so far we have used git and bash to accomplish familiar goals, and git and bash feel like just extra work for familiar goals.
We will do this by working in the gh-inclass repo on organizing a set of files. This is still a familiar task, but we will show how doing on the terminal can make it faster[1], more robust to errors, and automatable[2]
Setting the stage¶
To prepare for today’s class you examined an open source project.
We noticed that while the contents inside and the distribution of languages used as well as the specific code, or the content of the files was all different, a lot of the organization was similar.
Most had certain community health files and basic info files:
CONTRIBUTINGCODE_OF_CONDUCTREADME.MDLICENSEGOVERNANCE.MD
Setup¶
First, we’ll go back to our github inclass folder
cd gh-inclass-brownsaramlsabout.md README.mdand check in with git
git statusOn branch fun_fact
Your branch is up to date with 'origin/fun_fact'.
nothing to commit, working tree cleanand then get updated
git pull1 2 3 4 5 6 7 8remote: Enumerating objects: 18, done. remote: Counting objects: 100% (18/18), done. remote: Compressing objects: 100% (8/8), done. remote: Total 18 (delta 0), reused 18 (delta 0), pack-reused 0 (from 0) Unpacking objects: 100% (18/18), 1.64 KiB | 152.00 KiB/s, done. From https://github.com/compsys-progtools/gh-inclass-fa25-brownsarahm * [new branch] organizing_ac -> origin/organizing_ac Already up to date.
Note there is a new branch, I created that to give you some files wo work with for today’s class
Recall, we can use git branch with no inputs to see what branches we have locally
git branch 1-add-a-readme
fun_fact
* main
my_branchthe new branch does not show, becuase while it has been pulled and git can tell what it should be, we have not used it locally so it does not really exist yet locally.
To see branches that are in the remote we use the -r flag.
git branch -r1 2 3 4 5origin/1-add-a-readme origin/HEAD -> origin/main origin/fun_fact origin/main origin/organizing_ac
Now we have the new branch!
We can switch to it with checkout
git checkout organizing_acbranch 'organizing_ac' set up to track 'origin/organizing_ac'.
Switched to a new branch 'organizing_ac'Now, if we look at the contents of the working direcotry again, to see what the new branch includes.
ls_config.yml LICENSE.md
_toc.yml philosophy.md
abstract_base_class.py README.md
alternative_classes.py scratch.ipynb
API.md setup.py
CONTRIBUTING.md tests_alt.py
example.md tests_helpers.py
helper_functions.py tests_imp.py
important_classes.py tsets_abc.pyWe have a lot of new files that were not in the working directory before! git checkout does two things:
move the
HEADpointer to the new branch (here,organizing_ac)set the contents to the working directory to the contents in the repository as of the last commit on that branch (where the branch pointer points)
Organizing a project (working with files)¶
A common question is about how to organize projects. While our main focus
in this class session is the bash commands to do it, the task that we are
going to do is to organize a hypothetical python project
Put another way, we are using organizing a project as the context to motivate practicing with bash commands for moving files.
A different the instructor might go through a slide deck that lists commands and describes what each one does and then have examples at the end. Instead, we are going to focus on organizing files, and I will introduce the commands we need along the ways.
next we are going to pretend we worked on the project and made a bunch of files
I gave a bunch of files, each with a short phrase in them.
none of these are functional files
the phrases mean you can inspect them on the terminal
file extensions are for people; they do not specify what the file is actually written like
these are all actually plain text files
cat concatenates the contents of a file to STDOUT, which can be thought of like a special file that our terminal reads
think about in C you can write to STDOUT or STDERR, some IDEs have separate visual panels for these two places
so for example:
cat API.mdjupyterbook file to generate api documentationor:
cat scratch.ipynbjupyter notebook from devwe will create a new branch before we start doing some new work. We do not have to have this branch off of main, it can be off of the one we are already on. Remember branches are not copies they are pointers to different specific commits.
git checkout -b organizationSwitched to a new branch 'organization'and confirm it is as we expect.
git statusOn branch organization
nothing to commit, working tree cleanManipulating the Streams¶
The STDOUT stream can be manipulated. Any time we open or manipulate a file, it goes through a stream. So we can use other files in place of how we by default use STDOUT.
To try this out, let’s first recall what is in the README file.
cat README.md# GitHub practice
testecho is a simple command that repeats what you have done, again by default to STDOUT.
echo "hello"hellonot so exciting, but there is a way we can use this!
An important part of the unix philosophy (that underlies bash and lots of other design in computing today) is to make small programs that each do one thing well and then connect them together to accomplish more complex tasks.
Typically, when we write to a file, in programming, we also have to tell it what mode to open the file with, and some options are:
read
write
append
This could be familiar from:
fopenin Cor
openin Python
References
C is not an open source language in the typical sense so there is no “official” C docs
We can also redirect (send) the contents of a command from stdout to a file in bash. Like file operations while programming there is a similar concept to this mode.
There are two types of redirects, like there are two ways to write to a file, more generally:
overwrite (
>)append (
>>)
These sybols are each called a redirect
Redirect in append mode¶
We can use a to send output of any command to a different stream than STDOUT. So for example, we can send a phrase into the README.md file.
echo "today is rainy" >> README.mdThis time there was no output, since nothing when to STDOUT.
Then we check the contents of the file and we see that the new content is there.
cat README.md1 2 3 4# GitHub practice test today is rainy
the new content!
Redirect in write mode¶
We can redirect other commands too. This time we will use one > to use write mode instead of append mode.
git status > curgitAgain no output
The curgit[3] file did not exist before, so let’s see if it is there.
ls_config.yml LICENSE.md
_toc.yml philosophy.md
abstract_base_class.py README.md
alternative_classes.py scratch.ipynb
API.md setup.py
CONTRIBUTING.md tests_alt.py
curgit tests_helpers.py
example.md tests_imp.py
helper_functions.py tsets_abc.py
important_classes.pyit’s there
and we can look at its contents too
cat curgitOn branch organization
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
curgit
no changes added to commit (use "git add" and/or "git commit -a")Connected bash commands work right to left¶
From the contents of the file, we can see something really important about how the redirect (and other ways of chaining bash commands together) work.
Notice that above, the curgit file is mentioned in the git status that is in the contents of the curgit file. We didn’t create the file on purpose before we ran the line git status > curgit. This is because bash works partially right to left and then resolves after. So, what happens is:
the file
curgitis created and a stream to it is openedthe STDOUT stream is connected to that stream
git statusruns, it compares the working directory to the last commit, sends its output to STDOUT which is connected tocurgitthe file stream is closed.
this is not a file we actually want, which gives us a chance to learn another new bash command:
rm for remove, removes the entry for this file from the file table. Your computer keeps track of all of the paths and where in the disk (like the memory address) the corresponding content is. if we have no entry, we cannot, using normal tools, find the contents, though it is not erased from the disk or overwritten. The OS is then free to overwrite those bits if needed.
rm curgitThis is a true, full, and complete DELETE, this does not put the file in your recycling bin or the apple trash can that you can recover the file from, it is gone for real.
We will see soon a way around this, because git can help.
use rm with great care
Let’s check in with git
git statusOn branch organization
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
no changes added to commit (use "git add" and/or "git commit -a")Now we have made some changes we want, so let’s commit our changes.
git commit -a -m 'add note to readme'[organization 285dd21] add note to readme
1 file changed, 1 insertion(+)and confirm a clean working directory
git statusOn branch organization
nothing to commit, working tree cleanGit can save you from mistakes¶
Let’s try another redirect. With two we append to the file, with one, in write mode for a file with pre-exisiting content, let’s see:
echo "rain is sad" > README.mdandd then check the file
cat README.mdrain is sadIt wrote over the old content with this new content. This would be bad, we lost content, but this is what git is for!
It is very very easy to undo work since our last commit.
This is good for times when you have something you have an idea and you do not know if it is going to work, so you make a commit before you try it. Then you can try it out. If it doesn’t work you can undo and go back to the place where you made the commit.
To do this, we will first check in with git
git status1 2 3 4 5 6 7On branch organization Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: README.md no changes added to commit (use "git add" and/or "git commit -a")
Notice that it tells us what to do (use "git restore <file>..." to discard changes in working directory). The version of README.md that we broke is in the working directory but not commited to git, so git refers to them as “changes” in the working directory.
We run that command
git restore README.mdthis command has no output, so we can use git status to check first
and we can check the file
cat README.md# GitHub practice
test
today is rainyBack how we wanted it!
Manipulating files with bash¶
Next we will add some descriptive content to the README
echo "|file | contents |
> > | --| -- |
> > | abstract_base_class.py | core abstract classes for the project |
> > | helper_functions.py | utitly funtions that are called by many classes |
> > | important_classes.py | classes that inherit from the abc |
> > | alternative_classes.py | classes that inherit from the abc |
> > | LICENSE.md | the info on how the code can be reused|
> > | CONTRIBUTING.md | instructions for how people can contribute to the project|
> > | setup.py | file with function with instructions for pip |
> > | test_abc.py | tests for constructors and methods in abstract_base_class.py|
> > | tests_helpers.py | tests for constructors and methods in helper_functions.py|
> > | tests_imp.py | tests for constructors and methods in important_classes.py|
> > | tests_alt.py | tests for constructors and methods in alternative_classes.py|
> > | API.md | jupyterbook file to generate api documentation |
> > | _config.yml | jupyterbook config for documentation |
> > | _toc.yml | jupyter book toc file for documentation |
> > | philosophy.md | overview of how the code is organized for docs |
> > | example.md | myst notebook example of using the code |
> > | scratch.ipynb | jupyter notebook from dev |" >> README.mdthis explains each file a little bit more than the name of it does. We see there are sort of 5 groups of files:
about the project/repository
code that defines a python module
test code
documentation
extra files that “we know” we can delete.
We also learn something about bash:
using the open quote " then you stay inside that until you close it. when you press enter the command does not run until after you close the quotes
let’s look at the file again
cat README.md# GitHub practice
test
today is rainy
|file | contents |
> | --| -- |
> | abstract_base_class.py | core abstract classes for the project |
> | helper_functions.py | utitly funtions that are called by many classes |
> | important_classes.py | classes that inherit from the abc |
> | alternative_classes.py | classes that inherit from the abc |
> | LICENSE.md | the info on how the code can be reused|
> | CONTRIBUTING.md | instructions for how people can contribute to the project|
> | setup.py | file with function with instructions for pip |
> | test_abc.py | tests for constructors and methods in abstract_base_class.py|
> | tests_helpers.py | tests for constructors and methods in helper_functions.py|
> | tests_imp.py | tests for constructors and methods in important_classes.py|
> | tests_alt.py | tests for constructors and methods in alternative_classes.py|
> | API.md | jupyterbook file to generate api documentation |
> | _config.yml | jupyterbook config for documentation |
> | _toc.yml | jupyter book toc file for documentation |
> | philosophy.md | overview of how the code is organized for docs |
> | example.md | myst notebook example of using the code |
> | scratch.ipynb | jupyter notebook from dev |First, we’ll make a directory with mkdir
mkdir docsnext we will move a file there with mv
mv philosophy.md docs/what this does is change the path of the file from gh-inclass-fa25-brownsarahm/philosophy.md to gh-inclass-fa25-brownsarahm/docs/philosophy.md
It does not rewrite the file contents to disk, so even for large files this is quick, it changes only the file table that stores each path and a disk address.
we can look in the docs folder
cd docs/and use ls
lsphilosophy.mdor go back
cd ..and use ls with the relative path to where we moved it
ls docs/philosophy.mdwe can also check the original location
ls_config.yml important_classes.py
_toc.yml LICENSE.md
abstract_base_class.py README.md
alternative_classes.py scratch.ipynb
API.md setup.py
CONTRIBUTING.md tests_alt.py
docs tests_helpers.py
example.md tests_imp.py
helper_functions.py tsets_abc.pynot there!
Moving multiple files with patterns¶
We can use the * wildcard operator to move all files that match the pattern. We’ll start with the two yml (yaml)
files that are both for the documentation.
mv *.yml docs/Again, we confirm it worked by seeing that they are no longer in the working directory.
lsabstract_base_class.py LICENSE.md
alternative_classes.py README.md
API.md scratch.ipynb
CONTRIBUTING.md setup.py
docs tests_alt.py
example.md tests_helpers.py
helper_functions.py tests_imp.py
important_classes.py tsets_abc.pyand that they are in docs
ls docs/_config.yml _toc.yml philosophy.mdWe see that most of the test files start with tests_ but one starts with
tsets_. We can fix this!
Renaming files¶
We can use mv to change the name as well. This is because “moving” a file and
is really about changing its path, not actually copying it from one location to
another and the file name is a part of the path.
mv tsets_abc.py tests_abc.pyThis changes the path from .../tsets_abc.py to .../tests_abc.py to. It is doing the same thing as when we use it to move a file from one folder to another folder, but changing a different part of the path.
we can use ls to check that it worked
lsabstract_base_class.py LICENSE.md
alternative_classes.py README.md
API.md scratch.ipynb
CONTRIBUTING.md setup.py
docs tests_abc.py
example.md tests_alt.py
helper_functions.py tests_helpers.py
important_classes.py tests_imp.pyNow we make a new folder:
mkdir testsand move all of the test files there:
mv tests_* tests/this is why good file naming is important even if you have not organized the whole project yet, you can use the good conventions to help yourself later.
Hidden files¶
We saw before that some files and folders are hidden
and that to see them with ls we need the -a flag
ls -a. example.md
.. helper_functions.py
.git important_classes.py
.github LICENSE.md
abstract_base_class.py README.md
alternative_classes.py scratch.ipynb
API.md setup.py
CONTRIBUTING.md tests
docsWe are going to make a special hidden file and an extra one. We will use the following command:
touch .secret .gitignoreWe also learn 2 things about touch and bash:
touchcan make multiple files at a timelists in
bashare separated by spaces and do not require brackets
lets put some content in the secret file, this is simple text, but it represents a real type of content, things we need but do not want to share with collaborators or make public.
echo "my dev secret" > .secretIt is common in project to have an API key or secret that you need to keep with your code for it to run, but that you do not want to share.
Not everyone is careful with them though, over 12k were found in LLM training sets
again, we still do not see the file, it’s hidden from a casual user already!
lsabstract_base_class.py important_classes.py
alternative_classes.py LICENSE.md
API.md README.md
CONTRIBUTING.md scratch.ipynb
docs setup.py
example.md tests
helper_functions.pybut we can see them both
ls -a. docs
.. example.md
.git helper_functions.py
.github important_classes.py
.gitignore LICENSE.md
.secret README.md
abstract_base_class.py scratch.ipynb
alternative_classes.py setup.py
API.md tests
CONTRIBUTING.mdand the file is normally visible to all bash commands
cat .secretmy dev secretLet’s check in with git
git status1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21On branch organization Changes not staged for commit: (use "git add/rm <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: README.md deleted: _config.yml deleted: _toc.yml deleted: philosophy.md deleted: tests_alt.py deleted: tests_helpers.py deleted: tests_imp.py deleted: tsets_abc.py Untracked files: (use "git add <file>..." to include in what will be committed) .gitignore .secret docs/ tests/ no changes added to commit (use "git add" and/or "git commit -a")
git sees the file but it is not yet tracked. If we were to use git add . it would start getting tracked. We do not want that and we do not want to have to manually add each other file and just avoid this one.
gitignore lets us not track certain files
let’s ignore that .secret file
echo ".secret" >>.gitignoreNow we check with git again:
git statusOn branch organization
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
deleted: _config.yml
deleted: _toc.yml
deleted: philosophy.md
deleted: tests_alt.py
deleted: tests_helpers.py
deleted: tests_imp.py
deleted: tsets_abc.py
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
docs/
tests/
no changes added to commit (use "git add" and/or "git commit -a")no more secret!
the file is still there:
cat .secretmy dev secretTracking moved files¶
Now we want to add the rest of the files we did, notice that above in the git status it thought we had deleted a lot of files that we had actually moved.
Let’s add one of the folders for tracking
git add docs/git statusOn branch organization
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: docs/_config.yml
new file: docs/_toc.yml
new file: docs/philosophy.md
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
deleted: _config.yml
deleted: _toc.yml
deleted: philosophy.md
deleted: tests_alt.py
deleted: tests_helpers.py
deleted: tests_imp.py
deleted: tsets_abc.py
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
tests/it still thinks the files are deleted, but also that there are new files.
Now let’s add all
git add .and status
git statusOn branch organization
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: .gitignore
modified: README.md
renamed: _config.yml -> docs/_config.yml
renamed: _toc.yml -> docs/_toc.yml
renamed: philosophy.md -> docs/philosophy.md
renamed: tsets_abc.py -> tests/tests_abc.py
renamed: tests_alt.py -> tests/tests_alt.py
renamed: tests_helpers.py -> tests/tests_helpers.py
renamed: tests_imp.py -> tests/tests_imp.pyNow, it knows that we have actually renamed the files.
We had to tell it to track the changes to both . and docs for it to detect that they were the same files that we had only moved. It compares files using their names but also the contents. We will pick up from this idea more next week, but this is an important observation.
Finally, we will commit
git commit -m 'begin reorg
> 'I forgot to close my ' for the message so it gave me a continued prompt with > at the start and then i had to close it and press enter or return a second time
[organization e899a0e] begin reorg
9 files changed, 20 insertions(+)
create mode 100644 .gitignore
rename _config.yml => docs/_config.yml (100%)
rename _toc.yml => docs/_toc.yml (100%)
rename philosophy.md => docs/philosophy.md (100%)
rename tsets_abc.py => tests/tests_abc.py (100%)
rename tests_alt.py => tests/tests_alt.py (100%)
rename tests_helpers.py => tests/tests_helpers.py (100%)
rename tests_imp.py => tests/tests_imp.py (100%)Recap¶
Why do I need a terminal
replication/automation
it’s always there and doesn’t change
it’s faster one you know it (also see above)
Prepare for Next Class¶
Bring git questions or scenarios you want to be able to solve to class (in your mind or comment here if that helps you remember)
Try read and understand the workflow files in your KWL repo, the goal is not to be sure you understand every step, but to get an idea about the big picture ideas and just enough to complete the following. Try to modify files, on a prepare branch, so that your name is already filled in and
AymanBxorthomaspeck11(whoever you sit closer to in class) is already requested as a reviewer when your experience badge (inclass) action runs. We will give the answer in class, but especially do not do this step on the main branch it could break your action.
Hints:
Look for bash commands that we have seen before
cpcopies a filecheck what the experience makeup action does
Badges¶
badge steps marked lab are steps that you will be encouraged to use lab time to work on. In this case, in lab, we will check that you know what to do, but if we want you to do revisions those will be done through the badge.
Update your KWL chart with the new items and any learned items.
Clone the course website. Append the commands used and the contents of your
fall2025/.git/configto a file in your KWL called terminal_review.md (hint:historyoutputs recent commands and redirects can work with any command, not onlyecho). Edit theREADME.mdin the course website (you can add anything), commit, and try to push the changes. Describe what the error means and which GitHub Collaboration Feature you think would enable you to push? (answer in theterminal_review.md)
fork is the answer, must be one of the things higlighted in the link
lab Organize the provided messy folder in a Codepsace (details will be provided in lab time). Commit and push the changes. Answer the questions below in your kwl repo in a file called
terminal_organization.mdclone your
messy_repolocally and append thehistory.mdfile to yourterminal_organization.md
badge steps marked lab are steps that you will be encouraged to use lab time to work on. For this one in particular, I am going to give you the messy repo in lab.
Update your KWL chart with any learned items.
(option) Get set up so that you can pull from the course website repo and push to your own fork of the class website by cloning the main repo, then forking it and adding your fork as an additional remote. Append the commands used and the contents of your
fall2025/.git/configto a file in your KWL reop called terminal_practice.md (hint:historyoutputs recent commands and redirects can work with any command, not only echo). Based on what you know so far about forks and branches, what advantage does this setup provide? (answer in theterminal_practice.md)(option) Get set up so that you can contribute to the course website repo from your local system. Note: you can pull from the
compsys-progtools/fall2024repo, but you do not not have push permission, so there is more to do than clone. Append the commands used and the contents of your localfall2025/.git/configto a git-remote-practice.md. Then, using a text editor (or IDE), wrap each log with three backticks to make them fenced code blocks and add headings to the sections.
clone (fork in browser)
git remote addwith any name and their fork
lab Organize the provided messy folder (details will be provided in lab time). Commit and push the changes. Clone that repo locally.
Organize a folder on your computer ( good candidate may be desktop or downloads folder), using only a terminal to make new directories, move files, check what’s inside them, etc. Answer reflection questions in a new file, terminal_organization_adv.md in your kwl repo. Tip: Start with a file explorer open, but then try to close it, and use only command line tools to explore and make your choices. If you get stuck, look up additional commands to do acomplish your goals.
# Terminal File moving reflection
1. How was this activity overall Did this get easier toward the end?
2. How was it different working on your own computer compared to the Codespace form?
3. Did you have to look up how to do anything we had not done in class?
4. When do you think that using the terminal will be better than using your GUI file explorer?
5. What questions/challenges/ reflections do you have after this?
6. Append all of the commands you used in lab below. (not from your local computer's history, from the codespace history)Experience Report Evidence¶
The prepare work for both Tuesday and today had a file in it so you need to make sure that those get added to your experience badges. Review the process
Save your history with:
history > activity-2025-09-18.mdthen append your git status, and the contents of your github-in-class and github-in-class/docs with to help visually separate the parts.
echo "***--" >> activity-2025-09-18.md
git status >> activity-2025-09-18.md
echo "***--" >> activity-2025-09-18.md
ls >> activity-2025-09-18.md
echo "***--" >> activity-2025-09-18.md
ls docs/ >> activity-2025-09-18.mdthen edit that file (on terminal, any text editor, or an IDE) to make sure it only includes things from this activity.
Questions After Today’s Class¶
How much longer is the grace period for badges?¶
The penalty free zone ends on 2025-09-25
I’m interested in learning more about wildcard operators.¶
explore They are an example of a type of o pattern matching. The most general is regular expressions. This is a good explore badge topic.
Do patterns and the wild card operator (*) apply to all terminal commands?¶
I am sure some exist that do not, especially since people can make their own, but most bash commands that take a file as input do accept this.
After the initial adjustment period, learning new things can slow you down a bit, but once you get used to, the terminal makes things faster.
with LLMs people now think that lots of things can be “automated” by delegating them to the “AI.” However, LLMs will make mistakes, maybe not the first time, but they will make errors[4]. Automating something with a script can guarantee it and make it happen, faster, cheaper[5], and more reliably, with a little bit of investment of your time up front.
AI errors are often called hallucinations. I try to avoid the term because hallucinations in a biological brain are perceptions of things that are not there-- a replacement of sensory information tha and are different than normal function. LLM hallucinations are provably inevitable Kalai et al. (2025) and are not technically different, it is a label we assign based on if the output is correct or not.
LLMs use extraordinary amounts of electricity and water Luccioni et al. (2024), so using one for something that can be done with a script, over and over and over again is expensive (electricity costs $) and has a significant environmental impact.
notice that there is no extension! can you remember why?
- Kalai, A. T., Nachum, O., Vempala, S. S., & Zhang, E. (2025). Why Language Models Hallucinate. arXiv Preprint arXiv:2509.04664.
- Luccioni, S., Trevelin, B., & Mitchell, M. (2024). The environmental impacts of ai–primer. Hugging Face Blog.