review the notes on what is a commit. In gitdef.md on the branch for this issue, try to describe git in the four ways we described a commit. the point here is to think about what you know for git and practice remembering it, not “get the right answer”; this is prepare work, we only check that it is complete, not correct
Start recording notes on how you use IDEs for the next couple of weeks using the template file below. We will come back to these notes in class later, but it is best to record over a time period instead of trying to remember at that time. Store your notes in your fall24 repo in idethoughts.md on a dedicated
ide_prep
branch. This is prep for after a few weeks from now, not for October 8; keep this branch open until it is specifically asked for
12. What is git?#
Last class we created a local repo, then created an empty repo on GitHub. Linked both repos and attmpted to push local changes on the “empty” repo but weren’t successful
The reason for that is the way the new GitHub repo was created. When we used GitHub Classroom to fork my template tiny-book repo as your new empty repo GitHub made two commits. Making it not an empty repo.
cd Documents/systems/tiny-book
Let’s check it out
gh repo view --web
Let’s compare that to what we have locally
git log
commit 1da3fa4d2b1b14e3a92358455c2320697af43867 (HEAD -> main)
Author: AymanBx <ayman_sandouk@uri.edu>
Date: Tue Mar 4 13:42:31 2025 -0500
jupyter book template
When we attempted to pull
we weren’t fully successful because we failed to link the two mains because the push
command failed.
git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)
Unpacking objects: 100% (5/5), 1.70 KiB | 102.00 KiB/s, done.
From https://github.com/compsys-progtools/tiny-book-AymanBx
* [new branch] feedback -> origin/feedback
* [new branch] main -> origin/main
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.
git pull <remote> <branch>
If you wish to set tracking information for this branch you can do so with:
git branch --set-upstream-to=origin/<branch> main
We should follow git instructions
git branch --set-upstream-to=origin/main main
branch 'main' set up to track 'origin/main'.
Now we should be able to pull successfully
git pull
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint: git config pull.rebase false # merge
hint: git config pull.rebase true # rebase
hint: git config pull.ff only # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.
Git recognized the different commit history as in the repos as a conflict
To resolve this type of conflict (main had unrelated changes on it that are unknown by my branch)
We rebase
rebase
is updating my current branch with new commits that occurred on main (or any branch I want to rebase with)
git pull --rebase
Successfully rebased and updated refs/heads/main.
What did that do?
git log
Author: AymanBx <ayman_sandouk@uri.edu>
Date: Tue Mar 4 13:42:31 2025 -0500
jupyter book template
commit d781535217b324d9cb2ce6cb45ae565b54ee786f (origin/main)
Author: github-classroom[bot] <66690702+github-classroom[bot]@users.noreply.github.com>
Date: Tue Oct 8 16:54:12 2024 +0000
Setting up GitHub Classroom Feedback
commit 72bcbb8cbd2769d21aad3c23c8fbe477d0260ced (origin/feedback)
Author: github-classroom[bot] <66690702+github-classroom[bot]@users.noreply.github.com>
Date: Tue Oct 8 16:54:12 2024 +0000
GitHub Classroom Feedback
Now we should be able to push
git push
Enumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Delta compression using up to 8 threads
Compressing objects: 100% (10/10), done.
Writing objects: 100% (12/12), 16.33 KiB | 8.17 MiB/s, done.
Total 12 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)
To https://github.com/compsys-progtools/tiny-book-aymanbx.git
d781535..60bd457 main -> main
A couple of weeks ago, we learned about what a commit is and then we took a break from how git works, to talk more about unix philosophy and how developers communicate about code
Today we will learn what git is more formally.
study tip
We will go in and out of topics at times, in order to provides what is called spaced repetition, repeating material or key concepts with breaks in between.
Using git correctly is a really important goal of this course because git is an opportunity for you to demonstrate a wide range of both practical and conceptual understanding.
So, I have elected to interleave other topics with git to give core git ideas some time to simmer and give you time to practice them before we build on them with more depth at git.
Also, we are both learning git and using git as a motivating example of other key important topics.
12.1. Why so much git?#
Today, we are going to learn what git is and later we will learn more details of how it is implemented.
Remember we are spending so much time with git for two reasons:
it is an important developer tool
it demonstrates important conceptual ideas that occur in other areas of CS
git book is the official reference on git.
this includes other spoken languages as well if that is helpful for you.
12.2. git definition#
From here, we have the full definition of git
We do not start from that point, because these documents were written for target audience of working developers who are familar with other, old version control systems and learning an additional one.
Have you used another version control system before?
Most of you, however, have probably not used another version control system.
Let’s break down the definition
12.3. Git is a File system#
Content-addressable filesystem means a key-value data store.
What are some examples of key-value pairs that you have seen in computer science broadly, and in this course specficially, so far?
python dictionaries
pointers (address,content)
parameter, passed values
yaml files
some examples of key-value pairs that you have seen in computer science broadly, and in this course specficially
python dictionaries
pointers (address,content)
parameter, passed values
yaml files
What this means is that you can insert any kind of content into a Git repository, for which Git will hand you back a unique key you can use later to retrieve that content.
12.4. Git is a Version Control System#
In the before times
git stores snapshots of your work each time you commit.
What unit of git is how it represents a snapshot?
[ ] branch
[ ] blob
[x] commit
[ ] tag
it uses 3 stages:
These three stages are the in relation to your working directory, and potentially remotes.
So in broader context, the git visual cheatsheet is a more complete picture and has commands overlayed with the concept.
12.5. Git has two sets of commands#
Porcelain: the user friendly VCS
Plumbing: the internal workings- a toolkit for a VCS
Which of the following commands are porcelain commands?
git commit
git cat-file
git add
git status
git hash-object
Which of the following commands are porcelain commands?
git commit
git cat-file
git add
git status
git hash-object
We have so far used git as a version control system. A version control system, in general, will have operations like commit, push, pull, clone. These may work differently under the hood or be called different things, but those are what something needs to have in order to keep track of different versions.
The plumbing commands reveal the way that git performs version control operations. This means, they implement the git file system operations for the git version control system.
You can think of the plumbing vs porcelain commands like public/private methods. As a user, you only need the public methods (porcelain commands) but those use the private ones to get things done (plumbing commands). We will use the plumbing commands over the next few classes to examine what git really does when we call the porcelain commands that we will typically use.
Example?
12.6. Git is distributed#
What does that mean?
Git runs locally. It can run in many places, and has commands to help sync across remotes, but git does not require one copy of the repository to be the “official” copy and the others to be subordinate. git just sees repositories.
For human reasons, we like to have one “official” copy and treat the others as other copies, but that is a social choice, not a technological requirement of git. Even though we will typically use it with an offical copy and other copies, having a tool that does not care, makes the tool more flexible and allows us to create workflows, or networks of copies that have any relationship we want.
It’s about the workflows, or the ways we socially use the tool.
12.6.1. Subversion Workflow#
subversion is an older VCS
12.6.2. Centralized Manager#
12.6.3. dictator and lieutenants#
This is a variant of a multiple-repository workflow. It’s generally used by huge projects with hundreds of collaborators; one famous example is the Linux kernel. Various integration managers are in charge of certain parts of the repository; they’re called lieutenants. All the lieutenants have one integration manager known as the benevolent dictator. The benevolent dictator pushes from their directory to a reference repository from which all the collaborators need to pull.
12.7. How does git do all these things?#
We can use the bash command find
to search the file system. Note that this does not search the contents of the files, just the names.
find objects/ -type f
.git/objects/06/d56f40c838b64eb048a63e036125964a069a3a
.git/objects/0e/2e3b27f61b5908c4bb75a1ca680ee4053aa992
.git/objects/1d/a3fa4d2b1b14e3a92358455c2320697af43867
.git/objects/29/a422c19251aeaeb907175e9b3219a9bed6c616
.git/objects/2b/d9785b546aa1af7d6e41a48d33a0af811082dd
.git/objects/5f/534f8051f6a94d40e57e58242ef0113fae4fd1
.git/objects/6e/b15166db3ad944529be060af334deb2c022bbd
.git/objects/74/d5c7101ed8c8c1a6f87e31debd9445df1f0e71
.git/objects/78/3ec6aa5afe2f0a66087d01a112f543e1ed287e
.git/objects/7e/821e45db31376729c73f3616fb24db2b655a95
.git/objects/a0/57a320dcd595f3f0e0d250c3af4a5653596914
.git/objects/d6/f9d92349c768da1863b412674f25cd27d23cfb
.git/objects/e3/5d8850c9688b1ce82711694692cc574a799396
.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
.git/objects/f8/cdc73cb2be06824f521837366ec95b73d55ef8
.git/objects/fa/eea606145667f54d220a0c17ffe8d22db07146
.git/objects/fd/b7176c429a73d5335e127b27d530b8aaa07c7d
We searched for anything of the type file
with the option -type f
This is a lot of files! It’s more than we have in our working directory.
We can see that by looking at the working directory with ls
ls
_build/ _toc.yml logo.png markdown.md references.bib
_config.yml intro.md markdown-notebooks.md notebooks.ipynb requirements.txt
And remember, _build
is not being tracked. That means git knows nothing about it or about its content
This is a consequence of git taking snap shots and tracking both the actual contents of our working directory and our commit messages and other meta data about each commit.
12.8. Git Variables#
the program git
does not run continously the entire time you are using it for a project. It runs quick commands each time you tell it to, it’s goal is to manage files, so this makes sense. This also means that important information that git
needs is also saved in files.
We can see the files that it has by listing the directory:
ls .git
COMMIT_EDITMSG HEAD description index logs/ refs/
FETCH_HEAD config hooks/ info/ objects/
the files in all caps are like gits variables.
Lets look at the one called HEAD
we have interacted with HEAD
before when resolving merge conflicts.
cat .git/HEAD
ref: refs/heads/main
HEAD
is a pointer to the currently checked out branch. Do you remember where we see this pointer?
The other files with HEAD
in their name are similarly pointers to other references, named corresponding to other things.
Imagine you had a copy of a git repo on a computer without the program git.
You ran the command ls .git/
in the repo and got the following output:
COMMIT_EDITMSG HEAD description index logs/ refs/
FETCH_HEAD config hooks/ info/ objects/
How could you find out what branch was checked out using only bash in one command?
cat .git/HEAD
ref: refs/heads/main
12.9. Git Objects#
There are 3 types:
blob objects: the content of your files (data)
tree objects: stores file names and groups files together (organization)
Commit Objects: stores information about the sha values of the snapshots
12.10. Examining git objects#
Which of the following commands we have seen so far is a plumbing command?
[ ] git commit
[ ] git push
[x] git cat-file
[ ] git pull
git cat-file -t
12.11. Prepare for Next Class#
Take a few minutes to think what you know about hashing and numbers. Create hash_num_prep.md with two sections:
## Hashing
with a few bullet points summarzing key points about hashing, and## Numbers
with what types of number representations you know.Review notes from How do git branches work. Focus on resolving merge conflict in preperation to next lab.
Review notes from What is a commit & What is git (Both notes should be fixed by 3/10) to be fully prepared for this class after a nice break. Bring questions if you come up with any. You may qualify for a community badge if you post/contribute in a discussion thread related to the concepts mentioned in said classes if your ask meaningful questions that your classmates feel intrigued to discuss.
12.12. Experience Report Evidence#
Append the contents of one of your trees or commits and one blob or tree inside of that first one to the bottom of your experience report.
12.13. Badges#
Read about different workflows in git and describe which one you prefer to work with and why in favorite_git_workflow.md in your kwl repo. Two good places to read from are Git Book and the atlassian Docs
Update your kwl chart with what you have learned or new questions in the want to know column
Separate from what you added from the previous step. Add to your kwl table the following rows and fill them out.
git branches
,merge conflicts
,commits
. Try to include an older understanding about them in theknow
column and a newer understanding in thelearned
columnIn commit_contents.md, redirect the content of your most recent commit, its tree, and the contents of one blob. Edit the file or use
echo
to put markdown headings between the different objects. Add a title# Complete Commit
to the file and at the bottom of the file add## Reflection
subheading with some notes on how, if at all this excercise helps you understand what a commit is.
git log
to see most recent commitgit cat-file
that hashgit cat-file
the tree hasgit cat-file
hash of those files
Read about different workflows in git and add responses to the below in a workflows.md in your kwl repo. Two good places to read from are Git Book and the atlassian Docs
Update your kwl chart with what you have learned or new questions in the want to know column
Separate from what you added from the previous step. Add to your kwl table the following rows and fill them out.
git branches
,merge conflicts
,commits
. Try to include an older understanding about them in theknow
column and a newer understanding in thelearned
columnAdd the hash of the content of your completed workflows.md file and put that in the comment of your badge PR for this badge. Try to do this from your local CLI, but full credit even if you use the website interface
## Workflow Reflection
1. Why is it important that git can be used with different workflows?
1. Which workflow do you think you would like to work with best and why?
1. Describe a scenario that might make it better for the whole team to use a workflow other than the one you prefer.