8. What is a commit?#
8.1. Defining terms#
A commit is the most important unit of git. Later we will talk about what git as a whole is in more detail, but understanding a commit is essential to understanding how to fix things using git.
In CS we often have multiple, overlapping definitions for a term depending on our goal.
In intro classes, we try really hard to only use one definition for each term to let you focus.
Now we need to contend with multiple definitions
These definitons could be based on
what it conceptually represents
its role in a larger system
what its parts are
how it is implemented
for a commit, today, we are going to go through all of these, with lighter treatment on the implementation for today, and more detail later.
8.2. Conceptually, a commit is a snapshot#
git takes a full snapshot of the repo at each commit.
Under the hood, it only makes a new copy of files that have changed because it uses the same technique to store each snapshot, so any files that have not changed, do not create new files inside of git.
8.3. A commit’s role is central to git#
a commit is the basic unit of what git manages
All other git things are defined relative to commits
branches are pointers to commits that move
tags are pointers to commits that do not move
trees are how file path/organization information is stored for a commit
blobs are how files contents are stored when a commit is made
8.4. Parts of a commit#
We will learn about the structure of a commit by inspecting it.
First we will go back to our gh-inclass
repo
cd Documents/inclass/systems/gh-inclass-brownsarahm/
We can use git log
to view past commits
git log
1commit 1e2ab9259651a73ad277e826d602514d28969c86 (HEAD -> organization)
2Author: Sarah M Brown <brownsarahm@uri.edu>
3Date: Tue Sep 24 13:30:44 2024 -0400
4
5 include readmen content
6
7commit 87c72aeca9bd16700fc8fd8ee719136c13e83e01 (origin/organization)
8Author: Sarah M Brown <brownsarahm@uri.edu>
9Date: Tue Sep 24 13:23:36 2024 -0400
10
11 stop tracking
12
13commit d2d1fac72642204bfdcebc7703d786615b8de934
14Author: Sarah M Brown <brownsarahm@uri.edu>
15Date: Tue Sep 24 13:16:10 2024 -0400
16
17 organized files into foleders and ignore private
18
19commit a3904a0a5e7adbcbf9fe439c387fb4dbd7846c51
20Author: Sarah M Brown <brownsarahm@uri.edu>
21Date: Tue Sep 24 12:46:19 2024 -0400
22
23 Revert "start organizing"
24
25 This reverts commit 9120d9d88aa587e4ffda1ee9aa8c3dcf8f764f7e.
here we see some parts:
hash (the long alphanumeric string)
(if merge)
author
time stamp
message
but we know commits are supposed to represent some content and we have no information about that in this view
the hash is the unique identifier of each commit
we can view individual commits with git cat-file
and at least 4
characters of the hash or enough to be unique. We will try 4 characters
and I will use the first visible commit above, that is highlighted
git cat-file
has different modes:
-p
for pretty print-t
to return the type
git cat-file -p 1e2a
tree 7c055c5ff9309a982982db0b890bc2a02926d7e3
parent 87c72aeca9bd16700fc8fd8ee719136c13e83e01
author Sarah M Brown <brownsarahm@uri.edu> 1727199044 -0400
committer Sarah M Brown <brownsarahm@uri.edu> 1727199044 -0400
include readmen content
Here we see the actual parts of a commit file:
a pointer to a tree
a pointer to a parent commit (highlighted)
author info with timestamp
committer info with timestamp
commit message
8.4.1. Commit parents help us trace back#
kind of like a linked list
we can use the hash of the parent in the output above
git cat-file -p 87c7
tree 06895b0f89062a5d9d12b5de5a068dc253f27092
parent d2d1fac72642204bfdcebc7703d786615b8de934
author Sarah M Brown <brownsarahm@uri.edu> 1727198616 -0400
committer Sarah M Brown <brownsarahm@uri.edu> 1727198616 -0400
stop tracking
8.4.2. Commit trees are the hash of the content#
The snapshot is stored via a tree, we can use git cat-file
to look at the tree object too.
The tree being a separate object from the overall commit allows us to be able to “edit” a message or “change” the parent of a commit; we actually make a new commit with the same tree.
let’s look at the tree for that commit.
git cat-file -p 0689
040000 tree 263fb9d22090e88edd2bf1847c24c3511de91b49 .github
100644 blob 9fdc6b1b8d6b0916ef50b0a37e8c31999117016d .gitignore
100644 blob 9ece5efa25710c8fad7d9f210928785b5362b06f CONTRIBUTING.md
100644 blob 2d232a2231c650dc4094606797fe0bd3e0ce4c65 LICENSE.md
100644 blob b8eb6e89c6295e574ee5e3363d51c917a16797ff README.md
040000 tree f596404cd28ea4bad49ff73fb4884049ab0e31f2 docs
100644 blob 39d5708913a6c708d1a505cde6da544785c086a6 setup.py
040000 tree 8c3cc97ca6446c270ca0b8f7d4ce640a6e81e468 src
040000 tree d3980efccf4856f0c61a6a16ed40be534c5230a5 tests
in this we have several columns:
mode (indicates normla file or directory in the working directory)
git
object type (block or tree)hash of the object
its file name in the working directory
The highlighted line for LICENSE.md
we all have the same hash (as long as you picked a commit and tree after that file was created). This is because the hashis of the contents and the files all do have the same contents
8.4.3. Trees point to blobs of the file content#
We can also use git cat-file
to view a blob.
git cat-file -p 2d23
the info on how the code can be reused
++{“lesson_part”:“main”}
8.5. Commits are implemented as files#
commits are stored in the .git
directory as files. git itself is a file system, or a way of storing information.
Everything the git program uses is stored in the .git
directory, you can think of that like all of the variables the program would need if it ran all the time.
ls .git
COMMIT_EDITMSG REBASE_HEAD index packed-refs
FETCH_HEAD config info refs
HEAD description logs
ORIG_HEAD hooks objects
the ones in all caps are simple pointers and the others are other formats.
Most of the content is in th objects
folder, git objects are the items that get stores.
Recall, we had seen the HEAD
pointer before
cat .git/HEAD
ref: refs/heads/organization
which stores our current branch
Most of the content is in the objects
folder, git objects are the items that get stores.
ls .git/objects/
06 29 46 72 93 ab c7 e9
0c 2d 4c 76 94 b0 ca f1
0e 38 5b 7a 99 b1 cb f5
10 39 5f 7c 9d b8 d2 f9
19 3a 62 85 9e c0 d3 info
1e 3c 63 87 9f c2 d8 pack
1f 3d 66 8c a3 c3 dd
25 45 70 91 a8 c5 e0
We see a lot more folders here than we had commits. This is because there are three types of objects.
8.5.1. a commit is a type of git object#
This is a class diagram for the git object
s:
cat .git/objects/29/245e4b9cce937fb9e50bc3762ab19c6a7a12c3
x%?A
?0Fa?9?nt!?]? *(
??x?1??`Ld2???V?????eS/???P???1?aLL?EUT???!=????fu??~?
??.???x?TItƤ???|)?>?'#?Fܢhϔ?%?Cu?ڮ.??ђGb?????|Ez8```
+++{"lesson_part": "main"}
```{code-cell} bash
:tags: ["skip-execution"]
git cat-file -t 2924
blob
git cat-file -p 2924
# Sarah Brown
tenure year: 2027
- i skied competiively in hs
<<<<<<< HEAD
- i started at uri in 2020
=======
- i went to Northeastern
>>>>>>> 62dcf61 (local second fun fact)
git log
commit 1e2ab9259651a73ad277e826d602514d28969c86 (HEAD -> organization)
Author: Sarah M Brown <brownsarahm@uri.edu>
Date: Tue Sep 24 13:30:44 2024 -0400
include readmen content
commit 87c72aeca9bd16700fc8fd8ee719136c13e83e01 (origin/organization)
Author: Sarah M Brown <brownsarahm@uri.edu>
Date: Tue Sep 24 13:23:36 2024 -0400
stop tracking
commit d2d1fac72642204bfdcebc7703d786615b8de934
Author: Sarah M Brown <brownsarahm@uri.edu>
Date: Tue Sep 24 13:16:10 2024 -0400
organized files into foleders and ignore private
commit a3904a0a5e7adbcbf9fe439c387fb4dbd7846c51
Author: Sarah M Brown <brownsarahm@uri.edu>
Date: Tue Sep 24 12:46:19 2024 -0400
Revert "start organizing"
This reverts commit 9120d9d88aa587e4ffda1ee9aa8c3dcf8f764f7e.
commit 4ceb1500582236e98bdb141116821a5857f75a76
git status
On branch organization
Your branch is ahead of 'origin/organization' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working tree clean
ls
CONTRIBUTING.md README.md scratch.ipynb src
LICENSE.md docs setup.py tests
8.6. Commit messages are essential#
A git commit message must exist and is always for people, but can also be for machines.
the conventional commits standard is a format of commits
if you use this, then you can use automated tools to generate a full change log when you release code
8.8. Badges#
Export your git log for your KWL main branch to a file called gitlog.txt and commit that as exported to the branch for this issue. note that you will need to work between two branchse to make this happen. Append a blank line,
## Commands
, and another blank line to the file, then the command history used for this exercise to the end of the file.In commit-def.md compare two of the four ways we described a commit today in class. How do the two descriptions differ? How does defining it in different ways help add up to improve your understanding?
Explore the tools for conventional commits and then pick one to try out. Work on the branch for this badge and use one of the tools that helps making conventional commits (eg in VSCode or a CLI for it)for a series of commits adding “features” and “bug fixes” telling the story of a code project in a file called commit-story.md. For each edit, add short phrases like ‘new feature 1’, or ‘next bug fix’ to the single file each time, but use conventional commits for each commit. In total make at least 5 different types of changes (types per conventional commits standard) including 2 breaking changes and at least 10 total commits to the file.
learn about options for how git can display commit history. Try out a few different options. Choose two, write them both to a file, gitlog-compare.md. Using a text editor, wrap each log with three backticks to make them “code blocks” and then add text to the file describing a use case where that format in particular would be helpful. do this after the above so that your git log examples include your conventional commits
8.9. Experience Report Evidence#
8.10. Questions After Today’s Class#
8.10.1. Besides a linked list, what other data structures or algorithms does git use on its inner workings?#
the main algorithm it uses is hashing.
More detail here is a good explore badge.
8.10.2. Why did running ls .git/objects
had no files even when I have a commit history?#
git compresses content into packfiles at times. This is rare, but can happen.
8.10.3. Is the only reason I would need to sign a commit is to show that I am the one dispersing it so it doesn’t seem like spam or a virus? or should signing commits become a frequent practice?#
the answer is yes to both questions
8.10.4. When I entered “git cat-file -p 39d5” in the command line, the result was “file with function with instructions for pip”. What does this mean?#
your blob object 39d5 is a fil with that contents like:
file with function with instructions for pip
we do not yet know its filenmae, but that is what this is.