7. What is a commit?#

7.1. Viewing Commit History#


Navigate to your in-class repo


We can see commits with git log

git log

commit 162a47b31c7a73969ce9cbcefd206c279a37433d (HEAD -> main, origin/main, origin/HEAD)
Author: Ayman Sandouk <111829133+AymanBx@users.noreply.github.com>
Date:   Thu Feb 13 11:32:48 2025 -0500

    Create README.md

commit b0b24aa53537557c582b6f61d409e9b2ee91333f
Merge: 8040553 9942c3c
Author: AymanBx <ayman_sandouk@uri.edu>
Date:   Tue Feb 11 13:35:41 2025 -0500

    Keeping both fun facts

commit 9942c3c00dcde51d52a1c110413522043eca07cd
Author: Ayman Sandouk <111829133+AymanBx@users.noreply.github.com>
Date:   Tue Feb 11 13:24:03 2025 -0500

    Update about.md

commit 804055399f6565aca3cb188fe771ef2dca99f959 (fun_fact)
Author: AymanBx <ayman_sandouk@uri.edu>
Date:   Tue Feb 11 13:20:41 2025 -0500

    Added fun fact

commit 8bd4ea38fe31186b9e5d0c1e19c9aef748c6dce6 (my_branch)
Merge: e427044 9e8a8b6
Author: Ayman Sandouk <111829133+AymanBx@users.noreply.github.com>
Date:   Tue Feb 11 12:55:21 2025 -0500

    Merge pull request #2 from compsys-progtools/1-create-an-about-file

    Created an about file. Closes #1

commit e427044744061f5f38b0c3d1ff76b56552e8355d
Author: AymanBx <ayman_sandouk@uri.edu>
Date:   Tue Feb 11 00:43:36 2025 -0500

    removed file
.
.
.

The logs are soreted in a latest to oldest manner


this is a program, we can use enter/down arrow to move through it and then q to exit.


Mine will look a bit different that yours because I filled out my README.md later


7.1.1. How does that help?#


Let’s compare

git switch fun_fact

Switched to branch 'fun_fact'

git log

commit 804055399f6565aca3cb188fe771ef2dca99f959 (HEAD -> fun_fact)
Author: AymanBx <ayman_sandouk@uri.edu>
Date:   Tue Feb 11 13:20:41 2025 -0500

    Added fun fact

commit 8bd4ea38fe31186b9e5d0c1e19c9aef748c6dce6 (my_branch)
Merge: e427044 9e8a8b6
Author: Ayman Sandouk <111829133+AymanBx@users.noreply.github.com>
Date:   Tue Feb 11 12:55:21 2025 -0500

    Merge pull request #2 from compsys-progtools/1-create-an-about-file

    Created an about file. Closes #1

commit e427044744061f5f38b0c3d1ff76b56552e8355d
Author: AymanBx <ayman_sandouk@uri.edu>
Date:   Tue Feb 11 00:43:36 2025 -0500

    removed file

Notice what the parts of each log are


  • commit: The specific checkpoint of the state of your project. It holds some metadata within it

  • Author: The author of the commit.

    • Notice if you look at the different commits in your logs, you will find two different authors.

    • One that is <your_username>@users.noreply.github.com

    • The other is Your name and email (however your have configured them locally in git)

  • Date: Date and time of the commit

  • Content: The message you (or GitHub) added to the commit exoplaining what happened in that particular commit


What do we notice when we compare both logs?


In branch fun_fact the latest commit that branch knows about is the one with the commit message “Added fun fact”. Everything below matches mostly logs of the main branch. Whereas the logs in main show two more commits that are more recent (three im my case)

  • One with the message “Update about.md” and the author shows my GitHub user tag (with the @user)

  • The other has the message “Keeping both fun facts” with the author being my locally configered user data (Yes, I used my GitHub username as my name because at the time I configerred my username locally I didn’t know they didn’t HAVE to match)

  • We also notice a Merge tag on that last commit with some numbers next to it


Let’s try to make sense of the number in Merge:

8040553 9942c3c

Notice the individual commit identifiers (hash) for the two previous commits 9942c3c00dcde51d52a1c110413522043eca07cd 804055399f6565aca3cb188fe771ef2dca99f959


One more thing we will notice:

  • At the latest commit in the fun_fact branch we see next to the commit “hash” (HEAD -> fun_fact)

  • Next to the commit hash below it we only see (my_branch)

Whereas in the main branch logs we see:

  • Next to the latest commit (HEAD -> main, origin/main, origin/HEAD)

  • Next to the latest commit fun_fact branch knows we see (fun_fact). Notice HEAD has been moved

  • Next to the commit hash below it we only see (my_branch)


Branches are pointers, each one is located (pointing) at a particular commit.


Instead of storing important things in variables that only have scope of how long the program is active git stores its imporant information in files that it reads each time we run a command. All of git’s data is in the .git directory.


Everything the git program uses is stored in the .git directory, you can think of that like all of the variables the program would need if it ran all the time.

ls .git/

COMMIT_EDITMSG	REBASE_HEAD	index		packed-refs
FETCH_HEAD	config		info		refs
HEAD		description	logs
ORIG_HEAD	hooks		objects

the ones in all caps are simple pointers and the others are other formats.


cd .git/HEAD

bash: cd: .git/HEAD: Not a directory

7.2. What is a commit?#


7.2.1. Defining terms#

A commit is the most important unit of git. Later we will talk about what git as a whole is in more detail, but understanding a commit is essential to understanding how to fix things using git.


In CS we often have multiple, overlapping definitions for a term depending on our goal.

In intro classes, we try really hard to only use one definition for each term to let you focus.

Now we need to contend with multiple definitions


These definitons could be based on

  • what it conceptually represents

  • its role in a larger system

  • what its parts are

  • how it is implemented


for a commit, today, we are going to go through all of these, with lighter treatment on the implementation for today, and more detail later.


7.3. Conceptually, a commit is a snapshot#

storing data as snapshots over time

git takes a full snapshot of the repo at each commit.

Under the hood, it only makes a new copy of files that have changed because it uses the same technique to store each snapshot, so any files that have not changed, do not create new files inside of git.


7.4. A commit’s role is central to git#

a commit is the basic unit of what git manages

All other git things are defined relative to commits

  • branches are pointers to commits that move

  • tags are pointers to commits that do not move

  • trees are how file path/organization information is stored for a commit

  • blobs are how files contents are stored when a commit is made


7.5. Parts of a commit#

We will learn about the structure of a commit by inspecting it.


First we will go back to our gh-inclass repo

 bash
:tags: ["skip-executio

cd Documents/inclass/systems/gh-inclass-sp24-brownsarahm/


---

We can use `git log` to view past commits

```bash
git log

<log_here>

here we see some parts:

  • hash (the long alphanumeric string)

  • (if merge)

  • author

  • time stamp

  • message

but we know commits are supposed to represent some content and we have no information about that in this view


the hash is the unique identifier of each commit


we can view individual commits with git cat-file and at least 4 characters of the hash or enough to be unique. We will try 4 characters and I will use the last (my second to last) visible commit above (b0b24aa53537557c582b6f61d409e9b2ee91333f)

git cat-file has different modes:

  • -p for pretty print

  • -t to return the type


```with `git cat-file` and at least 4 
characters of the hash or enough to be unique.  We will try 4 characters 
and I will use the last (my second to last) visible commit above (`b0b24aa53537557c582b6f61d409e9b2ee91333f`)

`git cat-file` has  different modes:
- `-p` for pretty print
- `-t` to return the type
  
```{code-cell} bash
git cat-file -p 1e2a 

:emphasize-lines: 2,3
tree a9ebb362bb6ccac3d4cd637d4afa34d39a874a9b
parent 804055399f6565aca3cb188fe771ef2dca99f959
parent 9942c3c00dcde51d52a1c110413522043eca07cd
author AymanBx <ayman_sandouk@uri.edu> 1739298941 -0500
committer AymanBx <ayman_sandouk@uri.edu> 1739298941 -0500

Keeping both fun facts

Here we see the actual parts of a commit file:

  • a pointer to a tree

  • a pointer to two parent commits (because this was a merge) (highlighted)

  • author info with timestamp

  • committer info with timestamp

  • commit message


7.5.1. What is the PGP signature?#

Signed commits are extra authentication that you are who you say you are.

The commits that are labeled with the verified tag on GitHub.com


If we pick a commit from the history on GitHub that does not have verified on it, then we can see it does not have the PGP signature


7.6. Commit parents help us trace back#

kind of like a linked list



:emphasize-lines: 2
tree 657c82a4b3c0aca11df1d3f9f3db46f067753120
parent 8bd4ea38fe31186b9e5d0c1e19c9aef748c6dce6
author AymanBx <ayman_sandouk@uri.edu> 1739298041 -0500
committer AymanBx <ayman_sandouk@uri.edu> 1739298041 -0500

Added fun fact

7.6.1. Commit trees are the hash of the content#

This separation is helpful.


The snapshot is stored via a tree, we can use git cat-file to look at the tree object too.

The tree being a separate object from the overall commit allows us to be able to “edit” a message or “change” the parent of a commit; we actually make a new commit with the same tree.


let’s look at the tree for that commit.


```bash
:tags: ["skip-execution"]
git cat-file -p 0689

 console
:emphasize-lines: 4
040000 tree 263fb9d22090e88edd2bf1847c24c3511de91b49	.github
100644 blob 9fdc6b1b8d6b0916ef50b0a37e8c31999117016d	.gitignore
100644 blob 9ece5efa25710c8fad7d9f210928785b5362b06f	CONTRIBUTING.md
100644 blob 2d232a2231c650dc4094606797fe0bd3e0ce4c65	LICENSE.md
100644 blob b8eb6e89c6295e574ee5e3363d51c917a16797ff	README.md
040000 tree f596404cd28ea4bad49ff73fb4884049ab0e31f2	docs
100644 blob 39d5708913a6c708d1a505cde6da544785c086a6	setup.py
040000 tree 8c3cc97ca6446c270ca0b8f7d4ce640a6e81e468	src
040000 tree d3980efccf4856f0c61a6a16ed40be53
```230a5	tests

in this we have several columns:

  • mode (indicates normla file or directory in the working directory)

  • git object type (block or tree)

  • hash of the object

  • its file name in the working directory

The highlighted line for LICENSE.md we all have the same hash (as long as you picked a commit and tree after that file was created). This is because the hashis of the contents and the files all do have the same contents


7.6.2. Trees point to blobs of the file content#

We can also use git cat-file to view a blob.


```o use `git cat-file` to view a blob. 
```{code-cell} bash
:tags: ["skip-execution"]
git cat-file -p 2d23

 console
the info on how the code
```n be reused

 bash
:tags: ["skip
```ecution"]

++{“lesson_part”:“main”}

7.7. Commits are implemented as files#

commits are stored in the .git directory as files. git itself is a file system, or a way of storing information.


Everything the git program uses is stored in the .git directory, you can think of that like all of the variables the program would need if it ran all the time.


 bash
:tags: ["skip-execut
```"]
ls .git

 console
COMMIT_EDITMSG	REBASE_HEAD	index		packed-refs
FETCH_HEAD	config		info		refs
HEAD		description	logs
ORIG_HEAD	
```ks		objects

the ones in all caps are simple pointers and the others are other formats.


Most of the content is in the objects folder, git objects are the items that get stored.


Recall, we had seen the HEAD pointer before


```{code-cell} bash
:tags: ["skip-execution"]
cat .git/HEAD

 console
ref: refs/head
```rganization

which stores our current branch


 bash
:tags: ["skip-execution"]
ls 
```t/objects/

 console
06	29	46	72	93	ab	c7	e9
0c	2d	4c	76	94	b0	ca	f1
0e	38	5b	7a	99	b1	cb	f5
10	39	5f	7c	9d	b8	d2	f9
19	3a	62	85	9e	c0	d3	info
1e	3c	63	87	9f	c2	d8	pack
1f	3d	66	8c	a3	c3	dd
25	45	
```91	a8	c5	e0

We see a lot more folders here than we had commits. This is because there are three types of objects.



 bash
:tags: ["skip-execution"]
cat .git/objects/29/245e4b9cce937fb9e50bc3762a
```c6a7a12c3 

 console
x%?A
?0Fa?9?nt!?]? *(
??x?1??`Ld2???V?????eS/???P???1?aLL?EUT???!=????fu??~?
                                                      ??.???x?TItƤ???|)?>?'#?Fܢhϔ?%?Cu?ڮ.?
```Gb?????|Ez8

 bash
git cat
```le -t 2924


```onsole
blob

 bash
:tags: ["skip-execution"]
git cat
```le -p 2924

 console
<last blob
```ntent here>

 bash
:tags: ["skip-execut
```"]
git log


```nsole
<log>

 bash
:tags: ["skip-execution
```git status

 console
On branch organization
Your branch is ahead of 'origin/organization' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, work
``` tree clean

 bash
:tags: ["skip-e
```ution"]
ls

 console
CONTRIBUTING.md	README.md	scratch.ipynb	src
LICENSE.md	docs		
```up.py	tests

7.8. Prepare for next class#

Examine an open source software project and fill in the template below in a file called software.md in your kwl repo on a branch that is linked to this issue. You do not need to try to understand how the code works for this exercise, but instead focus on how the repo is set up, what additional information is in there beyond the code. You may pick any mature open source project, meaning a project with recent commits, active PRs and issues, multiple contributors. In class we will have a discussion and you will compare what you found with people who examined a different project. Coordinate with peers (eg using the class discussion or in lab time) to look at different projects in order to discuss together in class.

## Software Reflection

Project : <markdown link to repo>

## README

<!-- what is in the readme? how well does it help you  -->

## Contents 

<!-- denote here types of files (code, what languages, what other files) -->


## Automation

<!-- comment on what types of stuff is in the .github directory -->

## Documentation

<!-- what support for users? what for developers? code of conduct? citation? -->

## Hidden files and support
 <!-- What type of things are in the hidden files? who would need to see those files vs not? -->

Some open source projects if you do not have one in mind:

7.9. Experience Report Evidence#

redirect your history to a file log-2024-02-08.txt and include it with your experience report.

7.10. Badges#

  1. Export your git log for your KWL main branch to a file called gitlog.txt and commit that as exported to the branch for this issue. note that you will need to work between two branchse to make this happen. Append a blank line, ## Commands, and another blank line to the file, then the command history used for this exercise to the end of the file.

  2. In commit-def.md compare two of the four ways we described a commit today in class. How do the two descriptions differ? How does defining it in different ways help add up to improve your understanding?

  1. Explore the tools for conventional commits and then pick one to try out. Work on the branch for this badge and use one of the tools that helps making conventional commits (eg in VSCode or a CLI for it)for a series of commits adding “features” and “bug fixes” telling the story of a code project in a file called commit-story.md. For each edit, add short phrases like ‘new feature 1’, or ‘next bug fix’ to the single file each time, but use conventional commits for each commit. In total make at least 5 different types of changes (types per conventional commits standard) including 2 breaking changes and at least 10 total commits to the file.

  2. learn about options for how git can display commit history. Try out a few different options. Choose two, write them both to a file, gitlog-compare.md. Using a text editor, wrap each log with three backticks to make them “code blocks” and then add text to the file describing a use case where that format in particular would be helpful. do this after the above so that your git log examples include your conventional commits