7. How do programmers communicate about code?#

Tip

check if your codespace has uncommitted changes on github.com/codespaces

note:

Today we are going to pick up from where we left off talking about the conventional commits.

That is a core example of the types of detailed communication we do in programming that is embedded into the work.

7.1. Why Documentation#

Today we will talk about documentation, there are several reasons this is important:

  • using official documentation is the best way to get better at the tools

  • understanding how documentation is designed and built will help you use it better

  • writing and maintaining documentation is really important part of working on a team

  • documentation building tools are a type of developer tool (and these are generally good software design)

Design is best learned from examples. Some of the best examples of software design come from developer tools.

In particular documentation tools are really good examples of:

  • pattern matching

  • modularity and abstraction

  • automation

  • the build process beyond compiling

By the end of today’s class you will be able to:

  • describe different types of documentation

  • find different information in a code repo

  • generate documentation as html

  • ignore content from a repo

  • create a repo locally and push to GitHub

7.2. What is documentation#

documentation types table

from ethnography of docuemtnation data science

7.2.1. Why is documentation so important?#

we should probably spend more time on it

differenc ein time spent vs should

via source

7.3. So, how do we do it?#

Documenation Tools

write the docs

linux kernel uses sphinx and here is why and how it works

7.4. Jupyterbook#

Jupyterbook wraps sphinx and uses markdown instead of restructured text. The project authors note in the documenation that it “can be thought of as an opinionated distribution of Sphinx”. We’re goign to use this.

navigate to your folder for this course (mine is inclass/systems)

cd Documents/inclass/systems/

We can confirm that jupyter-book is installed by checking the version.

jupyter-book --version
Jupyter Book      : 0.15.1
External ToC      : 0.3.1
MyST-Parser       : 0.18.1
MyST-NB           : 0.17.2
Sphinx Book Theme : 1.0.1
Jupyter-Cache     : 0.6.1
NbClient          : 0.5.13

We will run a command to create a jupyterbook from a template, the command has 3 parts:

  • jupyter-book is a program (the thing we installed)

  • create is a subcommand (one action that program can do)

  • tiny-book is an argument (a mandatory input to that action)

jupyter-book create tiny-book
===============================================================================

Your book template can be found at

    tiny-book/

===============================================================================

We see that it succeeds

You can make it with any name, beacuse the name is an argument or input

jupyter-book create example
===============================================================================

Your book template can be found at

    example/

===============================================================================

Each one makes a directory, we can see by listing

ls
example				tiny-book
gh-inclass-sp24-brownsarahm

And we can delete the second one since we do not actually want it.

rm example/
rm: example/: is a directory

we get an error because it is not well defined to delete a directory, and potentially risky, so rm is written to throw an error

Instead, we have to tell it two additional things:

  • to delete recusively r

  • to force it to do something risky with f

note we can stack single character options together with a single -

rm -rf example/

Next we will go into the folder we mad and explore it some

cd tiny-book/

ls -a
.			intro.md		notebooks.ipynb
..			logo.png		references.bib
_config.yml		markdown-notebooks.md	requirements.txt
_toc.yml		markdown.md

7.5. Starting a git repo locally#

We made this folder, but we have not used any git operations on it yet, it is actually not a git repo, which we could tell from the output above, but let’s use git to inspect and get another hint.

We can try git status

git status
fatal: not a git repository (or any of the parent directories): .git

This tells us the .git directory is missing form the current path and all parent directories.

To make it a git repo we use git init with the path we want to initialize, which currently is .

git init .
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint: 
hint: 	git config --global init.defaultBranch <name>
hint: 
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint: 
hint: 	git branch -m <name>
Initialized empty Git repository in /Users/brownsarahm/Documents/inclass/systems/tiny-book/.git/

Here we are faced with a social aspect of computing that is also a good reminder about how git actually works

7.5.1. Retiring racist language#

Historically the default branch was called master.

we’ll change our default branch to main

git branch -m main

and check in with git now

git status
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	_config.yml
	_toc.yml
	intro.md
	logo.png
	markdown-notebooks.md
	markdown.md
	notebooks.ipynb
	references.bib
	requirements.txt

nothing added to commit but untracked files present (use "git add" to track)

this time it works and we see a two important things:

  • there are no previous commits

  • all of the files are untracked

and we will commit the template so that we have it saved as a point we could go back to.

git add .

We have to add separately because the files are untracked we cannot use the -a option on commit

and then we will commit with a simple message

git commit -m 'jupyter book template'
[main (root-commit) e34f91d] jupyter book template
 9 files changed, 341 insertions(+)
 create mode 100644 _config.yml
 create mode 100644 _toc.yml
 create mode 100644 intro.md
 create mode 100644 logo.png
 create mode 100644 markdown-notebooks.md
 create mode 100644 markdown.md
 create mode 100644 notebooks.ipynb
 create mode 100644 references.bib
 create mode 100644 requirements.txt

7.6. Structure of a Jupyter book#

We will explore the output by looking at the files

ls
_config.yml		logo.png		notebooks.ipynb
_toc.yml		markdown-notebooks.md	references.bib
intro.md		markdown.md		requirements.txt

A jupyter book has two required files (_config.yml and _toc.yml), some for content, and some helpers that are common but not required.

  • config defaults

  • toc file formatting rules

  • the *.md files are content

  • the .bib file is bibiolography information

  • The other files are optional, but common. Requirements.txt is the format for pip to install python depndencies. There are different standards in other languages for how

Note

the extention (.yml) is yaml, which stands for “YAML Ain’t Markup Language”. It consists of key, value pairs and is deigned to be a human-friendly way to encode data for use in any programming language.

The table of contents file describe how to put the other files in order.

cat _toc.yml 
# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html

format: jb-book
root: intro
chapters:
- file: markdown
- file: notebooks
- file: markdown-notebooks

The configuration file, tells jupyter-book basic iformation about the book, it provides all of the settings that jupyterbook and sphinx need to render the content as whatever output format we want.

cat _config.yml 
# Book settings
# Learn more at https://jupyterbook.org/customize/config.html

title: My sample book
author: The Jupyter Book Community
logo: logo.png

# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
  execute_notebooks: force

# Define the name of the latex output file for PDF builds
latex:
  latex_documents:
    targetname: book.tex

# Add a bibtex file so that we can create citations
bibtex_bibfiles:
  - references.bib

# Information about where the book exists on the web
repository:
  url: https://github.com/executablebooks/jupyter-book  # Online location of your book
  path_to_book: docs  # Optional path to your book, relative to the repository root
  branch: master  # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
html:
  use_issues_button: true
  use_repository_button: true
ls
_config.yml		logo.png		notebooks.ipynb
_toc.yml		markdown-notebooks.md	references.bib
intro.md		markdown.md		requirements.txt

7.6.1. Dev tools mean we do not have to write bibliographies manually#

bibliographies are generated with bibtex which takes structured information from the references in a bibtex file with help from sphinxcontrib-bibtex

For general reference, reference managers like zotero and mendeley can track all of your sources and output the references in bibtex format that you can use anywhere or sync with tools like MS Word or Google Docs.

The one last file tells us what dependencies we have

cat requirements.txt 

If your book generates with error messages run pip install -r requirements.txt

jupyter-book
matplotlib
numpy

7.7. Building Documentation#

We can transform from raw source to an output by building the book

jupyter-book build .
Running Jupyter-Book v0.15.1
Source Folder: /Users/brownsarahm/Documents/inclass/systems/tiny-book
Config Path: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_config.yml
Output Path: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/html
Running Sphinx v4.5.0
making output directory... done
[etoc] Changing master_doc to 'intro'
checking bibtex cache... out of date
parsing bibtex file /Users/brownsarahm/Documents/inclass/systems/tiny-book/references.bib... parsed 5 entries
myst v0.18.1: MdParserConfig(commonmark_only=False, gfm_only=False, enable_extensions=['colon_fence', 'dollarmath', 'linkify', 'substitution', 'tasklist'], disable_syntax=[], all_links_external=False, url_schemes=['mailto', 'http', 'https'], ref_domains=None, highlight_code_blocks=True, number_code_blocks=[], title_to_header=False, heading_anchors=None, heading_slug_func=None, footnote_transition=True, words_per_minute=200, sub_delimiters=('{', '}'), linkify_fuzzy_links=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, dmath_double_inline=False, update_mathjax=True, mathjax_classes='tex2jax_process|mathjax_process|math|output_area')
myst-nb v0.17.2: NbParserConfig(custom_formats={}, metadata_key='mystnb', cell_metadata_key='mystnb', kernel_rgx_aliases={}, execution_mode='force', execution_cache_path='', execution_excludepatterns=[], execution_timeout=30, execution_in_temp=False, execution_allow_errors=False, execution_raise_on_error=False, execution_show_tb=False, merge_streams=False, render_plugin='default', remove_code_source=False, remove_code_outputs=False, code_prompt_show='Show code cell {type}', code_prompt_hide='Hide code cell {type}', number_source_lines=False, output_stderr='show', render_text_lexer='myst-ansi', render_error_lexer='ipythontb', render_image_options={}, render_figure_options={}, render_markdown_format='commonmark', output_folder='build', append_css=True, metadata_to_fm=False)
Using jupyter-cache at: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/.jupyter_cache
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 4 source files that are out of date
updating environment: [new config] 4 added, 0 changed, 0 removed
/Users/brownsarahm/Documents/inclass/systems/tiny-book/markdown-notebooks.md: Executing notebook using local CWD [mystnb]
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
/Users/brownsarahm/Documents/inclass/systems/tiny-book/markdown-notebooks.md: Executed notebook in 2.18 seconds [mystnb]
/Users/brownsarahm/Documents/inclass/systems/tiny-book/notebooks.ipynb: Executing notebook using local CWD [mystnb]
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
/Users/brownsarahm/Documents/inclass/systems/tiny-book/notebooks.ipynb: Executed notebook in 2.44 seconds [mystnb]

looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] notebooks                                         
generating indices... genindex done
writing additional pages... search done
copying images... [100%] _build/jupyter_execute/137405a2a8521f521f06724f6d604e5a5544cce7bd94d903975cee58b0605ccb.png
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.

The HTML pages are in _build/html.
[etoc] missing index.html written as redirect to 'intro.html'

===============================================================================

Finished generating HTML for book.
Your book's HTML pages are here:
    _build/html/
You can look at your book by opening this file in a browser:
    _build/html/index.html
Or paste this line directly into your browser bar:
    file:///Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/html/index.html            

===============================================================================

Try it yourself

Which files created by the template are not included in the rendered output? How could you tell?

Now we can look at what it did

ls 
_build			logo.png		references.bib
_config.yml		markdown-notebooks.md	requirements.txt
_toc.yml		markdown.md
intro.md		notebooks.ipynb

we note that this made a new folder called _build. we can look inside there.

ls _build/
html		jupyter_execute

and in the html folder:

ls _build/html/
_images			index.html		objects.inv
_sources		intro.html		search.html
_sphinx_design_static	markdown-notebooks.html	searchindex.js
_static			markdown.html
genindex.html		notebooks.html

We can also copy the path to the file and open it in our browser

we can change the size of a browswer window or use the screen size settings in inspect mode to see that this site is responsive.

We didn’t have to write any html and we got a responsive site!

If you wanted to change the styling with sphinx you can use built in themes which tell sphinx to put different files in the _static folder when it builds your site, but you don’t have to change any of your content! If you like working on front end things (which is great! it’s just not alwasy the goal) you can even build your own theme that can work with sphinx.

7.8. Ignoring Built files#

The built site files are compeltey redundant, content wise, to the original markdown files.

We do not want to keep track of changes for the built files since they are generated from the source files. It’s redundant and makes it less clear where someone should update content.

Git helps us with this with the .gitignore

echo "_build/" >> .gitignore

Now we check with git status

git status
On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	.gitignore

nothing added to commit but untracked files present (use "git add" to track)

only the gitingore file itself is listed! just as we want.

now that’s the only new file as far as git is concerned, so we will track this,

git add .

and finally commit

git commit -m 'ignore built site'
[main 844f6e4] ignore built site
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore

7.9. How do I push a repo that I made locally to GitHub?#

Right now, we do not have any remotes, so if we try to push it will fail. Next we will see how to fix that.

First let’s confirm

git push
fatal: No configured push destination.
Either specify the URL from the command-line or configure a remote repository using

    git remote add <name> <url>

and then push using the remote name

    git push <name>

and it tells us how to fix it. This is why inspection is so powerful in developer tools, that is where we developers give one another hints.

Right now, we do not have any remotes

git remote

For today, we will create an empty github repo shared with me, by accepting the assignment linked in prismia or ask a TA/instructor if you are making up class.

More generally, you can create a repo

That default page for an empty repo if you do not initiate it with any files will give you the instructions for what remote to add.

Now we add the remote

git remote add origin https://github.com/compsys-progtools/tiny-book-brownsarahm.git

Then we can try to push

git push 
fatal: The current branch main has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin main

To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.

we get an error, becuse we need to link the branch locally to a remote branch

git push --set-upstream origin main
To https://github.com/compsys-progtools/tiny-book-brownsarahm.git
 ! [rejected]        main -> main (fetch first)
error: failed to push some refs to 'https://github.com/compsys-progtools/tiny-book-brownsarahm.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

now we get an error because the remote and local have different commits

We follow the instruction again, and pull

git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (5/5), 1.70 KiB | 348.00 KiB/s, done.
From https://github.com/compsys-progtools/tiny-book-brownsarahm
 * [new branch]      feedback   -> origin/feedback
 * [new branch]      main       -> origin/main
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=origin/<branch> main

we see the fetch part works, but then the linking fails

We need to tell it how to link on pull

git pull origin main
From https://github.com/compsys-progtools/tiny-book-brownsarahm
 * branch            main       -> FETCH_HEAD
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint: 
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint: 
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.

and we get an error about how they have different histories.

we tell it how we want it to resolve that

git pull origin main --rebase
From https://github.com/compsys-progtools/tiny-book-brownsarahm
 * branch            main       -> FETCH_HEAD
Successfully rebased and updated refs/heads/main.

and success!

and now we can actually push

git push --set-upstream origin main
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 8 threads
Compressing objects: 100% (12/12), done.
Writing objects: 100% (14/14), 16.53 KiB | 8.26 MiB/s, done.
Total 14 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To https://github.com/compsys-progtools/tiny-book-brownsarahm.git
   e5287d8..22f2227  main -> main
branch 'main' set up to track 'origin/main'.

7.10. Prepare for next class#

  1. Think through and make some notes about what you have learned about design so far. Try to answer the questions below in design_before.md. If you do not now know how to answer any of the questions, write in what questions you have.

- What past experiences with making decisions about design of software do you have?
- what experiences studying design do you have? 
- What processes, decisions, and practices come to mind when you think about designing software?
- From your experiences as a user, how you would describe the design of command line tools vs other GUI based tools?

7.11. Badges#

  1. Review the notes, jupyterbook docs, and experiment with the jupyter-book CLI to determine what files are required to make jupyter-book build run. Make your kwl repo into a jupyter book. Set it so that the _build directory is not under version control. Complete basic customization setps for the necessary files and ensure that you do not add template files to your KWL repo.

  2. Add docs.md to your KWL repo and explain the most important things to know about documentation in your own words using other programming concepts you have learned so far. Include in a markdown (same as HTML <!-- comment --> ) comment the list of CSC courses you have taken for context while we give you feedback.

build idea build a sphinx extension that adds a particular feature to a documentation website. You can start a proposal and discuss ideas with Dr. Brown

  1. Review the notes, jupyterbook docs, and experiment with the jupyter-book CLI to determine what files are required to make jupyter-book build run. Make your kwl repo into a jupyter book. Set it so that the _build directory is not under version control. Complete basic customization setps for the necessary files and ensure that you do not add template files to your KWL repo.

  2. Learn about the documentation ecosystem in another language that you know using at least one official source and additional sources as you find helpful. In docs_ecosystems.md include a summary of your findings and compare and contrast it to jupyter book/sphinx. Include a bibtex based bibliography of the sources you used. You can use this generator for informal sources and google scholar for formal sources.

explore idea extend the conversion of your repo into a jupyter book by making links among pages, adding an intro, and adding a github action so that a pdf is generated and added to an orphan branch named gh-pages. Plus use at least 2 other jupyter book features from the docs, specify or discuss, your extended features with Dr. Brown in your proposal.

7.12. Experience Report Evidence#

Link to your tinybook repo. in the experience report PR

7.13. Questions After Today’s Class#

7.13.1. Can you create a new remote through the terminal without using the github ui?#

If you knew the URL of the remote, yes, if the remote is on GitHub, you have to use either github.com in your browser or use the gh cli tool’s command gh repo create

7.13.2. How is jupyterbook different than other ides or editors?#

jupyter-book is not an editor or IDE, it is a tool for building websites or pdfs.

Jupyter Notebook is a single stream of computational analysis. Jupyter Lab is a more IDE like interface for doing compuational analyses. Both are part of project jupyter and on GitHub

Jupyter book is for publishing book like documents as websites and to other forms designed to be compatible with jupyter notebooks, but is a part of a separate executable books project. It is specialized for cases where there is computation in the code. See their gallery for examples. I use it for CSC310 that has code and plots in the notes

Jupyter-book is an opinionated distribution of sphinx which can also be used to document other languages like C++

7.13.3. What is the .doctrees/ folder inside _build/ with files like environment.pickle?#

That is an intermediate step between the markdown and the final HTML.

There is api level docs

7.13.4. Why is building necessary? Couldn’t it just be a part of compiling?#

Building is a more general process of transforming from source to an output format. Compling is a specific step withing building for some programming languages. We will learn more about the build process for C later.

What we did today was building, but not compiling.

7.13.5. What other uses are there for jupyter notebook#

The course website is an example. They also maintain a gallery of jupyter books

7.13.6. Did we create a repo locally?#

yes git init . created a repo locally

7.13.7. How to add more files and sections to your Jupyter Books website#

Create more files in the folder and then update the _toc.yml. For example, this whole website is jupyter book, see how many files are in the repo

7.13.8. how will we continue to use jupyter books going forward in this class?#

Your badge task is to convert your KWL repo to one. The course website is also a jupyter-book, so making more complex contributions to the site is one way to practice with it.

For example you might do an explore or build that contributes to the course website.

Also, most builds will require documentation using a documentation builder, which you might choose jupyter-book or to use a different one, but some of what we saw here will apply, making it easier to learn another similar tool.

7.13.9. how do we use the developer tools we talked about, for example the bibliography one?#

Within jupyter-book, their documentation includes a tutorial on references. This is a good high quality source because it is written by developers of a tool and describes in concrete terms what you need to do, with appropriate context.

To get the bibtex formatted reference for a source, you can use this generator for informal sources(like websites) and google scholar for formal sources(like journal articles or academic conference papers).

More generally, bibtex was designed to be used with latex which is easiest to get started with using the cloud-hosted, live collaboration, version in overleaf.