7. How do programmers communicate about code?#
Tip
check if your codespace has uncommitted changes on github.com/codespaces
note:
you can only have 2 active at a time(green dots)
you can see if any have uncommitted changes
you can export those changes to a branch from this page
Today we are going to pick up from where we left off talking about the conventional commits.
That is a core example of the types of detailed communication we do in programming that is embedded into the work.
7.1. Why Documentation#
Today we will talk about documentation, there are several reasons this is important:
using official documentation is the best way to get better at the tools
understanding how documentation is designed and built will help you use it better
writing and maintaining documentation is really important part of working on a team
documentation building tools are a type of developer tool (and these are generally good software design)
Design is best learned from examples. Some of the best examples of software design come from developer tools.
In particular documentation tools are really good examples of:
pattern matching
modularity and abstraction
automation
the build process beyond compiling
By the end of today’s class you will be able to:
describe different types of documentation
find different information in a code repo
generate documentation as html
ignore content from a repo
create a repo locally and push to GitHub
7.2. What is documentation#
from ethnography of docuemtnation data science
7.2.1. Why is documentation so important?#
we should probably spend more time on it
7.3. So, how do we do it?#
linux kernel uses sphinx and here is why and how it works
7.4. Jupyterbook#
Jupyterbook wraps sphinx and uses markdown instead of restructured text. The project authors note in the documenation that it “can be thought of as an opinionated distribution of Sphinx”. We’re goign to use this.
navigate to your folder for this course (mine is inclass/systems
)
cd Documents/inclass/systems/
We can confirm that jupyter-book
is installed by checking the version.
jupyter-book --version
Jupyter Book : 0.15.1
External ToC : 0.3.1
MyST-Parser : 0.18.1
MyST-NB : 0.17.2
Sphinx Book Theme : 1.0.1
Jupyter-Cache : 0.6.1
NbClient : 0.5.13
We will run a command to create a jupyterbook from a template, the command has 3 parts:
jupyter-book
is a program (the thing we installed)create
is a subcommand (one action that program can do)tiny-book
is an argument (a mandatory input to that action)
jupyter-book create tiny-book
===============================================================================
Your book template can be found at
tiny-book/
===============================================================================
We see that it succeeds
You can make it with any name, beacuse the name is an argument or input
jupyter-book create example
===============================================================================
Your book template can be found at
example/
===============================================================================
Each one makes a directory, we can see by listing
ls
example tiny-book
gh-inclass-sp24-brownsarahm
And we can delete the second one since we do not actually want it.
rm example/
rm: example/: is a directory
we get an error because it is not well defined to delete a directory, and potentially risky, so rm
is written to throw an error
Instead, we have to tell it two additional things:
to delete recusively
r
to force it to do something risky with
f
note we can stack single character options together with a single -
rm -rf example/
Next we will go into the folder we mad and explore it some
cd tiny-book/
ls -a
. intro.md notebooks.ipynb
.. logo.png references.bib
_config.yml markdown-notebooks.md requirements.txt
_toc.yml markdown.md
7.5. Starting a git repo locally#
We made this folder, but we have not used any git operations on it yet, it is actually not a git repo, which we could tell from the output above, but let’s use git to inspect and get another hint.
We can try git status
git status
fatal: not a git repository (or any of the parent directories): .git
This tells us the .git
directory is missing form the current path and all parent directories.
To make it a git repo we use git init
with the path we want to initialize, which currently is .
git init .
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /Users/brownsarahm/Documents/inclass/systems/tiny-book/.git/
Here we are faced with a social aspect of computing that is also a good reminder about how git actually works
7.5.1. Retiring racist language#
Historically the default branch was called master.
derived from a master/slave analogy which is not even how git works, but was adopted terminology from other projects
literally the person who chose the names “master” and “origin” regrets that choice the name main is a more accurate and not harmful term and the current convention.
we’ll change our default branch to main
git branch -m main
and check in with git now
git status
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
_config.yml
_toc.yml
intro.md
logo.png
markdown-notebooks.md
markdown.md
notebooks.ipynb
references.bib
requirements.txt
nothing added to commit but untracked files present (use "git add" to track)
this time it works and we see a two important things:
there are no previous commits
all of the files are untracked
and we will commit the template so that we have it saved as a point we could go back to.
git add .
We have to add separately because the files are untracked we cannot use the -a
option on commit
and then we will commit with a simple message
git commit -m 'jupyter book template'
[main (root-commit) e34f91d] jupyter book template
9 files changed, 341 insertions(+)
create mode 100644 _config.yml
create mode 100644 _toc.yml
create mode 100644 intro.md
create mode 100644 logo.png
create mode 100644 markdown-notebooks.md
create mode 100644 markdown.md
create mode 100644 notebooks.ipynb
create mode 100644 references.bib
create mode 100644 requirements.txt
7.6. Structure of a Jupyter book#
We will explore the output by looking at the files
ls
_config.yml logo.png notebooks.ipynb
_toc.yml markdown-notebooks.md references.bib
intro.md markdown.md requirements.txt
A jupyter book has two required files (_config.yml
and _toc.yml
), some for content, and some helpers that are common but not required.
the
*.md
files are contentthe
.bib
file is bibiolography informationThe other files are optional, but common. Requirements.txt is the format for pip to install python depndencies. There are different standards in other languages for how
Note
the extention (.yml
) is yaml, which stands for “YAML Ain’t Markup Language”. It consists of key, value pairs and is deigned to be a human-friendly way to encode data for use in any programming language.
The table of contents file describe how to put the other files in order.
cat _toc.yml
# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html
format: jb-book
root: intro
chapters:
- file: markdown
- file: notebooks
- file: markdown-notebooks
The configuration file, tells jupyter-book basic iformation about the book, it provides all of the settings that jupyterbook and sphinx need to render the content as whatever output format we want.
cat _config.yml
# Book settings
# Learn more at https://jupyterbook.org/customize/config.html
title: My sample book
author: The Jupyter Book Community
logo: logo.png
# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
execute_notebooks: force
# Define the name of the latex output file for PDF builds
latex:
latex_documents:
targetname: book.tex
# Add a bibtex file so that we can create citations
bibtex_bibfiles:
- references.bib
# Information about where the book exists on the web
repository:
url: https://github.com/executablebooks/jupyter-book # Online location of your book
path_to_book: docs # Optional path to your book, relative to the repository root
branch: master # Which branch of the repository should be used when creating links (optional)
# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
html:
use_issues_button: true
use_repository_button: true
ls
_config.yml logo.png notebooks.ipynb
_toc.yml markdown-notebooks.md references.bib
intro.md markdown.md requirements.txt
7.6.1. Dev tools mean we do not have to write bibliographies manually#
bibliographies are generated with bibtex which takes structured information from the references in a bibtex file with help from sphinxcontrib-bibtex
For general reference, reference managers like zotero and mendeley can track all of your sources and output the references in bibtex format that you can use anywhere or sync with tools like MS Word or Google Docs.
The one last file tells us what dependencies we have
cat requirements.txt
If your book generates with error messages run pip install -r requirements.txt
jupyter-book
matplotlib
numpy
7.7. Building Documentation#
We can transform from raw source to an output by building the book
jupyter-book build .
Running Jupyter-Book v0.15.1
Source Folder: /Users/brownsarahm/Documents/inclass/systems/tiny-book
Config Path: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_config.yml
Output Path: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/html
Running Sphinx v4.5.0
making output directory... done
[etoc] Changing master_doc to 'intro'
checking bibtex cache... out of date
parsing bibtex file /Users/brownsarahm/Documents/inclass/systems/tiny-book/references.bib... parsed 5 entries
myst v0.18.1: MdParserConfig(commonmark_only=False, gfm_only=False, enable_extensions=['colon_fence', 'dollarmath', 'linkify', 'substitution', 'tasklist'], disable_syntax=[], all_links_external=False, url_schemes=['mailto', 'http', 'https'], ref_domains=None, highlight_code_blocks=True, number_code_blocks=[], title_to_header=False, heading_anchors=None, heading_slug_func=None, footnote_transition=True, words_per_minute=200, sub_delimiters=('{', '}'), linkify_fuzzy_links=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, dmath_double_inline=False, update_mathjax=True, mathjax_classes='tex2jax_process|mathjax_process|math|output_area')
myst-nb v0.17.2: NbParserConfig(custom_formats={}, metadata_key='mystnb', cell_metadata_key='mystnb', kernel_rgx_aliases={}, execution_mode='force', execution_cache_path='', execution_excludepatterns=[], execution_timeout=30, execution_in_temp=False, execution_allow_errors=False, execution_raise_on_error=False, execution_show_tb=False, merge_streams=False, render_plugin='default', remove_code_source=False, remove_code_outputs=False, code_prompt_show='Show code cell {type}', code_prompt_hide='Hide code cell {type}', number_source_lines=False, output_stderr='show', render_text_lexer='myst-ansi', render_error_lexer='ipythontb', render_image_options={}, render_figure_options={}, render_markdown_format='commonmark', output_folder='build', append_css=True, metadata_to_fm=False)
Using jupyter-cache at: /Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/.jupyter_cache
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 4 source files that are out of date
updating environment: [new config] 4 added, 0 changed, 0 removed
/Users/brownsarahm/Documents/inclass/systems/tiny-book/markdown-notebooks.md: Executing notebook using local CWD [mystnb]
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
/Users/brownsarahm/Documents/inclass/systems/tiny-book/markdown-notebooks.md: Executed notebook in 2.18 seconds [mystnb]
/Users/brownsarahm/Documents/inclass/systems/tiny-book/notebooks.ipynb: Executing notebook using local CWD [mystnb]
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
/Users/brownsarahm/Documents/inclass/systems/tiny-book/notebooks.ipynb: Executed notebook in 2.44 seconds [mystnb]
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] notebooks
generating indices... genindex done
writing additional pages... search done
copying images... [100%] _build/jupyter_execute/137405a2a8521f521f06724f6d604e5a5544cce7bd94d903975cee58b0605ccb.png
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.
The HTML pages are in _build/html.
[etoc] missing index.html written as redirect to 'intro.html'
===============================================================================
Finished generating HTML for book.
Your book's HTML pages are here:
_build/html/
You can look at your book by opening this file in a browser:
_build/html/index.html
Or paste this line directly into your browser bar:
file:///Users/brownsarahm/Documents/inclass/systems/tiny-book/_build/html/index.html
===============================================================================
Try it yourself
Which files created by the template are not included in the rendered output? How could you tell?
Now we can look at what it did
ls
_build logo.png references.bib
_config.yml markdown-notebooks.md requirements.txt
_toc.yml markdown.md
intro.md notebooks.ipynb
we note that this made a new folder called _build
. we can look inside there.
ls _build/
html jupyter_execute
and in the html folder:
ls _build/html/
_images index.html objects.inv
_sources intro.html search.html
_sphinx_design_static markdown-notebooks.html searchindex.js
_static markdown.html
genindex.html notebooks.html
We can also copy the path to the file and open it in our browser
we can change the size of a browswer window or use the screen size settings in inspect mode to see that this site is responsive.
We didn’t have to write any html and we got a responsive site!
If you wanted to change the styling with sphinx you can use built in
themes which tell sphinx to put different
files in the _static
folder when it builds your site, but you don’t have to change any of your content! If you like working on front end things (which is great! it’s just not alwasy the goal) you can even
build your own theme that can work with sphinx.
7.8. Ignoring Built files#
The built site files are compeltey redundant, content wise, to the original markdown files.
We do not want to keep track of changes for the built files since they are generated from the source files. It’s redundant and makes it less clear where someone should update content.
Git helps us with this with the .gitignore
echo "_build/" >> .gitignore
Now we check with git status
git status
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
nothing added to commit but untracked files present (use "git add" to track)
only the gitingore file itself is listed! just as we want.
now that’s the only new file as far as git is concerned, so we will track this,
git add .
and finally commit
git commit -m 'ignore built site'
[main 844f6e4] ignore built site
1 file changed, 1 insertion(+)
create mode 100644 .gitignore
7.9. How do I push a repo that I made locally to GitHub?#
Right now, we do not have any remotes, so if we try to push it will fail. Next we will see how to fix that.
First let’s confirm
git push
fatal: No configured push destination.
Either specify the URL from the command-line or configure a remote repository using
git remote add <name> <url>
and then push using the remote name
git push <name>
and it tells us how to fix it. This is why inspection is so powerful in developer tools, that is where we developers give one another hints.
Right now, we do not have any remotes
git remote
For today, we will create an empty github repo shared with me, by accepting the assignment linked in prismia or ask a TA/instructor if you are making up class.
More generally, you can create a repo
That default page for an empty repo if you do not initiate it with any files will give you the instructions for what remote to add.
Now we add the remote
git remote add origin https://github.com/compsys-progtools/tiny-book-brownsarahm.git
Then we can try to push
git push
fatal: The current branch main has no upstream branch.
To push the current branch and set the remote as upstream, use
git push --set-upstream origin main
To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.
we get an error, becuse we need to link the branch locally to a remote branch
git push --set-upstream origin main
To https://github.com/compsys-progtools/tiny-book-brownsarahm.git
! [rejected] main -> main (fetch first)
error: failed to push some refs to 'https://github.com/compsys-progtools/tiny-book-brownsarahm.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
now we get an error because the remote and local have different commits
We follow the instruction again, and pull
git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (5/5), 1.70 KiB | 348.00 KiB/s, done.
From https://github.com/compsys-progtools/tiny-book-brownsarahm
* [new branch] feedback -> origin/feedback
* [new branch] main -> origin/main
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.
git pull <remote> <branch>
If you wish to set tracking information for this branch you can do so with:
git branch --set-upstream-to=origin/<branch> main
we see the fetch part works, but then the linking fails
We need to tell it how to link on pull
git pull origin main
From https://github.com/compsys-progtools/tiny-book-brownsarahm
* branch main -> FETCH_HEAD
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint: git config pull.rebase false # merge
hint: git config pull.rebase true # rebase
hint: git config pull.ff only # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.
and we get an error about how they have different histories.
we tell it how we want it to resolve that
git pull origin main --rebase
From https://github.com/compsys-progtools/tiny-book-brownsarahm
* branch main -> FETCH_HEAD
Successfully rebased and updated refs/heads/main.
and success!
and now we can actually push
git push --set-upstream origin main
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 8 threads
Compressing objects: 100% (12/12), done.
Writing objects: 100% (14/14), 16.53 KiB | 8.26 MiB/s, done.
Total 14 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To https://github.com/compsys-progtools/tiny-book-brownsarahm.git
e5287d8..22f2227 main -> main
branch 'main' set up to track 'origin/main'.
7.10. Prepare for next class#
Think through and make some notes about what you have learned about design so far. Try to answer the questions below in design_before.md. If you do not now know how to answer any of the questions, write in what questions you have.
- What past experiences with making decisions about design of software do you have?
- what experiences studying design do you have?
- What processes, decisions, and practices come to mind when you think about designing software?
- From your experiences as a user, how you would describe the design of command line tools vs other GUI based tools?
7.11. Badges#
Review the notes, jupyterbook docs, and experiment with the
jupyter-book
CLI to determine what files are required to makejupyter-book build
run. Make your kwl repo into a jupyter book. Set it so that the_build
directory is not under version control. Complete basic customization setps for the necessary files and ensure that you do not add template files to your KWL repo.Add
docs.md
to your KWL repo and explain the most important things to know about documentation in your own words using other programming concepts you have learned so far. Include in a markdown (same as HTML<!-- comment -->
) comment the list of CSC courses you have taken for context while we give you feedback.
build idea build a sphinx extension that adds a particular feature to a documentation website. You can start a proposal and discuss ideas with Dr. Brown
Review the notes, jupyterbook docs, and experiment with the
jupyter-book
CLI to determine what files are required to makejupyter-book build
run. Make your kwl repo into a jupyter book. Set it so that the_build
directory is not under version control. Complete basic customization setps for the necessary files and ensure that you do not add template files to your KWL repo.Learn about the documentation ecosystem in another language that you know using at least one official source and additional sources as you find helpful. In docs_ecosystems.md include a summary of your findings and compare and contrast it to jupyter book/sphinx. Include a bibtex based bibliography of the sources you used. You can use this generator for informal sources and google scholar for formal sources.
explore idea extend the conversion of your repo into a jupyter book by making links among pages, adding an intro, and adding a github action so that a pdf is generated and added to an orphan branch named gh-pages
. Plus use at least 2 other jupyter book features from the docs, specify or discuss, your extended features with Dr. Brown in your proposal.
7.12. Experience Report Evidence#
Link to your tinybook repo. in the experience report PR
7.13. Questions After Today’s Class#
7.13.1. Can you create a new remote through the terminal without using the github ui?#
If you knew the URL of the remote, yes, if the remote is on GitHub, you have to use either github.com in your browser or use the gh cli
tool’s command gh repo create
7.13.2. How is jupyterbook different than other ides or editors?#
jupyter-book
is not an editor or IDE, it is a tool for building websites or pdfs.
Jupyter Notebook is a single stream of computational analysis. Jupyter Lab is a more IDE like interface for doing compuational analyses. Both are part of project jupyter and on GitHub
Jupyter book is for publishing book like documents as websites and to other forms designed to be compatible with jupyter notebooks, but is a part of a separate executable books project. It is specialized for cases where there is computation in the code. See their gallery for examples. I use it for CSC310 that has code and plots in the notes
Jupyter-book is an opinionated distribution of sphinx which can also be used to document other languages like C++
7.13.3. What is the .doctrees/ folder inside _build/ with files like environment.pickle?#
That is an intermediate step between the markdown and the final HTML.
There is api level docs
7.13.4. Why is building necessary? Couldn’t it just be a part of compiling?#
Building is a more general process of transforming from source to an output format. Compling is a specific step withing building for some programming languages. We will learn more about the build process for C later.
What we did today was building, but not compiling.
7.13.5. What other uses are there for jupyter notebook#
The course website is an example. They also maintain a gallery of jupyter books
7.13.6. Did we create a repo locally?#
yes git init .
created a repo locally
7.13.7. How to add more files and sections to your Jupyter Books website#
Create more files in the folder and then update the _toc.yml
. For example, this whole website is jupyter book, see how many files are in the repo
7.13.8. how will we continue to use jupyter books going forward in this class?#
Your badge task is to convert your KWL repo to one. The course website is also a jupyter-book, so making more complex contributions to the site is one way to practice with it.
For example you might do an explore or build that contributes to the course website.
Also, most builds will require documentation using a documentation builder, which you might choose jupyter-book or to use a different one, but some of what we saw here will apply, making it easier to learn another similar tool.
7.13.9. how do we use the developer tools we talked about, for example the bibliography one?#
Within jupyter-book, their documentation includes a tutorial on references. This is a good high quality source because it is written by developers of a tool and describes in concrete terms what you need to do, with appropriate context.
To get the bibtex formatted reference for a source, you can use this generator for informal sources(like websites) and google scholar for formal sources(like journal articles or academic conference papers).
More generally, bibtex was designed to be used with latex which is easiest to get started with using the cloud-hosted, live collaboration, version in overleaf.