Using Git Effectively
Hello! If you're on this webpage, you're likely part of the Students Developing Software team or working on another git-based project with me. Because I work with students with a broad range of experience with git and GitHub, including none at all, I've put together this page as a reference of processes and tips for using git effectively for your work. It's not meant to be comprehensive—see the Resources section for more information. Rather, you should use this page when first starting out to make sure you have git set up, and then consult this page periodically when you have questions. I have attempted to provide some tips and solutions to some common problems that arise.
Important warning: all instructions in this guide use the terminal, a text-based way of interacting with your computer, and with git and GitHub in particular. There are graphical ways of using git/GitHub (e.g., PyCharm, GitHub Desktop), but you should NOT use use these tools for using git as part of SDS. Using the terminal can be a bit tricky at first, but forces you to learn exactly what happens when you use git/GitHub, rather than hiding actions behind a graphical user interface. Get used to using these commands, and you'll go a long way to mastering git and GitHub!
Getting started
-
Install git.
- If you have previously installed git on your computer, I recommend upgrading to the latest version of git to benefit from the latest features and bug fixes.
-
Create a GitHub account.
-
Open a terminal (or Git bash on Windows), and set the following configuration settings (replace the generic name and email with your GitHub info):
$ git config --global user.name "Test User" $ git config --global user.email "email@example.com"
-
Find your assigned SDS project on GitHub. (This can be done with a quick online search.)
-
Work through the Fork A Repo GitHub guide, replacing their
octocat/Spoon-Knife
repository with your project's repository. Make sure that after you've done the final step, you have both an "upstream" and "origin" remote.$ git remote -v > origin https://github.com/YOUR-USERNAME/YOUR-FORK.git (fetch) > origin https://github.com/YOUR-USERNAME/YOUR-FORK.git (push) > upstream https://github.com/ORIGINAL-OWNER/ORIGINAL-REPOSITORY.git (fetch) > upstream https://github.com/ORIGINAL-OWNER/ORIGINAL-REPOSITORY.git (push)
-
Run the following command in your project repository to set your
master
branch to track changes from the "upstream" (i.e., main project) repository:$ git fetch upstream $ git branch -u upstream/master
Throughout your time contributing to your project, your master branch should be an up-to-date copy of the
upstream/master
branch. You'll periodically "synchronize" this master branch withupstream/master
to obtain new changes to the project made by other students. -
Finally, send your GitHub username to your project supervisor (usually me) so that you can be added as a developer to the organization!
A deeper look
After you have completed the above steps, there are three different copies of the project code that you have access to.
-
The GitHub repository which you forked from. This is the version of the repository that is managed by your supervisor, and is the "definitive" version of the code. Anyone can access and read the files in the repository (which is why you were able to fork it in the first place), but only the project supervisor can modify the repository. If you followed the GitHub tutorial correctly, this repository will be named
upstream
when you typegit remote -v
. -
The GitHub repository which you created through a fork. This is your public version of the repository. Everyone can access it on GitHub, but only you are allowed to modify it. Use this repository to share your work with the outside world, including when you want to submit a change to the "upstream" repository. If you followed the GitHub tutorial correctly, this repository will be named
origin
when you typegit remote -v
. -
Your local copy of the repository. This is the one which consists of the files stored on your computer, not on a GitHub server. You have total control over this repository (it's on your machine, after all), but it is private to you, so no one else can see any changes you make here. This is what you obtained when you did the
git clone
command in the repository.
Making a change: basic workflow
After you have a copy of the project on your computer, you can start making changes. Most of your changes will be small and follow the same basic workflow, which we describe here.
Setting up
Note that you should do these steps before making any changes to the code!
- Switch to your master branch:
git checkout master
. -
Make sure your master branch is up to date with the definitive repo's master branch:
git pull upstream master
.Note that if you correctly switched your branch to track
upstream/master
(following the Getting started instructions), you can just rungit pull
instead. -
Create and switch to a new branch to do your work on:
git checkout -b <branch-name>
.If you are working on a fix for a particular issue, name the branch
issue-<X>
, replacing<X>
with the issue number. Otherwise, pick a short descriptive name, with all lowercase wordsseparated-by-hyphens
. We call this branch a feature branch.
Committing your work
At this point (and only at this point) are you ready to make changes to your code. Git uses commits to record a logical change in the code. Git commits are quite flexible, and it will be up to you to decide what exactly constitutes a "logical change". The entirety of the change you make for a task might just be a single commit, or it might be several. Use your judgement when deciding when to commit, and try to err on the side of smaller commits.
With git, all commits you make are local to your computer, and are not sent to your public fork. This means that you don't need to worry too much about messing up a commit; even if you do, it won't be public, or affect anyone else! And it is possible to go back and edit your commit history later, though this is more advanced.
When you want to make a commit:
- Make sure your files have been saved (you don't want to commit stale changes, after all).
- Run
git status
. This should show you all of the files in your repo which you have modified, as well as which files are new or have been removed, since the last commit. Pay attention to this output -- if there are any files which you didn't expect to see, that should be investigated. All changes at this point are unstaged, meaning git has noticed a change, but is not prepared to commit it. - Run
git diff
. This lets you see all of the changes you have made since the last commit. This can be used as a final quality control check (think of the usual English meaning of the word "commit"). I often catch extraneous debugging statements or accidental whitespace changes here, and fix them before the actual commit. -
Run
git add <file1> [<file2> ...]
for any files that have been modified/created that you want to commit, and similarly rungit rm <file1> [<file2> ...]
for files that you've deleted.Tips:
- If you want to add all files in a folder, you can run
git add <folder>
rather than listing each file separately. - You can use glob syntax to specify patterns of files to commit. For example,
git add *.py
will add all.py
files in the current directory to your commit. - Run
git status
again. You should now see a list of files that have staged changes, which are changes that have been marked for committing. Once again, this is a good check to make sure you are only committing the changes you want. - Finally, run
git commit -m "type your commit message here"
. Your commit message should be a short but descriptive message describing the purpose of your commit.
About pre-commit hooks: All SDS projects use pre-commit hooks to run code checks on changes on each commit. The first time you make a commit you'll see a message saying that the pre-commit checks are being installed, which will take a bit of time. Most of the checks will make automatic changes to your files (fixing style errors), but occasionally they will report issues that can't be fixed automatically, and you'll need to fix them manually.
If the pre-commit checks report any issues, including when all of the issues have been fixed automatically, git will not actually commit your changes. If this happens, first make sure all of the issues have been fixed, and then repeat Steps 2--5 to add these fixes to your staged changes and then commit them. 7. After making your commit, run a
git status
to check that your changes no longer appear as either staged or unstaged changes. That's because they're part of the git commit history now! If you now rungit log
, you should see the commit you just made as the most recent commit. You can make further changes, repeating the above steps every time you want to make a commit. - If you want to add all files in a folder, you can run
Adding yourself as a contributor
If you are making your first contribution to the project, please also add yourself to the list of the project's contributors. This helps us keep a record of everyone who's contributed to each project! Each SDS project has a list of contributors in the repository, though you might need to do a bit of searching to find it. Make sure to respect the existing order of contributors when adding your own name.
Note about selecting files to commit
There are some shortcuts you can take to save yourself some typing. You
can list files, directories, and glob patterns as arguments to git commit
rather
than running git add/rm
first. This is probably good enough for most
times when you'll want to commit.
You may also have heard about the -a
flag to git commit
. This causes the
commit to automatically include all modified and deleted files in the
commit. Use this flag with caution. It is very easy to overlook
files that you created by haven't added, or accidentally commit files
that you don't actually want to commit.
Sharing your changes
Because commits are local to your repository, you need to take some extra steps to share your changes with others. When you are ready to receive feedback on your work, do the following.
- Do a
git status
to check and make sure that you have no more changes left to commit. - Update your branch with the latest version from master:
git pull upstream master
. It is important to make sure that your work is compatible with other updates to master which might have happened since you started working on your changes. - Push your changes to your branch:
git push origin <branch-name>
. Note that<branch-name>
should be the same as what you named your local branch, and that you should useorigin
(your fork) rather thanupstream
. -
Visit your repository webpage on GitHub, and click on "New Pull Request." Make sure you have selected the correct fork and branches: the base branch should be the definitive master branch, and compare should be the new branch where you did your work.
Each SDS project uses a GitHub Pull Request Template to help you write informative pull request descriptions. Follow the instructions in the comments of the template to fill out each section.
Notes:
- GitHub uses
[ ]
to display a checkbox after you create the pull request; we uses these to indicate the type of change you're making, and in the "Checklist" at the bottom of the pull request template. You can turn these checkboxes into a "checked" state by replacing the[ ]
with[x]
, or by clicking on the checkbox manually after creating your pull request. - If your pull request resolves a GitHub issue (you will know if this is the case based on the task that was assigned to you), use a GitHub closing keyword to link your pull request to the issue. Then when your pull request is merged in,
- GitHub uses
-
Before creating the pull request, carefully review your file changes (by scrolling down below the pull request description). This may seem redundant because you'll be seeing all the changes you made, but it serves as a final check before you request that others review your work. If you find some things you want to change, cancel the pull request, and update your feature branch (see below).
- Then, create your pull request!
- Wait until the continuous integration checks pass (see the status at the bottom of the pull request page). If the checks do not all pass, you'll need to click on "Details" to investigate why—I encourage you to ask other students about this.
- After you have performed one last self-review of your code an ensured that all of the checks pass, request a review from me (
david-yz-liu
). That will send me a notification that your work is ready for review.
After making these changes, you can start working on new tasks, but remember to start back at the very beginning of this guide (right at Setting up).
Modifying your fork
You may want to modify the code you pushed to your fork, either when
reviewing your work before making a pull request, or after receiving
feedback on a pull request. To do this, simply make new commits, and do
another git push origin <branch-name>
; the branch will be updated
according to the new commits. If you have already made a pull request,
it will update automatically with the new changes, so there is no need
to make a new one.
Common questions/issues
I accidentally committed my changes to my master
branch rather than a feature branch!
-
First, create a new branch (following the naming guidelines described above):
$ git checkout -b <branch-name>
-
Then, switch back to your
master
branch.$ git checkout master
-
The above two steps ensure that your commits have been saved in a new feature branch. This last step is to reset your master branch to be an exact copy of the upstream master branch:
$ git reset upstream/master
To check your work, run git log
and verify that your commits are no longer on your master
branch.
Afterwards, you may want to do the following:
- Run
git pull upstream master
on both yourmaster
and feature branches to update them with the latest updates. - Switch back to your feature branch and make a pull request.
I was working on two different issues, but accidentally created my second feature branch off of the first one rather than off of my master branch (and made commits)!
When this happens, your second feature branch will contain the commits from both the first and second issue. This can be a bit tricky to resolve, depending on the complexity of the commits you've made, so we've provided a few different options.
-
Option 1: use
git cherry-pick
(works best when the second feature has a small number of commits).To do so, you should first do a
git log
to determine hashes of the [commit(s)] that belong to the second issue you're working on, and record them.Then, switch back to
master
, then create a new branch off of master; this branch will be your corrected "Issue 2" branch. Then usegit cherry-pick
to "bring over" the commits for just the second issue onto your new branch. -
Option 2: use git interactive rebase (
git rebase -i
).Git rebase is a powerful tool for rewriting the current branch's commit history, and in particular can be used to remove certain commits from the branch. Be warned, however, that rebase is a more advanced git feature, so if you are trying it out for the first time, I encourage you to first switch to a new branch so that any changes you make to this branch won't affect the original branch for the second feature.
-
Option 3: use
git reset master
to the commit history back to the master branch, but preserve the current state of all files. After doing this option, your repository should be in a state as if you've made changes for both issues you're working on, but have not committed any changes. (Do agit status
to check this.)Then, you can "undo" the changes you made for the first issue (e.g., using
git restore
), and then commit just the changes for the second issue.
Tips
This section includes a few tips to improve your git workflow. While not strictly necessary, I recommend going through this section at the start of the semester.
Using SSH for authentication
You can set up your GitHub account to use ssh keys instead of passwords for authentication. The following GitHub guides cover how to do this:
- Generating a new SSH key and adding it to the ssh-agent
- Adding a new SSH key to your GitHub account
- Testing your SSH connection
- Switching remote URLs from HTTPS to SSH (you'll need to do this for your local repository, for both the
origin
andupstream
remotes)
Useful git commands
Below are some git commands that students often find useful.
I recommend running git status
before and after trying one of these commands to help you keep track of what's going on.
-
View the changes you have made on branch
issue-1234
(git diff
documentation):$ git diff --full-index master issue-1234
-
Temporarily undo local unstaged changes, saving them for later (
git stash
documentation):$ git stash
-
Redo previously-stashed changed:
$ git stash pop
-
Undo unstaged changes made at a specific path (
git restore
documentation):$ git restore <path1> [<path2> ...]
-
Unstage changes made at a specific path:
$ git restore --staged <path1> [<path2> ...]
-
Delete a local branch, typically after your pull request has been merged in (
git branch
documentation):git branch -D <branch-name>
Useful GitHub features
- To see what commit introduced or last modified a specific line of code: GitHub Guide: Viewing the line-by-line revision history for a file
Recommended git configuration options
I recommend the following git configuration options to help simplify your workflow. If you are an experienced git user you may not wish to use (all of) these settings.
Based on this blog post.
-
pull.rebase false
: ensure that you trigger a merge (instead of a "rebase") when runninggit pull
.$ git config pull.rebase false
-
merge.conflictStyle zdiff3
: use a more helpful algorithm for displaying merge conflicts.$ git config merge.conflictStyle zdiff3
-
push.default current
andpush.autoSetupRemote true
: when runninggit push
, automatically create a branch on your fork with the same name as your current local branch.$ git config push.default current $ git config push.autoSetupRemote true
-
rerere.enabled true
: enable git rerere to help with resolving merge conflicts.$ git config --global rerere.enabled true
-
diff.algorithm histogram
: use a more helpful algorithm for displayinggit diff
s.$ git config --global diff.algorithm histogram
-
transfer.fsckobjects true
: detect malformed data when fetching/receiving data.$ git config transfer.fsckobjects true
Resources
- GitHub Guides. Beginner-friendly introductions to how to use GitHub. "Understanding the GitHub Flow" is probably the most useful for our purposes.
- GitHub's Git Guide.
- Git Book. Very comprehensive resource to understanding more advanced topics in Git.
- Learning Git through interactive exercises.
- How to Write a Git Commit Message