Erik Wilde on Services and APIs

Sunday, February 24, 2013

Github Fork Etiquette

this is a brief post about how to contribute to a github repository through the fork and pull request method. while there are a lot of short intros and how-to pages out there, none had the right mix of commands and workflow for me, so i am writing this in part simply as my own cheat sheet.

keep in mind that this is just one way of doing this, and there are many other methods and workflows that work equally well (and may work better for your use case).

the main idea is to never ever commit to the master branch in your fork, and only use it as a reference for the upstream changes. all changes are made in topic branches, then you're rebasing those onto the master branch, and then you're creating pull requests. if you are working on multiple issues, simply create multiple topic branches off the master branch.

now your cloned fork is ready to work on, but it is disconnected from the original. the convention is to add a remote called upstream, which you can later use to refer to the original repository:

cd repo
git remote add upstream https://github.com/original/repo.git

you can easily find the original repository's URI by visiting its home page on github. now that your fork is connected to the original repository, you should periodically pull the latest changes from the original:

git pull --ff-only upstream master

the --ff-only option ensures that in case you accidentally did make any changes to your fork's master branch, the pull will not work, and you first have to clean up. after a successful pull from upstream, make sure to push the changes:

git push origin master

the pull/push sequence is something you should repeat fairly frequently, in particular when you start working on a new issue. it makes sure you're up to date with what's going on in the original repository.

when you start working on an issue, create a branch off the master branch, and maybe give it a name that's helpful to the maintainer of the original repository:

git checkout -b yourname/issuename

a word of caution: when you are creating a new issue branch, make sure you are creating it from the master branch, i.e. make sure to first checkout the master branch, and then create your new branch.

all your changes now should be committed to that branch, which now nicely isolates your changes. however, should the master branch move forward, your branch now may start getting out of sync. this is something you need to address before creating a pull request.

when you want to submit your changes, first pull the latest changes from the upstream master, like shown above. now your fork's master branch reflects the upstream master branch. now you need to rebase your branch, so that it is based off this newer version of the master branch (make sure you are in the right branch when doing this):

git rebase master

now you are at the point where your branch looks as if you have made all the changes against the latest version of the upstream master. now you can create a pull request and review the changes before you actually submit it. make sure to create a pull request with the base being the original repository's master, and the head being your fork's topic branch.

once the pull request has been accepted, the changes become part of the original's master branch, which means the next time you pull from upstream, you should be getting your own changes (now part of the original master branch). this also means that the branch you've created does not need to exist anymore, and you can safely remove it:

git branch -D yourname/issuename

that's it. what's not covered here are two issues that also may be relevant:

rebasing may lead to conflicts, which have to be resolved. git will lead you through that process, but essentially you are paying the price to be able to create one neatly self-contained pull request.