pardon the construction noises while we build the internet

How We Use Github To Build AppFog

Joining AppFog back in February became my first time working for a company using Githubprivate repositories for source control. My previous gig used Bitbucket and Mercurial, which we migrated to from self-hosted SVN, which we migrated to from SourceGear Vault, which we migrated to from Visual Source Safe… etc. That’s how things go at a 12-year-old company, I guess. As a startup, AppFog can start off on the right foot and just use Github from the get-go. Yippee!

I have used git and Github for a while in open source projects and know the typical workflow of forking and submitting pull requests. AppFog’s workflow is very similar. Here’s how we approach it.

Setting Up Your Personal Fork

The AppFog organization has a number of projects in it. Everyone in the organization has commit access and can create/destroy/modify any of the repositories. We could just clone the repository down, edit the code, commit and push to master…but there’s a better way.

Our approach is for each developer to create a fork of the project they want to work on.

For example, suppose we have a repository in the AppFog org called yummy-sandwich. I’d head on over to the project page at https://github.com/appfog/yummy-sandwich and click the Fork button near the top of the page.

One thing you’d notice there is that git clone adds the origin automatically. That’s great and all, but doesn’t do much for me. I need an easy way to pull upstream changes into my personal fork from the main repository. To do that, I’ll hop into that directory and add the upstream remote:

Modifying The Code

Now that I’ve got my personal fork configured, I can do whatever I’d like to it without disturbing our pristine shared repository… but to keep things organized and clear, I’m not just going to start tweaking bits and committing to master. Instead, I’ll work entirely on topic branches, regardless of how small or large a change is.

The following steps detail how to create a branch, update the code, push changes to the remote, and bring these changes to the attention of the person responsible for merging pull requests.

First, I’ll bring the master up to date using that upstream remote I created.

# switch to master
git checkout master

# pull in latest code
git pull upstream master

This will pull all recently-merged changes from the main appfog/yummy-sandwich repository into my personal fork. Next, I’ll create the branch that I’m going to work on. The naming of the branches takes a special structured form.

Branch Names

We break down the work we do into four distinct types of tasks: bugs, features, chores and hotfixes.

A bug is a normal bug fix of existing functionality.

A feature is new functionality.

A chore is something that adds business value, but doesn’t qualify as a feature (e.g. refactoring).

A hotfix is when we need to fix an immediate problem on a server (if you’re using hotfixes, you’re probably doing something wrong).

The branch name itself consists of the branch type, a brief underscore-separated description, and an optional Github Issue number. They are in the format of: type/a_brief_description_#.

Suppose I wanted to fix a bug where shared users aren’t able to login to the site. After creating the Github Issue, I would name the branch bug/shared_users_cant_login_123 where 123 is the issue number.

Creating the Topic Branch

To create the branch, run git checkout -b <branch name>, where <branch name> is the type/description/id name we just described. For example, using our hypothetical bug fix task from before, I’d run:

git checkout -b bug/shared_users_cant_login_123

Once I’ve created the branch using the checkout command, follow the normal process of changing code, add/commit, and wrap it up with:

git push origin HEAD

After that, use the GitHub web interface to select the branch from the Current Branch dropdown and issue a Pull Request back to the original project.

So, start to finish, here are the steps to get some work done…

Steps To Work On The Branch

Checkout master: git checkout master

Pull in updates: git fetch upstream

Merge updates into master: git merge upstream/master

Create the branch: git checkout -b bug/branch_description_123

Make the code changes

Stage changes for commit: git add <changed filenames>

Commit changes: git commit -m "This is a descriptive commit message"

Push changes to my personal fork: git push origin HEAD

From GitHub, select the branch from the Current Branch dropdown and issue a Pull Request

Pull Request Guidelines

There are some basic guidelines around creating pull-requests.

Just Say No To Auto-Merge

Don’t auto-merge the pull request through Github’s interface. Ever.

Explain the Pull Request

Why is this pull request here? The branch name should be a brief description, but the pull-request itself should have a slightly more detailed description. Leave a nice, clear description about what the branch is for and why. Reference any issues if necessary. Github makes this super easy. You can just type something like #420 and it will auto-link the issue in.

Don’t Commit Gemfile.lock

This one is Ruby-specific. If you didn’t make changes to Gemfile, don’t commit Gemfile.lock.

Config Files

This one is Rails specific, and is a bit of hyperbole, but relates to our process of using git. If the branch requires the creation of a new config file, say so in the description in big, bold text.

Also do the following:

Add the config file to .gitignore. We do not store our configs in source control.

Provide an example of your config file suitable for development purposes at config/myconfig.example.yml where myconfig is your config and yml can be any format.

We have a rake task configs:copy that can find those example config files and copy them to the normal names. This is handy when developing locally. On the production servers, the config files are managed separately from source code and are linked into the config directory from another location.

Merging

Once the pull request is made, it is now the responsibility of the merge master to accept or reject the changes. Being the merge master is a rotating responsibility which involves looking over the pull requests, reviewing the code, and possibly asking the developer to make additional changes, add more unit test, etc.––this is our opportunity to discuss the changes in detail and introduce some standards and process for code quality.

When a pull-request is merged in, we’ll do that onto a special qa branch in the main repository. Here we can get all willy-nilly and try out the code, blow away the changes, or what have you. If everything looks good, then we merge from the change set in the qa branch that’s in a known-good state, finally, into the master branch on the main project. The qa branch is also the target that our continuous integration system watches, which, when updated kicks off our continuous delivery pipeline.

Conclusion

At AppFog we make it a practice to continually review and improve our workflows when we’re working with GitHub (or any tools for that matter). We are always looking for ways to reduce complexity while still getting the same or even better results.

One of the things we’re currently considering is moving away from personal forks and instead just working with topic branches right in the main project. This should keep things just as organized but reduce the complexity of the process significantly. One radical member of our team suggests never using branches. So far, we smile and nod. He may convince us someday.

We’re also rethinking the merge master role, opting instead to distribute the responsibility for merging code across the team but with the axiom “never merge your own code”, ensuring that code review is baked into the process.

The main benefit that personal forks give is access control. Repositories that are owned by the organization have a different ACL than the forks. It also keeps the ‘main’ repo from having a lot of branches in it, so it’s a bit cleaner.

But, I do agree with your sentiment than personal forks might not be that valuable. Originally, the role of merge master was important, and the ability to control commit access to the main repo was considered relevant. Imagine a scenario where you have an external contractor working on your code: give them read access to your organization’s repo, let them make a personal fork, do their work and submit a pull request, then someone at the company reviews/approves/merges back into the product. This workflow has some distinct advantages vs just giving the contractor full access to the repository.

Increasingly, we’re loosening that process though, sharing the merge responsibilities. This is all part of our gradual evolution towards a fully automated continuous delivery pipeline. We’re trying to remove the blocking elements of our workflow. Currently, all the devs have commit access to the main repos anyway.. So the benefits aren’t as tangible as they once were.