Flawed ways of working: git-rebase

I tell my children not to lie. My parents told me not to lie. I’m pretty sure that by far most parents tell their children not to lie. Aside from used car salesmen and politicians, I think not lying is pretty engrained in what we expect of each other. That is not to say that used car salesmen and politicians lie: that is to say we expect them to lie: there is an engrained distrust in society toward people who embellish or promise things for a living.

The real problem is when procedures force us to lie, and force us to use our tools in a way to make them lie. One way to make my favorite version control tool, Git, lie is by using git-rebase. The rebase command allows you to re-write history — to pretend that when you created a given branch isn’t really when you created that branch: you created it at some other point in the code’s history in stead.

I’ve been using Git for a several years now1, and have seen only very few real use-cases for git-rebase.

One is to take a branch that is created from one branch and reproduce its changes onto another branch. If the two branches (the original branch and the product of the rebase) will never meet (i.e. will not both be merged into the same branch — for example a feature branch and a maintenance branch), and there were too many changes to cherry-pick them, you may want to use rebase as a mega-cherry-pick. The effects are nearly the same as a whole bunch of cherry-picks, but you do have to take care to leave a branch behind. This is done like this:

A second use-case just happened about an hour ago: while conducting a routine security review of Arachnida’s code changes since the last release (three months ago) I found three commits that were not adequately commented: one comment just said “add eight use-cases, fix a bug” and two others said “debug session”. All three fixed real bugs, and no bug was adequately documented elsewhere (though none of the three bugs was a security issue and all were in code that had only recently been added, and for which unit tests had failed before the commit). As I was conducting a security review and these commits lit up, I decided to amend them, which goes like this:

Once I had done this on each of the offending commits, I had to push the modified commits to my server, which refuses non-fast-forward pushes — even with --force — so I had to hack on the server a bit:

git --bare branch master.orig-<datecode for today> # keeps the current master with a datecode, so we know what happened and how to undo
git update-ref HEAD HEAD~63 # backs up 63 commits, to just before the first amended commit

git --bare branch master.orig-<datecode for today> # keeps the current master with a datecode, so we know what happened and how to undo
git update-ref HEAD HEAD~63 # backs up 63 commits, to just before the first amended commit

After that, a git-push worked.

The first of these two use-cases is, in my view, simply a case of automating an otherwise labor-intensive process: Git is doing almost exactly what it would do if I had cherry-picked the commits in stead of rebasing the branch, but it just does it a lot quicker. The new branch history will show every commit having been committed at about the same time, but authored longer ago, perhaps by somebody else.

The second use-case has the disadvantage of making things a bit harder for others working from my public repository — myself included: as the very reason for my re-basing is that I’ve amended the commits in my branch’ history. However, I frown on badly executed rebases more than I do on duplicate commits in a history, because I can always cherry-pick from, or rebase, a branch I’m pulling from if I want to, so if the programmer pulling from my repo doesn’t feel she can do the rebase appropriately, she just shouldn’t do it. This is normally not a problem because I do security reviews one week before a new release, at which time the code is frozen, so Arachnida is currently in a code-freeze and has been in feature-freeze for three weeks already: any branches that were spawned from my master should have been merged three weeks ago.

Using git-rebase to make merges look cleaner is, IMO, a flawed way of working — even if the Git book suggests using it exactly for this purpose. Git is very good at merging, so there is no real reason why I would want merges to look “prettier” in any way.

When doing reviews of my products’ code or history, I look for a number of things, including:

security errors (no known exploitable security errors may exist in released software)

code ownership (all code must be legally owned by or appropriately licensed to Vlinder Software if it is to be part of a Vlinder Software product)

bugs

test coverage

… etc. …

If a merge doesn’t “look pretty” there’s still the good ol’ git diff HEAD^ which will show what happened. Tests must still pass after the merge, and the merged code is reviewed no matter how the merge itself looks — how many branches were tied together and how many conflicts were resolved.

Git is a very powerful tool — or rather, a very powerful set of power tools — and git-merge is incredibly good at merging. That doesn’t mean you should blindly trust it, but it does mean that several parallel development branches in your Git history should not be seen as a problem.

I started using it about one year after its first use in the Linux kernel [↩]

About rlc

Software Analyst in embedded systems and C++, C and VHDL developer, I specialize in security, communications protocols and time synchronization, and am interested in concurrency, generic meta-programming and functional programming and their practical applications.
I take a pragmatic approach to project management, focusing on the management of risk and scope.
I have over two decades of experience as a software professional and a background in science.