Menu

What Would Linus Do?

On good git pull requests

I love git for version control. Yeah it's a little crazy and the command line is a little intimidating.

Once you know what you're doing, though, there's no better way! There is so much flexibility and power there.

But, believe it or not, learning the git command line is the easy part.
Then comes the real challenge of satisfying the people reviewing your code.

And the first thing you may wonder is, "Why should I care if my code makes someone happy?"

You should always try to make it easier for someone to review your code. This applies to your teammates and to those you work with in the open source community. Making life easier is the whole point of working together!

But think for a second what will happen if they DON'T like reviewing your code..

If everyone hates reviewing your code, then your pull requests may sit and languish for days. With every day that passes, your eventual merge back into master will be more difficult.

While you are waiting for someone to review your pull request, you are also working in a new branch, on another feature.

Your co-workers are also working in their own branches. You're all working on the same files. Welcome to merge hell! You don't have any plans tonight do you?

If your code were a pleasure to review, though, your code could get in master quickly. You could avoid all this waiting and rotting and merging and conflicts. And your co-workers would like you better.

But it's easy to feel like a ping pong ball when others review your code. All developers have an opinion on the one true way to do anything. It's only natural that they think it the best and only way to do things.

I've had people complain that my PRs:

had too many commits

didn't have enough commits

that each PR should have one commit

that my commit messages were too long

that my commit messages were too short

that they never read commit messages anyway

that I didn't have enough tests

that my tests were pointless

that I didn't lint

that I didn't rebase enough

that I shouldn't have rebased to preserve history

that I didn't comment enough

that comments are useless

Every single person thought they were right in the absolute sense. There is a right and a wrong and I was doing it wrong.

I felt like Schrodinger's developer, right and wrong about everything at the same time!

I am writing this post, to give some small advice on how to create an effective pull request. Follow my advice, and you will avoid some amount of pain during your journey.

How PRs are reviewed on GitHub

For this article, I will make the assumption that code review will take place on GitHub.

There are two types of developers during code review:
1. Show me the commits
2. Show me the changed files

GitHub Review Commits

GitHub Review Files Changed

Commits make more sense and the argument is pretty straightforward.

One day in 2005, Linus Torvalds sat high on a mountain. He thought that commits would be a swell way to document software development.

Each commit is a unit of work associated with a few files and a comment explaining the purpose of the commit.

To me, this is using git as intended. To review code any other way, you're kind of going against the design of git itself.

You won't go to prison, it's not a capital crime, but that is how I see it.

Then there are those who like a long list of changed files, with no rhyme or reason to them. This is not a high crime or anything. But as we will see, it is inferior to reviewing commits.

If you are new to git, then you will likely prefer the changed files way. I know that I did, when I started working with git.

And there was a reason I preferred it this way. I didn't know enough about git.

Back then, to me, git was an esoteric ritual for neckbeard magicians.

Once I committed my changes, I had no idea how to change that commit. In fact, I would have been afraid to change anything concerning git commits.

But did you know that you can delete commits? Change commit messages? Re-order commits, combine them, or split them out into more commits?

You can do all this by something called rebasing. Many people prefer reviewing commits from the Changed Files tab, because they don't know how to rebase. We'll get into the specifics in a bit, but for now think of it as re-writing your commit history, because that's exactly what it is.

The Problem

I am the type of developer that once I get something working, I like to commit my changes before I break it. When things go south and all hell breaks loose, I can reset my local changes back to the commit that was working:

git reset --hard

Then I can try again from that working copy. But when you don't know how to manage your git history, these tiny commits become problematic.

Think of this situation. You write some code and add a few files including XYZ.b. You're happy with that so you commit it.

Later, you wake up in a cold sweat. You realize that XYZ was the dumbest thing you've ever done. Instead, you decide to create LOL.b and the need for XYZ.b is gone.

So, in a new commit, you delete XYZ.b and add LOL.b.

But now you have a problem. You're working on a feature branch (or at least you better be to avoid the pits of hell). And now you have two commits that reference XYZ.b which is now dead code.

You should rebase this code out of existence and we will talk about doing that in a minute.

There is an argument for leaving that code in your history. That you should want to see the history and the thought process of everything done.

I get that, but in this case it will make your pull request harder to read. It's not like we created XYZ.b in March and decided to delete it in October.

We created this file in this branch and removed it in this branch. At this point, it's noise for the reviewer. They should never see this and have to do mental gymnastics around what happened.

And this is exactly why I preferred the Files changed tab in GitHub when reviewing code. It prevented the viewer from seeing all this cruft.

The Files changed tab takes a diff between the end result and the branch you are merging into. And since XYZ.b does not exist in master and does not exist in the end result, they never see it.

We had group code reviews with the whole team. When I started on this team, I explained my code from the commits tab. There would be all this code where I would have to say, "Oh but this file doesn't exist anymore. Or ignore this, it all changed in a later commit."

It made the code so much harder to follow for the rest of the team. They complained and I listened. They said, "You should rebase your commits before making a pull request."

So, being a good citizen I did what any decent person would do. I googled "git rebase" and scratched my head for a while.

Why is it called rebasing?

It makes me think of freebasing or about how git has 50 million commands with confusing names.

If they had named it git rewrite instead, then everyone would use it!

So let's say that we have a bunch of commits that are hard to follow. We created files. We removed files. We wrote code and then changed our minds. We have commit messages like temp or WIP. And all this nonsense is in our commit history.

Squashing

To simplify things, let's imagine we have 4 commits. They're at the top of our history and they are ridiculous.

We want to squash them into one commit with a clear and concise message. We should rebase, it was born for this!

In SourceTree:

Select the commit that's below the 4 commits that we hate

Right-click on Rebase children interactively

SourceTree Interactive Rebase

Then you will see a dialog with the 4 bad commits. In this dialog you can re-order the commits, squash them, delete them and amend commit messages. So, let's squash.

SourceTree Rebase Dialog

Select the top commit

Click squash with previous 3 times

Click Edit message

Enter an inspirational commit message

Click Ok in the message dialog

Click Ok in the rebase dialog

Wallah! 4 confusing commits become 1 concise commit. All you have to do now is force push to your feature branch. Yes, with rebase you do have to force push. Don't do this to master unless you know what you're doing..Actually, don't do this to master at all..

Re-ordering

In the rebase dialog, you could have also used the arrows to move the commits up and down in the history.

Here's an example of why you may do this.

You commit to file abc.js. Later, you commit to abc.js again.

Start a rebase that contains these two commits.

Drag the last commit for abc.js to directly above the first commit for abc.js

Squash with previous

Edit message

Update your commit message

SourceTree Edit Commit Message

Click Ok

Click Ok

Force push

Done.

And be sure to clean up your commit messages. git will add a bunch of text about squashed commits and old commit messages, but I like to get rid of that stuff.

Deleting

Deleting a commit is simple as well.

In the main SourceTree window with the list of commits:

Click on the commit below the one you wish to delete

Right-click on rebase children interactively

In the dialog:

Click the commit you want to delete

Click the delete button

Click Ok

Force push to master

Done.

Commit Messages

When writing your commit messages, follow this pattern:

A short description of the commit (less than 80 chars)
<-- empty newline here
- Details about the commmit which can be as long as you wish
- More details going on about every last edge case you tested

The Command Line

You can also rebase from the command line. If you do, then you will need to understand how to exit vim (git uses vim to edit commit messages). You better have a PhD in applied cryptography if you're going to exit vim without frying your hard drive.

Here are the general steps for anyone who cares:

git rebase -i HEAD~4 # where 4 is the number of commits you want to rebase

This will bring up a screen in vim where you can do the same things. Re-order, squash, delete, etc.

Git Interactive Rebase Command Line

And when you're all done, vim will ask you for a new commit message. To do that:

Press i for interactive mode

Edit the message to your heart's content

Hit Escape

type :wq # that's colon wq as in write quit

Force push that mess

Preparing your PR

Rebasing is also good for bringing your feature branch up to date with master.

Say you are ready to create a pull request.

Then do this:

git co master
git pull
git co feat/branch-name
git rebase master

If there are no conflicts, then you're money. You've pulled master into your feature branch. You've rebased master onto your branch, or played your commits on top of master.

It's all semantics and no one understands this shit except Linus anyway. Go with it.

If there are no conflicts then force push to your branch.

If there are conflicts then let's deal with those.

Find the first conflict and resolve it.
Then do:

git add -u
git rebase --continue

Keep doing that until it stops complaining. I have an alias git next to do both those steps at once. Which is nice.

Once there are no more merge conflicts, force push your branch.

git redo?

Point of order though. Sometimes the rebasing gods are not kind. At times a rebase does not turn out as intended. In fact, a rebase can turn into the stuff of nightmares.

If this happens to you and you're in a bad state then do:

git reset --hard
git rebase --abort

And poof everything is back to normal.

Another point of order. Sometimes you try to rebase master onto your branch a couple times and it's a no-go. In this case, don't waste a lot of time. Abort the rebase and merge master into your branch instead. This may not keep the history as nice as a rebase, but merging is generally easier.

And now for the grand covfefe

My entire feature branch workflow in a nutshell.

git co master
git pull
git co -b feat/best-branch-ever

Lots of hectic committing happens

At this point, the history is a mess, and I have lots of commit messages like Work In Progress. But, on the bright side, we have conquered the feature.

Now, I will squash all the commits for this feature, into a single commit with a message like Work In Progress

Then I will force push to my branch (making a backup of my work)

Then do: git reset HEAD~

Be careful with that command, you can lose work.

What this command will do is uncommit everything in the last commit. All your changes for the feature will be pending and ready to re-commit.

But now you will have the benefit of hindsight. You can group the files into the perfect logical groups for each commit with a detailed message on the why.

Once everything is re-committed, push everything to your branch.

Now you are ready to create that pull request. And this time, someone may actually read it.

The reviewer won't need to go to the Changed Files tab anymore.

Instead, they can go to the Commits tab. They can see the files in each logically grouped commit. With stunning commit messages and clarity.

The way God and Linus Torvalds intended.

If you found this article to be helpful, then you should
subscribe here.