Nearly every code review guide tells us to keep Pull Requests (hereinafter: PR) small, in order to make them easy to review. The gain is straightforward: smaller PRs are easier to review, therefore more bugs and code flaws could be caught, which leads to better quality of the code.

What they don’t tell us is how to split a big PR, when you already have one, into a bunch of smaller ones.

I bet you have faced this kind of problem in your career. If not, grab a few situations when you may need it:

Developing a big feature individually, separately from other developers

Code review after a period of time when there’s no reviewer available (these are not rare cases in small teams)

Syncing two main branches (for example merge develop to master) as a big release

6 good practices for Pull Requests creating and splitting

These good practices will help you to set right your mindset for splitting Pull Requests. They concern also preventing situations when PRs need splitting.

Good practice 1. It’s your (author’s) responsibility

First things first. It’s the author’s responsibility to make the code review easy. The reviewers can’t spend as much time as you did digging into the context of the problem. Therefore, you need to present the code changes in a friendly form. Delivering your PR in small portions is one of the ways to do it, but doesn’t exhaust the list.

Don’t get angry when someone refuses to review your code because of its size. You should take some time and put some effort to make it more reviewable.

Good practice 2. Prevention is better than cure

Splitting big PR into smaller PR’s is a hard job. The best way to avoid this headache is to actually avoid the problem. You need to focus on not having big PRs in the first place. If you wonder how, here’s a bunch of ideas:

Intermediate PRs If you have mixed features A, B, and C in a single PR, and the reviewer finds an issue with one of those, you may need to rewrite all of the others. Exactly this problem is usually solved by delivering a partial solution to other developers more quickly. You can picture this problem as shortening the feedback loop from the other developers. By the way, intermediate PR doesn’t have to target the main branch – you may pick a feature branch (see Git Strategy 4 for more details)

Pair programming When you pair with someone on a big issue, it’s less likely that the code will have serious flaws. Usually, pair-programmed code requires only a brief, summary review from your partner.

Idea review That is a review of code, that hasn’t been written yet. How? By discussing with someone the idea how to implement it. This technique allows you to catch early design flaws of your solution, therefore prevents some serious code rewritings. It also hints reviewers about what they are going to see in code review, therefore makes the review easier.

Find a different reviewer If the problem is lack of reviewers, you could look for one “outside the box”. Have you ever asked a person specializing in a different technology for a review? I have. Explaining all the details to them, and questioning every line of code together was a very didactic experience.

In most cases, those techniques are cheaper (in terms of effort) than splitting PR into smaller ones. But sometimes it’s too late. We need other ways.

Good practice 3. Offer extra time to your reviewers

The situation when one PR has been split into multiple ones may be very confusing to your reviewers. It’s likely to introduce some complications. For example, a bug found in one PR is fixed in the next one. Or the commits have not been partitioned correctly and some code is missing in the PR.

In order to make the process more efficient, you can:

Introduce your reviewer proactively to how to review the PRs

Stick around to dispel his potential doubts

Good practice 4. Annotate Pull Requests

Depending on your code review flow, you may or may not already annotate the code before reviewers start their work (for example, in PR description or inline comments). In general, I find this technique very handy, and in the case of splitting Pull Requests into smaller PRs – a powerful clarification tool.

Focus on explaining the outcome, and remember that they haven’t seen the whole result:

Is all the code in the PR? For example: “Some code from original PR is not there; it regards refactoring which isn’t critical right now. I will create a separate PR with it after those ones are approved & merged”.

Is there a specific order in which to review? For example: “The code should be reviewed in order of PR’s number. Some features overlap each other, but particular Pull Requests should make sense on its own”.

Is this specific line of code in this specific PR in the final version? For example: “Please note that this bug is fixed in next PR”.

Good practice 5. Let reviewer follow your thought process

It’s very unlikely that one big PR was created at once, without any intermediate steps. Try to reveal those in the flow of reviewing.

Sometimes, you’re lucky enough to have the appropriate division at hand. For example, if you sealed each step with a commit. If you are, go for it, and show them the steps. Each commit could become a “milestone” commit being a base for its own PR (see Git Strategy 3 & 4).

Good practice 6. Learn your lesson

Once you’re done with splitting, you may want to think why was that even needed. If it was a big pain, maybe you should have a plan for how to avoid it in the future. Check again Good Practice 2 for potential solutions.

Additionally, here’s a few mistakes that you should avoid if you want to prevent splitting Pull Request again:

Too big features (ie. too long feedback loop)You’re violating Agile – Agile is mostly about keeping feedback loop short. Think about how to shorten it, for example, take a look at XP techniques like pair programming.

You did too much extra work That, in turn, could mean that you’re violating YAGNI. If it’s extra work on code clean up, that’s still violating. I get it, it’s very important to clean up the code as you go, but it should be little by little. This is what the Boy Scout Rule tells us.

It turned out to be harder than expected Classic. Well, maybe you should spend more time on proper estimation and refinement, or simply ask more experienced developers for help in estimation. This way you’d give everybody a chance to know that this feature is going to be huge, and potentially present them the partial PRs.

The feature required too many code changes This could mean DRY violation. If this problem keeps popping up, you may think of writing some abstractions (eg. introduce a design pattern), so that next time a new requirement would require a code change in a single place.

Too much test code needs to be changed That could mean that you test too much implementation (ie. too low-level tests). Think if you could test behavior rather than implementation. Test code shouldn’t restrain you from refactoring.

Too little test code needs to be changed Tests have one important role that is easily overlooked – they document the code. Sometimes the way to show your reviewers what’s the code is about, is to show them the diff from test code. If you’re missing it, they can spend a lot of time figuring out the consequences of every single line of code under review. To avoid this kind of painful review, you could introduce a proper test suite to your codebase.

7 methods for Pull Requests splitting

Below you will find 7 ideas for splitting one large Pull Request into several smaller ones and how to do it smart. From my experience, you should try them in order – the earlier the method, the more value it brings.

Method 1. Ask reviewers if they know how they want to review

Your thinking about your own code is biased. You spent much more time on the problem than your reviewers do. Something obvious to you might be totally unclear to your reviewers. It’s best to start by asking them if they know how they want to review the code. It’s not always the case but sometimes might work.

This can lead to unexpected results and point you to a different solution that you’d do otherwise. For example, your reviewer may answer:

“I’m not checking the view until I’m sure the business logic is written correctly”.

“There are a few big generated files that I’m not going to check. I’m sure they are generated properly”.

“This class has been changed in four commits, I would like to review the changes separately, along with corresponding changes in other files”.

If they don’t know, that’s fine, because again, it’s not their responsibility.

I’m not talking “I will do everything to make your review comfortable, master” attitude, but rather: “Let me know what are your biggest pain points, and I will try to ease it”.

Method 2. Separate refactoring from features

This rule comes from most code review guides, but here it comes in handy as well. Mixing features with refactoring may confuse reviewers which code change is caused by which. So let’s keep them in separate PRs.

A common case – you write a fix, but also run a linting procedure for the whole project. Now both are mixed in a single PR – multiple insignificant linting changes clutter the PR and make comprehending the other part (the actual fix) almost impossible. Once they are separated, the problem is gone.

Sometimes applying only this rule might be enough – can result in one big PR split into two (or more) reasonably sized PRs. If not, it could become a base to more meaningful units.

Method 3. Split Pull Request by features

In short, if your PR may contain features A, B, and C, you split it into 3 PRs, each containing one feature (please note that Method 3 generalizes Method 2).

Easier said than done. Recognizing particular features may be hard in your mind, not to mention in the code, and yet we’re talking about the code that has been already squashed into one code mash.

However, that should be your go-to method. It will result in the most sensible and easiest code review. This approach relies on a logical partition, while the next ones – on a more or less physical partition. Whenever possible, aim for this method, but if you decide it’s too much effort, go check next methods.

If you have recognized particular features and split them, you should think how much they are coupled. Sometimes the features are independent of each other, and that’s the easier case. A more challenging situation is when feature A needs to be in place to make feature B work.Therefore, you can’t review B prior to A. For example, the presenter of a model needs that model to be already properly implemented.

In that case, make sure that your reviewer checks the PRs in the right order, and don’t be upset when changes to “base” feature require more changes to “child” features. Remember that you could have asked before.

Method 4. Units first, integration later

If that makes sense in your case, you can divide your code into one or more code “details” part and final integration part.

For example, if you implement new API endpoint, the flow (request -> routing -> controller -> model -> view) is the bigger picture here – that means integration. Along to that, it usually requires significant changes to particular layers too. Separating “the what” from “the how” may be very helpful in comprehending code changes.

What I’d suggest is to group the units (the details), and check them one by one in separate PRs (remember to include tests!). After those are done, create a final, integration-level PR.

Method 5. Divide files by layer

For example, in Rails there’s a bunch of file types that usually could be logically grouped together and therefore checked in one go:

Models, repositories, migrations

Controller, service object, adapters

Views, decorators

Binary files

It’s certainly easier to review a bunched of files grouped by their purpose than all of them, even if the code changes mix multiple features. For example, you check the model and you wonder how to write a migration for some newly added field – the migration is already in place.

On the other hand, the PR doesn’t contain the controller code, which in this very moment is just noise, or at most – the detail.

It makes reviewing easier, but not as easy as feature-separated PRs.

Method 6. Separate UI from logic

If you can’t or don’t want to find more layers, maybe you can do only this – separate the visual part from the logic.

Method 7. Split Pull Request by files

This method should be avoided in favor of any previous ones, but still could bring some value.

Review files individually. Either it is one by one or a few big files first, then the rest (the swarm). What does it give you?

More pleasant navigation. If the PR contains too many files, navigating through it in your browser may become annoying due to long load time.

Reduces the noise. It depends on your code, but some big files don’t need other files to be checked at the same time.

More focus and less ground to treat the review mechanically

Example: your PR contains, along with meaningful changes, binary files – SVG images, fonts. The review of such files makes no sense or limited sense – you may want to review the filename, the directory where it’s placed if it’s not duplicated etc., but the content is not a subject of the review.

Similar story – the translations. These are more like configuration files, usually delivered by the client anyway, so even if you find an issue, the author may not be able to fix it.

Git strategy 4. “Milestone” commits to the temporary main branch

This one could be useful if you can’t or don’t want to merge partial PRs into the main branch. Remember that in PR, your target branch doesn’t have to be the main branch!

Check out the main branchgit checkout master

Create a temporary main branchgit checkout -b temp_master

Checkout source branchgit checkout feature_branch

Find milestone commits

Check out next earliest milestone commitgit checkout [commit_id]

Create a new branchgit checkout -b milestone_branch

Create a Pull Request to the temporary main branch

Annotate it with “do not merge until the previous ones are merged!”

Repeat 3-6 until there are no more milestone commits

Summary

I hope you found this post helpful. We discussed what’s important if you face the “too much PR” problem. I gave you some general thoughts as well as concrete git solutions, based on my personal experience.