I'm currently working on a project with a team that's using a git workflow. It's fairly simple, master should be in a deployable state and branches are used to create features and hotfixes. Whenever we have a feature or bugfix completed and tested then we move that over to master as soon as we can. The idea is that branches should be as small as possible to make it easier to merge them back into master. We have a policy that any code pushed to master branch should be in a deployable state and pass the tests.

We have got a situation where one of the developers has done a lot of work (a few months worth) on a single branch and this branch hasn't been merged back into the master yet. There's now a few separate features and a bunch of commits on this branch, essentially this branch really should have been merged back in a few times already but so far hasn't been. Most of the code is in a good state with unit tests that could be merged back into master but the most recent changes should certainly not be as they are not completed and aren't tested.

What is the best way to deal with such a situation where one branch is really far away from the others? What ways can we avoid branches getting a very large number of commits away from master in the future?

5 Answers
5

Let the guy who went for a couple of months without merging fix it. Maybe he can get one big chunk of code to merge, maybe he can get a bunch of little chunks to merge one at a time. In any case, he should be doing the legwork to fix the problem, since he caused it.

What is the best way to deal with such a situation where one branch is really far away from the others?

In general, don't worry about it: it's the other guy's problem. If two branches are really too far apart to merge, then they aren't really part of the same project anymore and you have a defacto fork. If it's an open source project, that may not even be a problem.

If this guy is really brilliant, and his code is better/smarter/more important than the rest of the team combined, then it's worth making it your problem instead of his. Otherwise, it isn't.

To answer the literal question: the best way to deal with this kind of situation is to not get in that kind of situation.

What ways can we avoid branches getting a very large number of commits away from master in the future?

Make sure everyone notices that the guy who went for months without merging is having to fix the problem he caused. Make sure everyone knows that it's easier to commit to master frequently than infrequently, since fewer changes means fewer opportunities for conflicts.

Make sure that people know that they can pull from master to stay up-to-date with other people's changes.

"If you merge every day, suddenly you never get to the point where you have huge conflicts that are hard to resolve." --Linus Torvalds

First, see if there truly are separate commits that can be merged or cherry-picked, like suggest by @Maciej Chalpuk. If this is the case, then the situation really isn't that bad, and I wouldn't worry too much about it in the future.

However, if the real situation is that multiple features have been developed concurrently in a single branch, within the same commits, then it becomes a much larger pain to deal with. Luckily, the prevention method is built in: require the developer to separate out the changes for each feature into separate branches and pull requests before merging them in. You'll both get your atomic merges, as well as dissuade this developer from doing it in the future.

The actual process of separating out the features is entirely manual. Create new branches off of master, and copy in whatever changes from the mega branch that are related to it. Compile, test the feature, push and issue a pull request. The less intermingled the code changes are, the easier it will be to do. If he was hacking a single method for all of them, well, have fun. He won't do it again.

Track the features that this person has implemented and go to each commit on that branch which was updated per feature. Take this commit and merge it with the master repo.

Let me break this down in the form of an example.

Let: Branch A be the branch from the master
Branch A+ = Branch A + new feature 1
Branch A++ = Branch A + new feature 2
and so on and so forth

What you need to do is to go back to: Branch A+

Take Branch A+ and merge it with Master.

Now go to Branch A++ and merge it with (Master + Branch A+).

Repeat until you've reached the final Branch A+...+ that is stable.

This method may sound counter-intuitive at first, but if you merge each separate new feature on its own with the master, it becomes easy to cycle between the master branch "per added feature"

What ways can we avoid branches getting a very large number of commits away from master in the future?

I think my solution above indicates what future method you should be adopting. Go with a per feature or per task method for each branch.

I would suggest using an approach of:

pre-master and master

master: Final/production level. Is modified not often. Is considered always stable

pre-master: the area where a new feature is added to existing code. Is tested thoroughly to work with existing code-base, and is the place where other branches can fork for new feature implementation.

You should also try bundling features and aiming for version-targeting.

Version-targeting: Specify an arbitrary number that will act as the placeholder for the master branch. "In V1.0.0, we will want to achieve X, Y, Z features. V1.0.0 will also have all these functionalities available: ..."

By maintaining a version against master, it can also be a way of keeping "master" stable and production-ready at all times.

Fixing the issue of the large pull request is one thing, and there are some good answers regarding that. But as for dealing with branches that get far out of date, you might want to revisit your processes for dealing with team work.

If you're working within an Agile or SCRUM framework, the team should really be asking why the feature wasn't completed and merged as part of the iteration/sprint. If it was "too big" to fit within an iteration, it should have been broken down into smaller chunks.

It also raises a question of code ownership -- within your team, do individual developers own their own work separately, or does the full team work together to ensure that items done?

Of course, the above assumes that your team is within some kind of company structure. If this is an open-source project with volunteer contributors, that's another story. Generally such projects have looser control over workflows, but the onus for generating acceptable pull requests falls more often on the individual contributors.

In many ways this becomes a process question. Maybe your needed process includes checking periodically (weekly? monthly?) for long-running, unmerged branches. Some tools make this simple to check visually; e.g. on github visit the "branches" link and it shows how far ahead/behind each branch is.