My personal blog

Tag Archives for dvcs

I’ve had plenty of discussions with mercurial fans, and one argument that always keeps poping up is how mercurial branches are superior. I’ve blogged in the past why I think the branching models are the only real difference between git and mercurial, and why git branches are superior.

So, in this blog post I will explain why mercurial branches are not superior, and how everything can be achieved with git branches just fine.

The big difference

The fundamental difference between mercurial and git branches can be visualized in this example:

In which branches is the commit ‘Quick fix’ contained? Is it in ‘quick-fix’, or is it both in ‘quick-fix’ and master? In mercurial it would be the former, and in git the latter. (If you ask me, it doesn’t make any sense that the ‘Quick fix’ commit is only on the ‘quick-fix’ branch)

In mercurial a commit can be only on one branch, while in git, a commit can be in many branches (you can find out with ‘git branch --contains‘). Mercurial “branches” are more like labels, or tags, which is why you can’t delete them, or rename them; they are stored forever in posterity just like the commit message.

That is why git branches are so useful; you can do absolutely anything that you want with them. When you are done with the ‘quick-fix’ branch, you can just remove it, and nobody has to know it existed (except for the fact that the merge commit message says “Merge branch ‘quick-fix'”, but you could have easily rebased instead). Then, the commit would only be on the ‘master’ branch’.

Bookmarks are not good enough

Mercurial has another concept that is more similar to git branches; bookmarks. In old versions of mercurial these were an extension, but now they are part of the core, and also new is the support for repository namespacing (so you could have upstream/master, backup/master, and so on).

However, these are still not as useful as git branches because of the fundamental design of mercurial; you can’t just delete stuff. So for example, if your ‘quick-fix’ bookmark didn’t go anywhere, you can delete it easily, but the commits won’t be gone; they’ll stay through an anonymous head (explained below). You would need to run ‘hg strip‘ to get rid of them. And then, if you have pushed this bookmark to a remote repository, you would need to do the same there.

And then, bookmark names are global, so you can’t push a branch with a different name like in git: ‘git push remote branch:branch-for-john‘.

Anonymous heads

Anonymous heads are probably the most stupid idea ever; in mercurial a branch can have multiple heads. So you can’t just merge, or checkout a branch, or really do any operation that needs a single commit.

Git forces you to either merge, or rebase before you push, this ensures that nobody else would need to do that; if you have a big project with hundreds of committers this is certain useful (imagine 10 people trying to merge the same two heads at the same time). In addition, you know that a branch will always be usable for all intends and purposes.

Even mercurial would try to dissuade you from pushing an anonymous head; you need to do ‘hg push -f‘ to override those checks.

The rest of the uses of anonymous heads were solved in git in much simpler ways; ‘git pull’ automatically merges the remote head, and remote namespaces of branches allow you to see their status after doing ‘git fetch’.

Anonymous heads only create problems and solve none.

Nothing is lost

According to him, it’s impossible to figure out what happened in this repository, but it’s not. In fact, git can automatically find out what is the corresponding branch for the commit with the ‘git name-rev‘ command (e.g. ‘release~1‘).

Now the only difference is that there is no ‘temp’ branch, but that is actually good; it was removed. Why would we want to see a branch that was removed? We wouldn’t. Either way, the information remains; “Merge branch ‘temp’ into release” says it all; that all those commits come from the ‘temp’ branch.

Of course, one would need to manually look through the commit messages to find those removed branches, but that is fine, because you would rarely (never?) need that. And if he really needs that information readily, he can write a prepare-commit-msg hook to store the branch name the commit was originally created from.

Real use-cases

jhw tried to defend the need for this information by presenting some use cases:

A more clever rebuttal to my question is to ask in return, “Why do you need to know?” Let me answer that preemptively:

A) I need to know which branch ab3e2afd was committed to know whether to include it in the change control review for the upcoming release

It’s easy to find out what commits are relevant for the next release with ‘git log master^..release‘:

But then he said:

I didn’t ask for a list of all the commits that are currently included in the head of the branch currently named ‘release’ that are not included in the head of the branch currently named ‘master’. I wanted to know what was the name of the branch on which the commit was made, at the time, and in the repository, where it was first introduced.

How convenient; now he doesn’t explain why he needs that information, he just says he needs it. ‘git log master..release‘ does what he said he was looking for.

B) I need to know which change is the first change in the release branch because I’d like to start a new topic branch with that as my starting point so that I’ll be as current as possible and still know that I can do a clean merge into master and release later

I didn’t want to know the most recent commit included in both the currently named ‘master’ and ‘release’ heads, because that may have actually occurred either prior to, or after, the creation of either the branch currently named ‘release’ or the branch currently named ‘master’.

And again he doesn’t explain why on earth would he need that.

To find the most current commit from the ‘release’ branch that can also be merged into ‘master’ cleanly you can use ‘git merge-base‘; the first commit of the ‘release’ branch doesn’t actually help as it has already diverged from ‘master’ and it’s not even “as current as possible” as there will probably be newer commits on the release branch.

Either way, if he really wants that, he can pick any commit that he wants from ‘git log master..release‘.

C) I need to know where topic branch started so that I can gather all the patches up together and send them to a colleague for review.

Easy: ‘git send-email --to john release..topic‘.

But then he said:

I didn’t want to know all the commits present in the head of the branch currently named ‘topic’ that aren’t present in head of the branch currently named ‘release. I wanted to know the first commit that went into a branch that was called ‘topic’ at the time when the change was committed. Your command may potentially include commits that were in a different branch that wasn’t called ‘topic’ at the time.

Why would you send patches for review that are dependent on commits your colleague has no visibility of? No, you want to send all the patches that comprise the ‘topic’ branch, doing anything else would be confusing

If for some reason you don’t want to send the patches that were part of another branch, you can select them out with ‘^temp’.

Conclusion

All the use-cases jhw explained are supported just fine in git, he is just looking for corner-cases and then complaining because we would need to do extra stuff.

I have never seen a sensible use-case in which mercurial “branches” (branch labels) would be more useful than git branches. And bookmarks are still not as good.

I have long criticized Pidgin’s move to Monotone. I have tried to analyse this rather marginal DVCS tool, I wrote my own mtn->git conversion tool, and I helped validate and improve Monotone’s official tool afterwards. I have spent countless hours identifying Pidgin’s contributors, finding their real names, email address, etc. I have also manually dug through old commits to properly identify the real authors of a patch (as opposed to the committer). Finally, I have also wrote scripts that automatically create a nice, clean, git repo.

However, I feel they are making yet another mistake, because they haven’t actually analysed their decision. In order to demonstrate the double standards, cognitive dissonance, and lack of follow up, I’m posting chunks of the recorded history on the mailing list, with links.