Decentralized versioning system at W3C

We’ve heard from several groups and individuals that they would like W3C to host a public decentralized versioning repository for W3C-related work items, such as editors drafts, test suites, tools and software.

The goal of such a repository would be to host the reference versions of these items, while allowing as many people as possible to modify, branch, patch the content of the repository, without the hurdles that CVS creates for this kind of cooperation.

As we are looking into experimenting with such a service, we are hitting the question that many others have encountered in that process: which decentralized versioning system to choose?

The main two contenders seems to be Git and Mercurial; Git seems to have a growing number of tools, and more advanced features; Mercurial seems to be easier to use, and possibly easier to set up on a larger number of platforms. Here are some of the comparions we have found in our early investigations:

I have a long history of VCS use. I started with RCS and SCCS in 1991, contributed to CVS at the time where it became network-aware, switched to PRCS, waited for it to evolve to the promising version 2 and at the end went to GNU Arch and Darcs. I managed to skip SVN except for a few projects I contribute to, such as GCC — even there, I usually manage to use gateways to more elaborate systems.

For a few years, I have been using Mercurial and GIT exclusively. I was first seduced by Mercurial because I like Python, and I heard a lot that GIT was a patchwork of shell scripts, Perl scripts and C programs. Mercurial was really pleasant to use, including in non-typical situations with limited connectivity [1].

In order to develop a device driver for my hosted Linux server watchdog, I had to use GIT to hack into the Linux kernel. I didn’t want to. And I just loved it.

Be it for large projects such as the kernel, medium projects with 15 developers or even single-developer code, I will not as of today choose anything else than GIT to manage my source files. I do not know any other systems which will let me accommodate every weird situation I may encounter (yes, I sometimes need to clean-up the history before publishing my changes — yes, I sometimes need to fix authorship information before sharing the code — yes, I may want to reorder changes before showing the patches).

It is true that advanced GIT concepts are harder to learn than a less-powerful system. But GIT is made for developers; people able to grasp with tricky specifications or able to learn several programming languages should not have any difficulty with it. Of course there is a learning curve; but the satisfaction once you master your tool is very, very rewarding!

I previously worked with the Mozilla team on their migration. Currently, I’m working with the python-dev community on bringing their code over. I’d be happy to help out with the W3C migration to Mercurial, if that will happen (also available for any questions or issues).

We have had a direct hand in implementing git and mercurial in production environments in two companies. Mercurial is fairly easy to setup and use, but the number of support tools and documentation for git is far greater.

git is also so fast that you don’t really think about it. It is also well-supported by companies like github. We had the choice between choosing Mercurial and Git – we chose git for the primary development repositories and mercurial as our product storage system (as our save file format).

While the learning curve of git is greater than mercurial, it’s nothing that somebody that isn’t familiar with Subversion couldn’t handle… and there are plenty of tutorials online to walk people familiar with subversion through the process of learning git.

I’d suggest that W3C purchase a commercial github account. That would allow W3C to easily manage the projects while W3C servers would do a git pull from github on a regular basis to ensure a backup of all git repositories. No need to expend energy creating a set of tools that will not provide the same level of developer experience as github. Plus, the brilliance of distributed version control is that it doesn’t matter where the “originating” repository is… because there really is no such thing.

We picked git years ago and it has been a tremendous boon to our productivity.

We had to address the same issue recently. Accordingly, we made an analysis between these:

GitMercurialBazaarSubversionVS Team System

The final winner was Bazaar (http://bazaar.canonical.com/en/), because it is the most user friendly environment with a tremendous feature list, and an immense flexibility (including extensibility with plugins). For example, it supports all the “new” stuff like branches that act like SVN, stacked branches, sending patches against any revision by email automatically, easy merging, mirrors, shared working trees, grouped logs …

Bazaar is the latest one, and we believe in the future potential, because we believe in the Ubuntu guys developing it. Hope that helped.