Darcs retrospective, and the future

GHC has been using darcs for version control since the beginning of 2006. It has not been all plain sailing, so in this page we will record our experiences with darcs, and attempt to objectively evaluate whether we would be better off with a different version control system. In the event that we do switch, we need to track exactly what needs to change, so this page will also list those dependencies.

Problems we currently experience with darcs

Conflicts and merging. This is the biggest problem we encounter, and is also the #1 priority for
Darcs development. Any non-trivial branch is affected, and essentially the workaround is to discard
the history from the branch when merging, and use ordinary diff/patch tools. Keeping history is
possible, but impractical for branches with more than a few patches.

Speed. many operations are impractical (annotate, darcs changes <file>), and many operations just take "too
long" (i.e. long enough that you go and do something else rather than wait for it to finish,
which incurs a context-switch cost). We can't use Trac's darcs integration or darcsweb, for example,
because both rely on invoking darcs changes <file> (for that matter, that's not completely true for the
​trac darcs plugin as it does not execute that command
on a per-file basis, but rather it loads and caches into its own database the result of darcs changes -v
on the "not-yet-loaded" changesets, visiting every patch in the repository just once.
It caches also the actual content of each file touched by any browsed changeset, to compute the unidiff.).

bugs: we run into darcs bugs other than the conflict/merging bug on a regular basis.

user interface issues: e.g. in a conflict there's no way to tell
which two patches are conflicting with each other(!)

Windows support: is quite flaky still. (well, it's certainly better than it used to be, and
at least some Windows users don't consider it to be bad).

The GHC developers have sufficient problems with Darcs that a change would be beneficial

We want to stick with distributed version control, and have a widely-used and well-supported system, so Mercurial and Git are the only real
contenders

Mercurial and Git and percived as being mostly feature-and-performance comparable, although git is more popular

More investigation of the Mercurial option for GHC is needed, especially in light of reported poor support for Windows with Git. This
work is ongoing

Important workflows

ToDo. Compare workflows using darcs with the same workflow in other systems.

Cherry-picking patches

This is how we maintain the stable GHC branch. Particular fixes are pulled from the HEAD. When the desired patches don't depend on undesired patches, darcs takes care of this automatically, as demonstrated below. Otherwise, with darcs, the patch has to be merged by hand.

amend-record

So, you make your lovely patch, it all looks good, so you record it. Then you do a build to make sure it works, and during the build or testsuite run you find that the patch wasn't quite right after all. You could just add a little 2-line patch, but that isn't very pleasant: It's nice if, as far as possible, all intermediate compiler states are buildable. Also, people might pull the first patch but not the second when cherry-picking, leading to head-scratching down the line. It's much nicer to be able to just amend-record the fix into your original patch.

Darcs alternatives still in the running

Mercurial

Advantages:

Speed comparable to Git

Some operations become feasible (bisect, annotate)

Many helper tools

Good Windows support

HTTP and SSH sync possible, but unknown how this compares to Git native protocol sync speed

Git

Complex command set? (Though, it should be possible to find replacements for the darcs commands and be happy.)

Lack of good Windows support?

bisect support would require git modules to also pick the correct version of libraries. Keeping this in sync is not easy, atm.

uses its own protocol for network transmission (http works but is slower, however, other hosting services are available, e.g., github)

Eliminated alternatives

Bzr

Advantages:

Fairly fast

Portable (as portable as python, anyhow)

Merging works correctly based on closest-common-ancestor

Tracking of renamed files / directories merges correctly

Revisions form a DAG (more like a tree with merge-points) rather than patchsets

Supports convenient "centralised-style" commit-remote-by-default as well as "distributed-style" commit-local-by-default. Just 'bind' or 'unbind' your branch whenever you want.

Simple clear UI

Disadvantages

Revisions form a DAG (more like a tree with merge-points) rather than patchsets (this is a subjective point, which is why it's in both lists. Which model do you believe in?)

Cherry-picking isn't very "native" to the data model.

UI is rather different from darcs (which current contributors are used to).

Reason for elimination: lack of uptake and hence more risk of Bzr becoming unmaintained.

Darcs

Advantages to staying with darcs:

Community consistency: essentially the Haskell community has standardised on darcs, so it would be
an extra barrier for contributors if they had to learn another VC system.

Merging, when it works, is done right in darcs.

Disadvantages to staying with darcs:

Uncertain future: no critical mass of hackers/maintainers. The technical basis is not well enough
understood by enough people.

Reason for elimination: persistent performance and algorithmic problems, see above.

Dependencies on darcs

The following is intended to be a complete list of the things that would need to change if we were to switch away from darcs, in addition to the conversion of the repository itself, which I am assuming can be automatically converted using available tools.

The following code/scripts would need to be adapted or replaced:

The darcs-all script

The push-all script

The aclocal.m4 code that attempts to determine the source tree date

.darcs-boring

The buildbot scripts

checkin email script: /home/darcs/bin/commit-messages-split.sh

Trac integration (the GHC Trac does not currently integrate with darcs, however)