Darcs

Version control is a key tool when programming. Recently, much work has gone into distributed version control systems (DVCS) that dispense with the requirement to have a single central repository, and allow a distributed, loosely-connected network of repositories. There are many DVCS systems out there, but I use darcs (which happens to be written in Haskell) whenever I can. All my Haskell code lives in darcs repositories, and this post is intended to detail a little of how I use it.

Darcs and CHP

Here is the structure of my darcs repositories for CHP across my three development machines (no prizes for guessing my machine naming scheme) for the past few weeks:

Each box is a repository, with the thin arrow-less lines indicating the directory structure. The empty-headed arrows indicate the ways in which patches are pushed and pulled (they are pushed in the direction of the arrow, and pulled in the opposite direction). The directions of the horizontal arrows are far from arbitrary; while both my laptop and university machine can SSH into my home machine (middle), neither of the outer two can be contacted via SSH. Hence they also cannot push and pull directly between each other.

I have about ten repositories in that diagram, which is probably typical for CHP. I tend not to do much work in the top three (the main repositories on each machine, akin to trunks), they are mainly pass-through repositories as I shift patches around. All the work is done in the lower repositories (branches, if you like), which may exist on several of the machines (e.g. the split repository for the recent 2.0 split) and have patches passed between them regularly, or may exist only on one machine. I am usually logged in to two of the machines at once, so that I can push my patches across and test them on several different GHC versions (until last week, all three machines had distinct versions of GHC: 6.8, 6.10, 6.12).

For my other major Haskell project, Tock (which I will also make a post about soon), I could make a similar diagram, but extended on to it are the repositories of the maintainer, Adam. He accepts my emailed patches into one repository to examine them, then he pushes to his main repository, and then later on pushes the patches to the webserver. So my working copy can be about seven repositories away from the publicly visible “trunk”, but it’s absolutely painless for both of us. There is a limit on how many branches is sensible of course, but darcs makes branching so easy that the branch of a branch is no trouble, even when that might make you uneasy in older version control systems.

I should probably make the CHP repository available on a webserver somewhere — please say if that’s something you’d be interested in. But such a public repository is unlikely to be much more up to date than the latest release. This is a combination of keeping my work in branches, and of my fairly regular release policy. You can see from the above diagram that I don’t necessarily release from the top middle repository (which is the closest I have to a “trunk”), although generally all the changes do eventually reach that repository.

Branch and Merge: Hunky

Darcs has easy branching and merging (especially compared to VCS systems like Subversion). One use I often make of branching is to get a clean copy of the repository. darcs get . foo will create a clean copy in the foo directory — i.e. a copy without any unrecorded changes. This is useful, for example, when you discover a bug and want to know if it was there before you made your current changes.

Darcs record has a particularly useful hunk-oriented interface. It interactively shows you all the changes you’ve made, and you can pick which ones you want to form a part of the current commit. This nicely reflects the fact that just because you made two changes to the same file, it doesn’t mean they have to end up in the same commit: for example, they may be fixing different bugs, but you happened to notice the presence of one while fixing another. So you can fix them both, then record them separately. This page shows an example. And I gather that hunk-splitting (which would make them even better) is coming in the next major release.

Other uses for Darcs

It’s not just Haskell code — in fact, even my home directory is a darcs repository (with my settings files recorded in it):

That way, when I get access to a new machine, I just darcs get from my home machine and I have the settings for my favourite editors and for bash (once I source that bash file in my .bashrc on the local machine). It’s an amazingly useful idea (thanks to Adam Sampson), and it means that if I modify any settings on any machine, I can record a patch and push it to my main home machine, then pull it onto all the other machines at a later date. So all my editor macros and so on can be easily kept up to date on all my machines, and I can rollback any change I make later on, if needed.

Summary and Donations

I imagine many Haskell programmers are already using darcs, and are aware of its usefulness. The Darcs team are currently running a bit of a fund-raising drive. If you use darcs as much as I do, perhaps you can consider donating a little by way of thanks. If you’re not a darcs and/or DVCS user, hopefully this post shows why it’s useful.

Related

First, I’m pleased to note that the cherry picking of hunks is now also a feature in other DVCSes, notably Git. Hopefully, these systems will one day acquire Darcs’s ability to cherry pick *everywhere*, during revert, pull, push, send, rollback, because that’s really handy and unique. It’d be great to see this idea spread out in the DVCS world too :-)

As for hunk-splitting, we do have the code right now and have been quite eagerly using it for quite some time. I love it! It’s extremely simple to use: you are presented with a before and after block in your text editor and you make any modifications you want.

That said, we’re still trying to work out a concise explanation that we can present in the text editor. So I suspect what will happen is that we’ll either hide the button (e), just not make a big deal of it, or mark it as experimental.
Anyway, the principle is that hunk splitting turns your patch into a patch sequence, which is *identical* to the original patch. So from a patch p, you get a sequence like pBefore, p1..pN, pAfter. The p1..pN are the direct result of your work in the text editor, and the pBefore and pAfter are infered by Darcs. See what I mean? It’s the kind of thing that’s actually very easy to use in practice, but sounds really complicated when you first try to explain it. Hopefully, we’ll get there…

When I look at a scheme of personal computers like that and think about all the time you must spend on it, I can’t help but think you would’ve been better off putting all that money & time into a really nice laptop which could replace them all!

(Not that managing ~/ in a DVCS isn’t a good idea for entirely other reasons, though.)