Patches applied in the last week (111)

Sunday, April 8, 2012

This year, Darcs had a sprint in two phases. It started with a one-day pre-sprint in Cordoba, Argentina (9 March), and then moved over to Southampton, England for a 3 day hackfest at the end of the month (30 March to 1 April).

No Darcs hackers being shipped between the two sprints though, but we did have one visitor from afar. Potential GSoC student Bhimanavajjula Sri Rama Krishna (BSRK) Aditya flew the 9 hours between India and England to join us for the sprint. It was great to meet him in person!

Presprint

We think that Darcs could make a great project for people get started with some practical Haskell hacking. It's a bit of a fixer-upper, but that also means there's a lot of difference to make!

Darcs veteran (and weekly news editor) Guillaume Hoffman was joined by two students, Miguel Pagano and Mathías Etcheverry, who within a day and with no prior knowledge of the Darcs code base were able to make the following contributions:

In between getting Miguel and Mathías, Guillaume also got a chance to make some improvements himself, namely:

adding the --unified flag to record, revert, amend-record

Thanks to Miguel and Mathías for joining us at the sprint. Hopefully we'll be able to repeat the cycle of Darcs hacking with them. And since little one-day mini sprints like the one Guillaume started are so easy to organise, there's a chance we'll be seeing more of these in the future.

Summer of Code

If he participates in this year's summer of code, Aditya will be helping us to integrate the long-promised patch index optimisation into Darcs. The patch index was originally developed by Benedikt Schmidt. It caches a mapping from filenames to the patches that affect those files, which saves a lot of work for commands like darcs changes or darcs annotate, commands that would otherwise have to trawl through the entire darcs history

Over the sprint, Aditya rebased the patch index code from Benedikt onto the current Darcs mainline. He studied the code a bit to understand what exactly was behind the index, and started working on implementing the integration with commands like darcs changes. He also got to explore a bit of Darcs internals, notably how Darcs makes use of matchers like 'date "before tea time"' to filter through patches.

One very concrete result of the sprint, we now have prototype of a patch-index-enabled darcs changes.If you can't wait to try it out, you could try applying the latest version of his patch.

Filepaths: bytes or code points?

Argh, Unicode, Argh

The main thing Ganesh worked on was fixing a problem with character set handling that has been outstanding for several months. The underlying problem was caused by recent versions of GHC changing the way it handles filenames on Linux; previously it treated them as a stream of raw bytes, but now it translates them into strings using an encoding. The eventual workaround was very short - explicitly set a global at the beginning of darcs, telling the GHC library to use no encoding at all - but it took a lot of investigation to get to that point, and the end result isn't very satisfactory for darcs as a library.

Darcs 2.8 Release Candidate 1

Florent and Ganesh also worked on getting a 2.8 release candidate ready. We'd love any feedback you could give us on it, so if you're up for a little beta testing:

cabal update
cabal install darcs-beta

The character set handling problem with GHC 7.2/7.4 was the main blocker for a release, so hopefully we can get the real release out pretty soon now.

Can you duplicate a rotcilfnoc (inverse conflictor)?

We are painfully aware that our current version of patch theory is broken with respect to conflicts. Owen Stephens (from Summer of Code 2011!), who generously hosted the sprint (thanks, Owen!) spent a good chunk of Friday staring at one example of the brokenness, a failing QuickCheck test which he minimised to a simple 3-way conflict: create a directory and a file, (A) remove the directory, rename the file, (B) remove the directory, (C) move the file inside the directory under a different name.

Ouch.

After much discussion with Camp hacker Ian Lynagh, Owen discovered that this was just a fundamental bug in the conflictor-based approach. We've already been back to the drawing board for a while, but now we have yet another test case for what the next patch theory should deal with.

Next Patch Theory?

We spent a bit of the weekend working on and discussing the new patch theory. Ian worked some more on Camp (more proofs!). Ganesh explained a bit more what he had in mind with the graphictors ideas he was exploring (each conflicting patch would be in a minimal context with respect to the conflict). And Owen talked us through some thinking he found digging through the archives of the old darcs-conflicts list. We know we need to a successor to the current version of patch theory. But where will we end are we going to end up?

Clean clean clean that code

The new patch theory won't be for a while. In the meantime, there is a ton of work we can do to prepare the ground for it. One thing we can do to help is to improve the Darcs code base to the point where shifting to a new patch theory, or a new repository format, or a new set of primitive patches is relatively smooth and easy. Darcs needs a cleanup effort.

Owen, Ganesh and Eric made several pushes towards making the Darcs code more approachable:

Eric made use of Cabal 1.8's shared library feature so that Darcs only has to be built once rather than 3 times

Eric (and a little bit of sed) replaced the confusing type witness C preprocessor macros with some more straightforward Haskell

We have a very long way to go. But we are thinking harder about the concrete steps we can take to making the Darcs code more respectable.

More helpful interactive mode

Florent worked on adding some more intelligence to the Darcs patch selection code. Hopefully this work will lead to more feedback and some nice new features like an interactive darcs diff. Cherry picking is one of the more unique aspects of Darcs, and one of the reasons we're so interested in making patch theory right one day. The patch theory is what allows us to cherry picking in almost all of our commands.

But while interactive mode can be pretty helpful, it can also provoke for the kind of situation where good just makes you hungrier for better. For example, if you try to pull some patches but interactively decide that you want to skip some patches, Darcs will also skip over the patches that depend on it. But figuring out why exactly patches get skipped can still be a bit mysterious. What if instead of telling you it skipped some patches, Darcs could give name the dependencies you'd need to pull in too? Hopefully, Florent's investigations will pay off!

Rebase

Owen and Eric spent some time getting to know the new darcs rebase feature that Ganesh has been working on. It's nice! Darcs rebase is for those situations where Darcs patch theory falls over (and fall over it does). It allows us to rescue long-term branches previously lost to intractable conflicts, or to do “deep amend-record” operations that break through dependency barriers.

And this being Darcs, it's done with the interactive cherry-picking interface which should be familiar to users. There's starting to be talk of getting this code in HEAD darcs so that people can try it out and we can start working towards refining the user interface.

Darcs Bridge

Darcs bridge isn't ready for prime time, we're afraid. It's good for one-shot conversions, but if you're hoping to maintain a long-term bridge and you have to deal with Git branches, we'd advise waiting. But we're getting closer. Owen and Ganesh spent some time hashing out the design for the darcs bridge and thinking more about how the respective Darcs and Git models of the universe mesh together.

Where next?

Finally, among our many discussions was a more general question of strategy. Darcs is a very long term project and it could take many years for us to get the version control system that we want. Over the past few years, we'd placed a great emphasis on performance, addressing some day to day issues to bring Darcs to a more acceptable place; and now the efforts are starting to pay off. We now have faster local repository operations, repository fetching (mainly by deprecating the old fashioned format and getting people to switch to hashed repositories), and a much more usable darcs annotate command (in the upcoming 2.8 release). So Darcs is faster now— it's certainly no Git and the conflict merging issue is still there, but it's in a much better place than it was 4 years ago. Now what?

Now we start digging in for the long haul. We have essentially 3 development priorities for the future of Darcs:

Cleanup: There is a massive amount of work to be done here, ranging from entry level tweaks like shifting to a uniform coding style and getting more disciplined about haddocks; to deeper software engineering issues, like developing a cleaner separation between repository-management and core patch theory code. The code needs a lot of loving, and if you're ready to roll up your sleeves, we could use the help.

Hosting: Darcs isn't enough. We need to think about online hosting and GUIs. One of our goals is to have a Darcs library that makes it easier to write things like Patch-Tag, or Darcsden; or whatever interesting ideas the community may come up with. If we have to, we may even prototype some code ourselves to push the library forward.

Theory: The one thing that we absolutely have to get right for the next patch theory is our story on conflicts. As you can see, we are thinking about quite a few different ideas. It's too early to tell which of these we'll end up running with. More news when we have some more solid ideas.

Thanks!

Darcs is a long term project and with all the ups and downs we've been through over the years, we are grateful for the support the community has shown over the years. Thanks to Guillaume and Owen for their sprint organisation efforts, to our donors for making it possible for students like Aditya to get to sprints, and the Software Freedom Conservancy for helping us with the administrative side of running an open source project.

If you'd like to support the Darcs team in our efforts to make an easy to use, flexible, formally backed version control system into a reality one day, we would be thrilled if you could submit patches, bug reports, comments on the IRC channel or darcs reddit. If you just want to send a little cash our way to push sprints along, we most certainly appreciate your donations.