Wednesday, May 23, 2007

I've proved that the central XMonad StackSet module is safe on several occasions, as the code keeps evolving. Each time I take the source code, run Catch on it, and send an email back to the XMonad list giving the results. So far only one other person (Spencer Janssen) has taken the time to download and install Catch and run the tests to validate my result. The reason for this is that building Catch is slightly tricky, due to the Yhc and Yhc.Core dependencies. I'm working on putting together a proper release for Hackage, expect that within a month - all the code works, its just the packaging and user interface thats lacking.

The other day dons asked me how he could "get an idea" of what Catch is doing. If you blindly accept a formal proof, its hard to get a feel of whether its correct, or if you are proving what you expect. The detailed answer is going to appear in my thesis (and hopefully as a paper beforehand), but I thought it may help to give a very light overview of what thoughts go through Catch.

In Principle

The concept behind Catch is that each function has a precondition that must hold in order for the function to execute without error. If a function f calls a function g which has a precondition, then the precondition for f must guarantee that the precondition to g holds. If you set this up so that error has the precondition False, then you have a pattern match checker.

The second concept is that given a postcondition on a function, you can transform that to a precondition on the arguments to the function. If there is a requirement that the result of f x y meets a certain condition, then this can be expressed as conditions on the variables x and y.

Before all this machinery can be put into action, it is first necessary to perform many transformations on the source code - filling in default pattern matches, doing simplifications, removing higher-order functions and abstracting some library functions. All these transformations are performed automatically, and are intended to set up the Catch machinery to do the best job it can.

In Reality

As it happens, most of the pattern matches that are checked in XMonad are relatively trivial - and do not push the power of Catch to its full potential. A few are slightly more complex, and one of these is focusLeft (XMonad code evolves very fast, so this may not be the current darcs version!):

Catch identifies two separate potential pattern match errors in this statement. Firstly the lambda expression passed as the second argument to modify is potentially unsafe - as it does not mention the Empty constructor. A quick look at the modify function shows that by this stage the value must be a Node. The way Catch solves this is by transforming the code, bringing the two pices of code together. Once the case expression within modify is merged with the one in the lambda, pattern match safety is a simple transformation of elimating redundant alternatives.

The second potential error is in the where statement. If reverse rs is empty then the pattern will not match. This is a perfect example of the postconditions in action, the generated postcondition is that reverse rs must be a (:) constructed value. Using the machinery in Catch, this condition is transformed into the condition that rs must be a (:) value. Looking at the alternatives, Catch can see this is always the case, and declares the pattern safe.

In order to prove the entire module, Catch requires 23 properties, and 12 preconditions. The process takes 1.16 seconds are requires 1521.05 Kb of memory.

In Conclusion

Catch is an automated tool - no user annotations are required - which means that some people may feel excluded from its computational thoughts. If a user does wish to gain confidence in the Catch checking process, a full log is produced of all the preconditions and postconditions required, but it isn't bedtime reading. Hopefully this post will let people get an idea for how Catch works at a higher level.

I am very glad that Catch is an automated tool. XMonad is a fast changing code base, with many contributors. In a manual system, requiring proofs to remain in lockstep with the code, the pace of progress would be slowed dramatically. Hopefully Catch can give cheap verification, and therefore verification which can be of practical user to non-experts.

I use both GHC and Hugs for my development work. I primarily work with WinHugs for development, then move on to GHC for deployment. Unfortunately one of the things that WinHugs doesn't do very well is CPP support, it can only be turned on or off for all files, and requires multiple passes and slows things down. So what you really need is some way of defining code which executes in the presence and absence of CPP - so you can customise correctly.

It's not too hard to make this work, but it can be a bit fiddly to come up with the correct constructs. The easier one is to exclude code from the CPP, which can be easily accomplished with #if 0:

{-#if 0-}cppOff = True{-#endif-}

If the CPP is on, the closing comment and the cppOff definition will be ignored. The other direction, having code only available when CPP is on, requires the nested comments available in Haskell:

{-#if 0{-#endif-}cppOn = 1{-#if 0-}#endif-}

The non-cpp contains an extra open comment, which nests the comments, and leaves the definition invisible.

With these two fragments, you can customise your Haskell code so that it performs differently on systems with and without a CPP, remaining valid Haskell on sytems with no knowledge of CPP.

Friday, May 18, 2007

* This is building on from last time. In particular, that means that all the same disclaimers apply.

Having done "wc -c" as the previous benchmark, the next thing to look at is "wc -l" - line counting. A line is defined as being a sequence of non-newline characters, optionally ending with a newline. This means that "test" has 1 line, "test\n" has 1 line, "test\ntest" has 2 lines. Its trivial to express this in Haskell:

print . length . lines =<< getContents

The Haskell code uses the same getContents as before, but the C code needed more changes:

The C code is quite hard to write. In fact, this code is the original version I wrote - with an accidental mistake. Rest assured that I benchmarked with the correct code, but see if you can spot the mistake.

As in the previous example, I've demanded that all implementations use getchar at their heart, to make everything a fair comparison. The benchmark was then line counting on the same 35Mb file as previously:

And the numbers: C=4.969, Supero=5.063, GHC=9.533. This time the difference between C and Supero is a mere 2%. However the difference between Supero and GHC has grown considerably! The reason for this is an advanced optimisation that has managed to remove all the intermediate lists. In a normal Haskell program, getContents would build a list of characters, lines would take that list, and build a list of strings, then length would take that list and consume it. With my optimisation, no intermediate lists are created at all.

While in the previous benchmark it was easy to examine and understand the generated GHC Core, for this one it is much harder. The benchmark has ended up using a recursive group of functions, none of which use unboxed numbers. For this reason, I suspect that the benchmark could be improved further for Supero. Once again, credit needs to be given to GHC for the results - it is clearly doing some clever things.

This benchmark is slightly more interesting than the previous one, but is still a bit of a micro-benchmark - not following a real program. My next task is to probably move on to "wc -w" to complete then set, then a real benchmark suite such as nobench. I hope that this demonstrates that Haskell can be beautiful, pure, and match C for speed in some cases.

Wednesday, May 16, 2007

* As usual, I'm going to qualify this with "on one particular benchmark, on one particular machine, with one particular set of following wind".

I've been working on optimisation techniques for the last week, as one chapter of my PhD is likely to be on that topic. The approach I'm using is to take Yhc.Core, optimise it at the Core level, then spit out GHC Haskell which can then be compiled. The optimisation technique I've been working on can best be described as "deforestation on steroids" - its entirely automatic (no notion of special named functions like in foldr/build etc) and isn't too expensive.

The particular benchmark I've been working on is "print . length =<< getContents" or "wc -c". In order to make it a fair comparison, I require that all implementations use the C getchar function at their heart for getting the next character - not allowing any buffering or block reads. This is actually a substantial benefit for C, since in Haskell things like clever buffering can be done automatically (and GHC does them!) - but it does result in a test which is purely on computational crunch.

I then benchmarked the C on -O3, versus the Haskell going through my optimiser, and going through GHC on -O2. The benchmark was counting the characters in a 35Mb file, run repeatedly, as usual:The actual numbers are C=4.830, Supero=5.064, GHC=5.705. This makes Supero 5% slower than C, and GHC 18% slower than C. The reason that Supero is faster is because its managed to deforest the length/getContents pair. The inner loop in Supero allocates no data at all, compared to the GHC one which does. Supero allocates 11,092 bytes in the heap, GHC allocates 1,104,515,888.

This benchmark is not a particularly good one - the task is simple, the computation is minimal and most expense is spent inside the getchar function. It is however a good place to start, and shows that Haskell can get very close to C speeds - even using standard type String = [Char] processing. Supero has made global optimisations that GHC was unable to, and does not have any techniques with which it could acheive the same results. GHC has then taken the Supero output and automatically identified strictness, performed unboxing/inlining, and done lots of other useful low-level optimisations.

I'm going to move on to "wc -l" next, and hopefully onto the nobench suite. The techniques used in Supero are all general, and I hope that as the complexity of the problem increases, the payoff will do so too.

Wednesday, May 09, 2007

Dons recently asked me to look over the StackSet.hs module of XMonad, checking for pattern-match errors, using my Catch tool. This module is the main API for XMonad, and having a proof of pattern-match safety would be a nice result. I performed an initial check of the library, including introducing the necessary infrastructure to allow Catch to execute, then sent them the required patches. After I had given them the Catch tool and the initial results, sjanssen picked up the results and went on to modify the code further - with the end result that now StackSet will never crash with a pattern-match error.

I'm not going to cover the changes that XMonad required to achieve safety - that is a story that sjanssen is much better equiped to tell. What I am going to describe is the process I followed to get to the initial results.

Step 1: Download Catch, Yhc and XMonad

Catch can be obtained from http://www-users.cs.york.ac.uk/~ndm/catch/ - however a word of warning - it is an involved installation procedure. In the next few months I hope to make Catch use Cabal, and then ordinary users will be able to easily install Catch.

Initially StackSet used pattern guards, one small patch and these were removed. There was also a small issue with defaulting and Yhc, which I was able to paper over with another small patch. With these two minor changes Yhc was able to compile StackSet.

Catch takes care of all the remaining details. The ||| operator says to check that under all circumstances the function will not crash with a pattern-match error. The module is actually slightly more involved, but this is all detail which Catch will shortly hide into a library.

A future intention is to have a tool which automatically generates such a module from a library. This then allows the person performing the checking to simply point at a library, with no additional infrastructure.

Step 4: Test using Catch

This step is the first time we actually invoke Catch - and its as simple as a single line:

At this point I hopped on IRC, and in discussions with sjannsen talked him through installing Catch and the necessary dependencies. Once he had this installed, he was able to modify the StackSet API and implementation to guarantee that no pattern match errors are raised. After all of this, now running Catch over the test script gives success.

A lesson to take away

There were only two sticking points in running Catch over XMonad. The first was that Yhc can't always compile the code, and may require slight tweaks to be made. There are two solutions to fixing this, we can either fix Yhc, or I can make Catch use GHC Core. The GHC Core solution shouldn't be too much work, once GHC has a library for external Core.

The second sticking point is that installing Catch is a lot harder than it should be. A little bit of Cabal magic and a proper release should squash these problems easily.

So what lessons did I learn? Perhaps the thing that surprised me most is how easily someone else was able to get to grips with the Catch tool. This gives me a strong sense of hope that average users may be able to employ the Catch tool upon their programs, either in this version of a future version.