2007-01-09

Recently I've read two blog posts about programming productivity and how it can be improved by better programming languages. I'd like to expand a bit on them.

The first post focuses on code reuse. The thesis is that the central thing that increases programming productivity is code reuse. As soon as you can reuse some bits of code you save effort. How does this relate to programming languages? One way is that programming languages can supply code in the runtime system, such as garbage collection. Thus, the programmer gets a well tuned bit of code which is most certainly correct and which he doesn't need to think about to use. Another example mentioned in the article is objects. It is indeed possible to program with objects in C but this programming pattern is captured by a language construct in object oriented languages and we can therefore reuse it.

The latter half of the post makes a case for standards. Having to choose from several libraries that does the same thing or programming constructs that can achieve the same thing are bound to make a mess. According to Murphy's Law, when you want to use libraries A and B, library A will use feature 1 and library B will use feature 2 and that will render them incompatible. So standards are good, be it libraries or programming idioms. They help making programs more composable.

Which leads me to the second post which talks about Composability and Productivity. This post starts out repeating a lot of the previous post but focuses on composability. But it goes further down the composability track talking about the problem with state. When composing two stateful pieces of software it might be important when the state changes happens and sometimes that leads to composability problems.

But there are even more interesting issues in composability and reuse that neither article mentions. One is concurrent programs. These are extremely hard to compose if they use ordinary locks and sometimes it is out right impossible to compose them. One promising line of work here is software transactional memory which can bring composability to concurrent programs. If think this sounds interesting you should check out the papers by Simon Peyton Jones on software transactional memory (or STM for short). STM has transactions, bits of code which either are run to completion or aren't run at all (perhaps due to a state change which happened while the transaction was running and which invalidates it's view of the world). The salient feature of STM is that it enables the programmer to compose transaction. Hence it increases composability.

The interesting thing is that this STM fits particularly nicely with Haskell. Why is that interesting? Because Haskell has some other very special features which makes Haskell programs more reusable and easier to compose.

One of the obvious benefits of Haskell when it comes to reuse is its strong type system. Writing reusable code is a lot easier if you don't have a very narrow minded type system which gets in your way all the time. To avoid that you can either ditch the static type system and go with dynamic types, which is what Lisp, Scheme, Ruby and Python does. The other option is to use a more fancy type system which lets give types even to the most abstract pieces of software. That way they can more easily be reused. This also improves composability.

One last feature of Haskell that I want to mention is laziness. Anyone who has ever heard of Haskell knows that this is the one feature which sets it apart from other languages (modulo Clean). Laziness makes evaluation order difficult to predict and therefore it is a bad idea to have state in a lazy language. Therefore Haskell is pure, allowing no mutable variables or IO except in a confined imperative sublanguage. As the first post I linked to above have already pointed out state is bad for composability. That should make Haskell programs more composable and reusable. While I think this is true it's not why I mention laziness. Lazy evaluation, used correctly, can enable a new kind of modularity which takes great effort to achieve by other means.

The standard example of modularity from laziness comes from John Hughes' classic article "Why functional programming matters" but I don't think it has gotten enough attention so I will repeat it here. Suppose you are writing an AI for some board game such as tick tack toe or chess. Writing good ones can be really tricky if the board game if moderately complex. In Haskell you would start out by generating a tree which models all possible moves from the current positions on the board. This tree can infinite. One can then write small functions which operates on this tree, such as pruning it or reorder branches to make things faster. Lastly one can compose these functions to construct the final evaluation function. The crucial thing here is that the small functions operating on the tree are highly reusable. It is very easy to cherry pick them and reorder them as you see fit. Therefore it is very easy to try out several different algorithms with very little effort. All this spells: Programmer Productivity. Lazy evaluation is crucial here in making sure that only parts of the tree that are actually needed are computed.

For those of you who think that this sounds horribly inefficient I have two answers. First, if you find a really good search algorithm this way then the constant factors due to lazy evaluation and many function calls doesn't really matter. Secondly, there are techniques to automatically remove the game tree which can make the program almost as fast as a hand tuned one.

OK, so this post turned out to be a sales pitch for Haskell in the end. No one who knows me would be really surprised by that. But I really do think that Haskell has a lot of features that takes code reuse and program composition to a new level. All those features such as purity, closures, powerful type system, laziness and STM which makes programming in the large easier come together very nicely in Haskell. So it's no surprise that the latest revisions of C#, Java and VB all borrow heavily from Haskell and it's language kins.

I completely agree with the posts I linked to above: Reuse and composability are very good for programming productivity. And there's a whole lot of things that can and should be done at the language level to improve these issues.

2 comments:

There is a lot of research going on about "Fusion"; removing intermediate and unnecessary data structures. Unfortunately there aren't any good overviews about this field. But one paper I can recommend is "Playing by the rules: rewriting as a practical optimisation technique in GHC" which shows how fusion is achieved in GHC, the dominant Haskell compiler. If you don't want to read the whole paper make sure you read section 7.