F# and Haskell, Estranged Cousins

In this post I compare and contrast Haskell and F#. It may come as no surprise that with so much shared history they share so much in common. However, it’s interesting to consider how the perspectives of the languages’ developers play a large role in determining the differences between the languages.

A Shared History

As far as the family tree of functional programming is concerned, F# and Haskell are not too distant cousins.

They both share a very similar syntax as well as a large number of features. A great example of this is Hindley–Milner type inference.

ML was the first widely used language to leverage Hindley–Milner for static inferred typing, a feature to which it owes much of it’s success. However, almost all functional programming languages now also have this feature. The FP community has always been fast to adopt obviously useful features. Some other things that fit into this category are garbage collection (Lisp) and lazy evaluation (Lazy ML).

The most obvious difference between Haskell and F# is somewhat easy to infer from this graph: object oriented constructs. That is to say, OCaml pioneered the use of object oriented data structures in functional programming and F# is it’s direct descendant. This has made OCaml (and in turn F#) somewhat of a black sheep in the theoretical functional programming world.

The reason many functional programming theorists dislike objects is because they want a language based on math. Unlike the majority of the ideas in functional programming, objects don’t have roots in either lambda calculus or category theory. However, this has not stopped OCaml from being successful. In fact, quite the opposite.

The use of objects mitigates one of the largest roadblocks in the path to functional programming adoption by engineers: the difficulty inherent in organizing large functional programs. The OCaml language engineers also showed that leveraging the object oriented paradigm did not hamper their ability to use static analysis techniques. Because of this OCaml approaches the speed of C.

While it is not pure, OCaml is almost an ideal compromise between theory and engineering. Indeed, nothing approaches it in terms of a functional language which fits into the paradigms of the Microsoft .NET framework. It’s easy to see why Microsoft chose to extend OCaml when building a functional language to bring to it’s software engineering masses.

On the other hand, Haskell is almost the ideal language for academic exploration of functional programming. The fact that it’s strictly limiting in terms of side effecting and adherence to abstract mathematical concepts means no side effecting surprises. Also, the fact that it’s a committee language means that if a researcher can get enough support for an idea, they can almost be sure it will be included in the next iteration of the language.

Haskell as a Committee Language

Repeat the mantra after me: Haskell is Lazy; Haskell is Pure; Haskell has Type Classes; Haskell is a Committee Language.

Of all of these, the most defining characteristic of Haskell is that it is a committee language. It’s an amalgamation of many different goals with no clear vision. This is at the same time Haskell’s greatest strength and greatest weakness. While it is the most widely used pure functional programming language, the quirks of committee design are obvious.

Some I ran into within two hours of starting with Haskell:

The first was integer rollover. Haskell has two integer datatypes: integer and int. Integer is infinitely sized but can be quite slow to use and due to that, it’s rather infrequently used. On the other hand int is fast but, just like in C, can roll over. There is no way to check the overflow bit.

So, ints can roll over, I can accept that. What it implies to me is that speed is more important to Haskell than robustness. However, this brings me to my second point: Many basic list operations will throw errors on an empty list. This seems entirely inconsistent to me.

I understand that if they didn’t, a logic error would be much more likely to cause an infinite loop in a tail recursive function. However, this seems completely at odds with the “speed first” definition of an int. It also means that almost everyone ends up wrapping the default list operations with the Maybe monad.

The third issue was that operations with the float data type are slow. Real World Haskell suggests always using a double due to the fact that a great deal of focus has gone into optimizing double arithmetic but very little into floats. This demonstrates another thing that comes about with committee languages: often things as important as optimization of basic data types can fall through the cracks because everyone involved wants to work on more exciting things.

Please don’t misunderstand me here, I really like Haskell. I’m hard on it because I can see that it has a great deal of unrealized potential. If Haskell is to be a language used for real software engineering, the committee needs to sit down and think hard about an overarching vision for the project.

What is the goal here?

The biggest difference between the world of theorists and the world of engineers is that each group has an entirely different set of concerns.

Theorists want to implement ideas fast so that they can crank out papers fast. A large part of this is having a language that is very close to math so that implementing ideas directly from the chalkboard is trivial. As the theory world changes so fast, they don’t often care much about organization or maintainability.

As the committee responsible for Haskell is mainly made up of theorists, it’s easy to see why the language has taken the direction it has. It’s a language that is very close to math. As the lifecycle of most academic code is very short, small implementation details which might cause a reduction in robustness are less important.

Engineers want to minimize time spent maintaining code. Part of this is having a language that emphasizes safety in that it facilitates catching as many bugs as possible, as early in the process as possible. Another important part of this is code organization as every moment that is spent trying to find a bug is a moment spent not fixing it. As the cost of maintaining software generally dwarfs the initial development cost, development speed must take a back seat to testing and organization.

The syntax heavy C# is a great example of this. It’s slow to write in but provides many constructs for the organization and testing of code. On top of this a great number of design patterns exist to further categorize substructures in a computer program. C# is slow to write, but it’s relatively safe and mountains of patterns and best practices have been made to guide it’s developers.

However, we in the software engineering world are in the midst of a crisis. It turns out that traditional imperative object oriented programs do not lend themselves to heavy parallelization. Yet, parallelize we must. We are looking at exponential growth in the number of cores contained in each processor. Because of this we engineers find ourselves at a bit of an impasse. Those that are looking ahead know…

Engineers will soon want very badly to minimize time spent maintaining parallelized code. We need our programs to be easy to organize, manage and test. Yet, as we will soon need to deal with massively parallelized systems, we find many of our ideas about what makes code robust and maintainable are broken. At the same time, to move to a purely functional language means leaving behind years of thought on how computer programs ought to be constructed, tested and maintained. Having any pattern, even if it’s wasteful or has many corner cases, is much better than having none. This is why a hybrid language is so important.

OCaml and F# provide engineers with the set and forget concurrency that comes along with the functional tradition. At the same time these languages have all of the organizational constructs of object oriented programming as well. This means that we can continue to use the same types of large scale organizational structures in our programs and also gain the safe parallelism that implicit immutability provides.

Conclusion

And so we see that it’s important to consider a language in terms of how it’s creators envisioned it’s use. Haskell has been developed mainly with research in mind and so is a fantastic research language. F# has been developed mainly with engineering in mind and so is much better suited for engineering.

Comments

You conclude that Haskell is a great research language, but is not suitable for real world programming. I'm afraid I didn't follow the argument -- the quirks you mentioned aren't particularly horrendous. Certainly, I've seen far worse warts and inconsistencies in the languages heavily used in industry. Maybe Haskell isn't suitable for engineering, but I'm not totally convinced it's out of the question.

You also criticise Haskell for lacking an "overarching vision". I'd argue that Haskell has, and has always had, a clear (and largely unique) vision of being a pure, lazy functional programming language, a theme it's followed pretty much without compromise. Little of the language remains unaffected by those choices. You could even argue that its overarching vision makes it less practical; in particular, the lack of uncontrolled side effects. I'd also point to the development of Haskell Prime as an indication that, out of the dozens of academic extensions and experiments conducted with Haskell, only a very few proven ideas are being conservatively adopted into the next revision of the language standard. The GHC implementation may be a different story, but the standard Haskell language is far from being a big academic free-for-all where dozens of half-baked ideas and implementations are chucked in to see what happens.

If you'll forgive me for disagreeing on one more point, I also don't see that closeness to maths is a bad thing even for pragmatists. Sometimes a well thought-out theory for a system means that system ends up much more consistent and robust than something hacked together without any rigorous up-front intellectual effort, and then monkey-patched as necessary to fix the corner cases as they emerge. (And sure, sometimes there is no nice theory for something as messy as real world problems, but...not always!)

Thank you for taking the time to write such a well thought out comment.

Admittedly, I've just begun learning Haskell and don't have deep knowledge of the language. When using Haskell I feel that it is in many aspects designed around the code being written being as safe a way as possible. However, my early experiences with int and list operations make me feel that not everyone involved with designing the language share a common vision.

By vision here I don't mean what, as in lazy or pure, I mean why as in to make our programs as robust as possible or to facilitate concurrency.

Also, I don't see closeness to math as a bad thing either, quite the opposite. Math is the foundation of programming as a whole and many techniques used in software engineering, in all languages, have deep roots in theory.

However, much of modern software engineering also comes from patterns and organizational structures discovered from insight or developmental trial and error. I just don't feel like large sacrifices should be made to appease the math gods without sufficient reason.

I remember having a similar experience of being annoyed with list operations returning Maybe values while I was learning haskell. It turns out not to be so much of an annoyance for me now, largely because I structure my code in a way that avoids explicit list values (more folds, for one).

One of the main advantages of Haskell over F# is that you can't use the same coding strategies that you are comfortable with. This both makes it great for learning and also impractical for everyday use until you climb the long and steep learning curve. I hope you at least make it to the point where you see a possibility of large-scale design as well-organized (or more) as OO design.

Prelude is rather inconsistent in places, it's true. I don't see this as a result of design by a committee as much as the relative immaturity of purely functional programming. As the other commenter says, I think of Haskell as having a more clear and consistent vision than most languages, at least at its core.

Great job articulating the committee language syndrome that stifles Haskell. You really nailed the issue.

One thing though, you mentioned that C# is slow to program in. Not true. It certainly is far more verbose than any member of the ML clan, but that doesn't necessarily have a direct effect on developer productivity. For instance, I come from the school of C++ and have a heavily biased disposition toward curly braced, imperative programming. Simply because I am very comfortable with the paradigm, I can crank out imperative code magnitudes faster than Haskell, F#, or Erlang.

I absolutely value the benefits of the functional paradigm, but the construction process is so different from what I grew up with, that I take forever to write anything useful. At the end of the day, my F# code is much more compact, less buggy, easier to understand, and more concurrency friendly than C#, but it took me much, much longer to write. For me, a big problem is not knowing how to express concepts I already know within the syntax of Haskell or F#.

Perhaps as I hone my functional skills and become more comfortable, the bias will begin to swing, but for the time being I am still far more productive in C++ and C#.

The real advantage of F# over Haskell is the .Net framework. I don't see the advantages of the language itself as being significant. It's the large framework/library that comes along with that will have the greatest impact.

For sure, inventing patterns on the fly is great when working on your own but this does not scale to large projects. Even if you were to invent a full array of patterns, each well thought out, only you would know them. Before pure functional programming is acceptable for large scale software engineering there is an additional learning curve that must be met: that of the community. Books must be written, studies done, and best practices invented. Until then it will remain the domain of hobbyists and theorists.

Jason,

Don't you think that given enough time to internalize the language and paradigms it might be faster to write programs in F# than C#? I would think the terminal velocity of a language would be directly related to the terseness of it's syntax.

Brian,

The .NET framework (and the safety it provides) means quite a bit. However, let's not lose sight of the fact that it's entirely possible to link any C library into an OCaml application.

I disagree with the opinion that C# is slow to write in. I'd say that it's faster than writing C++ or Java, assuming that those are the sort of languages you compare it to. However, imperative languages are all slow to write in compared to functional ones.