The Aerodynamic Drag – Type Inference Analogy

Okay, so. Welcome to my new blog.

I’m going to start this whole thing off with an odd analogy that I’ve had in my head for a little while now. I’m going to attempt to make some sort of point about type systems and type inference, and I’ll probably end up bashing Scala somehow. Here goes.

Let’s say you’re flying an airplane. Even better, let’s say you’re a student pilot, learning to fly an airplane. At some point, your instructor sits you down, says “let’s talk about drag,” and draws a graph looking something like this on the whiteboard:

Your flight instructor explains that there are two types of aerodynamic drag that affect small planes like the ancient, busted-ass little Piper you’re training in. One is called parasite drag, otherwise known as form drag or skin friction. It’s sort of the obvious one: the faster you attempt to fly, the more air resistance you face. Parasite drag is proportional to the square of the aircraft’s velocity. The other drag force is called induced drag, and is probably slightly less obvious: the slower a plane flies, the more wingtip vortices it generates due to redirection of airflow. Induced drag is inversely proportional to the square of velocity.

The important part here is that your total drag is the sum of the two. So if you look at the graph, you can see that there’s some velocity at which your airplane flies most efficiently, that is, with a minimum of drag. If you try to fly faster than this airspeed you’re fighting parasite drag, and if you try to fly slower you’re fighting induced drag. This optimally efficient airspeed has many names, but is often referred to simply as “best glide speed”: flying so as to minimize drag will take you the furthest distance in the event your crappy little engine dies.

What the fuck does this have to do with type systems? Right.

Let’s say your flight training budget dried up, and you grudgingly went back to the software industry. You are tasked now with evaluating which programming language might be good for a project, resulting in optimal developer productivity. You figure it’s important to choose a programming language with a strong, static type system (you still have a bit of PTSD from your Perl days, before you took up flying lessons), but you also know that there are many statically typed languages to choose from. Now, since productivity is naturally measured in dollars per keystroke, you want to minimize the number of keystrokes needed to bring in the dollars. Thus your ears perk up when I tell you that type inference is the secret to minimizing keystrokes.

Now I take you into a conference room and show you this graph:

There are two kinds of type annotations a developer has to manually, explicitly provide in a program: annotations due to a language’s lameness, and annotations due to a language’s awesomeness.

Languages with type systems that aren’t very powerful or expressive, such as Java (at the left of the graph), force you to write a lot of lame, obvious type signatures in your code. These are type signatures that, you figure, the compiler should be able to figure out for you, but it stubbornly refuses. For example:

On the other end of the spectrum, languages with type systems that are extremely powerful and expressive—dependently-typed languages (at the right of the graph, marked “dt”) such as Agda—also force you to write a lot of type signatures, because the compiler would never be able to infer correct types on its own. Type-checking in these languages is hard enough, type inference is undecidable. For example:

not : Bool -> Bool
not true = false
not false = true

One would figure that the Agda compiler could figure out the type of not based on its simple definition, but nope. An algorithm that could figure it out would absolutely fall flat on a dependent type.

So what about languages towards the middle, where the amount of both lameness-induced type annotations and awesomeness-induced type annotations is minimized? Well, it’s my belief that Hindley-Milner type systems (with Damas-Milner type inference) is basically the sweet spot. In other words, Haskell and ML are “just right.”

Somewhat toward the “lameness” side, you have Haskell 98 with its monomorphism restriction (see “mmr” on the graph), which prevents us from saying something like:

bind = (>>=)

without an accompanying type signature:

bind :: Monad m => m a -> (a -> m b) -> m b

Somewhat toward the “awesomeness” side, maybe, you have Scala’s object-oriented subtype polymorphism, which buys you things like simple heterogeneous collections:

To emulate this dynamic dispatch in Haskell, you’d need existential types which are a fair bit more cumbersome, and not standard in the language. That being said, here comes the Scala bashing I promised earlier: this kind of polymorphism absolutely ruins type inference, for not much benefit. It prevents inference entirely in certain scenarios, and leads to undesirable inferences in others. So although Scala’s type system is indeed more powerful than that of Haskell or ML, I wouldn’t call it more useful, thus its litany of required type annotations are actually lameness-induced and I need to reconsider the layout of my chart.

Okay, this is probably all a ridiculous analogy, but I’ve gone to the trouble of writing it up with pictures and all, so I’m sticking to it. The moral of the story is that languages with principal types—informally, languages in which suitable types can always be inferred for any well-typed expression—are the working software engineer’s “best glide speed.”