The abject failure of weak typing

Over the last few years of maintaining code old and new at REA, it has become abundantly clear that the neglect and misuse of type-systems has had a sharply negative impact on our codebases. Here we address the concrete causes and consequences, and propose concrete and achievable solutions.

Types at REA

REA’s codebases vary between statically typed languages such as Java and Scala on one hand, and “dynamic” languages such as Ruby, Javascript, and Perl on the other. I’ll mostly leave the dynamic ones alone — while the same arguments obviously apply, they are what they are. However, if we are paying for the conceptual overhead of a static type system, we need to be sure we are using it effectively; otherwise we are simply writing the same old Ruby play-dough but with long build-times and cumbersome syntax. Used properly, the tradeoff of static types is overwhelmingly beneficial.

What’s in a type?

A type is some logical proposition about a codebase, where an implementation is its proof. They are distinct from runtime tags, which are what people mostly mean when they talk about dynamic “types”. Terminology can vary, but I’ll stick to this usage. For example:

Haskell has a spectacularly rich static type system, with no runtime-accessible metadata, apart from pattern matching on tagged unions.

Dynamic languages such as Ruby are not really untyped, but rather unityped; statically, we know that everything inhabits a single message-receiving type. At runtime, of course, they employ tags to differentiate numbers from strings, arrays, maps, and user-defined structures.

Java has both a type system applied by its compiler, and a tag system at runtime that allows reflection, casting, RTTI, and various dynamic features. Neither is a compelling specimen.

Types as design

Even if the language we use doesn’t have a notion of types, you’d better believe they exist — how can we even write code without statically reasoning about it first? A maintainer must construct a model in their heads: what keys do we assume are in this map? Might this thing be null? What are the allowed values of this symbol? Can this string be empty? What messages will this object respond to?

Many dynamic-typing advocates have a curiously limited view of what a “type” is – in fact, almost any proposition we can statically make about the code can be represented as a type. Language choice notwithstanding, there is an enormous amount of information that fits into this category; by proving it up-front we can drastically reduce the number of incorrect programs that are even expressible. As the saying goes, don’t write software that isn’t broken; write software that can’t be broken.

Types are a powerful tool for clarifying thoughts, and designing correct software, arguably far more so than popular test-driven methodologies.

Types done wrong

Essentially anything that can be cleanly and obviously known about the code up-front belongs in a type. The most egregious failures here often stem from using the most common everyday concepts — strings, exceptions, primitives, maps, nulls, typecasts and so forth. Many readers of this post will, like its author, find something to feel guilty about here. Let’s take a tour:

Nulls

The harm represented by nulls is hopefully widely understood by now, but bears repeating.

Any value that we know could be null cannot be directly accessed by correct software; we must either surround usages in an if-guard, or employ some kind of harmless Null Object that can hopefully respond in a sensible way. It is a bald-faced lie told by the type-system in Java, C# and Scala, and to the developer’s mental model in Ruby. If a variable claims to be a Banana, surely you can feel justified in peel()-ing it? If it is null, then it is no banana at all, but a ticking time bomb waiting to explode, potentially at an unrelated line of code far away. Well-written code cannot tolerate even the possibility.

The proliferation of duplicated defensive code at numerous locations is a further burden, which bloats both code and tests, while reducing quality.

Solution:

Never permit nulls in code you control, and firmly regulate the contact points of systems and frameworks you don’t.

If a type has a sensible “empty” or default value that can fulfil the contract of your type, then initialise variables to this, or employ a Null Object.

Avoid mutable entities that need to be initialised piecemeal after creation. Write immutable objects that are immediately and fully initialised from constructor input.

If a particular variable might or might not be present, then this should be encoded in the type system using an option type, like Scala’s Option, Function Java’s Option, Java 8’s Optional, or Haskell’s Maybe. This correctly represents the uncertainty in the type system, so that any access is safe, simply by the fact that it compiled.

Exceptions

Exceptions are the primary error-handling mechanism employed by many widely-used languages. They are also a side-effect that makes a liar of the type system, and makes local reasoning about code far more difficult. They represent an undeclared method result smuggled through a back-channel separate from its declared return type. Furthermore, they transitively become an undeclared result of anything that calls that method, and anything that calls that, and so on. Trying to reason about the correct behaviour of code becomes very difficult, since the return type can no longer give you enough information. Exceptions kill modularity and inhibit composition.

Java awkwardly attempts to mitigate this with checked exceptions; they become a fully-fledged, type-checked part of a method signature. While this is better from a type-safety point of view, they still use an exotic second channel for returning results totally incompatible with the first, require an insufferable amount of handling code, and have far poorer tools for abstraction and reuse. Checked exceptions are widely despised by Java programmers, and frequently ignored by library authors.

Solution:

Don’t throw exceptions in code you control, except in the most irretrievably broken circumstances.

When dealing with code you don’t control, catch their exceptions as soon as possible and lift the various results into your return type. In Scala, the easiest way to do this is the Try type, which directly lifts the result into an ADT of Success(yourValue) or Failure(thrownException).

Exclusively encode possible function results in the return type. Don’t throw that AuthenticationException for a totally plausible and normal outcome! Here’s some alternatives:

When there is a main result alongside a possible failure result, use an existing Either or Validation type.

Define your own Algebraic Data Type (ADT) that describes the possible alternatives. For instance, in Scala or Java, this can take the form of a closed mini-class hierarchy.

Primitives

Primitive values such as integers and strings are often the first tools we reach for, but are woefully unsuited to most use-cases they are press-ganged into. This is because they have an astronomical number of possible values, and most use-cases do not.

Integers

Consider this function:

def blah(b: Boolean): Boolean

A function A -> B has BApossible implementations, by the number of inhabitants in A and B. So this function has 22 = 4 possible implementations. Perhaps we could write a test case for each one.

Now consider this function:

def compare(a: Int, b: Int): Int

This one has not only 232 possible results, but an incalculable (232)264 possible implementations. Even if we are generous and only allow the inputs to be 0 or 1, we still have 2128 implementations. If the only meaningful return values are -1, 0 and 1, indicating “less than”, “equal” or “greater than”, then why would we embed them like specks of dust in a big desert of undefined behaviour?

If we encode the return value as an ADT representing the 3 possible results, as Haskell does, then we have a positively civilised 34 = 81 possible mappings from input to output. Even in such a simple case, a more precise return type pruned 340 undecillion incorrect programs from the sphere of existence.

Remember, that was an utterly trivial example. So what happens when you have complex/composite/nested data structures, exceptions and even side-effects? How many of the possibilities are even remotely meaningful in your domain? Without precise types, how many were you hoping to reach with your TDD and “100% code coverage”?

Strings

String are perhaps the most commonly used data type, due to their immense versatility; however, they are rarely appropriate. Strings consist of a sequence of characters. This is the perfect representation for unstructured free text, and nothing else. Any restriction, structure or constraint in the format of the string means you don’t have a string at all; you have a URL, a Date, a Name, an Email, a Document, an ID, a Warning, or whatever else that might tempt you to deploy this amazing swiss-army-type.

Not only does “stringly typed” code result in a catastrophic expansion in the number of expressible incorrect programs, it inevitably results in duplication, as the validation, destructuring and restructuring code is repeated in every spot where the string format is supposed to be in use.

Solution:

If there are a finite number of possibilities, use an ADT to represent them. This internally prevents a vast number of incorrect implementations, and externally prevents a vast number of incorrect usages.

Use a wrapper type to encapsulate the desired structure; make it impossible to create invalid instances. This will also cull incalculable absurdities inside and outside of the function. This costs one line of code in Scala.

As a last resort, throw exceptions inside constructors to prevent any remaining possibility of an invalid instance.

Records vs Domain modelling

A couple of years ago, we were keen to avoid over-specific domain modelling, and took care to build our services as dealing in the domain of “records” or “attribute-maps”, rather than specifically tie our logic to Listings, Agents, Dogs, Cats, Aeroplanes or what-have-you. This was to prevent the loss of generality, and to keep the application focused on web and infrastructural concerns. So following the famous Alan Perlis line “It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures”, we decided on a unityped representation of domain data.

In an absolute sense, I can’t say whether this is a good idea or not. It’s potentially a totally valid point of view — perhaps the detail and structure of the record data is of no relevance to the code that provisions, streams, encodes, stores, secures and displays it — as long as it stays that way. In practice, however, it has left our codebase with serious flaws.

Firstly, the approach of using unityped records is totally predicated on the application not needing to know anything about their structure, beyond the obvious tree-shape. When the application suddenly needs to, say, differentiate “floorplan” from “main” images, or know if a “logo” was included, then the concept is doomed. This knowledge, completely understood at compile time, must be expressed in clumsy string-based map retrieval, faith-based typecasting and a total absence of any way to reason about the correctness, or even the intended behaviour of the code.

Either business logic has to be strictly forbidden from this unityped pipeline, or we need the types of the data to truly reflect the knowledge we have, and need, at compile time.

Secondly, even if the unityped record approach was correctly chosen, using, as we did, Map<String, Object> was a clear blunder. The verbosity and repetition are appalling; every single access, construction, iteration, de-reference has to be laboriously performed by hand at myriad locations around the codebase. Even if all we need to know is “it’s a map with certain behaviours and constraints”, then we should have encoded that in the types, and created some sort of Record class. In this case, it was also compounded by the use of Java (pre-8), which has extremely poor facilities for abstracting over collections and maps.

Solution:

The static knowledge we have, or require about our domain objects should be captured in types. If there is a clear case that we don’t need to differentiate between this-or-that domain object, then that should be captured in the types.

Quickly translate interchange formats into data structures that reflect the knowledge we require, and can only hold meaningful and valid states. There are mapping tools in Java and Scala that can safely map between JSON and typed objects.

Names are overrated

Let’s look at a function:

def findAddress(userId: String): String

What does it do? Are you sure?

Now lets look at another:

def blah(foo: UserId): Address

Which one tells you more about its purpose — the one with the businessy names, or the one with the types?

Naming has role to play, but consider what it really does. It is a mnemonic, a reference that helps you uniquely recall a concept. While this is fine as far as it goes, names are totally useless for reasoning about software. For documentation, they are as poor as comments, or Word docs. Implementing a precise type signature proves that the software does what it says on the tin. If the types in question have been carefully designed to prevent invalid states, then often there will be only be a handful of possible implementations, or even one — not a number you’ve never heard before, ending in -illion.

Solution:

Treat your types as the only real documentation.

Constrain argument and return types to a named alternative, that limits possible states. This is all maintainers should need to know about your functions.

Wrapper types start from one line of code in Scala.

The way forward

While I’ve listed a variety of different problems, you’ll notice a lot of repetition in the proposed solutions. In fact, we can solve these problems very simply, using only a few techniques — especially if we continue our adoption of Scala in place of Java and weakly-typed languages.

Algebraic Data Types (ADTs)

ADTs are a powerful tool for us here, because they allow us to encode limited possibilities in the type system, so that invalid combinations are inexpressible in a well-typed program.

They are called “algebraic” because they are sums and products of other types. (Sums are like OR, and products are like AND).

For instance, a List in Scala is defined as an ADT — it is a Cons of a value AND another List, OR an empty List. In Haskell, this could be written simply as:

data List a = Cons a (List a) | Empty

In Scala, at the cost of some more characters, we could encode this as a mini-class hierarchy:

This is almost a straightforward Java-style class hierarchy, but notice the sealed keyword: unlike normal OO classes, List cannot be extended, except by the classes below it. Without this feature, the number of possible outcomes would still be totally unbounded. In Java, we can still benefit from using this style, but the code required to manually write accessors, constructors, threading through arguments, correct hashcode/equals implementations and unit tests is significant, and error prone.

OO lore has it that pattern matching is evil, and that subtype-polymorphism is the answer to all questions. This is false; there are complementary pros and cons to subtype polymorphism and pattern matching. Since there are only a few fixed cases, it is perfectly idiomatic and sensible to pattern match on ADTs; the Scala compiler will even complain if we haven’t matched every possible eventuality.

Wrapper types

Wrapper types are one of the best ways to avoid the buggy swamplands of code written with bare strings and primitives. In Scala, this starts at almost no effort:

case class Angle(radians: Double) extends AnyVal

“Case classes” in Scala are (mostly) like any other class, except that the compiler will generate useful functionality. Scala will automatically bind the constructor parameter to an immutable field exposed through accessor methods, generate correct equals and hashCode, a default toString, and a pattern matching extractor. Hugely useful.

Value Classes: eliminating runtime overhead

Normalising and validating input

We can get all the benefits of classes here – we can define our own operations, and normalise or validate the constructor input. Here are some examples of this technique applied to Angles and Percentages. Note they are correct-by-construction; any instance of this type is guaranteed to represent a valid and normalised value.

Types are low hanging fruit

While there are no silver bullets, there’s an awful lot of low-hanging fruit just lying around. Let’s pluck it! We can make major improvements in our software quality, even with minor adjustments to our coding style. Code can be easier to reason about, with vastly less ways to fail, at a very low cost.

Types, kindly bestowed upon us by some languages, are a magnificent tool to improve quality. They prove desirable properties of our code; we should make it our business to put as much code in their reach as we can! There will always be a point where types have no more to say, and must pass the quality baton to tests. Consider though, how much less work tests must do, and how much less code they must expend, when entire universes of nonsense have been prohibited from existence.

In particular, by eschewing exceptions, using Algebraic Data Types to model the precise shape of our data, and wrapper types to constrain crude Strings and primitives, we can make immediate gains before we even get to more advanced abstractions like typeclasses and higher-kinded types.

In Java, much of this has been known for a long time, but the language’s lack of support for value-based classes, ADTs and pattern-matching has meant that good practices are often discarded as prohibitively cumbersome or expensive. Regrettably, despite the welcome addition of lambdas, Java 8 provides little respite.

In languages like Haskell and Scala, these methods are so cheap as to be no-brainers; in new projects you have no excuse for passing up these delights!

Either way, I hope that I’ve convinced you of the good news — there are plentiful green fields of easy code-improvement ahead, before we even get close to tough tradeoffs.

Thanks Jem, good point. I briefly touched on that in the conclusion. It was more of a reaction to the “what do we need types for, we have tests!” line that I keep seeing. I’m leaving “100% coverage” in scare-quotes though. 🙂

Seems a waste to have such great commentary marooned in Facebook. I hope you don’t mind if I quote you at length:

“I don’t write super abstract Haskell code because I love abstraction. I
write it because it constrains the search space of possible
implementations so much that its obvious what few design choices I’ve
left myself. TDD is great when you are swimming in a sea of
possibilities. When you have only one or two possible implementations it
becomes rather redundant.

The challenge to me when writing Haskell is how to make the types abstract
enough to constrain away bad implementations while avoiding unmotivated
abstraction. If I can’t get all the way there, I throw something like
doctests (or add quickchecks) for the code to shore up my confidence and
communicate examples (or laws) to my users, but the types themselves
also communicate a great deal of information to users.”

Though I understand what you’re aspiring for here, I’ve been convinced by Rich Hickey’s line of though regarding type systems- that ultimately they don’t bring as much gain as we would hope.

“Statically typed programs and programmers can certainly be as correct about the outside world as any other program/programmers can, but the type system is irrelevant in that regard. The connection between the outside world and the premises will always be outside the scope of any type system.”

Thanks for the link Jesse. I’m a huge fan of Rich Hickey, but I think he has the wrong end of the stick here. The connection between the real world and the program is down to modelling, either way. A type system can certainly do this, albeit up to a point. The question is, can you do it better in your head, with the astronomical explosion of possibilities and absence of reasoning tools that unityped languages necessarily entail?

For me, the answer is unambiguously no; I just can’t see how that pays for itself. I spend far more time maintaining than writing software, and the “connection to the real world” makes not a jot of difference. Difficulty in reasoning costs hours, days, months, sanity, and cold hard cash.

Well, pragmatically, a program can’t be correct if its inputs are wrong. The only thing static typing *might* give you in that case is an earlier report of an error (eg. if a string fails to be forced into its required type). That can still be extremely useful, of course, but it isn’t as compelling as the remainder of Ken’s arguments.

Hickey’s quote seems to limit the discussion to the interface between the program and the outside world. However, that’s only one location where problems can arise. As Ken describes, once types have been established, and transformations between those types are more carefully constrained, entire classes of *intra*-program bugs can be eliminated.

Another way to describe this is the famous Benjamin Pierce quote: “A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute.” By limiting the range of values computed, using more restrictive types, you necessarily increase the power that the type checker has to ensure correctness, by removing possibilities from the set of results. In other words: constraining your types make the type checker work FOR you, and not the other way around! 🙂

Of course. But I have understood that he chose dynamic typing for Clojure a reason, and I imagine it has to do with the consequences of including static typing on the rest of the language. I’m trying to understand what those are.

I think dynamic types make sense for what Clojure and especially Clojurescript intends to be. Lisps do give you some alternate ways of statically checking properties that type systems can’t do . You can have a macro that raises an error at expansion time..

Static type systems are a strict subset of what macros can check since you can implement static typechecking with macros. On the other hand, having a good type system that everyone uses by default is still nice, it’s just not the only way to go.

You are positing a false dilemma: either something is backed by scientific evidence (by which I assume you mean a statistically valid set empirical observations) or it is merely opinion. Mathematical propositions are an obvious exception, but even still there are propositions which, while not being strictly mathematical in nature, nonetheless seem to maintain a logical form, and are useful for discussing things which either lack existing scientific evidence (this topic) or for which scientific evidence cannot really apply (the epistemic principles we use to judge said evidence). How are you to argue to me that “Without scientific evidence it is just a long opinion piece”? How would you argue that scientific evidence is ever useful? Ultimately all of our judgements must rest on our intuitive sense of reason (sometimes referred to as “common sense”). We may occasionally discover techniques which increase our effectiveness at reasoning about certain classes of propositions (e.g. the scientific method) but even our judgements of this “effectiveness” are based in our intuitive sense of reason, and we should never make the mistake that these formal techniques ever completely subsume that intuition. You can disagree all you want, but then your actions can never be consistent with your philosophy.

Which is really just a long winded way of saying that your comments have not been constructive. Please, by all means contribute to conversation. It is just generally considered polite if you attempt to appeal to our reason. For starters, what techniques do you use in Python to minimize the author claims are addressed by static types? Are you practiced enough in the author’s typeful style of programming to be confident that you can say that in your experience it hasn’t helped much?

I have an opposing view. That means it will not be “constructive” to the authors argument.

Depending on the definition of politeness, disagreeing might not be very polite.

There are too many posts like this where the author just states opinions which no doubt strokes his followers, but does little to move things on. Where is the analysis of experimental results? (Where’s the experimentation). There is nothing here to lift it out of being, (another), opinion piece.

If the author wants to do more than “preach to the choir” he might appeal to _our_ reason by being less dismissive and adding more science!

You can’t plug into an interface invariants or enums to guarantee certain quality of the subject type.

So even if you type-ascertain an interface, you cannot prove from interface alone the quality of the subject type, except of course its name, input format and output format.

In functional languages, like F#, you can make a type of a list, list of pairs or pattern matching therefore ensuring some meaning to the type.

That’s what I guess he meant.

However, type declaration is somewhat limited in its power to deliver behavior.

Behavior interface is how you ask. Behavior implementation is how the computer execute. All invariants and checks that are not done at typing declaration (either by oblivion or by lack of power to describe the type) must be covered at the behavior implementation.

When someone reads this post from the symbolic programming background (Lisp, Haskell etc etc) – all this debate on type becomes hard to swallow. The symbolic will seek and return the data, or offer a modified copy of the data. Data itself being represented by a symbol. So the part the author says: “Treat your types as the only real documentation.” becomes really weird when you have a functional code base – the _symbol_ is the real documentation. And that’s an amazing winner for future code extension.

100% Agree. We should take take maximum care of types as they give real information about data structure. Instead of writing bunch of hard to maintain doc, focus on using and creating good and meaningful types, and use them at the right place.

Thanks for this very good article, every programmer have to read this…

Thanks Roland! Names do matter for recollection and navigation, but it seems to me that type names suffer the same weakness: they don’t convey any true information; like Blanche Dubois, we are still relying on the kindness of strangers.

Of course, at that point, there’s no better alternative than giving as sensible a name as you can. I’m not suggesting we replace everything with ASCII art and unicode! It’s just that names are often overrated as a means of understanding software. 🙂

with regards to “If a type has a sensible “empty” or default value that can fulfil the contract of your type, then initialise variables to this, or employ a Null Object.
If a particular variable might or might not be present, then this should be encoded in the type system using an option type, like Scala’s Option, Java 8′s Optional or Haskell’s Maybe. This correctly represents the uncertainty in the type system, so that any access is safe, simply by the fact that it compiled.”

I more than often find myself pondering whether I should bbe using an EmptyObject or Option. I like EmptyObjects but they cause “more work” like if you implement it Option then frameworks such as spray-json and others would know automatically how to refer to this and (de)serialize it correctly with no coding required from you. But If I use EmptyObject which I think are more beautiful, than i have to do much more coding in such frameworks. This keeps bugging me which one to use, what do you think, taking into account the additional (and I really mean additional much more coding) required rather than simplistic Options?

1) They say different things; Option says that “A Foo cannot be empty, but I may or may not have a Foo”. Null Objects say “An empty value is one of the things that a Foo can be”. Whether or not this is appropriate depends on the type – some, like monoids, will have an obvious “identity” or “empty” value. For others types, it might be another lie where it quacks like a Foo, walks like a Foo, but throws exceptions when you try to do useful things with it.

2) See my response to “Guest”, which seems to be you 🙂

3) Depends. Would they change for the same reason, or is their similarity a sheer coincidence? Also consider that in a language such as Java, the sheer verbosity and coordination of the layers might overwhelm the codebase with complexity, whereas in Scala or Haskell, defining separate structures might not be a big deal. I tend to dislike coupling more than I dislike verbosity, but you need to evaluate the tradeoff for yourself.

as for “When there is a main result alongside a possible failure result, use an existing Either or Validation type.
Define your own Algebraic Data Type (ADT) that describes the possible alternatives. For instance, in Scala or Java, this can take the form of a closed mini-class hierarchy.”

I tend to always define my own ADT, I always prefer my own encapsulation of things rather than something which looks like A[B, C] or ValidationNEL[String, Date] I would prefer MyOwnClearReturnType its the same that I prefer to return MyOwnEncapsulatingType rather than Int for calcSomething method, what do you think?

I often prefer to do that too. That being the case, there’s something to be said for generality, and the various Either types often come with plenty of useful abstractions that you’d otherwise need to write yourself. You can also write type aliases in Scala/Haskell, so that you’re reusing the general structure, but using a name that means something to you.

I’m certainly interested in them, but I wonder — it seems that if you generate types based on a database, CSV file, or whatever, then the database or CSV file is part of your code. This “code” can’t be tested without real integration tests, with all the flakiness, slowness and yak-shaving that implies. I’d be interested to hear how they pan out in practice from someone who’s used them.

I’m not too sure what you mean by this comment. The types from a Type Provider are automatically generated from the schema data or inferred from its structure. If the schema changes then generated types would have to be regenerated, but that should be no surprise as not technology can solve that problem. As for as how they pan out in practice, from the tests I’ve done with TPs and from what I’ve read, they’re very useful in practice.

Very sensible and useful viewpoints! For day-to-day-programming – which many of us do for a living – these are extremely relevant (and, no, let’s not get into Dynamic VS Static Typing warfare). Great article!

I enjoyed the article, but the combinatoric argument against tests is really tired, and misleading. There are effectively an infinite number of ways a car crash can unfold. If we crash 10k cars during testing, we’ve covered 0% of the possible number of crashes. This, in fact, doesn’t mean that the testing is pointless. The statistics don’t, in fact, work that way. Arguing that testing can’t work because there are a googolplex possible implementations is utterly specious. The number of implementations and classes of bugs we can reasonably expect to encounter is vastly smaller. Testing them is a tractable problem. It works.

This is the same reason that Calvin’s mom doesn’t actually need to disprove that aliens from Neptune stole the cookies from the cookie jar. Yes, it’s possible. But it’s not worth considering.

Sure, a lot of the space needn’t be considered. However, even when you discard specious and implausible cases, the possibilities still multiply astronomically. Testing is indeed useful and necessary; but it’s too expensive and piecemeal to be a replacement for types.

The endless stack traces filled with null pointers, class casts, ignored exceptions, string-munging errors and the “that’ll never happen!” parts in partial functions are a clear indication that the status quo is not already good enough; these can all be cheaply and totally eliminated with a judicious use of types.

What few studies we have point in the opposite direction: that type systems catch a paltry number of bugs, at great expense. Type systems address a very, very small subset of bugs. Type errors that can survive the simplest of hello-world unit tests are rare almost to the point of non-existence. Type errors fail quickly, and loudly. Developers who burn time on type errors in dynamic languages are those who didn’t bother writing the first unit test. These are the same developers who swear by type systems, and in that context, it’s easy to see why: a type system is indeed better than exactly no testing. But code without tests is unmaintainable, with or without types.

I tend not to put much stock in studies of programming, since there are so many variables, most of which are impossible to reproduce. There are plenty of studies with nice things to say about static types too, and I don’t think they’re worth the paper they’re printed on either.

Tests tend to take up between 50% and 60% of the volume of a codebase, and about the same proportion of development and ongoing maintenance effort. What is this “great expense” of types, in this context? By making a huge number of incorrect programs inexpressible, types can allow test suites to be sharp and focused, rather than expensive sprawling prayers to infinity.

Regarding the “very small subset”, this takes an absurdly limited view of what a “type error” is; one of the points I hoped to make in this article is that you can _turn_ a lot of runtime errors into type errors with not too much effort. It’s not simply replacing runtime tag-checking.

Index out of bounds? Make a NonZeroInt class with a “smart constructor”. “Wednesday” fell through to exception with a case checker? Make a “Weekday” ADT so that incomplete case-checks won’t compile. Can’t figure out where to handle “UnauthorizedExceptions” that keep popping up? Wrap it and make the return type encapsulate the failure modes too. And so forth.

An enormous number of runtime errors can be coerced into type errors, that are waaay cheaper than tests to write and maintain. Sometimes it just takes a bit of imagination.

Expense refers to the measured difference in development costs of statically vs dynamically typed projects, which is large, like a factor of three. Difference in defect rate is about zero. I disagree about studies. Without empirical data, it’s all just hand waving. Do you have links to the studies you mention? In addition to development cost and defect rate, it would be interesting to see how long-term maintainability is affected, and how the results scale with the size of the team. I would expect types to be more useful in larger teams, but, again, it’s all just theorizing until there is some data.

The study found a ~30% improvement in initial build time with dynamic types for one part of the project, under the very specific circumstances of the test. To be noted:
– The authors explicitly say that no generalisations can be made from the results of the paper.
– They note that the results are contradicted by a number of other studies, and that the discrepancy can’t be explained and needs more research.
– The results only apply to a single-programmer task and have nothing to say about team projects.
– The results only measure initial-build time and have nothing to say about ongoing maintenance.
– The tasks didn’t require tests to be written.
– The language representing static types is based on Java 1.4’s (!!) type system, which has limited benefits.
– “Type errors” are considered to be method-not-found errors and casting errors, but not NullPointerExceptions. I hope I’ve made it clear why this is not a useful distinction.
– The authors do not, and cannot, flatly claim that “dynamic types reduce development time”.

Anyway, it goes on. It’s not that the authors didn’t do a good job; it’s just that it’s really really hard to do this kind of thing, which they are totally frank about. It’s a contribution to an emerging, and shaky field.

Unfortunately, it’s been misused and misunderstood as a lazy whacking stick in internet debates, like in Robert Smallshire’s sloppy polemic at JavaOne earlier this year.

Personally, I care very much about maintaining, extending and reasoning about team-built software, and very little about single-dev hacks. That’s what my article here is about.

By all means, read and consider the studies that have been made, but don’t underestimate the difficulty in drawing conclusions from them, and don’t use it as an excuse to stop thinking things through.

One highly subjective data point outside of your primary target region is that I would not dare to do live coding without static type checking: it is a single-dev hack under time pressure that does not even allow you to write tests and types are the only thing standing between me and live failure. Certainly not a problem everyone has, but it does highlight the usefulness of a static type system outside of the classical domain of large long-lived projects.

Our systems are mostly Ruby, Java, Scala and some scattered Perl fossils. The size varies from very small codebases to multi-100KLOC monoliths. The article was written to encourage better use of type systems in Java and Scala, and not particularly as an anti-dynlang piece, although you can guess my opinions there.

I tried to be quite concrete with examples, causes and effects, but there’s only so much detail you can fit in one article without drowning it. The single line examples are representative of the essence of the problems that manifest on a larger scale; you can make up your own mind of course.
Cheers.

A -> B is equivalent to exponentiation B^A. If A = 2 * 2 and B = 2^32, the result is (2^32)^4. Wolfram Alpha can helpfully tell us the answer, and the English name. http://www.wolframalpha.com/input/?i=%282^32%29^4

For your last example, A = 2 * 2 and B = 3, so the answer is 3^4. Think of a diagram that has (true,false), (true, true), (true, false), (false, false) vertically on the left, with A, B and C on the right. There are 81 ways to draw lines between every input on the left, to any output on the right. Try drawing it on paper, if it helps it make sense.

and a function
Bool -> Data
A Bool can have 2 values, a Data 3.
In my diagram I have on the left (True) and (False),
on the right (A) and (B) and (C).
then I draw all lines between every input and output:
True—A
True—B
True—C
False—A
False—B
False—C
I see 6 = 2 * 3 possibilities (not as you say 3^2 which would be 9)
where did I miss your additional 3 lines?

Great one, however, one of the sad things about Java it has no “CTOR promotion” like in C++ .
If for example I stored an object of my UserId type in a rational db (e.g – pgsql) , and then i have code like resultSet.getString(“user_id”); I cannot have code like:

myobj.userId = resultSet.getString(“user_id”);

I have to have something like

myobj.userId = new UserId(resultSet.getString(“user_id”));

which kinda sucks.

In C++ I would have been able to code something like :

myobj.userId = resultSet.getString(“user_id”)); – yes, CTOR promotion would have taken place here, and saved me some code.

It also means that for persistancy reasons I will probably have to have a getString method that will return the string of UserId (in C++ I would have use a nifty casting operator to String to save this thing from me).
Sure, I can add all kinds of annotations and other magic that will help me automate things, I just wish Java was better in that sense.

Excellent post. There seems to be a whole generation of coders out there who grew up believing compiling and static typing are uncool and old-fashioned (not to talk about SQL). Hopefully the success of Scala and other projects — e.g. Dart with its optional static type checking — can help people see the importance and usefulness of type systems. I think some people frown upon types because it makes programming “look” more difficult, but the real difficulty is often in designing the applications, it is often in the actual problems we are solving with our programs and systems. A good programming language must help you solve the real-world problems, and typing helps you not only to model your domain, but to make rules explicit. The zen of Python says: explicit is better than implicit. But dynamic languages in general end up pushing us towards doing things implicitly, for instance, using string for everything. I would love someone to show me how I could make lots of wrapper classes like that in Python. It’s just not fit for that… In Scala it’s a breeze.

Good article, very interesting.
While reading, I can’t help but keep thinking: “Hey, Ceylon does it right!”.
It addresses the null problem (compiler forces you to test against it, if it can be null), has ADT built-in (this method can return Foo or Null, this one takes either a String or a Number, etc.), wraps primitives like any modern language should do, Etc.

Also thinking about JavaScript, which I use at work currently, where indeed we are “productive” as we code away with “types” we create on the fly (objects with ad hoc structure), but despite the tests we put in (never enough, and slow down this so called productivity…), create a ticking time bomb waiting for a junior developer to make it to explode… 🙂 Or some hastiness of a senior dev!

Very nice article Ken! As one of its core developers I’m interested to know why you would think about our JVM language Ceylon (http://ceylon-lang.org). It has union and intersection types and we do away with `NullPointerException` by making optional types an integral part of the language (by using unions, an option type Foo would be Foo|Null).

Hi Tako! I confess I haven’t looked at Ceylon since it was announced a few years ago. I think it’s great that “null” can only be expressed through a union. General purpose union and intersection types sound interesting too. I’ll read up on it again and get back to you. 🙂