Wednesday, June 18, 2014

The Safyness of Static Typing

I like static (manifest) typing. This may come as a shock to those who
have read other posts of mine, but it is true. I certainly am more
comfortable with having a MPWType1FontInterper *interpreter
rather than id interpreter. Much more comfortable, in fact, and
this feeling extends to
Xcode saying "0 warnings" and the clang static analyzer agreeing.

Safety

The question though is: are those feelings actually justified? The rhetoric
on the subject is certainly strong, and very rigid/absolute. I recently had a Professor
of Computer Science state unequivocally that anyone who doesn't use static typing should
have their degree revoked. In a room full of Squeakers. And that's not an
extreme or isolated case. Just about any discussion on the subject seems to
quickly devolve into proponents of static typing claiming absolutely that
dynamic typing invariably leads to programs that are steaming piles of bugs and crash left
and right in production, whereas statically typed programs have their bugs
caught by the compiler and are therefore safe and sound. In fact, Milner has supposedly made the claim that "well typed programs cannot go wrong". Hmmm...

That the compiler is capable of catching (some) bugs using static type checks
is undeniably true. However, what is also obviously true is that not all bugs are type
errors (for example, most of the 25 top software errors don't look like type errors
to me, and neither goto fail; nor Heartbleed look like type errors either, and neither
do the top errors in my different projects),
so having the type-checker give our programs a clean bill of health does not
make them bug free, it eliminates a certain type or class of bugs.

With that, we can take the question from the realm of religious zealotry to the
realm of reasoned inquiry: how many bugs does static type checking catch?

Alas, this is not an easy question to answer, because we are looking for something
that is not there. However, we can invert the question: what is the incidence
of type-errors in dynamically typed programs, ones that do not benefit from the
bug-removal that the static type system gives us and should therefore be steaming
piles of those type errors?

With the advent of public source repositories, we now have a way of answering that
question, and Robert Smallshire did the grunt work to come up with an answer: 2%.

So all those nasty type errors were actually not
having any negative impact on debug times, in fact the reverse was true. Which of
course makes sense if the incidence of type errors is even near 2%, because then other factors
are almost certain to dominate. Completely.

Some people are completely religious about type systems and as a mathematician I love the idea of type systems, but nobody has ever come up with one that has enough scope. If you combine Simula and Lisp—Lisp didn’t have data structures, it had instances of objects—you would have a dynamic type system that would give you the range of expression you need.
— ACM Queue: A Conversation with Alan Kay

Even stringent advocates of strong typing such as Uncle Bob Martin, with whom I sparred
many a time on that and other subjects in comp.lang.object have now come around to this
point of view: yeah, it's nice, maybe, but just not that important, and in fact he
has actually reversed his position, as seen in this video of him debating static typing with Chad Fowler.

Truthiness and Safyness

What I find interesting is not so much whether one or the other is right/wronger/better/whatever, but rather
the disparity between the vehemence of the rhetoric, at least on one side of the
debate ("revoke degrees!", "can't go wrong!") and both the complete lack of empirical evidence
for (there is some against) and the lack of magnitude of the effect.

Stephen Colbert coined the term "truthiness" for "a "truth" that a person making an argument or assertion claims to know intuitively 'from the gut' or because it 'feels right' without regard to evidence, logic, intellectual examination, or facts." [Wikipedia]

To me it looks like a similar effect is at play here: as I notice myself, it just feels
so much safer if the computer tells you that there are no type errors. Especially if it
is quite a bit of effort to get to that state, which it is. As I wrote, I notice that
effect myself, despite the fact that I actually know the evidence is not there,
and have been a long-time friendly skeptic.

So it looks like static typing is "safy": people just know intuitively that it
must be safe, without regard to evidence. And that makes the debate both so
heated and so impossible to decide rationally, just like the political debate on
"truth" subjects.

I think that most people tend to think in types, anyway. Types are the assumptions you make about your data.

I expect a string input, or at least something it makes sense to write to a console. I'll give you something you can multiply.

Just documenting that sort of thing in the interface makes it much easier to understand, because I'm no longer guessing, or deriving the properties of some value from natural language descriptions, which tend to get very long.

I also think types can lead to abstractions that would be very difficult to use without them, and a powerful type system can lead to new programming styles.

It's also possible to encode notions of trust into types, so for instance, you can't use an unescaped string where an escaped one is expected. It's just that people use plain strings for far, far too many things. If you don't mean "arbitrary stream of characters", then String is not a good enough type.

I'm not saying that dynamic languages are unusable. They're not. But the notion of types you're using seems somewhat limited.

The 'Top 25 Software Errors' listed are not the most common errors; just the most 'dangerous'. Many of the bugs relate to security concerns.

It's possible to have strongly typed programming languages without ridiculous type declarations cluttering things up. Look as Haskell and ML-derived languages such as OCaml and F#. We need to do away with object inheritance. Then our programs can know what kind of thing a type is. Types can then be inferred using Hindley-Milner style algorithms. That's removes much of the clutter from function signatures.

My preference for strong-typing is about writing bug-free, literate code which can be confidently refactored with editor tools. The last time I emphasised ease of refactoring to a dynamic fanboy I was told that code should be "re-written every 3 years anyway". I've yet to read a defence of Ruby/Python/Clojure, that even admits of the need to keep complex code working for decades, with constands feature updates.

"how many bugs does static type checking catch?"1. The research needs to be redone to treat ADT-style types (Haskell, F#) separately from object type systems.2. It would be a huge mistake to assume that catching bugs is the main point behind static typing. Consider ease of refactoring and added expressiveness as equally important.

Heartbleed is actually a prime example of a type error. Here's someone who rewrote the code from the unsafe language C to the safe typed systems language ATS to demonstrate that Heartbleed would have been caught as a type error:

I'm more and more thinking that combining strong type systems and "shape of data" checks could be a good way of ensuring safety. IE. check that an input is a string, and that this string is between n and m characters with only authorized chars. Same goes for more complex data structures (vectors, maps, lists, enum, etc.). Maybe some of those checks would need to be done at runtime though.

My preference for typed languages has nothing to do with how many bugs remain in the program, it's when I find certain (trivial) bugs.

I frequently do very silly things that blow in a very obvious way (thankfully). I prefer if the syntax highlighting tells me about that error rather than having to start the program.

This is primarily a big deal in GUI applications, that often involve a lot of clicking until you even get to the newly written code and where TDD doesn't work all that great. Scripts and web applications work much better with dynamic typing.

Also note that ALL the injection and XSS bugs listed in that top 25 are type errors. The problem is that "String" is abused as a very universal type, while actually something like "user input" and "website output" should be different types so that they cannot simply be concatenated. But that's strongly versus weakly typed not static versus dynamic. Using "int" and "String" almost always navigates around the safe type system.

Its generics derive from parametric polymorphism in functional languages. Google for this, and parametricity instead: http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdfhttp://dl.dropboxusercontent.com/u/7810909/media/doc/parametricity.pdf

In a proper type system like Haskell, or ML, the benefits here are enormous; it gives you free theorems which allow you to know a great deal about the functionality without even seeing the implementation. Java's syntactic penalty is non-existent in Haskell.

2) If you think of types as mini-bug-catchers, you're not really understanding what they do. While they certainly rule out incalculable incorrect programs, they are more valuable as a means to design and reason about software: you can truly _know_ what something does, not guess, or believe, or think. Given that maintenance takes vastly more time than initial creation, this is (IMHO) utterly indispensable.

3) Computer Science is more concerned with maths and engineering than social studies, although these have their place. There's no reason to confer a mystical Sciencey authority on them above all else; studies vary wildly in their rigour and reproducibility, and software development is a notorious difficult area here, given the huge number of unrepeatable variables. Again, there's a place for it, but they are unlikely to be suitable for broad sweeping conclusions.

Maths is correct before science gets out of bed; there is provably, mathematically more you can know about software, less that can go wrong, and less you need to fit in your head in statically type-checked code.

One difficulty in checking the impact of static typing on software robustness is that quite often, the same problem is solved quite differently in a statically typed and a dynamically typed language. This is most visible at the extremes: a Haskell programmer is likely to design his types first and write the rest of the code to fit them. At the other extreme, a Python programmer often writes classes made to fit into an existing "duck typing" framework (e.g. Python's informal sequence protocol).

Milner did indeed claim that well-typed programs can't "go wrong", and he was absolutely correct. It's just that he didn't mean what it sounds like. It just means that the operational semantics for the language can't get in a state where no evaluation rule applies to a term.

I think it's most helpful to consider a modern static type system (like those in ML or Haskell) to be a lightweight formal modelling tool integrated with the compiler. Using it provides a couple of benefits:

1. Parametric polymorphism protects abstractions; this helps you reason about the program as you are designing the model.

2. Your code is automatically checked for conformance to the model you designed in the types.

If you only consider a type system as a tool to find obvious screw-ups, they don't sound quite as valuable as they can be.

They also don't automatically confer the benefits I mentioned; you have to learn to think of them as a modeling tool and understand how to take advantage of that aspect. I think the studies you quote don't take that factor into account, assuming they're the studies I think you're quoting. They compare statically typed languages and dynamically typed languages as *tools to translate a model to an executable implementation of it* rather than as *tools to develop a correct model* in the first place.

From the viewpoint of formal modeling and static verification, ML and Haskell are the 'dynamic, lightweight, high power-to-weight ratio' tools of the trade. They're the "dynamic languages" of the modeling world, and they happen to also be great programming languages as well.

@Marcel: "most of the 25 top software errors don't look like type errors to me"

I disagree: most, if not all, of those component interaction and resource management bugs should be totally prevented by the type system. If your type system isn't sufficiently expressive to describe and enforce the necessary checks and constraints, get yourself a better one. Frankly, it scares me silly the number of professional developers who absolutely believe that stuff like this is a Good Thing:

int i = 0;

It's not: it is worse than useless. Look, we can already see `i` is an integer (the `0` is kind of a giveaway). But what is `i` minimum bound; what is its maximum? Are there any numbers in-between those bounds that it should also avoid (e.g. should only even integers be accepted)? If the developer wishes it to have unrestricted range, that's fine, but how then should it behave if it exceeds a machine-/runtime-imposed limitation such as MAXINT? And so on.

And yet, functional languages, and even some imperative imperative OO languages like Eiffel, have been thinking about, and addressing, these sorts of things for years - and doing it without stupid insipid ints infesting your code like roaches.

Using a type system that expresses such constraints doesn't necessarily mean that all errors will be caught at compile-time; for example, no compiler can perform a length check on externally supplied data that won't be received until run-time. However, encoding such constraints within the type definition means that the compiler can check as much as is practical at compilation time, and bake the remaining checks into the executable so that they are automatically applied wherever and whenever such checks should be performed at run-time. i.e. The developer is no longer required to include explicit tests at every point in the program code where new data enters the system for the first time, or where existing variables are rebound.

I suspect lot of problems might disappear amazingly quickly if static-vs-dynamic religionists (especially looking at you, C and Ruby) are sent out of the room for a few hours. For all their big talk and foot-stamping bluster, they rarely seem to possess any genuine understanding of types and type systems, and seem utterly disinterested in educating themselves either. But all their noise makes it terribly hard for the grownups to talk.

@ Anonymous: "I'm more and more thinking that combining strong type systems and "shape of data" checks could be a good way of ensuring safety. IE. check that an input is a string, and that this string is between n and m characters with only authorized chars."

Absolutely. Bounds checking is precisely the sort of thing a real type system should do. Note that "real type system" here means pay no attention to C. Hell, C doesn't even have a type system: it has a handful of compiler directives for allocating memory on the stack, and delusions of competence for everything else. C++, Java, etc. also come from the same pants-down school of language design; ignore them too.

In fact, I'd go further: having come up by way of dynamic imperative "scripting" languages, but now in the language design game myself and discovering all the weird and wonderful declarative idioms also available, I now wonder why imperative languages don't run (e.g.) H-M checks as standard. The only difference should be: do they prevent you running a program until all warnings/errors are cleared, or let you freely proceed and just ask if you'd like to know more about any concerns found? Indeed, such checks shouldn't even wait till the full, formal compilation stage: a lot of this feedback really should be reaching the user as they're typing that code, as that's when it's of greatest value to them.

Even the most open, dynamic languages like Python and Ruby would greatly benefit from the presence of first-class features for applying type and constraint checks at inputs and interfaces. The weakly, dynamically typed language I'm developing provides only two 'basic' data types (strings and lists). Most of its expressive power instead comes from declarative 'type specifiers', which apply coercions and verifications as needed. For example, want to make sure your input value is a list containing exactly four whole/decimal numbers between 0 and 100? Simple:

So this one signature not only provides run-time checks of input/output values, but also a key piece of auto-generated human-readable documentation, and eventually entry assistance and auto-completion in the editor, and even auto-generated GUI input forms for end users.

Every time I drop back into Python I find myself increasingly wishing it let me annotate its function signatures in even half this detail. No more ad-hoc 'assert'/'if not...raise' guards. No more docstring-based parameter descriptions useless to the interpreter. There is so much extra value you can extract once your type information is formally encoded as a first-class language structures, fully introspectable and applicable across all stages of authoring and execution. Low-hanging fruit, and then some...

Open your mind with this: Consider an assignment where you must write a 16 hour project without a compiler or a run time environment. Clone yourself and put them in two separate rooms with plenty of caffeine where one of them develops with a statically typed language in an IDE with a solid type checker and the other one develops with a dynamically typed language in a sleek code editor. At the end of the assignment, the respective programs are examined for bugs. Which one are you OBJECTIVELY more confident about? There's no question that the dynamically typed language is going to allow for typos and things that the statically typed IDE would complain at you while you are coding, etc.

The reason why type bugs only account for 2% is because they are obvious at RUN TIME. They are identified and fixed before the code is ever released. You have to hunt for them at run time because they are not obvious at code WRITE TIME. So you see, there is a filter bias in the 2% statistic because the loss is actually seen during the development process. Developers need to bang their head against their keyboard many times in order to get those bugs down to 2% in order to be able to release a product that actually works. With great tools for type checking and intellisense that only typed languages enable, these kinds of bugs practically fix themselves while the code is written and there is no head banging to get what becomes an even better result of 0% for this class of bugs.

In short, the 2% statistic is misleading because it does not represent the pain. It represents what has come after the pain.

Exactly: your scenario is so hypothetical that it's more like apples and slide projectors. The point of dynamic languages is that they are dynamic, that is that the distinction between compile-time and run-time is at most blurry and ideally non-existent.

So saying you're not allowed to run the program is silly, it's like a static language not being allowed to run the type checker until run-time. Or having to develop your program without the assistance of an IDE, using only PowerPoint.

The way you develop programs in a dynamic language is that you continuously run the program as you are developing/extending, which gives you much higher bandwidth of feedback than even the best IDE can provide. In theory, having additional feedback from static checks is useful, in practice that tends to be subsumed by the dynamic feedback you get from running it.

Thank you for your response. It's not apples and slide projectors. Hear me out. My example provided an abstraction to conceptually simplify a real phenomenon that is actually much more detailed and tedious to illustrate perfectly. Your feedback did expose a weakness in the communication of the point I was making through that abstraction. Thank you.

The reality in dynamic programming is, like you say, not where you sit down and write the whole thing out without the aid of rerunning it often to help guide the development process. That is true. In fact, the advantage of this iterative development process is also available for those working with static typed languages (compiled or interpreted). The added advantage of the static language is that you less often need to rerun the program to gain the assurance that types aren't breaking everywhere as you proceed. So, while I agree with you that my abstraction wasn't an entirely accurate example where you've got all the extra debugging to do at the end of the project, it still helped me introduce the point that you've got all this extra overhead distributed throughout the development process. All of my other conclusions still apply.

You yourself just said that dynamic programming can overcome my illustrated weakness by rerunning the program often for iterative development. But that plays right into my point: it is something extra that you must do more often to make up for the lack of type safety. I am not speaking out of lack of experience with this. I've been working with both static and dynamic languages for years.

That is exactly the opposite of the approach you want from a dynamic language. You want to "rerun" the program as often as possible, because that gives you an incredibly high rate + bandwidth + quality of feedback. I've implemented a little program called "CodeDraw", which does "live" programming by rerunning the program on every keystroke. The difference is indescribable, you really have to experience it. Would it be nice to also have static types? Probably, but they don't matter that much, and when it comes to having this experience or static types, it's just no contest.

Hi Marcel. What you said stuck in my mind over the last few months. You invited me to experience something you created called "CodeDraw". Is this something that is available online somewhere? I'm interested in taking a look at it. Many thanks.