Sunday, 24 July 2011

I can't write about Kotlin without first talking about the folly of "reversed" type declarations.
Sadly this disease has afflicted Scala, Gosu and now Kotlin (but perhaps its not yet too late for Kotlin ;-).

Reversed type declarations

When I see a new language, one of the first things I look at is the parameters and variable declarations.
For this blog I'll refer to them as "standard" (like Java, Ceylon and Fantom) and "reversed" (like Pascal, Scala, Gosu and Kotlin).

// standard
Type variableName
// reversed
variableName : Type

Here I compare a Java and Kotlin method, although the principle is similar for Gosu, Scala and quite a few others.

When I see the latter, I cringe. Its usually a sign that the language isn't going to win me over.

So why the big deal?

For a start, there is one extra syntax character, the colon.
Given that this is unnecessary (as shown by Java), it continues to be surprising to me that languages that aim to
be less verbose than Java begin with something that is more verbose.

Ignoring the annoyance of having to type the extra character, the two examples above are still fundamentally readable.
The human eye (specifically my human eye, but hey I'm being opinionated...) can cope with the extra colon
and "reversed" declaration order.
However, add complexity and the situation changes:

As more parameters are added, the eye has more difficulty in picking out where one parameter ends and the next one starts.
This is simply because the colon is more visually arresting than the comma, so as the eye scans the line it breaks up the
parameters using the colons, not the commas. Thus at a glance I see "str", "String, total", "int, input", "List<String>".
In fact, my eye sometimes doesn't see the commas at all, thus I get "str", "String total", "int input", "List<String>"
which is horribly broken.

In order to actually read the information, I have to slow down and take longer.
But when designing a programming language, it is rapid and quick readability that matters.
Slowing me down on reading is a Bad Thing.
So, beyond being unnecessary, the extra colons are actually making the code significantly harder to read (and write!).

Now, we have a return type, again separated by a colon.
The use of the same character (yes that is another verbosity character I have to type) makes it especially difficult
to visually parse.
For me, the strength of the colon overrides the end bracket, thus I end up seeing "Future<Person>" as a parameter.
Effectively my eye is parsing the line in a fraction of a second, but it gets to the end and has to double back
to "push" the last thing it saw onto the "return type" stack.
Try this one if you're struggling to see the issue.
Note how the types and colons dominate and flow into one another, causing the distinctions as to their meaning
(type of a parameter vs return type) to be lost:

fun process(a : Int, b : Int, c : Int) : Int { ... }

As an aside, lets look at default parameters, which many new languages support (using an equivalent syntax for Java):

Now it is really broken! Now I've got "Int = 0" staring me in the eye, which really is not what the programmer
was trying to express. Again, that visual barrier of the colon, together with the type, makes it very hard to
connect the actual parameter name "total" with the value it has "0".

The real test for the syntax is the more complex case of higher order functions.
This varies a lot by language, so lots of examples (hopefully accurate - I don't have time for lots of testing).
I'm simulating some syntax for Java and Fantom:

The Java strawman and BGGA examples and Fantom pseudo-example demonstrate that a form can be created where higher order
declarations are possible using "standard" declarations.
Ceylon chooses to go down the C style route, mixing the parameter name in the middle of the return type and arguments.
I don't find Ceylon's choice as readable to my eye when scanning as the Strawman/BGGA/Fantom pseudo-examples because
it mixes the type and the variable name.

The three "reversed" declaration approaches are very different.
Gosu (if I've read the documentation correctly) makes a very weird choice as the element after the colon is the
return type of the function not the type of the variable "transformer" within the method as would be expected most of the time.
Kotlin's choice is also poor, as it now means there are two colons in the parameter
declaration, one to separate the variable name from the type, and one to separate the function type input from its output.
Scala's is the most rational of the "reversed" declaration styles.
However, I find the lack of anything surrounding the "T => R" means that the eye struggles to find the start and end
of the type in more complex examples, which is essential to finding the variable name.

Of these, the Strawman/BGGA/Fantom pseudo-examples and most readable of the first group and Scala in the second group
(ignoring the real Java example for a minute, and noting that Scala would be clearer with something around the function type).
That is because when I'm performing the eye parsing/scanning I've been talking about, I am essentially trying to
grasp the signature. To do that I need to know the number of parameters, their names and their types.
Specifically, I want to put the name and type into different mental boxes.
Mixing the name and type as Ceylon and Gosu do makes that harder, while Kotlin's additional colon simply creates another fence
for my eye to have to jump.

To do this full justice, I should really have some examples of function types of function types.
However this blog is already very long...

Finally, I'll point out that this isn't just confined to method parameters, but also to variable declarations
such as local variables. Again, this gets complicated between language in the detail, so I'll compare to a
typical "reversed" type language using a braindead stupid example:

So, I cheated right? By inventing a Java type inference syntax.
Well, I'm making the point that type inference need not be limited to new languages, Java or any language using the "standard"
type declaration style can have it too (and Fantom and Ceylon do).
Thus, we should judge the variable declarations by the long form, even if it is not used for local variables all the time.
And as shown above, the long form is awful in the "reversed" style.
I am most emphatically not assigning the total to "String", I'm assigning it to "str", and that is what the code should say.

I'm sure if you've read this far you have a number of comments.
Perhaps you believe tooling solves the issue, maybe syntax colouring?
Well, I'll simply say that while tooling helps, you should still be able to understand the language without it,
even if just for command line diffs around a version control system.
Or perhaps you're objecting to my methodology of trying to visually parse a line in a glance?
Its how I work, don't you do scan code too?

Let me be clear, in none of the above do I mention the task of the compiler.
My sole focus is on the developer reading the code, and an order of magnitude less writing code.

Any argument in support of "reversed" type declarations should never be based on relevance to the compiler
or some other element of type theory.

My view is that the usability to the mainstream developer is what matters. And that is primarily about ease of reading what is written.
I have endeavoured to show that the reverse style hampers readability, and is unnecessary to achieve the same goals of
a more complex type system that are sometimes used as justification.

Summary

I'm arguing that the "reversed" type declaration style is flat out harder to visually parse, and should therefore be rejected
by language authors, even if they believe they have sound compiler or type theory rationales.
Programming languages exist primarily for developers, not to aid the compiler or underlying theories.
"Write once, Read many" must be the first law of language design!

I am thus hugely disappointed that Kotlin, which has many fine features taken from Fantom, did not think through this choice
in more detail, and I plead with the authors to change their minds before Kotlin is locked down.

Thursday, 21 July 2011

My first point here is that another group is willing to go public and say that Scala is too complex. It is easy to miss this, but anyone writing a new language right now (Kotlin, Ceylon, Gosu, Fantom, ...) is implicitly saying Scala isn't right. Of course I don't expect Scala supporters to like this or agree with it, but the truth is that I and many others have looked at it and run fast in the opposite direction.

As I've commented before, my dislike for Scala is not support for Java. There is a lot wrong with Java, and that cannot be sorted out without breaking backwards compatibility. Elements like primitives, arrays, wildcard generics and basic operators like equals. That is why I have proposed a backwards incompatible Java - #bijava.

More generally, my position is that as a community there is a role for a popular, statically typed, industrial, Java-like langauge without Java's warts. The Java-like also means a design that manages complexity and is usable by the mass-market. Fantom, Gosu, Ceylon and Kotlin are targetting that market. Groovy, Clojure, Scala and many others are simply not targetting that specific market.

I can't comment that much on Ceylon as it is more vapour than reality and from the little public information appears to have some dubious design decisions, especially around verbose words rather than syntax. I've also not studied Gosu much for some reason, yet in a day I've looked at Kotlin a lot (not sure why Gosu doesn't excite me...).

Fantom is probably the most different of the four. It provides a new platform, which happens to run on the JVM (and in the browser via Javascript!). The core Fantom library is new and dedicated to Fantom, with some different design principles to those of Java. It also runs its own form of bytecode, allowing deep immutability, non-null types, full modules and reified generics amongst other things. In fact, one could question whether Fantom fits into the group of four at all, however it does fit the criteria of statically typed, industrial and Java-like.

So, what about Kotlin? Well at first glance it gets a lot right, starting with null-safety and type inference. However, I have real issues with some features, which I think a proper Kotlin blog piece should focus on.

More generally if I could wave my magic wand, I would probably wish that Ceylon and Kotlin would merge into a single project. (Gosu and Fantom are now used in production, and Fantom has many different goals, so both are harder to change now.) Basically, we need the energy of all that input (and associated money) but with a better, single focus. Both Ceylon and Kotlin are still mostly paper languages, and both are trying to achieve the same thing (with Kotlin looking closer than Ceylon at this point). RedHat and JetBrains could you please have a conversation? (I'm happy to mediate if desired.)

Summary

I still like Fantom and I think most people hugely underestimate it, especially those from a Scala background. Fantom's rethinks what a language should be from the platform/productivity perspective with deep immutability, deep modules, no shared state and a practical type system that aims to eliminate bugs not be in absolute control. Scala is in many ways, light years behind Fantom.

And this is my key point with Kotlin too. Simply focussing on syntax is worthy, but kind of misses the point. Syntax exists simply to express the programmers intent in a way that should be readable years later. What makes a bigger difference are the productivity issues that are not typically thought of when talking about a language - versioning of code, which logging library to use, how to access configuration or injected state. Kotlin tackles the syntax parts pretty well, though not perfectly. But its not clear yet if they can grasp just how unimportant the syntax is relative to language related productivity gains.