What Elm and Rust Teach us About the Future

So I recently started programming in Elm and also did some stuff in Rust. Honestly, it was mostly a hype-driven decision, but the journey was definitely worth it. I also noticed that although those two languages differ in their target audience and use cases, they made many similar design decisions. I think this is no coincidence. It is well possible that ten years from now, both Elm and Rust will be forgotten, but I am quite sure that the ideas they are built upon will be present in the languages we will use by then. This is a post about the ideas I find charming in Elm and Rust.

A quick disclaimer first: I am no expert in either language and while I am starting to feel comfortable in Elm, I am undoubtedly a Rust beginner, so please correct me if I am doing injustice to any of the languages.

Setting the Scene

Rust is a systems language which aims to compete with C++. Rust values performance, concurrency, and memory safety, but is not garbage-collected. Rust compiles to native binaries, not only for the major x86/x64 platforms but also on ARM and even certain ARM-based microcontrollers.

Elm is a language for web apps competing with Javascript in general and the virtual DOM frameworks in particular (e.g. ReactJS). Elm compiles to Javascript, is garbage collected and purely functional. Elm values simplicity and reliability.

Both languages are already usable for actual projects, but the ecosystems are still immature and the languages themselves are still evolving.

While I like both of the languages, I do not intend to limit this post to the positive sides and will also mention what are (to me) the pain points.
I will start with the ideas the languages have in common, and will give more details about either language later.

Common Themes

The features described here are mostly nothing completely new and could be found in languages like OCaml, Haskell and F#. The interesting part is that Elm and Rust prove they are useful for quite diverse use-cases.

Tagged unions

This is a small but very practical feature - I would say tagged unions are enums on steroids. Consider, how often did you write something like:

This makes room for some sweet bugs, as your data model can represent a state that should be impossible (savings account with non-null credit parameters, or a credit card account with null credit parameters). The programmer needs to take care that no manipulation of the Account object can lead to such a state which may be non-trivial and error-prone. It also creates ambiguity - for example, there are multiple ways to get the credit limit of an account:

Since both Elm and Rust don't have null values, you have to specify CreditParams when building an AccountDetails instance, and so the code above is safe in all situations.

A further bonus is that in both Elm and Rust, you have to handle all possible cases of a tagged union (or provide a default branch). Failing to handle all cases is a compile-time error. In this way, the compiler makes sure that you update all your code when you extend the AcountDetails.

Type Inference

Some people are fond of static typing as it is harder to write erroneous code in statically-typed languages. Some poeple like dynamic typing, because it avoids the bureacracy of adding type annotations to everything. Type inference tries to get the best of both worlds: the language is statically typed, but you rarely need to provide type annotations. Type inference in Rust and Elm works a bit like auto in C++, but it is much more powerful - it looks at broader context and takes also downstream code into consideration. So for example (Rust syntax, try it online)

Type inference in Rust has certain limitations and so explicit type annotations are still needed now and then. But Elm goes further, implementing a variant of the Hindley-Milner type system. In practice this means that type annotations in Elm are basically just comments (except some weird corner cases). While the Elm compiler enforces that type annotations match the code, they can be omitted and the compiler will still statically typecheck everything. Nevertheless, it is a warning to not annotate your functions with types, as type annotations let the compiler give you better error messages and force you to articulate your intent clearly.

Immutability

Immutability means that variables/data cannot be modified after initial assignment/creation. Another way to state it is that operations on immutable data can have no observable effect except for returning a value. This implies that functions on immutable data will always return the same value for the same arguments. Code working with immutable data is easier to understand and reason about and is inherently thread-safe. Consider this code with mutable data:

Now let's say we want to figure out why person.primaryAddress.street changed. Since the data is mutable, it is not sufficient to find all usages of person.primaryAddress - we also need to check the whole tree of all variables/fields that were assigned to/from person.primaryAddress. With immutable data structures this problem is prevented as the programmer is forced to write something like:

Elm goes all-in on immutability - everything is immutable and no function can have a side effect. Rust is a bit more relaxed: in Rust, you have to opt-in for mutability and the compiler ensures that as long as a piece of data can be changed within a code segment (there is a mutable reference to the data), no other code path can read or modify the same data.

The Problem with Immutability

Making sure that the data you are referencing cannot change without your cooperation generally makes your life easier. Unless this is EXACTLY what you want to achieve. Let's say you are writing a traffic monitoring tool. You might want to model your data like this (Elm syntax):

You may expect that when you receive new traffic information, you simply work with World.routes and the changes will be seen when accessing through City.routes. But you would be mistaken. In Elm this will not even compile (fields in record types are fully expanded at compile time, and thus cannot have circular references). And if you use tagged unions to make the model compile, the trafficLevel accessed via World.routes may not be the same as one accessed via City.routes, as those always behave as different instances.

A similar data model in Rust will compile but it will be difficult to actually instantiate the structure and you won't be able to ever modify the trafficLevel of any Route instance, because the compiler won't let you create a mutable reference to it (every Route is referenced at least twice).

This brings us to a less talked-about implication of immutability: immutable data structures are inherently tree-like. In both Elm and Rust, it is a pain to work with graph-like structures and you have to give up some guarantees the languages give you.

In Elm, the only way to represent a graph is by using indices to a dictionary (map) instead of direct references. For the above example a practical data model could look like:

Notice that nothing prevents us from having an invalid RouteId stored in City.routes. While Elm gives you good tools to work with such a model (e.g., it forces you to always handle the case where a given RouteId is not present in World.routes), and the advantages for every other use case make this an acceptable cost, it is still a bit annoying.

Smart but Restrictive Compilers

This is basically a generalization of the previous specific features. The compilers for Elm and Rust are powerful and they do a lot of stuff for you. They not only parse the code line-by-line, but they reason about your code in the context of the whole program. However, the most interesting thing about compilers for Rust and Elm is not what they let you do. It is what they DO NOT let you do (e.g., you cannot mix floats and ints without explicit conversion, you cannot get to the data stored in an tagged union without handling all possible cases, you cannot modify certain data etc.). At the same time, the compilers are smart enough to make conforming to these restrictions less of a chore. If you think that programmers will produce better code when given fewer limitations, think of the time people complained that restricting the use of GOTO hinders productivity.

Another way to formulate this stance is that languages should not strive to make best practices easy as much as they should make writing bad code hard. I think both languages achieve this to a good degree - writing any code is a bit harder than in their less restrictive relatives, but there is much less incentive to take shortcuts.

In practice, smart but restrictive compilers mean more time spent coding and less time spent debugging. Since debugging and reading messy code can be very time-consuming, this usually results in a net productivity gain. Personally, I love writing code, while debugging is often frustrating, so to me, this is a sweet deal.

Needless to say, all those restrictions make hacking one-off dirty solutions in Rust or Elm slightly annoying. But what code is truly one-off?

Style matters

The communities of both Elm and Rust make a big push for consistent presentation of source code. At the very least, this reduces the need for lengthy project-specific style guidelines at every team using the language. To be specific, Elm compiler enforces indentation for certain language constructs, does not allow Tabs for identation(!) and enforces that types begin with an upper-case letter while functions begin in lower-case. Further, there is elm-format, a community-endorsed source formatter.

In a similar vein, Rust compiler gives warnings if you do not stick to official naming conventions and also provides a community-endorsed formatter rustfmt.

More About Elm

Now is the time to talk about the languages individually, if you are still interested. We will take Elm first. Elm is a simple, small language. The complete syntax can be documented on a single page. Elm aimes at people already using Javascript and strives for low barrier of entry. Elm is currently at version 0.18 and new releases regularly bring backwards-incompatible changes (although official conversion tools are available). An interesting thing is that over the last few versions more syntax elements were removed than added, testifying to the focus on language simplicity.

Elm is purely functional. This means there are no variables in the classical sense, everything is a function. How does an application evolve over time if there are no variables? This is handled by The Elm Architecture (TEA). On the most simplistic level, an Elm application consists primarily of an update function and a view function. The update function takes a previous state of the application and input from the user/environment and returns a new state of the application. The view function than takes the state of the application and returns a HTML representation. All changes to the application state thus happen outside of Elm code, within the native code in TEA. The architecture also provides the necessary magic to correctly and efficiently update the DOM to match the latest view result.

TEA forces you to explicitly say what constitutes the state of your application and its inputs. This lets Elm to provide its killer feature: the time-travel debugger. In essence, when the debugger is turned on, you can replay the whole history of the application and inspect the application state at any point in past. And due to the way the language is designed, it works 100% of the time.

Another big plus of TEA is that you never have to worry about forgetting to hide an element when the user clicks a checkbox. If your view function correctly displays the element based on the current application state, the element will also be automatically hidden once the application state changes again.

Pain Points in Elm

A big downside of TEA is that it assumes that all state of the application can be made explicit. This makes working with HTML elements that have a state of their own tricky in certain contexts (e.g., text area contents, caret position in text areas, Web Components). You need care to prevent TEA from messing with such components destructively. Further, TEA can be resource intensive, albeit less than comparable JS frameworks. Last but not least, creating large apps in Elm involves writing a significant amount of boilerplate code. The Elm community is still discussing how to develop large projects more easily.

More About Rust

In comparison with Elm, Rust is quite the beast. There is a lot of syntax and a lot of things to learn. This is however not unexpected: if you want to write fast code, you really need a lot of control. Also, C and especially C++ also have loads of syntax, so Rust is definitely not at a big disadvantage here. Rust is currently at version 1.15 and has forward compatibility guarantees.

While Rust is imperative, it took in a lot of useful functional programming concepts and boasts zero cost abstractions - i.e. that all the fancy syntactic tricks that let you develop code easily incur no actual performance penalty in comparison with a hand-tuned but dirty solution.

Rust also has no OOP of the usual kind, instead it has traits (a bit like interfaces) and deliberately avoids inheritance (you should compose instead).

The weirdest and most interesting part of Rust is the borrow checker. While Rust does not have managed memory (garbage collection), it can still guarantee that you cannot access uninitialized memory, dereference a null pointer or otherwise corrupt your memory. This has big implications not only for reliability but also for security, as Rust automatically prevents whole classes of severe attacks as buffer overflow or Heartbleed (blog post). Rust also prevents most (but not all) memory leaks. The borrow checker is what enables a big portion of those guarantees by validating that your program accesses memory correctly at compile time, i.e. without the runtime penalty of managed memory. The borrow checker ensures that a mutable reference to a piece of data cannot coexist with any other reference (and thus that you cannot free memory while holding a reference to it). For some intuition, mutable references in Rust behave a bit like std::unique_ptr in C++ (specs), but with the uniqueness enforced at compile-time. More detailed description could not fit here, so check Rust by Example or just Google away :-).

Pain Points in Rust

The borrow checker is both the biggest strength and the biggest weakness of Rust. Although the Rust community took a lot of effort to make most code just work, you inevitably end up fighting the borrow checker. There are some promising updates to the borrow checker in the pipeline that could make the life of Rust programmer easier, but it will not be cakewalk anytime soon - making the compiler understand your program is hard (both for the programmer and for the compiler).

While Rust takes performance seriously and the compiler should in theory be able to do a lot more optimizations than C/C++, Rust is not quite there yet. Benchmarks I've seen put it equal or slightly behind C/C++ on gcc (e.g. Benchmarks game). From my memory gcc also used to produce slower code than MSVC or the Intel compiler which would be bad news for Rust. The Internet however suggests that recent gcc is on par with MSVC/Intel, but I was unable to find any good benchmark link.

Concluding

The same way functional programming has made its way from fringes to being included in mainstream languages, I believe the features that make both Elm and Rust interesting will show up in the mainstream.
Some of the ideas can also be immediately transferred to the current languages (e.g. ImmutableJS). I think the take-home message of this post is that you should consider learning a new language. Preferably one that is very different from what you have been working with so far. No only it is fun, it will make you a better programmer in your language of choice.

I'll be very happy if you provide your feedback on this post either here, on my Twitter or on Reddit.

This is a great post, and I agree that tagged unions and exhaustive 'match' really do seem to help with corner cases. And I also love the fact that both languages forbid null and require the use of an Option-like type to represent possibly-missing values.

However, I found that Rust's borrow checker was mostly a struggle very early on, but that I quickly learned to design code in ways that made the borrow checker almost invisible on a day-to-day basis. For Rust, I think it's useful to distinguish between:

The initial learning curve: fairly steep. This is definitely steep, especially for people who've never worked in C or C++, or who have zero experience with any kind of functional programming. I think that most people will only climb this learning curve if they have a real incentive.

The difficulty of Rust on a day-to-day basis once learned: not bad at all. After a few weeks of working in Rust, I found it to be a surprisingly comfortable language. It takes me maybe twice as long the write Rust code as Ruby code, but the Rust code is a lot faster, I trust it more, and I can refactor Rust code very aggressively without fear of breaking things.

For me, the actual big limitation with Rust is that the third-party library support for "work" stuff is almost there but has the occasional corner case. I find that I produce a steady stream of minor PRs to Rust crates whenever I try to do something tricky in production. Happily, Rust library maintainers seem to be a pretty friendly bunch and they process most of my PRs quickly.

I've been working on a rust+elm project for some time now: A NES emulator written in rust with a debugger front-end written in Elm. Both languages are an absolute pleasure to work with. With elm in particular, I'm amazed at what I'm able to accomplish considering a web app of equal complexity written in any other browser-centric language/framework would have become a massive burden to maintain and add features to.

I really need to do a proper demo of this thing, but here's a crappy demo showing off the debugger front-end: youtu.be/5JlHSK6BeKI

Thanks for the article, really interesting food for thought. I especially liked the reference to "GOTO considered harmful, considered harmful". Seems ridiculous in hindsight.

I really like the ideas of functional languages. I wish people would come to realize that functional and object oriented programming are not opposites; they can form powerful alliances. Immutability and pure (side-effect free, idempotent) functions are extremely useful concepts. However, I do not necessarily need to go as far as Haskell to implement those (neither do I need to go as far as Java to do object orientation).

I have one major pain point with pretty much every functional langauge out there, be it Elm, Haskell, OCaml, you name it: The syntax. "Unreadable" does not even begin to describe it. Sure, the program just has 250 characters (not lines!) in total. But if I need 10 hours to figure out what it actually does, then how does this help anyone? It's about as far away from "literate programming" as you can possibly be. Elm and Rust are nightmares come alive in that regard as well, albeit each for different reasons. I can't help but get the feeling that those languages were not designed to be written, read or interpreted by humans. In that particular regard, they are no better (if not worse) than the assembly code they compile to.

You are confusing readability with familiarity here, to the point that it is almost a little bit insulting to people who enjoy ML style syntax. :-)

I do agree though, that (academic) Haskell code can be hard to read: highly abstract, lots of cryptic operators, single letter module imports. But this is a separate issue, not to be equated with language syntax.

IntelliJ has a very well made plugin for rust that gives you syntax highlight and allows you to resolve cargo dependencies, also it can run cargo tasks and other stuff related to the day by day work in Rust.

It's a pain to have this super big IDE only to use Rust, but it's an start.

Rust's compiler backend is always LLVM. The only difference between the GNU and MSVC Rust compilers that you can download for Windows is the version of libc that your program is linked to, and which ABI they use for repr(C) structs and extern functions.

Decimal types differ from floating-point types in that they have a fixed precision (both before and after the decimal point), mimicking the way human clerks work with numbers. There are even regulations for this: in my country (Czech Republic), the law requires/suggests (IANAL) you to compute money-related operations (interest, fees, ...) with four decimal digit precision and then round the final number to two decimal digits.

This contrasts with floating point operations, that don't have fixed precision and cannot represent some values precisely. For example, subtracting 0.01 from a large number may yield the same number, which may be acceptable in physics simulation, but is not acceptable in finance.