Aurora and Learning Erlang

TLDR: I learned Erlang and wrote a JSON API server called Aurora for a school project.

Introduction

I recently concluded a school project involving writing an Android app with “some elements of concurrency”. My group wrote a chat messaging app - a stripped-down clone of Whatsapp, but with some additional features such as message tagging, conversation notes (think Google Docs, but on a per-conversation basis), and event polling.

I volunteered to write the API server that would serve the Android front-end application. I chose Erlang because I heard it was cool and I wanted to learn functional programming.

It seems any blog post regarding projects written in Erlang should include performance benchmarks and an eventual (and completely obvious) conclusion as to why Erlang provides superior performance under heavily concurrent loads. Instead, I will write about my journey picking up Erlang and some other things I learned from writing Aurora.

Learning Erlang, and Why People Think That Learning Functional Programming is Difficult

Because it is, especially if you’re used to thinking imperatively. The soundbite I use most when talking to people is: “Imagine programming without loops”. Of course, its not as bad it sounds, and when you get the hang of it, for loops will seem completely stone-age. map, filter and fold will be your new best friends.

The difficulty is compounded by what I call the “I Think I Know What’s Going On Wait What” effect. Because functional programs (or at least, well-written ones), can be more terse and/or expressive than their imperative counterparts, it sometimes provides a false sense of understanding. When it comes time to write something from scratch, the programmer is completely paralyzed.

I was stuck in many ruts as I made my way through Fred Hebert’s excellent Erlang book, Learn You Some Erlang. The chapters on socket programming in Erlang (chapter 23), ETS (chapter 25), and Mnesia (chapter 29) were particularly illuminating.

The Importance of Ripping Other People’s Code Off

It wasn’t until I found some fairly basic code for a TCP server off a Github repo did I slowly begin to understand how to write my own. And even then, I experimented a lot by copying and pasting and changing each line, line by line, to get a sense of how everything came together.

Beyond the initial scaffolding stage, I continued to find code samples indispensable - often finding alternative ways to achieve the same functionality, corroborating and finding the most elegant/(insert some arbitarary measure of “best”) way. I think the relative paucity of Erlang code on the interwebs psychologically reinforces the preciousness of code samples.

Personal Notes on Erlang

Maps

I was fortunate enough to embark on this project just as Erlang 17 was released. It turns out that maps only found their way into the standard library from this release. Noting a conspicuous lack of knowledge about Erlang’s development history, I was nonetheless appalled to find that a key-value data structure took so long to be introduced into the standard library. I used maps heavily throughout my code.

Atoms

I cannot overstate how much I enjoy first-class atom support in Erlang. For the uninitiated, they are conceptually similar to symbols in Ruby in that they are both immutable string-like1 objects, and that multiple references to atoms and symbols actually refer to the same object in memory. This greatly improves performance, and also reduces programming errors to some degree. Atoms in Erlang are maintained in a centralized atom table, and their text representations are stored once for each unique representation. Notably, this atom table is not garbage collected, which means that dynamic creation of atoms is strongly discouraged. On the other hand, since symbols in Ruby are objects, I’m guessing they live on the heap along with other objects. I have no idea if symbols receive preferential treatment in, say, MRI Ruby’s incremental garbage collection system.

Some Other Details Regarding Aurora

As mentioned earlier, Aurora is an Erlang API server application that services a frontend Android chat messaging application. It uses JSON as the data protocol, transported over raw TCP. I don’t know why we didn’t go with HTTP, but it turned out fine, for the most part. I had to reimplement näively some features of HTTP I needed, such as status codes.

Written about a period of 2 months, Aurora’s source code is about 2100 lines long. About 1400 lines of that is contained in one monolithic file called controller.erl, which contains the controller/business logic, and also for legacy reasons, the Mnesia database’s REST-ful APIs and some convenience methods wrapped around this core set of APIs.

Code Smell, And What I Think I’d Have Done Differently But Am Actually Too Lazy To Revisit

Undoubtably, there is significant code smell in some portions of the code. Of particular badness are bloated functions. There are some functions which contain way too many nested syntactic constructs - case...ofs, ifs. For example, there’re infinitely many better ways to write the below function, which handles the asynchronous cast from the listener socket:

However, making your functions too piecewise can adversely affect readability. As with most things, there is a balance to be achieved here.

Build Tools

The standard for building Erlang projects is rebar, but I opted to keep it simple with a bare bones Emakefile because I was in way over my head at that time and didn’t have the necessary bandwidth to learn any other toolings.

Mnesia

I used Mnesia, which was nice because its query language, QLC, is written in Erlang as well. Like Active Record for Rails, I was able to stick to using the same language/DSL throughout the application, which is nice. I discuss more about Mnesia, including its locking strategies for handling concurrency, in the documentation report.

Dependencies

Aurora has only one dependency, jsx (which conveniently shares the same name as the XML-like markup language popular in React), a JSON parsing library that converts JSON strings (represented as utf-8 binary) into Erlang terms and vice versa via its decode and encode functions. jsx is also nice enough to convert keys into atoms, and has map support via its return_maps option (godsend). Otherwise, output is represented as proplists.

Parting Notes

As part of the school project’s requirements, we were also required to write a documentation report. Aurora’s part is from Page 21 to 57 and goes into implementation-specific details and what functionalities it (and by extension, the Android chat messaging app itself) supports. Thanks to Glen and Francisco for teaming on this project.

Don’t bastardize strings! They are innocent containers for data. It is also worth mentioning that strings are completely handled in binary within the application logic in Aurora. I heard that this is much more performant, but I never ran any tests to compare.) ↩