Tag: concurrency

Last week was hack week at Dropbox. I took the opportunity to explore the implementation of a GraphQL server that does optimal IO batching and concurrency.

Serial IO

A common performance problem in the implementation of web services is serial IO. Let’s say you have a service that returns the names of all of your friends. It’s easy and natural to implement it like this:

The problem is that each user info lookup for my friends occurs sequentially. Even with TCP connection pooling, that’s still a packet round-trip in the datacenter. “But that should be faster than 10 ms right?” Even if it is, it doesn’t take many friends to blow your performance budget and send your response time across the threshold of perceptible delay.

Moreover, this problem compounds on itself. If your inner functions have serial IO and you call them in a loop, you’ve just added potentially thousands of round-trips to your backend data sources. I would hazard a guess and say serial IO is the largest contributor to service response latencies.

Manually Batched IO

One solution is to always batch IO. Every function takes a list and returns a list. This can be made to work (indeed, I’ve achieved excellent performance by carefully, manually, batching) but doesn’t compose as your services scale. Sometimes it’s just too hard to know all of the data you will need to fetch, and the dependencies between that data.

OK, so serial IO is bad. More on that in a bit. Let’s talk about REST now.

REST

At IMVU, we went all-in on REST, building a rather substantial framework to make it easy to create REST services. We anticipated the fact that REST services tend to require many round-trips from clients, so we built a response denormalization system, where the service can anticipate the other services a client will need to hit and include their responses too.

This sounded great at first, and in fact was mostly an improvement from the prior state of affairs. But, at scale, REST has some challenging performance problems. For one, REST services have a defined schema, and they always return the data they advertise. As I mentioned, if a service doesn’t return all the data a client needs, the client needs to hit more services, increasing the number of client-to-server round-trips. In addition, because the service always returns the same set of data, it must query the backend database to fetch more data than the client even cares about.

This phenomenon tends to happen most frequently with core domain objects like users. Because users are so important, they accumulate relationships with so many pieces of data (e.g. list of subscribed experiments, contact lists, shopping carts, etc.), almost all of which is irrelevant to most clients.

Why GraphQL?

This is where GraphQL comes in. In GraphQL, the client specifies the data that it wants. There is only one request, even if the data is deeply nested, and the server only has to fetch exactly what’s needed.

There are GraphQL implementations for many languages but many of them don’t solve the serial IO problem I described to start this post. In fact, a naive GraphQL implementation might issue IO per field of each object requested.

For hack week, I wanted to explore the design of a GraphQL server that issued all of its backend IO in optimally concurrent batches.

Why Haskell?

Dropbox doesn’t use Haskell, but I find it to be a great language for exploring design spaces, particularly around execution models. Also, Facebook open sourced their excellent Haxl library which converts code written with serial IO into efficient batched requests. Haxl provides an effect type that, when it can understand that two data fetches are independent, runs them both in parallel. When all Haxl operations are blocked on backend data fetches, only then does it issue the backends. My prototype GraphQL resolvers are surprisingly naive, specified with sequential code. Haxl automatically batches up the requests and hands them to the DataSource for execution.

At this point, you might be thinking “So was anything challenging about this project?” On the data fetch side, no, not really. :) However, I did run into one unexpected snag when using Haxl for data updates: because Haxl tries really hard — and in a general, composable way — to run your IO in parallel, you must be careful about how writes and reads are sequenced together. If you leave out the () <- on line 105, Haskell sequences the operations with >> instead of >>=, and Haxl’s >> uses the Applicative bind operation instead of the Monad bind operation, and thus it assumes it can run them in parallel. And, as you might expect, issuing a write and read to the same data concurrently doesn’t end well. :)

Conclusions

I am very thankful for jdnavarro’s excellent GraphQL query parser. With it, in four days, I was able to get a prototype GraphQL server up and running. Using Haxl and Hedis, I have it hooked up to a Redis data source, and it correctly batches all independent IO reads and runs them concurrently.

You can even see that it noticed that R2-D2 is the hero of both movies, and only requested its info once.

The performance is pretty good: on some AWS instances, I measured about 3500 queries per second per machine and a query latency averaging 3 ms. Of course, that could worsen as permissions checks and so on are implemented. On the other hand, the code is completely unoptimized, full of lazy data structures and an unoptimized parser without a parser cache.

It’s probably possible to build something like Haxl in Python with generators, but you’d have to give up standard loops and list comprehensions, instead using some kind of parallel map operation. You also would not benefit from anything that teases concurrency out of imperative functions like GHC’s upcoming ApplicativeDo extension. There are some things that Haskell’s restricted effects are uniquely good at. :)

I’d guess it would probably be even trickier to do a good implementation in Go given that the programmer has less control over goroutines than Python’s generators and Monads in Haskell. That said, perhaps someone will discover a clean, optimal implementation. We should pay attention to efforts like this.

I think GraphQL will be a huge deal for high-performance web services. It’s harder to implement than REST or traditional RPC services, but the reduction in response latencies could be substantial. Naive implementations aren’t going to have good performance, but if you are interested in aiming straight at the finish line, Haskell and Haxl will give you a head start.

My colleague Andy recently wrote about what it’s like to use Haskell at IMVU. Since then, I’ve started writing some Haskell, and I wanted to share my thoughts on the language and ecosystem.

I am by no means a language expert. But I’ve gotten over the beginner’s hump and have written a fair amount of real code, so I have a decent sense of what the idioms are.

Haskell is Awesome

Overall, I find Haskell to be great. It’s not that fun to write — the common complaint that Haskell makes you think a lot up front is totally true — but once the code is written it is so easy to safely refactor. This is likely due to type inference, Haskell’s brevity, and the ease of creating new, lightweight types. At the technology strategy level, I value long-term maintainability and evolvability over ease of writing new code, so Haskell has a big advantage here.

With Warp and Wai, Haskell is particularly excellent for writing HTTP services. (Michael Snoyman generally has solid design taste and performance sense.) In just a few lines of code, you can write a scalable, concurrent, low-latency HTTP server. Better yet, you can deploy this HTTP server to your production servers as a static binary: no fancy configuration or package management necessary!

The GHC runtime is both innovative and mature. It has excellent concurrency support. With STM, avoiding deadlocks in concurrent algorithm is easy. Green threads and sparks enable spinning off hundreds of thousands of lightweight jobs, soaking up any available CPU capacity. The parallel I/O manager makes it efficient to concurrently handle multiple requests in a single process.

When digging into an unfamiliar codebase, the hardest part is building an understanding of the possible states and data flows. That is, understanding the big picture. Sum types help a lot here. They explicitly denote the possible states. Pattern matching, plus warnings for partial matches, make sure that all states are properly handled, largely preventing any kind of accidental null pointer errors. (Remember: Maybe a is a sum type: Just a | Nothing.)

Haskell draws a sharp line between pure code and effectful code. This makes it much easier to understand how information flows through the code. When you see a pure function, you know it is only calculating a result, and thus only need to read the type signature to understand its effect. Also, separating pure and effectful code prevents a common class of performance regression: accidentally adding expensive I/O to otherwise pure data transformation code. (The compiler also benefits from knowledge of which functions are pure: it can evaluate in arbitrary order, lift common function calls to outer scopes, and apply rewrite rules.)

In addition, through custom Monad types, Haskell makes it easy to restrict effects to specific contexts. For example, you can enforce rules like “No MySQL queries in Redis transactions” or “No arbitrary I/O in HTTP request handlers, only specific restricted actions.” By limiting your code to a restricted interface, it’s even possible to guarantee that your unit tests will be deterministic. This is a real problem in languages like Python, C++, and JavaScript, where a single unmocked call to “get the current time” can make your tests intermittent.

Type classes are awesome too! Traditional OO languages allow polymorphism through the first argument (this pointer, vtable). But type classes allow polymorphism on single values (see Data.Default) and across multiple arguments, even return values. See Eq.

Type classes can be created independently from their resident types. Types can be specified independently from type class implementations. This allows layering concepts into third-party libraries that weren’t designed with them in mind.

Type classes are especially powerful when you start building application frameworks with them. For example, consider the type of HTTP request handlers:

myRequestHandler :: ToHTTPResponse a => HTTPRequest -> a

That is, a request handler is a function that takes an HTTP request and returns any value that can be converted to an HTTP response. This means request handlers don’t themselves need to build the HTTP response: just return any compatible object and let the framework convert.

Haskell is Not Awesome

Haskell is not all roses. In fact, I’m not sure it can even become the Next Big Language. And for a rather sad, superficial reason: the syntax is a too idiosyncratic compared to every other mainstream language. I’m not talking about lets, operators, or separating functions from arguments by whitespace. Those things can be learned pretty quickly.

My biggest syntax problem with Haskell is that there is too much variety. I’m a strong believer in the Style is Substance thesis, which is that our precious brain cells and decision-making units shouldn’t be spent on stylistic choices — instead, we should all have a fairly consistent style and spend our energy on more important decisions.

Haskell, on the other hand, has WAY too many nerd knobs. You can use “let … in …” syntax, “expr … where …” syntax, do syntax (even in non-monadic contexts). You can write functions with explicit arguments or in point-free style.

In your own code, or even within an organization, you can define a coding style and aggressively follow it. But if Haskell syntax was as consistent and tight as, for example, Java or Python, it would have a better shot at being mainstream.

I have some other syntax nits too: the verbosity of module imports is annoying and some common libraries (Aeson) use too many custom operators.

Some extensions should definitely be promoted to the core language by now. String should basically never be used in production code (more on that later), and ByteString and Text aren’t usable without the OverloadedStrings extension, which allows specifying ByteString and Text values with string literals. OverloadedLists should also be enabled. Really, any literal should be overloaded. It’s very convenient to be able to use existing literal syntax to initialize values of many different types.

“ScopedTypeVariables simplifies real code. TupleSections are commonly-used. There’s little risk in folding both into the core language. I feel like RecordWildCards should also be enabled by default, but records are a mess in Haskell and that part of the language is actively evolving. ExistentialQuantification is also critical in most real codebases.

Now let’s talk about the Prelude. The Prelude probably made sense in the 80s and 90s when Haskell was a simple research language, but parts of it are showing its age. String should never be used in production code. A lazy linked list of 32-bit Unicode code points is… elegant? … but pretty much the least efficient data structure for text. Sadly, a lot of underlying APIs in Haskell use String, just because it’s standard and the original data type for text.

The Num typeclass is a mess. Just because a value can be added does not also mean it can be multiplied or has an absolute value or can be converted from a numeric program literal. Generally, type classes should be small and precise, and Num is none of those. The problems with Num are well-documented.

The Prelude in general should use more type classes. This is a point of contention within the Haskell community. My opinion, however, falls squarely on the “more polymorphic functions!” side of the argument. It really sucks to have to teach people about bothmap and fmap when they’re the same function. It generally goes like this:

“What is fmap?”
“It’s generalized map. It works on Maybe and Vector and HashMap.”
“Why doesn’t map work on those?”
“Well, some people in the community believe that type classes make learning harder, so the Prelude should stick to less general function definitions.”

I haven’t seen any actual human factors research that says type classes make learning the language harder. And you can’t really learn Haskell without understanding type classes, so a monomorphic Prelude simply delays some education that is necessary anyway.

The C++ standard library is a good counterexample: you can usestd::vector without understanding how to write templates. Python too. You can call len on any container without having to know how to implement the __len__ protocol. Haskell could have this same exact feature if people would stop inventing usability reasons not to. See Michael Snoyman’s work on ClassyPrelude.

This brings up a higher-level issue with the design of most Haskell APIs: little regard is paid to the number of symbols exposed. Ideal APIs are memorable and predictable. Once you know how to map across any collection, you should know how to map across all collections. Once you know how to calculate the length of a list, you should know how to calculate the length of a vector or map. In Haskell, today, you need separate functions for each of those (e.g. Prelude.length vs. Data.Vector.length vs. Data.HashMap.Strict.size). There is so much to remember that you need about six browser tabs open to get any useful work done.

In summary: type classes are an excellent language feature. But they’re not used enough or well enough in the base Haskell libraries. And that’s all I’ll say on the subject for now. :)

Now I will discuss integrating with libraries.

I think it’s time that certain libraries get promoted into the standard: ByteString, Data.Maybe, Data.Map, Data.Set, Data.Text. It’s not realistic to write a real application without using them, so they should be made ubiquitous.

Finally, I have one more major complaint about the Haskell ecosystem. Fortunately, it’s a solvable one. Cabal sucks. I don’t know enough about it to say specifically why, but literally every time I come back to our build system after time away, something goes horribly wrong. I end up having to manually deleted generated files, or manually uninstall some packages. The error messages are always inscrutable. Sometimes I have to pull in one of the Cabal experts at the company and it’s a coin flip as to whether they’ll be able to figure it out.

I assume Cabal is fixable, but it’s somewhat boggling to me that a community that thinks so much about precision and mathematical laws and reliability can produce a tool so flaky.

Haskell is Worth Using

That was a long list of complaints – longer than I spent on the upsides. :) But I want to be clear: I still think Haskell is pretty great. It’s certainly worth learning. It’s also a great platform for writing HTTP services. Unless someone with GvR-esque taste attacks the syntax, Prelude, and standard libraries with gusto, Haskell likely won’t become as popular as Java or PHP or Python. Nonetheless, Haskell’s on a rapid upward trajectory, and as the community becomes less academic and more practical, many of these problems will get addressed. Even if the community is slow to adapt, as with any language, you can define your own internal conventions and base libraries, including your own Prelude if desired.

Background: The IMVU client has a system that we call the ‘task system’ which works a lot like lightweight threads or coroutines. You can schedule up some work which gives you back a Future object on which you can either wait or request the actual result (if it has materialized, that is).

When we built the task system, there were two ways to cancel execution of a task: explicit and implicit. You could either call future.cancel() or just release your reference to the future and wait for it to be collected by the GC. In either case, the scheduler would notice and stop running your task. I’ve often wondered if support for implicit task cancellation was a good design.

After reading Joe’s article, I’m convinced. If you believe that resource utilization is also a resource, and that implicit resource disposal (garbage collection) is the right default, then implicit cancellation is the only sane default. In fact, we recently removed support for explicit cancellation, because we haven’t even ‘really’ used it yet. (Some of Dusty’s code used it, but he said it was fine to take it out.)

We may have to implement explicit cancellation again someday, but now I’m happy to wait until there is a compelling use case.