Scaling Play! to Thousands of Concurrent Requests

Scala Web Developers often fail to consider the consequences of thousands of users accessing our applications at the same time. Perhaps it’s because we love to rapidly prototype; perhaps it’s because testing such scenarios is simply hard.

Regardless, I’m going to argue that ignoring scalability is not as bad as it sounds—if you use the proper set of tools and follow good development practices.

Ignoring scalability is not as bad as it sounds—if you use the proper tools.

Lojinha and the Play! Framework

Some time ago, I started a project called Lojinha (which translates to “small store” in Portuguese), my attempt to build an auction site. (By the way, this project is open source). My motivations were as follows:

I really wanted to sell some old stuff that I don’t use anymore.

I don’t like traditional auction sites, especially those that we have down here in Brazil.

So obviously, as mentioned above, I decided to use the Play! Framework. I don’t have an exact count of how long it took to build, but it certainly wasn’t long before I had my site up and running with the simple system deployed at http://lojinha.jcranky.com. Actually, I spent at least half of the development time on the design, which uses Twitter Bootstrap (remember: I’m no designer…).

The paragraph above should make at least one thing clear: I did not worry about performance too much, if at all when creating Lojinha.

And that is exactly my point: there’s power in using the right tools—tools that keep you on the right track, tools that encourage you to follow best development practices by their very construction.

In this case, those tools are the Play! Framework and the Scala language, with Akka making some “guest appearances”.

Let me show you what I mean.

Immutability and Caching

It’s generally agreed that minimizing mutability is good practice. Briefly, mutability makes it harder to reason about your code, especially when you try to introduce any parallelism or concurrency.

The Play! Scala framework makes you use immutability a good portion of the time, and so does the Scala language itself. For instance, the result generated by a controller is immutable. Sometimes you might consider this immutability “bothersome” or “annoying”, but these “good practices” are “good” for a reason.

In this case, the controller’s immutability was absolutely crucial when I finally decided to run some performance tests: I discovered a bottleneck and, to fix it, simply cached this immutable response.

By caching, I mean saving the response object and serving an identical instance, as is, to any new clients. This frees the server from having to recalculate the result all over again. It wouldn’t be possible to serve the same response to multiple clients if this result were mutable.

The downside: for a brief period (the cache expire time), clients can receive outdated information. This is only a problem in scenarios where you absolutely need the client to access the most recent data, with no tolerance for delay.

For reference, here is the Scala code for loading the start page with a list of products, without caching:

Quite simple, isn’t it? Here, “index” is the key to be used in the cache system and 5 is the expiration time, in seconds.

After caching, the throughput went up to 800 requests per second. That's an improvement of more than 4x for less than two lines of code.

To test the effect of this change, I ran some JMeter tests (included in the GitHub repo) locally. Before adding the cache, I achieved a throughput of approximately 180 requests per second. After caching, the throughput went up to 800 requests per second. That’s an improvement of more than 4x for less than two lines of code.

Memory Consumption

Another area where the right Scala tools can make a big difference is in memory consumption. Here, again, Play! pushes you in the right (scalable) direction. In the Java world, for a “normal” web application written with the servlet API (i.e, almost any Java or Scala framework out there), it’s very tempting to put lots of junk in the user session because the API offers easy-to-call methods that allow you do so:

session.setAttribute("attrName", attrValue);

Because it’s so easy to add information to the user session, it is often abused. As a consequence, the risk of using up too much memory for possibly no good reason is equally high.

With the Play! framework, this is not an option—the framework simply doesn’t have a server side session space. The Play! framework user session is kept in a browser cookie, and you have to live with it. This means that the session space is limited in size and type: you can only store strings. If you need to store objects, you’ll have to use the caching mechanism we discussed before. For example, you might want to store the current user’s e-mail address or username in the session, but you will have to use the cache if you need to store an entire user object from your domain model.

Play! keeps you on the right track, forcing you to carefully consider your memory usage, which produces first-pass code that is practically cluster ready.

Again, this might seem like a pain at first, but in truth, Play! keeps you on the right track, forcing you to carefully consider your memory usage, which produces first-pass code that is practically cluster ready—especially given that there is no server-side session that would have to be propagated throughout your cluster, making life infinitely easier.

Async Support

Next in this Play! framework review, we will examine how Play! also shines in async(hronous) support. And beyond its native features, Play! allows you to embed Akka, a powerful tool for async processing.

Altough Lojinha does not yet take full advantage of Akka, its simple integration with Play! made it really easy to:

Schedule an asynchonrous e-mail service.

Process offers for various products concurrently.

Briefly, Akka is an implementation of the Actor Model made famous by Erlang. If you are not familiar with the Akka Actor Model, just imagine it as a small unit that only communicates through messages.

To send an e-mail asynchronously, I first create the proper message and actor. Then, all I need to do is something like:

EMail.actor ! BidToppedMessage(item.name, itemUrl, bidderEmail)

The e-mail sending logic is implemented inside the actor, and the message tells the actor which e-mail we would like to send. This is done in a fire-and-forget scheme, meaning that the line above sends the request and then continues to execute whatever we have after that (i.e., it does not block).

Conclusion

In summary: I rapidly developed a small application, Lojinha, capable of scaling up and out very well. When I ran into problems or discovered bottlenecks, the fixes were fast and easy, with much credit due to the tools I used (Play!, Scala, Akka, and so forth), which pushed me to follow best practices in terms of efficiency and scalability. With little concern for performance, I was able to scale to thousands of concurrent requests.

When developing your next application, consider your tools carefully.

About the author

Paulo is a passionate developer who found in Scala a chance to leverage years of experience with Java: beginning his career as a web-based Java developer, Paulo discovered Scala a few years ago and, since then, has been expanding his capabilities (and portfolio) with every passing project. [click to continue...]

Comments

Hi. It looks like you serve images directly from S3. That's gonna hit your pocket under the load. I cache most recent with enginx, that saves my money.

fhuuucho

Is there any benefit of using Akka over simply posting Runnable to an Executor?

chetan conikee

Great article. Which diagramming tool did you use to generate the gif above?

Paulo "JCranky" Siqueira

Easier to scale, as long as you are familar with the Actor Model.

Vitaly Dyatlov

Talking about good speed and high scalability, you forgot to optimize the biggest bottleneck here - JVM.
Simply set-up reverse proxy there (Varnish/Squid) and forget about caching on jvm side. It will gain you much more speed and much less memory will be used (on the average).
As you said in your post - "use the proper tools".

Isaias Cristiano Barroso

Nice article, congratulations

Paulo "JCranky" Siqueira

thank you for the feedback :)
Next time I decide to spend some time testing performance again I might throw in something like that to get the numbers.

Paulo "JCranky" Siqueira

That is something to consider as well. The upside of using S3 is that serving the images from S3 won't have any performance hit at all.

Paulo, just a few more thoughts.
First. How long have you been stressing the app? How many cache keys were used? It's a JVM, you might get well, and after some period of time get fed up with garbage. Is memory cache used in Play or somewhat like EHcache?
Second. (I'm not familiar with Play) but it appears to me like you're caching the entire response. If so you should tested front-end caching scenarios like Vitaly mentioned. It would be interesting to compare Varnish or Nginx+Reddis.

Luboš Volkov

To be accurate the diagram was created in the Adobe Photoshop.

Paulo "JCranky" Siqueira

1- EHcache by default, and the actual implementation being used can be plugged in if you need a different one.
2- That's true. I didn't test those other scenarios more due to time constraints than anything else, but it is something I might try in the future.

Pablo Fernandez

> It wouldn’t be possible to serve the same response to multiple clients if this result were mutable.
pretty much every framework presents a "response" cache of some sort, being the response object immutable or not.

Aliaksandr Hlinski

One of benefits: try to imagine "Runnable task" which can be executed on one of several nodes in cluster. Akka will manage load balancing between nodes in cluster and so on.

Alexey Migutsky

You can setup CloudFront to leverage CDN functionality.
Each CloudFront node will cache the content (and the cahce expiration is highly configurable) in memory and will deliver the content right from S3.
The bill will be reduced significantly, and S3 reads will conclude the largest part (S3 io will be considerably lower with right cache expiration)

Jim Clermonts

What would be better if posting a 150 mb file? Splitting it up to smaller chunks and have more actors running or does it depend on the RAM size of the EC2 instance?

Lev

One thing I don't understand about caches (both for JVM-side ones and reverse-proxy cashes):
How can I cache a whole page if it can look differently for different users? Imagine index page which looks differently for signed in user ("Hi Bill/John/Paulo!" in the header). How to deal with this scenario?
Thanks!

Abdullah Alansari

As you are the one who wrote Lojinha and not me, you are more aware of the app needs than me, so take what I say with a grain of salt.
Here goes nothing!
You mentioned that after caching the response your app served 4x more responses/second.
1. If you expire your cache in about 1 second, then in all the requests in the second, which you based your scalability on, the response is computed only once, this means that the required computation for a request is only 4x slower than the cache lookup!
2. Depending on how many requests your app serves in production this case could be in the 97% of the times when you shouldn't optimize for performance. And regardless you should use a monitoring service (e.g: NewRelic).
3. Caching, even without invalidating, can in many cases make the code less maintainable and harder to change, for example in your case you'll have to keep track of the keys and decide when how long each key in the cache should live.
4. Although, caching is usually straight forward it can be tempting to spend unnecessary long time on it and it tends to avert your attention from more important stuff that could even potentially deliver better performance.
5. There are more elegant ways to optimize for performance without caching. For example you could use different databases (e.g: MongoDB for data that doesn't change very much and Redis for data that changes very frequently). This can makes the code simpler and more manageble, since a database is better for certain tasks than another.
As for the cookies, you can do without them and should in many cases and store the data client-side, since client can be a browser or a mobile app.
One other thing is that you can write your app as a simple JSON REST API backend and write all the frontend in javascript (including HTML and CSS). It will seem contrived at the beginning and maybe hard (as is anything that you're doing for the first time).
And this approach has many advantages:
1. The user interface is more interactive and responsive.
2. Frontend frameworks are, obviously, better at the front-end even the best backend frameworks (e.g: Play!, Rails).
3. If you want to use a different backend framework or language you don't have relearn how to develop the frontend and vice-versa.
4. With some work you can even do the rendering server-side if you need to.
5. You can more easily find better tools for the frontend, and still use them even if you changed your backend framework and vice-versa.
A couple years ago I wrote a Rails app and noticed that almost the majority of my time is spent on frontend stuff and even the frontend wasn't good enough. These days for a good UX on the browser you need to write lots of javascript, so why not write the whole frontend in javascript.