Scaling Your Dyno Formation

Table of Contents

All apps on Heroku use a process model (via Procfile) that lets them scale up or down instantly from the command line or Dashboard. Each app has a set of running dynos, managed by the dyno manager, which are known as its dyno formation.

Scaling

Dynos are prorated to the second, so if you want to experiment with different configurations, you can do so and only be billed for actual seconds set. Remember, it’s your responsibility to set the correct number of dynos and workers for your app.

A web app typically has at least web and worker process types. You can set the concurrency level for either one by adjusting the number of dynos running each process type with the ps:scale command:

Dyno formation

The term dyno formation refers to the layout of your app’s dynos at a given time. The default formation for simple apps will be a single web dyno, whereas more demanding applications may consiste of web, worker, clock etc… process types. In the examples above, the formation was first changed to two web dynos, then two web dynos and a worker.

The scale command affects only process types named in the command. For example, if the app already has a dyno formation of two web dynos, and you run heroku ps:scale worker=2, you will now have a total of four dynos (two web, two worker).

Listing dynos

The current dyno formation can always been seen via the heroku ps command:

Introspection

Note that the logged message includes the full dyno formation, not just dynos mentioned in the scale command.

Understanding concurrency

Singleton process types, such as clock/scheduler process type or a process type to consume the Twitter streaming API, should never be scaled beyond a single dyno. They can’t benefit from additional concurrency and in fact they will create duplicate records or events in your system as each tries to do the same work at the same time.

Scaling up a given process type gives you more concurrency for the type of work handled by that process type. For example, adding more web dynos allows you to handle more concurrent HTTP requests, and therefore higher volumes of traffic. Adding more worker dynos will let you process more jobs in parallel, and therefore higher volumes of jobs.

There are circumstances where creating more dynos to run your web, worker, or other process types won’t help. One of these is bottlenecks on backing services, most commonly the database. If your database is a bottleneck, adding more dynos may actually make the problem worse. Instead, optimize your database queries, upgrade to a larger database, use caching to reduce load on the database, or switch to a sharded or read-slave database configuration.

Another circumstance where increased concurrency won’t help is long requests or jobs. For example, a slow HTTP request such as a report with a database query that takes 30 seconds, or a job to email out your newsletter to 20k subscribers. Concurrency gives you horizontal scale, which means it applies to work that can be subdivided - not large, monolithic work blocks.

The solution to the slow report might be to move the report calculation into the background and cache the results in memcache for later display. For the long job, the answer is to subdivide the work - create a single job which fans out by putting 20k jobs (one for each newsletter to be sent) onto the queue. A single worker can consume all these jobs in sequence, or you can scale up to multiple workers to consume these jobs more quickly. The more workers you add, the more quickly the entire batch will finish.