Background jobs in AWS for Ruby

Jan 16, 2018

While you can run the old resque+redis or sidekiq+redis duos on AWS you can also make use of some AWS services to ease the setup and work.

What would replace Redis (for the queue part)

Redis is primarily a key/value store and we use it extensively in many aspects of Rails apps nowadays. It does an ok job with resque and sidekiq to store the queues but it’s not really it’s primary function.

Before AWS times the go to solution to handle queues of messages to treat was RabbitMQ. It works ok but you need to setup at least three instances of it to be sure it’s reliable and won’t causing trouble when one server falls. Also if one queue goes berserk and messages piles up due to some incident in the application you might run out of memory and loose all your queues.

AWS has one pair of services that can be used to replace RabbitMQ : SNS and SQS. SNS is the “pub” part of “pub/sub” while “SQS” is the “sub” part of it. SNS exposes topics into which messages are posted. Topics have subscribers that can be of different kind (email, push notification, SQS queues, …)
Then SQS queues subscribed to the topic will get the messages and they are ready to be processed.

You “just” need to have software pulling from that SQS queue and doing what needs to be done for each message (pick, treat, delete).

Sidekiq, Resque and SQS

Neither Sidekiq nor Resque can handle SQS as source of messages, but Sidekiq inspired some people to create
Shoryuken which brings all the basic boilerplate to the gig.

Once your friendly devops person (maybe yourself ?) has created the topic and queue you can start
feeding messages in. Now how does a worker look ? Like so :

It’s very similar to Sidekiq, with a couple of catches. The first one is that you need to mark the message for deletion once the job is done.
If you don’t the message will reappear in the queue and be treated again. If it appears in the queue more than the maximum of retries configured
then the message is passed to the DeadLetterQueue.
So in clear : you don’t especially need to mark a message as bad if the job fails, just don’t delete the message, it will be passed back in the
queue, tried again a number of times configured for the queue, and passed to the DeadLetterQueue. How will you be aware of the failure ? If it’s
an exception that is thrown you should be notified by your exception catcher. If it’s another issue you need to have a way to notify yourself obviously.

You can define the queue in the config file or in the worker, and using an URL is probably better. See Shoryuken documentation .

Auto deleting

What about errors and messages that fail ?

The exception should be handled properly by your exception catcher and you should log as usual inside the worker. So
big errors will be taken care of that way. Yet one might be interested in knowing what messages
failed to be treated.

When a SQS queue is setup a number of retries per message is defined and also a “Dead Letter Queue”. See AWS documentation.
The first will insure the message will be tried again (or not depending if put a number higher than 0) and the
second will define another SQS queue where failed messages will be stored for some time allowing your team to have a look and do something about it.

As pointed in the previous paragraph if one message is retried more than the maximum retries configured then it will be passed to the DeadLetterQueue.

Words of warning (on SQS)

SQS provides at-least-once message delivery, which means each job you published could end up being delivered more than once by SQS.
This means your worker should be idempotent : each run of a job should produce the same result. If a message is treated twice then
your code need to produce the same result and not create twice the same object for example.

SQS doesn’t insure the order in which messages will be delivered, it only insure each message will be delivered at least once. So
be careful and don’t use SQS to provide you with the right order for a sequence of messages.
If order is very important you should look at other solution including AWS Kinesis.

From sidekiq

AWS configuration

As Shoryuken uses AWS you need to configure your AWS client in the code. It can be done
either through environment variables, an AWS credential files, set at initializing time
for Shoryuken or through IAM roles and profiles.
Usually we prefer the later as it leaves the developers with a simpler job and the devops
team will insure the instance running the worker have the rights to access the AWS ressources
they need anyway.

Quick example

Usually we like to have standalone workers services so that the code base is light, very dumb and does only few things well.

So the worker codebase is made minimal with just enough to pull the data from the data store, then the code to treat that data
and the code to put the result where it needs to go. We can have a simple class such as :

Running this in AWS with Docker, an ASG

We usually run background workers such as these in a very rudimentary but scalable way.
We setup an AWS AutoScalingGroup (ASG) with a Launch Configuration that will start up EC2 instances
pre installed with docker and a little script able to pull the latest version of the background
workers container.

So the idea is simple : the ASG is tied to the size of the SQS queue to process and as soon as the count
of messages is above 0 the ASG will start an EC2 instance. In the same manner if the number
of messages in the queue is above 50 (for example) for more than 5 minutes another instance will be added.
If the number of messages is below 10 for 5 minutes then an instance will be removed. And if there is 0 messages
in the queue for more than 10 minutes the last instance will be removed.

This is a very hands free approach, there is more to say about how to handle deploys and all but the gist
of it is there.

How do you make the container to run ?

First you need to have something to start a worker, as we define the queue to look into in the worker itself we can
start the worker by using the following command in a file :