Contents

Overview

The Message Queue project is part of Project Sagrada,
providing a service for applications to queue messages for clients.

Queuey provides queues for applications that can optionally be secured with a
Browser-ID assertion, accessible via a RESTful HTTP API.

Queues are lists of messages ordered by time of insertion with several basic
operations available:

Create a queue (optionally with a Browser-ID assertion to allow that user to read and/or delete their own messages in the queue)

Insert messages into a queue (optionally spread over queue partitions when the queue is anticipated to be used for worker processes)

Read messages from a queue, in ascending or descending order

Designate a starting timestamp to only query messages before or after a specific time

Delete a message in a queue

Remove a queue

Show information about a queue

Set TTL's on how long messages remain in a queue before being removed

Queuey is built to accommodate millions of queue's in a single deployment, with
a message guarantee that is generally "Deliver once, and exactly once." These
guarantee's can be altered based on deployment configuration.

Engineers

Ben Bangert

Hanno Schlichting

Architecture

Queue Web Application

queuey (Python web application providing RESTful API)

Queue Storage Backend

Cassandra

Coordination Service

(Only used when consuming from the queue)

Apache Zookeeper

Queue Consumption

* Consumers coordinate with coordination service
* Consumers split queues/partitions amongst themselves
* Consumers record in coordination service the farthest they've processed
in a queue/partition
* Consumers rebalance queue/partition allocation when consumers are added or
removed using coordination service
* Consumption and rebalancing is done entirely client-side

Queuey is composed of two core parts:

A web application (handles the RESTful API)

Storage back-end (used by the web application to store queues/messages)

The storage back-end is pluggable, and the current default storage back-end is
Cassandra.

Unlike other MQ products, Queuey does not hand out messages amongst all clients
reading from the same queue. Every client is responsible for tracking how far
it has read into the queue, if only a single client should see a message, then
the queue should be partitioned, and clients should decide who reads which
partition of the queue.

Different performance and message guarantee characteristics can be configured by
changing the deployment strategy and Cassandra replication and read / write
options. Therefore multiple Queuey deployments will be necessary, and Apps
should use the deployment with the operational characteristics desired.

Since messages are stored and read by timestamp from Cassandra, and Cassandra
only has eventual consistency, there is an enforced delay in how soon a message
is available for reading to ensure a consistent picture of the queue. This
helps ensure that written messages will show up when reading the queue to avoid
'losing' messages by reading past where they appeared in the queue.

For more on what kind of probabilities are involved in various replication
factors and differing read/write CL's, see:

In the event that the clients are deleting messages in their queue as they've
been read, the delay is unimportant. Enforcing a proper delay is only required
when clients read but never delete messages (and thus track how far into the
queue they've read based on time-stamp).

When using queue's that are to be consumed, they must be declared up-front as
a partitioned queue. The amount of partitions should also be specified, and
new messages will be randomly partitioned. If messages should be processed in
order, they can be inserted into a single partition to enforce ordering. All
messages that are randomly partitioned should be considered loosely ordered.

Queue Workers

A worker library called Qdo handles coordinating and processing messages off
queues in a job processing setup. The Qdo library utilizes Apache Zookeeper for
worker and queue coordination.

Workers coordinate to divide up the queue's and partitions in each queue so
that no queue/partition has multiple readers. This avoids the need for read
locking a queue, and how far into each host+queue+partition is stored in
Zookeeper.

This model is based exactly on how Apache Kafka workers divide up queues to
work on.

Initial User Requirements

Notifications

The Notifications project needs Queuey for storing
messages on behalf of users that want to receive them. Each user gets their own
queue.

If a user needs to get notifications for general topics, the Notifications
application will create a queue and clients will poll multiple queues.

The first version could be considered a Message Store rather than a
queue as it supports a much richer set of query semantics and does not let
public consumers remove messages. Messages can be removed via the entire queue
being deleted by the App or by expiring.

Requirements:

Service App can create queues

Service App can add messages to queue

Messages on queues expire

Clients may read any queue they are aware of

Socorro

The second version allows authenticated applications to queue messages and for
clients to consume them. This model allows for a worker model where jobs are
added to a queue that multiple clients may be watching, and each message
will be given to an individual client.

Requirements:

Service App can create queues, which can be partitioned

Service App can add messages to queue, and specify partition to retain ordering

Clients can ask for information about how many partitions a queue has

Clients may read any queue, and its partitions that they are aware of

API

Design Background

An initial search was done of available message queue products to determine if
any of them would be appropriate for Sagrada. Generally, most MQ's do not
assume message persistence is of great importance, don't have a good RESTful
API, and/or aren't built with the assumption that millions of queues will be
used at once. Having a configurable MQ that can easily have the various message
delivery guarantee's and availability options toggled is quite important, so
Queuey is unique in that deployment and configuration drastically alter the
actual MQ one works with.

For scalability purposes, since some messages may be retained for periods of
time per Notifications requirements and millions of queues will be required,
the initial choice for the backend is Cassandra. As Cassandra also provides
fast reads with internal caching (like memcached), using it in single node mode
for Socorro should also work well.

When used for Notifications, each user will have a single queue per Notification
Application, this helps ensure that even for an individual user receiving many
messages, they are partitioned by the Notification Application to ensure
a more level distribution rather than a single deep queue.

Under the Socorro use-case, it is most likely desired that massive amounts of
messages may be intended for the same queue, which would normally result in a
single extremely deep queue. Deep queue's do not scale well horizontally,
especially when using Cassandra which maximizes through-put across many
row-keys.

The other issue is that to consume messages, there are only two effective ways
of marking a message as consumed:

1. The message may be deleted after it has been sent. The only problem in
this case is that there is still no guarantee it has been processed, and
acquiring a lock to consume a message, and the resulting write-back to
delete it is expensive.

2. One consumer per queue/partition. This requires some guess-work up-front
about how many consumers will be around. Since consumers can consume from
multiple queue/partition's at once, there needs to be at least as many
partitions for a queue, as desired consumers. The advantage with this
approach is that locking is only necessary when adding/removing consumers
to ensure one consumer per partition.

Queuey goes with the second option, and only incurs the lock during consumer
addition/removal. The MessageQueue also does not track
the state or last message read in the queue/partition's, it is the consumers
responsibility to track how far it has read and processed successfully. There
is an API call available to record with the MessageQueue how far a consumer
has successfully processed in a queue/partition.