How to write unit tests for Celery task chains

Celery chains allow you to modularise your application and reuse common Celery tasks.
A classic use case is a market data system we at Zoomer Analytics built
for a hedge fund client.

The aim was to consume market data from different data vendors such as
Bloomberg or Reuters. The APIs were of all different kinds and shapes but ultimately the data ended in the same
database table.

By chaining the Celery tasks we only had to build a small number of specialised feed and data transformation tasks for
each vendor but could reuse common tasks such as deserialising and writing the data.

An example

Let's say we want to download cyrptocurrency timeseries data from a number of different APIs and calculate
the moving averages for each of these timeseries.

For instance, one of these timeseries would be the Bitcoin Price Index, available via the Coindesk API.

After careful consideration of reusability and separation of concerns, we decide to implement a generic Celery
task to calculate the moving average, expecting as parameters a list of dicts - [{"date": "2018-05-01": "value": 1000.0}, {"date": "2018-05-02": "value": 1003.5}, ...] - and the number of days
we want to calculate the moving average over:

We can now implement any data feed Celery task for any given API; as long as that task returns a list of dicts of
the expected format [{"date": "2018-05-01": "value": 1000.0}, {"date": "2018-05-02": "value": 1003.5}, ...],
we can simply chain these tasks to be able to calculate the moving average for any API result. For instance,
the Celery task for the Bitcoin Price Index Coindesk feed looks like this:

How do we go about testing Celery chains? As usually, there is more than answer. Let's have a look at two
different strategies and discuss which one makes sense in which context.

Mocking the Celery chain

Previously, we discussed the importance of unit-testing Celery tasks.
Assuming we have our Celery tasks test-covered, the only thing we are really interested in when it comes
to testing chained tasks is that the chain itself does the right thing.

In other words, we need to test, wherever we invoke a Celery chain, that the individual tasks are called in the
correct order with the correct arguments. Whatever happens inside the task is not our concern as that is already
covered by the unit test.

The chain is invoked in this Flask view. Hence, this is what we need to write the test for (another common approach
would be to implemented a dedicated method that invokes the Celery chain and write test against that method).

By mocking the actual Celery chain and the Celery tasks inside the chain, we are able to assert the
order in which the tasks are called and their respective arguments.

Testing the chain synchronously

There is an alternative approach. Instead of mocking the chain and the tasks, we can just test the whole lot in one go by
calling the task chain synchronously (in the same way we did it for unit-testing individual Celery tasks).

Here we use responsesto mock out the requests call but other than that
we let our test actually execute the chain and the two chained tasks themselves. If you do (and you should) unit-test
your Celery tasks, you end up with redundant tests which in turn is a drag on your development flow. However, this test setup might
make sense if you have Celery tasks that are always called as part of a chain.

How to apply this

Asynchronously linking Celery tasks via task chains is a powerful building block for constructing complex workflows (think Lego). Testing
Celery chains is as important as unit-testing individual Celery tasks. Mocking the Celery chain and the chained tasks
is an easy and effective way to stay on top of your Celery workflow, however complex.