On This Page

Zencoder: 100% Integration Reliability

This topic discusses reasons you might not be able to connect to Zencoder and how to ensure reliable integration.

Overview

Zencoder is an essential software dependency for most of our customers. And while we aim at 100% uptime, there may be times when you can't connect to Zencoder:

When this happens and Zencoder is down, your application will typically get a '503 Service Unavailable' response from Zencoder, but you could get a different error (like a 500). If you have exceeded your API rate limit, you will get a '403 Rate Limit Exceeded' response.

The good news: since video encoding is an asynchronous process, you can build your application to never experience downtime or problems related to our availability. If you do this, the worst case scenario is that your jobs take a bit longer. But no errors occur. We highly recommend that you do this.

To put it more strongly, if you care about reliability, you should follow this approach to integration - for Zencoder, or for any critical API that you integrate with.

Our service might be affected by problems at an upstream provider (e.g. Amazon Web Services)

We occasionally need to perform system maintenance that requires temporary downtime

Reliable app integration

Include a Secondary URL as a backup in case upload to your primary location fails.

If you get a non-successful response code from Zencoder - basically, something other than a 200 or 201 - don't fail the job. A response code of 503 doesn't mean that your video can't be processed. It just means that Zencoder is temporarily unavailable.

If you get a connection error when trying to connect to Zencoder, do the same thing.

Similarly, wrap your API requests in a timeout. We recommend a 30 second timeout; Zencoder usually responds in less than a second, so 30 seconds is usually plenty of time.

In all three of these cases - if you get a non-successful response code, can't connect, or the API request times out - flag the job as 'pending'.

Periodically, resubmit any jobs in the 'pending' state. You could use cron to do this every minute, for instance.

Once the jobs are resubmitted, everything behaves like normal. This way, a failed job submission only makes the job take a little longer rather than causing trouble for your application or your users.

Pseudocode

OK, so this isn't Pseudocode - it's Ruby. But Ruby is pretty easy to read.

Imagine a Videos table that includes these columns. (It will obviously have more, including columns to store a Zencoder job ID and a Zencoder output file ID.)

Also, by adding a 'lock_version' column to the videos table, we introduce optimistic locking. This means that if the record gets updated between the Video.find query and video.save, it won't submit the job to Zencoder. This will prevent the job to be submitted to Zencoder twice accidentally. You could use pessimistic or database locking or some other lock method to accomplish the same thing.

It's that easy…

All things considered, this is a pretty simple approach to ensuring 100% integration reliability between Zencoder and your application. It's a few more steps than just naively submitting a job; but it ensures that no matter what happens - whether it's an occasional timeout, or unexpected downtime at Zencoder, or scheduled maintenance - your app runs reliably.