Behind every great API is a reliable uptime monitoring system. In
today's internet world filled with SaaS apps, there are many
monitoring tools to choose from. If you're new to API monitoring, you
might be confused on exactly what you should be monitoring or the
right way to do it.

In this post I'll outline some tips to help you determine what and
how to monitor your API. Using these tips can help you build out
API monitoring automation that works well for your needs:

I may be a little biased since I work on Assertible which is an
API monitoring tool. That said, this list is of tips is intended to
help beginners create reliable API monitoring to detect downtime. I
can't help that Assertible is really good at monitoring and satisfies
these criteria!

1. Monitor multiple endpoints

When you first start monitoring an API, you will probably setup a
simple GET request to a specific endpoint (/ or /health) and
check that it returns the HTTP status 200 OK. This is the perfect
way to start, but quickly becomes limited for any growing API.

An API can suffer downtime for a plethora of reasons. Not only can
downtime be related to the entire system going offline due to server
issues, it can also be related to a specific endpoint or API
transaction caused by bugs in your application.

The first thing you should do to avoid these types of errors is
monitor multiple critical endpoints. In an ideal world, each
unique endpoint would have a simple monitor to check a basic HTTP
request returns the 200 OK status.

Tip: Start with one endpoint, then expand your monitoring to
multiple endpoints.

Tip: To detect errors which you may not know exist (and thus
aren't monitoring), use an error reporting service
like Rollbar or Sentry
from within your application.

2. Functional checks

Basic uptime checks monitoring multiple API endpoints is a great
start. But now what? As you continue to work on your API and deploy
new versions, there are many scenarios where a simple 200 OK check
on an endpoint is not sufficient.

Functional API checks are a great way to model real-world
user-scenarios and ensure the availability of a specific transaction
on an endpoint(s). For example, your functional test can:

Check status codes that are not HTTP 200 OK to identify API
transactions that should fail

Tip: After you have some simple monitoring checks setup, consider
your API's most important functional requirements and setup tests to
monitor those actions.

3. Eliminate flaky checks

One of the most important aspects to leveraging an API monitoring
solution is to vigilantly eliminate flaky checks. A flaky test (or
check) happens when you get an alert or downtime notification but
nothing is wrong with your API. These can happen due to some
unexpected non-deterministic behavior, when a test has too many steps,
or when a test is otherwise too complicated.

Checks that are not reliable or fail often with false-positives often
will create too much noise, distract you and your team, and
potentially cause team members to ignore important downtime alerts.

Tip: Don't allow a flaky check to stay in your system. Refactor a
flaky test immediately or completely delete it with a more simple test
that does not raise false-positives if at all possible.

4. Actionable alert notifications

When you receive an alert that your API is down, it's critical
that the notification tells you the most vital information immediately
so you can take action. API downtime alerts that require you to open
a link to view the primary parts of the failure first is a step in
the wrong direction.

API downtime notifications should be immediately actionable,
otherwise, you will waste time opening the dashboard for another web
app instead of responding to your web service's issue.

In practice, this mean your alert should give you at least basic
information like the HTTP status code and precise information about
why the check failed. For example, a downtime notification from
Assertible's Slack integration looks like this:

Tip: Ensure that your downtime notifications give you enough
critical information to respond to your API without navigating to an
external website to view the failure first.

5. Monitor testing and staging environments

Lots of hosting providers make it easy to setup staging and testing
versions of your app so that you can test it live before deploying to
production. For
example, Heroku has
pipelines allows you to have a staging environment and temporary
review apps (when
using
Heroku Review Apps).

I highly recommend monitoring non-production environments because
it allows you to catch API errors before they hit production. A
good API monitoring tool will allow you to reuse the same tests to
monitor each unique environment.

Some monitoring tools also have features that allow you to
smoke-test your API after it's deployed to a staging or dev
environment. In Assertible, it's called
the Deployments API. Automating smoke-tests
after a deployment is the best way to identify errors in a new
application version immediately and allows you to potentially run a
large set of integration tests against your live API.

Tip: Monitor staging, qa, and testing versions of your application
to further reduce the chances that a bug will land in production.

Tip: Automate comprehensive smoke-tests when your web service is
deployed

Conclusions

The tips I've outlined in this post should help you start monitoring
your API effectively. No team is immune to bugs so the most important
aspect is to practice continuous testing and iteratively build out
better tests at every stage of your development process (remember,
vigilantly remove flaky tests!).

Tools like Assertible make API monitoring trivial and reduce bugs in
your API so users don't just leave. We've spent a lot of time ensuring
that Assertible meets the requirements I've outline in this post. If
your keen to testing Assertible, I'd love to hear your
feedback. Send me a message
or reach out on Twitter and let's
talk testing!