June 9, 2015

An Overview of the Happy Apps' Checks

Written by
Gen

Checks are used to test whether various components of your system are up and running. Most commonly, people use Web checks to test if their web servers are accessible. This is a great test since it's user facing, but typically it's just a small piece of the puzzle. Happy Apps checks can test many different components of your system in addition to just web servers. View all of our supported check types here.

The Checks section of Happy Apps lists out all the checks you have setup. By default, checks are sorted by status with any failing checks listed at the top. This helps to easily identify which checks may need attention.

List of Checks

There are three possible states a check status can be in. Happy (green) means that the check has been consistently passing. Warning (yellow) means that the check is currently passing, but there has been a failure in the last 10 attempts. Sad (red) means that the check is currently failing. If a check that was passing fails, an additional check is performed 30 seconds later to ensure the check is indeed down before the status changes to Sad.

The check list shows several key pieces of information to summarize each check.

Time shows how long it has been since the check last ran.

Availability shows the uptime percentage for that particular check. By default availability looks at the last 30 days, but it's possible to adjust this in the Account Preferences.

Response Time is how long it took to run the last check.

Last Metric displays information about the last check response. This is dependent on the check type. For example, a web check will show the response code, redis will show the number of keys, and rabbitmq could show the number of items in a queue.

Adding a Check

Checks can be added from the Checks section.

Add a Check

Depending on the type, a check will have different options when setting up a new one.

Add Website Check

There are some options that are independent of the check type, and are always available.

Name: The name you'd like to give the check.

Max Severity: If the check fails and it creates an incident, this is the max severity level the check will allow the incident to reach due to that check. For example, if a check is part of an App and it fails, it will open up a 'critical' incident for the app by default. If the check max severity is set to 'info', the incident would instead be opened at 'info' level. This can be useful if a check is more informational, or not important enough to cause a critical alert.

In Availability: This option determines whether the check should be included in the overall availability calculation on the dashboard or not. For example, some users may want to track their test environments, but not have them be factored in to their overall availability.

SSH Tunneling: All check types are able to be performed over an SSH connection. This allows Happy Apps to test components that may not be publicly accessible. The SSH tab has all the relevant connection inputs, as well as a public SSH key that will need to be added to the host's authorized_keys file.

SSH Tunneling Option for a New Check

Once a check is added, it will appear in the checks list and begin running on its interval.

Duplicate Existing Checks

It can be common to add many checks with nearly identical configuration options. For instance, when creating checks for a set of database nodes, they may all be the same except for a slightly different host. Duplicating a check allows you to quickly setup repetitive checks by prepopulating all the fields for you.

The duplicate icon can be found at the very right of each row in the checks list.

Duplicate a Check

This action will open the new check window with all the fields populated. Simply give the check a new name and make whatever minor change you need before saving.

View Check Details

Clicking on a check name will take you to the details page for that check. At the top of this page are a few key metrics about the check.

Check Details Page

The availability (uptime percentage), response time of the last check, and number of open incidents the check is currently affecting. Below these metrics are the check details. This is information about what the check is configured to do. Chart Stats shows a graph of the response time for the last 24 hours. The History section shows the results of each time the check was performed.

History Check Area

Last, the Groups section will list any groups the check is a part of.

Mute a Check

Often times a check may go down for maintenance or a known reason. In this case, you may not want to trigger incident alerts or have the check affect your overall availability percentage. An example could be reseeding a node in your mongo database cluster. Mute allows you to handle these scenarios. If a check is muted, no incidents will be created if the check fails and no alerts will be sent out for the check. It's possible to mute a check anytime, while it's passing or failing. If an already failing check is muted, any incidents it belongs to will be closed out, assuming it's the only failing check in the incident.

To mute a check, click the flag icon at the top right of the Check Details page.

Muted checks can be identified by having the blue flag overlay on the status icon

A check will stay muted until the flag is unchecked.

Edit or Delete a Check

To edit a check, click the pencil icon

at the top right of the Check Details page.

To delete a check, click the trash icon .

Once a check is deleted, the name will be removed from any incident it was a part of. The check name in the incident will now show 'No subject' in its place.

Check by Agent

Agents can be utilized to check components that are not accessible directly from Happy Apps. If there are any agents setup in your account, the new check form will have an option for Agent. By default, the dropdown will be set to 'Happy Apps'. This means that Happy Apps will perform the check. To have one your own setup agents perform the check instead, simply choose the desired agent from the dropdown.

Security

Protecting your data is something we take very seriously. We understand that setting up checks to connect to your infrastructure is a concern. Be assured that we take every precaution to ensure your data and connections are protected.

First, all check and connection data is encrypted. Everything from a check’s hostname to credentials. This provides the security needed to check systems that are publicly accessible.

As mentioned above, we also support SSH tunneling to perform checks. The keys we generate are all RSA 2048 bit. We take the added precaution of encrypting the private key as well. Your account comes with a single key pair that will be used for any SSH check. If desired, you can generate a new SSH key pair for your account.

If a new key is generated, you will need to update the key on all servers setup with the old one.

New Key Pair Generation

If you are still uncomfortable with Happy Apps initiating the connection to your servers to perform checks, agents can be utilized. Running an agent on your network controls the flow of access to your servers. It polls Happy Apps for any checks that need to be run and relays the check results back. Communication is done over a secure connection, and you are in full control of when the agent is running or stopped.