Understanding performance and usage impact of releases using annotations in Application Insights

A little over a year ago, the Application Insights team re-introduced release annotations so that users would have the ability to correlate occurrences of app and services releases with their APM data. VSTS users can simply add a step to their release scripts to create annotations, and users leveraging third-party release engines can push release annotations to Application Insights by simply calling a Powershell script.

In the meantime, the VSTS team has been hard at work maturing continuous integration/deployment functionality in the VSTS release tools, as well as extending that functionality to many other platforms. As the concept of CI/CD continues to gain momentum and popularity in development shops around the world, the value of generating annotations for the associated tighter release cycle increases dramatically.

A simple example

In the above illustration, we have an app called MyEnterpriseApp being regularly deployed through continuous deployment. Two statistics have been chosen here: Users and browser page load time. Since annotations have been added to the release script, we can clearly see when our releases are occurring, and how they may be affecting performance or usage. In the earlier part of the chart, we see a very normal cadence of users: daily spikes during prime weekday hours, with valleys at night, followed by a lower number of users on the weekend. Our page load time is consistently around two seconds or less, and so our user base remains equally consistent.

If we look at the release that happens the following week (around April 17th on this chart), however, a problem is clearly introduced. Page load times shoot up to around eight seconds, and our number of active users goes into a nose-dive, dropping off dramatically until the number of users tolerating this level of performance is actually lower than what we would normally see on weekends. Following these events, we see another release around April 19th, which we can safely assume includes some kind of fix for the situation, because following the release our page load times drop back the normal range, and our active user count rebounds as well now that performance is back to an acceptable state.

So what happened? Was the release not tested properly? Did an environmental factor exist in production that wasn’t present in test? The natural next step is to investigate the problem to ensure it doesn’t happen again. Again, we can leverage our release annotations to begin drilling to greater detail to solve the issue. Remember that we can hover on an annotation, and it will give us information about the release:

If we click on the information balloon, a detail blade opens up describing the release, complete with a link to the release script in VSTS (or a 3rd-party system, so long as that was provided when the annotation was created through the Powershell script):

In this way, we can quickly see who we should contact about the release, and investigate individual steps or deployed components (if we suspect a code error was introduced).

Since we can save Metrics Explorer results as favorites, we can retain the view of these significant statistics for future use. This allows us to come back in the following days and very quickly get a view of our releases to ensure that our errors are not being repeated, or allow us to investigate them immediately if they are. Simply being able to correlate our releases with our performance and usage data by use of release annotations dramatically reduces the time required for confirmation or investigation.