Both Microsoft and AWS have a built in solutions for logging and historical tracking of metrics with alerting against them. Microsoft Azure provides Application Insights in their portal that operates as a SaaS offering. AWS provides a similar capability with their CloudWatch offering. These tools are built in to see the big picture of what is going on with your PaaS environment, like standing on the peak of a mountain and surveying everything below.

Instrumenting your code to provide custom telemetry

There are built in metrics that Azure and AWS capture that give you the basics of what has become standard monitoring of machines, web applications and network utilization. You could use these metrics to do reporting and alerting on some of your SLA values you want to track such as for HTTP responses.

There are some things however that you would be missing that only your code would know how to measure and report on. The real power comes when you take the extra steps of using a Node module provided by either Azure or AWS to instrument your code and output custom events, logs and metrics.

There are two parts to making this happen. Like was mentioned, you need to use a Node module that is part of the SDKs and send up telemetry to Azure or AWS. Telemetry will be stored for you in storage infrastructure. The second part is through your use of the portal UI to look at the instrumentation graphs and alerts you can configure. The portals allow you to search and filter for metrics as well.

Be careful what you send as telemetry data! Remember to never send along any personally identifiable information or otherwise sensitive information.

You will now see specific examples of what can be done with some of the SDK functions for feeding telemetry to Azure Application Insights and AWS CloudWatch. We will start with Application Insights.

Application Insights with Azure

Application Insights can be set up to monitor and alert on certain pre-defined metrics and also on those you instrument in your code. To get started with you own custom instrumentation of your Node code to send telemetry up to Application Insights, you download the Node.js module from NPM as follows:

npm install applicationinsights

The following are a few of the useful functions and what they are used for:

trackEvent(): This is used to log events such as interactions that a customer is having. For example, in our NewsWatcher sample application, we could create an event every time a user commented on a news story. The aggregate numbers for these events could then be viewed in the Application Insights Metrics Explorer.

trackTrace(): Call this to create diagnostic logs traces. For example, a diagnostic event such as running some batch file, or detecting some anomaly such as a performance issue. Definitely call this to report errors that are going on in your code such as exceptions. Add in some type of session tracking number if applicable to help tie together the series of interactions that happened leading up to an issue.

trackMetric(): You can report on interesting numbers to give you snapshot views in time of the state of some internal coding structures.

The telemetry values sent do not show up instantaneously in Application Insights. They are sent in batches and are only visible in the Azure portal at some future time. It could take up to an hour for that to happen according to the documentation. There is a sendPendingData () function call you can make to send up your data immediately, but it still does not make the telemetry show up in the portal immediately. Here is what the code looks like for using some of these telemetry sending functions:

Here is what the Search blade looks like where you can see the different types of telemetry. There is a scrollable list of everything as well as search and filtering capabilities.

Here is the Metrics Explorer blade showing the metrics for the built-in and well as custom collected values.

There is the capability to set up alerts against some of the standard metrics that are being collected. It would be nice to alert against exceptions, traces and individual custom events. For example, you would certainly want to know if an exception was happening deep inside your code that you surfaced with trackException() and then investigate why.

Application Insights in Azure is in Preview mode, so it may not always work as expected. You can also expect many more features to be coming as it progresses. Let’s now look at AWS CloudWatch.

CloudWatch with AWS

To send your metrics up to AWS, you use a capability of the aws-sdk Node module. You need to download this from NPM as follows:

npm install aws-sdk

To send your events, logging and metrics up to AWS, you will use capabilities off of the aws-sdk module. The following are three of the useful service functions and what they are used for. We see the same three analogous calls that we have with the Azure SDK.

CloudWatchEvents.putEvents(): This will send a custom event up to AWS CloudWatch events that can then be matched to rules. Use this to log customer events that are happening.

You can open the CloudWatch Management console to look at the telemetry and set up Alerts. Here is what you see when viewing log telemetry.

AWS offers many divergent paths to take once you have an event, log or metric uploaded. I will explain just a few of them, as it would take quite an examination to really cover all of the possibilities.

For logs, you can go into the management console and click on the group and stream and see them listed as shown in the previous screen shot. If you go to the log group view, you can select a group and do things from a menu such as send the log stream into Elasticsearch or Lambda. You can learn about each of those, but basically, from there you can do just about anything with the stream you like such as to visualize the data and perform analytics on or to trigger operations and alerts.

For now, you can click on the Events selection on the left navigation area of the management console and actually see the events listed. What you can do, is set up filtering rules for them and select actions for what happens to the events passing through, such as going to SNS or to a Lambda function.

The metrics usage is fairly mature in the management console, and you can set up graphs to look at them in a dashboard, set thresholds to alert on and such. Like was mentioned before, this would be the main place for monitoring your SLA values.