Performance Monitoring for Cloud Hosted Applications – Time for a Clean Sheet of Paper

There are several barriers that must be overcome in order for the constituents that own, support and rely upon a business application to feel comfortable putting that application in the cloud. One of these barriers is knowing what the performance the cloud hosted application is providing to its users, and having some insight as to why degradations in performance occur.

There are several factors that make monitoring cloud resident applications a challenge:

Clouds are almost always built upon virtualization. This means that all of the challenges that arise in monitoring the performance of virtualized applications (detailed in the white paper available for download at the end of this article) apply to cloud based applications. The main challenge is that capacity in virtualized environments tends to be dynamic and shared. This means that inferring good performance because the application is not using too much CPU, and inferring bad performance because it is using too much CPU no longer works.

Since resource utilization is not a reliable reciprocal of applications performance in virtualized and cloud based environments, the focus needs to shift to response time as the primary metric of applications performance. This is both a technical necessity and a political necessity as the users of the application are not gong to accept any other metric other than whether the application feels fast or slow to them.

The dividing line between the application and the infrastructure is now more like a wall than a line. When you put your application in a cloud environment you cannot generally specify changes to the underlying software infrastructure. In other words, if you put an application up on Amazon EC2, you cannot at the same time ask Amazon to put in place the infrastructure monitoring that you have in place for that application in your data center. Whatever monitoring you put in place needs to be, for the most part, part of your application, and must be installable as a part of your application.

You need to plan for the case where parts of your application live in one cloud, parts in a second cloud and parts in your internal data center – with only the public internet available as the communication mechanism between all of the parts. This requirement breaks most data center hosted monitoring solutions, as these solutions assume that the monitoring product and the entire application system all live on the same subnet and inside of the corporate firewalls. Traditional monitoring solutions have no mechanism to support data collection agents that live outside of firewalls and that communicate back over the public internet.

For the above reasons, cloud resident applications require monitoring solutions that are architected in a fundamentally new way. The new cloud application monitoring architecture should adhere to the following principles:

The monitoring agent should travel with the application, and be easily installable as a part of installing the application on the cloud infrastructure.

Communications by the monitoring agent back to the analysis and reporting servers should be one way, initiated by the monitoring agent, and over public internet friendly ports and protocols (SSL:443).

The most attractive option for many enterprises, would be for the monitoring solution to in fact be hosted by the monitoring vendor. This would make provisioning monitoring into something that is as simple and fast as provisioning the application itself in the cloud would be.

There is one vendor that has stepped up to the plate with an application performance monitoring solution that both meets the above challenges and is architected in a cloud friendly manner. New Relic RPM is a hosted APM solution that easily monitors Ruby on Rails and Java based applications. Since RPM is hosted by New Relic and is offered on a Software as a Service (SAAS) basis, you can sign up for monitoring with New Relic as easily as you can sign up for cloud based hosting of your application. RPM is also specifically built for the shared capacity architectures that clouds are based upon – and focuses upon response time as the key metric of applications performance.

Share this Article:

Bernd Harzog is the Analyst at The Virtualization Practice for Performance and Capacity Management and IT as a Service (Private Cloud).
Bernd is also the CEO and founder of APM Experts a company that provides strategic marketing services to vendors in the virtualization performance management, and application performance management markets.
Prior to these two companies, Bernd was the CEO of RTO Software, the VP Products at Netuitive, a General Manager at Xcellenet, and Research Director for Systems Software at Gartner Group. Bernd has an MBA in Marketing from the University of Chicago.

[…] Clean Sheet of Paper 28 September 2009 at 9:47 am | In Uncategorized | Leave a Comment In an article published the other day on the very informative site called The Virtualization Practice, tech […]

Lots of great points Bernd. As the co-founder of Scout (http://scoutapp.com), a hosted server monitoring solution, I wanted to highlight a couple of your points. Factor #2 (Shifting focus to response time from resource utilization) — I completely agree, but I’d say the freedom to track any end-user visible metric is key (whether it’s response time, image conversion time, search query time, execution of a background job, etc). It’s important to have a monitoring solution that tracks changes over time and alerts you when the metrics change dramatically. Factor #3: (Whatever monitoring you put in place needs to be, for… Read more »