Why Application-Centric APM is Incomplete

Ten years ago, the main goal for managers in network operations was to ensure the network was simply up and running. Today a well performing network does not guarantee that critical business applications are being successfully delivered to users.

The reality for IT and operations is dealing with requests for faster, always available applications and as a consequence organizations must combine network and application performance efforts to optimize delivery of business-critical applications and services.

In my last post I explored why network-centric approaches to monitoring have traditionally lacked insight into overall application performance and what really matters: the end-user’s experience. This post will focus on why application-centric APM is incomplete.

Traditional APM solutions focus solely on application performance and neglect to assess how the underlying infrastructure impacts the application. Application Performance Management(APM) has to be viewed from an application perspective, and must be tracked end-to-end through the entire infrastructure, including client devices, routing protocols, network configurations, switches, servers and associated components.

Many solutions focus on the performance of an individual application, and while they measure criteria such as end-to-end transaction time, they do not allow for the troubleshooting that would indicate why a transaction may stall within a particular tier of network infrastructure. This limitation impedes the ability to perform fast fault domain isolation necessary to maintain positive end-user experiences.

Limited Insight into Supporting Infrastructure

Poor end-user experience can be caused by poorly executing application code or by an overloaded server, load balancer or both. Most solutions require the use of two or more tools to troubleshoot the many issues affecting end-user experience. If you lack a unified view into apps and infrastructure, you can’t easily determine where the delay is (server, client, network, etc.), and you can’t expect to find the root cause of a performance issue.

Subsequently, without the ability to drill down to and isolate performance problems within the end-to-end path, troubleshooting and optimization efforts are simply guesswork. In order to control the quality of performance delivered to the end-user, IT and operations need an end-to-end view of the entire application delivery chain with the ability to drill down to any infrastructure element or methods deep within the application.

Fault Domain Isolation (FDI) without NPM

Not including network performance metrics into the overall FDI process can cause a performance bottleneck to be misinterpreted as network issues, providing very vague responses with no insight or granularity into the network itself, and what is actually causing the problem.

If something other than the application itself (a protocol delivery mechanism, coexisting nondependent application, etc.) is at fault for performance degradation, the lack of insight into network traffic will result in a lengthy mean-time-to-resolve (MTTR) or costly war room scenario.

Elongated problem resolution timeframes mean significant costs to the organization such as professional staffing expenses required to solve technology issues, potential loss of sales, decreased employee productivity and even a poor company brand image when the results of IT problems have an impact outside of the corporate walls. By including network performance metrics into the FDI, it helps identify potential problems by distinguishing between bandwidth contention, latency and server response time problems.

The ultimate goal of an APM solution is to restore and maintain performance to applications, and as APM vendors recognize the need to quickly identify the source of the problems, it is apparent that understanding the relationship between the underlying infrastructure and applications is a requirement.

While vendors from both ends of the spectrum are attempting to expand their product portfolios’ reach, either by positioning their products as Application-aware NPM or Network-aware APM, they often fail to successfully correlate data from the network and applications to meet the expectations of IT and operations managers.

I’ve been offering my help in analyzing performance data for quite a while now. Most of the time when analyzing PurePaths, load testing outputs or production log files, I find very similar problem patterns. This fact inspired us to automate problem detection in Dynatrace AppMon 6.5. Even though I think we cover a big part of common patterns, I am always on the lookout for something new – something I … read more

AWS CodePipeline is a more recent addition to Amazon Web Services – allowing development teams to push code changes from source check-in all the way into production in a very automated way. While code pipelines like that are not new (e.g: XebiaLabs, Electric Cloud, Jenkins Pipeline), Amazon provides seamless integration options for AWS CodeCommit, S3, CodeDeploy, Elastic Beanstalk, Cloud Formation as well as integration options for popular external DevOps tools … read more

Based on Docker and the Kubernetes container cluster manager, Red Hat OpenShift is the next generation container platform for developing, deploying and running containerized applications conveniently and at scale. In this article, Chris Morgan (@cmorgan_cloud), Technical Director for OpenShift Ecosystem, and Martin Etmajer (@metmajer), Technology Lead at the Dynatrace Innovation Lab, discuss why OpenShift and the Dynatrace digital performance management solution are a perfect combination. The interview is led by Franz Karlsberger (@fraka7), … read more