Diagnose Network Problems with Integrated Network Visibility

More and more distributed apps are being deployed in the private, hybrid, and public clouds, and the performance of these apps is becoming increasingly critical for enterprises.

In fact, the AppDynamics 2017 App Attention Index highlights the modern day consumer demand for speed and consistency, with 62 percent of respondents expressing increased expectations for how well digital services should perform. What’s more, when apps don’t perform correctly, 80 percent of users will delete the app. Needless to say, the bar for application performance is extremely high.

AppDynamics APM is well-equipped to monitor the performance of these apps, pinpointing app flows that degrade the end-user experience through the lens of the Business Transaction (BT). However, operations teams triaging problems are always challenged with the question of whether the underlying network is the cause of the degradation.

Enterprises typically have dedicated teams to manage the infrastructure (including network) and apps, but these teams don’t necessarily speak the same language, thus creating a communication barrier. AppDynamics Integrated Network Visibility attempts to facilitate collaboration between teams and bring down mean time to repair (MTTR). It’s a solution that is designed to enable AppOps to identify network-level problems during the “First Call” and escalate it to the right network team with actionable information. It also seamlessly integrates with application flow maps and directly correlates network performance metrics with application performance metrics, all within the context of business transactions.

Dynamic Dashboard for Network Visibility

One of the standout features of Network Visibility is the Dynamic Dashboard – a set of widgets showcasing trends of Transmission Control Protocol (TCP) connection metrics and host-level TCP socket metrics for selected time ranges. It also includes native metrics like Throughput, Loss, Data and SACK Retransmissions, TCP Resets, Connection Information, and uber metrics (single representation of a bunch of related metrics) like Network Errors and Performance Impacting Events (PIE). For example:

Network Errors bundles FIN Errors, Syn Black-holes, Syn Resets and RST on Established which captures errors that can occur on init or teardown of TCP connections.

PIE coalesces Client Zero Window, Client Limited, RTOs, Server Zero Window, and Server Limited which help highlight symptoms of a problem on the client node, server node, or the path between them. Full list of dashboard metrics can be found here.

With this data, you can now identify the contribution of the underlying network infrastructure. For example, consider a stalled transaction on your application flow map. With Network Visibility, users can launch this dashboard for the affected Tier / Node / Link and gain insightful network information, including:

A spike in the Latency trend, which could indicate a sluggish TCP connection between two services.

An uptick in Retransmissions, which could indicate network congestion.

“Network Impact on Transactions” juxtaposes PIE and Network Errors against Transactions, so network contribution for afflicted transactions can be identified.

Network Errors and Connection Information widgets, which help identify issues with TCP connections and their lifetimes.

Host Stack KPIs widget, which has metrics like Interface collisions & Wait Sockets which can help unearth issues in NIC or Duplex configurations.

Throughput, Loss and Latency widgets, which highlight the network health of the selected entity.

Snapshot Correlation

As the name implies, Transaction Snapshots is a popular feature in which AppDynamics retains a snapshot of certain transaction instances. This could be triggered by an automatic detection of slow transactions or a user-driven diagnostic session. A transaction snapshot gives you a cross-tier view of the processing flow for that particular transaction.

Transaction Snapshot drill downs will come with a network tab for the dynamic dashboard which will allow you to correlate network metrics captured at the time of snapshot collection. Each chart has the snapshot time range highlighted. You can then look for correlations in these charts and drill down to the root cause.

With integrated network visibility now running alongside the APM metrics you rely on to run your business critical applications, you can easily switch to a view of critical network performance indicators for your tiers, nodes and the flows between them.

In high-production environments where release cycles are measured in hours or minutes — not days or weeks — there's little room for mistakes and no room for confusion. Everyone has to understand what's happening, in real time, and have the means to do whatever is necessary to keep applications up and running optimally.

DevOps is a high-stakes world, but done well, it delivers the agility and performance to significantly impact business competitiveness.