To diagnose a network element, right-click and choose View Metrics. You can troubleshoot the following network elements:

Tier – The performance chart (top left) shows the rate or application performance outliers on the relevant nodes (Errors and Slow/Very Slow/Stalled Calls) and the Key Performance Indicators for all Connections used by those nodes (Errors and PIE).

Network Link – The performance chart (top left) shows the rate of Performance Impacting Events for all member Connections of that link.

Network Connection – You can troubleshoot a Connection from the Connection Explorer or from the Connections tab in a link popup.

You can troubleshoot a transaction from a Transaction Snapshot.

You can right-click on an application flow and choose View Network Metrics

You can also drill down to a node where an transaction delay/stall/error occurred and go to the Network tab.

To troubleshoot a tier, link, or connection in the Network Browser, right-click the network element and choose View Metrics. The top-left chart in the dashboard shows the overall network and/or network/application performance of the element. To troubleshoot the element, look for correlations between the performance chart (top left) and the other charts in the page.

Tip

You can switch between linear and logarithmic scale in each chart to best highlight metric spikes and variations. Click on the settings button (top-right corner of the chart to switch between scales.

If you see spikes in transaction outliers and correlated spikes in PIE or errors, this indicates that the network is affecting application performance. Look for orrelated spikes in the other charts to identify specific issues and root causes: connection errors, packet loss, retransmissions, high-latency connections, and so on.

KPI

Host Stack KPIs

TCP Wait socketscan result in significant delays and/or errors for the application or service that relies on that socket. A lot of simultaneous WAIT sockets can prevent applications and services from creating new connections.

KPI

Network PIE - Contributors

Performance Impacting Elements (PIE) are useful for identifying the location of actual or potential bottlenecks:

RSTs (resets) – A TCP Reset is an immediate closing of a connection. Not all connection resets indicate a problem, but it is good practice to investigate any spike in resets that coincides with a spike in application errors or slow transactions/calls. Connection resets can occur for various reasons, such as:

Inability to create connections by the Server.

Intermediate network elements such as Load Balancers, Firewalls etc. due to misconfiguration or other errors.

Current Established Connections

KPI

TCP Loss (mille)

The number of packets lost (sent but not received) per 1000 packets sent. "Per mille" is a percentage with one additional digit of precision. TCP detects lost packets and retransmits each lost packet until it receives an ACK (acknowledgement) from the peer. Spikes in TCP Loss generally indicate that the network is overutilized.

KPI

Retransmissions per Minute

Data Retransmits – The percentage of packets that were retransmitted. This metric includes SYN and FIN retransmissions.

SACK Retransmits – The percentage of data packets that were retransmitted due to selective acknowledgements (SACK, a TCP Feature).

KPI

Latency (RTT) Comparison

This chart compares the average TCP round-trip times (RTTs) for different types of packet request/responses.

Handshake RTT – Average RTT for the initial 3-way handshake (SYN, SYN/ACK, ACK) to set up a connection. If Handshake RTTs are significantly higher than Initial RTTs, this indicates that the client node is taking a long time to set up the connection.

Initial RTT – Average RTT for the initial SYN packets (between SYN and SYN-ACK or SYN-ACK and ACK). If Initial RTTs are high, the delay might be due to an intervening firewall, asymmetric routing, or other network issue that causes the network to treat SYN packets differently from data packets.

RTT – Average TCP RTT for data request/responses—that is, the time for TCP Data segments to be transmitted and acknowledged from the peer. This is different from Application Response Time metrics that measure the time an application takes to process a request.

KPI

Connection Lifetime

TCP is most efficient when long, stable connections are used. Connection setups and teardowns are very time-consuming and resource-intensive. The more short-term connections get generated, the worse TCP performance will be because most of the time is spent in creating connections and because of "Slow-Start" in TCP optimal TCP bandwith is not achieved.