Scope

The below diagram is meant to provide guidance on the steps that need to be taken in order to troubleshoot performance issues. It is split in 5 sections for easier reading.

Each step in the diagram is linked to a documentation resource or a recommendation.

Prerequisites and Assumptions

The assumption is that a performance issue is observed on a given page (either an AEM console or a web page) and can be reproduced consistently. Having a way to test or monitor the performance is a pre-requisite before starting the investigation.

The analysis starts at step 0. The goal is to determine which entity (dispatcher, external host or AEM) is responsible for the performance issue then determine which area (server or network) should be investigated.

Section 1

Section 2

Section 3

Section 4

Section 5

Reference Links

Step

Title

Resources

Step 0

Analyze Request Flow

You can use standard HTTP request analysis in the browser to analyze the request flow. For more info on how to do this on Chrome, see:

Check if the dispatcher sends HEAD requests to AEM for authentication before delivering the cached resource. You can do this by looking for HEAD requests in the AEM access.log. For more information, see Logging.

Step 6

Is the geographic location of the Dispatcher far away from the users?

Move the Dispatcher closer to the users.

Step 7

Is the network layer of the Dispatcher OK?

Investigate the network layer for saturation and latency issues.

Step 8

Is the slowness reproducible with a local instance?

Use Tough Day to replicate "real world" conditions from the production instances. If this is not realistic for the slace of your development, make sure to test the production instance (or an identical staging one) in a different network context.

Step 9

Is the geographical location of the server far away from the users?

Move the server closer to the users.

Steps 10 and 29

Investigate network layer

Investigate the network layer for saturation and latency issues.

For the author tier, it is recommended that the latency does not surpass 100 milliseconds.

Is the Keep-Alive header present in the different requests to re-use connections? Otherwise, it would mean that each requests leads to another connection establishment, which introduces unnecessary overhead. (Standard HTTP request analysis in the browser)