Issue

Environment

Cause

There can be many things that can cause AEM shutdown to take a long time. When you stop the AEM java process, it executes a java hook to shut down the Apache Felix OSGi container that AEM runs in. During the shut down of the OSGi container, the system stops all OSGi bundles and components. As part of that process, various services finish write operations, close out the open file handles, and wait until all active HTTP requests are responded to.

The most common causes of slow shutdowns are:

The deactivate method for an OSGi component takes a long time to execute

There are long running requests when the system is shut down

Resolution

To fix a slow shutdown issue, you need to analyze thread dumps to find out which threads are delaying the shutdown.

In addition to request threads, search for the thread with name "FelixStartLevel". That thread handles starting and stopping all the OSGi bundles and components and give some indication of what is delaying shutdown.

Look for patterns in the stack trace of the "FelixStartLevel" thread across thread dumps. See if it is stuck stopping a bundle or deactivating a particular OSGi component across many of the thread dumps. You can use a tool such as "grep" to analyze this. For example, if you observed that the SlingServletResolver OSGi component was being deactivated across multiple thread dumps then you might use the command below. The command below counts how many thread dumps have FelixStartLevel thread with SlingServletResolver in its stack trace: