Introduction to Memory Leaks In Java AppsOne of the core benefits of Java is the JVM, which is an out-of-the-box memory management. Essentially, we can create objects and the Java Garbage Collector will take care of allocating and freeing up memory for us.

Nevertheless, memory leaks can still occur in Java applications.

In this article, we're going to describe the most common memory leaks, understand their causes, and look at a few techniques to detect/avoid them. We're also going to use the Java YourKit profiler throughout the article, to analyze the state of our memory at runtime.

1. What is a Memory Leak in Java?The standard definition of a memory leak is a scenario that occurs when objects are no longer being used by the application, but the Garbage Collector is unable to remove them from working memory - because they're still being referenced. As a result, the application consumes more and more resources - which eventually leads to a fatal OutOfMemoryError.

For a better understanding of the concept, here's a simple visual representation:

As we can see, we have two types of objects - referenced and unreferenced; the Garbage Collector can remove objects that are unreferenced. Referenced objects won't be collected, even if they're actually not longer used by the application.

Detecting memory leaks can be difficult. A number of tools perform static analysis to determine potential leaks, but these techniques aren't perfect because the most important aspect is the actual runtime behavior of the running system.

So, let's have a focused look at some of the standard practices of preventing memory leaks, by analyzing some common scenarios.

2. Java Heap LeaksIn this initial section, we're going to focus on the classic memory leak scenario - where Java objects are continuously created without being released.

An advantageous technique to understand these situations is to make reproducing a memory leak easier by setting a lower size for the Heap. That's why, when starting our application, we can adjust the JVM to suit our memory needs:

-Xms<size>

-Xmx<size>

These parameters specify the initial Java Heap size as well as the maximum Heap size.

2.1. Static Field Holding On to the Object ReferenceThe first scenario that might cause a Java memory leak is referencing a heavy object with a static field.

We created our ArrayList as a static field - which will never be collected by the JVM Garbage Collector during the lifetime of the JVM process, even after the calculations it was used for are done. We also invoked Thread.sleep(10000) to allow the GC to perform a full collection and try to reclaim everything that can be reclaimed.

Let's run the test and analyze the JVM with our profiler:

Notice how, at the very beginning, all memory is, of course, free.

Then, in just 2 seconds, the iteration process runs and finishes - loading everything into the list (naturally this will depend on the machine you're running the test on).

After that, a full garbage collection cycle is triggered, and the test continues to execute, to allow this cycle time to run and finish. As you can see, the list is not reclaimed and the memory consumption doesn't go down.

Let's now see the exact same example, only this time, the ArrayList isn't referenced by a static variable. Instead, it's a local variable that gets created, used and then discarded:

Once the method finishes its job, we'll observe the major GC collection, around 50th second on the image below:

Notice how the GC is now able to reclaim some of the memory utilized by the JVM.

How to prevent it?Now that you understand the scenario, there are of course ways to prevent it from occurring.

First, we need to pay close attention to our usage of static; declaring any collection or heavy object as static ties its lifecycle to the lifecycle of the JVM itself, and makes the entire object graph impossible to collect.

We also need to be aware of collections in general - that's a common way to unintentionally hold on to references for longer than we need to.

2.2. Calling String.intern() on Long StringThe second group of scenarios that frequently causes memory leaks involves String operations - specifically the String.intern() API.

Here, we simply try to load a large text file into running memory and then return a canonical form, using .intern().

The intern API will place the str String in the JVM memory pool - where it can't be collected - and again, this will cause the GC to be unable to free up enough memory:

We can clearly see that in the first 15th seconds JVM is stable, then we load the file and JVM perform garbage collection (20th second).

Finally, the str.intern() is invoked, which leads to the memory leak - the stable line indicating high heap memory usage, which will never be released.

How to prevent it?Please remember that interned String objects are stored in PermGen space - if our application is intended to perform a lot of operations on large strings, we might need to increase the size of the permanent generation:

-XX:MaxPermSize=<size>

The second solution is to use Java 8 - where the PermGen space is replaced by the Metaspace - which won't lead to any OutOfMemoryError when using intern on Strings:

Finally, there are also several options of avoiding the .intern() API on Strings as well.

2.3. Unclosed StreamsForgetting to close a stream is a very common scenario, and certainly, one that most developers can relate to. The problem was partially removed in Java 7 when the ability to automatically close all types of streams was introduced into the try-with-resource clause.

In this case, the BufferedReader will be automatically closed at the end of the try statement, without the need to close it in an explicit finally block.

2.4. Unclosed ConnectionsThis scenario is quite similar to the previous one, with the primary difference of dealing with unclosed connections (e.g. to a database, to an FTP server, etc.). Again, improper implementation can do a lot of harm, leading to memory problems.

The URLConnection remains open, and the result is, predictably, a memory leak:

Notice how the Garbage Collector cannot do anything to release unused, but referenced memory. The situation is immediately clear after the 1st minute - the number of GC operations rapidly decreases, causing increased Heap memory use, which leads to the OutOfMemoryError.

How to prevent it?The answer here is simple - we need to always close connections in a disciplined manner.

2.5. Adding Objects with no hashCode() and equals() into a HashSetA simple but very common example that can lead to a memory leak is to use a HashSet with objects that are missing their hashCode() or equals() implementations.

Specifically, when we start adding duplicate objects into a Set - this will only ever grow, instead of ignoring duplicates as it should. We also won't be able to remove these objects, once added.

3. How to Find Leaking Sources in Your ApplicationDiagnosing memory leaks is a lengthy process that requires a lot of practical experience, debugging skills and detailed knowledge of the application.

Let's see which techniques can help you in addition to standard profiling.

3.1. Verbose Garbage CollectionOne of the quickest ways to identify a memory leak is to enable verbose garbage collection.

By adding the -verbose:gc parameter to the JVM configuration of our application, we're enabling a very detailed trace of GC. Summary reports are shown in default error output file, which should help you understand how your memory is being managed.

3.2. Do ProfilingThe second technique is the one we've been using throughout this article - and that's profiling. The most popular profiler is Visual VM - which is a good place to start moving past command-line JDK tools and into lightweight profiling.

In this article, we used another profiler - YourKit - which has some additional, more advanced features compared to Visual VM.

3.3. Review Your CodeFinally, this is more of a general good practice than a specific technique to deal with memory leaks.

Simply put - review your code thoroughly, practice regular code reviews and make good use of static analysis tools to help you understand your code and your system.

ConclusionIn this tutorial, we had a practical look at how memory leaks happen on the JVM. Understanding how these scenarios happen is the first step in the process of dealing with them.

Then, having the techniques and tools to really see what's happening at runtime, as the leak occurs, is critical as well. Static analysis and careful code-focused reviews can only do so much, and - at the end of the day - it's the runtime that will show you the more complex leaks that aren't immediately identifiable in the code.

Finally, leaks can be notoriously hard to find and reproduce because many of them only happen under intense load, which generally happens in production. This is where you need to go beyond code-level analysis and work on two main aspects - reproduction and early detection.

The best and most reliable way to reproduce memory leaks is to simulate the usage patterns of a production environment as close as possible, with the help of a good suite of performance tests.

About Stackify BlogStackify offers the only developers-friendly solution that fully integrates error and log management with application performance monitoring and management. Allowing you to easily isolate issues, identify what needs to be fixed quicker and focus your efforts – Support less, Code more. Stackify provides software developers, operations and support managers with an innovative cloud based solution that gives them DevOps insight and allows them to monitor, detect and resolve application issues before they affect the business to ensure a better end user experience. Start your free trial now stackify.com