Signup for new content

Radek Hecl

Unit-Level Performance tuning in Java

May 11, 2017

|Performance, Programming, Testing

How to simply improve performance of your java applications. Everything purely in IDE, without even starting the whole app.

When it comes to performance testing, I hear a lot about having a dedicated environment, funky tools like JMeter or Apica, and complicated scenarios. These take a lot of effort to set up and maintain. Therefore, I like to first make sure that the most critical units are well-optimized without any of these tools. One way to make this is through unit-level performance test apps. What’s great about these apps is that there is no need for any special tool, they can be ready to go within a minutes and they are proven to save a lot of time, money, and calls from angry customers.

In this article, I am going to share an example of such a test app. You can do the same in your projects.

Technology stack:

Netbeans IDE

Java

Maven

Testable Unit

In order to be able to run performance tests for a single unit, there is a need to have well-defined and testable units. Let’s work with an example (I have cooked this one, but you will get the idea).

This is a module for message broadcasting. Core method accepts pipe delimited String as an input, extracts parameters, finds the appropriate username, and broadcasts the messages. Following the implementation.

In the unit test you can see one happy case and 3 cases where validation failed. In addition you can see there invocation of verifyNoMoreInteractions to check that broadcastService doesn’t broadcasted any unwanted messages. That would be a harmful side effect. On the other hand userProvider doesn’t need to have this protection, because it performs read only operation (assuming it is truly read only and multiple readings are not causing anything harm). Regardless your coding style it is important to make similar work and make sure code is logically correct before starting with any optimization.

Test Application and Profiler

Now, when you have a code separated to the isolated unit and well defined unit tests, you are ready to start optimizing the performance. Before writing anything, ask yourself a question: Is this a critical part of the application? If answer is not, then it’s better not to optimize. Examples of critical parts are:

Methods called hundreds of millions of times.

Methods that process a lot of records.

Methods aggregating data from third parties in (near) real-time.

If you evaluated your method as a critical part of the application, then it is time to write the test application. Typically, you can place the test application right next to the unit tests. Here is the example.

Environment setup and data generation (it is desired to exclude this from measurement).

A waiting period to allow user connect profiler.

Test execution.

Result presentation.

As you can see I have used mockito to mock userProvider. And broadcastService is implemented inline. Both ways allow you to create a unit performance test without even having the real implementation of dependent services. For this purpose, the difference in them is that the mock version carries additional overhead. The right choice depends on the particular use case. That’s all about setup.

Starting Profiler

When you have your application ready, you can run it and attach the profiler during the prepared waiting period (to get good results, you should attach the profiler during that time). In NetBeans, it is pretty easy. Application can be run by the right click and then Run File option. Profiler is attached from top menu bar as Profile > Attach Profiler, then choose CPU and click Attach. Finally, choose your application and click OK. For illustration, please see the images below.

Analyzing Results

If everything is done everything correctly, then the console should look familiar to the following dump after the application finishes. See the profile attachment inside the waiting period (in the middle of the dots):

From the console output you can read that test took roughly 15 seconds. In addition to the console output, there is a profiler result which looks similar to the following.

It is possible to drill down within the profiler result and see how much time program spend in each method. The point of this test is to see details of the processMessage method. It is very clear that majority of time is taken by the getUserName method. In this case, it is caused by calling to the mock class. For simplicity, let’s assume that it would look similar if underline implementation makes a call to the database (in such a case, SQL would need to be sent to the database and the database would need to parse it, pull data, and return the result over some protocol, which would definitely take some time). So as the resolution, let’s consider the method getUserName as a bottleneck to deal with.

Bottleneck Optimization

As you probably know, the typical way to avoid expensive queries is some form of caching. Let’s try the most primitive one: using HashMap. Here’s how the optimized processMessage method looks:

When you run the performance test program with this adjustment, then the whole run takes around 4.4 seconds instead of the original 15 seconds (on the same machine). The profiler result looks like the following:

Now the bottleneck becomes the function for parsing dates from the string. This would be the next step for optimization, if required. Before closing, let me add a few notes.

Using hash maps for caching is probably not what you want in most of real cases.

Optimization introduced new branch of code which is not covered by current unit test. Good practice is to revisit unit test and get this case properly covered.

Optimization generally makes code more complex and less readable. Therefore focus first only on the parts of your application which are critical, use profiler to find the real bottleneck within the units and stop optimizing when performance is good enough for your case.

Summary

This article shows one way of performance tuning at the unit level. This type of optimization has an advantage in that that anyone can do it with only a laptop and a few basic tools and everything can be setup within a minutes. Therefore, this is the great first layer of the performance tuning, which will save you a lot of time and money during the later stages.