Clean Readable Performant Java

Introduction

"Do more with less" is a frequent mantra when supporting legacy
applications. After years of accumulating cruft these same
applications are now expected to support the "big data" workloads
of today. In order to achieve these performance levels, it is
often necessary to run profilers to find and remove the
bottlenecks. Before doing any optimization however, it is very
helpful if the code is easy to understand. (We assume that the
code is hard to understand because otherwise the optimization
would have already been done.) Once it is improved, it becomes
clear where optimizations can be made while minimizing changes to
the rest of the code base. It is often assumed that high
performance code must be difficult to read. This leads to the
false assumption that easy to read code must not be optimal.

The Java Virtual Machine (JVM) has many strategies for
optimizing code at runtime, but it is very limited in the time it
has for determining which ones should be applied. As a result, the
JVM works better on straightforward, easy-to-understand code. This
is advantageous because most developers also prefer very readable
code. What follows are a few techniques for writing both readable
and performant code for the JVM.

Write small methods and classes with high cohesion.

"Great things are done by a series of small things brought
together."
Vincent Van Gogh

The purpose of each method and class should be quickly and easily
understood. If this is not the case it may be a good idea to
refactor it into much smaller pieces. Break larger methods down
into smaller methods that each have a clear singular purpose.

Eclipse, IDEA, and NetBeans each have their own approach to
extracting methods and refactoring classes. The time you spend
learning these features for your platform will pay great dividends
in the future.

Make it smaller.

There is almost no performance penalty in the JVM for small
private or static methods. These methods are easily inlined at
runtime. Private and static methods greatly help readability by
breaking down complex tasks into easy to understand parts with
limited scope.

Public and protected methods are also frequently inlined but a
little more analysis must be done at run time to ensure the
correct behavior. This is because the implementation may have been
overridden and can't be inlined without an analysis of how many
times and where this has been done. Note that this same problem
can also frustrate the developer. Projects with too many
interfaces and abstract base classes can be very difficult to
trace through. This is one of the reasons that composition over
inheritance is generally encouraged.

The size of the method also plays a role. By default methods
larger than 35 bytecodes will not be inlined unless they are
called very frequently. It is not always clear how many bytecodes
a particular Java method will be compiled into or how frequently
it will be called by simply reading the source code. Fortunately
the JVM provides some command line options to make this easier.
Add the following to the arguments when starting up the JVM.

-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

This command will write the method inlining details to the
console at runtime. Pay attention to the last part of each line.
If the line ends with "hot method too big" the JVM would have
inlined this method had it been smaller. Break this method down
into smaller parts or simplify it until the message disappears. If
the line ends with only "too big" these methods should also be
considered for refactoring into smaller pieces. They were not
called frequently enough to be candidates for inlining; however,
they were deemed to be too large by the JVM.

Make it tighter.

Methods are easier to understand when all the data fields they
use are local. This is also helpful to the JVM because everything
fits closely
together on the stack as opposed to distributed out in the
heap. When all the fields are near at hand they can be cached and
prefetched by the underlying hardware. Keep in mind that
references to member variables do have a price. The address for
the object must be loaded and the field dereferenced in order to
be used.

In simple cases, the JVM can eliminate redundant member
references but it is a better approach to eliminate the clutter
and make the code more readable by using local variables. Only use
member variables within methods when it is required to modify or
read the state of the class. Eliminate member references inside
tight loops by reading the value once into a local variable before
entering the loop.

Exposing members while they are being constructed or modified
should always be avoided. This helps simplify the
dependencies readers need to think about and reduces the risk of
external code interacting with these variables at inappropriate
times.

Sometimes it is unclear how to reduce the references within a
method due to the tight coupling of the algorithm in use. In this
case it may be better to create a class to represent a running
instance of this algorithm which will then be created and used
inside the large method. This should cause the method to be
smaller because the logic has been moved to the new class and it
also allows for easy replacement of the algorithm in the future.

Take care to make sure that the instance of the algorithm class
does not get set to a member variable or get returned from the
method. If this is done correctly, the JVM will use escape
analysis to recognize that this object is only used locally. Once
this is known, the JVM will not allocate a new object but instead
will inline the object construction and usages. This makes use of
the stack instead of the heap and eliminates the need to garbage
collect this short lived object.

Make it happy.

The happy path is the code that should be executed in the normal
case when nothing goes wrong. Try to keep all the happy path code
in one place where it will be sequentially easy to read. Call out
as needed to methods specifically written to support the
infrequent corner cases.

Modern CPUs are much faster than their memory subsystems so they
prefetch data in order to maintain reasonable speeds. By ensuring
the sequential nature of the happy path, the code will work in
harmony with this prefetching behavior. At run time the JVM will
inline those methods that are called most often. Assuming these
methods are in the happy path, the prefetch will now be loading
the body of these inlined methods. This now gives us fast,
sequential execution and easy-to-think-about, clean readable code.

Do not mix boxed and unboxed primitives.

Primitives are better for raw performance but boxed values will
be necessary when using the built-in collections classes. Mixing
them together however can lead to unexpected performance issues
and clutter. If possible, pick one style and be consistent.

Changing between the two styles can lead to overlooking the need
to ensure boxed primitives are not null or introducing unnecessary
defensive null checks. Null checks clutter up the code flow and
obscure the work to be completed. The best approach is to ensure
nulls are not produced at any point in the code.

Mixing both styles also leads to confusion any time equivalence
needs to be checked. Using == is appropriate for
primitives but it is rarely the desired behavior when checking
boxed values. This is because == is checking
identity ensuring both arguments are the SAME object rather than
simply equivalent.

Autoboxing adds hidden costs because it becomes a method call
that internally may use a pool to limit the number of objects
created. This "object pooling" is common with small Integers but
may not be helpful because the garbage collector on modern JVMs is
very efficient at reclaiming short lived objects.

Using new to explicitly box a primitive via the
constructor (as in new Double(4.2) ) is always
faster than letting the JVM autobox. Surprisingly, this technique
is not much more expensive than using primitives due to the
efficiencies of the modern garbage collector.

Minimize exception handling and throwing.

"Do or do not. There is no try."
Master Yoda

Never use exceptions as a form of flow control. Limit their use
to truly exceptional cases. Frequent use of 'try', 'catch' and
'finally' greatly detracts from easily understanding the code
flow.

Checked exceptions have long been a pain point within Java due to
the extra boilerplate they impose. A lesser known problem with
exceptions is that the just-in-time compiler does not compile
catch blocks. As a result any code within the catch block will
never execute as quickly as other code.

The use of 'finally' can, at times, be excused because it is used
to promote safety and clarity when releasing a finite resource
such as a lock or connection handle. Caution should be used
however because frequent use can easily lead to misunderstandings
related to the order of execution. As stated previously,
sequential code leverages the prefetch behavior of the underlying
hardware and finally blocks can disrupt this.

When possible, refactor out the body of the catch into a single
method. Then use a conditional check for that particular case. The
body of the catch code will no longer be inside the happy path.
The explicit check also encourages fail fast by pushing the
contract requirements for the code up to the the front.

Conclusion

Now that the code is readable you are ready to fire up the
profiler and begin the optimization process. After applying the
above techniques, however, it may already be performing much
better than expected. The recommendations here demonstrate that
there is no reason to sacrifice readability for performance when
developing on the JVM.

OCI partners with clients to assess, design, architect, engineer, manage and support Mission-Critical,
High Performance and Real Time systems. Our goal is to make IT solutions more open, scalable, reusable,
interoperable, and affordable. Please visit www.ociweb.com to learn
more about our service offerings, open source middleware technologies, and professional IT training.