Class Ids

As JaCoCo's class identifiers are sometimes causing confusion this chapter
answers the concepts and common issues with class ids in FAQ style format.

What are class ids and how are they created?

Class ids are 64-bit integer values, for example
0x638e104737889183 in hex notation. Their calculation is
considered an implementation detail of JaCoCo. Currently ids are created with
a CRC64 checksum of the raw class file.

What are class ids used for?

Class ids are used to unambiguously identify Java classes. At runtime execution
data is sampled for every loaded class and typically stored to
*.exec files. At analysis time — for example for report
generation — the class ids are used to relate analyzed classes with the
execution data.

What are the advantages of JaCoCo class ids?

The concept of class ids allows distinguishing different versions of classes,
for example when multiple versions of an application are deployed to an
application server or different versions of libraries are included.

Also class ids are the prerequisite for JaCoCo's minimal runtime-overhead and
small *.exec files even for very large applications under test.

What is the disadvantage of JaCoCo class ids?

The fact that class ids identify a specific version of a class causes problems
in setups where different classes are used at runtime and at analysis time.

What happens if different classes are used at runtime and at analysis time?

In this case execution data cannot be related to the analyzed classes. As a
consequence such classes are reported with 0% coverage.

How can I detect that I have a problem with class ids?

The typical symptom of class id mismatch is classes not shown as covered
although they have been executed during the test. This situation can be easily
detected e.g. in the HTML report: Open the Sessions page with the link
on the top-right corner. You see a list of all classes where execution data
has been collected for. Find the class in questions and check whether the
entry has a link to the corresponding coverage report page. If the entry is
not linked this means there is a class id mismatch between the class used at
runtime and the class provided to create the report.

What can cause different class ids?

Class ids are identical for the exact same class file only (byte-by-byte).
There is a couple of reasons why you might get different class files. First
compiling Java source files will result in different class files if you use
a different tool chain:

Different compiler vendor (e.g. Eclipse vs. Oracle JDK)

Different compiler versions

Different compiler settings (e.g. debug vs. non-debug)

Also post-processing class files (obfuscation, AspectJ, etc.) will typically
change the class files. JaCoCo will work well if you simply use the same class
files for runtime as well as for analysis. So the tool chain to create these
class files does not matter.

Even if the class files on the file system are the same there is possible that
classes seen by the JaCoCo runtime agent are different anyways. This typically
happens when another Java agent is configured before the JaCoCo agent
or special class loaders pre-process the class files. Typical candidates are:

Mocking frameworks

Application servers

Persistence frameworks

What workarounds exist to deal with runtime-modified classes?

If classes get modified at runtime in your setup there are some workarounds to
make JaCoCo work anyways:

If you use another Java agent make sure the JaCoCo
agent is specified at first in the command line. This way the JaCoCo
agent should see the original class files.

Specify the classdumpdir option of the
JaCoCo agent and use the dumped classes at report
generation. Note that only loaded classes will be dumped, i.e. classes not
executed at all will not show-up in your report as not covered.

Use offline instrumentation before you run your
tests. This way classes get instrumented by JaCoCo before any runtime
modification can take place. Note that in this case the report has to be
generated with the original classes, not with instrumented ones.

Why can't JaCoCo simply use the class name to identify classes?

To understand why JaCoCo can't rely on class names we need to have a look at
the way how JaCoCo measures code coverage.

JaCoCo tracks execution with so called probes. Probes are additional
byte code instructions inserted in the original class file which will note
when they are executed and report this to the JaCoCo runtime. This process is
called instrumentation. To keep the runtime overhead minimal, only a
few probes are inserted at "strategic" places. These probe positions are
determined by analyzing the control flow of all
methods of a class. As a result every instrumented class produces a list of
n boolean flags indicating whether the probe has been executed or
not. A JaCoCo *.exec file simply stores a boolean array per
class id.

At analysis time, for example for report generation, the *.exec
file is used to get information about probe execution status. But as probes
are stored in a plain boolean array there is no information like corresponding
methods or lines. To retrieve this information we need the original class
files and perform the exact same control flow analysis than at instrumentation
time. Because this is a deterministic process we get the same probe positions.
With this information we can now interfere the execution status of every
single instruction and branch of a method. Using the debug information
embedded in the class files we can also calculate line coverage.

If we would use just slightly different classes at analysis time than at
runtime — e.g. different method ordering or additional branches —
we would end-up with different probes. For example the probe at index
i would be in method a() and not in method
b(). Obviously this will create random coverage results.

Why do I get an error when I try to analyze multiple versions of the same
class with a group?

JaCoCo always analyzes a set of class as a group. The group is used to
aggregate data for source files and packages (both can contain multiple
classes). Within the reporting API classes are identified by their fully
qualified name (e.g. to create stable file names in the HTML reports).
Therefore it is not possible to include two different classes with the same
name within a group. Anyhow it is possible to analyze different versions of
class files in separate groups, for example the Ant
report task can be configured with multiple groups.