Macs, Modularity and More

Eclipse start optimisation

In my previous post, I talked about my war on new Boolean() and how this
was improving the code by reducing object overhead. Fortunately, all of these
changes have now been merged, and we’re a step closer to having no additional
Boolean instances in memory. Well, actually that’s not true; if you use
getField().get() on a boolean value, you get a new Boolean()
instance created instead of a pointer to Boolean.TRUE or Boolean.FALSE.
Oh well …

So the second part of my optimisation is talking about speeding up Eclipse’s
launching. I’ve been looking through what happens when an Eclipse instance
starts in order to determine where we can optimise the start-up process.
I initially started this by running a new Eclipse instance, but this proved
to be variable and in any case too big to measure or optimise appropriately.
Instead, I switched focus to running a simple Eclipse 4 based application
product, created by File → New → Project → Eclipse 4 →
Eclipes 4 Application Product. This provides a simple window, but importantly
depends on everything that is needed to bring up an Eclipse instance. So
if we can optimise that, we can optimise Eclipse!

To determine where the time is spent, Eclipse has some debug options which
can be enabled with tracing. If you don’t know about Eclipse tracing, it’s
essentially a bunch of key=value pairs that you can set (like -D) but specified
in a .options file instead of on the command line. Tracing options can be
set when launching an Eclipse application by going to the launch configuration,
and going to the Tracing tab. Each bundle has possible debug options (specified
in a .options file inside the bundle itself) which can be enabled; if the
org.eclipse.osgi bundle is selected then this will switch on general debug
for this bundle (although there are other switches which can be enabled as
well).

OK, this gives a high level view of how long it takes to start up based on the
time taken to display a window. 1.7-1.8s on my laptop is not abnormal for an
empty window, but it’s not great. (Your environment will vary.)

Note that the first run is almost certainly invalid; when Equinox starts up,
it generates a lot of cached information and stores that in a set of files,
which it will re-read a second time. You can either measure the cold start
by deleting the workspace each time, or you can measure a warm start by
ignoring the first one and then running a second time. In fact, running a few
times makes sense; the OS will cache the JAR files in memory and so these can
vary between results depending on availability of operating system.

(An empty product doesn’t take up much memory; however, size the JVM so that
it doesn’t incur a GC during start-up. This will allow more repeatable
measurements to be made between runs. Either pick a big number, or use
-verbose:gc if you want to verify that a GC run isn’t happening. Ideally,
use this to pick a JVM size you know isn’t going to GC and then remove the
-verbose:gc flag so that it doesn’t impact the measurements.)

So, we’ve got a product and we’re running it to give us a number. Time to make
the number go down. How do we do that?

First, we need to understand what’s happening at start-up. The messages before
printing out Time to load bundles are messages from the Equinox framework
booting up and initialising the classes specified in the config.ini file.
This runs the org.eclipse.equinox.simpleconfigurator bundle, which reads
a file bundles.info in a similarly named directory. This is the file that
it’s printing out the Time to load bundles message, and there’s not much
that can be improved here.

Equinox then moves onto starting the framework by raising the start level. As
the levels increase, bundles are started corresponding to their individual
start level, which is in the bundles.info. This is what brings up Declarative
Services and the Event Admin in level 2, for example. Once the start level has
been increased, and all of the bundles have been started or lazily started
by then, the application launcher then hands it over to the EclipseAppLauncher
which passes it into the application’s run() method. There’s a listener
which is passed back by the application when it’s up (IApplicationContext has
an applicationRunning() method) which is also the same one used to hide the
splash screen.

OK, so each part of these in itself is pretty understandable, and pretty tight.
There will always be a part that is proportional to the number of bundles
involved (the bundles.info has one line per installed bundle) but otherwise
this is pretty lean. So where does the time go?

In short, loading and defining classes from disk. When you run a Java class,
the JVM has to be able to acquire the bytes and define the class itself.
Similarly when calling a method, any referenced arguments and exceptions
have to be loaded. Fortunately Java is pretty good at this, and is intelligent
so that when you define a class you don’t immediately need to define
all of the referenced classes; only when you use them for the first time.

However, there are things that run when classes are defined. Firstly, a
class may have static blocks that have code associated with it – although
that only gets run once the class' static (or instance) contents are accessed
for the first time. This is a little like the constructor but at the class
level instead of the instance level. The bottom line is if you have a block
that contains blocking or long processing, accessing the class' fields will
incur a hit for the first time.

The other thing – unique to OSGi – is that a bundle’s Bundle-Activator will
be run if a class is accessed from the bundle, and the bundle is set to start
lazily (i.e. when a bundle is accessed for the first time). The net effect is
that whilst simply referring to a class won’t necessarily run its static
blocks, simply accessing it from a remote bundle may cause that bundle to be
started, and hence load its activator.

Activators are bad.

Well, in general they aren’t, but they can often cause delays in bundle
start-up. For example, in Eclipse 4.5, the contenttype bundle’s Activator
referred to Eclipse’s extension registry
which meant that because the extension registry’s interface was referenced the
extension registry bundle was started. This of course takes some time to load
(a few hundred ms on average) which meant the bundle activator for the
contenttype bundle was blocked until the registry finished starting.

In theory, Eclipse can start bundles asynchronously – though at the moment,
it doesn’t – but when you have a blocking sequence of activators there’s
nothing much that Equinox can do. You can see how long an activator takes to
run if you enable the org.eclipse.osgi/debug/bundleLoad tracing option:

Here, contenttype is starting, which (serially) is starting the registry
and preference bundles, before itself finishing start. The number at the end
is in ms but it only represents the total time for the start() method to run;
any other items (including defining the actual classes themselves) are
effectively hidden here. (We can run this against a 4.6 nightly to see the time
taken to define the activator which is new.)

So, removing the Activator can speed up some time – albeit only a little –
by removing the need for the bundle to be lazily started. In general,
activators do one of three things:

Set up data structures or load them from disk (preferences, registry)

Acquire references to the BundleContext

Acquire references to other services (either through the BundleContext
directly or through a ServiceTracker)

The first case there’s not much that can be done. There’s a possibility that
the code could be moved to provide a cache-on-demand style lazy accessor
instead of up front; but in the case of the Equinox registry or preference
service they’d be needed on start-up anyway (and they’re already pretty well
optimised).

The second one can trivially be replaced with a call to
FrameworkUtil.getBundle(the.class).getBundleContext() where needed. This is
actually a pretty fast operation; the bundle is associated with the class'
ClassLoader, so the net effect is that it boils down to something like
the.class.getClassLoader().getBundle() which is pretty fast.

The third one can be replaced by moving the components that require the
services to a declarative services component. This has the advantage that DS
can bring up components in an asynchronous manner and use dependency graph
to bring them up in the right order. In fact, in the case of some of the
bundles in the Eclipse platform, they were already being brought up by DS but
the connection to the other services was outside of DS itself. By moving
everything into DS, the pattern is simplified and the code can be removed.

Put all this together, and to date four activators have been removed from
Eclipse 4.6:

So, Eclipse 4.6 is faster at starting up, right? Well, no … it turns out that
the empty project is now loading more classes than before. We’re not
loading four activators (yay!) but now we’re loading a bunch more classes.

It turns out that it’s a
new bug
in that it’s new code, added to a static block of
CSSRenderingUtils
that adds a single image to JFaceResources internal
ImageRegistry:

That’s all it takes to activate the JFace bundle (formerly; activating the
JFaceActivator too) and to trigger the creation of the image
registry (which includes auto-populating it). Doesn’t sound like much, but
the image registry hits in the order of 40 separate images, using constants
referenced from any number of classes (which themselves are then loaded and
defined …)

You can measure the amount of code that gets loaded by enabling another trace
option; org.eclipse.ogsi/debug/loader. This will spit out a
(large) log which shows which classes are being defined and how many bytes
they take. It’s best to log this to a file so you can look through it
afterwards; it’s too big for the Eclipse console view to hold all in one go.

You can grep for classes defined with 'grep defined log.txt' – piping this
through wc -l will show how many classes get loaded during a run of the
application. (This was the clue that JFace’s ImageRegistry was being hit,
since it showed up as a diff between that and the same project running in
Eclipse 4.5.) You can also figure out how much stuff is being loaded;
grepping for ‘bytes’ and then piping the second column will give you the number
of bytes that were being loaded. In fact, it’s possible to sum these all up:

It turns out this single change added around 20 classes to the startup of an
otherwise empty Eclipse 4 application, and an extra 100k of classes (not to
mention the number of bytes of images being defined).

Why do I mention all this? Well, this isn’t about this specific change as such
(after all, I broke the build not
once
but
twice
yesterday, and possibly a
third
time as well) – but it’s more to highlight a few things:

Activators are generally not required, and can be replaced

The amount of time Eclipse spends loading is proportional to the number and size of classes being loaded

It’s really easy to add static dependencies between classes which then
trigger performance problems later on

Fortunately the size of an application’s loaded classes can be measured, as
can the time. So we can show whether or not we’re moving in the right direction
– and there are also Eclipse performance tests which look at other factors.
But it’s really easy to move the wrong way on the performance profile, and
only by identifying what’s on the (serial) critical path and removing
unnecessary cruft are we going to speed up Eclipse’s launch process.

PS algorithm choices matter a lot more than micro-optimisations like these,
and in general use the JIT will do a lot of the optimisations for you. However,
start-up is a ‘special’ case; it only runs once (so most of it won’t be JITted)
and even though Equinox’s classloader is re-entrant (and so can handle
multiple threads) the declaration of classes is mainly single-threaded. Only
bundles whose components are instantiated from declarative services are going
to be able to benefit from multi-threaded startup. So; Activator bad, DS good
and avoid expensive static blocks are the takeaways here.