Search

Eclipse Goes Native

Eclipse is an open-source, extensible integrated development
environment (IDE) that's growing quickly in popularity. Written
in Java, it provides a multilanguage development environment
that allows developers to code in Java, C and C++. In response to
the need for improved performance and additional platform
coverage for the Red Hat Developer Suite, of which Eclipse is the
core, we created a version of Eclipse that's compiled natively.
Instead of running on top of a virtual
machine the way Java programs usually do—although that can
still be done if the user prefers—Red Hat's version of Eclipse
is compiled to binary and runs natively using the libgcj runtime
libraries, similar to the way a C program runs using the GNU C
libraries.

This article discusses why native compilation was an attractive
choice; explains what we had to do to GCJ, libgcj and Eclipse
to make it possible; and shows, using a real-world example, that
open-source Java has come a long way and now is useful commercially.

Motivation

Two main factors from the early days of
Developer Suite planning and engineering drove us toward
native compilation: platform coverage and performance.
Red Hat Enterprise Linux was scheduled to ship on
several 64-bit architectures, and we wanted to make
sure Developer Suite could run on all of them. One
big problem was Eclipse had never been run
on a 64-bit platform and it contained some code,
specifically the interface between SWT, the graphics
toolkit in Eclipse, and its native C libraries,
that assumed 32-bit addresses. Aside from having to
create a clean 64-bit version of SWT, we were faced
with a more significant problem: no 64-bit
Java Virtual Machine (JVM) for x86_64, AMD's 64-bit
architecture, existed at the time, and it didn't look hopeful
that one would be available before we had to ship.

Another problem we had was
performance. Eclipse worked well on Microsoft Windows but the version
available at the time was pretty slow on Linux. We found that
startup alone took well over a minute, and early user testing
found that the interface was a little too sluggish for comfortable use.
For example, Eclipse is based on perspectives, which are collections
of views and editors, only one of which is visible at a time. Switching
between them is something that a user does fairly frequently. However,
changing perspectives introduced substantial delays we thought
unacceptable for the enterprise development market
Red Hat Developer Suite was targeting.

The solution we came up with was to use GCJ to compile Eclipse into
native binaries that could run without having a JVM
installed. We knew that native compilation would help with the performance
problems, because we would no longer have the overhead that comes
with the JVM layer. It also would solve the platform coverage
problem, as GCJ/libgcj was available on all of the 64-bit
platforms we had to support, although in some cases, such as
x86_64, it still needed a lot of work. Native compilation
solved the technical problems we had and gave us the additional
benefits of reducing our external dependencies, allowing us to make
some significant improvements to open-source Java and to
demonstrate that open-source Java has matured to the point of being
useful commercially.

Approach

At the outset of this project, we really didn't know if it was
possible to compile Eclipse with GCJ and expect it to run. First,
Eclipse is a large program—more than two million lines of code as counted
by wc. We didn't know much about Eclipse internals or what runtime
facilities it might use. Second, GCJ's background is in embedded
systems, and we knew that work remained on parts of the Java
programming language, class loaders in particular, which are used
heavily by Eclipse. Third, the free class libraries were not complete.
We didn't know if Eclipse could use facilities we hadn't
written yet or even whether Eclipse might break the rules and use
internal, undocumented com.sun.* interfaces, as too many Java
programs seem to do.

We therefore took a two-pronged approach to determining whether a project
like this could succeed.
First, we used GCJ to make a list of the APIs used by
Eclipse that we did not or could not implement.
To accomplish this, we wrote a shell script that would try to compile each Eclipse
Java archive library (jar file) to object code. We then looked through
the error messages to see what was missing.
The results of this script were not encouraging: we found a
large number of missing packages. Still, more investigation was
required because some things didn't make sense. For instance,
there were dependencies on the Swing graphical user interface
classes, but we knew that Eclipse used SWT and not Swing.

Further investigation showed that many of the weird undefined
references came not from Eclipse itself but from the third-party
jar files included with it. For example, Eclipse includes its own
copy of the Ant build tool and its own copy of the Apache Tomcat
dynamic Web server. We knew that in many cases, the referenced
classes would not actually be invoked in the Eclipse environment.
This encouraged us to take another look at how to get Eclipse working.

Our second angle of attack was to try running Eclipse
using the bytecode interpreter that comes with libgcj. By doing this, we
reasoned, we would concentrate on runtime bugs, including the aforementioned
class loader problems and missing functionality actually
used by Eclipse.

This approach also was discouraging initially. We ran
into problems not only with class loading, but also with the fact that
libgcj's implementation of protection domains needed work.
These are the bases for Java's secure sandbox architecture, which
allows untrusted code to be run in a secure way. Problems in
this area had an unfortunate shadowing effect—we had to fix
each bug before we could discover the next one.

Changes to libgcj

Our first round of changes to libgcj was bug fixing only. We
implemented protection domains properly. Then, we made a pass over the entire
runtime, fixing bugs related to class loading. Because of the way
class loading had been implemented in libgcj, we had to modify all the
places in the native code that conceivably might load a class to
forward the request to the appropriate class loader.

Once this was done, we were able to start Eclipse using the libgcj
bytecode interpreter. At this point the question became, how can we
take real advantage of GCJ to compile Eclipse?

The naïve approach to this dilemma, namely precompiling all the classes and
linking them all together, had been ruled out by our investigations
into Eclipse's internals. This approach would clash with Eclipse's
relatively sophisticated class loading strategy.

More investigation revealed that most classes are loaded
by instances of the DelegatingURLClassLoader, which is a subclass of
the standard URLClassLoader that has been extended to understand
Eclipse's plugin architecture. It seemed like the best approach was to
modify Eclipse to allow it to load precompiled shared libraries as
well as bytecode files. We reasoned that the required changes would be
localized due to the way plugin class loading had been structured.

In fact, we had to go one step further and extend libgcj a bit as well.
libgcj knew how to load shared libraries invisibly in
response to a call to, for example, Class.forName(). However, this magic
always happened at the level of the bootstrap class loader. That
wouldn't work well for Eclipse or for any other application that defines
its own class loaders, so we invented a new gcjlib URL type. This is
like a jar URL, but it points to a shared library. We also made some
minor extensions to our implementation of URLClassLoader so that
gcjlib URLs would be treated specially.

Doing this wasn't enough, however. We also had to solve the linkage
problems. In particular, if we compiled a jar file to a shared
library, how could we prevent the dlopen() of such a shared
library from immediately failing due to unresolved symbols? The
solution to this problem was to resurrect and clean up the
-fno-assume-compiled option in GCJ. This option, which never had
been finished, enabled an alternative ABI that caused GCJ's output to
resolve most references at runtime rather than at link time.

The -f-no-assume-compiled option has various limitations and
inefficiencies. On the boards for the future is a cleaner way to
achieve this same goal. On the GCJ mailing list (see the on-line
Resources section) this option
is referred to either as the binary compatibility ABI or
-findirect-dispatch. This new ABI does everything
-fno-assume-compiled does, but in a much more efficient and
compatible way. Development is underway and is coming along
nicely on this new feature, one of several contributing to
GCJ's enterprise readiness.

Changes to Eclipse

Once all this was in place, we finally were ready to make our
changes to Eclipse. These turned out to be remarkably small.
Most of the work involved making the same sort of change in three different
places. In essence, we modified Eclipse so that when it's looking for a
plugin's jar file, it also looks for a similarly named shared
library installed alongside it. If there is one, we rewrite the URL
passed to the class loader from a jar URL to a gcjlib URL.
All rewriting is done conditionally, so our natively compiled Eclipse
still works with an unmodified JVM. In other words, users are
not locked in to native compilation if they would rather use a JVM instead.

Once that was done, we wrote our own launcher that understood how to
bootstrap the Eclipse platform from shared libraries. This was
accomplished in a
modest 90 lines of code.

Profiling

After all that, Eclipse was mysteriously
slow. Had we done something wrong? Was GCJ-compiled
code substantially worse than the code
generated on the fly by the current crop of
just-in-time (JIT) compilers? Did -fno-assume-compiled
have enormous overhead?

One nice advantage of GCJ is its output generally can be treated in
the same way one treats any object code. That is, existing tools such
as
OProfile can be applied to it directly without any change. And that,
in fact, is how we investigated our performance problem.

The first thing we noticed was a large number of exceptions being
thrown during platform startup. Amid the grumblings of compiler
writers (exceptions should be for exceptional
circumstances), and
although we were considering changes to the GCJ runtime that would
violate Java semantics, we noticed a strange symbol in the OProfile output.
It turned out that a small bit of buggy assembly code deep in the
libgcj runtime was causing a linear search of exception handling
tables rather than the expected binary search. The overhead of this
search through the entire program every time an exception was thrown was
vast. A fix to the errant assembly code proved this was
the problem, and suddenly our natively compiled Eclipse was able to
start a second faster than the stock version using a JVM. To
quantify it a bit further, the startup time dropped from more than a minute
before the fix to less than 15 seconds after it.

Limitations and Shameless Hacks

Currently, we don't compile Eclipse directly from source to object
code. Instead, we compile to bytecode and then compile the jar files
to shared libraries. This is done for two reasons. First,
a few bugs in the GCJ source compiler haven't been fixed.
Second, Eclipse comes with its own build scripts that compile from
source to bytecode. Reworking the Eclipse build system to allow
building directly from source to binary seemed like a much larger
divergence from the upstream sources than we were willing to maintain.

Also, we currently don't precompile all the jar files to shared
libraries—some remain as jar files and are interpreted at runtime.
This is done because the class libraries still are incomplete, and
these jar files refer to classes that have not been implemented yet.

One of our patches is unsuitable for the public GCJ. We had to
disable the compile-time bytecode verifier, as it was too buggy to
compile some of the Eclipse jar files. We're in the process of
replacing this verifier with a more robust one.

In addition, one limitation of natively compiled Eclipse deserves
mention. You can't use natively compiled Eclipse to debug a
GCJ-compiled application, because JDWP, the Java Debug Wire Protocol
used by Eclipse, hasn't been implemented in libgcj yet.

Implications and Future Directions

The achievement of the native compilation of Eclipse is a strong
indication that open-source Java based on GCJ and libgcj/classpath
has reached the point of being commercially useful. That said,
it's still not complete. Some fairly substantial gaps
still need to be filled in before open-source Java can be a proper
drop-in substitute for proprietary JVMs.

One of the major areas that needs work is the development/integration
of a JIT compiler. JIT would allow a
GCJ-based open-source Java environment to be used in a manner similar to a
conventional JVM, meaning that native compilation and
platform-specific binaries would not be necessary for performance reasons.

The other major piece that needs work also is, by far, the most
visible missing piece—Swing. Work on an open-source implementation
of Swing is coming along nicely as part of the GNU Classpath Project,
but Swing is a huge undertaking and the GNU Classpath implementation
is still
not quite usable.

A full-featured and completely open-source Java environment is an
attractive alternative to proprietary JVMs, and it's now within reach.
During the past six months, Red Hat has more than doubled the number of
engineers working in support of the Open Source Java
solution and community. Eclipse is a large, complicated piece of
software, and natively compiling and running it was an excellent test
of and testament to the progress being made on open-source Java.
The power of open source
lies in its communities, so please consider joining the open-source
Java community and contributing to the GCJ and GNU Classpath Projects
in any way that interests you.

John Healy is the manager of Red Hat's Eclipse Engineering group, based
in Toronto (people.redhat.com/jhealy). In the past
he's worked on custom open-source toolchains
for embedded processors as well as CRM and computer-telephony
applications.

Andrew Haley has been a programmer for longer than he cares to
remember. He is one of the maintainers of GCJ. He works for Red Hat,
which supports him in this task.

Tom Tromey has worked on free software since the early 1990s. Patches of
his appear in GCC, Emacs, GNOME, Autoconf, GDB and probably other
packages he has forgotten about. He works at Red Hat as the technical
lead of the Eclipse Engineering team. He can be reached at
tromey@redhat.com.