Most developers think of the Java compiler, javac, as an
unobtrusive command-line tool to invoke when you want to turn Java
source code into class files. The Java Compiler API, JSR 199, released
in final form last December, opens up the Java compiler to programmatic
interaction as well. Artima spoke with JSR 199 spec lead and Sun
engineer Peter von der Ahé about what programmatic compiler access
means for developers.

Frank Sommers: Can you start by giving us a
bird's eye view of the JSR 199 API?

Peter von der Ahé: The JSR 199 Compiler API
consists of three things: The first one basically allows you to invoke a
compiler via the API. Second, the API allows you to customize
how the compiler finds and writes out files. I mean files in the
abstract sense, since the files the compiler deals with aren't
necessarily on the file system. JSR 199's file abstraction allows
you to have files in a database, and to generate output directly to
memory, for example. Finally, the JSR 199 API lets you collect
diagnostics from the compiler in a structured way so that you can easily
transform error messages, for instance, into lines in an IDE's
editor.

Frank Sommers: How do you expect the Java Compiler
API to impact developers' work?

Peter von der Ahé: The main benefits to developers are
indirect in that the JSR 199 API allows betters tools, better deployment time,
and better infrastructure to exist.

For example, one of the benefits of having a compiler API is that you can
make compilation part of an application-level service. Consider the case when
you upload JSP code to an app server: The server has to analyze the JSP files,
generate Java source code files from the JSPs, write those files out to disk, invoke an
external compiler that then reads the generated Java source code files from disk, writes
the class files back to disk, and then the app server needs to read those class
files into memory. With the Compiler API, you can keep the compiler running in
that app server, and keep all of that in memory. That can reduce deployment
time, and also eliminates the startup overhead of the compiler.

To mention another example, say, you have an app server that stores
most of its data in a database, and is highly optimized for database
access. It is then natural to store not just the data, but also the
program in a database. Before the Compiler API, you had to take the
program data out of the database, put it on the file system, run the
compiler as an external process, which would then have to start up,
incurring a time overhead. And once you've generated some results,
you'd have to copy those back into the database. The compiler API
allows you to shortcut these steps, since it can consume files directly
from the database, thus allowing better integration with the
database.

Another benefit for developers is that IDEs and other developer tools
can more tightly integrate with compilers. By using the JSR 199 API, you
can invoke a compiler directly from within an IDE's editor, or from
build tools, such as Ant. Those tools then have a tighter control over
the compiler. In an application area where compilers are used a lot,
reducing the compile time significantly with the Compiler API can have a
big impact.

As a result, I think the expectation from developers will be higher
for tools that integrate with the compiler API. While I don't think
the Compiler API will fundamentally change how developers interact with
their IDEs, it's the combination of various subtle things the API
allows that will make a lot of difference.

Frank Sommers: To what extent will that tighter
compiler integration be available in the upcoming NetBeans 6.0
release?

Peter von der Ahé: Again, it's a combination of
small things that will make NetBeans 6 look very distinct from NetBeans
5 in the editor area. For NetBeans 6.0, we completely rewired the guts
of the Java source code, using the compiler to implement the editor.

That means a couple of things. For example, we expect quick NetBeans
integration of whatever new languages features will be put into JDK 7.
Just what those features will be, is an open issue. But once those are
implemented in the compiler, we expect that most of the new language
features will work in NetBeans with not too much extra work.

Another thing is that simply more information is available about your
program to the editor. For example, does a method override another
method? Or how many times are these methods get overridden in
subclasses? If you're editing a superclass, you can see how many
classes override a specific class or method. Code completion pops up
faster, and there is less overhead.

Note that the compiler integration is just a means to an end. NetBeans 6
simply has a better Java source code editor. The details of how we achieve that just means
that we're working more closely together and provide more direct access to the
underlying compiler.

I keep saying NetBeans, but there are other IDEs out there. All the
changes we made to the compiler are in Java SE 6, and there are more changes
to come, of course, as we move along. Those are out for everybody to
use. But at the moment, NetBeans 6 is the only IDE I know that uses Java
6's compiler features directly.

Frank Sommers: Based on your description, JSR 199 is
really an API around the Java compiler. How deep inside the actual
compiler can you get with this API?

Peter von der Ahé: Quite deep. JSR 199 works together with
the annotation processing API, JSR 269, for instance, that presents a
compile-time model similar to core reflection. Just as when running a Java
program you can reflect on objects and examine the structure of objects and
class files, the annotation framework allows you to examine the classes the
compiler is compiling. That lets you get fairly deep into the compiler data
structures.

In addition, Sun is providing access to the compiler syntax tree, a feature
relied on very heavily by the upcoming version of NetBeans. While this is not
directly in JSR 199, we're providing that capability in subclasses of the 199
API. At the moment, this only works on Sun's compiler, but providing
standardized access to a tree API is something we might want to consider in the
future.

Before we could think of a standard way to access the compiler tree,
though, we needed to have access to the tree API first. In the past, we
didn't even provide a stable API to those internal compiler
structures. Instead of trying to standardize on a tree API at this
stage, we made a more stable version of the API. You can say that the
tree API we're providing now is a first try in that direction. At a
later stage, we'll have to reconcile differences between various
compilers, and then we will perhaps look at proposing a standard API for
syntax trees.

Frank Sommers: How does parsing work in the current
compiler?

Peter von der Ahé: In JSR 199, there is a construct
called a compilation task. If you access the Sun subclass of that class,
then you get additional utilities from that subclass. One of them is to
be able to parse files. You can specify files using the file abstraction
from JSR 199. If you give a parse() method files in that
form, that method returns a list of abstract syntax trees. You can then
run through the syntax trees and analyze them in your program.

Note that the tree API we're exposing is not mutable—you
cannot change the syntax tree. Providing a mutable API for that would
involve a lot of challenges, as we learned from experience with some
NetBeans-related tools, notably the Jackpot project.

The Jackpot project learned that if you don't allow modification
of trees, but instead copy trees to a new version when you want to make
modifications, then you can compare the old version of a tree to a new
version. If a user later decides that a modification didn't work,
you can just throw away that modification, or the copy representing that
modification, and go back to the old version. That turns out to be a
great way to perform undo during refactoring. Actually, what Jackpot
provides goes beyond refactoring—it's more apt to call that
code re-engineering. While the compiler doesn't directly support
this, Jackpot extends the compiler's capabilities to recover trees
in that manner.

Frank Sommers: You spoke before about the Kitchen
Sink project that provides an experimental playground for new Java
language features. If I have an idea for a great new Java programming language
feature, what steps would I have to take to implement that feature in
the open-source compiler?

Peter von der Ahé: When it comes to language
features, the first thing you want to do is not touch the compiler. The
really important part comes from first sitting down and thinking really
hard about the specification of that change. Think about how you want
certain things to work. Consider the various corner cases, and how you
want to solve them.

That will give you a good starting point for language features,
because you want to have fairly clear ideas how the syntax should look
before you start modifying especially the parser. You want to think
about whether there should be new types of syntax trees in the compiler
so when the parser is analyzing the new language features, it could
create those trees.

Once you have a good plan for what you want to do, it should be fairly easy
to modify the parser. Note that we have a hand-written parser in
javac, and that the compiler relies extensively on the visitor
pattern. One of the things you need to decide is whether you'll need to extend
the visitors to support your new language feature.

Whether you are reusing an existing syntax tree or adding a new one,
you will know where to look for changes once you figure out which syntax
tree you're looking at. You'll want to go through all the phases
of the compiler, and see what's going on in each area. It's not
trivial, but it's not as daunting a task as one might think.

It's also possible to plug in a different parser. But then
we're not talking about a standard API any more, but about going in
and making changes to the internals of the compiler—you're off
the beaten path, and things at that level may change as we change the
implementation of the compiler.

Frank Sommers: The JSR 199 specs state that the
compiler should generate a valid Java class. Can you modify the compiler
so that you output something other than Java classes, even from Java
source code? Likewise, can you modify the compiler to read in some other
language and generate Java class files from that language?

Peter von der Ahé: You could probably use the
compiler to read some other language, such as JavaScript, for instance,
but then we get into details of our conformance rules, and what you can
call Java and what you can't call that. That's really a legal
issue that I would prefer to steer clear of.

We have discussed internally at Sun whether it would be a good idea to reuse
compiler technology and implementations between the Java programming language,
C++, and JavaScript, for example. One use of that would be in the NetBeans IDE.
When you look at the Java programming language, C, and C++, they share to some
extent certain traits. If you create a compiler that generates code in some
intermediate format, as in GCC, then you can probably share some technology
there, especially in the area of optimization in the back-end. However, when it
comes to applications, such as IDEs, then the differences become more of a
stumbling block.

In fact, such reuse was already tried, and we decided to stop doing
that. You will see the benefits of not doing that once NetBeans 6 comes
out, because the newly-implemented Java source code editor makes the Java editor
very specific to the Java programming language. Separately, we can have a specific JavaScript
editor. So it's not as much reusing compiler technology, as it is
re-using experiences.

I could imagine that if someone was writing a compiler for some
scripting language to get it running on the JVM, they could reuse a few
classes that make up the back-end of javac. But even then,
they would have to do a lot of work, because javac is
targeted only at generating class files from the Java programming language.

Frank Sommers: Sun open-sourced javac
last November along with the JDK, and the JCP followed that with the
final release JSR 199. Going forward, what upcoming compiler-related
features are you most excited about?

Peter von der Ahé: Right now, I'm most excited
about the community involvement. For example, with the Kitchen Sink
project we talked about, the potential is really exciting: we have now a
more open way to evaluate various language features, and see what works
the best.

I'm also excited about some of the possible new language features
for JDK 7 that the compiler will support. While no decisions have been
formed on the exact set of new Java programming language features to ship in JDK 7,
let me say that I'm very optimistic about closures, based on the
proposal led by Neal Gafter. As with other new language features,
it's now possible to see how they'll look in practice, since the
open-source compiler makes it easier to build an implementation of those
features.

Share Your Opinion

Have a question or opinion about the Java Compiler API?
Discuss this article in the Articles Forum topic,
The Java Compiler API.

About the author

Frank Sommers is a Senior Editor with Artima Developer. He also serves as chief
editor of the IEEE Technical Committee on Scalable
Computing's newsletter, and is an elected member of the Jini Community's Technical
Advisory Committee. Prior to joining Artima, Frank wrote the Jiniology and Web
services columns for JavaWorld.