In this final installment of the
Diagnosing Java code series, Eric Allen discusses some of the
current trends in software development and predicts what they may lead
to in the coming years.

Loyal readers: I regret to inform you that this will be the last column
in the Diagnosing Java code series. I've (finally) finished my
Ph.D., and I'm off to the industrial labs to help start a new research
program in programming languages.

In this farewell article, let's have some fun and look into our crystal
ball. We'll discuss some of the prevailing trends in the software industry
and the impact we can expect these trends to have on the future of
software development. We'll focus our discussion through the lens of
effective software development we've used in this series over the past two
and a half years. As always, let's pay particular attention to the crucial
role that effective error prevention and diagnosis play in allowing us to
manage our increasingly complex digital world.

There are three prevalent trends in computing to consider when
examining where the industry is heading as whole. These are:

The explosion in pervasive and wireless computing

The ever-increasing expense of developing reliable software

The continued rapid pace of improvement in computing performance

Together, these trends are reshaping the fundamental nature of software
and the engineering trade-offs involved in constructing it. At a high
level, the expense of software and the ubiquity of it are pushing us in
the direction of greater and more pervasive abstractions, such as powerful
virtual machines with well-defined semantics and security properties,
allowing us to develop and maintain software more easily and across more
platforms.

At the same time, the continued improvements in computing performance
enable us to build these abstractions without suffering unacceptable
performance degradation. I'd like to take a stab at some of the ways I
believe we could construct new abstractions that would help in building
the next generation of software products.

Component-based
developmentIn component-based development, software is
developed in modules whose external references are decoupled from
particular implementations. These modules can then be linked dynamically
to construct a full-fledged application. Notice that "external references"
includes not just objects that are referred to, but external classes that
can be used and even subclassed. (Think of the ways in which various
packages refer to each other in Java programming. Now think about
decoupling the packages from one another.) In this column, we've already
discussed one vision of component-based programming for Java -- Jiazzi
(see Resources
for the November 2002 column on decoupling package dependencies).

Component-based programming promises two complementary benefits, both
of which will become increasingly important as the above-mentioned trends
become more and more prominent. First, component-based systems allow for
much greater reuse. For example, consider the myriad programs today that
provide text-editing support (mail clients, word processors, IDEs, and so
on). Also, consider the numerous clients that provide support for handling
e-mail. Despite the number of programs offering these services, few
programs handle e-mail as well as a dedicated e-mail client. Likewise, no
mail program will allow text manipulation at the same level as a dedicated
text editor. But why should every mail client (IDE, word processor, and so
on) have to develop its own text editor? It would be so much better if
there were a standard "text-editor" API that could be implemented by
various third-party components. Tools such as mail clients could choose
their favorite implementation of this API and plug it in. In fact, one can
even imagine users assembling their environment from off-the-shelf
components (such as, their favorite editor, their favorite mail client)
that could be linked dynamically when an application is run.

Another benefit of a component-based model is the potential for greater
testing. In the Java language as it exists today, external references to
classes, such as the I/O library classes and the like, are hard-wired
references that cannot be altered without recompilation. As a result, it
is difficult to test parts of programs that rely on external references in
isolation. For example, it is difficult to test whether a program is
making proper use of the filesystem without actually allowing it to read
and write from the filesystem. But reading and writing files in unit tests
slows tests and requires adding more complexity (like creating a temp
directory and cleaning up files after use). Ideally, we would like to
separate programs from external references to the I/O libraries for the
purposes of testing.

There are numerous ways in which we can formulate a component model.
J2EE provides such a model at the level of objects for Web services.
Eclipse provides a model for IDE components. Jiazzi provides a model in
which independently compiled "units" of software can be linked to form a
complete application. Each of these formulations has its use in particular
contexts; we should expect to see yet more formulations in the coming
years.

Assertions and
invariantsHand in hand with component-based programming
must go increased emphasis on assertions and other methods of ensuring
that the invariants expected to hold for a component are actually met. The
type system by itself is not expressive enough to capture all of the
intended invariants. For example, we shouldn't expect that the method
types for a text-editing API would capture such invariants as "only opened
documents can be closed." We can rely on informal documentation to specify
such invariants, but the more invariants we can formalize and check, the
better.

Ensuring that the requisite invariants are satisfied is a necessary
aspect of component encapsulation. In general, a client programmer
will have no way to reason how a component will behave other than what is
said in the published API. Any behavior of a component that isn't included
in the API is not behavior that the client programmer can rely on. If
non-published behavior results in a run-time error, it will be exceedingly
difficult for the programmer to diagnose the problem and repair it.

There are several research projects underway to significantly improve
the sorts of invariants we can specify for a component. Some of them, such
as Time Rover (see Resources
for the July 2002 column), use modal logic and other logical formalisms to
express deep attributes of run-time behavior.

Another approach to expressing invariants is to bolster the type system
with generic types, types parameterized by other types (the topic of our
the most recent series of articles in this column).

Yet another approach to adding much more powerful invariants is that of
dependent types. Dependent types are types parameterized by
run-time values (compare this with generic types, which are
parameterized by other types).

The canonical example of dependent types is that of an array type
parameterized by the size of the array. By including the size in the type,
a compile-time checker can symbolically analyze the accesses to the array
to ensure that no accesses are done outside the bounds of the array.
Another compelling use of dependent types is that of ownership
types, developed by Boyapati, Liskov, and Shrira (see Resources
for a link to their original paper).

Ownership types are types of objects that are parameterized by an owner
object. For example, consider an iterator over a container. It is natural
to say that the iterator is owned by the container, and therefore,
that the container has special access privileges to the iterator. Inner
classes provide some of the same controls over access privileges, but
ownership types provide a much more powerful and flexible mechanism.

Continued improvements in
refactoring toolsAs software applications become larger, it
becomes increasingly difficult to maintain and improve code bases or to
diagnose bugs. This problem is exacerbated by the scarcity of qualified
developers. Fortunately, development tools are providing us with
increasingly powerful control over software systems. Two of the most
powerful forms of control are unit testing tools and refactoring
browsers.

Unit testing tools allow us to check that key invariants of our
programs continue to hold under refactoring. Refactoring browsers provide
many direct and powerful ways to modify code while preserving behavior. We
are starting to see "second generation" unit testing tools that leverage
static types and unit tests to mutual advantage, allowing for automatic
testing of code coverage and automatic generation of tests. Refactoring
browsers are adding more and more refactorings to the standard repertoire.
Long range, we should look for even more sophisticated tools, such as
"pattern savvy" refactoring browsers that recognize uses (or potential
uses) of design patterns in a program and apply them.

We can even expect development tools to eventually leverage the unit
tests to perform more aggressive refactorings. In Martin Fowler's classic
text, Refactoring: Improving the Design of Existing Code, a
refactoring is defined to preserve the observable behavior of a program.
However, in general we are not concerned about all aspects of the
observable behavior of a program; instead, we generally care about
maintaining certain key aspects of the behavior. But these key aspects are
exactly what the unit test suite is supposed to check!

Therefore, a refactoring browser could potentially leverage the unit
test suite to determine what aspects of behavior are important. Other
aspects could be aggressively modified by the refactoring browser at will,
in order to simplify the code. On the flip side, the functionality of such
a refactoring browser could be leveraged to check test coverage by
determining the kinds of refactorings that are allowed by the unit tests
and reporting them to the programmer.

Interactive
debuggersAs applications become more complex and
increasingly run on remote platforms, diagnosis of bugs takes on whole new
challenges. Often, it's not possible or practical to debug software on the
deployment platform. Ideally, we'd like to be able to debug software
remotely.

Java Platform Debugger Architecture (JPDA) provides for exactly such a
facility by allowing a debugger to run on a separate JVM; then we can use
RMI for the remote debugging session. But, in addition to remote
debugging, diagnosis can be made much more efficient by giving the
programmer more control over the access points available when starting a
debugger and the available views of the state of the computation during
debugging.

Even with modern debuggers, programmers still have to resort to
printlns in many contexts to get the information they need.
Ideally, we would have debuggers that completely obviated the need for
printlns. In fact, at the Java programming languages team
(JavaPLT), we are working on just such a debugger. This debugger, due for
open source release in Fall 2003, will make use of a seamlessly integrated
"interactions window" that allows for incremental code evaluation (see Resources
for the March 2002 column). The interactions window lets you start a
debugging process through an arbitrary expression evaluation. It can also
be used at a breakpoint to interact with the running process in
context, accessing the scope visible at that point in the process and
modifying it at will. The JavaPLT debugger will be released both as part
of the DrJava IDE and as an independent Eclipse plug-in.

Lightweight, interoperable
development toolsAs development tools become more
sophisticated, it becomes increasingly difficult for a single vendor to
provide the best of all tools. So, developers tend to rely on a
smorgasbord of tools from different vendors. Doing so is most pleasant if
the various tools play well together, each accepting the fact that they
will be working together with other tools.

Projects like Eclipse take this philosophy one step further and provide
ways to interoperate tools to leverage the functionality of one another
and provide services beyond what is possible with any of the tools in
isolation. With time, we should expect this model, or others like it, to
truly "eclipse" traditional all-in-one IDEs.

Meta-level application
logicAs our final crystal ball vision, let's consider one
direction that software might take in the very long term. Many of
the most common bugs that occur in applications take the form of a simple
misconfiguration that is easily remedied once the user understands the
underlying details of the application. The problem is that most users
don't have the time to understand the underlying details of all the
applications that they use.

One long-range solution to this problem could be to embed meta-level
knowledge into applications that encodes the context in which the
application is run and what it is supposed to do. For example, meta-level
knowledge for a word processor would include logic explaining that the
program was used by humans on personal computers to produce English
documents that are then read by other humans. With this knowledge encoded,
an application could potentially make inferences about what a user was
trying to do when something goes wrong (of course, the application would
also have to determine that something was wrong in the first place).

Such meta-level knowledge is potentially a powerful mechanism for
adding robustness to an application. It is also extremely dangerous, and
what is most worrisome about the danger is that it is often overlooked by
the strongest advocates of moving in this direction. After all, the
behavior of an application that dynamically reconfigures itself can be
extremely unpredictable. A user may find it quite difficult to reason how
his program will behave under certain circumstances. And, the developers
of the program can also find it extremely difficult to be able to be
assured of its reliability. As we've seen time and again, the inability to
predict and understand a program's behavior has easily predictable
consequences -- buggy software.

To be clear, I really do think that adaptive software with meta-level
knowledge about the context in which it's used has the potential to vastly
improve the capabilities of software applications, but if we add such
capabilities, we must find ways to do so that still allows us to reason
effectively about our programs.

A great example of a software system that incorporates a form of
meta-level knowledge and adaptability (albeit extremely limited) without
sacrificing predictable behavior is the TiVo personal digital recorder (or
other similar products). TiVo uses your television viewing habits to
adaptively determine what shows you might like to watch, but this
adaptability is stringently restricted. TiVo will always follow user
directives for the shows to record regardless of any of its adaptive
behavior responses. TiVo uses a very simple form of meta-level knowledge,
but even as the meta-level knowledge used becomes more and more complex,
we should continue to keep control over adaptive behavior. If you'll
forgive a somewhat fanciful comparison from the realm of science fiction,
we could follow the precedent set by Isaac Asimov. Asimovian robots were
extraordinarily powerful machines, but they were controlled by absolute
and inviolable fundamental laws, allowing for some degree of
predictability in their behavior.

I bid you a fond
adieuAnd it's on the note of Asimovian robots that I'll
choose to wrap up this discussion. I'd like to thank the team at
developerWorks for their efforts in the past past two-and-a-half
years: editor Jenni Aloi, for giving me the opportunity to write this
column; copy editor Christine Stackel, for her excellent attention to
detail; and finally, developmental editor Kane Scarlett, for his heroic
efforts, which substantially improved the content of this column.

To my readers: I hope you have found some lasting value in these
articles. Writing them has been an invaluable learning experience for me.
Thank you for your patronage, and I wish you the best of luck in your
efforts to prevent and diagnose bugs in your programs.

Get a jump on generics in the Java language by downloading the JSR-14
prototype compiler; it includes the sources for a prototype compiler
written in the extended language, a JAR file containing the class files
for running and bootstrapping the compiler, and a JAR file containing
stubs for the collection classes.

Follow the discussion of adding generic types to the Java language
by reading the Java Community Process proposal, JSR-14.

And don't forget to try the high-performance code-analysis engine
for both J2SE and J2EE development, CodeGuide from
OmniCore. It already provides IDE support for generic types in the Java
language with the JSR-14 prototype compiler.

Eric Allen has a new book on the subject of bug patterns, Bug Patterns
in Java, which presents a methodology for diagnosing and
debugging computer programs by focusing on bug patterns, Extreme
Programming methods, and ways to craft powerful testable and extensible
software.

About the
authorEric Allen sports a broad range of hands-on
knowledge of technology and the computer industry. With a B.S. in
computer science and mathematics from Cornell University and an M.S.
in computer science from Rice University, Eric is currently a Ph.D.
candidate in the Java programming languages team at Rice. Eric is a
project manager for and a founding member of the DrJava project, an
open-source Java IDE designed for beginners; he is also the lead
developer of the university's experimental compiler for the NextGen
programming language, an extension of the Java language with added
experimental features. Contact Eric at eallen@cs.rice.edu.