This chapter is from the book

This chapter is from the book

Java classes preserve a wealth of information about programmer intent. Rather
than just containing a jumble of executable instructions, binary
classes1 also contain large amounts of
metadatadata that describes the structure of the binary class. Most
of this metadata is type information enumerating the base class,
superinterfaces, fields, and methods of the class. Type information is used to
make the dynamic linking of code more reliable by verifying at runtime that
clients and servers share a common view of the classes they use to communicate.

The presence of type information also enables dynamic styles of programming.
You can introspect against a binary class to discover its fields and
methods at runtime. Using this information, you can write generic services to
add capabilities to classes that have not even been written yet.

The binary class format is a simple data structure that you could parse to
perform introspection yourself. Rather than going to this trouble, you can use
the Java Reflection API instead. Reflection provides programmatic access to most
of the metadata in the binary class format. It also provides not only the
ability to introspect classes for metadata, but also the ability to dynamically
access fields and methods. Reflective invocation is critical for writing generic
object services. As of SDK version 1.3, reflection also includes the ability to
manufacture classes called dynamic proxies at runtime. This chapter introduces
the binary class format, the uses of metadata, the Reflection API, dynamic
proxies, and custom metadata.

3.1 The Binary Class Format

The binary class format means different things to different people. To an
application developer, the binary class is the compiled output of a Java class.
Most of the time, you can treat the class format as a black boxa detail
that is thankfully hidden by the compiler. The binary class is also the unit of
executable code recognized by the virtual machine. Virtual machine developers
see the binary class as a data structure that can be loaded, interpreted, and
manipulated by virtual machines and by Java development tools. The binary class
is also the unit of granularity for dynamic class loading. Authors of custom
class loaders take this view and may use their knowledge of the binary class
format to generate custom classes at runtime. But most importantly, the binary
class is a well-defined format for conveying class code and class metadata.

Most of the existing literature on the binary class format targets compiler
and virtual machine developers. For example, the virtual machine specification
provides a wealth of detail about the exact format of a binary class, plus a
specific explanation of extensions that can legally be added to that format. For
a Java developer, such detail is overkill. However, hidden in that detail is
information that the virtual machine uses to provide valuable services, such as
security, versioning, type-safe runtime linkage, and runtime type information.
The availability and quality of these services is of great concern to all Java
developers. The remainder of Section 3.1 will describe the information in the
binary class format, and how that information is used by the virtual machine.
Subsequent sections show you how you can use this information from your own
programs.

3.1.1 Binary Compatibility

A clear example of the power of class metadata is Java's enforcement of
binary compatibility at runtime. Consider the MadScientist class and
its client class BMovie, shown in Listing 31. If you compile the
two classes and then execute the BMovie class, you will see that the
threaten method executes as expected. Now, imagine that you decide to
ship a modified version of MadScientist with the threaten
method removed. What happens if an old version of BMovie tries to use
this new version of MadScientist?

In a language that does not use metadata to link methods at runtime, the
outcome is poorly defined. In this particular case, the old version of
BMovie probably would link to the first method in the object. Since
threaten is no longer part of the class, blowUpWorld is now
the first method. This program error would literally be devastating to the
caller.

As bad as this looks, an obvious failure is actually one of the best
possible outcomes for version mismatches in a language without adequate
metadata. Consider what might happen in a systems programming language, such as
C++, that encodes assumptions about other modules as numeric locations or
offsets. If these assumptions turn out to be incorrect at runtime, the resulting
behavior is undefined. Instead of the desired behavior, some random method may
be called, or some random class may be loaded. If the random method does not
cause an immediate failure, the symptoms of this problem can be incredibly
difficult to track down. Another possibility is that the code execution will
transfer to some location in memory that is not a method at all. Hackers may
exploit this situation to inject their own malicious code into a process.

Compare all the potential problems above with the actual behavior of the Java
language. If you remove the threaten method, and recompile onlythe MadScientist class, you will see the following result:

If a class makes a reference to a nonexistent or invalid entity in some other
class, that reference will trigger some subclass of
IncompatibleClassChangeError, such as the NoSuchMethodError
shown above. All of these exception types indirectly extend Error, so
they do not have to be checked and may occur at any time. Java assumes fallible
programmers, incomplete compile-time knowledge, and partial installations of
code that change over time. As a result, the language makes runtime metadata
checks to ensure that references are resolved correctly. Systems languages, on
the other hand, tend to assume expert programmers, complete compile-time
knowledge, and full control of the installation processes. The code that results
from these may load a little faster than Java code, but it will be unacceptably
fragile in a distributed environment.

In the earlier example, the missing method threaten caused the new
version of MadScientist to be incompatible with the original version of
BMovie. This is an obvious example of incompatibility, but some other
incompatibilities are a little less obvious. The exact rules for binary class
compatibility are enumerated in [LY99], but you will rarely need to consult the
rules at this level. The rules all support a single, common-sense objective: no
mysterious failures. A reference either resolves to the exact thing the caller
expects, or an error is thrown; "exactness" is limited by what the
caller is looking for. Consider these examples:

You cannot reference a class, method, or field that does not exist. For
fields and methods, both names and types must match.

You cannot reference a class, method, or field that is invisible to you,
for example, a private method of some other class.

Because private members are invisible to other classes anyway, changes to
private members will not cause incompatibilities with other classes. A
similar argument holds for package-private members if you always update
the entire package as a unit.

You cannot instantiate an abstract class, invoke an abstract method,
subclass a final class, or override a final method.

Compatibility is in the eye of the beholder. If some class adds or
removes methods that you never call anyway, you will not perceive any
incompatibility when loading different versions of that class.

Another way to view all these rules is to remember that changes to invisible
implementation details will never break binary compatibility, but changes to
visible relationships between classes will.

3.1.1.1 Declared Exceptions and Binary Compatibility

One of the few oddities of binary compatibility is that you can refer
to a method or constructor that declares checked exceptions that you do not
expect. This is less strict than the corresponding compile-time rule, which
states that the caller must handle all checked exceptions. Consider the versions
of Rocket and Client shown in Listing 32. You can only
compile Client against version 1 of the Rocket since the
client does not handle the exception thrown by version 2. At runtime, a
Client could successfully reference and use either version because
exception types are not checked for binary compatibility.

This loophole in the binary compatibility rules may be surprising, but it
does not compromise the primary objective of preventing inexplicable failures.
Consider what happens if your Client encounters the second version of
Rocket. If and when the InadequateNationalInfrastructure
exception is thrown, your code will not be expecting it, and the thread will
probably terminate. Even though this may be highly irritating, the behavior is
clearly defined, and the stack trace makes it easy to detect the problem and add
an appropriate handler.

3.1.1.2 Some Incompatible Changes Cannot Be Detected

The Java compiler enforces the rules of binary compatibility at compile time,
and the virtual machine enforces them again at runtime. The runtime enforcement
of these rules goes a long way toward preventing the accidental use of the wrong
class. However, these rules do not protect you from bad decisions when you are
shipping a new version of a class. You can still find clever ways to write new
versions of classes that explode when called by old clients.

Listing 33 shows an unsafe change to a class that Java cannot prevent.
Clients of the original version of Rocket expect to simply call
launch. The second version of Rocket changes the rules by
adding a mandatory preLaunchSafetyCheck. This does not create any
structural incompatibilities with the version 1 clients, who can still find all
the methods that they expect to call. As a result, old versions of the client
might launch new rockets without the necessary safety check. If you want to rely
on the virtual machine to protect the new version of Rocket from old
clients, then you must deliberately introduce an incompatibility that will break
the linkage. For example, your new version could implement a new and different
Rocket2 interface.2

Listing 33 Some Legal Changes to a Class May Still Be Dangerous.

3.1.2 Binary Class Metadata

[LY99] documents the exact format of a binary class. My purpose here is not
to reproduce this information but to show what kinds of metadata the binary
class includes. Figure
31 shows the relevant data structures that you can traverse in the
binary class format. The constant pool is a shared data structure that contains
elements, such as class constants, method names, and field names, that are referenced
by index elsewhere in the class file. The other structures in the class file
do not hold their own data; instead, they hold indexes into the constant pool.
This keeps the overall size of the class file small by avoiding the repetition
of similar data structures.

The -superclass and -interfaces references contain indices
into the constant pool. After a few levels of indirection, these indices
eventually lead to the actual string names of the class's base class and
superinterfaces. The use of actual string names makes it possible to verify
at runtime that the class meets the contractual expectations of its
clients.

Note that the class name format used by the virtual machine is different from
the dotted notation used in Java code. The VM uses the "/" character
as a package delimiter. Also, it often uses the "L" and ";"
characters to delimit class names if the class name appears inside a stream
where other types of data might also appear. So, the class
java.lang.String will appear as either java/lang/String or
Ljava/lang/String; in the class file's constant pool.

The fields and methods arrays also contain indices into the constant pool.
Again, these constant pool entries lead to the actual string names of the
referenced types, plus the string names of the methods and fields. If the
referenced type is a primitive, the VM uses a special single-character string
encoding for the type, as shown in Table 31. A method also contains a
reference to the Java bytecodes that implement the method. Whenever these
bytecodes refer to another class, they do so through a constant pool index that
resolves to the string name of the referenced class. Throughout the virtual
machine, types are referred to by their full, package qualified string names.
Fields and methods are also referenced by their string names.

Table 31 Virtual Machine Type Names

Java Type

Virtual Machine Name

int

I

float

F

long

J

double

D

byte

B

boolean

Z

short

S

char

C

type[ ]

[type

package.SomeClass

Lpackage.SomeClass;

3.1.2.1 Analyzing Classes with javap

The details of binary class data structures are of interest to VM writers,
and they are covered in detail in the virtual machine specification [LY99].
Fortunately, there are a large number of tools that will display information
from the binary class format in a human-friendly form. The javap tool
that ships with the SDK is a simple class decompiler. Consider the simple
Echo1 class:

If you run javap on the compiled Echo1 class, you will see
output similar to Listing 34. As you can see, the class format contains
the class names, the method names, and the parameter type names. The
javap utility has a variety of more verbose options as well, including
the c flag to display the actual bytecodes that implement each
method, shown in Listing 35. Without worrying about what specific
bytecodes do, you can easily see that the bytecode instructions refer to
classes, fields, and members by name. The #10, #5,
#1, and #8 in the output are the indices into the constant
pool; javap helpfully resolves these indices so that you can see the
actual strings being referenced.

3.1.3 From Binary Classes to Reflection

Java class binaries always contain metadata, including the string names for
classes, fields, field types, methods, and method parameter types. This metadata
is used implicitly to verify that cross-class references are compatible. Both
metadata and the notion of class compatibility are built into the bones of the
Java language, so there is no subterranean level where you can avoid their
presence. By themselves, the binary compatibility checks provided by the virtual
machine would be sufficient to justify the cost of creating, storing, and
processing class metadata. In reality, these uses only scratch the surface. You
can access the same metadata directly from within your Java programs using the
Reflection API.