What's Wrong with the EJB 2 Specification?

The EJB 2.0 beta specification was released with great fanfare last
summer during JavaOne. The EJB 2.0 specification introduced new
features, including a souped-up version of entity bean CMP,
message-driven beans, and additional CORBA
interoperability. Application server vendors have rushed to support
the EJB 2.0 specification; many of them have quickly provided
message-driven bean support, but few have CMP and CORBA
interoperability support so far.

Today, the specification is still in public final draft form with
no official release date set, which is somewhat
uncharacteristic. Previous versions of EJB specifications have
typically remained in public final draft form for less than a
quarter. So what's going on? Whatever the reason for the stall, I'm
glad for it.

I have serious issues with some recommendations made in the EJB
specification, most of which revolve around the proposal for
introducing a fine-grained "dependent object". Before I ramble on
about the nuances of dependent objects, there are other
inconsistencies in the EJB specification:

There are two tags for primary key identification:
<prim-key-class> and
<primkey-field>. The former tag is used to
indicate the fully qualified class name of the primary key object, and
the latter is used to indicate which CMP persistent field is the
non-compound primary key of the entity EJB. Why does one tag start
with prim-key and the other start with
primkey? Also, <prim-key-class> is
used for simple and compound primary keys, while the
<primkey-field> tag is only used for simple
compound primary keys. You must pay close attention to the
specification otherwise you might make the silly assumption
that use of items in the deployment description would be
uniform.

The ejbRemove() method is used inconsistently in
session and entity beans. For session beans, the
ejbRemove() method is invoked when a bean moves from the
"method-ready" state to the "does not exist" state. This transition
doesn't occur on a predictable basis, even though many developers
believe that it occurs every time a client invokes the
remove() method on a remote stub. For session beans, the
ejbRemove() method is used as a container callback to
notify the bean that it's being placed into "inactive" status. It is
not used to indicate that the bean has been destroyed, but a container
is allowed to make this transition as part of a remove()
invocation from the client.

This contradicts the intended use of ejbRemove() for
entity beans, for which ejbRemove() is called when the
bean is being destroyed in the persistent store. If you take a close
look at the entity EJB state diagram, another container callback is
used to notify a bean that it is being moved from the "pooled" state
to the "does not exist" state: unsetEntityContext(). As
an instructor to hundreds of students who're learning EJBs, it's often
a nightmare to explain to them that ejbRemove() for
session beans and unsetEntityContext() for entity beans
represent the same event transitions, but ejbRemove() for
entity beans really means destruction.

Here's a proposal, though the migration to support it would be
difficult: how about changing ejbRemove() in the
SessionBean interface to be unsetSessionContext() and
alter ejbRemove() in the entity bean interface to be
ejbDestroy()? In this scenario, the ambiguous
ejbRemove() wouldn't exist in either interface and
consistency would exist between setXXXContext() and
unsetXXXXContext() callbacks.

Dependent Objects

The EJB 2.0 specification introduced a new approach to doing CMP
entity beans. It also introduces the concept of relationships between
entity EJBs and dependent objects, which are also new to the
specification. In previous versions, all entity beans were considered
coarse-grained persistent objects, requiring every access to the bean
to be completed over an RMI invocation. The network overhead and the
request-level interception work done by a container -- transaction
demarcation, security checks, etc. -- made entity EJBs heavyweight
objects.

Coarse-grained objects are ideal in many situations, but there are
situations in which more finely-grained, persistable objects exist
too. For example, a customer profile might be considered a
coarse-grained object, but the credit card information stored for a
single customer profile might be too fine to warrant having RMI
invocations and separate transaction management as part of the
object's access logic. A fine-grained object may be needed by a
system for one or more of the following conditions:

the accessed object doesn't need to have its interface exposed to
remote clients;

the accessed object can be accessed over a method invocation as
opposed to an RMI invocation (to improve performance);

the accessed object may not necessarily be identified through a
primary key, and duplicates may be possible (they're strictly
forbidden with entity EJBs);

the accessed object follows the life cycle of the accessor object
so that it's created and destroyed contemporaneous to its
accessor;

Please reference Figure 1: Fine-Grained Objects as EJBs and Figure
2: Advantages of Local Fine-Grained Objects, both below, for another
view point.

Figure 1: Fine-Grained Objects as EJBs

Figure 2: Advantages of Local Fine-Grained Objects

The ability for a component architecture to host local,
fine-grained objects as "attached children" to remote, coarse-grained
objects is critical. If two objects are both considered remote, the
persistence manager will ultimately have to perform separate database
updates to manage both of the objects. However, if a coarse-grained
object is managing a set of finer-grained objects, the chance for a
persistence manager to do batch database access can significantly
improve performance of the system.

To meet this need, the EJB 2.0 specification defines a dependent
object as any object that's created by another object, can only be
accessed by another object, or follows the life cycle (creation and
destruction) of another object. Dependent objects are not entity EJBs
and have a JavaBean-like syntax. Dependent objects can also
participate in CMP relationships, giving a variety of options:

Entity Bean - Dependent Object relationships

Dependent Object - Dependent Object relationships (the depths of
which are unbounded as defined by the specification (!?))

Dependent objects can participate in one-to-one, one-to-many, and
many-to-many relationships

Dependent objects don't require primary keys

Dependent objects can have duplicates

Dependent objects don't have to be "attached" to a parent object.
This can occur if a parent object is destroyed and the deployment
descriptor is configured not to perform cascading deletions of any
child-dependent objects.

Dependent objects may have cascading deletions (if a parent object
that the dependent object is attached to is destroyed, the dependent
object and all of its dependent object children will be automatically
destroyed)

So What's My Problem?

With all these options available to developers, this must be the
right approach to take for incorporating fine-grained objects, right?
Wrong. All of this supposed flexibility will ultimately cause more
problems than it's meant to solve:

Dependent objects are meant to be all things to all people.
There are two things that developers currently need to be successful
with fine-grained objects in the near future:

Local fine-grained objects that have all of the benefits
mentioned earlier -- including local method invocations and bypassing
container enhancements, if necessary -- thus allowing developers to
model different persistent objects which are to be managed by a single
persistence manager employed by a parent, coarse-grained
object.

Life cycle objects that follow the same life cycle of a parent
object. A life cycle object is created when its parent is created and
destroyed when its parent is destroyed. The child object is
preferably created and destroyed automatically by the parent's
container or persistence manager. This is particularly useful for
modeling composition UML relationships where a child object cannot
exist without its parent object. In the earlier example, if credit
card information is only ever associated with a customer profile, the
credit card information may be modeled as a life cycle object where
the life of the credit card is managed automatically by the
persistence manager of the customer profile object.

Dependent objects have a cascading delete deployment descriptor
option that instructs a container to manage the destruction of
dependent objects attached to parent objects. This is needed to manage
life cycle objects. The other characteristic needed to support life
cycle objects, an automatic creation mechanism, does not exist, which
makes the intentions of the specification developers murky. If the
intent is to make dependent objects behave like life cycle objects,
which they are according to objectives listed in section 9.3.1 of the
EJB 2.0 public final draft specification, then why include cascading
delete behavior without including a mechanism for tying the creation
of a dependent object to its parent? Presently if you want a
dependent object to be created at the same time its parent object is
created, you have to create it manually during a creation callback of
the parent object. It would have to be done as part of invoking the
ejbPostCreate() method of an entity bean that was the
parent of a life cycle dependent object.

Dependent objects are written differently than entity EJBs.
Since dependent objects can only be accessed locally, home and remote
interfaces are not necessary. Only an abstract implementation class
has to be provided as part of a dependent object's implementation.
The container still provides a concrete implementation similar to
entity EJBs. The additional syntax that an EJB developer will have to
learn will confuse new developers.

I want to emphasize the confusion and ambiguity introduced by
dependent objects by raising a final issue. If the point of dependent
objects is to standardize local, fine-grained objects, which it must
be since dependent objects only partially support life cycle
requirements, then they are a small step forward at best. Many
developers believe that dependent objects are necessary since there
isn't another way to achieve CMP management of a local, fine-grained
object.

To the contrary, many application servers have the hooks, flags,
and proprietary measures needed to model two entity EJBs in a
remote-local fashion. Using these proprietary flags, it's possible to
achieve behavior and performance very similar to a dependent object
implementation. How? Using BEA's WebLogic Server 6.0 container
implementation,

implement the parent object and the child object as entity EJBs
with CMP;

in the WebLogic-specific deployment descriptor set the child EJB
to be accessed by reference; (the EJB will be accessed over a local
method invocation instead of RMI if the child EJB is in the same
virtual machine as its parent);

set the transaction attribute of the child EJB to
SUPPORTS so that it will inherit any transaction context
that the parent EJB uses; (this will insure that any database updates
that are performed as a part of the child's ejbStore()
callback will be done at the same time that the parent's updates are
performed);

Other application servers support similar proprietary flags that
achieve the same affect. The disadvantages of this technique are
twofold: first, it's proprietary; second, a persistence manager will
likely not be able to batch database SQL invocations at the end of a
transaction as it would be able to do with a parent object that has a
dependent object.

A more detailed analysis of the point of fine-grained objects is
obviously warranted. Even without that analysis, the intended use of
dependent objects as defined by the specification developers seems
fuzzy. Despite addressing a real need identified by developers in the
EJB community, this dependent object approach confuses the conceptual
issues and ultimately confuses developers too.

Conclusion and What Should Be Done?

Get rid of dependent objects.

Emphasize the importance of life cycle and local objects.
Allow developers to specify objects as life-cycle, local, or
both.

Develop a technique for fine-grained objects that doesn't
introduce more syntax for developers to learn. Consider leaving CMP
entity EJBs as the only persistent object implementation, but include
deployment descriptor flags that specify fine-grained
object.

Is all of this controversial? You bet it is. Despite the number
of vendors that have embraced the EJB 2.0 specification, I'm not aware
of a single vendor that's implemented dependent objects, including BEA
WebLogic and JBoss Server, which are usually the most updated
implementations. It's likely that the confusion described here is
felt by others in the industry too.