J2EE Transaction Frameworks: Building the Framework

Introduction

The availability of cheap computing power and increased network
bandwidth gives rise to distributed component-based computing
applications. A distributed component-based application is a
configuration of services provided by different application components
running on physically independent computers that appear to the users
of the system as a single application running on a single physical
machine. Several things motivate the adoption of distributed
component-based systems over traditional centralized systems.

Distributed application: Some tasks are inherently
distributive and by their very nature require cooperative work from
multiple agents. In such cases, it is preferable to locate and harness
computing power and data where they are naturally available and most
needed.

Reliability: Because of the shared, cooperative, and distributed
nature of the system, there is no single point of failure in the
system. By using new technologies of failover, recovery and
distributed synchronization techniques, greater reliability is
ensured.

Scalability: As the requirements of the application grow over
time, by properly designing the system, it can handle more loads by
adding new services and hardware.

Performance: As the domain of computing covers wider application
areas, the nature of the problems that need to be solved gets more
complicated. To solve these more complex problems, we need faster
computers with more computing power at a reasonable price.

Economics: It is possible to pay less for equivalent levels of
computing power when the system is split across multiple
machines.

To give the illusion to users of a single unified application
running on a single physical machine, instead of a collection of
disparate applications running on heterogeneous computers connected
via a network, a distributed system needs to be transparent in the
following ways.

Data location: It is not necessary for the user of the system
to know where data is located in the network.

Failure: It is not necessary for the user of the system to worry
about consistency of data even if there is a failure within the
network or data sources.

Replication: It is not necessary for the user of the system to
know how data replication is done.

Distribution: It is not necessary for the user of the system to
know how computing power and data are distributed across the
system.

The distributed system allows a user to store, access, and
manipulate data transparently from many computers while maintaining
the integrity of data during system failures. The management of
distributed data and transactions is accomplished at the local and
global levels. A local data manager, or resource manager, enables the
access and manipulation of data or resources. These resource managers
provide the transparency of data location, data models, and database
security and authority control. A local transaction management system
is responsible for initiating, monitoring, and terminating
transactions in a computing system. A distributed transaction
management system extends the scope of a local transaction management
system by coordinating with the local resource managers to view
related transactions over a network as a single transaction.

A transaction is a group of statements that represents a unit of
work, which must be executed as a unit. Transactions are sequences of
operations on resources -- like read, write or update -- that
transforms one consistent state of the system into a new consistent
state. In order to reflect the correct state of reality in the
system, a transaction should have the following properties.

Atomicity: This is the all-or-nothing property. Either
the entire sequence of operations is successful or unsuccessful. A
transaction should be treated as a single unit of operation. Completed
transactions are only committed and incomplete transactions are rolled
back or restored to the state where it started. There is absolutely
no possibility of partial work being committed.

Consistency: A transaction maps one consistent state of the
resources (e.g. database) to another. Consistency is concerned with
correctly reflecting the reality of the state of the resources. Some
of the concrete examples of consistency are referential integrity of
the database, unique primary keys in tables etc.

Isolation: A transaction should not reveal its results to
other concurrent transactions before it commits. Isolation assures
that transactions do not access data that is being concurrently
updated. The other name for isolation is serialization.

Durability: Results of completed transactions have to be
made permanent and cannot be erased from the database due to system
failure. Resource managers ensure that the results of a transaction
are not altered due to system failures.