Analyzing the Deployment Architecture

Using the OpenSSO Enterprise embedded configuration data store
can lower response time and ensure service availability when machine
failure occurs. You can deploy multiple OpenSSO instances to serve
as a single system, and their corresponding embedded configuration
data store instances will be automatically configured in data replication
mode. Each embedded configuration data store instance in the system
will contain the same set of data. Any update request in a single
instance will be replayed in all other instances in the system. By
using the simplest architecture, the embedded configuration data store
replication model uses multi-master (peer-to- peer) network structure.

Single-Server and Multiple-Servers Modes

The following figure illustrates OpenSSO Enterprise deployed
with the embedded configuration data store in single-server mode.

Figure 15–1 Single-Server Mode

Under multiple-servers mode, every OpenSSO Enterprise instance
works with its own embedded configuration data store instance under
the same memory space in the web container. The embedded configuration
data store replication mechanism uses the custom replication protocol
to maintain the data consistency between directory service instances.

The following figure illustrates OpenSSO Enterprise deployed
with the embedded configuration data store in multiple-servers mode.

Figure 15–2 Multiple-Servers Mode

Replication Structure

Replication is entirely handled by OpenSSO Enterprise. The OpenSSO
Enterprise embedded configuration data store replication model supports
multi-master network architecture. The embedded configuration data
store separates actual data from replication metadata. In this model,
the server that stores the configuration data is called the directory
server. The server that stores the replication metadata
is called the replication server.

Even the smallest deployment must include two replication server
instances, to ensure availability when the replication server instances
fails. Replication servers perform the following functions:

Manage connections from directory servers

Connect to other replication servers

Listen for connections from other replication servers

Receive changes from directory servers

Forward changes to directory servers and to other
replication servers

Save changes to stable storage and trimming older
operations

Each replication server contains a list of all the other replication
servers in the replication topology. Replication servers are also
responsible for providing other servers information with information
about the replication topology.

Directory servers perform the following functions:

Receiving read and write requests from client applications

Forwarding changes to specific replication servers

Each directory server contains a list of the suffix DNs to
be synchronized. For each suffix DN to be synchronized, each directory
server contains a list of replication servers to connect to. When
a change is made on a directory server, that directory server forwards
the change to the local replication server. The replication server
then relays the change to other replication servers in the topology,
which in turn relay the change to all other directory servers in the
topology.

Applications should typically perform reads and writes on the
same directory server instance. This reduces the likelihood of consistency
problems due to replication.

Every replication server instance maintains a message queue
which is used to store pending changes. When one of the directory
servers is down, all the changes applied to other servers will be
stored in the corresponding message queue in the server instance which
receives the requests. Once the directory server instance is back
online, the replication servers relay all the changes to maintain
data consistency. However, the size of the message queue and purge
delay time are limited. By default, the size of the message queue
is 10000 changes. The purge delay time is 24 hours. If one of the
servers is down longer than the purge delay time, or if the changes
applied to a particular directory server exceeds the size of message
queue, the replication system will lose synchronization.

You can change the value of the purge delay and the size of
message queue by adding the entries ds-cfg-replication-purge-delay and ds-cfg-queue-size attributes to
the file config.ldif. The config.ldif file
is under the directoryOpenSSO base directory/opends/config directory. The unit of ds-cfg-replication-purge-delay is seconds, and the unit of ds-cfg-queue-size is
integer. Once the embedded configuration data store instance loses
synchronization, the only way to bring the system back to synchronization
is to reconfigure OpenSSO Enterprise with the embedded configuration
data.

To determine whether embedded configuration data store instances
are synchronized, OpenSSO Enterprise CLI tools ssoadm provides
a command embedded-status to check the status of embedded configuration
data store instances. SeeChapter 1, ssoadm Command Line Interface Reference, in Sun OpenSSO Enterprise 8.0 Administration Reference Alternatively,
you can check the embedded configuration data store logs when you
suspect a problem with configuration data store inconsistencies. The
logs are under the directory OpenSSO base directory/opends/logs. Current OpenSSO Enterprise embedded
configuration data store replication implementation is recommended
for use with the server instances located within the same geographical
region.

Summary of Actual Replication Test Results

Replication tests were run using up to four instances of the
OpenSSO Enterprise embedded configuration data store with Tomcat and
GlassFish. The results show that replication was successful among
the four instances using 8000 policies. The following is a summary
of the test results:

When all four instances are online, the delay of synchronization
of replication is generally less than one second, which is negligible.

The time required to load the same amount of data
in an embedded configuration data store instance is longer when replication
is enabled whether or not all instances are online.

The time required to load the data is incremental.
The time required to load the second 1000 entries is longer that the
time required to load the first 1000 entries.

Data loading time can be significantly reduced by
breaking up the data and loading data at the same time using multiple
instances.

With the same amount of memory heap size (2GB), Tomcat
performs better than GlassFish v2 for smaller settings (one or two
instances). GlassFish performs better in larger settings (four instances).