The EclipseLink cache is an in-memory repository that stores recently read or written objects based on class and primary key values. EclipseLink uses the cache to do the following:

Improve performance by holding recently read or written objects and accessing them in-memory to minimize database access.

Manage locking and isolation level.

Manage object identity.

Cache Architecture

EclipseLink uses two types of cache: the session cache maintains objects retrieved from and written to the data source; and the unit of work cache holds objects while they participate in transactions. When a unit of work successfully commits to the data source, EclipseLink updates the session cache accordingly.

As [[#Figure 98-1] shows, the session cache and the unit of work cache work together with the data source connection to manage objects in a EclipseLink application. The object life cycle relies on these three mechanisms.

Figure 98-1 Object Life Cycle and the EclipseLink Caches

[img_text/cacharch Description of "Figure 98-1 Object Life Cycle and the EclipseLink Caches"]

Session Cache

The session cache is a shared cache that services clients attached to a given session. When you read objects from or write objects to the data source using a client session, EclipseLink saves a copy of the objects in the parent server session's cache and makes them accessible to all other processes in the session.

EclipseLink adds objects to the session cache from the following:

The data store, when EclipseLink executes a read operation

The unit of work cache, when a unit of work successfully commits a transaction

An isolated client session is a special type of client session that provides its own session cache isolated from the shared object cache of its parent server session. The isolated client session cache can be used to improve user-based security or to avoid caching highly volatile data. For more information, see Isolated Client Sessions.

Unit of Work Cache

The unit of work cache services operations within the unit of work. It maintains and isolates objects from the session cache, and writes changed or new objects to the session cache after the unit of work commits changes to the data source.

Cache Concepts

This section describes concepts unique to the EclipseLink cache, including the following:

Cache Type and Object Identity

EclipseLink preserves object identity through its cache using the primary key attributes of a persistent entity. These attributes may or may not be assigned through sequencing (see Projects and Sequencing). In a Java application, object identity is preserved if each object in memory is represented by one, and only one, object instance. Multiple retrievals of the same object return references to the same object instance–not multiple copies of the same object.

Maintaining object identity is extremely important when the application's object model contains circular references between objects. You must ensure that the two objects are referencing each other directly, rather than copies of each other. Object identity is important when multiple parts of the application may be modifying the same object simultaneously.

Full Identity Map

This option provides full caching and guaranteed identity: objects are never flushed from memory unless they are deleted.

It caches all objects and does not remove them. Cache size doubles whenever the maximum size is reached. This method may be memory-intensive when many objects are read. Do not use this option on batch operations.

We recommend using this identity map when the data set size is small and memory is in large supply.

Weak Identity Map

This option is similar to the full identity map, except that the map holds the objects by using weak references. This method allows full garbage collection and provides full caching and guaranteed identity.

The weak identity map uses less memory than full identity map but also does not provide a durable caching strategy across client/server transactions. Objects are available for garbage collection when the application no longer references them on the server side (that is, from within the server JVM).

Soft Identity Map

This option is similar to the weak identity map, except that the map uses soft references instead of weak references. This method allows full garbage collection and provides full caching and guaranteed identity.

The soft identity map allows for optimal caching of the objects, while still allowing the JVM to garbage collect the objects if memory is low.

Soft Cache Weak Identity Map and Hard Cache Weak Identity Map

These options are similar to the weak identity map except that they maintains a most frequently used subcache. The subcache uses soft or hard references to ensure that these objects are garbage-collected only if the system is low on memory.

The soft cache weak identity map and hard cache weak identity map provide more efficient memory use. They release objects as they are garbage-collected, except for a fixed number of most recently used objects. Note that weakly cached objects might be flushed if the transaction spans multiple client/server invocations. The size of the subcache is proportional to the size of the identity map as specified by the ClassDescriptor method setIdentityMapSize. You should set this cache size to be as large as the maximum number of objects (of the same type) referenced within a transaction (see Configuring Cache Type and Size at the Descriptor Level).

We recommend using this identity map in most circumstances as a means to control memory used by the cache.

If object identity is important, but caching is not, use a WeakIdentityMap policy.

If an object has a long life span or requires frequent access, or object identity is important, use a FullIdentityMap policy.

WARNING:Use the FullIdentityMap only if the class has a small number of finite instances. Otherwise, a memory leak will occur.

If an object has a short life span or requires frequent access, and identity is not important, use a CacheIdentityMap policy.

If objects are discarded immediately after being read from the database, such as in a batch operation, use a NoIdentityMap policy. The NoIdentityMap does not preserve object identity.

Note: We do not recommend the use of CacheIdentityMap and NoIdentityMap policies.

What You May Need to Know About the Internals of Weak, Soft, and Hard Identity Maps

The WeakIdentiyMap and SoftIdentityMap use JVM weak and soft references to ensure that any object referenced by the application is held in the cache. Once the application releases it reference to the object, the JVM is free to garbage collection the objects. When a weak and soft reference is garbage collected is determined by the JVM. In general one could expect a weak reference to be garbage collected on each JVM garbage collector, and a soft reference to be garbage collected when the JVM determines memory is low.

The SoftCacheWeakIdentityMap and HardCacheWeakIdentityMap types of identity map contain the following two caches:

Reference cache: implemented as a LinkedList that contains soft or hard references, respectively.

Weak cache: implemented as a HashMap that contains weak references.

When you create a SoftCacheWeakIdentityMap or HardCacheWeakIdentityMap with a specified size, the reference cache LinkedList is exactly this size. The weak cache HashMap is initialized to 100 percent of the specified size: the weak cache will grow when more objects than the specified size are read in. Because EclipseLink does not control garbage collection, the JVM can reap the weakly held objects whenever it sees fit.

Because the reference cache is implemented as a LinkedList, new objects are added to the end of the list. Because of this, it is by nature a least recently used (LRU) cache: fixed size, object at the top of the list is deleted, provided the maximum size has been reached.

The SoftCacheWeakIdentityMap and HardCacheWeakIdentityMap are essentially the same type of identity map. The HardCacheWeakIdentityMap was constructed to work around an issue with some JVMs.

If your application reaches a low system memory condition frequently enough, or if your platform's JVM treats weak and soft references the same, the objects in the reference cache may be garbage-collected so often that you will not benefit from the performance improvement provided by it. If this is the case, we recommend that you use the HardCacheWeakIdentityMap. It is identical to the SoftCacheWeakIdentityMap except that it uses hard references in the reference cache. This guarantees that your application will benefit from the performance improvement provided by it.

When an object in a HardCacheWeakIdentityMap or SoftCacheWeakIdentityMap is pushed out of the reference cache, it gets put in the weak cache. Although it is still cached, EclipseLink cannot guarantee that it will be there for any length of time because the JVM can decide to garbage-collect weak references at anytime.

Querying and the Cache

A query that is run against the shared session cache is known as an in-memory query. Careful configuration of in-memory querying can improve performance (see How to Use In-Memory Queries).

By default, a query that looks for a single object based on primary key attempts to retrieve the required object from the cache first, searches the data source only if the object is not in the cache. All other query types search the database first, by default. You can specify whether a given query runs against the in-memory cache, the database, or both.

Handling Stale Data

Stale data is an artifact of caching, in which an object in the cache is not the most recent version committed to the data source. To avoid stale data, implement an appropriate cache locking strategy.

By default, EclipseLink optimizes concurrency to minimize cache locking during read or write operations. Use the default EclipseLink isolation level, unless you have a very specific reason to change it. For more information on isolation levels in EclipseLink, see [[#Cache Isolation].

Cache locking regulates when processes read or write an object. Depending on how you configure it, cache locking determines whether a process can read or write an object that is in use within another process.

A well-managed cache makes your application more efficient. There are very few cases in which you turn the cache off entirely, because the cache reduces database access, and is an important part of managing object identity.

To make the most of your cache strategy and to minimize your application's exposure to stale data, we recommend the following:

Configuring a Locking Policy

Make sure you configure a locking policy so that you can prevent or at least identify when values have already changed on an object you are modifying. Typically, this is done using optimistic locking. EclipseLink offers several locking policies such as numeric version field, time-stamp version field, and some or all fields.

Configuring the Cache on a Per-Class Basis

If other applications can modify the data used by a particular class, use a weaker style of cache for the class. For example, the SoftCacheWeakIdentityMap or WeakIdentityMap minimizes the length of time the cache maintains an object whose reference has been removed.

For more information, see [[Configuring a Descriptor (ELUG)|Configuring Cache Type and Size at the Descriptor Level].

Forcing a Cache Refresh when Required on a Per-Query Basis

Any query can include a flag that forces EclipseLink to go to the data source for the most up-to-date version of selected objects and update the cache with this information.

Configuring Cache Invalidation

Using descriptor API, you can designate an object as invalid: when any query attempts to read an invalid object, EclipseLink will go to the data source for the most up to date version of that object and update the cache with this information. You can manually designate an object as invalid or use a CacheInvalidationPolicy to control the conditions under which an object is designated invalid.

Configuring Cache Coordination

If your application is primarily read-based and the changes are all being performed by the same Java application operating with multiple, distributed sessions, you may consider using the EclipseLink cache coordination feature. Although this will not prevent stale data, it should greatly minimize it.

Explicit Query Refreshes

Some distributed systems require only a small number of objects to be consistent across the servers in the system. Conversely, other systems require that several specific objects must always be guaranteed to be up-to-date, regardless of the cost. If you build such a system, you can explicitly refresh selected objects from the database at appropriate intervals, without incurring the full cost of distributed cache coordination.

To implement this type of strategy, do the following:

Configure a set of queries that refresh the required objects.

Establish an appropriate refresh policy.

Invoke the queries as required to refresh the objects.

Refresh Policy

When you execute a query, if the required objects are in the cache, EclipseLink returns the cached objects without checking the database for a more recent version. This reduces the number of objects that EclipseLink must build from database results, and is optimal for noncoordinated cache environments. However, this may not always be the best strategy for a coordinated cache environment.

To override this behavior, set a refresh policy that specifies that the objects from the database always take precedence over objects in the cache. This updates the cached objects with the data from the database.

You can implement this type of refresh policy on each EclipseLink descriptor, or just on certain queries, depending upon the nature of the application.

Alternatively, you can configure any object with a CacheInvalidationPolicy that lets you specify, either automatically or manually, under what circumstances a cached object is invalid: when any query attempts to read an invalid object, EclipseLink will go to the data source for the most up-to-date version of that object, and update the cache with this information.

You can use any of the following CacheInvalidationPolicy instances:

DailyCacheInvalidationPolicy: the object is automatically flagged as invalid at a specified time of day.

NoExpiryCacheInvalidationPolicy: the object can only be flagged as invalid by explicitly calling org.eclipse.persistence.sessions.IdentityMapAccessor method invalidateObject.

TimeToLiveCacheInvalidationPolicy: the object is automatically flagged as invalid after a specified time period has elapsed since the object was read.

If you configure a query to cache results in its own internal cache (see How to Cache Query Results in the Query Cache), the cache invalidation policy you configure at the query level applies to the query's internal cache in the same way it would apply to the session cache.

If you are using a coordinated cache (see [[#Cache Coordination]), you can customize how EclipseLink communicates the fact that an object has been declared invalid. For more information, see [Configuring%20a%20Descriptor%20(ELUG)|Configuring Cache Coordination Change Propagation at the Descriptor Level].

Cache Coordination

The need to maintain up-to-date data for all applications is a key design challenge for building a distributed application. The difficulty of this increases as the number of servers within an environment increases. EclipseLink provides a distributed cache coordination feature that ensures data in distributed applications remains current.

Cache coordination reduces the number of optimistic lock exceptions encountered in a distributed architecture, and decreases the number of failed or repeated transactions in an application. However, cache coordination in no way eliminates the need for an effective locking policy. To effectively ensure working with up-to-date data, cache coordination must be used with optimistic or pessimistic locking. We recommend that you use cache coordination with an optimistic locking policy (see [[Configuring a Descriptor (ELUG)|Configuring Locking Policy]).

You can use cache invalidation to improve cache coordination efficiency. For more information, see #Cache Invalidation.

For more information, see [[#Cache Coordination].

Cache Isolation

Isolated client sessions provide a mechanism for disabling the shared server session cache. Any classes marked as isolated only cache objects relative to the life cycle of their client session. These classes never utilize the shared server session cache. This is the best mechanism to prevent caching as it is configured on a per-class basis allowing caching for some classes, and denying for others.

Cache Locking and Transaction Isolation

By default, EclipseLink optimizes concurrency to minimize cache locking during read or write operations. Use the default EclipseLink transaction isolation configuration unless you have a very specific reason to change it.

Cache Coordination

As Cache Coordination shows, cache coordination is a session feature that allows multiple, possibly distributed, instances of a session to broadcast object changes among each other so that each session's cache is either kept up-to-date or notified that the cache must update an object from the data source the next time it is read.

When sessions are distributed, that is, when an application contains multiple sessions (in the same JVM, in multiple JVMs, possibly on different servers), as long as the servers hosting the sessions are interconnected on the network, sessions can participate in cache coordination. Coordinated cache types that require discovery services also require the servers to support User Datagram Protocol (UDP) communication and multicast configuration (for more information, see Coordinated Cache Architecture).

Cache coordination reduces the occurrence of stale data by increasing the likelihood that distributed caches are kept up-to-date with changes and are notified when one of the distributed caches must update an object from the data source the next time it is read.

Cache coordination reduces the number of optimistic lock exceptions encountered in a distributed architecture, and decreases the number of failed or repeated transactions in an application. However, cache coordination in no way eliminates the need for an effective locking policy. To effectively ensure working with up-to-date data, cache coordination must be used with optimistic or pessimistic locking. We recommend that you use cache coordination with an optimistic locking policy (see Configuring Locking Policy).

For other options to reduce the likelihood of stale data, see #Stale Data.

Coordinated Cache Architecture

EclipseLink provides coordinated cache implementations that perform discovery and message transport services using various technologies including the following:

Message transport services allow the session to broadcast object change notifications to other sessions participating in cache coordination when a unit of work from this session commits a change.

Descriptor

You can configure how object changes are broadcast on a descriptor-by-descriptor basis. This lets you fine-tune the type of notification to make.

For example, for an object with few attributes, you can configure its descriptor to send object changes. For an object with many attributes, it may be more efficient to configure its descriptor so that the object is flagged as invalid (so that other sessions will know to update the object from the data source the next time it is read).

Unit of Work

Only changes committed by a unit of work are subject to propagation when cache coordination is enabled. The unit of work computes the appropriate change set based on the descriptor configuration of affected objects.

Coordinated Cache Types

JMS Coordinated Cache

For a JMS coordinated cache, when a particular session's coordinated cache starts up, it uses its JNDI naming service information to locate and create a connection to the JMS server. The coordinated cache is ready when all participating sessions are connected to the same topic on the same JMS server. At this point, sessions can start sending and receiving object change messages. You can then configure all sessions that are participating in the same coordinated cache with the same JMS and JNDI naming service information.

Because you must supply the necessary information to connect to the JMS Topic, a JMS coordinated cache does not use a discovery service.

If you do use cache coordination, we recommend that you use JMS cache coordination: JMS is robust, easy to configure, and provides efficient support for asynchronous change propagation.

For more information on configuring JMS, see Oracle Fusion Middleware Services Guide for Oracle Containers for Java EE.

RMI Coordinated Cache

For an RMI coordinated cache, when a particular session's coordinated cache starts up, the session binds its connection in its naming service (either an RMI registry or JNDI), creates an announcement message (that includes its own naming service information), and broadcasts the announcement to its multicast group (see Configuring a Multicast Group Address and Configuring a Multicast Port). When a session that belongs to the same multicast group receives this announcement, it uses the naming service information in the announcement message to establish bidirectional connections with the newly announced session's coordinated cache. The coordinated cache is ready when all participating sessions are interconnected in this way, at which point sessions can start sending and receiving object change messages. You can then configure each session with naming information that identifies the host on which the session is deployed.

CORBA Coordinated Cache

For a CORBA coordinated cache, when a particular session's coordinated cache starts up, the session binds its connection in JNDI, creates an announcement message (that includes its own JNDI naming service information), and broadcasts the announcement to its multicast group (see Configuring a Multicast Group Address and Configuring a Multicast Port). When a session that belongs to the same multicast group receives this announcement, it uses the naming service information in the announcement message to establish bidirectional connections with the newly announced session's coordinated cache. The coordinated cache is ready when all participating sessions are interconnected in this way, at which point, sessions can start sending and receiving object change messages. You can then configure each session with naming information that identifies the host on which the session is deployed.

Currently, EclipseLink provides support for the Sun Object Request Broker.

Custom Coordinated Cache

Using the classes in org.eclipse.persistence.sessions.coordination package, you can define your own coordinated cache for custom solutions. For more information, contact your EclipseLink support representative.

Cache Invalidation API

You configure cache invalidation using ClassDescriptor methods getCacheInvalidationPolicy and setCacheInvalidationPolicy to configure an org.eclipse.persistence.descriptors.invalidation.CacheInvalidationPolicy.

You can use any of the following CacheInvalidationPolicy instances:

DailyCacheInvalidationPolicy: The object is automatically flagged as invalid at a specified time of day.

NoExpiryCacheInvalidationPolicy: The object can only be flagged as invalid by explicitly calling org.eclipse.persistence.sessions.IdentityMapAccessor method invalidateObject.

TimeToLiveCacheInvalidationPolicy: The object is automatically flagged as invalid after a specified time period has elapsed since the object was read.

You can also configure cache invalidation using a variety of API calls accessible through the Session. The org.eclipse.persistence.sessions.IdentityMapAccessor provides the following methods:

getRemainingValidTime: Returns the remaining life of the specified object. This time represents the difference between the next expiry time of the object and its read time.

invalidateAll: Sets all objects for all classes to be invalid in EclipseLink identity maps.

invalidateClass(Class klass) and invalidateClass(Class klass, boolean recurse): Set all objects of a specified class to be invalid in EclipseLink identity maps.