Application data is cached in a "region". The RegionFactory class provides the simpliest
entry point into the Region
API. A Region implements Map,
however, it also provides caching behavior such as data loading,
eviction control, and distribution. Every region has a name and
regions may be nested to provide a cache-naming hierarchy ("parent
regions" with "subregions"). The root regions of the naming hierarchy
(that is, the regions with no parent) are obtained with the Cache.getRegion(java.lang.String) method.

Region properties such as the region's cache loader, mirroring policy, and
storage model are specified by an instance of RegionAttributes. A region
RegionAttributes object can be specified when creating a
region.

Region data can be partitioned across many distributed system members to create one large logical heap. PartitionAttributes are used to configure
a partitioned region. A partitioned region can be configured to be
highly available, surviving the loss of one or more system members, by
maintaining copies of data. These extra copies also benefit read operations by
allowing load balancing across all the copies.

Partitioned Regions have the added feature of allowing storage
sizes larger than a single Java VM can provide and with multiple
Java VMs comes multiple garbage collectors, improving the performance of
the entire Region in the face of a full garbage collection cycle.

Partitioned Region caveats

As of GemFire 5.0, Partitioned Regions do not support the following GemFire features:

A region contains key/value pairs of objects known as the region's
"entries". The
Region class provides a number of methods for
manipulating the region's entries such as create, put, invalidate, and destroy . The following
diagram describes the life cycle of a region entry.

A region's scope attribute
determines how the region's entries will be distributed to other
caches. A region with local scope
will not distribute any of its changes to any other members of the
distributed system, nor will it receive changes when another cache
instance is updated.

When a change (such as a put or
invalidate) is made to a region with non-local scope,
that change is distributed to the other members of the distributed
system that have created that region in their cache instance. There
are three kinds of distributed scope, each of which guarantees a
different level of consistency for distributed data. "Global"
scope provides the highest level of data consistency by obtaining a
distributed lock on a region entry before propagating a change to other
members of the distributed system. With globally-scoped regions, only
one thread in the entire distributed system may modify the region entry at a
time.

"Distributed ACK" scope provides slightly weaker data consistency
than global scope. With distributed ACK scope, the method that
modifies the region (such as a call to Region.destroy(java.lang.Object)) will not return until an
acknowledgment of the change has been received from every member of
the distributed system. Multiple threads may modify the region
concurrently, but the modifying thread may proceed knowing that its
change to the region has been seen by all other members.

"Distributed NO ACK" scope provides the weakest data consistency of
all the scopes, but also provides the best performance. A method invocation that
modifies a region with distributed NO ACK scope will return
immediately after it has updated the contents of the region in its own
cache instance. The updates are distributed
asynchronously.

Once the amount of data stored in the region exceeds the eviction
controller's threshold, least recently used data is written to
disk and removed from the VM until the region's size is below the
threshold.

true

No evcition controller or an eviction controller with an eviction
action other than "overflow to disk"

Disk for persistence

All data in the region is scheduled to be
written to disk as soon as it is placed in the region. Thus,
the data on disk contains a complete backup of the region. No
information about recently used data is maintained and,
therefore, the size of the VM will continue to grow as more data
is added to the region. "Disk for persistence" mode is
appropriate for situations in which the user wants to back up a
region whose data fits completely in the VM.

All data in the region is scheduled to be
written to disk as soon as it is placed in the region. But
unlike "disk for persistence" mode, least recently used data will
only be removed from the VM once the eviction controller's
threshold is reached.

There are several things to keep in mind when working with regions
that store data on disk.

When disk overflow is enabled, only the value of the least
recently used entry is removed from the VM. That is, the entry's key
and the entry itself remain in the VM.

When disk overflow is enabled, and backup is not then in each
disk directory files whose name start with OVERFLOW_
will be created. These files are deleted automatically when they
are no longer being used. If a VM crashes it may not clean up its
overflow files. In that case they will be cleaned up the next time
the same region is created.

GemFire will look for the value of a region entry on disk before
it will look in other cache instances. That is, it will not perform a
net search or a load if the value exists
on disk.

Region operations such as destroy and invalidate effect the
data that is stored on disk in addition to the data stored in the VM.
That is, if an entry is destroyed in the VM (even by an ExpirationAction), it will also be destroyed on disk.

Region backup and restore

A put on a region
that is configured to have a disk "backup" (persistBackup
is true) will result in the immediate scheduling of a
disk write according to the region's DiskWriteAttributes
(synchronous or asynchronous).

The actual backup data is stored in each of the specified disk
directories. If any one of these directories runs out of space then
any further writes to the backed up region will fail with a
DiskAccessException. The actual file names begin with BACKUP_
and KEYS_. If you wish to store a backup in another location
or offline, then all of these files need to be saved. All of the files
in the same directory must always be kept together in the same directory.
It is ok to change the directory name.

When a region with a disk backup is created, it initializes itself
with a "bulk load" operation that reads region entry data from its
disk files. Note that the bulk load operation does not create cache
events and it does not send update messages to other members of the
distributed system. Additionally, bulk loading reads only the
keys of
the region entries and not the values. Entry
values will be lazily read into the VM as they are requested.

A common high-availability scenario may involve mirrored regions
that are configured to have disk backups. When a mirrored backup
region is created in a distributed system that already contains a
mirrored backup region, GemFire optimizes the initialization of the
backup region by streaming the contents of the backup file to the
region being initialized. If there is no other mirrored backup region
in the distributed system, the backup file for the region being
initialized may contain stale data. (That is, the value of region
entries may have changed while the backup VM was down.) In this
situation, the region being initialized will consult other VMs in the
distributed system to obtain an up-to-date version of the cached
data.

A cache loader
allows data from outside of the VM to be placed into a region. When
Region.get(java.lang.Object) is called for a region
entry that has a null value, the load method of the
region's cache loader is invoked. The load method
creates the value for the desired key by performing an operation such
as a database query. The load may also perform a
net
search that will look for the value in a cache instance hosted by
another member of the distributed system.

If a region was not created with a user-specified cache loader, the
get operation will, by default, perform a special
variation of net search: if the value cannot be found in any of the
members of the distributed system, but one of those members has
defined a cache loader for the region, then that remote cache loader
will be invoked (a "net load") and the loaded value returned to the
requester. Note that a net load does not store the loaded value in
the remote cache's region.

The CacheWriter is a type of event handler
that is invoked synchronously before the cache is modified, and has
the ability to abort the operation. Only one CacheWriter in the distributed system
is invoked before the operation that would modify a cache. A CacheWriter
is typically used to update an external database.

Sometimes cached data has a limited lifetime. The region attributes
regionTimeToLive, regionIdleTimeout, entryTimeToLive, and entryIdleTimeout, specify how data is handled when it becomes too
old. There are two conditions under which cache data is considered
too old: data has resided unchanged in a region for a given amount of time
("time to
live") and data has not been accessed for a given amount of time
("idle
time"). GemFire's caching implementation launches an "expiration
thread" that periodically monitors region entries and will expire
those that have become too old. When a region entry expires, it can
either be invalidated, destroyed, locally
invalidated, or locally destroyed.

A region's "data policy" attribute determines if data is stored in the local cache.
The normal policy
will store region data in the local cache.
The empty policy
will never store region data in the local cache. They act as proxy regions that
distribute write operations to others and receive events from others.
The replication policies
may reduce the number of net searches that a caching application has to be perform,
and can provide a backup mechanism. The replicated region initializes itself when
it is created with the keys and value of the region as found in other caches.
The replicate policy simply store the relicate data in memory and the
the persistent replicate policy stores the data in memory and disk.

The CacheListener
interface provides callback methods that are invoked synchronously in
response to certain operations (such as a put or
invalidate) being performed on a region. The event
listener for a region is specified with the setCacheListener method. Each callback method on the
CacheListener receives a CacheEvent describing the operation
that caused the callback to be invoked and possibly containing information
relevant to the operation
(such as the old and new values of a region entry).

Before a new entry is created in a region, the region's eviction controller
is consulted. The eviction controller may perform some action on the region (usually an
action that makes the region smaller) based on certain criteria. For
instance, an eviction controller could keep track of the sizes of the
entry values. Before a new entry is added, the eviction controller
could remove the entry with the largest value to free up space in
the cache instance for new data. GemFire provides EvictionAttributes that will create an eviction controller
that destroys the "least recently used" Region Entry once the Region
exceeds a given number of entries.

The CacheStatistics class
provides statistics information about the usage of a region or entry.
Statistics are often used by eviction controllers to determine which
entries should be invalidated or destroyed when the region reaches its
maximum size.

GemFire caches can be configured in a distributed hierarchical configuration.
In this configuration, there are GemFire cache instances at the edge
(a.k.a. Edge Caches) that talk to GemFire instances that sit in the backend
(a.k.a. Server caches). The edge cache talks to the server cache when it records
a miss. The server cache can either fulfill the request or it can load the data
from the database and return it to the edge cache. The loaded data is cached on
the server cache and used to serve future requests from edge caches. The
communication between the edge and the server is implemented using a custom
cache loader at the edge and a cache server process on the server.The cache
loader transmits requests to the cache server which in turn looks in its local
cache to fulfill the request. The custom cache loader and cache server use a
communication protocol that is designed to reduce latency and provide failover
capabilities.

The GemFire cache supports transactions providing enhanced data
consistency across multiple Regions. GemFire Transactions are
designed to run in a highly concurrent environment and as such have
optimistic conflict behavior. They are optimistic in the sense that
they assume little data overlap between transactions. Using that
assumption, they do not reserve entries that are being changed by the
transaction until the commit operation. For example, when two
transactions operate on the same Entry, the last one to commit will
detect a conflict and fail to commit. The changes made by the
successful transaction are only available to other threads after the
commit has finished, or in other words Gemfire Transactions exhibit
Read Committed behavior.

To provide application integration with GemFire transactions, a
TransactionListener with associated
TransactionEvents is provided as a Cache
attribute. The listener is notified of commits, both failed and
successful as well as explicit rollbacks. When a commit message is
received by a distributed member with the same Region, again the
TransactionListener is invoked.

GemFire transactions also integrate well with JTA transactions. If
a JTA transaction has begun and an existing GemFire transaction is not
already present, any transactional region operation will create a
GemFire transaction and register it with the JTA transaction, causing
the JTA transaction manager to take control of the GemFire
commit/rollback operation.

Similar to JTA transactions, GemFire transactions are associated
with a thread. Only one transaction is allowed at a time per thread
and a transaction is not allowed to be shared amount threads. The
changes made changed by a GemFire transaction are distributed to other
distributed memebers as per the Region's Scope.

Finally, GemFire transactions allow for the "collapse" of multiple
operations on an Entry, for example if an application destroys an
Entry and follows with a create operation and then a
put operations, all three operations are combined into
one action reflecting the sum of all three.

The GemFire region can be configured to require the
presence of one or more user-defined membership roles. Each Role is assigned to any number of
applications when each application connects to the GemFire distributed
system. MembershipAttributes are
then used to specify which roles
are required to be online and present in that region's membership
for access to the cache region being configured.

In addition to specifying which roles are required,
MembershipAttributes are used to specify the LossAction. The loss action determines how
access to the region is affected when one or more required roles are
offline and no longer present in the system membership. The region can be
made completely "NO_ACCESS",
which means that the application cannot read from or write to that region.
"LIMITED_ACCESS" means that the application cannot write to that region.
"FULL_ACCESS" means that the application can read from and write to that
region as normal. If "FULL_ACCESS" is selected,
then the application would only be aware of the missing required role if it
registered a RegionRoleListener.
This listener provides callbacks to notify the application when required
roles are gained or lost, thus providing for custom integration of required
roles with the application.

ResumptionAction defined in the
MembershipAttributes specifies how the application responds
to having a missing required role return to the distributed membership.
"None"
results in no special action, while "Reinitialize"
causes the region to be reinitialized. Reinitializing the region will
clear all entries from that region. If it is a replicated mirror, the
region will repopulate with entries available from other distributed
members.

RequiredRoles provides methods to
check for missing required roles for a region and to wait until all required
roles are present. In addition, this class can be used to check if a role
is currently present in a region's membership.