Cache Topologies

Caching topology defines the strategy for data storage in a clustered cache. It
refers to the behavior of cache cluster under different scenarios. These
behavior specifications help in choosing the cache topology which suits best to
an application’s environment.

NCache provides the following set of topologies for cluster caches.

Mirrored Cache (Two-Server Active/Passive)

Replicated Cache

Partitioned Cache

Partitioned-Replica Cache (Asynchronous and Synchronous)

Reference Data Versus Transactional Data: Cached data can be differentiated
into two broad types i.e. Reference and Transactional. Reference refers to the
type of data that has a high ratio of reads than writes e.g. product catalog
prices may change after one or few days. On the other hand, transactional refers
to the data that has almost the same ratio of reads and writes e.g. each ASP.NET
Session is updated with all the activities of each application user.

Each of these caching topologies is explained in more detail below.

Mirror Cache

Mirror cache consists of two server active/passive nodes. One server node is
active node which is basically the coordinator node. The coordinator node is the
oldest server node in any cache cluster having membership management
authorities. The other node is passive/mirror node.

Clients Connection: All cache clients can only connect with the active node.
Passive node is added for creating the exact backup of active node.

Backup Support: Mirror cache provides data reliability by creating a backup
through asynchronous replication of all updates on mirror/passive node. Due to
asynchronous replication between active and passive node, cache clients get
better performance. Active and passive servers can be interchanged at any time.

Failure Recovery: If active node goes down due to any reason, then passive
node automatically becomes active without any intervention. All the cache
clients connect to the new active node and can perform operations without any
performance degradation and downtime.

Runtime Scalability: As there is a single active node in this topology, its
scalability is the same as that of a local cache. But due to mirroring of data,
it also provides fault tolerance which lacks in local cache.

State Transfer: Mirror node always replicates the state from active node
when it joins the cache cluster.

Replicated Cache

Multiple Server Nodes: In replicated cache, there can be two or more server
nodes. Each node is replica of the cache, i.e., all server nodes contain the
same set of cached data.

More than One Backup: It has more than one backup of the cached data. With
an increase of server nodes in cluster, the number of backups also increase.

Replication Mode: Replications are performed synchronously on all other
nodes of the cache. In synchronous parallel replication, there is a chance that
two parallel requests for updating the same data may result in different state
on different replica nodes. But using a global token performs updates on all
nodes in a sequenced fashion to avoid data integrity issues.

Clients Load Balancing: Cache clients are load balanced automatically by the
cluster for an even distribution of load on the cache nodes. This ensures that
any one node is not overwhelmed by the cache client requests. Client load
balancing is configurable and can be turned Off.

High Performance in Reference Data: Replicated cache is suitable when data
read are more than data writes. For any cache client, reads are local to the
server which means that cache client has direct access to all data.

Runtime Scalability: Server nodes can be added or removed in replicated
cache like other topologies at run time.

Low Performance in Transactional Data: As more server nodes are added to
scale the replicated cache, write overhead increases. So this topology is not
suitable for applications having transactional type of data.

Storage Limitation to Each Server Node: Adding more nodes does not increase
the storage capacity because every node has same data set.

High Availability: Replicated topology is best for applications where high
availability and reliability with zero down time is the priority. This is
because this topology contains multiple backups of cached data set.

State Transfer: Every server node replicates the state from the coordinator
node (the senior most node in terms of cluster joining) when it joins the cache
cluster.

Partitioned Cache

Multiple Server Nodes: This topology has two or more servers. When any cache
item is added in the cache it is saved only to the relevant server node which
means that every server node has a unique set of data.

Data Distribution: Data is distributed/partitioned among all server nodes on
the basis of the hash code of cache key. A hash based distribution map is
generated which is then distributed to all server nodes.

Clients Connections: Cache clients establish connections with every server
node of the cluster. Data distribution maps are also shared with cache clients
and cache clients are aware of the actual data distribution. When a cache client
wants to send a request to cache servers, it first determines to which server
node that key belongs to and sends the request accordingly on the basis of the
distribution map.

One Hop Reads and Writes: Based on the distribution map, cache client
adds/fetches cached item to/from the exact server node. In this manner cache
reads and writes both become one hop operation and there are no delays.

No Backups/Fault Tolerance: When any server node leaves the cluster, then
data residing on that node is lost because there is no replica for any node.
That's why there is no fault tolerance in terms of cached data.

Runtime Scalability: Server nodes can be added or removed at runtime when
ever needed.

High Performance: Scaling the cache will not affect the performance of
cluster cache for both reference and transactional data because read/write is
one hop operation in this topology. It will result in high performance in
reads/writes without affecting the performance of cluster cache.

Storage Scalability: When more nodes are added in partitioned cache, storage
capacity is increased because each partition in the cluster has a unique set of
data.

Reliability: This topology is not fault tolerant because there is no backup
of data and if any server node goes down, then data belonging to that partition
is permanently lost.

State Transfer: When a node joins the cache cluster in Partitioned topology,
existing nodes sacrifice a portion of their data, according to hash based
distribution, for the newly joined node. Newly joined node transfers its data
share from existing nodes which results in an even distribution of the cached
data among all cluster nodes. State transfer occurs in background. Therefore
during state transfer, user cache operation does not suffer.

Partitioned-Replica Cache

Multiple Server Nodes: This topology has two or more servers. When any cache
item is added in the cache it is saved only to the relevant server node which
means that every server node has a unique set of data.

Data Distribution: Data is distributed/partitioned among all server nodes on
the basis of the hash code of cache key. A hash base distribution map is
generated which is then distributed to every server node.

Fault Tolerance: It provides fault tolerance up to one node replica. Each
server node in the cluster has one active partition and its passive/mirror
replica on the other node. Thus when one node is down data can be restored from
its replica.

Double Memory Consumption: Each server node consumes double memory than the
configured size because that node contains one active partition and one replica.
Both active and replica has the same size as configured for cache.

Clients Connections: Cache clients are connected to all server nodes in this
topology. Data distribution maps are also shared with cache clients and cache
clients are aware of the actual data distribution. Operations are directly sent
to the node containing data with the help of distribution map on the basis of
cache key.

Runtime Scalability: Server nodes can be added or removed at runtime
whenever needed.

Storage Scalability: When more nodes are added in partitioned-replica cache,
storage capacity is increased because each partition in the cluster has a unique
set of data.

High Performance: Scaling the cache will not affect the performance of
cluster cache for both reference and transactional data because read/write is
one hop operation in this topology. It will give high performance in
reads/writes without affecting the performance of cluster cache. Addition of new
nodes to the cache cluster results in load distribution of existing members
resulting in more throughput.

State Transfer: When a node joins the cache cluster in partitioned-replica
topology, existing nodes sacrifice a portion of their data, according to hash
based distribution, for the newly joined node. Newly joined node gets its data
share transferred from existing nodes which results in an even distribution of
the cached data among all cluster nodes. Similarly, when a node leaves the cache
cluster, data from its backup node is distributed among existing partitions.
State transfer also occur between active and passive nodes. State transfer
occurs in background. Therefore, during state transfer, user cache operation
does not suffer.

Replication strategy/modes: Following are the data replication modes
provided in this topology.

Synchronous Replication

Cache clients have to wait for replication to the backup partition as part of
the cache operation. Due to synchronous replication it can be assured that if
cache client request is executed successfully, then data is already stored to
the backup node.

Asynchronous Replication

Cache clients request returns immediately after operation is performed on active
partition node. Operations are replicated asynchronously to the replica node. If
any node goes down disgracefully, then operations that remain in the replicator
queue could be lost.