I modified the GUI demo to test the cache replication in a 2 -node cluster setting with tcp transport protocol (with ISPN5.0).

Here is the config:

<default>

<locking

isolationLevel="READ_COMMITTED"

lockAcquisitionTimeout="20000"

writeSkewCheck="false"

concurrencyLevel="5000"

useLockStriping="false"

/>

<jmxStatistics enabled="true"/>

<clustering mode="replication">

<stateRetrieval

timeout="240000"

fetchInMemoryState="false"

alwaysProvideInMemoryState="false"

/>

<sync replTimeout="20000"/>

</clustering>

<loaders

passivation="false"

shared="false"

preload="true">

<loader

class="org.infinispan.loaders.file.FileCacheStore"

fetchPersistentState="true"

purgeOnStartup="false">

<properties>

<property name="location" value="/var/tmp/demostore"/>

</properties>

</loader>

</loaders>

The replication works fine for the data changes made when both nodes are up (form a cluster). However, if one node starts early and get some changes in the cache (such as add some entries), then, a second node joins the cluster, and the second node will NOT get the changes on the first node when the second node was down, I,e. the data in the cach in the second node is different from those in the first.

I tried to set fetchInMemoryState="true" in the config, and got the same behaviour.

My understanding is that fetchPersistentState="true" will make the second node sync the state with the first node.

What I should set to ensure the new nodes to sync state (cache data) with the existing node(s)?

!! the problem is seen when I used org.infinispan.loaders.jdbm.JdbmCacheStore loader!!

Indeed, I do not see the issue when use FileCacheStore loader !

With the JdbmCacheStore:

if I set 'fetchPersistentState="true" along with 'fetchInMemoryState="true"', the added entries on one node will be shown up on the new cluster node joining, but the removed data entryies will not be synchronized to the new node.

For example:

we have the following data pre-existed in each node's cache store (local):

k1 --> v1

k2 --> v2

1) start first node (only) and delete cach data k1. Now, one node1, the cache data now like this:

k1 -- v1

2) start the node and wait it to joint the cluster

You will see that the cache entries on node 2 still contains the 2 entries: