Question, Problem

I would like to preserve disk space, how can I consolidate and share storage of CRX data files?

Note: the same method can be applied to instances of CQ5 WCM version 5.2.1

Answer, Resolution

One way to preserve disk space in your environment is to share the CRX datastore directory over a network share between multiple installations of CRX.

WARNING: This process requires you to move your datastore directory to a shared network drive. If you are moving your datastore directory from a local folder to a shared network directory then you will experience a considerable loss in performance. Please consider this before implementing this process and weigh the benefits accordingly.

First of all, what is the datastore?

The data store is used by CRX to store large binary values. Normally all node and property data is stored in a persistence manager, but for large binaries for example, special treatment can improves performance and reduces disk usage.

How to combine the datastore between multiple CRX instances

Consider a scenario where you have two CRX instances, A and B (it doesn't matter if either is an author or a publish). A is installed under /opt/day/crxA and B is installed under /opt/day/crxB. In a default unclustered installation of CRX, the datastore is stored under <path to instance>/crx-quickstart/repository/shared/repository/datastore.

Note that your "shared" directory path may be different if you have configured a cluster "shared" directory. The datastore is stored under <shared path>/repository/datastore.

Instructions:
Copy and consolidate/combine the files of instance A and instance B's datastores
for example (if A and B are on 2 different physical servers and use a common network share /mnt/nfsshare1):

Configure repository.xml to point to the new datastore path for both instance A and B
Open repository.xml on instance A and change the datastore shared path (/opt/day/cq5A/crx-quickstart/server/runtime/0/crx/WEB-INF/repository.xml)

How to run datastore garbage collection when the datastore is shared by multiple instances of CRX

WARNING: This only applies to CRX1.4.2 or patched versions of 1.4.1 as datastore garbage collection only works properly in 1.4.1 after applying a CRX hotfix (contact day support for more info). Please test this in a dev environment before implementing this in production.

When multiple CRX instances use the same datastore: First, call gc.scan() on the instance A, then on B and so on. At the end, call gc.deleteUnused() on instance A: