Fast Cluster Restore

The Fast Cluster Restore procedure documented in this page is recommended
to speed-up the performance of arangorestore
in a Cluster environment.

It is assumed that a Cluster environment is running and a logical backup
with arangodump has already been taken.

The procedure described in this page is particularly useful for ArangoDB
version 3.3, but can be used in 3.4 and later versions as well. Note that
from v3.4, arangorestore includes the option --threads which can be a first
good step already in achieving restore parallelization and its speed benefit.
However, the procedure below allows for even further parallelization (making
use of different Coordinators), and the part regarding temporarily setting
replication factor to 1 is still useful in 3.4 and later versions.

The speed improvement obtained by the procedure below is achieved by:

Restoring into a Cluster that has replication factor 1, thus reducing
number of network hops needed during the restore operation (replication factor
is reverted to initial value at the end of the procedure - steps #2, #3 and #6).

Restoring in parallel multiple collections on different Coordinators
(steps #4 and #5).

Please refer to the
arangorestore examples
for further context on the factors affecting restore speed when restoring
using arangorestore in a Cluster.

Step 1: Copy the dump directory to all Coordinators

The first step is to copy the directory that contains the dump to all machines
where Coordinators are running.

This step is not strictly required as the backup can be restored over the
network. However, if the restore is executed locally the restore speed is
significantly improved.

Step 2: Restore collection structures

The collection structures have to be restored from exactly one Coordinator (any
Coordinator can be used) with a command similar to the following one. Please add
any additional needed option for your specific use case, e.g. --create-database
if the database where you want to restore does not exist yet:

The option --create-collection false is passed since the collection
structures were created already in the previous step.

Starting from v3.4.0 the arangorestore option --threads N can be
passed to the command above, where N is an integer, to further parallelize
the restore (default is --threads 2).

The above command will create three scripts, where three corresponds to
the amount of listed Coordinators.

The resulting scripts are named coordinator_<number-of-coordinator>.sh (e.g.
coordinator_0.sh, coordinator_1.sh, coordinator_2.sh).

Step 5: Execute parallel restore scripts

The coordinator_<number-of-coordinator>.sh scripts, that were created in the
previous step, now have to be executed on each machine where a Coordinator
is running. This will start a parallel restore of the dump.

Step 6: Revert to the initial Replication Factor

Once the arangorestore process on every Coordinator is completed, the
replication factor has to be set to its initial value.

Run the following command from exactly one Coordinator (any Coordinator can be
used). Please adjust the replicationFactor value to your specific case (2 in the
example below):