Replication failure handling

Replication can encounter several failure situations. The following table
lists these situations and describes the actions that
Derby takes as a result.

Table 1. Replication failure handling

Failure situation

Action taken

Master loses connection with slave.

Transactions are allowed to continue processing while the
master tries to reconnect with the slave. Log records generated while the
connection is down are buffered in main memory. If the log buffer reaches its
size limit before the connection can be reestablished, the master replication
functionality is stopped. You can use the property
derby.replication.logBufferSize to configure the size limit of the
buffer; see the Derby Reference Manual for
details.

Slave loses connection with master.

The slave tries to reestablish the connection with the
master by listening on the specified host and port. It will not give up until it
is explicitly requested to do so by either the failover=true or
stopSlave=true connection URL attribute. If a failover is requested, the
slave applies all received log records and boots the database as described in
Forcing a failover. If the
stopSlave=true attribute is specified, the slave database is shut down
without further actions.

Two different masters of database D try to replicate to
the same slave.

The slave will only accept the connection from the first
master attempting to connect. Note that authentication is required to start
both the slave and the master, as described in
Replication and security.

The master and slave
Derby instances are not at
the same Derby version.

The master
Derby instance is not able to
send log data to the slave at the same pace as the log is generated. The main
memory log buffer gradually fills up and eventually becomes full.

The master notices that the main memory log buffer is
filling up. It first tries to increase the speed of the log shipment to keep
the amount of log in the buffer below the maximum. If that is not enough to
keep the buffer from getting full, the response time of transactions may
increase for as long as log shipment has trouble keeping up with the amount of
generated log records. You can use properties to tune both the log buffer size
and the minimum and maximum interval between consecutive log shipments. See
the Derby Reference Manual for details.

The slave
Derby instance crashes.

The master sees this as a lost connection to the slave.
The master tries to reestablish the connection until the replication log buffer
is full. Replication is then stopped on the master. Replication must be
restarted, as described in
Starting and running replication.

An unexpected failure is encountered.

Replication is stopped. The other
Derby instance of the
replication pair is notified of the decision if the network connection is still
alive.