16.4.1.13 Replication and Master or Slave Shutdowns

It is safe to shut down a master server and restart it later.
When a slave loses its connection to the master, the slave tries
to reconnect immediately and retries periodically if that fails.
The default is to retry every 60 seconds. This may be changed
with the CHANGE MASTER TO
statement or
--master-connect-retry option. A
slave also is able to deal with network connectivity outages.
However, the slave notices the network outage only after
receiving no data from the master for
slave_net_timeout seconds. If
your outages are short, you may want to decrease
slave_net_timeout. See
Section 5.1.4, “Server System Variables”.

An unclean shutdown (for example, a crash) on the master side
can result in the master binary log having a final position less
than the most recent position read by the slave, due to the
master binary log file not being flushed. This can cause the
slave not to be able to replicate when the master comes back up.
Setting sync_binlog=1 in the
master my.cnf file helps to minimize this
problem because it causes the master to flush its binary log
more frequently.

--innodb-safe-binlog is
unneeded as of MySQL 5.0.3, having been made obsolete by the
introduction of XA transaction support.

Shutting down a slave cleanly is safe because it keeps track of
where it left off. However, be careful that the slave does not
have temporary tables open; see
Section 16.4.1.15, “Replication and Temporary Tables”. Unclean
shutdowns might produce problems, especially if the disk cache
was not flushed to disk before the problem occurred:

For transactions, the slave commits and then updates
relay-log.info. If a crash occurs
between these two operations, relay log processing will have
proceeded further than the information file indicates and
the slave will re-execute the events from the last
transaction in the relay log after it has been restarted.

A similar problem can occur if the slave updates
relay-log.info but the server host
crashes before the write has been flushed to disk. Writes
are not forced to disk because the server relies on the
operating system to flush the file from time to time.

The fault tolerance of your system for these types of problems
is greatly increased if you have a good uninterruptible power
supply.

User Comments

The docs should highlight the lack of crash safety. It is easy to miss one sentence at the end of this section. Note that this warning is incomplete as a crash between commit and the call to flush_relay_log_info() will cause the last group of events to be repeated from the relay log. That doesn't require a server reboot.