Cannot configure wsrep_sst_donor correctly?

08-20-2014, 02:19 PM

Just recently during scheduled maintenance, I decided to improve the SST donor selection by setting wsrep_sst_donor before rebooting all our server nodes. While monitoring the reboot, I noticed that half of our nodes were not behaving correctly and received their SST from different nodes. Details follow.

We have a 6 node cluster (weighted) across 2 datacenters. The nodes are named similarly to:

Primary DC:
node01
node02
node03

Secondary DC:
node01-dr
node02-dr
node03-dr

Since we're running 5.5, I wanted to configure the nodes to try and get their SST from another node in the same DC.

I noticed an interesting issue where the nodes that end in "-dr" correctly request their SST from the specified donors, but the other nodes request always request their SST from the wrong DC. For example:

node01-dr is configured as wsrep_sst_donor = "node02-dr,node03-dr"
When it requests an SST, it correctly asks node02-dr.

node01-dr is configured as wsrep_sst_donor = "node03-dr,node02-dr"
When it requests an SST, it correctly asks node03-dr.

node01 is configured as wsrep_sst_donor = "node02,node03"
BUT when it requests an SST, it incorrectly receives it from node02-dr.

node01 is configured as wsrep_sst_donor = "node03,node02"
BUT when it requests an SST, it incorrectly receives it from node03-dr.

Can anyone shed some light on this? This cluster has been in production for a year and everything else is working as expected. It is strange I cannot set this one parameter.