Hi,
"How to determine being out-of-sync?" -- some ideas.
> [...] On problem we ran
> into is determining if the slave server is out of sync with that
> master
If I understand it well
. updates are processed in a fixed sequence
. updates have timestamps on them
. being out of sync means having missed some data?
It might be quite simple to dump all LDAP data in a canonical format; then
piping it through a secure hash like SHA1 to ensure that access privileges
are not at risk and that a change-sensitive summary of the data exists for
comparison of master and slaves.
Alternatively, it might be simple to make that secure hash over the
timestamps alone; this could be done anywhere, even in a slurpd that only
knows part of what's going on. This would probably require changes to
software.
An intermediate solution would be to parse all replog files, collect all
timestamps from a given moment on, order them, hash them, and have that as
material for comparison between master and slaves.
In order to sign out-of-syncs immediately, one could consider mentioning
the timestamp of the previous update in any update. The slave would notice
right away that it's missing information -- but that would make the slave
rather inflexible.
A very useful solution would be to represent the updates in some idempotent
format; that is, when applied more than once, the update wouldn't have any
other effect than when it would be applied exactly once.
Even better, but probably impossible, would be if the updates would be
commutative; that is, they could be applied in any order. If this seems
undoable, it might be interesting to look into what changes should be made
to an update when it "jumps over" another update that was applied too early
due to mis-syncs.
An example of the latter:
"change attr kids to 3 for cn=rick,dc=vanrein,dc=org"
WHEN JUMPING OVER
"modify rdn to cn=henderikus for cn=rick,dc=vanrein,dc=org"
BECOMES
"change attr kids to 3 for cn=henderikus,dc=vanrein,dc=org"
Another example:
"change attr kids to 3 for cn=rick,dc=vanrein,dc=org"
WHEN JUMPING OVER
"delete the object for cn=rick,dc=vanrein,dc=org"
BECOMES
"no operation"
The terms "idempotent" and "commutative" come from algebra, by the way.
> When I backup the slave, I want to know that I am
> getting all of the data.
Yep, you'd want to calculate one of the proposed hashes above on the
material on your tape or CDROM.
Cheers,
Rick van Rein,
OpenFortress.