Once again, replication/sync has been lost. I really wish the
product was more stable, it is so much potential and yet.

Servers running for 6 days no issues. No new accounts or
changes (maybe a few users changing passwords) and again, 5 out
of 16 servers are no longer in sync.

I can test it easily by adding an account and then waiting a
few minutes, then run "ipa user-show --all username" on all
the servers, and only a few of them have the account. I have
now waited 15 minutes, still no luck.

Oh well.. I guess I will go look at alternatives. I had such
high hopes for this tool. Thanks so much everyone for all your
help in trying to get things stable, but for whatever reason,
there is a random loss of sync among the servers and obviously
this is not acceptable.

What I also found to be interesting is that I have not deleted any
masters at all, so this was quite perplexing where the orphaned
entries came from. However I did find 3 of the replicas did not
show complete RUV lists... While most of the replicas had a list
of all 16 servers, a couple of them listed only 4 or 5. (using
ipa-replica-manage list-ruv)

I don't know about the orphaned entries. Did you get entries below
deleted parents ?

AFAIK all replicas are master and so have an entry {replica <rid>}
in the RUV. We should expect all servers having the same number of
RUVelements (16, 4 or 5). The servers with 4 or 5 may be isolated
so that they did not received updates from those with 16 RUVelements.

would you copy/paste an example of RUV with 16 and with 4-5 ?

Now, the steps to clear this were:

Removed the "unable to decode" with the direct ldapmodify's. This
worked across all replicas, which was nice and did not have to be
repeated in each one. In other words, entered on a single server,
and it was removed on all.

And just like that - for no reason, they all reappeared:
unable to decode {replica 16} 55356472000300100000 55356472000300100000
unable to decode {replica 23} 5545d61f000200170000 5552f718000300170000
unable to decode {replica 24} 554d53d3000000180000 554d54a4000200180000
:-(
~J