Interesting... I guess you have to add one node at a time and run repair on
it.
On Sat, Jul 20, 2013 at 7:30 AM, E S <tr1sklion@yahoo.com> wrote:
> I am trying to understand the best procedure for adding new nodes. The
> one that I see most often online seems to have a hole where there is a low
> probability of permanently losing data. I want to understand what I am
> missing in my understanding.
>
> Let's say I have a 3 node cluster (node A,B,C) with a RF of 3. I want to
> double the cluster size to 6 (node A,B,C,D,E,F) while keeping the
> replication factor of 3. Let's assume we use vnodes.
>
> My understanding is to bootstrap the 3 nodes and then run repair then
> cleanup. Here is my failure case:
>
> Before bootstrapping I have a row that is only replicated onto node A and
> B. Assume I did a quorum write and there was some hiccup on C, hinted
> handoff didn't work, and a repair has not yet been run. Let's also assume
> that once nodes D,E, F have been bootstrapped, this rows new replicas are
> D,E, and F.
>
> My reading through the bootstrapping code shows that for a given range, it
> streams it only from one node (unlike repair). There is a 1/9 chance that
> D,E,F will have streamed the range containing the row from C, which does
> not have this row.
>
> Now not even a consistency level read of ALL will return the row. A
> repair will not solve it, and when cleanup is run, the row is permanently
> deleted.
>
> I don't think this problem would normally happen without vnodes, because
> when doubling you would alternate the new nodes with the old nodes in the
> ring, so while quorum might not work until the final repair, "all" would,
> and a repair would solve the problem. With vnodes though, some of the
> ranges will follow the pattern above (range ownership moving from A,B,C to
> D,E,F).
>
> Am I missing something here? If I'm right, I think the only way to avoid
> this is adding less then a quorum of new nodes (in this case 1) before
> doing a repair. That would be painful since repairs take a while.
>
> Thanks for any help.
>
> Eddie
>
>
>