Activity

in reviewing some of the patches lately, i was again bothered by the complexity of outstandingChanges used by PrepRP and FinalRP. Both threads touch that list and prepRP checks both the list and the datatree.

right now datatree has the committed changes, but if we make datatree store the pending changes, then it can be completely maintained by the prepRP. it would also mean that prepRP would execute all operations: both the read and write operations. it turns out that we never take advantage of the fact that datatree contains only committed operations.

in general, i think this would simplify the code, make finalRP go away, allow us to get rid of outstandingChanges and allow us to do all datatree manipulation in the prepRP.

there is a downside: currently order operations as we get them from the clients, which works with the above. we have occasionally talked about letting read operations short circuit the pipeline if there are no writes from the client issuing the read in the pipeline. short circuited requests would go straight to finalRP. if we implement the above, this optimization would be much harder to implement. (i don't see it as much of an issue since we don't have any plans to implement it currently.)

Benjamin Reed
added a comment - 12/Jun/11 01:59 in reviewing some of the patches lately, i was again bothered by the complexity of outstandingChanges used by PrepRP and FinalRP. Both threads touch that list and prepRP checks both the list and the datatree.
right now datatree has the committed changes, but if we make datatree store the pending changes, then it can be completely maintained by the prepRP. it would also mean that prepRP would execute all operations: both the read and write operations. it turns out that we never take advantage of the fact that datatree contains only committed operations.
in general, i think this would simplify the code, make finalRP go away, allow us to get rid of outstandingChanges and allow us to do all datatree manipulation in the prepRP.
there is a downside: currently order operations as we get them from the clients, which works with the above. we have occasionally talked about letting read operations short circuit the pipeline if there are no writes from the client issuing the read in the pipeline. short circuited requests would go straight to finalRP. if we implement the above, this optimization would be much harder to implement. (i don't see it as much of an issue since we don't have any plans to implement it currently.)

just to correct slightly my earlier statement. we do indirectly take advantage of the fact that the datatree contains only committed operations: if the leader tells a follower to truncate, we know that we only need to truncate the log; we don't need to worry about removing anything from the data tree since everything in the tree is committed.

Benjamin Reed
added a comment - 12/Jun/11 04:46 just to correct slightly my earlier statement. we do indirectly take advantage of the fact that the datatree contains only committed operations: if the leader tells a follower to truncate, we know that we only need to truncate the log; we don't need to worry about removing anything from the data tree since everything in the tree is committed.

With an immutable DataTree as proposed in ZOOKEEPER-1285, the PrepRP would perform the operation and get a reference to a new immutable DataTree holding the result of the operation.
The FinalRP would then just take the prepared DataTree from a queue and put it into the ZKDatabase as the now committed DataTree.

Thomas Koch
added a comment - 04/Nov/11 16:20 With an immutable DataTree as proposed in ZOOKEEPER-1285 , the PrepRP would perform the operation and get a reference to a new immutable DataTree holding the result of the operation.
The FinalRP would then just take the prepared DataTree from a queue and put it into the ZKDatabase as the now committed DataTree.