Both store have a similar design. Each app data and attempt data is stored in its own znode/file. The names of the znodes/files are the app and attempt id's. This creates discrete units of writes for each store operation and reduces potential areas of contention for downstream HA work.
For loading the stored state, each app and attempt is parsed into its object. Attempts are then placed into collections for their corresponding apps and orphans are discarded.

Bikas Saha
added a comment - 20/Nov/12 18:40 Both store have a similar design. Each app data and attempt data is stored in its own znode/file. The names of the znodes/files are the app and attempt id's. This creates discrete units of writes for each store operation and reduces potential areas of contention for downstream HA work.
For loading the stored state, each app and attempt is parsed into its object. Attempts are then placed into collections for their corresponding apps and orphans are discarded.

Tom White
added a comment - 23/Nov/12 15:37 Some initial feedback on FileSystemRMStateStore (I haven't looked at the ZK store):
There are references to znodes in FileSystemRMStateStore.
Failure to read a file in FileSystemRMStateStore should not cause the whole recovery process to fail, just that particular application or attempt.
The literals "application_" and "appattempt_" should be made into constants that live in ApplicationId and ApplicationAttemptId.

At this point I have tried to make the logic fail safe keeping in mind imminent HA work where load/store operations need to make sure that partial operations dont happen. Errors could either be because of store error or loss of master status. Loss of master status may actually manifest as a store error in many cases. Hence I have chosen to bail out on every error in the store. Of course, this may change to allow partial reads if such optimizations are deemed safe. Does that help clarify?
Thats also partly responsible for why I am currently throwing Exception everywhere. I am yet not sure about all different exceptions that might be validly thrown by different store implementations. In the near term, I do expect that these might narrow down at least to IOException for a pure store error (e.g. NN is unavailable) and HAException that would clearly show that loss of master status has occurred. Until some preliminary HA work provides more inputs to this end, I thought I would leave the Exceptions generic for now. All interfaces are marked Evolving for this reason.
My next patch should address the remaining feedback and any other new ones that you might have.

Bikas Saha
added a comment - 24/Nov/12 09:12 At this point I have tried to make the logic fail safe keeping in mind imminent HA work where load/store operations need to make sure that partial operations dont happen. Errors could either be because of store error or loss of master status. Loss of master status may actually manifest as a store error in many cases. Hence I have chosen to bail out on every error in the store. Of course, this may change to allow partial reads if such optimizations are deemed safe. Does that help clarify?
Thats also partly responsible for why I am currently throwing Exception everywhere. I am yet not sure about all different exceptions that might be validly thrown by different store implementations. In the near term, I do expect that these might narrow down at least to IOException for a pure store error (e.g. NN is unavailable) and HAException that would clearly show that loss of master status has occurred. Until some preliminary HA work provides more inputs to this end, I thought I would leave the Exceptions generic for now. All interfaces are marked Evolving for this reason.
My next patch should address the remaining feedback and any other new ones that you might have.

Errors could either be because of store error or loss of master status.

Do you mean loss of master status of the RM doing the store? I wouldn't expect the store to know about the master's status, since that would be handled at a higher level (i.e. by the RM itself). So IOException would be sufficient, at least for this JIRA.

All interfaces are marked Evolving for this reason.

It would be better to use Unstable until the RM HA work is done. Evolving would mean that you couldn't change the exceptions between 2.0.3 and 2.0.4 (say).

Tom White
added a comment - 27/Nov/12 17:12 Errors could either be because of store error or loss of master status.
Do you mean loss of master status of the RM doing the store? I wouldn't expect the store to know about the master's status, since that would be handled at a higher level (i.e. by the RM itself). So IOException would be sufficient, at least for this JIRA.
All interfaces are marked Evolving for this reason.
It would be better to use Unstable until the RM HA work is done. Evolving would mean that you couldn't change the exceptions between 2.0.3 and 2.0.4 (say).

No. I mean RMStateStoreImpl catching store exceptions and figuring out whether master status has been lost or its an unrelated store error. e.g. An error saying the store has disallowed access vs and error saying the store is unavailable. The first would be an HA exception meaning the RM should stop vs the second exception would mean a loss of store which may be tolerable when storing non-critical information like node health. So, I would like to narrow down from Exception into a better subset as we progress instead of making everything look like an IOException now.

Good point. I will mark stuff as Unstable instead for both this and YARN-230

Bikas Saha
added a comment - 27/Nov/12 17:25 No. I mean RMStateStoreImpl catching store exceptions and figuring out whether master status has been lost or its an unrelated store error. e.g. An error saying the store has disallowed access vs and error saying the store is unavailable. The first would be an HA exception meaning the RM should stop vs the second exception would mean a loss of store which may be tolerable when storing non-critical information like node health. So, I would like to narrow down from Exception into a better subset as we progress instead of making everything look like an IOException now.
Good point. I will mark stuff as Unstable instead for both this and YARN-230

Tom White
added a comment - 28/Nov/12 12:39 I was assuming that the store was HA since it would be backed by HDFS or ZK.
Anyway, I'm fine refining the set of exceptions as this changes in follow-on JIRAs, even though it probably is just IOException at this point.

Code Warning
IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.zkClient; locked 81% of time
IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.zkSessionTimeout; locked 50% of time

These are the warnings. I looked at the code and I dont see a synchronization issue. Maybe a different pair of eyes might spot something. If its clean I will disable the warning for that part of the code.

Bikas Saha
added a comment - 21/Dec/12 21:39
Code Warning
IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.zkClient; locked 81% of time
IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.zkSessionTimeout; locked 50% of time
These are the warnings. I looked at the code and I dont see a synchronization issue. Maybe a different pair of eyes might spot something. If its clean I will disable the warning for that part of the code.

Bikas Saha Apologies for taking this long to respond, but it looks great.

I'd really want to get this into 2.0.3-alpha to go with YARN-230, but I'm worried about the ZK changes (and since I'm not an expert I can't go very deep into them either).

So, as a compromise, would you mind splitting this up into 2 patches? a) FS Store b) ZK Store.

This way I can commit the FS Store into 2.0.3 and then commit ZK Store into trunk for a subsequent release? I know it's being unfair given I've take so long to get around to this, but I'd really appreciate if you could accommodate my request for 2.0.3. Thanks!

Arun C Murthy
added a comment - 21/Jan/13 18:20 Bikas Saha Apologies for taking this long to respond, but it looks great.
I'd really want to get this into 2.0.3-alpha to go with YARN-230 , but I'm worried about the ZK changes (and since I'm not an expert I can't go very deep into them either).
So, as a compromise, would you mind splitting this up into 2 patches? a) FS Store b) ZK Store.
This way I can commit the FS Store into 2.0.3 and then commit ZK Store into trunk for a subsequent release? I know it's being unfair given I've take so long to get around to this, but I'd really appreciate if you could accommodate my request for 2.0.3. Thanks!

Bikas Saha
added a comment - 22/Jan/13 18:07 Hitesh, createConnection() is private and called from within synchronized functions. ZKAction are invoked from within synchronized functions. Let me know if you see that not happening.

Hitesh Shah
added a comment - 22/Jan/13 20:01 A couple of minor comments on the fs patch:
FSRMStateStore class:
needs a default value for fsWorkingPath in case YarnConfiguration.FS_RM_STATE_STORE_URI is not defined
should "FileStatus[] childNodes = fs.listStatus(fsRootDirPath);" use the listStatus function that accepts a path filter?