Details

Description

With the help of AsyncDiskService from MAPREDUCE-1213, we should be able to delete files from distributed cache asynchronously.

That will help make task initialization faster, because task initialization calls the code that localizes files into the cache and may delete some other files.
The deletion can slow down the task initialization speed.

Amareshwari Sriramadasu
added a comment - 16/Dec/09 07:52 There is known issue with deletion, see MAPREDUCE-1141 . Deletion should not happen by task localization, it should be done by separate thread in TaskTracker (CleanupQueue ?).

deletion of contents of work dir in TaskTracker.SetupWorkDir() should also be done asynchronously(as in MAPREDUCE-1213, by first moving to toBeDeleted and then removing by a separate thread(could be CleanupQueue)) sothat the next task in the same jvm(in case of jvm reuse) can start immediately. Currently this deletion is done inline, which can take a long time based on the number of files existing in work dir.

Ravi Gummadi
added a comment - 16/Dec/09 08:16 deletion of contents of work dir in TaskTracker.SetupWorkDir() should also be done asynchronously(as in MAPREDUCE-1213 , by first moving to toBeDeleted and then removing by a separate thread(could be CleanupQueue)) sothat the next task in the same jvm(in case of jvm reuse) can start immediately. Currently this deletion is done inline, which can take a long time based on the number of files existing in work dir.

Zheng Shao
added a comment - 21/Dec/09 19:43 The test errors are not related.
The first one seems like random error - it didn't appear on the last hudson test of the same patch.
The next 3 are common to all patches' test results.
>>><<< org.apache.hadoop.security.authorize.TestServiceLevelAuthorization.testServiceLevelAuthorization
>>><<< org.apache.hadoop.streaming.TestStreamingExitStatus.testMapFailOk
>>><<< org.apache.hadoop.streaming.TestStreamingExitStatus.testReduceFailOk
>>><<< org.apache.hadoop.streaming.TestStreamingKeyValue.testCommandLine

dhruba borthakur
added a comment - 24/Dec/09 18:56 The code looks good to me. What happens to a file if the TaskTracker dies after renaming the file but before the async thread gets around to deleting the file?

The code didn't create a single task for toBeDeleted, but went through the toBeDeleted directory and create one task per each.
The reason for that is:
1. This allows parallel deletion of the contents inside toBeDeleted
2. A single list call per volume shouldn't take too long
3. If we want to create a single task for toBeDeleted, then we need to rename it to something else, and recreate toBeDeleted, and then move the old one to be a sub directory inside the new toBeDeleted. This will introduce additional intermediate states that may be hard to recover from.

Zheng Shao
added a comment - 29/Dec/09 20:38 The code didn't create a single task for toBeDeleted, but went through the toBeDeleted directory and create one task per each.
The reason for that is:
1. This allows parallel deletion of the contents inside toBeDeleted
2. A single list call per volume shouldn't take too long
3. If we want to create a single task for toBeDeleted, then we need to rename it to something else, and recreate toBeDeleted, and then move the old one to be a sub directory inside the new toBeDeleted. This will introduce additional intermediate states that may be hard to recover from.

#getMRAsyncDiskService() method is not used anywhere. Can we remove it?

We should add javadoc of #cleanupStorage() method as to what one should use instead of this deprecated method.

The method calls this.distributedCacheManager.purgeCache() and asyncDiskService.moveAndDeleteAllVolumes() (at +645-46) are not needed at all because there is a asyncDiskService.moveAndDeleteFromEachVolume(SUBDIR) (at +575) itself that completely clears the taskTracker subdir. They can be removed.

MRAsyncDiskService.java

There is an unused import of Configuration which can be removed.

Annotate MRAsyncDiskService as @InterfaceAudience.Private as it is for internal use only.

Please change the java comment of the class to javadoc. Also we should make it clear that clients of this service should NOT write files themselves in the TOBEDELTED directory in any of the volumes or risk losing such files.

Documentation (javadoc) is needed as to how MRAsyncDiskService(JobConf conf) uses the 'conf' parameter.

Can we rename moveAndDeleteAllVolumes() to something like cleanupAllVolumes() or similar. This is because this method is only moving and deleting the contents of the volumes and not the volumes themselves.

To be more clear, can we also rename moveAndDelete(String absolutePathName) to moveAndDeleteAbsolutePath(String absolutePathName) and moveAndDelete(String volume, String pathName) to moveAndDeleteRelativePath(String volume, String pathName)?

In any case, we should make the absolute/relative difference explicit in the javadoc of both the methods.

getRelativePathName() is bit buggy in that it will return null if any one or both of 'volume' string and 'absolutePathName' string have multiple Path.SEPARATOR characters in them. We should construct normalized paths before working on them.

Similar to above, we should make sure that paths/path fragments are normalized before their comparision in moveAndDeleteAllVolumes() at +289 and before passing to DeleteTask at +246 moveAndDelete(String volume, String pathName)

Javadoc of moveAndDeleteAllVolumes() needs to be fixed, it looks incomplete now

Nit: Need to fix the comment at +288 to use TOBEDELETED instead of SUBDIR

Not related to this JIRA: I think that uniqueId generation should be static and be valid across different MRAsyncDiskService objects. We can file another issue to fix this if you agree that it is a bug here.

JobConf

When deprecating Jobconf.deleteLocalFiles(), we should also modify javadoc to state that MRAsyncDiskService should be used instead.

What about JobConf.deleteLocalFiles(String subDir) and its callers? Another JIRA?

Testing:

Even though the inline removal of files is tested by various other test-cases, we haven't test the actual deletion anywhere. For this, we can refactor the DeleteTask part of moveAndDelete(String, String) into a new method and can then override it with inline deletion to assert the actual deletion of files by verifying the contents of TOBEDELETED directory.

Vinod Kumar Vavilapalli
added a comment - 06/Jan/10 11:10 I've looked at your latest patch. Some comments I have on the patch:
TaskTracker.java
#getMRAsyncDiskService() method is not used anywhere. Can we remove it?
We should add javadoc of #cleanupStorage() method as to what one should use instead of this deprecated method.
The method calls this.distributedCacheManager.purgeCache() and asyncDiskService.moveAndDeleteAllVolumes() (at +645-46) are not needed at all because there is a asyncDiskService.moveAndDeleteFromEachVolume(SUBDIR) (at +575) itself that completely clears the taskTracker subdir. They can be removed.
MRAsyncDiskService.java
There is an unused import of Configuration which can be removed.
Annotate MRAsyncDiskService as @InterfaceAudience.Private as it is for internal use only.
Please change the java comment of the class to javadoc. Also we should make it clear that clients of this service should NOT write files themselves in the TOBEDELTED directory in any of the volumes or risk losing such files.
Documentation (javadoc) is needed as to how MRAsyncDiskService(JobConf conf) uses the 'conf' parameter.
Can we rename moveAndDeleteAllVolumes() to something like cleanupAllVolumes() or similar. This is because this method is only moving and deleting the contents of the volumes and not the volumes themselves.
To be more clear, can we also rename moveAndDelete(String absolutePathName) to moveAndDeleteAbsolutePath(String absolutePathName) and moveAndDelete(String volume, String pathName) to moveAndDeleteRelativePath(String volume, String pathName) ?
In any case, we should make the absolute/relative difference explicit in the javadoc of both the methods.
getRelativePathName() is bit buggy in that it will return null if any one or both of 'volume' string and 'absolutePathName' string have multiple Path.SEPARATOR characters in them. We should construct normalized paths before working on them.
Similar to above, we should make sure that paths/path fragments are normalized before their comparision in moveAndDeleteAllVolumes() at +289 and before passing to DeleteTask at +246 moveAndDelete(String volume, String pathName)
Javadoc of moveAndDeleteAllVolumes() needs to be fixed, it looks incomplete now
Nit: Need to fix the comment at +288 to use TOBEDELETED instead of SUBDIR
Not related to this JIRA: I think that uniqueId generation should be static and be valid across different MRAsyncDiskService objects. We can file another issue to fix this if you agree that it is a bug here.
JobConf
When deprecating Jobconf.deleteLocalFiles(), we should also modify javadoc to state that MRAsyncDiskService should be used instead.
What about JobConf.deleteLocalFiles(String subDir) and its callers? Another JIRA?
Testing:
Even though the inline removal of files is tested by various other test-cases, we haven't test the actual deletion anywhere. For this, we can refactor the DeleteTask part of moveAndDelete(String, String) into a new method and can then override it with inline deletion to assert the actual deletion of files by verifying the contents of TOBEDELETED directory.

Zheng Shao
added a comment - 06/Jan/10 20:11 Modified according to Vinod's comments.
I didn't change the test. I did verify the deletion in the makeSureCleanedUp() method.
I will deprecate JobConf.deleteLocalFiles(subdir) in a follow-up jira.

+1 to the final patch.
I only have a minor comment. Can we change #getRelativePathName() to #isInVolume() instead? Just return true if the absoluteFileName is in the volume. It is actually how this function is used right now. This is not necessary. Just to make it more clear.

Scott Chen
added a comment - 06/Jan/10 23:53 +1 to the final patch.
I only have a minor comment. Can we change #getRelativePathName() to #isInVolume() instead? Just return true if the absoluteFileName is in the volume. It is actually how this function is used right now. This is not necessary. Just to make it more clear.

Scott, I do use the returned value of #getRelativePathName() after comparing it with null.
So the other option is to have 2 functions: #isInVolume() and #getRelativePathName().
I prefer having only 1 function just for simplification.

Zheng Shao
added a comment - 07/Jan/10 00:06 Scott, I do use the returned value of #getRelativePathName() after comparing it with null.
So the other option is to have 2 functions: #isInVolume() and #getRelativePathName().
I prefer having only 1 function just for simplification.

Zheng, sorry for the delay, got caught up with other things. MAPREDUCE-1186 got committed which invalidated your latest patch. It changed the constructor of MRAsyncDiskService and Tasktracker.initialize(). Can you please generate a new one?

I myself manually merged the changes for review of code. Rest of the changes look good. I'll do a +1 once you upload the updated patch. Thanks for being patient!

Vinod Kumar Vavilapalli
added a comment - 11/Jan/10 07:09 Zheng, sorry for the delay, got caught up with other things. MAPREDUCE-1186 got committed which invalidated your latest patch. It changed the constructor of MRAsyncDiskService and Tasktracker.initialize(). Can you please generate a new one?
I myself manually merged the changes for review of code. Rest of the changes look good. I'll do a +1 once you upload the updated patch. Thanks for being patient!

Amareshwari Sriramadasu
added a comment - 12/Jan/10 10:09 Since Hudson does not run LinuxTaskController tests, can you run TestTrackerDistributedCacheManagerWithLinuxTaskController and make sure it passes?