Details

Description

On large directory structures it takes significant time to iterate through the file and directory counts recursively to get a complete ContentSummary.
When you want to just check for the quota on a higher level directory it would be good to have an option to skip the file and directory counts.

Moreover, currently one can only check the quota if you have access to all the directories underneath. For example, if I have a large home directory under /user/joep and I host some files for another user in a sub-directory, the moment they create an unreadable sub-directory under my home I can no longer check what my quota is. Understood that I cannot check the current file counts unless I can iterate through all the usage, but for administrative purposes it is nice to be able to get the current quota setting on a directory without the need to iterate through and run into permission issues on sub-directories.

Activity

This would solve a significant annoyance with computing quotas on a shared tree. However I think it has security implications. If one can get the quota totals for the entire tree then they can calculate what must be used by the parts they cannot access via quota_usage - usage_visible. If what is being stored in the restricted area is sensitive (e.g.: records related to financials) then knowing how many files or the size of the restricted data could leak sensitive information.

Jason Lowe
added a comment - 14/Aug/15 18:53 This would solve a significant annoyance with computing quotas on a shared tree. However I think it has security implications. If one can get the quota totals for the entire tree then they can calculate what must be used by the parts they cannot access via quota_usage - usage_visible. If what is being stored in the restricted area is sensitive (e.g.: records related to financials) then knowing how many files or the size of the restricted data could leak sensitive information.

So it sounds like we're discussing two things here:
1) Getting the quota itself for a directory that a user has access to. There seems to be little security concerns with this.
2) Getting the quota, and the "ContentSummary" / count / usage for a directory that a user has access to, even if they might not have access to all the sub-directories. This is where Jason Lowe pointed out that there could be a potential security implication.

Even with yielding the NN lock, it seems the NN can still lock for ~1 sec per 10M files in a sub-directory to check the entire sub-directory sub-directory tree for permissions.
To address the potential security implications for 2) we could either make this a cluster-wide (final) config value, or we could do something with an extended attribute on the directory itself to allow or disallow a particular directory to be traversed (or not).

1) would give a huge performance boost for the cases when people just want to know what the quota is.
2) would give a huge performance boost for the cases when people want to know a quota plus what's left for large directories relatively high in the directory structure (let alone / on a huge namespace of many tens of millions of files).

Joep Rottinghuis
added a comment - 11/Sep/15 22:21 So it sounds like we're discussing two things here:
1) Getting the quota itself for a directory that a user has access to. There seems to be little security concerns with this.
2) Getting the quota, and the "ContentSummary" / count / usage for a directory that a user has access to, even if they might not have access to all the sub-directories. This is where Jason Lowe pointed out that there could be a potential security implication.
Even with yielding the NN lock, it seems the NN can still lock for ~1 sec per 10M files in a sub-directory to check the entire sub-directory sub-directory tree for permissions.
To address the potential security implications for 2) we could either make this a cluster-wide (final) config value, or we could do something with an extended attribute on the directory itself to allow or disallow a particular directory to be traversed (or not).
1) would give a huge performance boost for the cases when people just want to know what the quota is.
2) would give a huge performance boost for the cases when people want to know a quota plus what's left for large directories relatively high in the directory structure (let alone / on a huge namespace of many tens of millions of files).

Can we provide optimizations for specific use cases without sacrificing the overall security requirement Jason mentioned above?

Allow super users to get a simplified version of ContentSummary quickly. To add to the Joep’s scenarios, we have scenarios which require quota and high-level usage “quota + namespace files/dirs count + disk consumption”, which is a subset of what ContentSummary provides(file length, distinction between files and dirs aren’t important). If we provide a ContentSummaryV2 with only these fields, NN can just return usage data cached on directory objects with quota set. For regular user or directories without quota set, traversal is still required.

Support other users besides super users. The permissions check is already skipped if the caller is a super user. If we define something like power user which have less power than super user but have read access to all directories, we can apply the same optimization to these power users. At the minimal, proxy users that can represent everyone should be treated like super users for the getContentSummary scenario.

Ming Ma
added a comment - 15/Sep/15 18:03 Can we provide optimizations for specific use cases without sacrificing the overall security requirement Jason mentioned above?
Allow super users to get a simplified version of ContentSummary quickly. To add to the Joep’s scenarios, we have scenarios which require quota and high-level usage “quota + namespace files/dirs count + disk consumption”, which is a subset of what ContentSummary provides(file length, distinction between files and dirs aren’t important). If we provide a ContentSummaryV2 with only these fields, NN can just return usage data cached on directory objects with quota set. For regular user or directories without quota set, traversal is still required.
Support other users besides super users. The permissions check is already skipped if the caller is a super user. If we define something like power user which have less power than super user but have read access to all directories, we can apply the same optimization to these power users. At the minimal, proxy users that can represent everyone should be treated like super users for the getContentSummary scenario.

Here is the patch just to illustrate the idea. Applications have the option to get QuotaUsage for any directory that has quota set. It contains the quota and the usage. This allows NN to directly use the cached data.

For a regular user, NN's recursive file permission check still takes time, but at least getting the actual usage is fast. So the overall latency of getting quota usage is faster than getContentSummary. For a super user, given there is no more traversal so it will just take few milliseconds for any large directory.

Ming Ma
added a comment - 16/Sep/15 18:05 Here is the patch just to illustrate the idea. Applications have the option to get QuotaUsage for any directory that has quota set. It contains the quota and the usage. This allows NN to directly use the cached data.
For a regular user, NN's recursive file permission check still takes time, but at least getting the actual usage is fast. So the overall latency of getting quota usage is faster than getContentSummary. For a super user, given there is no more traversal so it will just take few milliseconds for any large directory.

Here is the updated patch with new unit tests. Another change in the updated patch is to have ContentSummary reuse QuotaUsage structure given they have lots of overlaps. I have manually verified that existing applications compiled with the old ContentSummary will still work with the new binary without recompilation. Procobuf definition for these two structures remain separate.

Ming Ma
added a comment - 02/Nov/15 17:11 Here is the updated patch with new unit tests. Another change in the updated patch is to have ContentSummary reuse QuotaUsage structure given they have lots of overlaps. I have manually verified that existing applications compiled with the old ContentSummary will still work with the new binary without recompilation. Procobuf definition for these two structures remain separate.

Ming Ma
added a comment - 04/Nov/15 05:20 Thanks Vinayakumar B ! Here is the updated patch that addresses your comments, findbugs and many of the checkstyle issues.
3. Option added to Count command needs document update.
It looks like the document also misses the storage type quota. I updated it as part of this change. If more work is needed for storage type quota, I can open a new jira.
4. Following code too is expected to be inside fsd lock.
Fixed. good point. It seems getFileInfo and getContentSummary have similar issues. Any ideas? If we need to fix those cases, maybe it is better to open a new jira for that.

mvninstall failure is due to the known issue, where another job on the same machine updated mvn cache for hadoop-hdfs-project/hadoop-hdfs-client and thus caused the build of hadoop-hdfs-project/hadoop-hdfs to fail. The failed unit tests aren't related. Vinayakumar B, Kihwal Lee, any more comments?

Ming Ma
added a comment - 19/Nov/15 17:48 mvninstall failure is due to the known issue, where another job on the same machine updated mvn cache for hadoop-hdfs-project/hadoop-hdfs-client and thus caused the build of hadoop-hdfs-project/hadoop-hdfs to fail. The failed unit tests aren't related. Vinayakumar B , Kihwal Lee , any more comments?

How does a new client work with an older namenode that does not implement getQuotaUsage?
If we are not planning on making it work, we should probably show more user friendly error message.
Other than that the patch looks good. Also it still applies!

Kihwal Lee
added a comment - 19/Jan/16 18:11 How does a new client work with an older namenode that does not implement getQuotaUsage?
If we are not planning on making it work, we should probably show more user friendly error message.
Other than that the patch looks good. Also it still applies!

Ming Ma
added a comment - 20/Jan/16 00:42 Thanks Kihwal Lee !
In the case of new client talking to older namenode, we could have DFSClient fall back to use the existing getContentSummary RPC. Here is the updated patch. I have tested this specific case manually.

The patch looks good.
The checkstyle warnings seem to be due to the builder methods being non-void. I think we can ignore that.
Please make sure the unit test failures are unrelated and whether there already is a jira that reported the issue.

Kihwal Lee
added a comment - 21/Jan/16 15:53 The patch looks good.
The checkstyle warnings seem to be due to the builder methods being non-void. I think we can ignore that.
Please make sure the unit test failures are unrelated and whether there already is a jira that reported the issue.

Ran these local tests locally with java 1.7 and 1.8. They all passed. No jiras have been opened so far. Let me open a jira for TestNNThroughputBenchmark. It has had issues as reported in other jenkins jobs.

Ming Ma
added a comment - 21/Jan/16 17:53 Ran these local tests locally with java 1.7 and 1.8. They all passed. No jiras have been opened so far. Let me open a jira for TestNNThroughputBenchmark. It has had issues as reported in other jenkins jobs.

Zhe Zhang
added a comment - 12/Sep/16 21:31 Quick note that I'm working on backporting HDFS-10744 to branch-2.7; since this change is not in branch-2.7, the overlapping part won't be backported.
If someone plans to backport this to branch-2.7, please remember to add the optimization from HDFS-10744 .