Details

Description

KMSClientProvider is designed to be shared by different KMS clients. When HDFS Namenode as KMS client talks to KMS to generateEncryptedKey for new file creation from proxy user (hive, oozie), the proxyuser handling for KMSClientProvider in this case is unnecessary, which cause 1) an extra proxy user configuration allowing hdfs user to proxy its clients and 2) KMS acls to allow non-hdfs user for GENERATE_EEK operation.

This ticket is opened to always use HDFS namenode login user (hdfs) when talking to KMS to generateEncryptedKey for new file creation. This way, we have a more secure KMS based HDFS encryption (we can set kms-acls to allow only hdfs user for GENERATE_EEK) with less configuration hassle for KMS to allow hdfs to proxy other users.

Xiao Chen, thanks for the review. This reason for not repro with the unit test is not HDFS-9405. There are 150 key (hadoop.security.kms.client.encrypted.key.cache.size*hadoop.security.kms.client.encrypted.key.cache.low-watermark = 500*0.3=150) getting pre-created upon encryption zone creation. Change the unit test to hadoop.security.kms.client.encrypted.key.cache.size = 4 and hadoop.security.kms.client.encrypted.key.cache.low-watermark=0.5 so that the refill of EDEK cache happen upon the 3rd file creation.

Update the unit test based on that and now we can repro the original issue without the code fix in the patch with the exception stack below.

Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: hdfs/localhost@EXAMPLE.COM is not allowed to impersonate oozie_user
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:616)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:574)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.access$200(KMSClientProvider.java:91)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider$EncryptedQueueRefiller.fillQueueForKey(KMSClientProvider.java:146)
at org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:299)

Also update the unit test to remove the try/catch as suggested. Please review and let me know your thoughts. Thanks!

Xiaoyu Yao
added a comment - 10/Aug/16 19:22 Xiao Chen , thanks for the review. This reason for not repro with the unit test is not HDFS-9405 . There are 150 key (hadoop.security.kms.client.encrypted.key.cache.size*hadoop.security.kms.client.encrypted.key.cache.low-watermark = 500*0.3=150) getting pre-created upon encryption zone creation. Change the unit test to hadoop.security.kms.client.encrypted.key.cache.size = 4 and hadoop.security.kms.client.encrypted.key.cache.low-watermark=0.5 so that the refill of EDEK cache happen upon the 3rd file creation.
Update the unit test based on that and now we can repro the original issue without the code fix in the patch with the exception stack below.
Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: hdfs/localhost@EXAMPLE.COM is not allowed to impersonate oozie_user
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:616)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:574)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.access$200(KMSClientProvider.java:91)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider$EncryptedQueueRefiller.fillQueueForKey(KMSClientProvider.java:146)
at org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:299)
Also update the unit test to remove the try/catch as suggested. Please review and let me know your thoughts. Thanks!

The change LGTM too, but the test is passing even without the fix. I think (not debugged, sorry if not correct) this is because NN will warm up the cache after HDFS-9405, so the test didn't trigger the KMS ACL check.
Why createFile is done 3 times in the test? Is it for cache draining? I think we could set the cache size to 1 make it fail if so.

Xiao Chen
added a comment - 02/Aug/16 05:52 Thanks Xiaoyu Yao for revving!
The change LGTM too, but the test is passing even without the fix. I think (not debugged, sorry if not correct) this is because NN will warm up the cache after HDFS-9405 , so the test didn't trigger the KMS ACL check.
Why createFile is done 3 times in the test? Is it for cache draining? I think we could set the cache size to 1 make it fail if so.
Also a nit: in the test, can we remove this?
try {
...
} catch (IOException e) {
throw new IOException(e);
}

Thanks Xiaoyu Yao for opening the issue and the patch.
I think the idea makes sense, since from HDFS perspective the only user needs to generate EDEK is hdfs. Ping Andrew Wang for awareness.

Regarding checkTGTAndReloginFromKeytab, you're absolutely right that we don't need it in the client code here. I think adding it to KerberosAuthencitator makes sense logically, and in that case we don't need these in DTA any more.

I didn't put it there in HADOOP-13255 because KA is in hadoop-auth component, while DTA and UGI are both in hadoop-common. Feels like we'll need a dependency between the two in order to add this... Let's follow up on this in the separate jira.

Xiao Chen
added a comment - 19/Jul/16 01:18 Thanks Xiaoyu Yao for opening the issue and the patch.
I think the idea makes sense, since from HDFS perspective the only user needs to generate EDEK is hdfs . Ping Andrew Wang for awareness.
Regarding checkTGTAndReloginFromKeytab , you're absolutely right that we don't need it in the client code here. I think adding it to KerberosAuthencitator makes sense logically, and in that case we don't need these in DTA any more.
public void authenticate(URL url, AuthenticatedURL.Token token)
throws IOException, AuthenticationException {
if (!hasDelegationToken(url, token)) {
// check and renew TGT to handle potential expiration
UserGroupInformation.getCurrentUser().checkTGTAndReloginFromKeytab();
authenticator.authenticate(url, token);
}
}
I didn't put it there in HADOOP-13255 because KA is in hadoop-auth component, while DTA and UGI are both in hadoop-common. Feels like we'll need a dependency between the two in order to add this... Let's follow up on this in the separate jira.

checkTGTAndReloginFromKeytab is not needed with HADOOP-13255 per discussion with Jitendra Nath Pandey. Adding a patch v1 for that.
I'm working on the unit test of this and will update the patch again later.

I also found a potential issue with HADOOP-13255 where the checkTGTAndReloginFromKeytab is invoked with only DelegationTokenAuthenticator#authenticate but not KerberosAuthenticator#authenticate. This is not an issue now because we currently don't use KerberosAuthenticator directly. Only DelegationTokenAuthenticator or KerberosDelegationTokenAuthenticator are being used. Since both KerberosAuthenticator and DelegationTokenAuthenticator implement the Authenticator interface, it is good to have checkTGTAndReloginFromKeytab added to authenticate implementations for consistency. I will open a separate ticket for it.

Xiaoyu Yao
added a comment - 19/Jul/16 00:00 checkTGTAndReloginFromKeytab is not needed with HADOOP-13255 per discussion with Jitendra Nath Pandey . Adding a patch v1 for that.
I'm working on the unit test of this and will update the patch again later.
I also found a potential issue with HADOOP-13255 where the checkTGTAndReloginFromKeytab is invoked with only DelegationTokenAuthenticator#authenticate but not KerberosAuthenticator#authenticate . This is not an issue now because we currently don't use KerberosAuthenticator directly. Only DelegationTokenAuthenticator or KerberosDelegationTokenAuthenticator are being used. Since both KerberosAuthenticator and DelegationTokenAuthenticator implement the Authenticator interface, it is good to have checkTGTAndReloginFromKeytab added to authenticate implementations for consistency. I will open a separate ticket for it.
cc: Xiao Chen and Zhe Zhang for additional feedback.