Activity

I see this is handling the real user for proxy tokens the same way as HDFS-3509 for webhdfs. After my work on HDFS-3553, I suggested on HDFS-3509 that perhaps the doAs required by SPNEGO should be pushed lower in the stack instead of all the callers implementing the same logic at a high level. If so, I think that will provide a solution for httpfs, webhdfs, fetchdt, and future callers so the new logic can be removed in this jira.

Other than that, I'm not confident in my knowledge to fully review the patch, but I'm +1 based on my knowledge and Aaron's +1.

Daryn Sharp
added a comment - 26/Jul/12 14:45 I see this is handling the real user for proxy tokens the same way as HDFS-3509 for webhdfs. After my work on HDFS-3553 , I suggested on HDFS-3509 that perhaps the doAs required by SPNEGO should be pushed lower in the stack instead of all the callers implementing the same logic at a high level. If so, I think that will provide a solution for httpfs, webhdfs, fetchdt, and future callers so the new logic can be removed in this jira.
Other than that, I'm not confident in my knowledge to fully review the patch, but I'm +1 based on my knowledge and Aaron's +1.

Hadoop QA
added a comment - 26/Jul/12 01:47 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12537931/HDFS-3113.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 14 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2908//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2908//console
This message is automatically generated.

taking care of javac warnings as well as unhanded condition when canceling the delegation token without a token parameter (was setting both BAD_REQUEST and SC_OK as response, now doing BAD_REQUEST only).

Created a follow up JIRA to annotate all classes in HTTPFS as private HDFS-3724

Alejandro Abdelnur
added a comment - 26/Jul/12 01:18 taking care of javac warnings as well as unhanded condition when canceling the delegation token without a token parameter (was setting both BAD_REQUEST and SC_OK as response, now doing BAD_REQUEST only).
Created a follow up JIRA to annotate all classes in HTTPFS as private HDFS-3724

Alejandro Abdelnur
added a comment - 25/Jul/12 17:44 New patch addressing Eli's feedback, except for:
HttpFSKerberosAuthenticator. Why wrap AuthenticationException in IOEs for the token methods?
This ends up used by FileSystem methods which throw only IOExceptions.
DelegationTokenManagerException This class should be public / stable right?
Not really, all this is HttpFS internal.

Alejandro Abdelnur
added a comment - 18/Jul/12 21:06 @eli, thanks for the review. I'll address your comments today or tomorrow.
@daryn, HttpFS issues it own delegation tokens, and they can be used only against the HttpFs service. Hope this clarifies.

Just to make sure I understand: WebHDFS client -> HttpFS -> SomeFS. Is HttpFS acting like a pure proxy and relaying the real token for SomeFS back to the client? If so, what's the purpose of the secret manager in httpfs?

Daryn Sharp
added a comment - 18/Jul/12 21:01 Just to make sure I understand: WebHDFS client -> HttpFS -> SomeFS. Is HttpFS acting like a pure proxy and relaying the real token for SomeFS back to the client? If so, what's the purpose of the secret manager in httpfs?
Regarding token decoding, this:
ByteArrayInputStream buf = new ByteArrayInputStream(token.getIdentifier());
DataInputStream dis = new DataInputStream(buf);
DelegationTokenIdentifier id = new DelegationTokenIdentifier();
id.readFields(dis);
dis.close();
can be replaced with DelegationTokenIdentifier id = token.decodeIdentifier()
BTW, have you tested this with oozie to ensure proxy tokens work correctly?

Eli Collins
added a comment - 18/Jul/12 18:40 Hey Tucu,
Overall looks good, only minor comments follow. What testing have you done with these patches, confirmed distcp against httpfs with kerberos enabled now works?
pom.xml
Intend to comment out <forkMode> always?
ServerWebApp.java
Annotate setAuthority VisibleForTesting?
HttpFSKerberosAuthenticator
Why wrap AuthenticationException in IOEs for the token methods?
DelegationTokenManagerException
This class should be public / stable right?
HttpFSUtils
Use HttpFSFileSystem.SCHEME
HttpFSKerberosAuthenticationHandler
The last part of the authenticate javadoc is not filled out
NetUtils
The diff is a nop

Alejandro Abdelnur
added a comment - 02/Jul/12 20:48 @Daryn, I'm looking at the Token#decodeIdentifier, but I'm sure where I can use that method instead of what the patch is currently doing. Would you please elaborate? thx

If a client uses webhdfs:// they'll be using WebHdfsFileSystem class, the HttpFSFileSystem class is not avail for clients. WebHDFSFileSystem usas the token renewer, and that one will kick in. The WebHdfsFileSystem client will get a token from HttpFS server and will renew it against HttpFS. It is not the intention for a token from WebHdfs server to be used against HttpFS server or viceversa. Does this address your concern?

Alejandro Abdelnur
added a comment - 02/Jul/12 19:21 Would you please elaborate ...
If a client uses webhdfs:// they'll be using WebHdfsFileSystem class, the HttpFSFileSystem class is not avail for clients. WebHDFSFileSystem usas the token renewer, and that one will kick in. The WebHdfsFileSystem client will get a token from HttpFS server and will renew it against HttpFS. It is not the intention for a token from WebHdfs server to be used against HttpFS server or viceversa. Does this address your concern?
I'll look at the Token#decoderIndentifier, thx

I think the new patch addressed the few token service issues I cited. I don't feel comfortable though reviewing the entire patch, so I'd like someone else to help out.

keep in mind this FileSystem implementation is not used by clients, clients use the WebHdfsFileSystem implementation against HttpFS.

Would you please elaborate further on why a token renewer isn't needed? If a job is using web hdfs, and tries to renew the web hdfs token, doesn't that mean web hdfs needs to in turn renew the httpfs token? Or how do web and http fs and their tokens interact?

BTW, Tom added a really nifty Token#decodeIdentifier method that you can use.

Daryn Sharp
added a comment - 02/Jul/12 17:59 I think the new patch addressed the few token service issues I cited. I don't feel comfortable though reviewing the entire patch, so I'd like someone else to help out.
keep in mind this FileSystem implementation is not used by clients, clients use the WebHdfsFileSystem implementation against HttpFS.
Would you please elaborate further on why a token renewer isn't needed? If a job is using web hdfs, and tries to renew the web hdfs token, doesn't that mean web hdfs needs to in turn renew the httpfs token? Or how do web and http fs and their tokens interact?
BTW, Tom added a really nifty Token#decodeIdentifier method that you can use.

Regarding the token renewal, yes I know that is missing from the HttpFSFileSystem implementation. However, keep in mind this FileSystem implementation is not used by clients, clients use the WebHdfsFileSystem implementation against HttpFS. The HttpFSFileSystem implementation is used by the HttpFSServer as well as for the testcases. This could change when we unify the codebase of webhdfs and httpfs.

Alejandro Abdelnur
added a comment - 30/Jun/12 01:20 Thanks Daryn. The new patch takes care of your first 2 comments.
Regarding the token renewal, yes I know that is missing from the HttpFSFileSystem implementation. However, keep in mind this FileSystem implementation is not used by clients, clients use the WebHdfsFileSystem implementation against HttpFS. The HttpFSFileSystem implementation is used by the HttpFSServer as well as for the testcases. This could change when we unify the codebase of webhdfs and httpfs.

Daryn Sharp
added a comment - 29/Jun/12 18:17 This is a big patch, so here's what I've immediately noticed:
The uri should be constructed as URI.create(uri.getScheme()+"://"+uri.getAuthority()) in case it doesn't include the default port. Otherwise the current patch I think will cause -1 to be in the uri.
The httpFSAddr (used for the token service) needs to be based on getCanonicalUri which adds the default port if necessary. Similarly, getDefaultPort should be implemented.
A TokenRenewer is needed that is registered for the service loader.

the findbugs warning of lack of synchronization is in a method used for test initialization, synchronizing is not necessary and it does not make sense as it would no work due to how the getAuthority() method works. I'll add the findbug exclusion at commit time.

Alejandro Abdelnur
added a comment - 29/Jun/12 05:25 the findbugs warning of lack of synchronization is in a method used for test initialization, synchronizing is not necessary and it does not make sense as it would no work due to how the getAuthority() method works. I'll add the findbug exclusion at commit time.