Due to the requirement that KSSL use weak encryption types for Kerberos tickets, HTTP authentication to the NameNode will now use SPNEGO by default. This will require users of previous branch-1 releases with security enabled to modify their configurations and create new Kerberos principals in order to use SPNEGO. The old behavior of using KSSL can optionally be enabled by setting the configuration option "hadoop.security.use-weak-http-crypto" to "true".

Due to the requirement that KSSL use weak encryption types for Kerberos tickets, HTTP authentication to the NameNode will now use SPNEGO by default. This will require users of previous branch-1 releases with security enabled to modify their configurations and create new Kerberos principals in order to use SPNEGO. The old behavior of using KSSL can optionally be enabled by setting the configuration option "hadoop.security.use-weak-http-crypto" to "true".

Description

The current approach to secure and authenticate nn web services is based on Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now that we have one, we can get rid of the non-standard KSSL and use SPNEGO throughout. This will simplify setup and configuration. Also, Kerberized SSL is a non-standard approach with its own quirks and dark corners (HDFS-2386).

Sub-Tasks

Activity

Hey Jakob, I think getting rid of KSSL (e.g. replacing it with SPNEGO) would be a huge improvement. But, I'd like to propose an alternate design which should be just as secure, and potentially simpler to implement.

Given that a checkpoint by the 2NN already has to do a few RPCs before the transfer of the fsimage/edits (to, e.g. initiate an edits log roll), we could easily generate a shared secret token between the NN and 2NN in this RPC, which then could be included as a URL parameter during the fsimage/edits transfers over HTTP. I suspect this will be easier to implement, is just as secure as SPNEGO/KSSL since when security is enabled the RPCs creating the secret token will be authenticated, and has the advantage of making checkpoints execute the same code paths with or without security enabled.

Aaron T. Myers
added a comment - 02/Dec/11 01:42 Hey Jakob, I think getting rid of KSSL (e.g. replacing it with SPNEGO) would be a huge improvement. But, I'd like to propose an alternate design which should be just as secure, and potentially simpler to implement.
Given that a checkpoint by the 2NN already has to do a few RPCs before the transfer of the fsimage/edits (to, e.g. initiate an edits log roll), we could easily generate a shared secret token between the NN and 2NN in this RPC, which then could be included as a URL parameter during the fsimage/edits transfers over HTTP. I suspect this will be easier to implement, is just as secure as SPNEGO/KSSL since when security is enabled the RPCs creating the secret token will be authenticated, and has the advantage of making checkpoints execute the same code paths with or without security enabled.
Thoughts?

@Aaron - think about this some more, and not hearing any comments, I think it'd be better to go with SPNEGO for a couple of reasons: (1) keep a consistent approach to the web interfaces for the NN/2NN (we could re-use tokens from the map-output fetch, but it would be a bit messy) and (2) the current kerbssl approach is used to fetch/renew/etc delegation tokens explicitly so we don't have to have an API call (to enabled hftp). Moving to SPNEGO for these would preserve this behavior.

The next question is - how to deprecate the kerbssl. It'll be quite annoying to have to support both for a couple releases.

Jakob Homan
added a comment - 13/Dec/11 21:32 @Aaron - think about this some more, and not hearing any comments, I think it'd be better to go with SPNEGO for a couple of reasons: (1) keep a consistent approach to the web interfaces for the NN/2NN (we could re-use tokens from the map-output fetch, but it would be a bit messy) and (2) the current kerbssl approach is used to fetch/renew/etc delegation tokens explicitly so we don't have to have an API call (to enabled hftp). Moving to SPNEGO for these would preserve this behavior.
The next question is - how to deprecate the kerbssl. It'll be quite annoying to have to support both for a couple releases.

@Jakob - those are certainly good reasons. Another benefit of going with the SPNEGO approach is that fsck just goes straight to a URL on the NN, so the approach I proposed would require adding some system to authenticate that as well.

So, +1 to SPNEGO. It seems like my proposal would be more trouble than it's worth.

Aaron T. Myers
added a comment - 13/Dec/11 21:50 @Jakob - those are certainly good reasons. Another benefit of going with the SPNEGO approach is that fsck just goes straight to a URL on the NN, so the approach I proposed would require adding some system to authenticate that as well.
So, +1 to SPNEGO. It seems like my proposal would be more trouble than it's worth.

The next question is - how to deprecate the kerbssl. It'll be quite annoying to have to support both for a couple releases.

I realize this isn't ideal, but I'm in favor of just ripping out kssl in an incompatible way. I think it's reasonable to expect that the NN and 2NN be upgraded in lockstep. Though it's obviously undesirable to have to match client version to server version just to run fsck, I'm of the opinion that that's an acceptable tradeoff versus the pain of having to support both kssl and something else for a few releases.

Aaron T. Myers
added a comment - 21/Dec/11 06:39 The next question is - how to deprecate the kerbssl. It'll be quite annoying to have to support both for a couple releases.
I realize this isn't ideal, but I'm in favor of just ripping out kssl in an incompatible way. I think it's reasonable to expect that the NN and 2NN be upgraded in lockstep. Though it's obviously undesirable to have to match client version to server version just to run fsck, I'm of the opinion that that's an acceptable tradeoff versus the pain of having to support both kssl and something else for a few releases.

Are you working on this? Since the change may be incompatible seem like we should get this in an early 23.x release as possible.

Yes. I should have something for 1 next week, and 23 after that. It likely will be incompatible for both, though I'm not going to necessarily for the 1 patch to be committed (although we'll be using it). It will be very painful to try to support both KerbSSL and SPNEGO, so I'm writing the patch to not do this. I've chatted with Devaraj and Owen and this seems reasonable. The only change will be new keytabs (with http principal rather than host) and some config changes.

Jakob Homan
added a comment - 19/Jan/12 18:55 Are you working on this? Since the change may be incompatible seem like we should get this in an early 23.x release as possible.
Yes. I should have something for 1 next week, and 23 after that. It likely will be incompatible for both, though I'm not going to necessarily for the 1 patch to be committed (although we'll be using it). It will be very painful to try to support both KerbSSL and SPNEGO, so I'm writing the patch to not do this. I've chatted with Devaraj and Owen and this seems reasonable. The only change will be new keytabs (with http principal rather than host) and some config changes.

Here's a draft patch for 1.0 we're testing here. I'm not planning on committing this for reasons described below. Getting away from Kerberized SSL makes lots of things simpler, most significantly not having to switch users in the NN and 2NN since the spnego filter handles that itself.

One issue that this patch creates is that the next Hadoop release that uses it won't be able to obtain/renew/cancel delegation tokens from earlier clusters since this is done over http, which was supposed to be our never-change protocol. Older clusters will speak kerb-ssl and not be able to support spnego. For this reason, it's probably best to just apply this to trunk.

Supporting both SPNEGO and KerbSSL would be really, really gnarly, so I still don't recommend trying to do that.

Jakob Homan
added a comment - 25/Jan/12 22:00 Here's a draft patch for 1.0 we're testing here. I'm not planning on committing this for reasons described below. Getting away from Kerberized SSL makes lots of things simpler, most significantly not having to switch users in the NN and 2NN since the spnego filter handles that itself.
One issue that this patch creates is that the next Hadoop release that uses it won't be able to obtain/renew/cancel delegation tokens from earlier clusters since this is done over http, which was supposed to be our never-change protocol. Older clusters will speak kerb-ssl and not be able to support spnego. For this reason, it's probably best to just apply this to trunk.
Supporting both SPNEGO and KerbSSL would be really, really gnarly, so I still don't recommend trying to do that.
Thoughts?

Jakob, I should have asked this before .. If you consider a situation where there're multiple clusters some running a version of hadoop with spnego, and some without this (kerbssl), how would the distcp jobs that copy data between these clusters work?

I've seen this usecase quite common in Yahoo! (where not all clusters are upgraded in lock-step)

Devaraj Das
added a comment - 01/Feb/12 18:24 Jakob, I should have asked this before .. If you consider a situation where there're multiple clusters some running a version of hadoop with spnego, and some without this (kerbssl), how would the distcp jobs that copy data between these clusters work?
I've seen this usecase quite common in Yahoo! (where not all clusters are upgraded in lock-step)

Tsz Wo Nicholas Sze
added a comment - 07/Feb/12 19:51 > ... Has distcp been updated to include that support?
distcp uses FileSystem API and so there is nothing to change. You can simply use webhdfs:// URIs in the distcp command.

The AuthenticatedURL.Token could be created once and then reused, then SPNEGO will be triggered only when the token expires instead on every request (if I understand correctly, the default freq is 1hr, thus what you are doing it is not a biggy).

Alejandro Abdelnur
added a comment - 08/Mar/12 01:22 @Jakob, approach seems reasonable. A couple of things
The AuthenticatedURL.Token could be created once and then reused, then SPNEGO will be triggered only when the token expires instead on every request (if I understand correctly, the default freq is 1hr, thus what you are doing it is not a biggy).
Why do we need to keep the SSL HTTP configuration?
Thxs

Jakob Homan
added a comment - 28/Mar/12 02:53 In the meantime can you answer Alejandro's question wrt we we need to keep the SSL HTTP configuration?
We don't. As described in the comments above, the posted patch is what we've deployed here, not the final version to be committed. Removing SSL is a fine thing to do, when I finish the main patch.

Jakob Homan
added a comment - 28/Mar/12 03:47 In terms of when I can get the 23 patch done, I've scheduled time next week. This isn't worth holding up any point release of 23 as it's an improvement rather than a bug fix; don't bother doing so.

Alejandro, on your comment, on the SSL HTTP configuration, this existed even before the work on KRB5 ssl connector went in. Look at the patch that introduced krb5sslconnector - http://bit.ly/HQNVGm (and search for dfs.https.enable). I guess Jakob maintained that, and maybe that can be taken out anyway.

Devaraj Das
added a comment - 17/Apr/12 20:35 Alejandro, on your comment, on the SSL HTTP configuration, this existed even before the work on KRB5 ssl connector went in. Look at the patch that introduced krb5sslconnector - http://bit.ly/HQNVGm (and search for dfs.https.enable). I guess Jakob maintained that, and maybe that can be taken out anyway.
The patch doesn't currently apply (not surprising really ).

Alejandro Abdelnur
added a comment - 03/May/12 19:20 updated patch for trunk. After some testing in a real cluster we found the SecondaryNameNode and StandbyCheckpointer were trying to open HTTPS (Thxs ATM for testing/finding this).

After a little more testing of the trunk patch, I found a few more spots where were still erroneously using the HTTPS port instead of the HTTP port. The only difference between this patch and the last one Tucu uploaded is the following:

Aaron T. Myers
added a comment - 04/May/12 02:56 I'd like to add default values for the new config knobs so that if the user already has webhdfs configured, they don't need to do anything extra.
That seems like a good idea to me. Perhaps it can be done as a separate JIRA, though?

Alejandro Abdelnur
added a comment - 04/May/12 04:29 @atm, I'd add Owens config patch to this one as it is directly related to the functionality being introduced by this patch (saving one JIRA at the time )

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

Hadoop QA
added a comment - 04/May/12 05:51 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525563/HDFS-2617-trunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated 2 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.fs.viewfs.TestViewFsTrash
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2373//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2373//console
This message is automatically generated.

No tests are included since this feature can only be tested with security enabled. I'm confident that the test failure and javadoc warnings are unrelated, and the findbugs warning is likely spurious: HADOOP-8354

Aaron T. Myers
added a comment - 04/May/12 06:29 No tests are included since this feature can only be tested with security enabled. I'm confident that the test failure and javadoc warnings are unrelated, and the findbugs warning is likely spurious: HADOOP-8354

Eli Collins
added a comment - 04/May/12 22:53 +1 to the trunk/branch-2 patch. Thanks for all the testing ATM.
DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY/DEFAULT are now dead code and the deprecation of dfs.secondary.https.port can now be removed, but this is trivial, can be handled in a separate jira.

Eli Collins
added a comment - 07/May/12 05:52 DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY/DEFAULT are now dead code and the deprecation of dfs.secondary.https.port can now be removed, but this is trivial, can be handled in a separate jira.
Filed HDFS-3378 for this

Should this be marked resolved or are we leaving it open for commit to 1.1?

I'd like to put some version of this patch in 1.1, perhaps with a config option to continue to use KSSL if one wants so we don't necessarily break deployments that are currently successfully using KSSL.

Perhaps we should resolve this one and open a new JIRA along the lines of "Back-port HDFS-2617 to branch-1" ?

Aaron T. Myers
added a comment - 17/May/12 20:28 Should this be marked resolved or are we leaving it open for commit to 1.1?
I'd like to put some version of this patch in 1.1, perhaps with a config option to continue to use KSSL if one wants so we don't necessarily break deployments that are currently successfully using KSSL.
Perhaps we should resolve this one and open a new JIRA along the lines of "Back-port HDFS-2617 to branch-1" ?

Marking this as an incompatible change, since users need to update the keytabs with new principals, etc. Can someone please fill in the release note with a pointer on how a user needs to update confs/keytabs?

Todd Lipcon
added a comment - 18/May/12 21:03 Marking this as an incompatible change, since users need to update the keytabs with new principals, etc. Can someone please fill in the release note with a pointer on how a user needs to update confs/keytabs?

Kihwal Lee
added a comment - 19/Jun/12 19:08 I wish the hftp part was done in a backward-compatible way. We won't be able to do production-like testing in this state because data cannot be pulled from existing branch-1 based clusters.

Todd Lipcon
added a comment - 23/Jun/12 20:44 Owen: I noticed you're shipping this patch as part of HDP according to your release notes, but you haven't posted anything here. Do you plan on open sourcing your patch?

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

Hadoop QA
added a comment - 26/Jun/12 06:15 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12533408/hdfs-2617-1.1.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2699//console
This message is automatically generated.

Eli Collins
added a comment - 26/Jun/12 18:49 Hey Owen,
Looks like the patch removes KSSL support entirely, will break existing 1.x users. How about adding a config option so the new SPNEGO-based solution can be enabled via a config?
Thanks,
Eli

WebHDFS works, but we have customers who built their stuff around Hftp, which has served as a compatibility layer between different releases. This assumption is broken after this jira and it is a bit difficult to provide a transition plan, unless there is a release that supports both.

Kihwal Lee
added a comment - 26/Jun/12 22:37 WebHDFS works, but we have customers who built their stuff around Hftp, which has served as a compatibility layer between different releases. This assumption is broken after this jira and it is a bit difficult to provide a transition plan, unless there is a release that supports both.

I meant supporting both SPNEGO and krb5ssl on Hftp. If we don't have this, we can't try 2.0 until we deprecate Hftp and have all users transition to webhdfs on 1.x. It's doable but takes time. If Hftp in 2.0 was backward compatible, we would be able to have people move to webhdfs and also try 2.0 at the same time.

Kihwal Lee
added a comment - 26/Jun/12 23:30 unless there is a release that supports both
I meant supporting both SPNEGO and krb5ssl on Hftp. If we don't have this, we can't try 2.0 until we deprecate Hftp and have all users transition to webhdfs on 1.x. It's doable but takes time. If Hftp in 2.0 was backward compatible, we would be able to have people move to webhdfs and also try 2.0 at the same time.

Given that 2.x is a major release, it seems a reasonable time to break HFTP over KSSL especially given that one has to severely cripple their security in order to make secure Hadoop work on recent Kerberos implementations.

It also seems reasonable to explain to users as part of their transition to 2.x from prior releases that this functionality is going away. This primarily is going to sting the early adopters, an audience who has essentially volunteered to do be our lab rats. But for the folks who favor stability, now is the time to get the word out to start switching to a 1.x branch with a working WebHDFS. By the time 2.0 is stable and/or ready for those people to deploy, they should be in relatively good shape.

Something else to consider: the impacted audience is likely low, as I suspect most people probably aren't running a 1.x release yet and/or have security turned on. (I'd love to see some stats though. I really hope I'm wrong. However knowing that it took us several months to transition from 0.20.2 to secure 1.x... and part of that time is directly correlated to the lack of the code in this patch... I have a feeling I'm correct.)

Allen Wittenauer
added a comment - 26/Jun/12 23:44 Given that 2.x is a major release, it seems a reasonable time to break HFTP over KSSL especially given that one has to severely cripple their security in order to make secure Hadoop work on recent Kerberos implementations.
It also seems reasonable to explain to users as part of their transition to 2.x from prior releases that this functionality is going away. This primarily is going to sting the early adopters, an audience who has essentially volunteered to do be our lab rats. But for the folks who favor stability, now is the time to get the word out to start switching to a 1.x branch with a working WebHDFS. By the time 2.0 is stable and/or ready for those people to deploy, they should be in relatively good shape.
Something else to consider: the impacted audience is likely low, as I suspect most people probably aren't running a 1.x release yet and/or have security turned on. (I'd love to see some stats though. I really hope I'm wrong. However knowing that it took us several months to transition from 0.20.2 to secure 1.x... and part of that time is directly correlated to the lack of the code in this patch... I have a feeling I'm correct.)

There doesn't seem to be a good answer. KSSL depends on weak ciphers and doesn't work at all on RHEL 6. We need to get users migrating off of it as quickly as possible. In fact, I'm currently testing my patch above on branch-1.1 and I think we should include it there.

Would it work to enable the hftp client to fall back to KSSL if it can't connect using SPNEGO or unauthenticated?

Owen O'Malley
added a comment - 29/Jun/12 22:49 There doesn't seem to be a good answer. KSSL depends on weak ciphers and doesn't work at all on RHEL 6. We need to get users migrating off of it as quickly as possible. In fact, I'm currently testing my patch above on branch-1.1 and I think we should include it there.
Would it work to enable the hftp client to fall back to KSSL if it can't connect using SPNEGO or unauthenticated?

Eli Collins
added a comment - 06/Jul/12 16:47 KSSL works on RHEL6, we've tested secure RHEL6 clusters w/o SPNEGO (you have to install the JCE policy file or not use AES-256). Also, WebHDFS isn't yet a reasonable Hftp replacement, eg you can't distcp files larger than 24kb , see HDFS-3577 .
Given that removing KSSL support is an incompatible change, this is our stable branch, and that we can add SPNEGO support w/o removing KSSL support, I'm -1 on a patch that removes KSSL entirely.

I'd too like kssl to be supported, even as just a fallback, a bit longer because impacts the ability to migrate data from older clusters not yet upgraded to 1.x+. I'm a bit concerned that webhdfs hasn't (yet) been "battle hardened" so any bugs may severely impact production environments.

From a quick search, it looks like 128 bit encryption is considered weak. 128 bits isn't exactly terrible, so can we just disable <128 bit ciphers?

Daryn Sharp
added a comment - 09/Jul/12 15:14 I'd too like kssl to be supported, even as just a fallback, a bit longer because impacts the ability to migrate data from older clusters not yet upgraded to 1.x+. I'm a bit concerned that webhdfs hasn't (yet) been "battle hardened" so any bugs may severely impact production environments.
From a quick search, it looks like 128 bit encryption is considered weak. 128 bits isn't exactly terrible, so can we just disable <128 bit ciphers?

No. KSSL is hard-coded by RFC to only use certain ciphers. To put this into terms that many might have an easier time understanding, KSSL is roughly equivalent to WEP in terms of its vulnerability.

I'd also like to point out what our 'spread' looks like:

0.20.2 and lower: insecure only, so irrelevant
0.20.203 through 0.20.205: only had KSSL+hftp
1.0.0 and up: WebHDFS is available

So we're looking at a window of releases of about 5-6 months. Folks that are running something in 0.20.203 through 1.0.1 should really upgrade anyway due to the severity of some of the bugs never mind the security holes that have since been found.

Allen Wittenauer
added a comment - 09/Jul/12 19:36 No. KSSL is hard-coded by RFC to only use certain ciphers. To put this into terms that many might have an easier time understanding, KSSL is roughly equivalent to WEP in terms of its vulnerability.
I'd also like to point out what our 'spread' looks like:
0.20.2 and lower: insecure only, so irrelevant
0.20.203 through 0.20.205: only had KSSL+hftp
1.0.0 and up: WebHDFS is available
So we're looking at a window of releases of about 5-6 months. Folks that are running something in 0.20.203 through 1.0.1 should really upgrade anyway due to the severity of some of the bugs never mind the security holes that have since been found.

I'm interested in learning the details of why kssl is so bad. I can't find much online except early versions of java 6 had an issue, and a solaris kext for kssl has had a number of problems.

WEP's usage of RC4 is an egregious example of a bad RC4 implementation. WPA also used RC4 (TKIP) in a more sane manner before WPA2 switched to AES. As best I can tell, the java gss doesn't use a WEP style RC4 impl, and gss also supports AES. Both kssl and spnego are protected via SSL's encryption, and the krb tickets are encrypted. Where is the achille's heel that affects kssl but not spnego?

Daryn Sharp
added a comment - 09/Jul/12 21:54 I'm interested in learning the details of why kssl is so bad. I can't find much online except early versions of java 6 had an issue, and a solaris kext for kssl has had a number of problems.
WEP's usage of RC4 is an egregious example of a bad RC4 implementation. WPA also used RC4 (TKIP) in a more sane manner before WPA2 switched to AES. As best I can tell, the java gss doesn't use a WEP style RC4 impl, and gss also supports AES. Both kssl and spnego are protected via SSL's encryption, and the krb tickets are encrypted. Where is the achille's heel that affects kssl but not spnego?

Ack, no, I didn't think to look in the jetty socket factory. That begs the question: can we change the hardcoded value?

My understanding is kerberos is designed to be used on an insecure network, so does ssl provide much benefit? If yes, then why is ssl used to get a token, and then the token is passed in cleartext w/o ssl?

Daryn Sharp
added a comment - 09/Jul/12 23:12 Ack, no, I didn't think to look in the jetty socket factory. That begs the question: can we change the hardcoded value?
My understanding is kerberos is designed to be used on an insecure network, so does ssl provide much benefit? If yes, then why is ssl used to get a token, and then the token is passed in cleartext w/o ssl?

Should we look at what use cases we must absolutely support (so that folks in production are not impacted):
1. Is it (a) old clients talking to new servers, or, (b) new clients talking to old servers, or, (c) both.
2. If (a), then it can be addressed without many complications IMO. NameNode would try to login using HOST/ and HTTP/ principals (first for the KerbSSL and second for the SPNEGO), so that it can serve both KerbSSL and SPNEGO clients.
3. If (b) (where I think most users with prod deployments would fall), it's slightly more tricky - the client would have to discover that the server can't speak SPNEGO.
3.1 Hack exception handling and try KerbSSL as a fallback.
3.2 Configure the client to talk different protocols (http or https) based on the namenode's address.
4. If (c), then yeah, its a combination of (2) and (3).

Devaraj Das
added a comment - 16/Jul/12 19:28 Should we look at what use cases we must absolutely support (so that folks in production are not impacted):
1. Is it (a) old clients talking to new servers, or, (b) new clients talking to old servers, or, (c) both.
2. If (a), then it can be addressed without many complications IMO. NameNode would try to login using HOST/ and HTTP/ principals (first for the KerbSSL and second for the SPNEGO), so that it can serve both KerbSSL and SPNEGO clients.
3. If (b) (where I think most users with prod deployments would fall), it's slightly more tricky - the client would have to discover that the server can't speak SPNEGO.
3.1 Hack exception handling and try KerbSSL as a fallback.
3.2 Configure the client to talk different protocols (http or https) based on the namenode's address.
4. If (c), then yeah, its a combination of (2) and (3).
Thoughts?

eric baldeschwieler
added a comment - 17/Jul/12 06:13 Looking at how HFTP is used it is both. When two clusters are out of sync, data is moved both ways with HFTP. And since it is always pull, you have both old->new and new-> old.

Not weighing in either way. Just thought it might be helpful to point out some details
relevant to some of the above comments.

1. SPNEGO is not a direct replacement for KSSL. SPNEGO as an authentication method is only a subset of what SSL provides, which is a full transport layer security mechanism (read as authentication and payload protection). Further, with regards to authentication, KSSL provides client and server side authentication. Kerberos SPNEGO (as in HTTP Negotiate) will mean that we will lose server side authentication at the point of connection, though maintain client side authentication.

2. Known WEP attacks are not applicable to SSL despite the fact that RC4 may be a common streaming cipher between them. Pre-shared keys are effectively non-existent in the SSL world.

3. DES and 3DES are very different block ciphers with regards to cipher strength. 56 bit vs 168 bit ciphers.

4. Not clear what the concern is regarding the current hard coding of 3DES in the code, though it may be worth pointing out that it's customary for a clients to present a cipher suite of acceptable ciphers in SSL, meaning the current implementation could be expanded on to include a set. A list of Kerberos cipher suites is in RFC 2712, though one needs to cross reference which one's are implemented in the client and server crypto libs. Quick list includes RC4_128, RC2_128, 3DES(168). Haven't checked if there are further extensions. This can be done asymmetrically (think upgradable), though of course it's not effective until both client and server support the new ciphers.

5. Regarding RC4 streaming cipher with 128 bit keys, it's likely a vast majority of one's credit card transactions are still occurring using this, as this has been the defacto standard since the mid 90's.. though earlier with 88 bits of the key exposed making effective 40-bit to comply with old US import/export restrictions.

Dave Thompson
added a comment - 17/Jul/12 18:30 Not weighing in either way. Just thought it might be helpful to point out some details
relevant to some of the above comments.
1. SPNEGO is not a direct replacement for KSSL. SPNEGO as an authentication method is only a subset of what SSL provides, which is a full transport layer security mechanism (read as authentication and payload protection). Further, with regards to authentication, KSSL provides client and server side authentication. Kerberos SPNEGO (as in HTTP Negotiate) will mean that we will lose server side authentication at the point of connection, though maintain client side authentication.
2. Known WEP attacks are not applicable to SSL despite the fact that RC4 may be a common streaming cipher between them. Pre-shared keys are effectively non-existent in the SSL world.
3. DES and 3DES are very different block ciphers with regards to cipher strength. 56 bit vs 168 bit ciphers.
4. Not clear what the concern is regarding the current hard coding of 3DES in the code, though it may be worth pointing out that it's customary for a clients to present a cipher suite of acceptable ciphers in SSL, meaning the current implementation could be expanded on to include a set. A list of Kerberos cipher suites is in RFC 2712, though one needs to cross reference which one's are implemented in the client and server crypto libs. Quick list includes RC4_128, RC2_128, 3DES(168). Haven't checked if there are further extensions. This can be done asymmetrically (think upgradable), though of course it's not effective until both client and server support the new ciphers.
5. Regarding RC4 streaming cipher with 128 bit keys, it's likely a vast majority of one's credit card transactions are still occurring using this, as this has been the defacto standard since the mid 90's.. though earlier with 88 bits of the key exposed making effective 40-bit to comply with old US import/export restrictions.

Here's a patch against branch-1 which provides the option of using either KSSL or SPNEGO for HTTP authentication. It's basically the same as Owen's last patch, except that instead of completely removing KSSL support, it adds a new configuration option (dfs.use.kssl.auth) which defaults to "true", to preserve the existing branch-1 behavior. If this new option is set to "false", then KSSL will not be used for any authentication, and HTTP will be used instead.

I've tested this patch manually on a pseudo cluster by ensuring that 2NN checkpointing, HFTP, and WebHdfs all work without security enabled, with security enabled and KSSL for auth, and with security enabled and SPNEGO for auth.

I'm running the full HDFS test suite tonight, and will report back with any errors encountered tomorrow.

Aaron T. Myers
added a comment - 18/Jul/12 08:22 Here's a patch against branch-1 which provides the option of using either KSSL or SPNEGO for HTTP authentication. It's basically the same as Owen's last patch, except that instead of completely removing KSSL support, it adds a new configuration option (dfs.use.kssl.auth) which defaults to "true", to preserve the existing branch-1 behavior. If this new option is set to "false", then KSSL will not be used for any authentication, and HTTP will be used instead.
I've tested this patch manually on a pseudo cluster by ensuring that 2NN checkpointing, HFTP, and WebHdfs all work without security enabled, with security enabled and KSSL for auth, and with security enabled and SPNEGO for auth.
I'm running the full HDFS test suite tonight, and will report back with any errors encountered tomorrow.

Have you tested HsftpFileSystem too? Do we even support encrypting the transfer data if spnego is enabled?

The addition of useKssl(conf) seems rather invasive in the sense that many callers have to be modified to specifically have knowledge of kssl. A simple boolean complicates the ability to add new auth systems in the future. Maybe we can push the decision to use kssl deeper into the system so it's more transparent? Rough ideas:

SecurityUtil.openSecureHttpConnection swaps out the https scheme with http if kssl is not enabled. Negates a bunch of changes in HftpFileSystem and DelegationTokenFetcher.

Add a NameNode.getSecurePort(conf) that can use kssl to determine if the https or http port should be returned, HftpFileSystem could use this for the default secure port to be agnostic to kssl.

Maybe add an arg to the ctor of HttpServer for the auth filter, or add a setter for the auth filter so addInternalServlet and the many calls to it don't need to be modified.

The initialization of a secure HttpServer in places such as the NN and 2NN seem virtually identical, maybe create a common method? Would centralize one of the main kssl checks.

A few places appear to assume that if kssl is off, that the connection must be spnego w/o even checking if security is enabled.

Daryn Sharp
added a comment - 18/Jul/12 15:38 Have you tested HsftpFileSystem too? Do we even support encrypting the transfer data if spnego is enabled?
The addition of useKssl(conf) seems rather invasive in the sense that many callers have to be modified to specifically have knowledge of kssl. A simple boolean complicates the ability to add new auth systems in the future. Maybe we can push the decision to use kssl deeper into the system so it's more transparent? Rough ideas:
SecurityUtil.openSecureHttpConnection swaps out the https scheme with http if kssl is not enabled. Negates a bunch of changes in HftpFileSystem and DelegationTokenFetcher .
Add a NameNode.getSecurePort(conf) that can use kssl to determine if the https or http port should be returned, HftpFileSystem could use this for the default secure port to be agnostic to kssl.
Maybe add an arg to the ctor of HttpServer for the auth filter, or add a setter for the auth filter so addInternalServlet and the many calls to it don't need to be modified.
The initialization of a secure HttpServer in places such as the NN and 2NN seem virtually identical, maybe create a common method? Would centralize one of the main kssl checks.
A few places appear to assume that if kssl is off, that the connection must be spnego w/o even checking if security is enabled.

Have you tested HsftpFileSystem too? Do we even support encrypting the transfer data if spnego is enabled?

No I didn't, but you make a good point. Thoughts on what to do here? Perhaps we should disallow use of Hsftp if SSL (either KSSL or cert-based) isn't enabled?

The addition of useKssl(conf) seems rather invasive in the sense that many callers have to be modified to specifically have knowledge of kssl.

It's really not all that many callers. Maybe we can cut down on a few, but it's going to go from like 10 to 7, or something.

A simple boolean complicates the ability to add new auth systems in the future.

I'm skeptical that we'll be adding more HTTP auth systems to branch-1, so I'm hesitant to build up a generic system for something that will only have two actual options.

SecurityUtil.openSecureHttpConnection swaps out the https scheme with http if kssl is not enabled. Negates a bunch of changes in HftpFileSystem and DelegationTokenFetcher.

The reason I didn't do this is because use of KSSL is only within HDFS, whereas SecurityUtil is in Common, and I didn't want to add a dependency on HDFS to Common. That said, it might be reasonable to move the dfs.use.kssl.auth key to Common, change it to "hadoop.security.use.kssl.auth", and then move the NameNode#useKsslAuth method to SecurityUtil itself. How does this sound?

Add a NameNode.getSecurePort(conf) that can use kssl to determine if the https or http port should be returned, HftpFileSystem could use this for the default secure port to be agnostic to kssl.

Seems like a good idea.

Maybe add an arg to the ctor of HttpServer for the auth filter, or add a setter for the auth filter so addInternalServlet and the many calls to it don't need to be modified.

I don't think this is reasonable since a single HttpServer might very well have individual servlets that have different auth filter requirements.

The initialization of a secure HttpServer in places such as the NN and 2NN seem virtually identical, maybe create a common method? Would centralize one of the main kssl checks.

I'll take a look at options for refactoring this, but I honestly don't think there's a lot of code that can be shared, or any new method that would be added would have to be heavily parameterized for just two call sites.

A few places appear to assume that if kssl is off, that the connection must be spnego w/o even checking if security is enabled.

Aaron T. Myers
added a comment - 18/Jul/12 17:57 Have you tested HsftpFileSystem too? Do we even support encrypting the transfer data if spnego is enabled?
No I didn't, but you make a good point. Thoughts on what to do here? Perhaps we should disallow use of Hsftp if SSL (either KSSL or cert-based) isn't enabled?
The addition of useKssl(conf) seems rather invasive in the sense that many callers have to be modified to specifically have knowledge of kssl.
It's really not all that many callers. Maybe we can cut down on a few, but it's going to go from like 10 to 7, or something.
A simple boolean complicates the ability to add new auth systems in the future.
I'm skeptical that we'll be adding more HTTP auth systems to branch-1, so I'm hesitant to build up a generic system for something that will only have two actual options.
SecurityUtil.openSecureHttpConnection swaps out the https scheme with http if kssl is not enabled. Negates a bunch of changes in HftpFileSystem and DelegationTokenFetcher.
The reason I didn't do this is because use of KSSL is only within HDFS, whereas SecurityUtil is in Common, and I didn't want to add a dependency on HDFS to Common. That said, it might be reasonable to move the dfs.use.kssl.auth key to Common, change it to "hadoop.security.use.kssl.auth", and then move the NameNode#useKsslAuth method to SecurityUtil itself. How does this sound?
Add a NameNode.getSecurePort(conf) that can use kssl to determine if the https or http port should be returned, HftpFileSystem could use this for the default secure port to be agnostic to kssl.
Seems like a good idea.
Maybe add an arg to the ctor of HttpServer for the auth filter, or add a setter for the auth filter so addInternalServlet and the many calls to it don't need to be modified.
I don't think this is reasonable since a single HttpServer might very well have individual servlets that have different auth filter requirements.
The initialization of a secure HttpServer in places such as the NN and 2NN seem virtually identical, maybe create a common method? Would centralize one of the main kssl checks.
I'll take a look at options for refactoring this, but I honestly don't think there's a lot of code that can be shared, or any new method that would be added would have to be heavily parameterized for just two call sites.
A few places appear to assume that if kssl is off, that the connection must be spnego w/o even checking if security is enabled.
Can you point them out specifically?

Owen O'Malley
added a comment - 18/Jul/12 17:58 Sigh
If we have the config knob, it has to be something way more obvious that you are basically turning off security.
dfs.use-insecure-http
and of course it should default to off.

Aaron T. Myers
added a comment - 18/Jul/12 18:13 If we have the config knob, it has to be something way more obvious that you are basically turning off security.
What makes you say we're "basically turning off security"? The config knob as-implemented is switching between using KSSL or SPNEGO for HTTP auth, not disabling HTTP auth entirely.

Aaron T. Myers
added a comment - 18/Jul/12 19:13 OK, I'm find changing the default value for the config knob to not use KSSL, as long as we call it out with a release note that it's an incompatible change for branch-1.
As for the name, how about "hadoop.security.use-weak-http-crypto"?

I'm still confused about how kssl is insecure vs. spnego. They just seem different to me. KSSL appears to be a generic means of authenticating a secure socket, whereas SPNEGO is http specific. Here's what I understand, please correct me if necessary because I must be missing something:

Kerberos is specifically designed for insecure networks so the mutual auth exchange is always using strong encryption. Hftp + KSSL is encrypting the kerberos auth before the http request occurs. SPNEGO does the kerberos auth in the clear via standard http request headers. SSL encryption atop kerberos seems to be of little value, whether "weak" or not, since kerberos auths are already encrypted. Why is SPNEGO considered more secure when it lacks a second layer of (unnecessary) encryption?

After the kerberos auth, all actual fs operations and transfers are in the clear using a token. I think the weakest link is the token being passed around insecurely. Hftp + KSSL gets the token "securely", but then uses it insecurely over http which negates any advantage to getting it securely. Hftp + SPNEGO does everything insecurely over http, so why is SPNEGO more secure?

Daryn Sharp
added a comment - 18/Jul/12 22:46 I'm still confused about how kssl is insecure vs. spnego. They just seem different to me. KSSL appears to be a generic means of authenticating a secure socket, whereas SPNEGO is http specific. Here's what I understand, please correct me if necessary because I must be missing something:
Kerberos is specifically designed for insecure networks so the mutual auth exchange is always using strong encryption. Hftp + KSSL is encrypting the kerberos auth before the http request occurs. SPNEGO does the kerberos auth in the clear via standard http request headers. SSL encryption atop kerberos seems to be of little value, whether "weak" or not, since kerberos auths are already encrypted. Why is SPNEGO considered more secure when it lacks a second layer of (unnecessary) encryption?
After the kerberos auth, all actual fs operations and transfers are in the clear using a token. I think the weakest link is the token being passed around insecurely. Hftp + KSSL gets the token "securely", but then uses it insecurely over http which negates any advantage to getting it securely. Hftp + SPNEGO does everything insecurely over http, so why is SPNEGO more secure?
Also, why can't we simply change/remove the hardcoded cipher?

The issue with KSSL is that it requires that you enable weak cipher support for all of Kerberos. This means that completely unrelated actions using Kerberos could end up using a weak cipher. If you use KSSL with JDK 7, you could avoid enabling the weak ciphers as a bug was fixed related to using AES for session initialization.

Joey Echeverria
added a comment - 18/Jul/12 22:53 The issue with KSSL is that it requires that you enable weak cipher support for all of Kerberos. This means that completely unrelated actions using Kerberos could end up using a weak cipher. If you use KSSL with JDK 7, you could avoid enabling the weak ciphers as a bug was fixed related to using AES for session initialization.

The KSSL is required to use weak encryption. It should not be used in secure deployments except as a compatibility hack.

I'm not clear what you're saying here, Owen. You're saying SPNEGO should not be used except as a compat hack? Or KSSL shouldn't be used except as a compat hack?

I'm not sure why encryption is being mentioned here, since most of our RPC and data transfer is not encrypted by default. So I don't see any reason that the fsimage transfer should be the one place where encryption is necessary, instead of simple authentication.

Todd Lipcon
added a comment - 18/Jul/12 22:56 The KSSL is required to use weak encryption. It should not be used in secure deployments except as a compatibility hack.
I'm not clear what you're saying here, Owen. You're saying SPNEGO should not be used except as a compat hack? Or KSSL shouldn't be used except as a compat hack?
I'm not sure why encryption is being mentioned here, since most of our RPC and data transfer is not encrypted by default. So I don't see any reason that the fsimage transfer should be the one place where encryption is necessary, instead of simple authentication.

This bug unfortunately requires that the Kerberos authentication part of the KSSL connection use DES encryption for the Kerberos tickets. Pretty much everyone agrees that DES is unacceptably weak, which is also why MIT KRB5 has been phasing out support for it.

Also, why can't we simply change/remove the hardcoded cipher?

The cipher you're referring to isn't the issue, and in fact is hard-coded to 3DES, whose strength I don't think folks here are concerned about. That cipher is used to encrypt the traffic via SSL after the Kerberos handshake has completed.

If you enable Java SSL/KRB5 debug output when performing an NN checkpoint, you'll see that DES is used for the Kerberos handshake, and thereafter 3DES for the SSL encryption.

Aaron T. Myers
added a comment - 18/Jul/12 23:16 The trouble with KSSL is not in KSSL itself, it's because of a JDK bug that Joey mentioned: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6946669
This bug unfortunately requires that the Kerberos authentication part of the KSSL connection use DES encryption for the Kerberos tickets. Pretty much everyone agrees that DES is unacceptably weak, which is also why MIT KRB5 has been phasing out support for it.
Also, why can't we simply change/remove the hardcoded cipher?
The cipher you're referring to isn't the issue, and in fact is hard-coded to 3DES, whose strength I don't think folks here are concerned about. That cipher is used to encrypt the traffic via SSL after the Kerberos handshake has completed.
If you enable Java SSL/KRB5 debug output when performing an NN checkpoint, you'll see that DES is used for the Kerberos handshake, and thereafter 3DES for the SSL encryption.

Here's an updated patch which addresses Owen's comments, as well as Daryn's comments that I agree with. Indeed, the patch got a little smaller and more self-contained once I did the refactors Daryn proposed.

I once again tested this patch by ensuring that WebHdfs, HFTP, NN checkpointing, and FSCK work with security disable, security enabled using SPNEGO for HTTP auth, and security enabled using KSSL for HTTP auth.

As for testing HSFTP, I don't think this patch will break it, but I don't have an easy way of testing HSFTP. Daryn, any chance you could give that a shot?

Aaron T. Myers
added a comment - 19/Jul/12 00:52 Here's an updated patch which addresses Owen's comments, as well as Daryn's comments that I agree with. Indeed, the patch got a little smaller and more self-contained once I did the refactors Daryn proposed.
I once again tested this patch by ensuring that WebHdfs, HFTP, NN checkpointing, and FSCK work with security disable, security enabled using SPNEGO for HTTP auth, and security enabled using KSSL for HTTP auth.
As for testing HSFTP, I don't think this patch will break it, but I don't have an easy way of testing HSFTP. Daryn, any chance you could give that a shot?

Owen O'Malley
added a comment - 19/Jul/12 13:31 Todd, KSSL is the unfortunate choice and needs to be phased out as quickly as possible.
Aaron, I have patches that apply on top that fix HSFTP to work correctly.

Aaron T. Myers
added a comment - 19/Jul/12 18:23 Aaron, I have patches that apply on top that fix HSFTP to work correctly.
Great, so it sounds like the testing I did should cover all the pertinent cases.
I also ran all of the branch-1 tests with this patch applied, and the only one that failed was TestJobTrackerRestart, which I'm confident is unrelated to this patch.

hadoop.security.use-weak-http-crypto should go in core-default.xml with a comment to the effect that if it is enabled then SPNEGO is used (since this flag effectively controls SPNEGO enablement as well)

s/"false to use SPNEGO"/"false to use SPNEGO or if security is disabled"/

Eli Collins
added a comment - 19/Jul/12 19:04 ATM, looks great, only minor comments, +1 otherwise
hadoop.security.use-weak-http-crypto should go in core-default.xml with a comment to the effect that if it is enabled then SPNEGO is used (since this flag effectively controls SPNEGO enablement as well)
s/"false to use SPNEGO"/"false to use SPNEGO or if security is disabled"/
Define/use KERB5_FILTER = "krb5Filter"
Owen, you'll file a follow-up jira/patch for HSFTP?

Thanks a lot for the review, Eli. Here's an updated patch which addresses your comments.

If there are no more comments, I'm going to go ahead and commit this to branch-1 in an hour or two based on Eli's +1.

hadoop.security.use-weak-http-crypto should go in core-default.xml with a comment to the effect that if it is enabled then SPNEGO is used (since this flag effectively controls SPNEGO enablement as well)

Added the following to core-default.xml:

<property>
<name>hadoop.security.use-weak-http-crypto</name>
<value>false</value>
<description>If enabled, use KSSL to authenticate HTTP connections to the
NameNode. Due to a bug in JDK6, using KSSL requires one to configure
Kerberos tickets to use encryption types that are known to be
cryptographically weak. If disabled, SPNEGO will be used for HTTP
authentication, which supports stronger encryption types.
</description>
</property>

s/"false to use SPNEGO"/"false to use SPNEGO or if security is disabled"/

Aaron T. Myers
added a comment - 19/Jul/12 19:36 Thanks a lot for the review, Eli. Here's an updated patch which addresses your comments.
If there are no more comments, I'm going to go ahead and commit this to branch-1 in an hour or two based on Eli's +1.
hadoop.security.use-weak-http-crypto should go in core-default.xml with a comment to the effect that if it is enabled then SPNEGO is used (since this flag effectively controls SPNEGO enablement as well)
Added the following to core-default.xml:
<property>
<name>hadoop.security.use-weak-http-crypto</name>
<value> false </value>
<description>If enabled, use KSSL to authenticate HTTP connections to the
NameNode. Due to a bug in JDK6, using KSSL requires one to configure
Kerberos tickets to use encryption types that are known to be
cryptographically weak. If disabled, SPNEGO will be used for HTTP
authentication, which supports stronger encryption types.
</description>
</property>
s/"false to use SPNEGO"/"false to use SPNEGO or if security is disabled"/
Done.
Define/use KERB5_FILTER = "krb5Filter"
Done.

I've just committed this to branch-1. Thanks a lot for the contribution and discussion, all. Particular thanks go out to Jakob Homan for getting the ball rolling on this issue and posting the original rev of this patch.

Aaron T. Myers
added a comment - 20/Jul/12 00:18 I've just committed this to branch-1. Thanks a lot for the contribution and discussion, all. Particular thanks go out to Jakob Homan for getting the ball rolling on this issue and posting the original rev of this patch.

Eli Collins
added a comment - 20/Jul/12 04:40 Yup, from the above comments:
Owen: and of course it should default to off.
ATM: OK, I'm find changing the default value for the config knob to not use KSSL, as long as we call it out with a release note that it's an incompatible change for branch-1.
I preferred ATM's first attempt where compatibility was preserved.

I would prefer compatibility by default, but a flag isn't the end of the world.

Can we pretty please merge this into branch-2? I know that's an unpopular position, but we require at least client-side kssl compat on 2.x for hftp else we are going to have a very hard time migrating data from earlier grids. Given recent webhdfs jiras, I think it's safe to say it's not yet hardened enough to be suitable for mission-critical production environments. I'm confident webhdfs will be shored up in 2.x, so kssl compat can be dropped in future releases.

Daryn Sharp
added a comment - 20/Jul/12 15:03 I would prefer compatibility by default, but a flag isn't the end of the world.
Can we pretty please merge this into branch-2? I know that's an unpopular position, but we require at least client-side kssl compat on 2.x for hftp else we are going to have a very hard time migrating data from earlier grids. Given recent webhdfs jiras, I think it's safe to say it's not yet hardened enough to be suitable for mission-critical production environments. I'm confident webhdfs will be shored up in 2.x, so kssl compat can be dropped in future releases.

Eric, we have talked through the options. We've gotten to the place where the user can choose:

No authentication

Weak authentication

Strong authentication

The default has always been no authentication. If someone has bothered to ask for strong authentication, our project shouldn't subvert their effort by having them use known weak crypto unless they explicitly declare that hftp compatibility without a pre-fetched token is more important that the strength of their authentication.

Owen O'Malley
added a comment - 20/Jul/12 17:47 Eric, we have talked through the options. We've gotten to the place where the user can choose:
No authentication
Weak authentication
Strong authentication
The default has always been no authentication. If someone has bothered to ask for strong authentication, our project shouldn't subvert their effort by having them use known weak crypto unless they explicitly declare that hftp compatibility without a pre-fetched token is more important that the strength of their authentication.

If security was enabled the default was not no authentication, that's why this change is incompatible.

I think the alternative option Eric is referring to is to default use-weak-http-crypto to true, which means people with secure clusters who just update their bits don't switch from kssl to spnego automatically, ie they'd have to explicitly enable spnego via setting use-weak-http-crypto to false.

Fortunately people can override this and ship with use-weak-http-crypto true.

Eli Collins
added a comment - 20/Jul/12 23:45 If security was enabled the default was not no authentication, that's why this change is incompatible.
I think the alternative option Eric is referring to is to default use-weak-http-crypto to true, which means people with secure clusters who just update their bits don't switch from kssl to spnego automatically, ie they'd have to explicitly enable spnego via setting use-weak-http-crypto to false.
Fortunately people can override this and ship with use-weak-http-crypto true.

I've been talking over the options with various actors to determine where this needs to be patched. This is what I propose:

1) We patch 1.0 as proposed here

2) We do not take these patches to 2.0.

3) We additionally patch the client to try first the SPNEGO token protocol and then KSSL if that fails. We patch both 1.0 and 2.0 HFTP clients to do this.

—

With these changes we introduce the least possible cruft into 2.0. And we support a gradual transition in the installed base from week to strong, so that orgs do not need a DDay config switch, which will require organized validation and disruption.

Further the default behavior is right for folks not worrying about this transition.

eric baldeschwieler
added a comment - 21/Jul/12 04:05 I've been talking over the options with various actors to determine where this needs to be patched. This is what I propose:
1) We patch 1.0 as proposed here
2) We do not take these patches to 2.0.
3) We additionally patch the client to try first the SPNEGO token protocol and then KSSL if that fails. We patch both 1.0 and 2.0 HFTP clients to do this.
—
With these changes we introduce the least possible cruft into 2.0. And we support a gradual transition in the installed base from week to strong, so that orgs do not need a DDay config switch, which will require organized validation and disruption.
Further the default behavior is right for folks not worrying about this transition.
Any concerns with this approach?

a) Without these or similar patches, 2.0 can't run in secure mode on various OS distributions without crippling them. i.e., this is clearly a blocker for any 2.0 stable release and should likely be a block for even non-stable releases.

b) Upgrades almost always trigger a config change anyway. Given that we want folks to move from hftp to webhdfs, forcing a config change is a good thing, as we can tell them to turn on webhdfs at the same time.

Allen Wittenauer
added a comment - 21/Jul/12 04:44 a) Without these or similar patches, 2.0 can't run in secure mode on various OS distributions without crippling them. i.e., this is clearly a blocker for any 2.0 stable release and should likely be a block for even non-stable releases.
b) Upgrades almost always trigger a config change anyway. Given that we want folks to move from hftp to webhdfs, forcing a config change is a good thing, as we can tell them to turn on webhdfs at the same time.

I'm not entirely sure what you mean by this. SPNEGO support is already in branch-2. Are you saying that you just want to leave that as-is, and not add an option to use KSSL on the server side to branch-2? If so, I agree with that.

3) We additionally patch the client to try first the SPNEGO token protocol and then KSSL if that fails. We patch both 1.0 and 2.0 HFTP clients to do this.

That seems fine to me, but I think that should be done as a separate JIRA, along the lines of "HftpFileSystem should try both KSSL and SPNEGO when authentication is required". If you agree, mind filing that JIRA? If you post a patch, I'll be happy to review it.

Aaron T. Myers
added a comment - 21/Jul/12 08:39 Hi Eric,
1) We patch 1.0 as proposed here
Agree.
2) We do not take these patches to 2.0.
I'm not entirely sure what you mean by this. SPNEGO support is already in branch-2. Are you saying that you just want to leave that as-is, and not add an option to use KSSL on the server side to branch-2? If so, I agree with that.
3) We additionally patch the client to try first the SPNEGO token protocol and then KSSL if that fails. We patch both 1.0 and 2.0 HFTP clients to do this.
That seems fine to me, but I think that should be done as a separate JIRA, along the lines of "HftpFileSystem should try both KSSL and SPNEGO when authentication is required". If you agree, mind filing that JIRA? If you post a patch, I'll be happy to review it.

eric baldeschwieler
added a comment - 22/Jul/12 21:00 Yupp. We leave 2.0 as is, without KSSL. The 0.23 guys can choose to patch or not.
I've created HDFS-3699 - HftpFileSystem should try both KSSL and SPNEGO when authentication is required. I'll muster up some help with the implementation.