Centralized Hadoop SSO/Token Server

Details

Description

This is an umbrella Jira filing to oversee a set of proposals for introducing a new master service for Hadoop Single Sign On (HSSO).

There is an increasing need for pluggable authentication providers that authenticate both users and services as well as validate tokens in order to federate identities authenticated by trusted IDPs. These IDPs may be deployed within the enterprise or third-party IDPs that are external to the enterprise.

These needs speak to a specific pain point: which is a narrow integration path into the enterprise identity infrastructure. Kerberos is a fine solution for those that already have it in place or are willing to adopt its use but there remains a class of user that finds this unacceptable and needs to integrate with a wider variety of identity management solutions.

Another specific pain point is that of rolling and distributing keys. A related and integral part of the HSSO server is library called the Credential Management Framework (CMF), which will be a common library for easing the management of secrets, keys and credentials.

Initially, the existing delegation, block access and job tokens will continue to be utilized. There may be some changes required to leverage a PKI based signature facility rather than shared secrets. This is a means to simplify the solution for the pain point of distributing shared secrets.

This project will primarily centralize the responsibility of authentication and federation into a single service that is trusted across the Hadoop cluster and optionally across multiple clusters. This greatly simplifies a number of things in the Hadoop ecosystem:

1. a single token format that is used across all of Hadoop regardless of authentication method
2. a single service to have pluggable providers instead of all services
3. a single token authority that would be trusted across the cluster/s and through PKI encryption be able to easily issue cryptographically verifiable tokens
4. automatic rolling of the token authority’s keys and publishing of the public key for easy access by those parties that need to verify incoming tokens
5. use of PKI for signatures eliminates the need for securely sharing and distributing shared secrets

In addition to serving as the internal Hadoop SSO service this service will be leveraged by the Knox Gateway from the cluster perimeter in order to acquire the Hadoop cluster tokens. The same token mechanism that is used for internal services will be used to represent user identities. Providing for interesting scenarios such as SSO across Hadoop clusters within an enterprise and/or into the cloud.

The HSSO service will be comprised of three major components and capabilities: