Details

Introduced access tokens as capabilities for accessing datanodes. This change to internal protocols does not affect client applications.

Description

Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes.

When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.

In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.