RFC 7532

Namespace Database (NSDB) Protocol for Federated File Systems

Internet Engineering Task Force (IETF) J. Lentini
Request for Comments: 7532 NetApp
Category: Standards Track R. Tewari
ISSN: 2070-1721 IBM Almaden
C. Lever, Ed.
Oracle Corporation
March 2015 Namespace Database (NSDB) Protocol for Federated File Systems
Abstract
This document describes a file system federation protocol that
enables file access and namespace traversal across collections of
independently administered fileservers. The protocol specifies a set
of interfaces by which fileservers with different administrators can
form a fileserver federation that provides a namespace composed of
the file systems physically hosted on and exported by the constituent
fileservers.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7532.

Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.

7.3. LDAP Descriptor Registration ..............................558. Glossary .......................................................589. References .....................................................609.1. Normative References ......................................609.2. Informative References ....................................62
Acknowledgments ...................................................64
Authors' Addresses ................................................651. Introduction
A federated file system enables file access and namespace traversal
in a uniform, secure, and consistent manner across multiple
independent fileservers within an enterprise or across multiple
enterprises.
This document specifies a set of protocols that allow fileservers,
possibly from different vendors and with different administrators, to
cooperatively form a federation containing one or more federated file
systems. Each federated file system's namespace is composed of the
file systems physically hosted on and exported by the federation's
fileservers. A federation comprises a common namespace across all
its fileservers. A federation can project multiple namespaces and
enable clients to traverse each one. A federation can contain an
arbitrary number of namespace repositories, each belonging to a
different administrative entity and each rendering a part of the
namespace. A federation might also have an arbitrary number of
administrative entities responsible for administering disjoint
subsets of the fileservers.
Traditionally, building a namespace that spans multiple fileservers
has been difficult for two reasons. First, the fileservers that
export pieces of the namespace are often not in the same
administrative domain. Second, there is no standard mechanism for
the fileservers to cooperatively present the namespace. Fileservers
may provide proprietary management tools, and in some cases, an
administrator may be able to use the proprietary tools to build a
shared namespace out of the exported file systems. However, relying
on vendor-specific proprietary tools does not work in larger
enterprises or when collaborating across enterprises because the
fileservers are likely to be from multiple vendors or use different
software versions, each with their own namespace protocols, with no
common mechanism to manage the namespace or exchange namespace
information.

The federated file system protocols in this document define how to
construct a namespace accessible by a Network File System (NFS)
version 4.0 [RFC7530], NFSv4.1 [RFC5661], or newer client and have
been designed to accommodate other file-access protocols in the
future.
The requirements for federated file systems are described in
[RFC5716]. A protocol for administering a fileserver's namespace is
described in [RFC7533]. The mechanism for discovering the root of a
federated namespace is described in [RFC6641].
In the rest of the document, the term "fileserver" denotes a
fileserver that is part of a federation.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Overview of Features and Concepts
2.1. File-Access Protocol
A file-access protocol is a network protocol for accessing data. The
NFSv4.0 protocol [RFC7530] is an example of a file-access protocol.
2.2. File-Access Client
File-access clients are standard, off-the-shelf network-attached
storage (NAS) clients that communicate with fileservers using a
standard file-access protocol.
2.3. Fileserver
Fileservers are servers that store physical fileset data or refer
file-access clients to other fileservers. A fileserver provides
access to its shared file system data via a file-access protocol. A
fileserver may be implemented in a number of different ways,
including a single system, a cluster of systems, or some other
configuration.
2.4. Referral
A referral is a mechanism by which a fileserver redirects a file-
access client to a different fileserver or export. The exact
information contained in a referral varies from one file-access
protocol to another. The NFSv4.0 protocol, for example, defines the

fs_locations attribute for returning referral information to NFSv4.0
clients. The NFSv4.1 protocol introduces the fs_locations_info
attribute that can return richer referral information to its clients.
NFSv4.1 fileservers may use either attribute during a referral. Both
attributes are defined in [RFC5661].
2.5. Namespace
The goal of a unified namespace is to make all managed data available
to any file-access client via the same path in a common file system
namespace. This should be achieved with minimal or zero
configuration on file-access clients. In particular, updates to the
common namespace should not require configuration changes to any
file-access client.
Filesets, which are the units of data management, are a set of files
and directories. From the perspective of file-access clients, the
common namespace is constructed by mounting filesets that are
physically located on different fileservers. The namespace, which is
defined in terms of fileset names and locations, is stored in a set
of namespace repositories, each managed by an administrative entity.
The namespace schema defines the model used for populating,
modifying, and querying the namespace repositories. It is not
required by the federation that the namespace be common across all
fileservers. It should be possible to have several independently
rooted namespaces.
2.6. Fileset
A fileset is loosely defined as a set of files and the directory tree
that contains them. The fileset abstraction is the basic unit of
data management. Depending on the configuration, a fileset may be
anything from an individual directory of an exported file system to
an entire exported file system on a fileserver.
2.7. Fileset Name (FSN)
A fileset is uniquely represented by its fileset name (FSN). An FSN
is considered unique across a federation. After an FSN is created,
it is associated with one or more fileset locations (FSLs) on one or
more fileservers.

An FSN consists of:
NsdbName: the network location of the Namespace Database (NSDB)
node that contains authoritative information for this FSN.
FsnUuid: a UUID (universally unique identifier), conforming to
[RFC4122], that is used to uniquely identify an FSN.
FsnTTL: the time-to-live of the FSN's FSL information, in
seconds. Fileservers MUST NOT use cached FSL records after the
parent FSN's FsnTTL has expired. An FsnTTL value of zero
indicates that fileservers MUST NOT cache the results of
resolving this FSN.
The NsdbName is not physically stored as an attribute of the record.
The NsdbName is obvious to any client that accesses an NSDB and is
indeed authenticated in cases where Transport Layer Security (TLS) is
in effect.
The FsnUuid and NsdbName values never change during an FSN's
lifetime. However, an FSN's FSL information can change over time and
is typically cached on fileservers for performance. More detail on
FSL caching is provided in Section 2.8.3.
An FSN record may also contain:
Annotations: name/value pairs that can be interpreted by a
fileserver. The semantics of this field are not defined by
this document. These tuples are intended to be used by higher-
level protocols.
Descriptions: text descriptions. The semantics of this field are
not defined by this document.
2.8. Fileset Location (FSL)
An FSL describes one physical location where a complete copy of the
fileset's data resides. An FSL contains generic and type-specific
information that together describe how to access the fileset data at
this location. An FSL's attributes can be used by a fileserver to
decide which locations it will return to a file-access client.

An FSL consists of:
FslUuid: a UUID, conforming to [RFC4122], that is used to
uniquely identify an FSL.
FsnUuid: the UUID of the FSL's FSN.
NsdbName: the network location of the NSDB node that contains
authoritative information for this FSL.
The NsdbName is not stored as an attribute of an FSL record for the
same reason it is not stored in FSN records.
An FSL record may also contain:
Annotations: name/value pairs that can be interpreted by a
fileserver. The semantics of this field are not defined by
this document. These tuples are intended to be used by higher-
level protocols.
Descriptions: text descriptions. The semantics of this field are
not defined by this document.
In addition to the attributes defined above, an FSL record contains
attributes that allow a fileserver to construct referrals. For each
file-access protocol, a corresponding FSL record subtype is defined.
This document defines an FSL subtype for NFS. An NFS FSL contains
information suitable for use in one of the NFSv4 referral attributes
(e.g., fs_locations or fs_locations_info, described in [RFC5661]).
Section 4.2.2.4 describes the contents of an NFS FSL record.
A fileset may also be accessible by file-access protocols other than
NFS. The contents and format of such FSL subtypes are not defined in
this document.
2.8.1. The NFS URI Scheme
To capture the location of an NFSv4 fileset, we extend the NFS URL
scheme specified in [RFC2224]. This extension follows rules for
defining Uniform Resource Identifier schemes (see [RFC3986]). In the
following text, we refer to this extended NFS URL scheme as an NFS
URI.
An NFS URI MUST contain both an authority and a path component. It
MUST NOT contain a query component or a fragment component. Use of
the familiar "nfs" scheme name is retained.

2.8.1.1. The NFS URI Authority Component
The rules for encoding the authority component of a generic URI are
specified in section 3.2 of [RFC3986]. The authority component of an
NFS URI MUST contain the host subcomponent. For globally scoped NFS
URIs, a hostname used in such URIs SHOULD be a fully qualified domain
name. See section 3.2.2 of [RFC3986] for rules on encoding non-ASCII
characters in hostnames.
An NFS URI MAY contain a port subcomponent as described in section
3.2.3 of [RFC3986]. If this subcomponent is missing, a port value of
2049 is assumed, as specified in [RFC7530], Section 3.1.
2.8.1.2. The NFS URI Path Component
The rules for encoding the path component of a generic URI are
specified in Section 3.3 of [RFC3986].
According to Sections 5 and 6 of [RFC2224], NFS URLs specify a
pathname relative to an NFS fileserver's public filehandle. However,
NFSv4 fileservers do not expose a public filehandle. Instead, NFSv4
pathnames contained in an NFS URI are evaluated relative to the
pseudoroot of the fileserver identified in the URI's authority
component.
Each component of an NFSv4 pathname is represented as a component4
string (see Section 3.2, "Basic Data Types", of [RFC5661]). The
component4 elements of an NFSv4 pathname are encoded as path segments
in an NFS URI. NFSv4 pathnames MUST be expressed in an NFS URI as an
absolute path. An NFS URI path component MUST NOT be empty. The NFS
URI path component starts with a slash ("/") character, followed by
one or more path segments that each start with a slash ("/")
character [RFC3986].
Therefore, a double slash always follows the authority component of
an NFS URI. For example, the NFSv4 pathname "/" is represented by
two slash ("/") characters following an NFS URI's authority
component.
The component names of an NFSv4 pathname MUST be prepared using the
component name rules defined in Section 12 ("Internationalization")
of [RFC7530] prior to encoding the path component of an NFS URI. As
specified in [RFC3986], any non-ASCII characters and any URI-reserved
characters, such as the slash ("/") character, contained in a
component4 element MUST be represented by URI percent encoding.

2.8.1.3. Encoding an NFS Location in an FSL
The path component of an NFS URI encodes the rootpath field of the
NFSv4 fs_location4 data type or the "fli_rootpath" of the NFSv4
fs_locations_item4 data type (see [RFC5661]).
In its server field, the NFSv4 fs_location4 data type contains a list
of universal addresses and DNS labels. Each may optionally include a
port number. The exact encoding requirements for this information is
found in Section 12.6 of [RFC7530]. The NFSv4 fs_locations_item4
data type encodes the same data in its fli_entries field (see
[RFC5661]). This information is encoded in the authority component
of an NFS URI.
The server and fli_entries fields can encode multiple server
hostnames that share the same pathname. An NFS URI, and hence an FSL
record, represents only a single hostname and pathname pair. An NFS
fileserver MUST NOT combine a set of FSL records into a single
fs_location4 or fs_locations_item4 unless each FSL record in the set
contains the same rootpath value and extended file system
information.
2.8.2. Mutual Consistency across Fileset Locations
All of the FSLs that have the same FSN (and thereby reference the
same fileset) are equivalent from the point of view of access by a
file-access client. Different fileset locations for an FSN represent
the same data, though potentially at different points in time.
Fileset locations are equivalent but not identical. Locations may be
either read-only or read-write. Typically, multiple read-write
locations are backed by a clustered file system while read-only
locations are replicas created by a federation-initiated or external
replication operation. Read-only locations may represent consistent
point-in-time copies of a read-write location. The federation
protocols, however, cannot prevent subsequent changes to a read-only
location nor guarantee point-in-time consistency of a read-only
location if the read-write location is changing.
Regardless of the type, one file-access client may be referred to a
location described by one FSL while another client chooses to use a
location described by another FSL. Since updates to each fileset
location are not controlled by the federation protocol, it is the
responsibility of administrators to guarantee the functional
equivalence of the data.
The federation protocols do not guarantee that different fileset
locations are mutually consistent in terms of the currency of their
data. However, they provide a means to publish currency information

so that all fileservers in a federation can convey the same
information to file-access clients during referrals. Clients use
this information to ensure they do not revert to an out-of-date
version of a fileset's data when switching between fileset locations.
NFSv4.1 provides guidance on how replication can be handled in such a
manner. In particular, see Section 11.7 of [RFC5661].
2.8.3. Caching of Fileset Locations
To resolve an FSN to a set of FSL records, a fileserver queries the
NSDB node named in the FSN for FSL records associated with this FSN.
The parent FSN's FsnTTL attribute (see Section 2.7) specifies the
period of time during which a fileserver may cache these FSL records.
The combination of FSL caching and FSL migration presents a
challenge. For example, suppose there are three fileservers named A,
B, and C. Suppose further that fileserver A contains a junction J to
fileset X stored on fileserver B (see Section 2.10 for a description
of junctions).
Now suppose that fileset X is migrated from fileserver B to
fileserver C, and the corresponding FSL information for fileset X in
the authoritative NSDB is updated.
If fileserver A has cached FSLs for fileset X, a file-access client
traversing junction J on fileserver A will be referred to fileserver
B, even though fileset X has migrated to fileserver C. If fileserver
A had not cached the FSL records, it would have queried the NSDB and
obtained the correct location of fileset X.
Typically, the process of fileset migration leaves a redirection on
the source fileserver in place of a migrated fileset (without such a
redirection, file-access clients would find an empty space where the
migrated fileset was, which defeats the purpose of a managed
migration).
This redirection might be a new junction that targets the same FSN as
other junctions referring to the migrated fileset, or it might be
some other kind of directive, depending on the fileserver
implementation, that simply refers file-access clients to the new
location of the migrated fileset.

Back to our example. Suppose, as part of the migration process, a
junction replaces fileset X on fileserver B. Later, either:
o New file-access clients are referred to fileserver B by stale FSL
information cached on fileserver A, or
o File-access clients continue to access fileserver B because they
cache stale location data for fileset X.
In either case, thanks to the redirection, file-access clients are
informed by fileserver B that fileset X has moved to fileserver C.
Such redirecting junctions (here, on fileserver B) would not be
required to be in place forever. They need to stay in place at least
until FSL entries cached on fileservers and locations cached on file-
access clients for the target fileset are invalidated.
The FsnTTL field in the FSL's parent FSN (see Section 2.7) specifies
an upper bound for the lifetime of cached FSL information and thus
can act as a lower bound for the lifetime of redirecting junctions.
For example, suppose the FsnTTL field contains the value 3600 seconds
(one hour). In such a case, administrators SHOULD keep the
redirection in place for at least one hour after a fileset migration
has taken place because a referring fileserver might cache the FSL
data during that time before refreshing it.
To get file-access clients to access the destination fileserver more
quickly, administrators SHOULD set the FsnTTL field of the migrated
fileset to a low number or zero before migration begins. It can be
reset to a more reasonable number at a later point.
Note that some file-access protocols do not communicate location
cache expiry information to file-access clients. In some cases, it
may be difficult to determine an appropriate lifetime for redirecting
junctions because file-access clients may cache location information
indefinitely.
2.8.4. Generating a Referral from Fileset Locations
After resolving an FSN to a set of FSL records, the fileserver
generates a referral to redirect a file-access client to one or more
of the FSN's FSLs. The fileserver converts the FSL records to a
referral format understood by a particular file-access client, such
as an NFSv4 fs_locations or fs_locations_info attribute.

To give file-access clients as many options as possible, the
fileserver SHOULD include the maximum possible number of FSL records
in a referral. However, the fileserver MAY omit some of the FSL
records from the referral. For example, the fileserver might omit an
FSL record because of limitations in the file-access protocol's
referral format.
For a given FSL record, the fileserver MAY convert or reduce the FSL
record's contents in a manner appropriate to the referral format.
For example, an NFS FSL record contains all the data necessary to
construct an fs_locations_info attribute, but an fs_locations_info
attribute contains several pieces of information that are not found
in the simpler fs_locations attribute. A fileserver constructs
entries in an fs_locations attribute using the relevant contents of
an NFS FSL record.
Whenever the fileserver converts or reduces FSL data, the fileserver
SHOULD attempt to maintain the original meaning where possible. For
example, an NFS FSL record contains the rank and order information
that is included in an fs_locations_info attribute (see NFSv4.1's
FSLI4BX_READRANK, FSLI4BX_READORDER, FSLI4BX_WRITERANK, and
FSLI4BX_WRITEORDER). While this rank and order information is not
explicitly expressible in an fs_locations attribute, the fileserver
can arrange the fs_locations attribute's locations list based on the
rank and order values.
Another example: A single NFS FSL record contains the hostname of one
fileserver. A single fs_locations attribute can contain a list of
fileserver names. An NFS fileserver MAY combine two or more FSL
records into a single entry in an fs_locations or fs_locations_info
array only if each FSL record contains the same pathname and extended
file system information.
Refer to Sections 11.9 and 11.10 of the NFSv4.1 protocol
specification [RFC5661] for further details.
2.9. Namespace Database (NSDB)
The NSDB service is a federation-wide service that provides
interfaces to define, update, and query FSN information, FSL
information, and FSN-to-FSL mapping information.
An individual repository of namespace information is called an NSDB
node. The difference between the NSDB service and an NSDB node is
analogous to that between the DNS service and a particular DNS
server.

Each NSDB node is managed by a single administrative entity. A
single administrative entity can manage multiple NSDB nodes.
Each NSDB node stores the definition of the FSNs for which it is
authoritative. It also stores the definitions of the FSLs associated
with those FSNs. An NSDB node is authoritative for the filesets that
it defines.
An NSDB MAY be replicated throughout the federation. If an NSDB is
replicated, the NSDB MUST exhibit loose, converging consistency as
defined in [RFC3254]. The mechanism by which this is achieved is
outside the scope of this document. Many Lightweight Directory
Access Protocol (LDAP) implementations support replication. These
features MAY be used to replicate the NSDB.
2.9.1. NSDB Client
Each NSDB node supports an LDAP [RFC4510] interface. An NSDB client
is software that uses the LDAP protocol to access or update namespace
information stored on an NSDB node.
A domain's administrative entity uses NSDB client software to manage
information stored on NSDB nodes. Details of these transactions are
discussed in Section 5.1.
Fileservers act as an NSDB client when contacting a particular NSDB
node to resolve an FSN to a set of FSL records. The resulting
location information is then transferred to file-access clients via
referrals. Therefore, file-access clients never need to access NSDBs
directly. These transactions are described in Section 5.2.
2.10. Junctions and Referrals
A junction is a point in a particular fileset namespace where a
specific target fileset may be attached. If a file-access client
traverses the path leading from the root of a federated namespace to
the junction referring to a target fileset, it should be able to
mount and access the data in that target fileset (assuming
appropriate permissions). In other words, a junction can be viewed
as a reference from a directory in one fileset to the root of the
target fileset.
A junction can be implemented as a special marker on a directory or
by some other mechanism in the fileserver's underlying file system.
What data is used by the fileserver to represent junctions is not
defined by this document. The essential property is that given a
junction, a fileserver must be able to find the FSN for the target
fileset.

When a file-access client reaches a junction, the fileserver refers
the client to a list of FSLs associated with the FSN targeted by the
junction. The client can then mount one of the associated FSLs.
The federation protocols do not limit where and how many times a
fileset is mounted in the namespace. Filesets can be nested; a
fileset can be mounted under another fileset.
2.11. Unified Namespace and the Root Fileset
The root fileset, when defined, is the top-level fileset of the
federation-wide namespace. The root of the unified namespace is the
top level directory of this fileset. A set of designated fileservers
in the federation can export the root fileset to render the
federation-wide unified namespace. When a file-access client mounts
the root fileset from any of these designated fileservers, it can
view a common federation-wide namespace.
2.12. UUID Considerations
To ensure FSN and FSL records are unique across a domain, Federated
File System (FedFS) employs UUIDs conforming to [RFC4122] to form the
distinguished names of LDAP records containing FedFS data (see
Section 4.2.2.2).
Because junctions store a tuple containing an FSN UUID and the name
and port of an NSDB node, an FSN UUID must be unique only on a single
NSDB node. An FSN UUID collision can be detected immediately when an
administrator attempts to publish an FSN or FSL by storing it under a
specific NSDB Container Entry (NCE) on an authoritative NSDB host.
Note that one NSDB node may store multiple NCEs, each under a
different namingContext. If an NSDB node must contain more than one
NCE, the federation's admin entity SHOULD provide a robust method for
preventing FSN UUID collisions between FSNs that reside on the same
NSDB node but under different NCEs.
Because FSLs are children of FSNs, FSL UUIDs must be unique for just
a single FSN. As with FSNs, as soon as an FSL is published, its
uniqueness is guaranteed.
A fileserver performs the operations described in Section 5.2 as an
unauthenticated user. Thus, distinguished names of FSN and FSL
records, as well as the FSN and FSL records themselves, are required
to be readable by anyone who can bind anonymously to an NSDB node.
Therefore, FSN and FSL UUIDs should be considered public information.

Version 1 UUIDs contain a host's Media Access Control (MAC) address
and a timestamp in the clear. This gives provenance to each UUID,
but attackers can use such details to guess information about the
host where the UUID was generated. Security-sensitive installations
should be aware that on externally facing NSDBs, UUIDs can reveal
information about the hosts where they are generated.
In addition, version 1 UUIDs depend on the notion that a hardware MAC
address is unique across machines. As virtual machines do not depend
on unique physical MAC addresses and, in any event, an administrator
can modify the physical MAC address, version 1 UUIDs are no longer
considered sufficient.
To minimize the probability of UUIDs colliding, a consistent
procedure for generating UUIDs should be used throughout a
federation. Within a federation, UUIDs SHOULD be generated using the
procedure described for version 4 of the UUID variant specified in
[RFC4122].