About Lustre file systems

Overview

Lustre is a high-performance storage architecture and scalable parallel file system for use with computing clusters, supercomputers, visualization systems, and desktop workstations. Its compliance with the POSIX standard makes it well suited for use with Linux-based systems, but it can be re-exported with NFS or CIFS to enable use by Windows and OS X clients. Lustre can scale to provide petabytes of storage capacity to thousands of clients with hundreds of gigabytes per second of I/O bandwidth. Lustre also features integrated network diagnostics, and mechanisms for performance monitoring and tuning.

Lustre started as a research project at Carnegie Mellon University (CMU), and is now developed and distributed as open source software under the GNU General Public License version 2 (GPLv2). Development of Lustre is supported by the non-profit Open Scalable File Systems OpenSFS organization. For more, see the Lustre website.

Key components

A Lustre file system comprises the following components:

Lustre clients: Lustre client software runs on computational, visualization, or desktop nodes that communicate with file system's servers via the Lustre Network (LNET) layer, which supports a variety of network technologies, including InfiniBand, Ethernet, Seastar, and Myrinet. Within an organization, Lustre presents clients with a consolidated global namespace for all files and data in the file system. When Lustre is mounted on a client, its users can transfer and manage file system data as if those data were stored locally; however, clients never have direct access to the underlying file storage.

Management Target (MGT): The MGT stores file system configuration information for use by the clients and other Lustre components. Although MGT storage requirements are relatively small even in the largest file systems, the information stored there is vital to system access.

Management Server (MGS): The MGS manages the configuration data stored on the MGT. Lustre clients contact the MGS to retrieve information from the MGT.

Metadata Server (MDS): The MDS manages the namespace metadata stored on the MDT. Lustre clients contact the MDS to retrieve this information from the MDT. The MDS is not involved in file read/write operations.

Object Storage Targets (OSTs): The OSTs store user file data in one or more logical objects that can be striped across multiple OSTs.

Some helpful commands

Following are some helpful commands for working with files on the DC2 or DC-WAN2 file systems:

Get the total sum of data stored in your scratch directory:

du -hc /N/dc2/scratch/username

List your files in reverse order by date modified:

find . -type f -exec ls -1hltr "{}" +;

Note:

On a Lustre file system, using the ls command with the -l option to list the contents of a directory in long format can cause performance issues for you and other users, especially if the directory contains a large number of files. Because Lustre performs file read/write and metadata operations separately, executing ls
-l involves contacting both the Lustre MDS (to get path, ownership, and permissions metadata) and one or more OSSs (which in turn must contact one or more OSTs to get information about the data objects that make up your files). Use ls -l only when necessary, or on directories that contain a small number of files. For more, see Listing files.

Set Lustre striping:

lfs setstripe -c X <file|directory>

In the example above, replace X with the number of stripes to set for a file or directory (the default is one stripe).

Note:

Too many stripes may negatively impact performance (16 should be the maximum). Also, setstripe does not affect existing data.

Show the number of stripes for a file and the OSTs on which the stripes are located: