EMC DSSD D5 Storage Appliance Integration for Hadoop DataNodes

Overview of EMC DSSD D5 Integration

The EMC DSSD D5 provides a high-speed, low-latency storage solution based on flash media. It has been optimized for use as storage for DataNodes in the
Cloudera CDH distribution. The DataNode hosts connect directly to the DSSD D5 using a PCIe card interface. In a CDH cluster, only the DataNodes use the DSSD D5 for storage; all other hosts use
standard disks.

To manage clusters that use DSSD D5 storage, enable DSSD Mode in Cloudera Manager. All other Hadoop components operate normally. When this mode is
enabled, Cloudera Manager can only manage clusters with DSSD D5 DataNodes; you cannot mix cluster types (a cluster that uses only DSSD D5 DataNodes and a cluster that uses regular DataNodes). All
DataNodes must connect to the DSSD D5; you cannot mix DataNode types within a cluster.

Installing CDH with DSSD DataNodes

Use Cloudera Manager to install a DSSD D5-enabled cluster. You can install Cloudera Manager in several ways, and you can use Cloudera Manager to install agents and other software on
all hosts in your cluster. Installing CDH with DSSD D5 DataNodes is similar to non-DSSD D5 installation, except for the following:

You cannot install a DSSD D5 cluster using a Cloudera Manager instance that is already managing a cluster.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required
notices. A copy of the Apache License Version 2.0 can be found here.