Using the same version of the same operating system on all cluster hosts is strongly recommended.

Selected tab: SupportedOperatingSystems

Supported JDK Versions

Supported JDK Versions

Cloudera Manager supports Oracle JDK 1.7.0_67 and 1.8.0_11 when it's managing CDH 5.x and Oracle JDK 1.6.0_31 and 1.7.0_67 when it's managing CDH 4.x. Cloudera Manager supports Oracle JDK 1.7.0_67 and 1.8.0_11 when it's managing both CDH 4.x and CDH 5.x clusters. Oracle JDK 1.6.0_31 and 1.7.0_67 can be installed during the installation and upgrade. For further information, see Java Development Kit Installation.

Selected tab: SupportedJDKVersions

Supported Browsers

Supported Browsers

The Cloudera Manager Admin Console, which you use to install, configure, manage, and monitor services, supports the following browsers:

Mozilla Firefox 24 and 31

Google Chrome

Internet Explorer 9 and higher. Internet Explorer 11 Native Mode.

Safari 5 and higher

Selected tab: SupportedBrowsers

Supported Databases

Supported Databases

Cloudera Manager requires several databases. The Cloudera Manager Server stores information about configured services, role assignments, configuration history, commands, users, and running processes in a database of its own. You must also specify a database for the Activity Monitor and Reports Manager management services.

Important: When processes restart, the configuration for each of the services is redeployed using information that is saved in the Cloudera Manager database. If this information is not available, your cluster will not start or function correctly. You must therefore schedule and maintain regular backups of the Cloudera Manager database in order to recover the cluster in the event of the loss of this database.

After installing a database, upgrade to the latest patch version and apply any other appropriate updates. Available updates may be specific to the operating system on which it is installed.

Cloudera Manager and its supporting services can use the following databases:

MySQL - 5.5 and 5.6

Oracle 11gR2

PostgreSQL - 8.4, 9.2, and 9.3

Cloudera supports the shipped version of MySQL and PostgreSQL for each supported Linux distribution. Each database is supported for all components in Cloudera Manager and CDH subject to the notes in CDH 4 Supported Databases and CDH 5 Supported Databases.

Selected tab: SupportedDatabases

Supported CDH and Managed Service Versions

Supported CDH and Managed Service Versions

The following versions of CDH and managed services are supported:

Warning: Cloudera Manager 5 does not support CDH 3 and you cannot upgrade Cloudera Manager 4 to Cloudera Manager 5 if you have a cluster running CDH 3.Therefore, to upgrade CDH 3 clusters to CDH 4 using Cloudera Manager you must use Cloudera Manager 4.

Resource Requirements

Resource Requirements

Cloudera Manager requires the following resources:

Disk Space

Cloudera Manager Server

5 GB on the partition hosting /var.

500 MB on the partition hosting /usr.

For parcels, the space required depends on the number of parcels you download to the Cloudera Manager Server and distribute to Agent hosts. You can download multiple parcels of the same product, of different versions and builds. If you are managing multiple clusters, only one parcel of a product/version/build/distribution is downloaded on the Cloudera Manager Server—not one per cluster. In the local parcel repository on the Cloudera Manager Server, the approximate sizes of the various parcels are as follows:

Cloudera Management Service -The Host Monitor and Service Monitor databases are stored on the partition hosting /var. Ensure that you have at least 20 GB available on this partition. For more information, see Data Storage for Monitoring Data.

Agents - On Agent hosts each unpacked parcel requires about three times the space of the downloaded parcel on the Cloudera Manager Server. By default unpacked parcels are located in /opt/cloudera/parcels.

RAM - 4 GB is recommended for most cases and is required when using Oracle databases. 2 GB may be sufficient for non-Oracle deployments with fewer than 100 hosts. However, to run the Cloudera Manager Server on a machine with 2 GB of RAM, you must tune down its maximum heap size (by modifying -Xmx in /etc/default/cloudera-scm-server). Otherwise the kernel may kill the Server for consuming too much RAM.

Networking and Security Requirements

Networking and Security Requirements

The hosts in a Cloudera Manager deployment must satisfy the following networking and security requirements:

Cluster hosts must have a working network name resolution system and correctly formatted /etc/hosts file. All cluster hosts must have properly configured forward and reverse host resolution through DNS. The /etc/hosts files must

Contain consistent information about hostnames and IP addresses across all hosts

Not contain uppercase hostnames

Not contain duplicate IP addresses

Also, do not use aliases, either in /etc/hosts or in configuring DNS. A properly formatted /etc/hosts file should be similar to the following example:

In most cases, the Cloudera Manager Server must have SSH access to the cluster hosts when you run the installation or upgrade wizard. You must log in using a root account or an account that has password-less sudo permission. For authentication during the installation and upgrade procedures, you must either enter the password or upload a public and private key pair for the root or sudo user account. If you want to use a public and private key pair, the public key must be installed on the cluster hosts before you use Cloudera Manager.

Cloudera Manager uses SSH only during the initial install or upgrade. Once the cluster is set up, you can disable root SSH access or change the root password. Cloudera Manager does not save SSH credentials, and all credential information is discarded when the installation is complete. For more information, see Permission Requirements for Package-based Installations and Upgrades of CDH.

If single user mode is not enabled, the Cloudera Manager Agent runs as root so that it can make sure the required directories are created and that processes and files are owned by the appropriate user (for example, the hdfs and mapred users).

No blocking is done by Security-Enhanced Linux (SELinux).

IPv6 must be disabled.

No blocking by iptables or firewalls; port 7180 must be open because it is used to access Cloudera Manager after installation. Cloudera Manager communicates using specific ports, which must be open.

For RedHat and CentOS, the /etc/sysconfig/network file on each host must contain the hostname you have just set (or verified) for that host.

Cloudera Manager and CDH use several user accounts and groups to complete their tasks. The set of user accounts and groups varies according to the components you choose to install. Do not delete these accounts or groups and do not modify their permissions and rights. Ensure that no existing systems prevent these accounts and groups from functioning. For example, if you have scripts that delete user accounts not in a whitelist, add these accounts to the list of permitted accounts. Cloudera Manager, CDH, and managed services create and use the following accounts and groups:

Table 1. Users and Groups

Component (Version)

Unix User ID

Groups

Notes

Cloudera Manager (all versions)

cloudera-scm

cloudera-scm

Cloudera Manager processes such as the Cloudera Manager Server and the monitoring roles run as this user.

The Cloudera Manager keytab file must be named cmf.keytab since that name is hard-coded in Cloudera Manager.

Note: Applicable to clusters managed by Cloudera Manager only.

Apache Accumulo (Accumulo 1.4.3 and higher)

accumulo

accumulo

Accumulo processes run as this user.

Apache Avro

No special users.

Apache Flume (CDH 4, CDH 5)

flume

flume

The sink that writes to HDFS as this user must have write privileges.

Apache HBase (CDH 4, CDH 5)

hbase

hbase

The Master and the RegionServer processes run as this user.

HDFS (CDH 4, CDH 5)

hdfs

hdfs, hadoop

The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.

Apache Hive (CDH 4, CDH 5)

hive

hive

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (e.g. MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This isjavax.jdo.option.ConnectionUserNamein hive-site.xml.

Apache HCatalog (CDH 4.2 and higher, CDH 5)

hive

hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user.

Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos.

Apache Oozie (CDH 4, CDH 5)

oozie

oozie

The Oozie service runs as this user.

Parquet

No special users.

Apache Pig

No special users.

Cloudera Search (CDH 4.3 and higher, CDH 5)

solr

solr

The Solr processes run as this user.

Apache Spark (CDH 5)

spark

spark

The Spark History Server process runs as this user.

Apache Sentry (incubating) (CDH 5.1 and higher)

sentry

sentry

The Sentry service runs as this user.

Apache Sqoop (CDH 4, CDH 5)

sqoop

sqoop

This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.

Apache Sqoop2 (CDH 4.2 and higher, CDH 5)

sqoop2

sqoop, sqoop2

The Sqoop2 service runs as this user.

Apache Whirr

No special users.

YARN (CDH 4, CDH 5)

yarn

yarn, hadoop

Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.

Apache ZooKeeper (CDH 4, CDH 5)

zookeeper

zookeeper

The ZooKeeper processes run as this user. It is not configurable.

Selected tab: NetworkingandSecurityRequirements

Selected tab: SystemRequirements

What's New

What's New in Cloudera Manager 5.3.0

JDK 1.8 - Cloudera Manager adds support for Oracle JDK 1.8.

Single user mode - The Cloudera Manager Agent and all service processes can now be run as a single configured user in environments where running as root is not permitted. See Single User Mode Requirements.

CDH upgrade wizard enhanced - The CDH upgrade wizard now supports minor and maintenance version upgrade as well as major version upgrade.

Oozie Sharelib - The Oozie Sharelib can be updated without restarting the Oozie service.

Read-only users prevented from viewing process logs or environment - Read-only users can no longer view the environment or logs of a process. This is to prevent read-only users from seeing potentially sensitive information.

Data-at-rest encryption

Important: Cloudera provides two solutions:

Navigator Encrypt is production ready and available to Cloudera customers licensed for Cloudera Navigator. Navigator Encrypt operates at the Linux volume level, so it can encrypt cluster data inside and outside HDFS. Consult your Cloudera account team for more information.

HDFS Encryption is production ready and operates at the HDFS directory level, enabling encryption to be applied only to HDFS folders where needed.

HDFS encryption implements transparent, end-to-end encryption of data read from and written to HDFS by creating encryption zones. An encryption zone is a directory in HDFS with every file and subdirectory in it encrypted. Use one of the following services to store, manage, and access encryption zone keys:

KMS (File) - The Hadoop Key Management Server with a file-based Java keystore; maintains a single copy of keys, using simple password-based protection.

KMS (Navigator Key Trustee) - An enterprise-grade key management service that replaces the file-based Java keystore and leverages the advanced key-management capabilities of Cloudera Navigator Key Trustee. Navigator Key Trustee is designed for secure, authenticated administration and cryptographically strong storage of keys on multiple redundant servers that can be located outside the cluster.

The Cloudera Manager Server now reports the correct number of physical cores and hyper-threading cores if hyper-threading is enabled.

Client configurations - Client configurations are now managed so that they are redeployed when a machine is re-imaged.

Important: The changes to client configurations affect some API calls, as follows:

When a host ceases to have a client configuration assigned to it, Cloudera Manager will remove it, rather than leaving it behind. If a host has a client configuration assigned and the client configuration is missing, Cloudera Manager will recreate it.

If you currently use the API command deployClientConfig to deploy the client configurations for a particular service, and you pass a specific set of role names to this call to narrow the set of hosts that receive the new client configuration, then you should be aware that:

The API command will continue to generate and deploy the client configuration only to the hosts that correspond to the specified role names.

Any other hosts that previously had deployed client configurations, but do not have gateway roles assigned to them, will have those client configurations removed from them. This is the new behavior.

The behavior of the cluster level deployClientConfig command, and calling the service level command with no arguments, is unchanged. The command still deploys a new client configuration to all hosts with roles corresponding to the specified service or cluster.

As this change is due to internal functional changes inside CM, it is not restricted to any new API level. The deployClientConfig command in all API levels is affected.