6. Define Cluster Configuration

6. Define Cluster Configuration

The Hortonworks Data Platform consists of multiple components. These components are installed across the cluster.
The cluster properties file specifies the directory locations and node host locations for each of the components.
The following section outlines how to construct the cluster properties file to define the cluster blueprint for all HDP
components that need to be installed.

Use the following instructions to configure HDP installer for your cluster:

Create a clusterproperties.txt file.

Add the properties to the clusterproperties.txt file as
described in the table given below:

Important

Ensure that all the properties in the
clusterproperties.txt file are separated by a new
line character.

Ensure that the directory paths do not contain any whitespace character.

For example, C:\Program Files\Hadoop is an invalid
directory path for HDP.

Use Fully Qualified Domain Names (FQDN) for specifying the network host
name for each cluster host. The FQDN is a DNS name that uniquely identifies
the computer on the network. By default, it is a concatenation of the host
name, the primary DNS suffix, and a period.

When specifying the host lists in the clusterproperties.txt file, if the hosts are multi-homed
or have multiple NIC cards, make sure that each name or IP address by which you specify the
hosts are the preferred name or IP address by which the hosts can communicate among
themselves. In other words, these should be the addresses used internal to the cluster, not
those used for addressing cluster nodes from outside the cluster.

Table 1.6. Configuration values for MSI installer

Configuration Property Name

Description

Example value

Mandatory/Optional/Conditional

HDP_LOG_DIR

HDP's operational logs will be written to this directory on each cluster
host. Ensure that you have sufficient disk space for storing these
log files.

d:\hadoop\logs

Mandatory

HDP_DATA_DIR

HDP data will be stored in this directory on each cluster node. You
can add multiple comma-separated data locations for multiple data
directories.

d:\hdp\data

Mandatory

NAMENODE_HOST

The FQDN for the cluster node that will run the NameNode master
service.

NAMENODE_MASTER.acme.com

Mandatory

SECONDARY_NAMENODE_HOST

The FQDN for the cluster node that will run the Secondary NameNode master
service.

SECONDARY_NN_MASTER.acme.com

Mandatory

JOBTRACKER_HOST

The FQDN for the cluster node that will run the JobTracker master
service.

JOBTRACKER_MASTER.acme.com

Mandatory

HIVE_SERVER_HOST

The FQDN for the cluster node that will run the Hive Server master
service.

HIVE_SERVER_MASTER.acme.com

Mandatory

OOZIE_SERVER_HOST

The FQDN for the cluster node that will run the Oozie Server master
service.

OOZIE_SERVER_MASTER.acme.com

Mandatory

WEBHCAT_HOST

The FQDN for the cluster node that will run the WebHCat master
service.

WEBHCAT_MASTER.acme.com

Mandatory

FLUME_HOSTS

A comma separated list of FQDN for those cluster nodes that will run the Flume
service.

A comma separated list of FQDN for those cluster nodes that will run the
HBase Region Server services.

slave1.acme.com, slave2.acme.com, slave3.acme.com

Mandatory

SLAVE_HOSTS

A comma separated list of FQDN for those cluster nodes that will run the
DataNode and TaskTracker services.

slave1.acme.com, slave2.acme.com, slave3.acme.com

Mandatory

ZOOKEEPER_HOSTS

A comma separated list of FQDN for those cluster nodes that will run the
Zookeeper hosts.

ZOOKEEEPER_HOST.acme.com

Mandatory

DB_FLAVOR

Database type for Hive and Oozie metastores (allowed databases are SQL
Server and Derby). To use default embedded Derby instance, set the value of
this property to derby. To use an existing SQL Server
instance as the metastore DB, set the value as mssql.

mssql or derby

Mandatory

DB_HOSTNAME

FQDN for the node where the metastore database service is installed.
If using SQL Server, set the value to your SQL Server hostname. If
using Derby for Hive metastore, set the value to HIVE_SERVER_HOST.

sqlserver1.acme.com

Mandatory

DB_PORT

This is an optional property required only if you are using SQL
Server for Hive and Oozie metastores. By default,
database port is set to 1433.

1433

HIVE_DB_NAME

Database for Hive metastore. If using SQL Server, ensure that you
create the database on the SQL Server instance.