How-to: Automate Your Cluster with Cloudera Manager API

API access was a new feature introduced in Cloudera Manager 4.0 (download free edition here.). Although not visible in the UI, this feature is very powerful, providing programmatic access to cluster operations (such as configuration and restart) and monitoring information (such as health and metrics). This article walks through an example of setting up a 4-node HDFS and MapReduce cluster via the Cloudera Manager (CM) API.

Cloudera Manager API Basics

The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuration. The API supports HTTP Basic Authentication, accepting the same users and credentials as the Web UI. API users have the same privileges as they do in the web UI world.

Interacting with the API

The most basic way to use the API is by making HTTP calls directly using tools like curl. For example, to obtain the status of service hdfs2 in cluster dev01 (note: italics are used for interactive shell code throughout):

The API also comes with a Python client for your convenience. To do the same in Python:

1

2

3

4

5

>>>from cm_api.api_client import ApiResource

>>>api=ApiResource('cm_host',username='admin',password='admin')

>>>dev01=api.get_cluster('dev01')

>>>hdfs=dev01.get_service('hdfs2')

>>>print hdfs.serviceState,hdfs.healthSummary STARTED GOOD

You can expect to see client bindings in more languages. A Java client is in the works right now.

Setting up a Cluster

Next I will demonstrate an API Python script that defines, configures and starts a cluster. You are about to see some of the low-level details of Cloudera Manager. Compared with the UI wizard, the API route is more tedious. But the API provides flexibility and programmatic control. You will also notice that this setup process does not require my cluster to be online (until the very last step where I start the services.) This has proven useful to people who are stamping out pre-configured clusters.

Step 1. Define the Cluster

1

2

3

4

5

6

7

8

9

#!/usr/bin/env python

import socket

from cm_api.api_client import ApiResource

CM_HOST="centos56-17.ent.cloudera.com"

api=ApiResource(CM_HOST,username="admin",password="admin")

cluster=api.create_cluster("prod01","CDH4")

This creates a handle on the API. The ApiResource object also accept other optional arguments such as port, TLS, and API version. With that, I created a cluster called prod01 on version CDH4. The handle to the cluster is returned as part of the call.

Step 2. Create HDFS Service and Roles

Now we can create the services. HDFS comes first:

1

hdfs=cluster.create_service("hdfs01","HDFS")

At this point, if I query the different role types supported by hdfs01, I will get:

Most of the code is performing host creation. That is required for role creation, as each role needs to be assigned to a host. In the end, the first host is assigned the NameNode, the Secondary NameNode and a DataNode. The rest are DataNodes.

At this point, if I query the first host, I can see the correct roles assigned to it:

Step 3. Configure HDFS

Service configuration is separated into service-wide configuration and role type configuration. Service-wide configuration is typically settings that affect multiple role types, such as HDFS replication factor. Role type configuration is a template that gets inherited by specific role instances. For example, at the role type template level, I can set all DataNodes to use 3 data directories. And I can override that for specific DataNodes by setting the role-level configuration.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

hdfs_service_config={

'dfs_replication':2,

}

nn_config={

'dfs_name_dir_list':'/dfs/nn',

'dfs_namenode_handler_count':30,

}

snn_config={

'fs_checkpoint_dir_list':'/dfs/snn',

}

dn_config={

'dfs_data_dir_list':'/dfs/dn1,/dfs/dn2,/dfs/dn3',

'dfs_datanode_failed_volumes_tolerated':1,

}

hdfs.update_config(

svc_config=hdfs_service_config,

NAMENODE=nn_config,

SECONDARYNAMENODE=snn_config,

DATANODE=dn_config)

# Use a different set of data directories for DN3

hdfs.get_role('hdfs01-dn3').update_config({

'dfs_data_dir_list':'/dn/data1,/dn/data2'})

How do I find out the configuration keys used by CM? For example, how do I know that dfs_replication is the key for setting replication factor? I query the service:

Step 4. Create MapReduce Service and Roles

This step is similar to the HDFS one. I assign a TaskTracker to each node, and the JobTracker to the first one.

1

2

3

4

mr=cluster.create_service("mr01","MAPREDUCE")

jt=mr.create_role("mr01-jt","JOBTRACKER",hosts[0].hostId)

foriinrange(4):

mr.create_role("mr01-tt"+str(i),"TASKTRACKER",hosts[i].hostId)

Step 5. Configure MapReduce

Here is the code to configure the “mr01” service.:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

mr_service_config={

'hdfs_service':'hdfs01',

}

jt_config={

'jobtracker_mapred_local_dir_list':'/mapred/jt',

'mapred_job_tracker_handler_count':40,

}

tt_config={

'tasktracker_mapred_local_dir_list':'/mapred/local',

'mapred_tasktracker_map_tasks_maximum':10,

'mapred_tasktracker_reduce_tasks_maximum':6,

}

gateway_config

={

'mapred_reduce_tasks':10,

'mapred_submit_replication':2,

}

mr.update_config(

svc_config=mr_service_config,

JOBTRACKER=jt_config,

TASKTRACKER=tt_config,

GATEWAY=gateway_config)

Two items deserve elaboration. First is the hdfs_service. Rather than asking the user for the equivalent of “fs.defaultFS”, a MapReduce service depends on an HDFS service, and derives its HDFS access parameters based on how that HDFS service is configured.

Second, the “gateway” role type is unique to CM. It represents a client. A gateway role does not run any daemons. It simply receives client configuration, as part of the “deploy client configuration” process, which we will perform later.

Step 6. Start HDFS

HDFS is ready to start. This is the step that requires the cluster nodes to be up, CDH installed, and Cloudera Manager Agents running. (The API does not perform software installation.) As part of the preparation, I did that, and pointed the CM agents to the CM server by editing the server_host in /etc/cloudera-scm-agent/config.ini.

Now I can format HDFS and start it.

1

2

3

4

5

6

7

8

CMD_TIMEOUT=180# format_hdfs takes a list of NameNodes

cmd=hdfs.format_hdfs('hdfs01-nn')[0]

ifnotcmd.wait(CMD_TIMEOUT).success:

raise Exception("Failed to format HDFS")

cmd=hdfs.start()

ifnotcmd.wait(CMD_TIMEOUT).success:

raise Exception("Failed to start HDFS")

Each of the cmd object represents an asynchronous command. I then wait for their completion and assert that they have succeeded. Then I deploy the HDFS client configuration to the host running hdfs01-nn.

Advanced Usage

The Cloudera Manager API provides a lot more than configuration and service life-cycle management. You can also obtain service health information and metrics (for the Enterprise Edition), and configure Cloudera Manager itself. Here are some resources for your exploration: