Automate CM Express Wizard installation via API?

I am working at Hadapt, Cloudera patrner company, on a product that builds on top of Cloudera CDH. For testing we would like to be able to create in automation clusters with different kinds of full Cloudera deployments, as created and configured by Cloudera Manager. For example, we would like to be able to deploy CDH 4.3 and 4.4 using RPM and Parcels into different hardware environments in a fully automated way.

We believe most users will be using the Express Wizard to install clusters. Is there any way to simply automate the equivalent of an Express Wizard install via CM without needing to interact with the GUI?

For example, I am able to install Cloudera Manager the way a customer would using:

I have been attempting to do so using the v5 API, but find that I need to do a lot of things outside of the API, such as setting up the agents on the cluster nodes via packages, or trying to set possibly unsettable things like zookeeper-autocreate-dirs or having to manually format HDFS. (I see that the v6 API includes support for doing a hostInstall. While necessary, this I think would also not be sufficient, as it does not cover the wizard side of things. Also, we would like to replicate the experience a customer would get when using CM 4.x., not the CM 5 beta.)

While I imagine I can eventually figure out how to make the cm-api calls work, I worry it will not completely replicate the customer experience.

Have I missed some piece of documentation that explains how to do this? Or are there perhaps some undocumented calls that we could take advantage of? There must be something that Cloudera uses internally deploy for testing...

2) Install cm agents on cluster hosts (as you pointed out, this is only possible via CM API in CM5)

3) Replicate every configuration and command performed by the CM wizard through API. Everything config and command involved is exposed in API (our internal testing automation uses this). You can even go further and add things like enabling NN HA or JT HA.

Note that it is not possible to get CM recommendations via the API. You will need to determine all configuration manually. It may help to run the wizard through the UI, then capture what CM configures and repeat it in your API scripts. The "deployment" endpoint in the API can be very useful for this.

Also don't forget to distribute install JDBC drivers on all cluster hosts if using mysql or postgres.

Re: Automate CM Express Wizard installation via API?

Manually re-creating the behavior seemed somewhat error-prone because the "automated" version that the ExpressWizard does may change without us noticing it. (We have even considered using Selenium or something to drive the wizard remotely, though that seems frought with its own head-aches.)

I had not noticed the deployment end point, thanks for the pointer. It looks like the deployment end point is not available in the Python API. From skimming the Java code, it looks like it is functionally similar to this dump script that I threw together last week. https://gist.github.com/sit/7208850.

Do you have some script that automatically re-stores via the API all the settings found in a deployment?

Looks like deployment is missing from the python bindings, as you pointed out. You'll have to manually access the URL, or enhance the python bindings = ). It's in the Java bindings.

Using deployment won't run the commands for you, but is one way of quickly creating a cluster with the desired role assignments and configuration.

While it's true that steps change a little over time, generally CM will add steps and ocnfigs and not remove them, and your old workflows will work as well as they used to because we try to maintain API compatibility.This means that you can usually expect your scripts to keep working and only update them if you want to take advantage of a new feature, even when CM version changes. One notable exception to this will be around Impala, however, which will get a new mandatory role soon, even for CDH4. Keep an eye out for that when CM 4.8 comes out. Also we'll make some minor incompatible changes like deleting deprecated / refactored / unused configs in CM 5. Other partners are using the CM API effectively to automate deployments.