Using the AWS SDK for Java to Create an Amazon EMR Cluster

This documentation is for AMI versions 2.x and 3.x of
Amazon EMR. For information about Amazon EMR releases 4.0.0 and above, see the Amazon EMR Release Guide.
For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

The following example illustrates how the SDKs can simplify programming with Amazon EMR
The code sample below uses the StepFactory object, a helper class for creating common Amazon EMR step types, to create
an interactive Hive cluster with debugging enabled.

Note

If you are adding IAM user visibility to a new cluster, call RunJobFlow and set
VisibleToAllUsers=true, otherwise IAM users cannot view the cluster.

At minimum, you must pass a service role and jobflow role corresponding to EMR_DefaultRole and
EMR_EC2_DefaultRole, respectively. You can do this by invoking this AWS CLI command
for the same account. First, look to see if the roles already exist:

aws iam list-roles | grep EMR

Both the instance profile (EMR_EC2_DefaultRole) and the service role (EMR_DefaultRole) will be displayed if they exist: