For advanced use only, a string to be inserted into log4j.properties for this role only.

log4j_safety_valve

false

Heap Dump Directory

Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically
created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions.
The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured
for this role.

oom_heap_dump_dir

/tmp

oom_heap_dump_dir

false

Dump Heap When Out of Memory

When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.

false

oom_heap_dump_enabled

true

Kill When Out of Memory

When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.

true

oom_sigkill_enabled

true

Automatically Restart Process

When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.

false

process_auto_restart

true

Logs

Display Name

Description

Related Name

Default Value

API Name

Required

Failover Controller Log Directory

Directory where Failover Controller will place its log files.

/var/log/hadoop-0.20-mapreduce

failover_controller_log_dir

false

Failover Controller Logging Threshold

The minimum log level for Failover Controller logs

INFO

log_threshold

false

Failover Controller Maximum Log File Backups

The maximum number of rolled log files to keep for Failover Controller logs. Typically used by log4j or logback.

10

max_log_backup_index

false

Failover Controller Max Log Size

The maximum size, in megabytes, per log file for Failover Controller logs. Typically used by log4j or logback.

200 MiB

max_log_size

false

Monitoring

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Health Alerts for this Role

When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting
eventserver_health_events_alert_threshold

true

enable_alerts

false

Enable Configuration Change Alerts

When set, Cloudera Manager will send alerts when this entity's configuration changes.

false

enable_config_alerts

false

File Descriptor Monitoring Thresholds

The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.

Enables the health test that the Failover Controller's process state is consistent with the role configuration

true

failovercontroller_scm_health_enabled

false

Heap Dump Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.

Warning: 10 GiB, Critical: 5 GiB

heap_dump_directory_free_space_absolute_thresholds

false

Heap Dump Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as
a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

heap_dump_directory_free_space_percentage_thresholds

false

Log Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.

Warning: 10 GiB, Critical: 5 GiB

log_directory_free_space_absolute_thresholds

false

Log Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a
percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

log_directory_free_space_percentage_thresholds

false

Rules to Extract Events from Log Files

This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads.
It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has
some or all of the following fields:

alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not
specified, the default is "false".

rate(mandatory) - the maximum number of log messages matching this rule that may be sent as events every minute. If more
than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of
messages per minute is unlimited.

periodminutes - the number of minutes during which the publisher will only publish rate events
or fewer. If not specified, the default is one minute

threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.

exceptiontype - match only those messages which are part of an exception message. The exception type must match this regular expression.

Example:{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule will send events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.

The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health
system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:

triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.

streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition
fires. By default set to 0, and any stream returned causes the condition to fire.

enabled(optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.

expressionEditorConfig(optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the
Edit Trigger page; editing the trigger here can lead to inconsistencies.

For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for
more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.

[]

role_triggers

true

Unexpected Exits Thresholds

The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window
configuration for the role.

Warning: Never, Critical: Any

unexpected_exits_thresholds

false

Unexpected Exits Monitoring Period

The period to review when computing unexpected exits.

5 minute(s)

unexpected_exits_window

false

Performance

Display Name

Description

Related Name

Default Value

API Name

Required

Maximum Process File Descriptors

If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.

Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be
given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.

cpu.shares

1024

rm_cpu_shares

true

Cgroup I/O Weight

Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host
experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.

blkio.weight

500

rm_io_weight

true

Cgroup Memory Hard Limit

Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default
processes not managed by Cloudera Manager will have no limit.

memory.limit_in_bytes

-1 MiB

rm_memory_hard_limit

true

Cgroup Memory Soft Limit

Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use
a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.

memory.soft_limit_in_bytes

-1 MiB

rm_memory_soft_limit

true

Stacks Collection

Display Name

Description

Related Name

Default Value

API Name

Required

Stacks Collection Data Retention

The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.

stacks_collection_data_retention

100 MiB

stacks_collection_data_retention

false

Stacks Collection Directory

The directory in which stacks logs are placed. If not set, stacks are logged into a stacks
subdirectory of the role's log directory.

stacks_collection_directory

stacks_collection_directory

false

Stacks Collection Enabled

Whether or not periodic stacks collection is enabled.

stacks_collection_enabled

false

stacks_collection_enabled

true

Stacks Collection Frequency

The frequency with which stacks are collected.

stacks_collection_frequency

5.0 second(s)

stacks_collection_frequency

false

Stacks Collection Method

The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon
process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint
is periodically scraped.

stacks_collection_method

jstack

stacks_collection_method

false

Suppressions

Display Name

Description

Related Name

Default Value

API Name

Required

Suppress Configuration Validator: CDH Version Validator

Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.

Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log
Files parameter.

false

role_config_suppression_log_event_whitelist

true

Suppress Parameter Validation: Heap Dump Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.

false

role_config_suppression_oom_heap_dump_dir

true

Suppress Parameter Validation: Role Triggers

Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.

false

role_config_suppression_role_triggers

true

Suppress Parameter Validation: Stacks Collection Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory
parameter.

false

role_config_suppression_stacks_collection_directory

true

Suppress Health Test: File Descriptors

Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_mapreduce_failovercontroller_file_descriptor

true

Suppress Health Test: Heap Dump Directory Free Space

Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored
when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the
overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_mapreduce_failovercontroller_host_health

true

Suppress Health Test: Log Directory Free Space

Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing
the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_mapreduce_failovercontroller_scm_health

true

Suppress Health Test: Swap Memory Usage

Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_mapreduce_failovercontroller_unexpected_exits

true

gateway

Advanced

Display Name

Description

Related Name

Default Value

API Name

Required

Deploy Directory

The directory where the client configs will be deployed

/etc/hadoop

client_config_root_dir

true

Gateway Logging Advanced Configuration Snippet (Safety Valve)

For advanced use only, a string to be inserted into log4j.properties for this role only.

For advanced use only, key-value pairs (one on each line) to be inserted into the client configuration for hadoop-env.sh

mapreduce_client_env_safety_valve

false

Client Java Configuration Options

These are Java command line arguments. Commonly, garbage collection flags or extra debugging flags would be passed here.

-Djava.net.preferIPv4Stack=true

mapreduce_client_java_opts

false

Compression

Display Name

Description

Related Name

Default Value

API Name

Required

Use Compression on Map Outputs

If enabled, uses compression on the map outputs before they are sent across the network. Will be part of generated client
configuration.

mapred.compress.map.output

true

mapred_compress_map_output

false

Compression Codec of MapReduce Map Output

For MapReduce map outputs that are compressed, specify the compression codec to use. Will be part of generated client
configuration.

mapred.map.output.compression.codec

org.apache.hadoop.io.compress.SnappyCodec

mapred_map_output_compression_codec

false

Compress MapReduce Job Output

Compress the output of MapReduce jobs. Will be part of generated client configuration.

mapred.output.compress

false

mapred_output_compress

false

Compression Codec of MapReduce Job Output

For MapReduce job outputs that are compressed, specify the compression codec to use. Will be part of generated client
configuration.

mapred.output.compression.codec

org.apache.hadoop.io.compress.DefaultCodec

mapred_output_compression_codec

false

Compression Type of MapReduce Job Output

For MapReduce job outputs that are compressed as SequenceFiles, you can select one of these compression type options: NONE, RECORD
or BLOCK. Cloudera recommends BLOCK. Will be part of generated client configuration.

mapred.output.compression.type

BLOCK

mapred_output_compression_type

false

Compression Level of Codecs

Compression level for the codec used to compress MapReduce outputs. Default compression is a balance between speed and compression
ratio.

zlib.compress.level

DEFAULT_COMPRESSION

zlib_compress_level

false

Jobs

Display Name

Description

Related Name

Default Value

API Name

Required

Number of Tasks to Run per JVM

Number of tasks to run per JVM. If set to -1, there is no limit. Will be part of generated client configuration.

mapred.job.reuse.jvm.num.tasks

1

mapred_job_reuse_jvm_num_tasks

false

Map Tasks Speculative Execution

If enabled, multiple instances of some map tasks may be executed in parallel.

mapred.map.tasks.speculative.execution

false

mapred_map_tasks_speculative_execution

false

Number of Map Tasks to Complete Before Reduce Tasks

Fraction of the number of map tasks in the job which should be completed before reduce tasks are scheduled for the job.

mapred.reduce.slowstart.completed.maps

0.8

mapred_reduce_slowstart_completed_maps

false

Default Number of Reduce Tasks per Job

The default number of reduce tasks per job. Will be part of generated client configuration.

mapred.reduce.tasks

1

mapred_reduce_tasks

false

Reduce Tasks Speculative Execution

If enabled, multiple instances of some reduce tasks may be executed in parallel.

mapred.reduce.tasks.speculative.execution

false

mapred_reduce_tasks_speculative_execution

false

Maximum Time to Retain User Logs

The maximum time, in hours, to retain the user logs after job completion.

mapred.userlog.retain.hours

1 day(s)

mapred_userlog_retain_hours

false

Logs

Display Name

Description

Related Name

Default Value

API Name

Required

Gateway Logging Threshold

The minimum log level for Gateway logs

INFO

log_threshold

false

Monitoring

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Configuration Change Alerts

When set, Cloudera Manager will send alerts when this entity's configuration changes.

false

enable_config_alerts

false

Other

Display Name

Description

Related Name

Default Value

API Name

Required

Alternatives Priority

The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will
cause Alternatives to prefer this configuration over any others.

91

client_config_priority

true

Mapreduce Submit Replication

The replication level for submitted job files.

mapred.submit.replication

10

mapred_submit_replication

false

MapReduce Task Timeout

The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status
string.

mapred.task.timeout

10 minute(s)

mapred_task_timeout

false

Performance

Display Name

Description

Related Name

Default Value

API Name

Required

I/O Sort Factor

The number of streams to merge at the same time while sorting files. That is, the number of sort heads to use during the merge sort
on the reducer side. This determines the number of open file handles. Merging more files in parallel reduces merge sort iterations and improves run time by eliminating disk I/O. Note that merging
more files in parallel uses more memory. If 'io.sort.factor' is set too high or the maximum JVM heap is set too low, excessive garbage collection will occur. The Hadoop default is 10, but Cloudera
recommends a higher value. Will be part of generated client configuration.

io.sort.factor

64

io_sort_factor

false

I/O Sort Memory Buffer (MiB)

The total amount of memory buffer, in megabytes, to use while sorting files. Note that this memory comes out of the user JVM heap
size (meaning total user JVM heap - this amount of memory = total user usable heap space. Note that Cloudera's default differs from Hadoop's default; Cloudera uses a bigger buffer by default because
modern machines often have more RAM. The smallest value across all TaskTrackers will be part of generated client configuration.

io.sort.mb

256 MiB

io_sort_mb

false

I/O Sort Record Percent

The percentage of 'io.sort.mb' dedicated to tracking record boundaries. If this value is represented as 'r', and 'io.sort.mb' is
represented as 'x', then the maximum number of records collected before the collection thread must block is equal to (r * x) / 4. The syntax is in decimal units; the default is 5% and is formatted
0.05. Will be part of generated client configuration.

io.sort.record.percent

0.05

io_sort_record_percent

false

I/O Sort Spill Percent

The soft limit in either the buffer or record collection buffers. When this limit is reached, a thread will begin to spill the
contents to disk in the background. Note that this does not imply any chunking of data to the spill. A value less than 0.5 is not recommended. The syntax is in decimal units; the default is 80% and
is formatted 0.8. Will be part of generated client configuration.

io.sort.spill.percent

0.8

io_sort_spill_percent

false

MapReduce Child Java Opts Base

Java opts for the TaskTracker child processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by
current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'mapred.child.ulimit' can be used to control the maximum virtual memory of the child processes. Note that unlike Hadoop, Cloudera Manager
separates the child options into this setting and a separate setting just for the maximum heap size. Will be part of generated client configuration.

mapred.child.java.opts

-Djava.net.preferIPv4Stack=true

mapred_child_java_opts_base

false

Map Task Java Opts Base

Java opts for the TaskTracker child map processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by
current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'Map Task Maximum Virtual Memory' can be used to control the maximum virtual memory of the map processes. This takes precedence over the generic
'mapred.child.java.opts'. Will be part of generated client configuration.

mapred.map.child.java.opts

mapred_map_task_java_opts

false

Default Number of Parallel Transfers During Shuffle

The default number of parallel transfers run by reduce during the copy (shuffle) phase. This number should be between
sqrt(nodes*number_of_map_slots_per_node) and nodes*number_of_map_slots_per_node/2. Will be part of generated client configuration.

mapred.reduce.parallel.copies

10

mapred_reduce_parallel_copies

false

Reduce Task Java Opts Base

Java opts for the TaskTracker child reduce processes. The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'Reduce Task Maximum Virtual Memory' can be used to control the maximum virtual memory of the reduce processes. This takes precedence over the
generic 'mapred.child.java.opts'. Will be part of generated client configuration.

mapred.reduce.child.java.opts

mapred_reduce_task_java_opts

false

Resource Management

Display Name

Description

Related Name

Default Value

API Name

Required

MapReduce Child Java Maximum Heap Size

The maximum heap size, in bytes, of the Java child process. This number will be formatted and concatenated with the 'base' setting
for 'mapred.child.java.opts' to pass to Hadoop. Will be part of generated client configuration.

1 GiB

mapred_child_java_opts_max_heap

false

MapReduce Maximum Virtual Memory (KiB)

The maximum virtual memory, in KiB, of a process launched by the MapReduce framework. This can be used to control both the MapReduce
tasks and applications using Hadoop Pipes, Hadoop Streaming, and so on. By default, it is left unspecified to allow administrators to control it via 'limits.conf' and other mechanisms. Note:
'mapred.child.ulimit' must be greater than or equal to approximately 1.5 times the -Xmx passed to JavaVM, or else the VM might not start. Will be part of generated client configuration.

mapred.child.ulimit

mapred_child_ulimit

false

Map Task Maximum Heap Size

The maximum heap size, in bytes, of the child map processes. This number will be formatted and concatenated with 'Map Task Java Opts
Base' to pass to Hadoop. Will be part of generated client configuration.

mapred_map_task_max_heap

false

Map Task Maximum Virtual Memory (KiB)

The maximum virtual memory, in KiB, available to map tasks. Note: this must be greater than or equal to the -Xmx passed to the
JavaVM via 'Map Task Java Opts', or else the VM might not start. This takes precedence over the generic 'mapred.child.ulimit'. Will be part of generated client configuration.

mapred.map.child.ulimit

mapred_map_task_ulimit

false

Reduce Task Maximum Heap Size

The maximum heap size, in bytes, of the child reduce processes. This number will be formatted and concatenated with 'Reduce Task
Java Opts Base' to pass to Hadoop. Will be part of generated client configuration.

mapred_reduce_task_max_heap

false

Reduce Task Maximum Virtual Memory (KiB)

The maximum virtual memory, in KiB, available to reduce tasks. Note: this must be greater than or equal to the -Xmx passed to the
JavaVM via 'Map Task Java Opts', or else the VM might not start. This takes precedence over the generic 'mapred.child.ulimit'. Will be part of generated client configuration.

jobtracker

Advanced

Display Name

Description

Related Name

Default Value

API Name

Required

Hadoop Metrics Advanced Configuration Snippet (Safety Valve)

Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics. Properties will be inserted into hadoop-metrics.properties for this role only. Note that Cloudera Manager tunes hadoop-metrics.properties to work optimally with its Service Monitoring features. By overriding the
default, Cloudera Manager might not be able to provide accurate monitoring information, health tests or alerts.

For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of
this role except client configuration.

JOBTRACKER_role_env_safety_valve

false

JobTracker Logging Advanced Configuration Snippet (Safety Valve)

For advanced use only, a string to be inserted into log4j.properties for this role only.

log4j_safety_valve

false

JobTracker Client Connection Retries

The maximum number of times to retry between failovers.

mapred.client.failover.connection.retries

0

mapred_client_failover_connection_retries

false

JobTracker Client Max Retries

The maximum number of times to retry on timeouts between failovers.

mapred.client.failover.connection.retries.on.timeouts

0

mapred_client_failover_connection_retries_on_timeouts

false

JobTracker Client Max Failover Attempt

The maximum number of times a client of JobTracker tries to fail over.

mapred.client.failover.max.attempts

15

mapred_client_failover_max_attempts

false

JobTracker Client Base Sleep

The time in milliseconds to wait before the first failover.

mapred.client.failover.sleep.base.millis

500 millisecond(s)

mapred_client_failover_sleep_base_millis

false

JobTracker Client Maximum Sleep

The maximum amount of time in milliseconds to wait between failovers (for exponential backoff).

mapred.client.failover.sleep.max.millis

1.5 second(s)

mapred_client_failover_sleep_max_millis

false

Heap Dump Directory

Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically
created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions.
The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured
for this role.

oom_heap_dump_dir

/tmp

oom_heap_dump_dir

false

Dump Heap When Out of Memory

When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.

false

oom_heap_dump_enabled

true

Kill When Out of Memory

When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.

true

oom_sigkill_enabled

true

Automatically Restart Process

When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.

false

process_auto_restart

true

Classes

Display Name

Description

Related Name

Default Value

API Name

Required

Hadoop Socket Factory for Job Submission

Socket Factory to use to connect to a MapReduce master (JobTracker). If null or empty, then use
hadoop.rpc.socket.factory.class.default.

hadoop.rpc.socket.factory.class.JobSubmissionProtocol

hadoop_rpc_socket_factory_class_job_submission_protocol

false

Task Scheduler

The class responsible for scheduling tasks. Cloudera recommends the Fair Scheduler. The JobQueueTaskScheduler is often referred to
as the FIFO scheduler.

Jobs

Enter an XML string that represents the Capacity Scheduler configuration.

<?xml version=1.0?> <!-- This is the configuration file for the resource manager in Hadoop. --> <!-- You can
configure various scheduling parameters related to queues. --> <!-- The properties for a queue follow a naming convention, such as, --> <!--
mapred.capacity-scheduler.queue.<queue-name>.property-name. --> <configuration> <property> <name>mapred.capacity-scheduler.queue.default.capacity</name>
<value>100</value> <description>Percentage of the number of slots in the cluster that are to be available for jobs in this queue. </description> </property>
<property> <name>mapred.capacity-scheduler.queue.default.maximum-capacity</name> <value>-1</value> <description> maximum-capacity defines a limit beyond which a
queue cannot use the capacity of the cluster. This provides a means to limit how much excess capacity a queue can use. By default, there is no limit. The maximum-capacity of a queue can only be
greater than or equal to its minimum capacity. Default value of -1 implies a queue can use complete capacity of the cluster. This property could be to curtail certain jobs which are long running in
nature from occupying more than a certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of other queues being affected. One important thing to note
is that maximum-capacity is a percentage , so based on the cluster's capacity the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in absolute terms
would increase accordingly. </description> </property> <property> <name>mapred.capacity-scheduler.queue.default.supports-priority</name> <value>false</value>
<description>If true, priorities of jobs will be taken into account in scheduling decisions. </description> </property> <property>
<name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percent</name> <value>100</value> <description> Each queue enforces a limit on the percentage of
resources allocated to a user at any given time, if there is competition for them. This user limit can vary between a minimum and maximum value. The former depends on the number of users who have
submitted jobs, and the latter is set to this property value. For example, suppose the value of this property is 25. If two users have submitted jobs to a queue, no single user can use more than 50%
of the queue resources. If a third user submits a job, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queue's resources. A
value of 100 implies no user limits are imposed. </description> </property> <property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-jobs-per-user</name> <value>2</value> <description>The maximum number of jobs to be pre-initialized for
a user of the job queue. </description> </property> <!-- The default configuration settings for the capacity task scheduler --> <!-- The default values would be applied to all
the queues which don't have --> <!-- the appropriate property for the particular queue --> <property> <name>mapred.capacity-scheduler.default-supports-priority</name>
<value>false</value> <description>If true, priorities of jobs will be taken into account in scheduling decisions by default in a job queue. </description> </property>
<property> <name>mapred.capacity-scheduler.default-minimum-user-limit-percent</name> <value>100</value> <description>The percentage of the resources limited to a
particular user for the job queue at any given point of time by default. </description> </property> <property>
<name>mapred.capacity-scheduler.default-maximum-initialized-jobs-per-user</name> <value>2</value> <description>The maximum number of jobs to be pre-initialized for a
user of the job queue. </description> </property> <!-- Capacity scheduler Job Initialization configuration parameters --> <property>
<name>mapred.capacity-scheduler.init-poll-interval</name> <value>5000</value> <description>The amount of time in miliseconds which is used to poll the job queues for
jobs to initialize. </description> </property> <property> <name>mapred.capacity-scheduler.init-worker-threads</name> <value>5</value>
<description>Number of worker threads which would be used by Initialization poller to initialize jobs in a set of queue. If number mentioned in property is equal to number of job queues then a
single thread would initialize jobs in a queue. If lesser then a thread would get a set of queues assigned. If the number is greater then number of threads would be equal to number of job queues.
</description> </property> </configuration>

Allows the Fair Scheduler to assign both a map task and a reduce task on each Cloudera Agent heartbeat, which improves cluster
throughput when there are many small tasks to run.

mapred.fairscheduler.assignmultiple

true

mapred_fairscheduler_assignmultiple

false

Fair Scheduler Pool Name Property

Specify the 'jobconf' property that determines the pool that a job belongs in. The default is 'user.name' (one pool for each user).
If you want to use MapReduce's "queue" system to enable authorization for the Fair Scheduler, specify 'mapred.job.queue.name'. This requires adding the Fair Scheduler's pool names to
'mapred.queue.names' and users to submit jobs using the 'mapred.job.queue.name' property instead of the 'mapred.fairscheduler.pool' property. Note that 'mapred.fairscheduler.poolnameproperty' is used
only for jobs in which 'mapred.fairscheduler.pool' is not explicitly set.

mapred.fairscheduler.poolnameproperty

user.name

mapred_fairscheduler_poolnameproperty

false

Fair Scheduler Preemption

Enables Fair Scheduler preemption. If a pool's minimum share is not met for some period of time, the Fair Scheduler optionally
supports preemption of jobs in other pools. The pool will be allowed to kill tasks from other pools to make room to run. Preemption can be used to guarantee that production jobs are not starved while
also allowing the Hadoop cluster to be used for experimental and research jobs. In addition, a pool can also be allowed to preempt tasks if it is below half of its fair share for a configurable
timeout (generally set larger than the minimum share preemption timeout). When choosing tasks to kill, the Fair Scheduler picks the most-recently launched tasks from over-allocated jobs, to minimize
wasted computation. Preemption does not cause the preempted jobs to fail because Hadoop jobs tolerate losing tasks; it only makes them take longer to finish.

mapred.fairscheduler.preemption

false

mapred_fairscheduler_preemption

false

Fair Scheduler Weight Adjuster

An extension point that lets you specify a class to adjust the weights of running jobs. This class should implement the
WeightAdjuster interface. There is currently one example implementation - NewJobWeightBooster, which increases the weight of jobs for their first 5 minutes to let short jobs finish faster. To use it,
set the weightadjuster property to the full classname, org.apache.hadoop.mapred.NewJobWeightBooster. NewJobWeightBooster itself provides two parameters for setting the duration and boost factor.
mapred.newjobweightbooster.factor: Factor by which new jobs weight should be boosted. Default is 3. mapred.newjobweightbooster.duration: Boost duration in milliseconds. Default is 300000 for 5
minutes.

mapred.fairscheduler.weight.adjuster

mapred_fairscheduler_weight_adjuster

false

Persist JobTracker Job Status

If enabled, job status information is persisted.

mapred.job.tracker.persist.jobstatus.active

false

mapred_job_tracker_persist_jobstatus_active

false

Directory for JobTracker Job Status Persistence

The HDFS directory in which job status information is kept persistently. The directory must exist and be owned by the mapred
user.

mapred.job.tracker.persist.jobstatus.dir

/jobtracker/jobsInfo

mapred_job_tracker_persist_jobstatus_dir

false

Time Limit of JobTracker Job Status Persistence

The number of hours job status information is persisted in HDFS. The job status information will be available after it drops out of
the memory queue and between JobTracker restarts. If zero is specified for this property, the job status information is not persisted.

mapred.job.tracker.persist.jobstatus.hours

0

mapred_job_tracker_persist_jobstatus_hours

false

Maximum Completed User Jobs

The maximum number of completed jobs per user to retain before delegating them to the job history.

mapred.jobtracker.completeuserjobs.maximum

5

mapred_jobtracker_completeuserjobs_maximum

false

Enable Job Recovery Upon Restart

Enables job recovery upon restart. If the property is set to true, then if and when the JobTracker stops while a job is running, it
will resubmit the job on restart.

mapred.jobtracker.restart.recover

false

mapred_jobtracker_restart_recover

false

JobTracker Retire Job Interval (milliseconds)

Number of milliseconds job history objects are kept.

mapred.jobtracker.retirejob.interval

86400000

mapred_jobtracker_retirejob_interval

false

MapReduce Queue Names

Comma separated list of queues configured for the JobTracker in this service instance. Jobs are added to queues. Schedulers can
configure different scheduling properties for the queues specified in this list. You can configure queue properties that are common to all schedulers, by using the naming convention
'mapred.queue.$QUEUE-NAME.$PROPERTY-NAME' in this property (for example, 'mapred.queue.default.submit-job-acl'). The number of queues configured in this property depends on the type of scheduler
specified in 'mapred.jobtracker.taskScheduler'. The default scheduler JobQueueTaskScheduler supports a single queue only. Before adding more queues to this property, make sure that the scheduler in
'mapred.jobtracker.taskScheduler' supports multiple queues. This property can also be populated with the Fair Scheduler's pool names to enable authorization of the Fair Scheduler. This requires
setting 'mapred.fairscheduler.poolnameproperty' to 'mapred.job.queue.name' and users to submit jobs to the right queue by setting the 'mapred.job.queue.name' property in their jobs.

mapred.queue.names

default

mapred_queue_names_list

false

Logs

Display Name

Description

Related Name

Default Value

API Name

Required

JobTracker Log Directory

Directory where JobTracker will place its log files.

hadoop.log.dir

/var/log/hadoop-0.20-mapreduce

jobtracker_log_dir

false

JobTracker Logging Threshold

The minimum log level for JobTracker logs

INFO

log_threshold

false

JobTracker Maximum Log File Backups

The maximum number of rolled log files to keep for JobTracker logs. Typically used by log4j or logback.

10

max_log_backup_index

false

JobTracker Max Log Size

The maximum size, in megabytes, per log file for JobTracker logs. Typically used by log4j or logback.

200 MiB

max_log_size

false

Metrics

Display Name

Description

Related Name

Default Value

API Name

Required

Hadoop Metrics Class

Implementation daemons will use to report some internal statistics. The default (NoEmitMetricsContext) will display metrics on
/metrics on the status port. The GangliaContext and GangliaContext31 classes will report metrics to your specified Ganglia Monitoring Daemons (gmond). The ganglia wire format changed incompatibly at
version 3.1.0. If you are running any version of ganglia 3.1.0 or newer, use the GangliaContext31 metric class; otherwise, use the GangliaContext metric class.

org.apache.hadoop.metrics.spi.NoEmitMetricsContext

hadoop_metrics_class

false

Hadoop Metrics Output Directory

If using FileContext, directory to write metrics to.

/tmp/metrics

hadoop_metrics_dir

false

Hadoop Metrics Ganglia Servers

If using GangliaContext, a comma-delimited list of host:port pairs pointing to 'gmond' servers you would like to publish metrics to.
In practice, this set of 'gmond' should match the set of 'gmond' in your 'gmetad' datasource list for the cluster.

hadoop_metrics_ganglia_servers

false

Monitoring

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Health Alerts for this Role

When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting
eventserver_health_events_alert_threshold

true

enable_alerts

false

Enable Configuration Change Alerts

When set, Cloudera Manager will send alerts when this entity's configuration changes.

false

enable_config_alerts

false

Heap Dump Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.

Warning: 10 GiB, Critical: 5 GiB

heap_dump_directory_free_space_absolute_thresholds

false

Heap Dump Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified
as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

heap_dump_directory_free_space_percentage_thresholds

false

File Descriptor Monitoring Thresholds

The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.

Warning: 50.0 %, Critical: 70.0 %

jobtracker_fd_thresholds

false

Garbage Collection Duration Thresholds

The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall
clock time.

Warning: 30.0, Critical: 60.0

jobtracker_gc_duration_thresholds

false

Garbage Collection Duration Monitoring Period

The period to review when computing the moving average of garbage collection time.

5 minute(s)

jobtracker_gc_duration_window

false

JobTracker Host Health Test

When computing the overall JobTracker health, consider the host's health.

true

jobtracker_host_health_enabled

false

JobTracker Process Health Test

Enables the health test that the JobTracker's process state is consistent with the role configuration

true

jobtracker_scm_health_enabled

false

Health Check Startup Tolerance

The amount of time allowed after this role is started that failures of health checks that rely on communication with this role will
be tolerated.

5 minute(s)

jobtracker_startup_tolerance

false

Web Metric Collection

Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server.

true

jobtracker_web_metric_collection_enabled

false

Web Metric Collection Duration

The health test thresholds on the duration of the metrics request to the web server.

Warning: 10 second(s), Critical: Never

jobtracker_web_metric_collection_thresholds

false

Log Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.

Warning: 10 GiB, Critical: 5 GiB

log_directory_free_space_absolute_thresholds

false

Log Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a
percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

log_directory_free_space_percentage_thresholds

false

Rules to Extract Events from Log Files

This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads.
It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has
some or all of the following fields:

alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not
specified, the default is "false".

rate(mandatory) - the maximum number of log messages matching this rule that may be sent as events every minute. If more
than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of
messages per minute is unlimited.

periodminutes - the number of minutes during which the publisher will only publish rate events
or fewer. If not specified, the default is one minute

threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.

exceptiontype - match only those messages which are part of an exception message. The exception type must match this regular expression.

Example:{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule will send events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.

The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health
system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:

triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.

streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition
fires. By default set to 0, and any stream returned causes the condition to fire.

enabled(optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.

expressionEditorConfig(optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the
Edit Trigger page; editing the trigger here can lead to inconsistencies.

For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for
more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.

[]

role_triggers

true

Unexpected Exits Thresholds

The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window
configuration for the role.

Warning: Never, Critical: Any

unexpected_exits_thresholds

false

Unexpected Exits Monitoring Period

The period to review when computing unexpected exits.

5 minute(s)

unexpected_exits_window

false

Other

Display Name

Description

Related Name

Default Value

API Name

Required

JobTracker Logical Name

For High Availability, the logical name for the JobTracker active-standby pair. This name is serialized as part of the path of the
ZooKeeper node storing high-availibility data. Renaming the JobTracker requires reinitializating the ZooKeeper state.

logicaljt

job_tracker_name

false

JobTracker Local Data Directory

Directory on the local filesystem where the JobTracker stores job configuration data. Directories that do not exist are ignored. A
single directory is sufficient; a list of multiple directories will not cause problems.

mapred.local.dir

jobtracker_mapred_local_dir_list

true

Job History Files Cleaner Interval

Time interval for history cleaner to check for files to delete. Files are only deleted if they are older than
mapreduce.jobhistory.max-age-ms.

mapreduce.jobhistory.cleaner.interval

1 day(s)

mapreduce_jobhistory_cleaner_interval

false

Job History Files Maximum Age

Job history files older than this time duration will deleted when the history cleaner runs.

mapreduce.jobhistory.max-age-ms

7 day(s)

mapreduce_jobhistory_max_age_ms

false

Paths

Display Name

Description

Related Name

Default Value

API Name

Required

Running Job History Location

Location to store the job history files of running jobs. This is a path on the host where the JobTracker is running.

hadoop.job.history.location

/var/log/hadoop-0.20-mapreduce/history

hadoop_job_history_dir

false

Completed Job History Location

Location to store the job history files of completed jobs. If a location is not specified, the job history files of completed jobs
are stored in a subdirectory of the 'Running Job History Location'. If set, completed jobs will be moved into this directory in HDFS.

mapred.job.tracker.history.completed.location

mapred_job_tracker_history_completed_dir

false

MapReduce JobTracker Staging Root Directory

The root HDFS directory of the staging area for users' MapReduce jobs; for example /user. The staging directories are always named
after the user.

mapreduce.jobtracker.staging.root.dir

/user

mapreduce_jobtracker_staging_root_dir

false

Performance

Display Name

Description

Related Name

Default Value

API Name

Required

Hue Thrift Server Max Threadcount

Maximum number of running threads for the Hue Thrift server running on the Jobtracker

dfs.thrift.threads.max

20

dfs_thrift_threads_max

false

Hue Thrift Server Min Threadcount

Minimum number of running threads for the Hue Thrift server running on the Jobtracker

dfs.thrift.threads.min

10

dfs_thrift_threads_min

false

Hue Thrift Server Timeout

Timeout in seconds for the Hue Thrift server running on the Jobtracker

dfs.thrift.timeout

60

dfs_thrift_timeout

false

JobTracker Handler Count

The number of server threads for the JobTracker. This should be approximately 20 * ln(the number of TaskTracker nodes).

mapred.job.tracker.handler.count

10

mapred_job_tracker_handler_count

false

Maximum Tasks per Job

The maximum number of tasks for a single job. Use a value of -1 B to specify no maximum. Note that allowing jobs with a large number
of tasks increases memory usage by the JobTracker.

mapred.jobtracker.maxtasks.per.job

mapred_jobtracker_maxtasks_per_job

false

User JobConf Limit

The maximum allowed size of the user jobconf.

mapred.user.jobconf.limit

5 MiB

mapred_user_jobconf_limit

false

JobTracker MetaInfo Maxsize

The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the
configured value. No limits if set to -1.

mapreduce.jobtracker.split.metainfo.maxsize

10000000

mapreduce_jobtracker_split_metainfo_maxsize

false

Maximum Process File Descriptors

If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.

rlimit_fds

false

Plugins

Display Name

Description

Related Name

Default Value

API Name

Required

Enable JobTracker Plugins Required for Hue

If enabled, adds 'org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin' to the 'mapred.jobtracker.plugins' configuration. This property
must be enabled to allow Hue to operate.

true

hue_jobtracker_plugin

false

MapReduce JobTracker Plugins

mapred.jobtracker.plugins: Comma-separated list of JobTracker plugins to be activated. If one plugin cannot be loaded, all plugins
are ignored. Note that there are separate controls below to enable the Hue Thrift plugin.

mapred.jobtracker.plugins

mapred_jobtracker_plugins_list

false

Ports and Addresses

Display Name

Description

Related Name

Default Value

API Name

Required

JobTracker Port for HA

Port of the High Availability service protocol for the JobTracker. The JobTracker listens on a separate port for High Availability
operations which is why this property exists in addition to 'mapred.job.tracker'.

mapred.ha.job.tracker

8023

ha_job_tracker_port

false

Bind JobTracker to Wildcard Address

If enabled, the JobTracker binds to the wildcard address ("0.0.0.0") on all of its ports.

false

job_tracker_bind_wildcard

false

JobTracker Port

Port for the internal JobTracker protocol.

mapred.job.tracker

8021

job_tracker_port

false

JobTracker HTTP Server Address

The address where the JobTracker HTTP server listens. The default address, 0.0.0.0, binds to all interfaces.

0.0.0.0

mapred_job_tracker_http_host

false

JobTracker HTTP Server Port

The port where the JobTracker HTTP server listens. If the port is 0, the server starts on a free port.

mapred.job.tracker.http.address

50030

mapred_job_tracker_http_port

false

Hue Thrift Plugin Port

Port to use for 'org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin' that is used by Hue's NameNode plugin.

Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be
given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.

cpu.shares

1024

rm_cpu_shares

true

Cgroup I/O Weight

Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host
experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.

blkio.weight

500

rm_io_weight

true

Cgroup Memory Hard Limit

Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default
processes not managed by Cloudera Manager will have no limit.

memory.limit_in_bytes

-1 MiB

rm_memory_hard_limit

true

Cgroup Memory Soft Limit

Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use
a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.

memory.soft_limit_in_bytes

-1 MiB

rm_memory_soft_limit

true

Security

Display Name

Description

Related Name

Default Value

API Name

Required

Enable MapReduce ACLs

Specifies whether ACLs should be checked for authorization of users who are doing various queue and job-level operations. ACLs are
disabled by default. If enabled, the JobTracker and TaskTracker perform access control checks when users make requests for queue and job operations. Examples of queue operations are submitting a job
to the queue and killing a job in the queue. Examples of job operations are viewing the job details (mapreduce.job.acl-view-job), modifying the job (mapreduce.job.acl-modify-job), or using MapReduce
APIs, RPCs, or the console and web user interfaces.

mapred.acls.enabled

false

mapred_acls_enabled

false

MapReduce Queue ACLs

String representing an XML file that controls, per queue, which users are allowed to submit and administrate jobs in that queue. The
default setting is that all users and groups are allowed to submit jobs to queue 'default' and no users or groups are allowed to administer jobs other than their own that are submitted to queue
'default'.

If enabled, administrative actions such as 'kill job' will be displayed in the JobTracker's web interface. These actions can then be
triggered by anyone who has access to the web interface.

webinterface.private.actions

false

webinterface_private_actions

false

Stacks Collection

Display Name

Description

Related Name

Default Value

API Name

Required

Stacks Collection Data Retention

The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.

stacks_collection_data_retention

100 MiB

stacks_collection_data_retention

false

Stacks Collection Directory

The directory in which stacks logs are placed. If not set, stacks are logged into a stacks
subdirectory of the role's log directory.

stacks_collection_directory

stacks_collection_directory

false

Stacks Collection Enabled

Whether or not periodic stacks collection is enabled.

stacks_collection_enabled

false

stacks_collection_enabled

true

Stacks Collection Frequency

The frequency with which stacks are collected.

stacks_collection_frequency

5.0 second(s)

stacks_collection_frequency

false

Stacks Collection Method

The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon
process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint
is periodically scraped.

stacks_collection_method

jstack

stacks_collection_method

false

Suppressions

Display Name

Description

Related Name

Default Value

API Name

Required

Suppress Configuration Validator: CDH Version Validator

Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.

false

role_config_suppression_cdh_version_validator

true

Suppress Parameter Validation: Running Job History Location

Whether to suppress configuration warnings produced by the built-in parameter validation for the Running Job History Location
parameter.

false

role_config_suppression_hadoop_job_history_dir

true

Suppress Parameter Validation: Hadoop Metrics Output Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics Output Directory
parameter.

false

role_config_suppression_hadoop_metrics_dir

true

Suppress Parameter Validation: Hadoop Metrics Ganglia Servers

Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics Ganglia Servers
parameter.

Whether to suppress configuration warnings produced by the built-in parameter validation for the MapReduce JobTracker Staging Root
Directory parameter.

false

role_config_suppression_mapreduce_jobtracker_staging_root_dir

true

Suppress Parameter Validation: Heap Dump Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.

false

role_config_suppression_oom_heap_dump_dir

true

Suppress Parameter Validation: Role Triggers

Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.

false

role_config_suppression_role_triggers

true

Suppress Parameter Validation: Stacks Collection Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory
parameter.

false

role_config_suppression_stacks_collection_directory

true

Suppress Health Test: File Descriptors

Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_file_descriptor

true

Suppress Health Test: GC Duration

Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the
overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_gc_duration

true

Suppress Health Test: Heap Dump Directory Free Space

Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored
when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_heap_dump_directory_free_space

true

Suppress Health Test: Host Health

Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the
overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_host_health

true

Suppress Health Test: Log Directory Free Space

Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_log_directory_free_space

true

Suppress Health Test: Process Status

Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing
the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_scm_health

true

Suppress Health Test: Swap Memory Usage

Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_swap_memory_usage

true

Suppress Health Test: Unexpected Exits

Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_unexpected_exits

true

Suppress Health Test: Web Server Status

Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_job_tracker_web_metric_collection

true

service_wide

Advanced

Display Name

Description

Related Name

Default Value

API Name

Required

System User's Home Directory

The home directory of the system user on the local filesystem. This setting must reflect the system's configured value - only
changing it here will not change the actual home directory.

For advanced use only, a string to be inserted into ssl-server.xml. Applies to configurations of all
roles in this service except client configuration.

mapreduce_ssl_server_safety_valve

false

System Group

The group that this service's processes should run as.

hadoop

process_groupname

true

System User

The user that this service's processes should run as.

mapred

process_username

true

Monitoring

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Log Event Capture

When set, each role identifies important log events and forwards them to Cloudera Manager.

true

catch_events

false

Enable Service Level Health Alerts

When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer
setting eventserver_health_events_alert_threshold

true

enable_alerts

false

Enable Configuration Change Alerts

When set, Cloudera Manager will send alerts when this entity's configuration changes.

false

enable_config_alerts

false

Failover Controllers Healthy

Enables the health check that verifies that the failover controllers associated with this service are healthy and running.

true

failover_controllers_healthy_enabled

false

Activity Duration Rules

To generate an event when certain activities are running slowly, enter rules for the activities in this setting. The syntax for a
rule is 'regex=number' where number is in minutes. Enter one rule per line in this text box.
When a new activity starts, each regex expression is tested against the name of the activity for a match. The first rule that matches is used. If an activity
matches a rule and runs longer than the number of minutes, an event will be sent.

firehose_activity_duration_rules

false

Alert on Activity Failure

If enabled, an alert will be generated when any activity fails.

true

firehose_activity_failure_alert

false

Alert on Slow Activities

If enabled, an alert will be generated when an activity has been running longer than the duration specified in the 'Activity
Duration Rules' setting.

true

firehose_activity_slow_alert

false

Log Event Retry Frequency

The frequency in which the log4j event publication appender will retry sending undelivered log events to the Event server, in
seconds

30

log_event_retry_frequency

false

Active JobTracker Detection Window

The tolerance window that will be used in Mapreduce service tests that depend on detection of the active JobTracker.

3 minute(s)

mapreduce_active_jobtracker_detecton_window

false

JobTracker Activation Startup Tolerance

The amount of time after JobTracker(s) start that the lack of an active JobTracker will be tolerated. This is intended to allow
either the auto-failover daemon to make a JobTracker active, or a specifically issued failover command to take effect. This is an advanced option that does not often need to be changed.

When computing the overall cluster health, consider the health of the standby JobTracker.

true

mapreduce_standby_jobtrackers_health_enabled

false

Healthy TaskTracker Monitoring Thresholds

The health test thresholds of the overall TaskTracker health. The check returns "Concerning" health if the percentage of "Healthy"
TaskTrackers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" TaskTrackers falls below the critical threshold.

Warning: 95.0 %, Critical: 90.0 %

mapreduce_tasktrackers_healthy_thresholds

false

Service Triggers

The configured triggers for this service. This is a JSON formatted list of triggers. These triggers are evaluated as part as the
health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:

triggerName(mandatory) - The name of the trigger. This value must be unique for the specific service.

streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition
fires. By default set to 0, and any stream returned causes the condition to fire.

enabled(optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.

expressionEditorConfig(optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the
Edit Trigger page; editing the trigger here can lead to inconsistencies.

For example, the followig JSON formatted trigger fires if there are more than 10 DataNodes with more than 500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:bad", "streamThreshold": 10, "enabled": "true"}]See the trigger rules documentation for
more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.

[]

service_triggers

true

Service Monitor Client Config Overrides

For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client
configuration for the service.

For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default
ones.

smon_derived_configs_safety_valve

false

Other

Display Name

Description

Related Name

Default Value

API Name

Required

HDFS Service

Name of the HDFS service that this MapReduce service instance depends on

hdfs_service

true

ZooKeeper Service

Name of the ZooKeeper service that this MapReduce service instance depends on

zookeeper_service

false

Paths

Display Name

Description

Related Name

Default Value

API Name

Required

MapReduce System Directory

The HDFS directory where the MapReduce service stores system files. This directory must be accessible from both the server and
client machines. For example: /hadoop/mapred/system/

mapred.system.dir

/tmp/mapred/system

mapred_system_dir

false

Performance

Display Name

Description

Related Name

Default Value

API Name

Required

Enable HDFS Short-Circuit Read

Enable HDFS short-circuit read. This allows a client colocated with the DataNode to read HDFS file blocks directly. This gives a
performance boost to distributed clients that are aware of locality.

dfs.client.read.shortcircuit

false

dfs_client_read_shortcircuit

false

SequenceFile I/O Buffer Size

Size of buffer for read and write operations of SequenceFiles.

io.file.buffer.size

64 KiB

io_file_buffer_size

false

Job Counters Limit

Limit on the number of counters allowed per job.

mapreduce.job.counters.max

120

mapreduce_job_counters_limit

false

Security

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Kerberos Authentication for HTTP Web-Consoles

Enables Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Note: This is effective only if Kerberos is enabled for the HDFS service.

false

hadoop_secure_web_ui

false

Hue's Kerberos Principal Short Name

The short name of Hue's Kerberos principal. Normally, you do not need to specify this configuration. Cloudera Manager
auto-configures this property so that Hue and Cloudera Manamgent Service work properly.

hue.kerberos.principal.shortname

hue_kerberos_principal_shortname

false

Kerberos Principal

Kerberos principal short name used by all roles of this service.

mapred

kerberos_princ_name

true

TLS/SSL Client Truststore File Location

Path to the truststore file used when roles of this service act as TLS/SSL clients. Overrides the cluster-wide default truststore
location set in HDFS. This truststore must be in JKS format. The truststore contains certificates of trusted servers, or of Certificate Authorities trusted to identify servers. The contents of the
truststore can be modified without restarting any roles. By default, changes to its contents are picked up within ten seconds. If not set, the default Java truststore is used to verify
certificates.

ssl.client.truststore.location

ssl_client_truststore_location

false

TLS/SSL Client Truststore File Password

Password for the TLS/SSL client truststore. Overrides the cluster-wide default truststore password set in HDFS.

ssl.client.truststore.password

ssl_client_truststore_password

false

Hadoop TLS/SSL Server Keystore Key Password

Password that protects the private key contained in the server keystore used for encrypted shuffle and encrypted web UIs. Applies to
all configurations of daemon roles of this service.

ssl.server.keystore.keypassword

ssl_server_keystore_keypassword

false

Hadoop TLS/SSL Server Keystore File Location

Path to the keystore file containing the server certificate and private key used for encrypted shuffle and encrypted web UIs.
Applies to configurations of all daemon roles of this service.

ssl.server.keystore.location

ssl_server_keystore_location

false

Hadoop TLS/SSL Server Keystore File Password

Password for the server keystore file used for encrypted shuffle and encrypted web UIs. Applies to configurations of all daemon
roles of this service.

Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop TLS/SSL Server Keystore File
Password parameter.

false

service_config_suppression_ssl_server_keystore_password

true

Suppress Configuration Validator: TaskTracker Count Validator

Whether to suppress configuration warnings produced by the TaskTracker Count Validator configuration validator.

false

service_config_suppression_tasktracker_count_validator

true

Suppress Health Test: Failover Controllers Health

Whether to suppress the results of the Failover Controllers Health heath test. The results of suppressed health tests are ignored
when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

service_health_suppression_mapreduce_failover_controllers_healthy

true

Suppress Health Test: JobTracker Health

Whether to suppress the results of the JobTracker Health heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

service_health_suppression_mapreduce_ha_job_tracker_health

true

Suppress Health Test: TaskTracker Health

Whether to suppress the results of the TaskTracker Health heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

service_health_suppression_mapreduce_task_trackers_healthy

true

tasktracker

Advanced

Display Name

Description

Related Name

Default Value

API Name

Required

Hadoop Metrics Advanced Configuration Snippet (Safety Valve)

Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics. Properties will be inserted into hadoop-metrics.properties for this role only. Note that Cloudera Manager tunes hadoop-metrics.properties to work optimally with its Service Monitoring features. By overriding the
default, Cloudera Manager might not be able to provide accurate monitoring information, health tests or alerts.

hadoop_metrics_safety_valve

false

TaskTracker Logging Advanced Configuration Snippet (Safety Valve)

For advanced use only, a string to be inserted into log4j.properties for this role only.

log4j_safety_valve

false

Healthchecker Script Arguments

Comma-separated list of arguments which are to be passed to node health script when it is being launched.

mapred.healthChecker.script.args

mapred_healthchecker_script_args

false

Healthchecker Script Path

Absolute path to the script which is periodically run by the node health monitoring service to determine if the node is healthy or
not. If the value of this key is empty or the file does not exist in the location configured here, the node health monitoring service is not started.

mapred.healthChecker.script.path

mapred_healthchecker_script_path

false

Heap Dump Directory

Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically
created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions.
The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured
for this role.

oom_heap_dump_dir

/tmp

oom_heap_dump_dir

false

Dump Heap When Out of Memory

When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.

false

oom_heap_dump_enabled

true

Kill When Out of Memory

When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.

true

oom_sigkill_enabled

true

Automatically Restart Process

When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.

For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of
this role except client configuration.

TASKTRACKER_role_env_safety_valve

false

Classes

Display Name

Description

Related Name

Default Value

API Name

Required

TaskTracker Instrumentation Class

The instrumentation class to associate with each TaskTracker. If using Cloudera's Activity Monitor, adjust this to use
org.apache.hadoop.mapred.TaskTrackerCmonInst.

mapred.tasktracker.instrumentation

org.apache.hadoop.mapred.TaskTrackerMetricsInst

mapred_tasktracker_instrumentation

false

Compression

Display Name

Description

Related Name

Default Value

API Name

Required

Compression Codecs (Client Override)

Comma-separated list of compression codecs that can be used in job or map compression.

io.compression.codecs

override_io_compression_codecs

false

Use Compression on Map Outputs (Client Override)

If enabled, uses compression on the map outputs before they are sent across the network. Will override value in client
configuration.

mapred.compress.map.output

no_override

override_mapred_compress_map_output

false

Compression Codec of MapReduce Map Output (Client Override)

For MapReduce map outputs that are compressed, specify the compression codec to use. Will override value in client
configuration.

mapred.map.output.compression.codec

override_mapred_map_output_compression_codec

false

Compress MapReduce Job Output (Client Override)

Compress the output of MapReduce jobs. Will override value in client configuration.

mapred.output.compress

no_override

override_mapred_output_compress

false

Compression Codec of MapReduce Job Output (Client Override)

For MapReduce job outputs that are compressed, specify the compression codec to use. Will override value in client
configuration.

mapred.output.compression.codec

override_mapred_output_compression_codec

false

Compression Type of MapReduce Job Output (Client Override)

For MapReduce job outputs that are compressed as SequenceFiles, you can select one of these compression type options: NONE, RECORD
or BLOCK. Cloudera recommends BLOCK. Will override value in client configuration.

mapred.output.compression.type

override_mapred_output_compression_type

false

Jobs

Display Name

Description

Related Name

Default Value

API Name

Required

Number of Tasks to Run per JVM (Client Override)

Number of tasks to run per JVM. If set to -1, there is no limit. Will override value in client configuration.

mapred.job.reuse.jvm.num.tasks

override_mapred_job_reuse_jvm_num_tasks

false

Map Tasks Speculative Execution (Client Override)

If enabled, multiple instances of some map tasks may be executed in parallel.

mapred.map.tasks.speculative.execution

no_override

override_mapred_map_tasks_speculative_execution

false

Number of Map Tasks to Complete Before Reduce Tasks (Client Override)

Fraction of the number of map tasks in the job which should be completed before reduce tasks are scheduled for the job.

mapred.reduce.slowstart.completed.maps

override_mapred_reduce_slowstart_completed_maps

false

Reduce Tasks Speculative Execution (Client Override)

If enabled, multiple instances of some reduce tasks may be executed in parallel.

mapred.reduce.tasks.speculative.execution

no_override

override_mapred_reduce_tasks_speculative_execution

false

Mapreduce Submit Replication (Client Override)

The replication level for submitted job files.

mapred.submit.replication

override_mapred_submit_replication

false

Maximum Time to Retain User Logs (Client Override)

The maximum time, in hours, to retain the user logs after job completion.

mapred.userlog.retain.hours

override_mapred_userlog_retain_hours

false

Logs

Display Name

Description

Related Name

Default Value

API Name

Required

TaskTracker Logging Threshold

The minimum log level for TaskTracker logs

INFO

log_threshold

false

TaskTracker Maximum Log File Backups

The maximum number of rolled log files to keep for TaskTracker logs. Typically used by log4j or logback.

10

max_log_backup_index

false

TaskTracker Max Log Size

The maximum size, in megabytes, per log file for TaskTracker logs. Typically used by log4j or logback.

200 MiB

max_log_size

false

TaskTracker Log Directory

Directory where TaskTracker will place its log files.

hadoop.log.dir

/var/log/hadoop-0.20-mapreduce

tasktracker_log_dir

false

Metrics

Display Name

Description

Related Name

Default Value

API Name

Required

Hadoop Metrics Class

Implementation daemons will use to report some internal statistics. The default (NoEmitMetricsContext) will display metrics on
/metrics on the status port. The GangliaContext and GangliaContext31 classes will report metrics to your specified Ganglia Monitoring Daemons (gmond). The ganglia wire format changed incompatibly at
version 3.1.0. If you are running any version of ganglia 3.1.0 or newer, use the GangliaContext31 metric class; otherwise, use the GangliaContext metric class.

org.apache.hadoop.metrics.spi.NoEmitMetricsContext

hadoop_metrics_class

false

Hadoop Metrics Output Directory

If using FileContext, directory to write metrics to.

/tmp/metrics

hadoop_metrics_dir

false

Hadoop Metrics Ganglia Servers

If using GangliaContext, a comma-delimited list of host:port pairs pointing to 'gmond' servers you would like to publish metrics
to. In practice, this set of 'gmond' should match the set of 'gmond' in your 'gmetad' datasource list for the cluster.

hadoop_metrics_ganglia_servers

false

Monitoring

Display Name

Description

Related Name

Default Value

API Name

Required

Enable Health Alerts for this Role

When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting
eventserver_health_events_alert_threshold

false

enable_alerts

false

Enable Configuration Change Alerts

When set, Cloudera Manager will send alerts when this entity's configuration changes.

false

enable_config_alerts

false

Heap Dump Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.

Warning: 10 GiB, Critical: 5 GiB

heap_dump_directory_free_space_absolute_thresholds

false

Heap Dump Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified
as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

heap_dump_directory_free_space_percentage_thresholds

false

Log Directory Free Space Monitoring Absolute Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.

Warning: 10 GiB, Critical: 5 GiB

log_directory_free_space_absolute_thresholds

false

Log Directory Free Space Monitoring Percentage Thresholds

The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a
percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.

Warning: Never, Critical: Never

log_directory_free_space_percentage_thresholds

false

Rules to Extract Events from Log Files

This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role
loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each
rule has some or all of the following fields:

alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not
specified, the default is "false".

rate(mandatory) - the maximum number of log messages matching this rule that may be sent as events every minute. If more
than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of
messages per minute is unlimited.

periodminutes - the number of minutes during which the publisher will only publish rate events
or fewer. If not specified, the default is one minute

threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.

exceptiontype - match only those messages which are part of an exception message. The exception type must match this regular expression.

Example:{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule will send events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.

The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the
health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:

triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.

streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition
fires. By default set to 0, and any stream returned causes the condition to fire.

enabled(optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.

expressionEditorConfig(optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the
Edit Trigger page; editing the trigger here can lead to inconsistencies.

For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for
more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.

[]

role_triggers

true

TaskTracker Blacklisted Health Test

Enables the health test that the TaskTracker is not blacklisted

true

tasktracker_blacklisted_health_enabled

false

TaskTracker Connectivity Health Test

Enables the health test that the TaskTracker is connected to the JobTracker

true

tasktracker_connectivity_health_enabled

false

TaskTracker Connectivity Tolerance at Startup

The amount of time to wait for the TaskTracker to fully start up and connect to the JobTracker before enforcing the connectivity
check.

3 minute(s)

tasktracker_connectivity_tolerance

false

File Descriptor Monitoring Thresholds

The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.

Warning: 50.0 %, Critical: 70.0 %

tasktracker_fd_thresholds

false

Garbage Collection Duration Thresholds

The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed
wall clock time.

Warning: 30.0, Critical: 60.0

tasktracker_gc_duration_thresholds

false

Garbage Collection Duration Monitoring Period

The period to review when computing the moving average of garbage collection time.

5 minute(s)

tasktracker_gc_duration_window

false

TaskTracker Host Health Test

When computing the overall TaskTracker health, consider the host's health.

The health test thresholds for monitoring of free space on the filesystem that contains this role's TaskTracker Local Data
Directories. Specified as a percentage of the capacity on that filesystem. This setting is not used if a TaskTracker Local Data Directories Free Space Monitoring Absolute Thresholds setting is
configured.

Warning: Never, Critical: Never

tasktracker_local_data_directories_free_space_percentage_thresholds

false

TaskTracker Process Health Test

Enables the health test that the TaskTracker's process state is consistent with the role configuration

true

tasktracker_scm_health_enabled

false

Web Metric Collection

Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server.

true

tasktracker_web_metric_collection_enabled

false

Web Metric Collection Duration

The health test thresholds on the duration of the metrics request to the web server.

Warning: 10 second(s), Critical: Never

tasktracker_web_metric_collection_thresholds

false

Unexpected Exits Thresholds

The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window
configuration for the role.

Warning: Never, Critical: Any

unexpected_exits_thresholds

false

Unexpected Exits Monitoring Period

The period to review when computing unexpected exits.

5 minute(s)

unexpected_exits_window

false

Other

Display Name

Description

Related Name

Default Value

API Name

Required

TaskTracker Local Data Directories

List of directories on the local filesystem where a TaskTracker stores intermediate data files. To spread disk I/O, enter a
comma-separated list of directories on different devices. Directories that do not exist are ignored. Typical values are /data/N/mapred/local for N = 1, 2, 3...

mapred.local.dir

tasktracker_mapred_local_dir_list

true

Performance

Display Name

Description

Related Name

Default Value

API Name

Required

I/O Sort Factor (Client Override)

The number of streams to merge at once while sorting files. That is, the number of sort heads to use during the merge sort on the
reducer side. This determines the number of open file handles. Merging more files in parallel reduces merge sort iterations and improves run time by eliminating disk I/O. Note that merging more files
in parallel uses more memory. If 'io.sort.factor' is set too high or the maximum JVM heap is set too low, excessive garbage collection will occur. The Hadoop default is 10, but Cloudera recommends a
higher value. Will override value in client configuration.

io.sort.factor

override_io_sort_factor

false

I/O Sort Memory Buffer (MiB) (Client Override)

The total amount of memory buffer, in megabytes, to use while sorting files. Note that this memory comes out of the user JVM heap
size (meaning total user JVM heap - this amount of memory = total user usable heap space. Note that Cloudera's default differs from Hadoop's default; Cloudera uses a bigger buffer by default because
modern machines often have more RAM. Will override value in client configuration.

io.sort.mb

override_io_sort_mb

false

I/O Sort Record Percent (Client Override)

The percentage of 'io.sort.mb' dedicated to tracking record boundaries. If this value is represented as 'r', and 'io.sort.mb' is
represented as 'x', then the maximum number of records collected before the collection thread must block is equal to (r * x) / 4. The syntax is in decimal units; the default is 5% and is formatted
0.05. Will override value in client configuration.

io.sort.record.percent

override_io_sort_record_percent

false

I/O Sort Spill Percent (Client Override)

The soft limit in either the buffer or record collection buffers. When this limit is reached, a thread will begin to spill the
contents to disk in the background. Note that this does not imply any chunking of data to the spill. A value less than 0.5 is not recommended. The syntax is in decimal units; the default is 80% and
is formatted 0.8. Will override value in client configuration.

io.sort.spill.percent

override_io_sort_spill_percent

false

MapReduce Child Java Opts Base (Client Override)

Java opts for the TaskTracker child processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by
current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'mapred.child.ulimit' can be used to control the maximum virtual memory of the child processes. Note that unlike Hadoop, Cloudera Manager
separates the child options into this setting and a separate setting just for the maximum heap size. Will override value in client configuration.

mapred.child.java.opts

override_mapred_child_java_opts_base

false

Map Task Java Opts Base (Client Override)

Java opts for the TaskTracker child map processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by
current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'Map Task Maximum Virtual Memory' can be used to control the maximum virtual memory of the map processes. This takes precedence over the generic
'mapred.child.java.opts'.

mapred.map.child.java.opts

override_mapred_map_task_java_opts

false

Default Number of Parallel Transfers During Shuffle (Client Override)

The default number of parallel transfers run by reduce during the copy (shuffle) phase. This number should be between
sqrt(nodes*number_of_map_slots_per_node) and nodes*s/2. Will override value in client configuration.

mapred.reduce.parallel.copies

override_mapred_reduce_parallel_copies

false

Reduce Task Java Opts Base (Client Override)

Java opts for the TaskTracker child map processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by
current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc
-Xloggc:/tmp/@taskid@.gc". The configuration variable 'Reduce Task Maximum Virtual Memory' can be used to control the maximum virtual memory of the reduce processes. This takes precedence over the
generic 'mapred.child.java.opts'.

mapred.reduce.child.java.opts

override_mapred_reduce_task_java_opts

false

Maximum Process File Descriptors

If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.

rlimit_fds

false

Number of TaskTracker HTTP Threads

The number of worker threads for the HTTP server. This is used for map output fetching.

tasktracker.http.threads

80

tasktracker_http_threads

false

Ports and Addresses

Display Name

Description

Related Name

Default Value

API Name

Required

TaskTracker Activity Monitor Instrumentation Plugin Address

Address where TaskTracker Activity Monitor instrumentation plugin listens for requests. This setting is ignored unless the
TaskTracker Instrumentation Class is set to org.apache.hadoop.mapred.TaskTrackerCmonInst. This is usually set to 127.0.0.1.

mapred.tasktracker.instrumentation.cmon.jettyhost

127.0.0.1

mapred_tasktracker_instrumentation_cmon_jettyhost

false

TaskTracker Activity Monitor Instrumentation Plugin Port

Port where TaskTracker Activity Monitor instrumentation plugin listens for requests. This setting is ignored unless the TaskTracker
Instrumentation Class Class is set to org.apache.hadoop.mapred.TaskTrackerCmonInst.

mapred.tasktracker.instrumentation.cmon.jettyport

4867

mapred_tasktracker_instrumentation_cmon_jettyport

false

TaskTracker Web UI Address

Address where TaskTracker listens for web requests

0.0.0.0

task_tracker_http_address

false

TaskTracker Web UI Port

Port where TaskTracker listens for web requests

mapred.task.tracker.http.address

50060

task_tracker_http_port

false

Resource Management

Display Name

Description

Related Name

Default Value

API Name

Required

Maximum Number of Simultaneous Map Tasks

The maximum number of map tasks that a TaskTracker can run simultaneously. Sometimes referred to as "map slots."

mapred.tasktracker.map.tasks.maximum

2

mapred_tasktracker_map_tasks_maximum

false

Maximum Number of Simultaneous Reduce Tasks

The maximum number of reduce tasks that a TaskTracker can run simultaneously. Sometimes referred to as "reduce slots."

mapred.tasktracker.reduce.tasks.maximum

2

mapred_tasktracker_reduce_tasks_maximum

false

MapReduce Child Java Maximum Heap Size (Client Override)

The maximum heap size, in bytes, of the Java child process. This number will be formatted and concatenated with the 'base' setting
for 'mapred.child.java.opts' to pass to Hadoop. Will override value in client configuration.

override_mapred_child_java_opts_max_heap

false

MapReduce Maximum Virtual Memory (KiB) (Client Override)

The maximum virtual memory, in KiB, of a process launched by the MapReduce framework. This can be used to control both the
MapReduce tasks and applications using Hadoop Pipes, Hadoop Streaming, and so on. By default, it is left unspecified to allow administrators to control it 'via limits.conf' and other mechanisms.
Note: 'mapred.child.ulimit' must be greater than or equal to approximately 1.5 times the -Xmx passed to JavaVM, or else the VM might not start. Will override value in client configuration.

mapred.child.ulimit

override_mapred_child_ulimit

false

Map Task Maximum Heap Size (Client Override)

The maximum heap size, in bytes, of the child map processes. This number will be formatted and concatenated with 'Map Task Java
Opts Base' to pass to Hadoop. Will override value in client configuration.

override_mapred_map_task_max_heap

false

Map Task Maximum Virtual Memory (KiB) (Client Override)

The maximum virtual memory, in KiB, available to map tasks. Note: this must be greater than or equal to the -Xmx passed to the
JavaVM via 'Map Task Java Opts', or else the VM might not start. This takes precedence over the generic 'mapred.child.ulimit'. Will override value in client configuration.

mapred.map.child.ulimit

override_mapred_map_task_ulimit

false

Reduce Task Maximum Heap Size (Client Override)

The maximum heap size, in bytes, of the child reduce processes. This number will be formatted and concatenated with 'Reduce Task
Java Opts Base' to pass to Hadoop. Will override value in client configuration.

override_mapred_reduce_task_max_heap

false

Reduce Task Maximum Virtual Memory (KiB) (Client Override)

The maximum virtual memory, in KiB, available to reduce tasks. Note: this must be greater than or equal to the -Xmx passed to the
JavaVM via 'Map Task Java Opts', or else the VM might not start. This takes precedence over the generic 'mapred.child.ulimit'. Will override value in client configuration.

mapred.reduce.child.ulimit

override_mapred_reduce_task_ulimit

false

Cgroup CPU Shares

Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be
given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.

cpu.shares

1024

rm_cpu_shares

true

Cgroup I/O Weight

Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host
experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.

blkio.weight

500

rm_io_weight

true

Cgroup Memory Hard Limit

Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default
processes not managed by Cloudera Manager will have no limit.

memory.limit_in_bytes

-1 MiB

rm_memory_hard_limit

true

Cgroup Memory Soft Limit

Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages
charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use
a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.

Security

Comma-separated list of users banned from submitting MapReduce jobs to this TaskTracker. Only applies when the TaskTracker is
running in secure mode

banned.users

mapred, hdfs, bin

taskcontroller_banned_users

false

Task Controller Group

The system group that owns the task-controller binary. This does not need to be changed unless the ownership of the binary is
explicitly changed.

mapreduce.tasktracker.group

mapred

taskcontroller_group

false

Minimum User ID for Job Submission

The lowest user ID (UID) that a user may have in order to submit a job to this TaskTracker. Only applies when the TaskTracker is
running in secure mode

min.user.id

1000

taskcontroller_min_user_id

false

Stacks Collection

Display Name

Description

Related Name

Default Value

API Name

Required

Stacks Collection Data Retention

The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.

stacks_collection_data_retention

100 MiB

stacks_collection_data_retention

false

Stacks Collection Directory

The directory in which stacks logs are placed. If not set, stacks are logged into a stacks
subdirectory of the role's log directory.

stacks_collection_directory

stacks_collection_directory

false

Stacks Collection Enabled

Whether or not periodic stacks collection is enabled.

stacks_collection_enabled

false

stacks_collection_enabled

true

Stacks Collection Frequency

The frequency with which stacks are collected.

stacks_collection_frequency

5.0 second(s)

stacks_collection_frequency

false

Stacks Collection Method

The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon
process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint
is periodically scraped.

stacks_collection_method

jstack

stacks_collection_method

false

Suppressions

Display Name

Description

Related Name

Default Value

API Name

Required

Suppress Configuration Validator: CDH Version Validator

Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.

false

role_config_suppression_cdh_version_validator

true

Suppress Parameter Validation: Hadoop Metrics Output Directory

Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics Output Directory
parameter.

false

role_config_suppression_hadoop_metrics_dir

true

Suppress Parameter Validation: Hadoop Metrics Ganglia Servers

Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics Ganglia Servers
parameter.

Whether to suppress the results of the Blacklisted Status heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_blacklisted

true

Suppress Health Test: JobTracker Connectivity

Whether to suppress the results of the JobTracker Connectivity heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_connectivity

true

Suppress Health Test: File Descriptors

Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_file_descriptor

true

Suppress Health Test: GC Duration

Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing
the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_gc_duration

true

Suppress Health Test: Heap Dump Directory Free Space

Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are
ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_heap_dump_directory_free_space

true

Suppress Health Test: Host Health

Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing
the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_host_health

true

Suppress Health Test: Log Directory Free Space

Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_log_directory_free_space

true

Suppress Health Test: Process Status

Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing
the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_scm_health

true

Suppress Health Test: Swap Memory Usage

Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_swap_memory_usage

true

Suppress Health Test: Unexpected Exits

Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_unexpected_exits

true

Suppress Health Test: Web Server Status

Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when
computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.

false

role_health_suppression_task_tracker_web_metric_collection

true

Suppress Health Test: TaskTracker Local Data Directories Free Space

Whether to suppress the results of the TaskTracker Local Data Directories Free Space heath test. The results of suppressed health
tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.