Package org.apache.hadoop.metrics2 Description

Metrics 2.0

This package provides a framework for metrics instrumentation
and publication.

The framework provides a variety of ways to implement metrics
instrumentation easily via the simple
MetricsSource interface
or the even simpler and more concise and declarative metrics annotations.
The consumers of metrics just need to implement the simple
MetricsSink interface. Producers
register the metrics sources with a metrics system, while consumers
register the sinks. A default metrics system is provided to marshal
metrics from sources to sinks based on (per source/sink) configuration
options. All the metrics are also published and queryable via the
standard JMX MBean interface. This document targets the framework users.
Framework developers could also consult the
design
document for architecture and implementation notes.

Sub-packages

org.apache.hadoop.metrics2.annotation

Public annotation interfaces for simpler metrics instrumentation.

org.apache.hadoop.metrics2.impl

Implementation classes of the framework for interface and/or
abstract classes defined in the top-level package. Sink plugin code
usually does not need to reference any class here.

The Metrics annotation is
used to indicate that the class is a metrics source.

MyContext

The optional context name typically identifies either the
application, or a group of modules within an application or
library.

MyStat

The class name is used (by default, or specified by name=value parameter
in the Metrics annotation) as the metrics record name for
which a set of metrics are to be reported. For example, you could have a
record named "CacheStat" for reporting a number of statistics relating to
the usage of some cache in your application.

@Metric

The Metric annotation
identifies a particular metric, which in this case, is the
result of the method call getMyMetric of the "gauge" (default) type,
which means it can vary in both directions, compared with a "counter"
type, which can only increase or stay the same. The name of the metric
is "MyMetric" (inferred from getMyMetric method name by default.) The 42
here is the value of the metric which can be substituted with any valid
java expressions.

Note, the MetricsSource interface is
more verbose but more flexible,
allowing generated metrics names and multiple records. In fact, the
annotation interface is implemented with the MetricsSource interface
internally.

This object corresponds to the record created in metrics sources
e.g., the "MyStat" in previous example.

conf

The configuration object for the sink instance with prefix removed.
So you can get any sink specific configuration using the usual
get* method.

flush

This method is called for each update cycle, which may involve
more than one record. The sink should try to flush any buffered metrics
to its backend upon the call. But it's not required that the
implementation is synchronous.

In order to make use our MyMetrics and MySink,
they need to be hooked up to a metrics system. In this case (and most
cases), the DefaultMetricsSystem would suffice.

DefaultMetricsSystem.initialize("test"); // called once per application
DefaultMetricsSystem.register(new MyStat());

Sinks are usually specified in a configuration file, say,
"hadoop-metrics2-test.properties", as:

test.sink.mysink0.class=com.example.hadoop.metrics.MySink

The configuration syntax is:

[prefix].[source|sink|jmx|].[instance].[option]

In the previous example, test is the prefix and
mysink0 is an instance name.
DefaultMetricsSystem would try to load
hadoop-metrics2-[prefix].properties first, and if not found,
try the default hadoop-metrics2.properties in the class path.
Note, the [instance] is an arbitrary name to uniquely
identify a particular sink instance. The asterisk (*) can be
used to specify default options.

Consult the metrics instrumentation in jvm, rpc, hdfs and mapred, etc.
for more examples.

One of the features of the default metrics system is metrics filtering
configuration by source, context, record/tags and metrics. The least
expensive way to filter out metrics would be at the source level, e.g.,
filtering out source named "MyMetrics". The most expensive way would be
per metric filtering.

In this example, we specify a source filter that includes source
foo and excludes bar. When only include
patterns are specified, the filter operates in the white listing mode,
where only matched sources are included. Likewise, when only exclude
patterns are specified, only matched sources are excluded. Sources that
are not matched in either patterns are included as well when both patterns
are present. Note, the include patterns have precedence over the exclude
patterns.

Similarly, you can specify the record.filter and
metric.filter options, which operate at record and metric
level, respectively. Filters can be combined to optimize
the filtering efficiency.

This is usually an abstract class (or interface) to define an
instrumentation interface (incrCounter0 etc.) that allows different
implementations. This could be a mechanism to allow different metrics
systems to be used at runtime via configuration.

Mutable[Gauge*|Counter*|Rate]

These are library classes to manage mutable metrics for
implementations of metrics sources. They produce immutable gauge and
counters (Metric[Gauge*|Counter*]) for downstream consumption (sinks)
upon snapshot. The MutableRate
in particular, provides a way to measure latency and throughput of an
operation. In this particular case, it produces a long counter
"Rate0NumOps" and double gauge "Rate0AvgTime" when snapshotted.

Users of the previous metrics system would notice the lack of
context prefix in the configuration examples. The new
metrics system decouples the concept for context (for grouping) with the
implementation where a particular context object does the updating and
publishing of metrics, which causes problems when you want to have a
single context to be consumed by multiple backends. You would also have to
configure an implementation instance per context, even if you have a
backend that can handle multiple contexts (file, gangalia etc.):

to send metrics of a particular context to a particular backend. Note,
myprefix is an arbitrary prefix for configuration groupings,
typically they are the name of a particular process
(namenode, jobtracker, etc.)