Description

Currently Namenode and Datanode web page needs to be scraped by scripts to get this information. Having an interface where this structured information is provided, will help building scripts around it.

Description of the patch:
Expose the following information about name node and data node through JMX to the end users.
each datanode:
hostname (string)
rpcport (string)
httpport (string)
version (string): hadoop version
each volume:

name: name of the volume

used: used dfs space

free: free dfs space

reserved: reserved space

We group each volume information into a map object and serialize it to JSON string before sending to the client. This is an example of the volume information, {"/tmp/user/dfs/data/current/finalized":{"freeSpace":86118031360,"usedSpace":28672,"reservedSpace":0}}

each namenode:
hostname (string)
version (string)
used (long) : used dfs space
free (long): free dfs space
total (long): total dfs space
safemode (string): safemode status
isfinalize (boolean): is upgrade finalized.
nondfsusedspace (long): total used space by data nodes fro non DFS purposes such as storing temporary files on the local file system
percentused (float) : the total used space by data nodes as percentage of total capacity
percentremaining (float): percentage of remaining space on the cluster
totalblocks (long) : the total number of blocks on the cluster
totalfiles (long): total number of files on the cluster
threads (int): the number of threads

each alive name node

name : host name of the alive name node

last contact : time since last contact in seconds

used: space used
each dead name node:

name : host name of the dead name node

last contact: time since last contact in seconds
each decommissioning node:

name: host name of the decommissioning node

under replicated blocks

decommission only replicas

under replicate in open files

We group information of each alive/dead/decommissioning node into a map object and serialize it to JSON string before sending to the client. This is an example of alive node information, {"somehost.com":{"usedSpace":28672,"lastContact":1}}

Two public interfaces are defined and implemented, i.e. NameNodeMXBean.java and DataNodeMXBean.java. The exposed name node, data node information can be used for monitoring purposes.

Tanping Wang
added a comment - 04/Aug/10 22:35 Description of the patch:
Expose the following information about name node and data node through JMX to the end users.
each datanode:
hostname (string)
rpcport (string)
httpport (string)
version (string): hadoop version
each volume:
name: name of the volume
used: used dfs space
free: free dfs space
reserved: reserved space
We group each volume information into a map object and serialize it to JSON string before sending to the client. This is an example of the volume information, {"/tmp/user/dfs/data/current/finalized":{"freeSpace":86118031360,"usedSpace":28672,"reservedSpace":0}}
each namenode:
hostname (string)
version (string)
used (long) : used dfs space
free (long): free dfs space
total (long): total dfs space
safemode (string): safemode status
isfinalize (boolean): is upgrade finalized.
nondfsusedspace (long): total used space by data nodes fro non DFS purposes such as storing temporary files on the local file system
percentused (float) : the total used space by data nodes as percentage of total capacity
percentremaining (float): percentage of remaining space on the cluster
totalblocks (long) : the total number of blocks on the cluster
totalfiles (long): total number of files on the cluster
threads (int): the number of threads
each alive name node
name : host name of the alive name node
last contact : time since last contact in seconds
used: space used
each dead name node:
name : host name of the dead name node
last contact: time since last contact in seconds
each decommissioning node:
name: host name of the decommissioning node
under replicated blocks
decommission only replicas
under replicate in open files
We group information of each alive/dead/decommissioning node into a map object and serialize it to JSON string before sending to the client. This is an example of alive node information, {"somehost.com":{"usedSpace":28672,"lastContact":1}}
Two public interfaces are defined and implemented, i.e. NameNodeMXBean.java and DataNodeMXBean.java. The exposed name node, data node information can be used for monitoring purposes.

Eli, the goal of HDFS-453 is to provide REST APIs for namenode WebUI. Adding REST interface is not the goal of this jira. Now that this jira exposes namenode and datanode web UI information, the JMX added in this jira could be used to provide REST interface or servlet based interface, more cleanly, without the need to access Namenode internal classes directly.

Suresh Srinivas
added a comment - 11/Aug/10 18:10 Eli, the goal of HDFS-453 is to provide REST APIs for namenode WebUI. Adding REST interface is not the goal of this jira. Now that this jira exposes namenode and datanode web UI information, the JMX added in this jira could be used to provide REST interface or servlet based interface, more cleanly, without the need to access Namenode internal classes directly.
If REST interface is no longer the goal, HDFS-453 can be closed.