Description

Currently NameNode WebUI server starts only after the fsimage is loaded, edits are applied and checkpoint is complete. Any status related to namenode startin up is available only in the logs. I propose starting the webserver before loading namespace and providing namenode startup information.

Suresh Srinivas
added a comment - 30/Nov/12 18:00 Detailed changes proposed:
Start the web server on the namenode before loading namespace.
As namenode loads the namespace, print the following information (currently present in logs):
Current startup state - loading fsimage/editlog/checkpointing/checkpointing complete/waiting for registration of datanodes/safemode/startup complete.
A message when fsimage is loaded, the size of the fsimage loaded and time taken
A message When editlog is loaded, the size of the edits and time taken
A message when checkpointing is complete and how long it took.
The information that is displayed on the webUI will also be made available over JMX/http+JMX for monitoring and tooling purposes.

Do you think it also makes sense to expose journal state here too? For example, where the edits are being read/recovered from. If an external editlog service (eg QJM or BKJM) then perhaps also it can be interrogated for service specific information (hostnames, instance health status) which can also be printed as part of NN startup status? Might help reveal issues or reassure because all the relevant state would be side by side at a glance.

Andrew Purtell
added a comment - 01/Dec/12 03:51 +1
Do you think it also makes sense to expose journal state here too? For example, where the edits are being read/recovered from. If an external editlog service (eg QJM or BKJM) then perhaps also it can be interrogated for service specific information (hostnames, instance health status) which can also be printed as part of NN startup status? Might help reveal issues or reassure because all the relevant state would be side by side at a glance.

Suresh Srinivas
added a comment - 02/Dec/12 02:47 Andrew, sounds like a good idea. We should add that information as well. I will start with basic information in this jira. Will follow up this jira with jiras for adding others such as QJM/BKJM etc.

Seems like a good idea to me. It would also be interesting to see which edit log files or QJM sources were actually selected for use. Similarly, seeing which FSImage was loaded could be interesting in some cases.

Colin P. McCabe
added a comment - 03/Dec/12 22:06 Seems like a good idea to me. It would also be interesting to see which edit log files or QJM sources were actually selected for use. Similarly, seeing which FSImage was loaded could be interesting in some cases.

I expect to start posting patches for this feature on the sub-tasks later this week after additional testing. I am attaching several screenshots.

HDFS-4249-1.png: This shows a new section on dfshealth.jsp called Startup Progress. It displays overall elapsed time and percent complete. Below that, the NameNode startup sequence is divided into phases: loading fsimage, loading edits, saving a checkpoint, and safe mode. Phases are sub-divided into steps, which show more granular operations within each phase. We display counters, percent complete, and elapsed time per step, which is also aggregated at the phase level. Phases in progress display in italics. Phases not yet started display in gray text. Note that some information typically displayed on dfshealth.jsp is missing: RPC server address, cluster ID and block pool ID. This is because we are starting the HTTP server before initializing FSNamesystem and the RPC server, so that information isn't available yet.

HDFS-4249-2.png: Here we see that the saving checkpoint phase has begun. The interesting thing about this is that the phase is multi-threaded, one thread per dfs.namenode.name.dir, so we see steps related to 3 different paths simultaneously, with progress tracked separately for each one. This can help identify if namenode startup is blocked waiting on a particularly slow disk while saving the checkpoint.

HDFS-4249-3.png: Once the namenode reaches safe mode and the RPC server is available, we see more of the traditional output of dfshealth.jsp. At this point, we move the Startup Progress section to the bottom of the page. This keeps the focus on Cluster Summary, which is probably more useful than Startup Progress during normal operation.

HDFS-4249-4.png: This shows the same information exposed as JSON by making an HTTP call to a new relative URI: /startupProgress. This supports clients such as Ambari that may want to display the data in a different UI.

HDFS-4249-5.png: This is a jconsole screenshot showing that progress information at the phase level is available via JMX too in a new MBean named StartupProgress.

Chris Nauroth
added a comment - 26/Feb/13 19:19 I expect to start posting patches for this feature on the sub-tasks later this week after additional testing. I am attaching several screenshots.
HDFS-4249 -1.png: This shows a new section on dfshealth.jsp called Startup Progress. It displays overall elapsed time and percent complete. Below that, the NameNode startup sequence is divided into phases: loading fsimage, loading edits, saving a checkpoint, and safe mode. Phases are sub-divided into steps, which show more granular operations within each phase. We display counters, percent complete, and elapsed time per step, which is also aggregated at the phase level. Phases in progress display in italics. Phases not yet started display in gray text. Note that some information typically displayed on dfshealth.jsp is missing: RPC server address, cluster ID and block pool ID. This is because we are starting the HTTP server before initializing FSNamesystem and the RPC server, so that information isn't available yet.
HDFS-4249 -2.png: Here we see that the saving checkpoint phase has begun. The interesting thing about this is that the phase is multi-threaded, one thread per dfs.namenode.name.dir, so we see steps related to 3 different paths simultaneously, with progress tracked separately for each one. This can help identify if namenode startup is blocked waiting on a particularly slow disk while saving the checkpoint.
HDFS-4249 -3.png: Once the namenode reaches safe mode and the RPC server is available, we see more of the traditional output of dfshealth.jsp. At this point, we move the Startup Progress section to the bottom of the page. This keeps the focus on Cluster Summary, which is probably more useful than Startup Progress during normal operation.
HDFS-4249 -4.png: This shows the same information exposed as JSON by making an HTTP call to a new relative URI: /startupProgress. This supports clients such as Ambari that may want to display the data in a different UI.
HDFS-4249 -5.png: This is a jconsole screenshot showing that progress information at the phase level is available via JMX too in a new MBean named StartupProgress.

I'm about to post patches on all 3 sub-tasks: HDFS-4372, HDFS-4373, and HDFS-4374. The patches are separated to try to simplify code review, though it probably makes sense for code reviewers to give all 3 at least a quick scan. HDFS-4372 contains the majority of core NameNode changes. HDFS-4373 and HDFS-4374 contain code for different consumers of the startup progress information. I've made sure that everyone who is a watcher on HDFS-4249 is also a watcher on all sub-tasks, so that everyone receives all of the related updates.

Chris Nauroth
added a comment - 27/Mar/13 21:25 I'm about to post patches on all 3 sub-tasks: HDFS-4372 , HDFS-4373 , and HDFS-4374 . The patches are separated to try to simplify code review, though it probably makes sense for code reviewers to give all 3 at least a quick scan. HDFS-4372 contains the majority of core NameNode changes. HDFS-4373 and HDFS-4374 contain code for different consumers of the startup progress information. I've made sure that everyone who is a watcher on HDFS-4249 is also a watcher on all sub-tasks, so that everyone receives all of the related updates.