Re: YARN is completely down.Cluster is in trouble

Start by looking for FATAL/ERROR logs in each of those roles (ResourceManager first perhaps). You should see a potential cause in the logs right before the crash time. The logs are typically under /var/log/hadoop-yarn/ if you are using CDH.

If you have trouble interpreting the logs once you've located them, please share them here (via pastebin or such if they are large).

Then I checked the HDFS root directory and I checked YARN configurations to know what user are system users for job history server. I found no tmp directory, no mapred user and all. Then I have created mapred user and /tmp, /tmp/log directories from log data.

Then I have given permission on /tmp directory to mapred user.

The restart command successfully restarted my JobHistory Server.

All this was happened because of failure of restoring root directory snapshot.