Support a hierarchy of queues in the Map/Reduce framework

Details

Description

In MAPREDUCE-824, we proposed introducing a hierarchy of queues in the capacity scheduler. Currently, the M/R framework provides the notion of job queues and handles some functionality related to queues in a scheduler-agnostic manner. This functionality includes:

Displaying list of jobs in a queue in the jobtracker web UI and job client CLI

Providing APIs for list queues and queue information in JobClient.

Since it would be beneficial to extend this functionality to hierarchical queues, this JIRA is proposing introducing the concept into the map/reduce framework as well. We could treat this as an umbrella JIRA and file additional tasks for each of the changes involved, sticking to the high level approach in this JIRA.

Briefly, hierarchical queues allow administrators to have a greater control over how capacity (or conceivably other policies) associated with a queue can be used. They also allow delegation of control. Large clusters could have root level queues set up for organizations, and then operators from the organizations could be given access to manage queues under the root level queues (which we are calling sub-queues in MAPREDUCE-824).

Hemanth Yamijala
added a comment - 13/Aug/09 05:49 The motivation for hierarchical queues is discussed in the proposal on MAPREDUCE-824 , for those interested.
Briefly, hierarchical queues allow administrators to have a greater control over how capacity (or conceivably other policies) associated with a queue can be used. They also allow delegation of control. Large clusters could have root level queues set up for organizations, and then operators from the organizations could be given access to manage queues under the root level queues (which we are calling sub-queues in MAPREDUCE-824 ).

The basic proposal is to define the concept of sub-queues. Sub-queues are queues that are contained in other queues. Sub-queues can be nested. The last level of queues in the hierarchy, the leaf level queues, are called job queues, as they contain jobs. By that definition, all the queues that are defined in the present system are job queues.

In this example, grid, org1, and priority are container queues. And production, proj1, proj2, proj3, miscellaneous and org2 are job queues. An example of how policies such as capacity can be assigned to this hierarchy and how it benefits them is described in MAPREDUCE-824.

Hemanth Yamijala
added a comment - 13/Aug/09 12:13 The basic proposal is to define the concept of sub-queues. Sub-queues are queues that are contained in other queues. Sub-queues can be nested. The last level of queues in the hierarchy, the leaf level queues, are called job queues, as they contain jobs. By that definition, all the queues that are defined in the present system are job queues.
An example organization could be:
grid {
org1 {
priority {
production
proj1
proj2
proj3
}
miscellaneous
}
org2
}
In this example, grid, org1, and priority are container queues. And production, proj1, proj2, proj3, miscellaneous and org2 are job queues. An example of how policies such as capacity can be assigned to this hierarchy and how it benefits them is described in MAPREDUCE-824 .

In summary, the following three areas would be impacted by this change:

APIs in JobClient (public), JobSubmissionProtocol and QueueManager

Configuration in mapred-queues.xml and mapred-site.xml

UI - the hadoop queue family of commands in the CLI and the Scheduling information shown in the Jobtracker's web UI and Queue details pages.

It is a goal to be able to keep this feature backwards compatible. So sites not interested in hierarchical queues, but using the current single level of queues should be able to continue to function as before.

To keep details manageable and not confuse with the high level picture, I will add tasks under this JIRA for each of these areas.

Hemanth Yamijala
added a comment - 13/Aug/09 12:34 In summary, the following three areas would be impacted by this change:
APIs in JobClient (public), JobSubmissionProtocol and QueueManager
Configuration in mapred-queues.xml and mapred-site.xml
UI - the hadoop queue family of commands in the CLI and the Scheduling information shown in the Jobtracker's web UI and Queue details pages.
It is a goal to be able to keep this feature backwards compatible. So sites not interested in hierarchical queues, but using the current single level of queues should be able to continue to function as before.
To keep details manageable and not confuse with the high level picture, I will add tasks under this JIRA for each of these areas.