Enabling and Configuring Resource Pools

Starting with HPC Pack 2008 R2 with Service Pack 2 (SP2), you can configure the HPC Job Scheduler Service to allocate resources based on Resource Pools. Resource Pools help you define what proportion of your cluster cores must be guaranteed for specific user groups (or job types). If a user group is not using all of their guaranteed cores, those cores can be used by other groups. You must use job templates to associate a user group with a Resource Pool. Jobs that use the job template will collectively be guaranteed the proportion of cluster cores that are defined for the Resource Pool, and will be scheduled within the pool according to job priority, submit time, and scheduling mode (Queued or Balanced). Resource Pool scheduling works best on clusters with homogeneous resources.

Sample scenario: Various user groups in your organization have contributed to the cluster budget, and in return they expect to have a determined portion of the cluster at their disposal. If at any given time a group has a light workload and does not utilize their entire share of the cluster, you want those resources temporarily made available to other groups. So to guarantee availability and maximize cluster utilization, you want the HPC Job Scheduler Service to allocate resources based on Resource Pools.

An integer between 0 and 999,999 that represents the proportion of cluster cores that should be guaranteed to the pool.

Guaranteed cores

Set by the HPC Job Scheduler Service.

The number of cores that correspond to the weight defined for the pool. The number of guaranteed cores will vary according to how many nodes are Online and reachable at any given time. The number of guaranteed cores is calculated as (poolWeight/totalWeights)*NumberOfCoresOnline.

Allocated cores

Set by the HPC Job Scheduler Service.

The number of cores that are actually being used by jobs that are submitted to the pool. This number can be higher or lower than the number of guaranteed cores.

Important considerations

A pool with a weight of 0 has no guaranteed cores, but can have allocated cores if there are jobs that are submitted to the pool, and the other pools are not using all of their resources.

The Default Pool cannot be deleted. When Resource Pools are enabled in the HPC Job Scheduler Service, any jobs that do not specify a pool will use the Default Pool. Unlike custom pools, specifying the Default Pool does not provide any guarantee of resources. You can set the weight of the Default Pool to 0.

When the HPC Job Scheduler Service calculates the number of cores for each Resource Pool (according to pool weight), the resulting value for each pool is rounded down to the nearest whole number. The remainder cores are added to the Default Pool.

Node groups and a list of requested nodes provide alternative ways to allocate cluster resources to a job, and neither is intended to be used together with Resource Pools. If you add both specific node groups (or a list of requested nodes) and Resource Pools to a job template, the HPC Job Scheduler Service will restrict access to cluster resources based on both properties independently.

To configure resource pools, you must define one or more pools, and then associate the pools with job templates. As an example, let’s say you have two user groups, and each group expects to be able to use the following proportions of the cluster at any given time: Group A 60%, and Group B 40%. Let’s also say that Group A has two distinct types of jobs for which they want separate job templates: one type is high priority, and the other type is low priority. To enforce the desired scheduling policies, you create three node templates: “GroupA_HighPriJobs”, “GroupA_LowPriJobs”, and “GroupB_AllJobs”.

Important

After you define Resource Pools and associate them with Job Templates, you must enable Resource Pool scheduling in the Job Scheduler settings. See Enable resource pools in this topic.

You must enable Resource Pool scheduling in the Job Scheduler configuration settings. You can do this through HPC Cluster Manager, or by using command utilities. Use one of the following methods to enable Resource Pool scheduling: