High Throughput Resource Scheduling with IBM Session Scheduler

In environments running large numbers of short duration jobs, the workload scheduler can spend a lot of time making repetitive scheduling decisions for identical workloads, which causes delays. IBM Platform Session Scheduler provides high-throughput, low-latency scheduling for faster, more predictable job turnaround times.

The Challenge

Tired of waiting for jobs?

For technical and high performance computing (HPC) users, the productivity gains of scalable workloads often require balancing scheduling complexity and job throughput. As job duration decreases, the overhead of applying the desired scheduling policies may exceed the duration of the job itself. When workloads consist primarily of short jobs, event-driven scheduling can be a potential solution. However, this is done at the expense of the rich scheduling and management policies like those in IBM Platform LSF.

The Solution

Scheduling speed and volume

The IBM® Platform™ Session Scheduler gives you the best of both worlds: speed and volume. It combines the benefits of event-driven scheduling with the rich scheduling policies of IBM Platform LSF® to provide high-throughput, low-latency scheduling for a wide-range of workloads. Platform Session Scheduler is particularly well suited to environments that run high-volumes of short duration jobs and where users require faster and more predictable job turnaround times.

Unlike traditional batch schedulers that make resource allocation decisions for every job submission, with Platform Session Scheduler you can specify resource allocation decisions once for multiple jobs in a session, which effectively provides you with your own virtual private cluster. With this more efficient scheduling model, you benefit from higher job throughput and faster response times.

The Benefits

Efficient use of resources

In traditional batch scheduling environments, large numbers of short-running jobs can lead to clusters being poorly utilized. With Platform Session Scheduler, you can dispatch tasks immediately without needing to wait for the main scheduler to make a decision, which leads to dramatic gains in efficiency. Because cluster resources are used more efficiently, a fixed number of cluster nodes are able to process a higher volume of jobs or support a larger user community.

Reduce operational and infrastructure costs by providing optimal SLA management and greater flexibility, visibility and control of job scheduling.

Improve productivity and resource sharing by fully utilizing hardware and application resources, whether they are just down the hall or halfway around the globe.

The Specifics

Higher throughput, improved utilization

IBM Platform Session Scheduler gives users the ability to request resources once and interact with what is in effect their own personal scheduler. Because resources are pre-allocated, users can submit large volumes of tasks immediately with essentially no scheduling delays. In addition, because session scheduling is performed asynchronously without needing to wait for the main scheduler’s next scheduling interval, resource can be kept fully utilized resulting in faster job completion times.

IBM Platform Session Scheduler provides the performance benefits of a low-latency operation without the need to re-code applications or adapt to client or server side APIs. By preserving existing Platform LSF semantics and by supporting the same syntax as job arrays, parametric problems where inputs are passed based on a job array index are readily adaptable to Platform Session Scheduler. With minimal changes to scripts, users enjoy faster throughput with essentially no scheduling delays once a session is started.