Chapter 8 Fair Share Scheduler (Overview)

The analysis of workload data can indicate that a particular workload
or group of workloads is monopolizing CPU resources. If these workloads are
not violating resource constraints on CPU usage, you can modify the allocation
policy for CPU time on the system. The fair share scheduling class described
in this chapter enables you to allocate CPU time based on shares instead of
the priority scheme of the timesharing (TS) scheduling class.

Introduction to the Scheduler

A fundamental job of the operating system is to arbitrate which processes
get access to the system's resources. The process scheduler, which is also
called the dispatcher, is the portion of the kernel that controls allocation
of the CPU to processes. The scheduler supports the concept of scheduling
classes. Each class defines a scheduling policy that is used to schedule processes
within the class. The default scheduler in the Solaris Operating System, the
TS scheduler, tries to give every process relatively equal access to the available
CPUs. However, you might want to specify that certain processes be given more
resources than others.

You
can use the fair share scheduler (FSS) to control the
allocation of available CPU resources among workloads, based on their importance.
This importance is expressed by the number of shares of
CPU resources that you assign to each workload.

You give each project CPU shares to control the project's entitlement
to CPU resources. The FSS guarantees a fair dispersion of CPU resources among
projects that is based on allocated shares, independent of the number of processes
that are attached to a project. The FSS achieves fairness by reducing a project's
entitlement for heavy CPU usage and increasing its entitlement for light usage,
in accordance with other projects.

The FSS consists of a kernel scheduling
class module and class-specific versions of the dispadmin(1M) and priocntl(1) commands. Project shares used
by the FSS are specified through the project.cpu-shares property
in the project(4) database.

CPU Share Definition

The term “share” is used to define a portion
of the system's CPU resources that is allocated to a project. If you assign
a greater number of CPU shares to a project, relative to other projects, the
project receives more CPU resources from the fair share scheduler.

CPU shares are not equivalent to percentages of CPU resources. Shares
are used to define the relative importance of workloads in relation to other
workloads. When you assign CPU shares to a project, your primary concern is
not the number of shares the project has. Knowing how many shares the project
has in comparison with other projects is more important. You must also take
into account how many of those other projects will be competing with it for
CPU resources.

Note –

Processes in projects with zero shares always run at the lowest
system priority (0). These processes only run when projects with nonzero shares
are not using CPU resources.

CPU Shares and Process State

In the Solaris system, a project workload usually consists of
more than one process. From the fair share scheduler perspective, each project
workload can be in either an idle state or an active state. A project is considered idle if none of its processes are
using any CPU resources. This usually means that such processes are either sleeping (waiting for I/O completion) or stopped. A project is
considered active if at least one of its processes is using CPU resources.
The sum of shares of all active projects is used in calculating the portion
of CPU resources to be assigned to projects.

When more projects become active, each project's CPU allocation is reduced,
but the proportion between the allocations of different projects does not
change.

CPU Share Versus Utilization

Share allocation is not the same as utilization. A project that is allocated
50 percent of the CPU resources might average only a 20 percent CPU use. Moreover,
shares serve to limit CPU usage only when there is competition from other
projects. Regardless of how low a project's allocation is, it always receives
100 percent of the processing power if it is running alone on the system.
Available CPU cycles are never wasted. They are distributed between projects.

The allocation of a small share to a busy workload might slow its performance.
However, the workload is not prevented from completing its work if the system
is not overloaded.

CPU Share Examples

Assume you have a system with two CPUs running two parallel CPU-bound
workloads called A and B,
respectively. Each workload is running as a separate project. The projects
have been configured so that project A is assigned SA shares,
and project B is assigned SB shares.

On average, under the traditional TS scheduler, each of the workloads
that is running on the system would be given the same amount of CPU resources.
Each workload would get 50 percent of the system's capacity.

When run under the control of the FSS scheduler with SA=SB, these projects are also given approximately the
same amounts of CPU resources. However, if the projects are given different
numbers of shares, their CPU resource allocations are different.

The next three examples illustrate how shares work in different configurations.
These examples show that shares are only mathematically accurate for representing
the usage if demand meets or exceeds available resources.

Example 1: Two CPU-Bound Processes in Each Project

If A and B each
have two CPU-bound processes, and SA = 1 and SB = 3, then the total
number of shares is 1 + 3 = 4. In this configuration, given sufficient CPU
demand, projects A and B are
allocated 25 percent and 75 percent of CPU resources, respectively.

Example 2: No Competition Between Projects

If A and B have
only one CPU-bound process each, and SA = 1 and SB = 100, then the total
number of shares is 101. Each project cannot use more than one CPU because
each project has only one running process. Because no competition exists between
projects for CPU resources in this configuration, projects A and B are each allocated 50 percent of all CPU resources. In this
configuration, CPU share values are irrelevant. The projects' allocations
would be the same (50/50), even if both projects were assigned zero shares.

Example 3: One Project Unable to Run

If A and B have
two CPU-bound processes each, and project A is
given 1 share and project B is given 0 shares,
then project B is not allocated any CPU resources
and project A is allocated all CPU resources. Processes
in B always run at system priority 0, so they will
never be able to run because processes in project A always
have higher priorities.

FSS Configuration

Projects and Users

Projects are the workload containers in the FSS scheduler. Groups of
users who are assigned to a project are treated as single controllable blocks.
Note that you can create a project with its own number of shares for an individual
user.

Users can be members of multiple projects that have different numbers
of shares assigned. By moving processes from one project to another project,
processes can be assigned CPU resources in varying amounts.

CPU Shares Configuration

The
configuration of CPU shares is managed by the name service as a property of
the project database.

When the first task (or process) that is associated with a project is
created through the setproject(3PROJECT) library function, the number of CPU shares defined
as resource control project.cpu-shares in the project database is passed to the kernel. A project that does not have
the project.cpu-shares resource control defined is assigned
one share.

In the following example, this entry in the /etc/project file
sets the number of shares for project x-files to 5:

x-files:100::::project.cpu-shares=(privileged,5,none)

If you alter the number of CPU shares allocated to a project in the
database when processes are already running, the number of shares for that
project will not be modified at that point. The project must be restarted
for the change to become effective.

If you want to temporarily change the number of shares assigned to a
project without altering the project's attributes in the project database,
use the prctl command. For example, to change the value
of project x-files's project.cpu-shares resource
control to 3 while processes associated with that
project are running, type the following:

Specifies the object of the change. In this instance, project x-files is the object.

Project system with project ID 0 includes all system daemons that are started
by the boot-time initialization scripts. system can be
viewed as a project with an unlimited number of shares. This means that system is always scheduled first, regardless of how many shares
have been given to other projects. If you do not want the system project
to have unlimited shares, you can specify a number of shares for this project
in the project database.

As stated previously, processes that belong to projects with zero shares
are always given zero system priority. Projects with one or more shares are
running with priorities one and higher. Thus, projects with zero shares are
only scheduled when CPU resources are available that are not requested by
a nonzero share project.

The maximum number of shares that can be assigned to one project is
65535.

FSS and Processor Sets

The FSS can be used in conjunction with processor
sets to provide more fine-grained controls over allocations of CPU resources
among projects that run on each processor set than would be available with
processor sets alone. The FSS scheduler treats processor sets as entirely
independent partitions, with each processor set controlled independently with
respect to CPU allocations.

The CPU allocations of projects running in one processor set are not
affected by the CPU shares or activity of projects running in another processor
set because the projects are not competing for the same resources. Projects
only compete with each other if they are running within the same processor
set.

The number of shares allocated to a project is system wide. Regardless
of which processor set it is running on, each portion of a project is given
the same amount of shares.

When processor sets are used, project CPU allocations are calculated
for active projects that run within each processor set.

Project partitions that run on different processor sets might have different
CPU allocations. The CPU allocation for each project partition in a processor
set depends only on the allocations of other projects that run on the same
processor set.

The performance and availability of applications that run within the
boundaries of their processor sets are not affected by the introduction of
new processor sets. The applications are also not affected by changes that
are made to the share allocations of projects that run on other processor
sets.

Empty processor sets (sets without processors in them) or processor
sets without processes bound to them do not have any impact on the FSS scheduler
behavior.

FSS and Processor Sets Examples

Assume that a server with eight CPUs is running several CPU-bound applications
in projects A, B, and C. Project A is allocated one share,
project B is allocated two shares, and project C is allocated three shares.

Project A is running only on processor set
1. Project B is running on processor sets 1 and
2. Project C is running on processor sets 1, 2,
and 3. Assume that each project has enough processes to utilize all available
CPU power. Thus, there is always competition for CPU resources on each processor
set.

The total system-wide project CPU allocations on such a system are shown
in the following table.

Project

Allocation

Project A

4% = (1/6 X 2/8)pset1

Project B

28% = (2/6 X 2/8)pset1+ (2/5 * 4/8)pset2

Project C

67% = (3/6 X 2/8)pset1+ (3/5 X 4/8)pset2+
(3/3 X 2/8)pset3

These percentages do not match the corresponding amounts of CPU shares
that are given to projects. However, within each processor set, the per-project
CPU allocation ratios are proportional to their respective shares.

On the same system without processor sets, the
distribution of CPU resources would be different, as shown in the following
table.

Project

Allocation

Project A

16.66% = (1/6)

Project B

33.33% = (2/6)

Project C

50% = (3/6)

Combining FSS With Other Scheduling Classes

By default, the FSS scheduling class uses the same range of priorities
(0 to 59) as the timesharing (TS), interactive (IA), and fixed priority (FX)
scheduling classes. Therefore, you should avoid having processes from these
scheduling classes share the same processor set. A mix
of processes in the FSS, TS, IA, and FX classes could result in unexpected
scheduling behavior.

With the use of processor sets, you can mix TS, IA, and FX with FSS
in one system. However, all the processes that run on each processor set must
be in one scheduling class, so they do not compete for
the same CPUs. The FX scheduler in particular should not be used in conjunction
with the FSS scheduling class unless processor sets are used. This action
prevents applications in the FX class from using priorities high enough to
starve applications in the FSS class.

You can mix processes in the TS and IA classes in the same processor
set, or on the same system without processor sets.

The Solaris system also offers a real-time (RT) scheduler to users with
superuser privileges. By default, the RT scheduling class uses system priorities
in a different range (usually from 100 to 159) than FSS. Because RT and FSS
are using disjoint, or non-overlapping, ranges of priorities,
FSS can coexist with the RT scheduling class within the same processor set.
However, the FSS scheduling class does not have any control over processes
that run in the RT class.

For example, on a four-processor system, a single-threaded RT process
can consume one entire processor if the process is CPU bound. If the system
also runs FSS, regular user processes compete for the three remaining CPUs
that are not being used by the RT process. Note that the RT process might
not use the CPU continuously. When the RT process is idle, FSS utilizes all
four processors.

You can type the following command to find out which scheduling classes
the processor sets are running in and ensure that each processor set is configured
to run either TS, IA, FX, or FSS processes.

Scheduling Class on a System with Zones Installed

Non-global zones use the default scheduling class for the system. If
the system is updated with a new default scheduling class setting, non-global
zones obtain the new setting when booted or rebooted.

The preferred way to use FSS in this case is to set FSS to be the system
default scheduling class with the dispadmin command. All
zones then benefit from getting a fair share of the system CPU resources.
See Scheduling Class in a Zone for more information
on scheduling class when zones are in use.

For information about moving running processes into a different
scheduling class without changing the default scheduling class and rebooting,
see Table 27–5 and the priocntl(1) man page.

Commands Used With FSS

The commands that are shown in the following table provide the
primary administrative interface to the fair share scheduler.