FSS(7)

NAME

FSS– Fair share scheduler

DESCRIPTION

The fair share scheduler (FSS) guarantees application performance by
explicitly allocating shares of CPU resources to projects. A share indicates
a project's entitlement to available CPU resources. Because shares are meaningful
only in comparison with other project's shares, the absolute quantity of shares
is not important. Any number that is in proportion with the desired CPU entitlement
can be used.

The goals of the FSS scheduler differ from the traditional time-sharing
scheduling class (TS). In addition to scheduling individual LWPs, the FSS
scheduler schedules projects against each other, making it impossible for
any project to acquire more CPU cycles simply by running more processes concurrently.

A project's entitlement is individually calculated by FSS independently
for each processor set if the project contains processes bound to them. If
a project is running on more than one processor set, it can have different
entitlements on every set. A project's entitlement is defined as a ratio between
the number of shares given to a project and the sum of shares of all active
projects running on the same processor set. An active project is one that
has at least one running or runnable process. Entitlements are recomputed
whenever any project becomes active or inactive, or whenever the number of
shares is changed.

Processor sets represent virtual machines in the FSS scheduling class
and processes are scheduled independently in each processor set. That is,
processes compete with each other only if they are running on the same processor
set. When a processor set is destroyed, all processes that were bound to it
are moved to the default processor set, which always exists. Empty processor
sets (that is, sets without processors in them) have no impact on the FSS
scheduler behavior.

If a processor set contains a mix of TS/IA and FSS processes, the fairness
of the FSS scheduling class can be compromised because these classes use the
same range of priorities. Fairness is most significantly affected if processes
running in the TS scheduling class are CPU-intensive and are bound to processors
within the processor set. As a result, you should avoid having processes from
TS/IA and FSS classes share the same processor set. RT and FSS processes use
disjoint priority ranges and therefore can share processor sets.

As projects execute, their CPU usage is accumulated over time. The FSS
scheduler periodically decays CPU usages of every project by multiplying it
with a decay factor, ensuring that more recent CPU usage has greater weight
when taken into account for scheduling. The FSS scheduler continually adjusts
priorities of all processes to make each project's relative CPU usage converge
with its entitlement.

While FSS is designed to fairly allocate cycles over a long-term time
period, it is possible that projects will not receive their allocated shares
worth of CPU cycles due to uneven demand. This makes one-shot, instantaneous
analysis of FSS performance data unreliable.

Note that share is not the same as utilization. A project may be allocated
50% of the system, although on the average, it uses just 20%. Shares serve
to cap a project's CPU usage only when there is competition from other projects
running on the same processor set. When there is no competition, utilization
may be larger than entitlement based on shares. Allocating a small share to
a busy project slows it down but does not prevent it from completing its work
if the system is not saturated.

The configuration of CPU shares is managed by the name server as a property
of the project(4)
database. In the following example, an entry in the /etc/project file sets the number of shares for project "x-files" to 10:

x-files:100::::project.cpu-shares=(privileged,10,none)

Projects with undefined number of shares are given one share each. This
means that such projects are treated with equal importance. Projects with
0 shares only run when there are no projects with non-zero shares competing
for the same processor set. The maximum number of shares that can be assigned
to one project is 65535.

You can use the prctl(1)
command to determine the current share assignment for a given project:

$ prctl -n project.cpu-shares -i project x-files

or to change the amount of shares if you have root privileges:

# prctl -r -n project.cpu-shares -v 5 -i project x-files

See the prctl(1)
man page for additional information on how to modify and examine resource
controls associated with active processes, tasks, or projects on the system.

By default, project "system" (project ID 0) includes all system daemons
started by initialization scripts and has an "unlimited" amount of shares.
That is, it is always scheduled first no matter how many shares are given
to other projects.

The following command sets FSS as the default scheduler for the system:

# dispadmin -d FSS

This change will take effect on the next reboot. Alternatively, you
can move processes from the time-share scheduling class (as well as the special
case of init) into the FSS class without changing your default scheduling
class and rebooting by becoming root, and then using the priocntl(1) command,
as shown in the following example:

# priocntl -s -c FSS -i class TS
# priocntl -s -c FSS -i pid 1

CONFIGURING SCHEDULER WITH DISPADMIN

You can use the dispadmin(1M)
command to examine and "tune" the FSS scheduler's time quantum value. Time
quantum is the amount of time that a thread is allowed to run before it must
relinquish the processor. The following example dumps the current time quantum
for the fair share scheduler:

The value of the QUANTUM represents some fraction of a second with the
fractional value determied by the reciprocal value of RES. With the default
value of RES = 1000, the reciprocal of 1000 is .001, or milliseconds. Thus,
by default, the QUANTUM value represents the time quantum in milliseconds.

If you change the RES value using dispadmin with
the -r option, you also change the QUANTUM value. For example,
instead of quantum of 110 with RES of 1000, a quantum of 11 with a RES of
100 results. The fractional unit is different while the amount of time is
the same.

You can use the -s option to change the time quantum
value. Note that such changes are not preserved across reboot. Please refer
to the dispadmin(1M)
man page for additional information.