No age scheduling for Oracle on HP-UX: HPUX_SCHED_NOAGE (2/2)

christianbilien

12 years ago

Advertisements

The first post on this topic was presenting the normal HP-UX timeshare scheduling, under which all processes will run. Its main objective was to give the rules for context switching (threads releasing CPUs and getting a CPU back), before comparing it to the timeshare no age scheduling.

The decaying priority scheme has not been altered since the very early HP-UX workstations, at a time when I assume data bases were a remote if not non existent consideration to the HP-UX designers.

As a thread gets weaker because of a decreasing priority, its probability of being switched out will increase. If this process holds a critical resource (an Oracle latch, not even mentioning the log writer), this context switch will cause resource contention should other threads wait for the same resource.

A context switch (you can see them with in the pswch/s column of sar -w) is a rather expensive task: the system scheduler makes a copy of all the switched out process registries and copy into a private area known as the UAREA. The reverse operation needs to be performed for the incoming thread: its UAREA will be copied in the CPU registries. Another problem will of course occurs when CPU switches also happen on top of thread switches: modified cache lines need to be invalidated and read lines reloaded on the new CPU. I once read that a context switch would cost around 10 microseconds (I did not verify it myself), which is far for negligible. The HP-UX internals manuals mentions 10000 CPU cycle, which would indeed be translated into 10 microseconds with a 1Ghz CPU.

So when does a context switch occurs? HP-UX considers two cases: forced and voluntary context switches. Within the Oracle context, voluntary context switches are most of the time disk I/Os. A forced context switch will occur at the end of a timeslice when a thread with an equal or higher priority is runnable, or when a process returns from a system call or trap and a higher-priority thread is runnable.

The sched no age policy was added in HP-UX11i: it is part of the 178-225 range (the same range as the normal time share scheduler user priorities), but a thread running in this scheduler will not experience any priority decay: its priority will remain constant over time although the thread may be timesliced. Some expected benefits:

Less context switches overhead (lower cpu utilization)

The cpu hungry transactions may complete faster in a CPU busy system (the less hungry transactions may run slower, but as they do not spend much time on the CPUs, it may not be noticeable).

May help resolve an LGWR problem: the log writer inherently experiences a lot of voluntary context switches, which helps keeping a high priority. However, it may not be enough on CPU-busy systems.

As the root user, give the RTSCHED and RTPRIO privileges to the dba group setprivgrp dba RTSCHED RTPRIO. Create the /etc/privgroup file, if it does not exist, and add the following line to it: dba RTSCHED RTPRIO. Finally update the spfile/init with hpux_sched_noage=178 to 255. See rtsched(1).