No age scheduling for Oracle on HP-UX: HPUX_SCHED_NOAGE (1/2)

christianbilien

12 years ago

Advertisements

The HPUX_SCHED_NOAGE initialization parameter has been around for some times, but in most of the situation where I found out it would be useful, sys admins and DBAs alike were a bit scared (to say the least) to use it. However, I cannot see a real threat to using this parameter: some degradation may occur if set inappropriately, but not outages and side effects associated with real time. Here are a couple of notes to explain this parameters usefulness. This first post will present time share generalities, the second one will be more specific about this parameter.

HP-UX scheduling policies can be classified using different criteria. One of them is priority range. Basically, processes can belong to one of 3 priorities ranges: rtprio and rtsched are real-time, HP-UX time share may be either real time or not. I am only interested here by the latest: HPUX_SCHED_NOAGE only interacts with the time share mechanism, not with the POSIX (rtsched) or rtprio (the oldest HPUX real time system) schedulers.

Time-share scheduling

This is the default HP-UX scheduler: its priority range is from 128 (highest) to 255 (lowest). It is itself divided between system (128-177) and user priorities (178-255). I’ll start to explain what happens to processes when HPUX_SCHED_NOAGE is not set in order to build up the case for using it.

1. Normal time-share scheduling (HPUX_SCHED_NOAGE unset)

Each CPU has a run queue, also sometimes designated as the dispatch queue (each queue length can be seen by sar –qM). Let’s start with a mono processor: processes using the normal time-share scheduling will experience priority changes over time.

CPU time is accumulated for the running thread (interestingly, this could lead to “phantom” thread using for a transaction less than 10ms (a tick) of CPU time, for which the probability of priority decay would be low). Every 40ms (4 ticks), the priority of the running thread is lowered. The priority degradation is a function of the CPU time accumulated during those 40ms (I do not know what the function formula is, but it is linear). Finally, a thread which has been running for TIMESLICE (default 10) x 10ms will be forced out (this is a forced context switch, seen in Glance). TIMESLICE is a changeable HP-UX kernel parameter. Thus, the default forced context switches will by occurs after a 100ms quantum. The thread waiting in the run queue having the highest priority will then be dispatched. Threads waiting (for a semaphore, an I/O, or any other reason) will regain their priority exponentially (I do not know the formula either).

Some known problems:

The scheduler assumes that all threads are of equal importance: this can be corrected by using nice(1) to slow down or accelerate the priority change when the process has no thread running.

CPU-hungry transactions tend to be bogged in low priorities: the more they need CPU, the less they get it.