At the Hotsos Symposium last week, Tanel Poder pointed out an interesting hidden parameter (_high_priority_processes). We were discussing the occasional need to increase priority on the LGWR process when dealing with “log file sync” issues. The aforementioned parameter is used to tell Oracle to automatically set background processes to a higher than normal priority when the database is started. On linux this means putting them in the RR class, in Solaris, the RT class is used. The parameter defaults to LMS* in 10.2 and to LMS*||VKTM in 11gR1. (LMS* are the Lock Manager Server processes in a RAC instance and VKTM is the new Time Keeper process in 11g). So it makes sense that these processes would run at a higher priority.

Recently I have been working on a system that was spending a significant amount of time waiting on “log file sync”. Increasing the priority of the LGWR process made a significant improvement. It’s not a silver bullet though. When the system becomes really busy, the “log file sync” times still suck, just not as much. But I digress, this post is just to point out that Oracle has a built in mechanism to increase priority on background processes, not to tell you how to reduce long “log file sync” times. There are numerous posts on that subject already. Here a couple of pretty good ones:

Back to my story. Unfortunately, every time the database is bounced, LGWR goes back to his original priority and class and we have to get the Unix guys to reset it. So I got to wondering if the _high_priority_processes parameter could be used to do this automatically. And sure enough it looks like it can. First here’s a look at the class and priority info of the standard high priority processes on a Solaris system and a Linux system.

So you can see that the database did indeed cause the background processes specified in the _high_priority_processes parameter to run with a higher priority / scheduling class. Cool, maybe we can let the Unix guy get a little sleep now.

22 Comments

Does anyone know what the LMS* value stands for?
I’m also wondering if the higher priority is a fixed “jump” or is there a way to specify a given one?
For example, in AIX I can only change priorities within certain limits set by the sysadmin, what happens if Oracle tries to go higher?
(I guess I have to try it out soon, currently I just sudo a renice…)

RAC has multiple LMS processes (lms0, lms1, …) – LMS* appears to mean do it all the LMS background processes. I am not aware of any controls for the priority number itself. If you find one let me know.

Hmmm, I can confidently report that in AIX 5.3 with Oracle 10.2.0.3, this does not work.
I’ve tried exactly the same string as here, no effect whatsoever on the priority.
Mind you: I’m not running RAC, don’t know if that would have any effect.

Interesting. I wouldn’t think RAC would make a difference. The instances I tested it on were not RAC. I have only tried it on Redhat Linux so far though. I’ll give it a try on Solaris and post my results.

To Tanel’s and Bob Sneed’s point, RT cpu priority is kernel preemptive. LGWR can block the interrupts being serviced by executing in RT mode in the CPU. LGWR needs those interrupts to be serviced and those interrupts quite probably could be I/O completion interrupts for LGWR to complete commit/log file sync processing. This issue of waiter blocking the interrupts server leads to an incorrect priority anomaly, analogous to a waiter of a resource evicting the holder of a resource off the CPU.

On the other hand, in RAC, LMS processes will wait for LGWR to complete ‘log file sync’ if the block to be transferred is considered “busy”. But, LMS is also running in RT mode. So if the LGWR process is not running in RT mode, then the LMS process can evict LGWR from the CPU. This also leads to incorrect priority anomaly.

If there are enough CPUs, in RAC, I would prefer, to run both LGWR and LMS* processes in RT mode (and VKTM in 11g). Then fence the interrupts to a different processor set so that LGWR/LMS* can not block those interrupts. In this way we should be able to keep interrupts/LMS/LGWR priorities straight. [At the very least, run LGWR in FX 60 in Solaris so that LGWR does not suffer from TS class scheduling latencies.]

It is important to also realize that, default number of LMS processes are quite excessive. For example, one of my client site had 28 LMS processes and all 28 of them running in RT mode! You can imagine how much CPU usage will be used by those LMS processes. In my opinion, in most cases, 3 or 5 LMS processes per instance is good enough to service GC traffic. So, if the number of LMS processes are excessive, then it is probably worthwhile to reduce that and then also increase LGWR priority.

Last but not least, I still fail to see a point of running VKTM in RT mode!

Found something very interesting. Remember when I asked what priority level did this set processes to?
Well, it turns out it’s nothing to do with that. This parameter raises the processes to real time priority class, not level. What controls the level is (yet!) another hidden parameter:
_os_sched_high_priority
For complete reference, go to MOS and search for “602419.1”.
From the result set, pick the
“LMS not running in RT (real time) mode in 10.2.0.3 RAC database” link.
It’s all explained there.
See what you done? Now I *must* test all these things!!!!
:D

Interesting. The _os_sched_high_priority parameter appears to just ratchet the priority up within the RT/RR class. On linux with 10.2.0.4, a vlue of 1 = priority of 41, 2 = 42, 10=50, 20=60, etc… I think once you get into the Real Time class that moving the priority up will have little value. At that point I think you’d need to follow Riyaj’s suggestion regarding interrupt fencing / isolating LGWR. Thanks for pointing out the parameter and the Metalink reference.

How to fix LMS process to run in RR? I’m on Linux IA64and RHEL EE 5.3 so there is no support for this combination…even on Metalink!
I’m suffering node eviction without reason. After I set digiwait and make some configuration changes …now in last 10 days there is no eviction…even on heavy stress (800 sql connections)…
So I’m pretty desperate…in searching for the reason…

SR is closed as unresolved and there is nothing else to do.
Main problem from mine point of view is Intel Itanium CPU, configuration that is really rare, so I think there is no testing configuration on Oracle Technical support side.
Rg,
Damir Vadas

[…] are not preempted by lower priority processes. We know that since 10g some of Oracle processes have Real-Time priority. By default only LMS and VKTM processes scheduled in RT class. These processes do not use mutexes. […]

The answer is maybe it could help. Those wait events are related to moving blocks between RAC nodes, so the best resolution would be to figure out why so many blocks have to move (as opposed to just rtying to make the move faster). Maybe some sort of workload partitioning would work better.