Briefly, the technology makes spinlocks and rwlocks preemptible by default.

the patch auto-detects at compile-time the type of lock to use for a spinlock (mutex or original raw_spinlock)

it uses a feature of gcc to manage this (reducing patch size)

it uses native Linux semaphores for preemption

it convert rwlocks to rw-semaphores

apparently, about 90 locks are targetted for NON-conversion to preemptibility (that is, they are preserved as RAW_SPINLOCKS)

Ingo mentioned at one time that this was about 20% of the locks in his kernel configuration, implying that there were about
450 spinlocks present in the kernel in his configuration.

Ingo said this about how well this works on Un-processor (UP) systems versus
SMP systems.

...and no matter how well UP works, to fix SMP one has to 'cover' all the
necessary locks first before fixing it, which (drastic) increase in raw
locks invalidates most of the UP efforts of getting rid of raw locks.
That's why i decided to go for SMP primarily - didnt see much point in
going for UP.

Normally, in UP the spinlocks are compiled away. When PREEMPT is turned on (without the new patch)
these spinlocks are turned into markers for non-preemptible regions. When RT-PREEMPT is used,

Comments regarding the scheduling of RT tasks

you have to enable CONFIG_PREEMPT_RT to active this feature. I've
designed this code to not hurt non-RT scheduling, and i've optimized
performance for the 'lightly loaded case' (which is the most common to
occur on mainline-using systems).

A very short description of the design: there's a global 'RT overload
counter' - which is zero and causes no overhead if there is at most 1 RT
task in every runqueue. (i.e. at most 2 RT tasks on a 2-way system, at
most 4 RT tasks on a 4-way system, etc.) If the system gets into 'RT
overload' mode (e.g. the third RT task gets activated on a 2-way box),
then the scheduler starts to balance the RT tasks agressively. Also,
whenever an RT task is preempted on a CPU, or is woken up but cannot
preempt a higher-prio RT task on a given CPU, then it's 'pushed' to
other CPUs if possible. This design avoids global locking (it avoids a
global runqueue), which simplifies things immensely. (I first tried a
global runqueue for RT tasks but the complexity impact was much bigger.)

(note that these scheduler changes are resonably self-contained and do
not depend on other parts of PREEMPT_RT, so in theory they could be
added to mainline too, after some time - given lots of testing and broad
agreement.)

comments about the number of raw spinlocks needed

Sven Dietrich <sdietr...@mvista.com> wrote:
> IMO the number of raw_spinlocks should be lower, I said teens before.
> Theoretically, it should only need to be around hardware registers and
> some memory maps and cache code, plus interrupt controller and other
> SMP-contended hardware.
yeah, fully agreed. Right now the 90 locks i have means roughly 20% of
all locking still happens as raw spinlocks.
But, there is a 'correctness' _minimum_ set of spinlocks that _must_ be
raw spinlocks - this i tried to map in the -T4 patch. The patch does run
on SMP systems for example. (it was developed as an SMP kernel - in fact
i never compiled it as UP :-|.) If code has per-CPU or preemption
assumptions then there is no choice but to make it a raw spinlock, until
those assumptions are fixed.

Rationale

This feature is intended to provide much better realtime scheduling response for a Linux
system.

Resources

Projects

Various parties are working on ports: Time Sys and Monta Vista, in particular, seem to have
made ports to PPC and ARM platforms.