This is to announce the first stable release of the BFS CPU schedulerfor linux 3.3.0 designed for optimal interactivity, responsiveness andthroughput on commodity hardware.

The changes since BFS version 0.416 include a fairly largearchitectural change just to bring the codebase in sync with 3.3, butnone of the changes should be noticeable in any way. One change thatmay be user-visible is that the high resolution IRQ accounting nowappears to be on by default for x86 architectures. There is an issuethat system time accounting is wrong without this feature enabled inBFS so this should correct that problem.

Other changes:416-417: A number of ints were changed to bool which though unlikelyto have any performance impact, do make the code cleaner and thecompiled code does often come out different. rq_running_iso wasconverted from a function to macro to avoid it being a separatefunction call when compiled in with the attendant overhead.requeue_task within the scheduler tick was moved to being done underlock which may prevent rare races. test_ret_isorefractory() wasoptimised. set_rq_task() was not being called on tasks that were beingrequeued within schedule() which could possibly have led to issues ifthe task ran out of timeslice during that requeue and should have hadits deadline offset. The need_resched() check that occurs at the endof schedule() was changed to unlikely() since it really is that. Movedthe scheduler version print function to bfs.c to avoid recompiling theentire kernel if the version number is changed.

417-418: Fixed a problem with the accounting resync for linux 3.3.

418-419: There was a small possibility that an unnecessary reschedwould occur in try_preempt if a task had changed affinity and calledtry_preempt with its ->cpu still set to the old cpu it could no longerrun on, so try_preempt was reworked slightly. Reintroduced thedeadline offset based on CPU cache locality on sticky tasks in a waythat was cheaper than we currently offset the deadline.

419-420: Finally rewrote the earliest_deadline_task code. This haslong been one of the hottest code paths in the scheduler and smallchanges here that made it look nice would often slow it down. I spentquite a few hours reworking it to include less GOTOs whiledisassembling the code to make sure it was actually getting smallerwith every change. Then I wrote a scheduler specific version offind_next_bit which could be inlined into this code and avoid anotherfunction call in the hot path. The overall behaviour is unchanged fromprevious BFS versions, but initial benchmarking confirms slightimprovements in throughput.

While interactivity is the prime concern for BFS, as part of theregression testing, throughput benchmarks are performed withkernbench. This is a plot of BFS 418/420 and mainline archlinux 3.3.0kernel on a dual quad hyperthread core2 (lower is better):