A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, BFS and -ck.

Monday, 26 September 2011

BFS 0.410 test with skiplists.

Hi all.

TL;DR: Test release for fastest BFS ever.

The skiplists patch has proven to be quite stable, but a couple of minor issues have shown up. First, as I knew, the ondemand behaviour would not quite match the current release, and second, SCHED_IDLEPRIO tasks weren't scheduled properly (they acted like normal tasks). However the code itself seemed quite safe otherwise. So I've tentatively put out a test release of the next version of BFS. The two main changes to the skiplist code are to bring the ondemand governor performance up to par with current BFS, and to fix the behaviour of SCHED_IDLEPRIO tasks.

Quick benchmarks:
BFS 406:
Make -j4: 26.6s
Make -j : 27.8s

BFS 410:
Make -j4: 26.4s
Make -j : 27.1s

Changelog in patch:

Implement skip lists as described here:
http://en.wikipedia.org/wiki/Skip_list
for the main priority queue.
The old queue had: O(1) insertion, O(1) removal, but a lookup involving
both a binary search of a bitmap and O(n) search through a linked list which
is very cache unfriendly as the list gets larger.
This queue is now: O(log n) insertion, O(1) lookup and O(k) lookup in a much
more cache friendly manner.
This should not compromise performance at all at very low loads, but improve
both throughput and latency as loads get higher, as confirmed by benchmarks.
Other changes: Cleanups of variable choices and micro-optimisations.

@Con, I am not a programer. This work aimes to get your scheduler BFS skale up to very large machines?

Didn't you once say something like: It is more realistic to think of a scheduler not be able to scale to all use cases (like CFS pretends to do). With this work you try to falsify this your assumption?

That's a very good question. In fact this is not about scaling BFS up to large machines. Believe it or not, the machine that will benefit the most from this change is the one with the lowest CPU count! The reason is that load is relatively higher the less CPUs you have. Despite the fact that the benchmark is aimed at showing the heavily loaded "make -j" case, this is simply a way of quantifying it somehow. In my experimentation, even if load is often low at around ~1, you can *easily* get bursts of load up to 10-15 it's just that they're so short lived they never show up in your load average.

CK - I ran the following benchmark on a dual Xeon system: 8 physical cores and 8 hyperthreaded cores = 16 total. My benchmark was running a make -j16 bzImage modules on linux v3.0.4 source code using the Arch Linux .config file. The boxplot shows compile time and the distributions for each of the 10 runs per kernel. As usual, your shit out preforms mainline (identified as "stock" in my plots). On this machine, there isn't a statistically significant difference between bfs v0.406 and bfs v0.410 :(

When I get some CPU time on my home system (X3360, 4 physical cores and 0 hyperthreaded cores), I will repeat run the experiment on it and see if there is a bigger difference.

Serious regression to report with v0.410 using the ondemand governor on my quad core workstation: when I tried to load virtualbox, it spit back a bunch of errors related to my vdi file. After I forcefully quit vbox, nothing worked properly. For example, grub-mkconfig -o /boot/grub/grub.cfg just froze up. After rebooting into my 3.0.4 kernel with bfs v0.406 everything was fine. I repeated with v0.410 and again, vbox errors :(

Interesting. Thanks for testing. No regression is probably the most important part of this code, and even if the performance is the same at -j16 on 16x, that's fine by me as it will be hard to overload a machine like that in normal use! The vdi regression concerns me more. What were the errors exactly? Was there anything in dmesg/syslog?

CK - well, this time I nuked my vbox modules and recompiled them after booting into my kernel with bfsv0.410 and no errors. Perhaps that was the problem before? I NEVER needed to rebuild modules like this before. Only when a two point kernel is released...

So, is this a one-off occurrence, the results of solar flares (you read that article?) or what...

CK - I am experiencing a serious lack of responsiveness when I compile on my workstation and browse the web or use any GUI app. The mouse does not scroll smoothly nor does my cursor move smoothly while typing. As soon as I stop the compilation, desktop responsiveness returns to normal.

It addresses a bug in the deadline calculation. It's essential. I'm trying to accumulate fixes before the next test release as this is a major change you can imagine. There is also a suspend/resume regression (like there always is with scheduler changes sigh).