A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, MuQSS, BFS and -ck.

Friday, 26 May 2017

linux-4.11-ck2, MuQSS version 0.156 for linux-4.11

Announcing a new -ck release, 4.11-ck2 with the latest version of the Multiple Queue Skiplist Scheduler, version 0.156. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.

linux-4.11-ck2

MuQSS

MuQSS 0.156 updates

- Fixed failed UP builds.

- Remove the last traces of the global run queue data, moving nr_running, nr_uninterruptible and nr_switches to each runqueue. Calculate nr_running accurately at the end of each context switch only once, reusing the variable in place of rq_load. (May improve reported load accuracy.)

4.11-ck2 updates

- Make full preempt default on all arches.

- Revert inappropriately reverted part of vmsplit patch.

Enjoy!

お楽しみ下さい

-ckI seem to have unintentionally deleted the -ck1 post, sorry about that.

The symptom is that on 2 CPU system (FUJITSU ESPRIMO Mobile V6555 laptop w/Intel Core2 Duo T6570) with a single cpu-intensive process the frequency does not get raised at all. In this case probably both cores get 50-50% load, which is lower than the 80% default threshold of the conservative governor.

If the process is pinned to one of the cores, the frequency of the core the process is pinned to rises to the maximum as expected.

Running two cpu-intensive processes on this 2 core system raises the frequency of both cores as expected.

Any ideas how to fix this?

By the way, powertop seems to mess up something in the kernel and frequencies stay low after starting powertop. Changing the governor to something else and then back again to conservative fixes this issue.

It's intrinsic to the design to minimise latency that tasks will move around to get the lowest latency scheduling for them. If you want it to do that less, disable interactive mode:echo 0 > /proc/sys/kernel/interactive

Thank you both for bringing this up and for clarifying! The combination of symptoms, design intentions and possible successful solution makes it easier to understand how MuQSS works under certain conditions. :-)BR, Manuel Krause

Mmmh, hasn't there been an issue with iwlwifi users some time ago, discussed on here?! Don't remember completely.Maybe switching from builtin to module or vice versa may help, or getting a fresh firmware.

Yeah I've been logging / graphing (munin, because it works for what I need) ever since the -ck2 bump and when I'm actually AFK and things are idle it does seem to show reasonable "very close to zero" loads. Good job it's working sanely :)

Fwiw, I've been using MuQSS on my old netbook (Eee 701 with Celeron M ULV 353) for a long time, and with BFS before that. However, mine is UP, vs the Z520's SMT. Perhaps providing the panic info would help. Did you use the vanilla kernel's config?

I have a quad-core CPU (and no SMT). The output of 'top' seems to be kind of right, but I have a hunch it's also off and always calculating a result that's double of what's in 'htop'. It might just get clamped to 100%, and that makes the numbers up to 4 look good.

This is supposed to be in a busy loop for 10ms, then sleep for 10ms, and then this all repeats. It's supposed to show 50% in the CPU% column of 'top' and 'htop', and it behaves exactly like that with a 4.11.3 kernel using CFS.

When changing that "$t" in the while loop to "$t/2", "$t/3", "$t*2", "$t*3", it's supposed to result in 33%, 25%, 66%, 75% CPU usage, and that's again what happens with CFS.

Then going to the kernel using MuQSS, the displayed numbers are jumping around a lot, so I have to guess the average. With CFS, the percentage shown was quite stable. The numbers I see with MuQSS are like this:

expected, htop, top25, 33, 3533, 43, 4750, 55, 6966, 66, 8175, 71, 91

The numbers displayed in top/htop were changing by over 10% from second to second, so they are really just guesses. I wrote down min/max values that I saw and used the average between those two. This wasn't needed at all when testing with CFS where top and htop showed pretty stable numbers.

The scheduler can't physically make the virtualised operating system use any more CPU and what you are seeing is almost certainly simply sampling error differences between CFS and MuQSS. The CPU accounting is performed differently by both schedulers.

Windows' timers work differently to linux and I'm guessing they happen to be landing at exactly the sampling points used in muqss. Fixing it is unlikely any time soon without knowing exactly what's causing it and I'm afraid I don't have such spare time to dedicate.

Hi, using ck patches for long time. I was experience a freeze on Xorg (firefox + gnome) every 2-3 seconds with the default 100Hz tick even before MuQSS so I had manually setting it to 1000Hz. First iterations were better I think but still getting that now with default 100Hz. Even with 1000Hz is still there but way less than before so might not even notice it. Any idea where I Can track down it's source? CPU: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz