A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, MuQSS, BFS and -ck.

Saturday, 16 August 2014

BFS 450, 3.16-ck1

Announcing a resync and update of BFS for linux kernel 3.16.x. Coding has proven a nice distraction from unpleasant life events so I've been able to bring the patch up to date with the latest kernel.

A number of minor fixes as queued up post 3.15-ck1 made their way into this patchset, along with some changes inspired by the development work of Alfred Chen (thanks!).

The major feature upgrade in this one is the inclusion of SMT nice as discussed at length on this blog. This version of BFS includes an updated version of SMT nice beyond version 6 posted here with one change - 25% of the CPU time of any nice level of SCHED_NORMAL tasks can be shared with any other nice level over and above the nice-based CPU distribution. This is to capitalise on the slightly increased throughput that is available by using the sibling CPU concurrently without too dramatically affecting higher priority process CPU loss. In addition it dramatically reduces the massive latencies that can sometimes otherwise be seen by heavily niced tasks with SMT nice enabled by dithering the metering out of CPU instead of giving it all as a burst only when it's entitled to CPU.

Making SMT nice configurable means users can get to choose if they still want the standard behaviour. The config option will recommend users who enable the SMT scheduler option also enable the SMT nice option. I believe this to be a good default choice for virtually all desktop users, and selectively for server users if they depend heavily on the use of 'nice' or scheduling policies for their work cases (but otherwise it should be disabled).

EDIT: A build fix for non SMT enabled kernels to prevent it being possible to enable SMT nice is here:bfs450-nosmt-buildfix.patch
Just disabling SMT nice will achieve the same thing for those affected.

@ckIn 0450, I can see tsk_is_polling checking is total removed in resched_task(). But a debug code shows that TIF_POLLING_NRFLAG bit is set when checking in resched_task(), about 10000 counts in 2mins.[ 116.728800] bfs: resched_task 9571Would you give further hint why remove this checking in bfs? Thanks.

@pfGood finding. scheduler_ipi() is removed by my patch [BFS] Remove runqueue wake_list. I do missed the preempt_fold_need_resched() call in it. ck's patch fixed it.Do you have a workable old version of kernel(maybe 3.13, 3.14) to jump back and check the ksoftirqd behaviours?

@pfWould you apply this patch upon bfs450-sched-ipi.patch and see it help with the ksofirqd CPU usage issue?It enables the mainline TIF_POLLING_NRFLAG checking routines, should help with ipi in somehow, but I am not sure if it help with ath9k module.

Thanks PF. I'd spent the last couple of days auditing code to see what might be responsible and that was the only solution I could come up with. The behaviour with this patch is definitely correct, but it's a bit disappointing because it means there's something fundamentally different in BFS handling the resched flag compared to mainline and I didn't intend to start diverting from mainline in this way. I'll keep auditing the code to see if there's an obvious trigger to act on this flag in a different place that I've missed but it's fair to say this is a sane solution for the time being and if I can't come up with anything, I'll just run with it.

Aha! Now we're talking! This last set of patches is the correct fix (unlike the tifcheck patch). Let's try it for a day or two and then I can formalise these changes as a new BFS if nothing shows up. Thanks for testing!

@PF: From linux 3.13 setting just the "tif needs resched" flag alone was not enough to trigger a descheduling from certain places in the code, it needed the "preempt needs resched" tagged to trigger a different type of descheduling to hand over to another process or kick it off a cpu where it should no longer be.

The recurring theme is that _cond_resched no longer works properly in BFS. It presents as a different bug for the i8k module not unconditionally rescheduling when the affinity changes but is the same issue as the ath9k tasklet not properly rescheduling and the ksoftirq spinning without rescheduing. Now to go back to 3.13 and see what changed at that time and how it broke.

would only comment, that my system is running fine with 3.16.1 and BFS+SMT on i7. Hibernate and suspend are working trouble-free (now with an Intel Wireless 7260 card and not anymore with the ath9k module). Had in mind, that there was an higher load value, but this was not the case. And the NFS server on my machine gives the same throughput as without BFS, or even enough to stream wireless some HD videos. Make operations could need some more time now, but that was the goal ;)Or with other words, no negative drawback.

So, if I read correctly, the patches:(1) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-sched-ipi.patch(2) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-tifcheck_in_cond_resched.patch are of benefit?And this one is not needed?:(3) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-resched-scap.patch

Do these patches "only" (but thankfully!) heal the issues post-factum and others reported, or are they considered as bug-fixes for BFS?

The NEW patch set also works well for a non affected system with a 3.16.y-gc patched kernel, applied on top on here:bfs450-resched-scap.patchbfs450-sched-ipi.patchbfs450-add-preempt-resched.patch (No.1 with fuzz o.k, No.8 failed, as already removed o.k)

I hope Alfred Chen does consider this safe... ^^

Thank you all, and best regards, Manuel Krause

BTW, I knew the "behavioural issues" are meant regarding my system, that's why I found it so funny as it can have double meaning for real life..

I'm watching this thread. I will update my -gc branch by re-basing 0450 and sync with 3.16.2 from mainline, hopefully next week. As ck said debug is not finished, I will not include these 3 patches so you can apply updated ones if you affected by similar issues.