A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, BFS and -ck.

Wednesday, 10 July 2013

BFS 0.440, -ck1 for linux-3.10

I finally managed to set up some 3g wireless internet in this remote mountain village I'm staying in (probably the first to ever do so). After a few revisions I was able to bring BFS into line with mainline. There are no significant changes to the design itself, but hopefully a few minor fixes have come along as a result of the resync as I also carved out bits of code not relevant to BFS and tinkered with the shutdown mechanism a bit more. As for the new tickless on busy CPU feature from mainline, it is not being offered in BFS as it is quite orthogonal to a design that so easily moves tasks from one CPU to another, and it provides no advantage for desktop/laptop/tablet/PDA/mobile device/phone/router etc. which BFS is targeted towards.

Some of the configuration code was also changed since the last version allowed you to generate an invalid configuration. You might get some strange warnings about the IRQ TIME ACCOUNTING configuration option but it should be harmless.

After careful consideration, I've decided to remove the remaining -ck patches and just make the -ck patchset BFS with some extra default config options and the -ck tag. As I've said previously, those other patches were from long ago, the kernel has changed a lot since then, and I've been unable to confirm they do anything useful any more, whereas there have been reports of regressions with them.

54 comments:

Btw: FYI it looks like there are RFC patches for a new scheduler "that's power-aware and aims for offering power-efficient performance has been published ... The patch set introduces a cpu capacity managing 'power scheduler' which lives BY THE SIDE of the existing (process) scheduler."

Nice job, CK. Running just fine with 300 Hz tick rate and haven't been able to trip the shutdown freeze which was solved in 3.9 by jacking the tick rate up to 1k. Will post if I see it. Will also post the usual Pepsi Challenge comparing CFS to BFS in the 3.10 tree when I get some time.

Thanks for picking that up. These new RCU features are designed to help debug the new full dynticks option which BFS doesn't even support so I never even bothered to try enabling them with the latest BFS (since they're not supposed to do anything unless you enable full dynticks, yet they add significant overhead). I guess I should have masked the options for it to not even be possible to enable them.

Hello,had anyone else problems with suspend to disk/ram ?Since 3.9 it's broken on my machines with the BFS patches.Shutdown was broken too with the first 3.9 patches but it worked again with 3.9.4 or 3.9.5...What's the best way to debug this ? The last thing I get is a blinking cursor no stack trace ...

So I recompiled the kernel again (3.9.10 on Xubuntu 12.04), but this time I changed to CONFIG_HZ=1000 (was CONFIG_HZ=300). And now the I can suspend/resume, as well as restart/shutdown, normally.

Of course I cannot be certain that this one change rectified the problem I was experiencing with suspend, as I have changed other options as well (using graysky's linux-ck-atom-3.9.10-1-i686 config as guide, customizing to my preference and selections appropriate to a Debian-based distro from there).

Anyway, I'm a happy camper now, and thanks to graysky for his public repository and architecture-specific configurations.

No that didn't help sadly...I tried enabling some debug stuff but all I see is the blinking cursor.Any help to pin down this issue would be appreciated.If I switch to CFS everything works but that's not a good solution.

Since kernel 3.9.8 I had issues with suspend/resume, too, using suspend-to-disk usually. As I often try different kernel config options and there were many changes in 3.10 and CK's patch drops I wasn't able to track this down to a particular patch/config/setting with a minimum of rational reasoning -- at least I tried to. ;-)

- disabled radeon UVD as it completely breaks suspend (with a not accepted patch from LKML: http://pastebin.com/0mRGb224 && issuing radeon.no_uvd=1 @ kernel command line).- applied mm-drop_swap_cache_aggressively.patch from 3.9-ck1- CONFIG_HZ_300- CONFIG_HIGHPTE=n( - for 3.10.3 I newly tried the Transparent Hugepage Support which enables memory compaction and page migration as well )- set /proc/sys/vm/dirty_background_ratio to 3 (openSUSE seems to default to 5) - set /proc/sys/vm/dirty_ratio to 8 (openSUSE seems to default to 10)- additionally I've had set 'early writeout = n' in /etc/suspend.conf some times ago

So, maybe one or more of this stuff is helpful to you, here it survived 3 consecutive suspend/resume cycles within ~2 1/2 days of uptime.

Thanks for the help but non of this tips helped. But I could further pin it down. I enabled no_console_suspend and let my machine try to suspend now I see the suspending is getting to the last phase and I see the CPU x is now offline stuff. But now the strange thing and most likely related to the BUG some times it gets to CPU 4 until it hangs and some times to CPU 6 but never further.This stuff is is in "kernel/cpu.c" in the "disable_nonboot_cpus" function but I'm not a kernel programmer i'm stuck there.

I see now that my proposed workarounds don't heal the suspend/resume problems that must be somewhere else in kernel. 3 times it does work, the 4th or 5th attempt fails.This is still on my old unicore PIII Tualatin with ATI Radeon HD 4350 Gfx and the opensource radeon driver (and the known setup).

When resuming and the BUG finds a way to the logs I then usually get somekind of this: http://pastebin.com/91fy4RcrThe unlink_anon_vmas can come earlier than __rb_erase_color with kernel versions < 3.10.4

Don't know whom to contact about this and whether it is BFS related at all. I haven't tested against CFS so far, but I'll do now (don't want, don't want, don't want) ;-)

look for fixes for rbtree on lkml or following patches for 3.10 or file a bug report/ask on lkml - could be triggered by BFS or it's simply an issue which got introduced recently - didn't see it so far with 3.10

the unlink_anon_vmas related issue:might be an inherent issue in preemption code that gets triggered by BFS - search for related messages on lkml

AMD Phenom II / Radeon HD6950 here refuses to suspend to RAM or Disk with kernel 3.10.1-ck. With the Arch 3.9.9 it suspends (albeit with some errors ([Firmware Bug]: cpu x, try to use APIC500 (LVT offset 0) with x being every CPU core present except 0). Was told this was a BIOS bug for AMD K10 chips.

HD turns off, screen shows a non-flashing cursor at the top left. Hard boot required to get things going again.

Also compiled it for an AMD E-450 (or was it E-350..) and there it suspends correctly.

Hmmm. One one of my machines I got a panic with a (repeated) "BUG: unable to handle kernel paging request" (and 10 pages of info). The IP points to anon_vma_clone+0x7f/0x160. Since the kernel (3.9.8) is patched (BFS, BFQ) and tainted (nvidia), and since the issue most likely not easily reproducable, I will never be able to report this bug to the right people... sigh.

Hmmm. another machine panicked on me. this time under xorg, and i got no screenshots or logs. Common to both machines was: kernel 3.9.8, CK1 patch, BFQ v6r2 patch, nvidia blob, and the panic occured during or minutes after resuming from STR. I've upgraded the kernels to 3.9.10 and I'll keep observing... Going 3.10 soon.

Finally updated to kernel 3.10. No problems so far with BFS. 3.10 is probably going to be a longterm kernel, so hopefully I'll be sticking with it for the next year or so. I'm sick and tired of new kernels breaking half my software.

Not here, using a i7, s2ram and s2disk works fine. Had a problem with s2disk, could not find swap disk, but that was another story.

If I remember correct, since 3.9 (until kernel 3.10.3?) I had a problem with CPU frequency scaling, the load was nearly 0, but the cpus run at max MHz. There was a new config switch for Intel governors (CONFIG_X86_INTEL_PSTATE), my .config had still the old ondemand and this was the cause. For additional info: https://plus.google.com/117091380454742934025/posts/2vEekAsG2QT

So Con, thanks for your work on BFS. Using it with the ZEN Kernel.

PS: Do only have a performance problem with BFQ in conjunction with BFS, the disk io drops sometimes to 10% of normal.

Almost certainly any suspend to disk/ram issues that arise as a result of patching with BFS are the fault of BFS itself. The complete rewrite of the suspend mechanism after linux-3.8 led to drastic changes being required in BFS to suspend again (see the BFS announce for 3.9). As is increasingly common, I have to say time and enthusiasm prevents me investigating much further right now. If suspending is crucial to your everyday activities, and you wish to use BFS, a 3.8 based kernel is your best option. Sorry, I wish I had infinite hours to work on every one of my projects as much as I'd like...

I've dropped my retest against 3.10.x CFS, as I faced new and different and sometimes looping BUGs/OOPSes after suspend-to-disk, that never showed up earlier. I have to say, CFS is not more reliable than BFS at all. And the experience with CFS is so bad in comparison to BFS that I trust in the ext4 journalling to not loose too much data when expecting/having a lockup upon 5th resume with BFS. CFS may happen to fail on the 2nd resume already, unpredictably. I even drew back some PCI-register-tunings that had worked until 3.8.y --with no advantage.

I still, for years now, suspect that there is some severe problem with memory distribution within physical<->swap<->shm that no kernel developer feels responsible for. And I don't have enough knowledge to dig into that.

The three patches are a real benefit! And I dont understand that the haven't been incuded to 3.10.Y yet. I'm now @ 3.10.10. Please, see & apply the two related fixes in: http://www.ozlabs.org/~akpm/mmots/broken-out/

I have now adapted my old machine's setup with shm & swap to my new system. Although it is much faster and has smp now -- the old glitches remain, when using swap backed shm the system often keeps stalling for a second or so when processing a file from there.