People often ask me why I don't maintain a git tree of my patches or at least BFS and make it easier on myself and those who download it. As it turns out, it is actually less work only for those who download it to have a git tree and would actually be more work for me to maintain a git tree.

I do NOT keep track of the linux kernel patches as they come in during the development phase prior to the latest stable release. Unfortunately I simply do not have the time nor the inclination to care on that level any more about linux kernel. However I still do believe quite a lot in what BFS has to offer. If I watched each patch as it came into git, I could simply keep my fork with BFS and merge the linux kernel patches as they came in, resyncing and modifying as it went along with the changes. When new patches go into the kernel, there is a common pattern of many changes occurring shortly after they're merged, with a few fixes going in, some files being moved around a few times, and occasionally the patch backed out when it's found the patch introduces some nasty regression that proves a showstopper to it being released. Each one of these changes - fixes, moves, renames, removal, require a resync if you are maintaining a fork.

The way I've coded up the actual BFS patch itself is to be as unobtrusive as possible - it does not actually replace large chunks of code en bloc, just adding files and redirecting builds to use those new files instead of the mainline files. This is done to minimise how much effort it is to resync when new changes come. The vast majority of the time, only trivial changes need to be made for the patch to even apply. Thus applying an old patch to a new kernel just needs fixes to apply (even if it doesn't build). This is usually the first step I do in syncing BFS, and I end up with something like this after fixing the rejects:

After that, I go through the incremental changes from mainline 3.6 to 3.7 to see any scheduler related changes that should be applied to BFS to 1. make it build with API changes in mainline and 2. benefit from any new features going into mainline that are relevant to the scheduler in general. I manually add the changes and end up with an incremental patch like this:

Git is an excellent source control tool, but provides me with almost nothing for this sort of process where a patch is synced up after 3 months of development. If I were to have my fork and then start merging all patches between 3.6 and 3.7, it would fail to merge new changes probably dozens and potentially hundreds of times along the way, each requiring manual correction. While merge conflicts are just as easy to resolve with git as they are with patch, they aren't actually easier, and instead of there being conflicts precisely once in the development process, there are likely many with this approach.

However git also does not provide me with any way to port new changes from mainline to the BFS patch itself. They still need to be applied manually, and if changes occur along the way between 3.6 stable through 3.7-rc unstable to 3.7 stable, each time a change occurs to mainline, the change needs to be done to BFS. Thus I end up reproducing all the bugfixes, moves, renames and back-outs that mainline does along the way, instead of just doing it once.

Hopefully this gives some insight into the process and why git is actually counter-productive to BFS syncing.

Thanks as usual CK. For those interested readers, here are the tests confirming the performance of bfs v0.425 to bfs v0.426 as I usually provide. A reminder that these are my `make bzImage` test on my workstation comparing 4 kernels: 3.6.9, 3.6.9-bfs, 3.7.0, and 3.7.0-bfs. The script is on my github linked below.

That graphic is overly complicated for people who don't do statistics. But since CK has a background in stats, I included them. In plain English, both version of the BFS gave statistically significant DECREASES in compile time compared to the CFS. Less time = faster compile.

Look at the median for each group. BFS v0.426 (the current one that patches into the linux 3.7 series) was around 350 ms faster than the corresponding mainline scheduler (CFS).

I thought I just explained why, so I guess you didn't even read the article and just assumed I'm trolling against git... Feel free to grab a 3.6-bfs425 patched kernel and then git rebase v3.7 for yourself and see if you end up with anything like bfs426.

Con Kolivas please look at what this is doing.http://lkml.indiana.edu/hypermail/linux/kernel/1212.1/03729.html

I do not mean to be mean here. But this person is at long last addressing the faults that were raised against BFS you did not want to hear a while back when you stormed away for the Linux kernel main developers and made an ass out of yourself.

As the maintainer of CFS said at the time BFS was not scaling properly and you response was that those system were more complex than desktop. We are now getting desktops 8 cores and more.

It would be good to get you back in the main-line development Con Kolivas. Hopefully big enough lesson not to be pig headed even if the other person appears to be. If it does not work on large machines sooner or latter those large machines will be the general desktop/mobile phone machines. Yes there is a 8 core mobile phone due out as well. The sub 4 core gains are over.

So scaling well as number of cores increases has become critical.

The one thing CFS has had always over BFS is better handling on large systems by avoiding cpus having to lock to access data as much.

http://cs.unm.edu/~eschulte/classes/cs587/data/bfs-v-cfs_groves-knockel-schulte.pdf"BFS takes a diferent approach than both the O(1) scheduler and CFS. BFS uses runqueues like O(1); however, unlike O(1), which has both an active and an expired runqueue per CPU, BFS has only one system-wide runqueue containing all non-running tasks."

Yes the place BFS broke from a O(1) scheduler. Comes back and hurts you as you get more and more cores needing to talk to the 1 single system-wide runqueue.

Now a wise person would have investigated why. Worst locking is memory controller to memory controller for performance cost. So a run-queue per memory controller. Possible in a cgroup of processes assigned to that physical cores connected to that memory control the cost would be min. And you would avoid having to perform load balancing.

Now you also have stated your hate for cgroups around processes. This now forces you down the path of runqueue per core that now leads back to costly load balancing.

Basically the problem you put off along time ago Con is back with for revenge.

Clearly you do not understand me at all then. If regaining my life makes me an ass then so be it - I shall remain an ass. As an amateur hacker it is absolutely impossible to maintain that degree of interaction with the linux kernel in my spare time without it affecting my personal life and health. So as it stands, I don't actually want to engage them again at that price. There is no "problem I put off long ago" as far as I can see. You make it sound like I'm obliged to do something. I never once pretended BFS was a fix for everything, nor was I trying to make it so. I was the one who pointed out all its faults long before that paper came out.

As for the actual code, I shall respond in time to the email as appropriate.

Furthermore... BFS didn't exist when I "stormed off" by the way. And it was nothing to do with scalability of the Staircase Deadline scheduler that were considered the problem at the time. But time muddies the story and people mix up the new and old so it's understandable.

"...stormed away... and made an ass out of yourself". Smells like an uninformed troll. I for one thank CK for continuing his work DESPITE what the "Linux kernel main developers" said and wanted, it's thanks to him that my Linux-based laptop is responsive.

The limited scale argument to which you refer needs data to support it AFAIK. Look back a few posts to see that on a dual quad (hyperthreaded) machine, a `make -j16` endpoint clearly establishes that BFS outperforms CFS. I would be interesting to see the numbers on a larger machine. What is the point at which it breaks down, etc. I don't think that we will see 16 core desktop/laptops in the near future :p

Thanks for the detailed explanation. I always wondered why you don't use git for this. I didn't consider how different it would be to do countless smaller changes throughout the process of a kernel release instead of one larger rebase of your code.

Time to put your new ck patch to work on my shiny new (to me) core 2 duo. It always helped dramatically on my old pentium 4 boxes

Unfortunately the new BFS does make accounts the process times wrong again :( - I had this issue with linux-3.4.x-bfs - came around with higher RCU boost with linux-3.5.x-bfs- I had no thus issues with linux-3.6.x-bfs- now manipulating RCU .config doesn't help with linux-3.7.1-bfs

As a simple user I feel dependend observing htop -ordering by time- for controlling functioning of my system(d). Also I feel very unsecure observing some 50 million hours on some tasks :(

At this point I want to add some experiences I've made with the 3.6.x series + ck/bfs + bfq. At some time suspend-to-disk broke. I tried to investigate further and came to the conclusion that my switch from pure bfs+bfq to ck+bfq made the difference. So... long story short...On my old machine with 1.4 GHz CPU, 2GB RAM, 3GB shm, 4 GB swap my normal operations AND suspend-to-disk+resume worked again, when leaving swappiness = 60 (openSUSE kernel-source default; ck = 10 always distorted my desktop latency), setting dirty_ratio = 6 (default = 20; ck = 1), setting dirty_background_ratio = 2 (default = 10; ck = 1),[the last both settings are the lowest known working with suspend +1 to be on the safe side].

I haven't retested this on 3.7.1 as with 3.6.x so far, but it's working fine for 5 days of uptime with 3.7.1 and regular nightly suspends!Is this difference due to my outdated computer?

Thanks, I'll try dirty_ratio = 6 and dirty_background_ratio = 2 then.Looks like there's lots of contradicting opinions on the net on what these values should be set to. In 2007 Linus decreased the ratios to 10/5 but people complained about poor DB performance.

PS: I have perfected the low-jitter config on linux, for mainline kernel. Please see the jitter links on my blog. www.paradoxuncreated.com

I also tested BFS during this. It seems general experience of jitter, (lost frames, poor frametiming) is worse with BFS.). So if I was you, I´d drop it. Get an Intel E5 workstation on top of this, and even windows won´t stutter.

Unless ofcourse you have some idea you want to realize. But then somne measure of fairness seems quite good.

I was looking for actual numbers on your blog. It seems the only thing I've found is your conclusion that with BFS native games run almost as low jitter as CFS, while games under wine perform as good as under a real Windows OS. Which is a good thing :)

@Paradox Knows:Is it an Islamic/Muslim/halal kernel, that you provide? ->After reading your BLOG while keeping up all my "western" tolerance, I'm severely in doubt that this could serve our needs in general.<-

I also managed to read to some of Paradox Knows' low-jitter related articles and links (and found some newer config for his local kernel as of 3.6.6).

There are ~ 440 different lines comparing his with my .config. Some things only for SMP & x64 that I don't use. And, of course, WITHOUT BFS (or CK) & WITHOUT BFQ. Without an outline, about what magic setting makes that kernel "lower-jitter" than BFS/CK + BFQ I feel left in the dark. Sorry, for not checking each of the ~440 diff. lines for now.

After reading some more & testing...I can say that Paradox Uncreated's approach of adjusting priorities via schedtool (as in his ljtune script) can still help to achieve better==lower latencies effectively. This at least applies to my low performance system. (Many months ago I asked Con about adjustments of this kind and he said he didn't use it. O.k. ... ^^) And, remember, Paradox Uncreated doesn't use BFS or BFQ. I do and will use both.

Hello,I'm testing multithreading behavoir of BFS especially the Hyperthreading awareness.Here is a sample with the Whetstone MP Benchmark

normal run 4 4 threads:MWIPS 9775with taskset -c 0,1,2,3:MWIPS 11449

As you can see the taskset variant is ~17% percent faster.My question is now is there any way to let BFS prioritize the physical cores until we have more than 4 active threads and then for each new thread use virtual core 4,5,6,7.So basically handle my processor as quad core until we have more than 4 active threads. A behavoir like this would be nice cause most of the time we don't use more than 4 cores and the normal BFS behavoir diminishes performance.

I have seen this benchmark a long time ago (note he calls CFS CFQ by mistake) and it is a one off test for one particular workload. "Wrong" is too strong a word to describe this behaviour, because it depends largely on what endpoint you're measuring. BFS prioritises latency over throughput and in the relatively unloaded CPU case, BFS shines at its ability to find the earliest available CPU to minimise latency. Doing this sacrifices throughput -slightly- of cache bound throughput intensive workloads. BFS is a scheduler designed to optimise interactivity and responsiveness primarily and to maintain good throughput secondarily. The fact that BFS does better at any throughput benchmark compared to the mainline scheduler is a bonus.

There is no such thing as "real" cores versus siblings. Siblings only become SMT siblings when something is bound to the other thread unit on a core. Yes, BFS already does bind to unused "cores" before trying siblings of busy units. However, if they're all in use, it will then find a sibling in the interests of latency rather than hold off and wait to get back on the same core.

I am having in the last versions, from 3.6.x upwards i think, problems making backups with rsnapshot (a rsync based backup solution) to my mdadm software raid 5 xfs filesystem, here you can see a thread where i reported the problem to xfs mailing list, but seems to be related to the kernel, could this be a bfs problem?

This was introduced with BFS 426 for linux-3.7 (the hunk @ line 754) and is not present in the previous versions.

Is there a reason to prefer IRQ_TIME_ACCOUNTING over TICK_CPU_ACCOUNTING? ( @ck: do I read the patch correctly that this is what you intended? ) The config help says, "... so there can be asmall performance impact" with it. And, ck, do you know what takes precedence when wrongly having both set to y or if this has side effects?

The CPU accounting in BFS by default is already using high resolution just like the mainline "IRQ TIME ACCOUNTING" so it actually doesn't change anything. It's just the new visible kernel option in mainline. I've investigated and enabling both does nothing harmful to BFS.

I tend to find BFS doesn't play nice with nice. When I run a processor-intensive task such as a backup or compile in the background niced so I can keep working on a responsive machine, BFS seems to thrash, pushing the load up, grinding the desktop to a halt, and taking around 3 times as long to complete the task as the normal kernel scheduler.

For this reason I have always avoided BFS-enabled kernels, but the distro I use (PCLinuxOS) has now removed the non-BFS kernels from its repository, so it looks as if I will be faced with a choice between BFS kernels or compiling my own in future.

World's Most Popular Cars, Hot Speed Cars, Hot Cars with Hot Girls, Cars Latest Pictures with all info, Latest updates Cars Models and Company Cars, Strange Vehicles, Concept Cars, Top 10 Expensive Cars in the World.Visit this Link for More Strange Vehicles and Cars with Latest info and PicturesWorldLatestVehicles.com