If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Comment

So basically we're stuck with CFS which is inferior to whatever Windows is using and produces worse behavior and then wonder why linux cannot take on Windows?

The CFS works well for most workloads with the automatic cgroups patch that was merged around 3.3. Performance lags likely originate in things like the kernel's page replacement algorithm and the block/buffer/VFS/filesystem IO stack. Speaking of which, ZFS is designed to replace most of that. I maintain/develop the Gentoo Linux ZFS packages, so I tend to hear how it works for people. People that migrate to ZFS usually find that their systems are more responsive on ZFS. The only person that did not see an improvement in interactive response had more RAM than storage, which kept everything cached in RAM no matter what he did.

Comment

When I turned to linux I formatted my partitions with ext4 since that is what it seemed to be the default. Changing the filesystem is pretty hard right now and I don't want to reformat my HDD. Why isn't ZFS the default then? My mouse stuttering for example doesn't always happen when I run out of memory and have to swap but it's true that swapping tends to degrade performance really bad. Waaay worse than in windows. Even when swapping I could still listen to music and have a continuous mouse movement. It's really frustrating when I hear people bolstering how great the linux kernel is, a magnificent piece of engineering, when I can't do basic stuff with which I never had a problem in Win 98 even.

The "default filesystem" is a distribution decision. It has nothing to do with the Linux kernel. Getting any distribution to change its filesystem recommendations are hard. It basically needs to be a drop-in replacement that does things better with zero regressions in even the most obscure scenarios before anyone would consider it. ZFS is the best overall filesystem, but getting a distribution to switch to ZFS will be hard as long as there is a single area where its incumbent filesystem works better. ZFS' main weaknesses are in memory management, architecture support and partition management. Basically, ZFS won't work on the WRT54G and it really isn't compatible with existing installers, like anaconda. I do not expect many distributions to switch to ZFS until those issues are fixed. There is nothing stopping end-users from jumping ship early though.

With that said, the continuous mouse movement that you observed while playing music on Windows likely can be attributed to having the display server in the kernel. Things that live in the kernel are always resident, which insulates them from the effects of disk thrashing. Windows put all of the components required to draw the cursor on your screen into the NT the kernel, which is the reason why it performed so well. In the case of Linux, the display server and compositor live in userland, so the kernel can page them out to a swap device as it pleases. This is likely why you experience lags. Some people think that changing the CPU scheduler will help, but the effect that a CPU scheduler has on this is fairly chaotic in nature. The proper way of handling this would be to use a more intelligent page replacement algorithm.

ZFS has ARC, which tends to handle this more gracefully. Unfortunately, there are some outstanding memory management issues that prevent it from handling this as well as it could. Specifically, the Linux kernel's virtual memory support is awful. It lacks support for slab-based allocations, all allocations use a single lock and it does not obey GFP flags. LLNL wrote a compatibility shim that attempts to handle this, but it is far from ideal. In addition, mmap() is currently double cached between the page cache and ZFS ARC, which can create churn in the kernel as the kernel evicts pages required for your display server only to load them back from ARC. Admittedly, this is better than going to disk, but it still degrades performance.

Comment

Using BFS for desktop users should be OK, as they generally have at max a 4core+4HTT system . Servers have >8 cores, and they need the throughput and scalability of CFS.
Desktop users generally need responsivity.

Comment

Using BFS for desktop users should be OK, as they generally have at max a 4core+4HTT system . Servers have >8 cores, and they need the throughput and scalability of CFS.
Desktop users generally need responsivity.

Most of the scenarios where people claim that BFS helps seem to involve disk IO. While the more greedy nature of BFS might have an effect on the behavior of the kernel's page replacement algorithm, it does not address actual IO problems.

Comment

The needs aren't the same. Server and desktop have different latency needs. So it's normal that there could be a need for a specific scheduler depending on the situation.

If you need the BFS scheduler then you will use it, otherwise it's better to stick with a sane default. Hacky switches based purely on core count aren't a good idea because the scalability could vary from version to version, and would require updating that switch all the time.

Comment

Can't they put something like a flag that determines whether a memory slice is swappable or not? And put my mouse and my music in that part of memory that cannot leave the main memory and reside on the HDD?

Supported since 2.4 kernel (mlockall), but only doable by root.

Why not default X and your music to that then (beyond having to run the music player as root)? What if the comp is in DPMS sleep/screensaver and tries to do some processing? Sure you'd let X get swapped then.

Comment

Con Kolivas has some ideas to make the scheduler scalable by automaticly trickling down queues. But he has no time. No one is sponsering him ....

I found the place CK wrote about it some time ago.
Perhaps there is another expert taking this idea? Because the Linux scheduler is broken! What would you say for example about a security concept providing security for only 90 percent of the users?https://lkml.org/lkml/2012/12/20/509

However the main reason for developing the upgradeable rwlocks was not just to
create more critical sections that other CPUs can have read access. Ultimately
I had a pipe dream that it could be used to create multiple runqueues as you
have done in your patch. However, what I didn't want to do was to create a
multi runqueue design that then needed a load balancer as that took away one
of the advantages of BFS needing no balancer and keeping latency as low as
possible.

I've not ever put a post up about what my solution was to this problem because
the logistics of actually creating it, and the work required kept putting me
off since it would require many hours, and I really hate to push vapourware.
Code speaks louder than rhetoric. However since you are headed down creating
multi runqueue code, perhaps you might want to consider it.

What I had in mind was to create varying numbers of runqueues in a
hierarchical fashion. Whenever possible, the global runqueue could be grabbed
in order to find the best possible task to schedule on that CPU from the entire
pool. If there was contention however on the global runqueue, it could step
down in the hierarchy and just grab a runqueue effective for a numa node and
schedule the best task from that. If there was contention on that it could
step down and schedule the best task from a physical package, and then shared
cache, then shared threads, and if all that failed only would it just grab a
local CPU runqueue. The reason for doing this is it would create a load
balancer by sheer virtue of the locking mechanism itself rather than there
actually being a load balancer at all, thereby benefiting from the BFS approach
in terms of minimising latency, finding the best global task, not requiring a
load balancer, and at the same time benefit from having multiple runqueues to
avoid lock contention - and in fact use that lock contention as a means to an
endpoint.

Alas to implement it myself I'd have to be employed full time for months
working on just this to get it working...

Comment

Can't they put something like a flag that determines whether a memory slice is swappable or not? And put my mouse and my music in that part of memory that cannot leave the main memory and reside on the HDD?