On Wed, Feb 23, 2011 at 05:10:47PM +0000, Mel Gorman wrote:> On Wed, Feb 23, 2011 at 05:24:32PM +0100, Andrea Arcangeli wrote:> > On Wed, Feb 23, 2011 at 04:17:44AM +1030, Arthur Marsh wrote:> > > OK, these patches applied together against upstream didn't cause a crash > > > but I did observe:> > > > > > significant slowdowns of MIDI playback (moreso than in previous cases, > > > and with less than 20 Meg of swap file in use);> > > > > > kswapd0 sharing equal top place in CPU usage at times (e.g. 20 percent).> > > > > > If I should try only one of the patches or something else entirely, > > > please let me know.> > > > Yes, with irq off, schedule won't run and need_resched won't get set.> > > > Stepping back a little, how did you determine that isolate_migrate was the> major problem? In my initial tests using the irqsoff tracer (sampled for> the duration fo the test every few seconds and resetting the max latency> each time), compaction_alloc() was a far worse source of problems and> isolate_migratepage didn't even register. It might be that I'm not testing> on large enough machines though.

I think you're right compaction_alloc is a bigger problem. Your patchto isolate_freepages is a must have and in the right direction.

However I think having large areas set as PageBuddy may be common too,the irq latency source in isolated_migratepages I think needs fixingtoo. We must be guaranteed to release irqs after max N pages (where Nis SWAP_CLUSTER_MAX in my last two patches).

> In another mail, I posted a patch that dealt with compaction_alloc after> finding that IRQs were being disabled for millisecond lengths of time.> That length of time for IRQs being disabled could account for the performance> loss on the network load. Can test the network load with it applied?

kswapd was also running at 100% on all CPUs in that test.

The z1 that doesn't fix the latency source in compaction but thatremoves compaction from kswapd (a light/hackish version ofcompaction-no-kswapd-3 that I just posted) fixes the problemcompletely for the network load too.

So clearly it's not only a problem we can fix in compaction, the irqlatency will improve for sure, but we still get an overload fromkswapd which is not ok I think.

What I am planning to test on the network load ishigh-wmark+compaction_alloc_lowlat+compaction-kswapd-3 vshigh-wmark+compaction_alloc_lowlat+compaction-no-kswapd-2.

Is this ok? If you want I can test alsohigh-wmark+compaction_alloc_lowlat withoutcompaction-kswapd-3/compaction-no-kswapd-2 but I think the irq-latencysource in isolate_migratepages in presence of large PageBuddy regions(after any large application started at boot quits) isn't ok. Also Ithink having kswapd at 100% cpu load isn't ok. So I doubt we shouldstop at compaction_alloc_lowlat.