> On Tue, 24 Nov 1998, Stephen C. Tweedie wrote:>> >> Indeed. However, I think it misses the real advantage, which is that>> the mechanism would be inherently self-tuning (much more so than the>> existing code).

> Yes, that's one of the reasons I like it.

> The other reason I like it is that right now it is extremely hard to share> swapped out pages unless you share them due to a fork(). The problem is> that the swap cache supports the notion of sharing, but out swap-out> routines do not - they swap things out on a per-virtual-page basis, and> that results in various nasty things - we page out the same page to> multiple places, and lose the sharing.

No, I fixed that in 2.1.89. Shared anonymous pages _must_ be COW andtherefore readonly (this is why moving to MAP_SHARED anonymous regionsis so hard). So, the first process which tries to swap such a sharedpage will write it to disk and set up a swap cache entry. Because thepage is necessarily readonly, we can safely assume it is OK to write itat this point and not at the point of the last unmapping.

Subsequent processes which pageout the same page will find it in theswap cache already and will just free the page. I've tested this with aprogram which sets up large anonymous region, forks, and then thrashesthe memory. On prior kernels we lose the sharing, but on 2.1.89 andlater, that sharing is maintained perfectly even after fork and we nevergrow the amount of swap which is used.

> The VM policy changes weren't stability issues, they were only "timing". > As such, if they broke something, it was really broken before too.

Absolutely.

> And I agree that the mechanism is already there, however as it stands we> really populate the swap cache at page-in rather than page-out, and> changing that is fairly fundamental. It would be good, no question about> it, but it's still fairly fundamental.

We still have to populate the swap cache at page-in time. The initialreason for the early swap cache implementation was to prevent us fromhaving to re-write to disk pages which are still clean in memory. Forthat to work we need to cache the page-in.

However, for pages which become dirty in memory, we _do_ populate theswap cache only at page-out time. That's why the sharing still works.I think that the real change we need is to cleanly support PG_dirtyflags per page. Once we do that, not only do all of the dirty inodepageouts get fixed, but we also automatically get MAP_SHARED |MAP_ANONYMOUS.While we're on that subject, Linus, do you still have Andrea's patch topropogate page writes around all shared ptes? I noticed that ZlatkoCalusic recently re-posted it, and it looks like the sort of short-termfix we need for this issue in 2.2 (assuming we don't have time to do aproper PG_dirty fix).

--Stephen

-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.rutgers.eduPlease read the FAQ at http://www.tux.org/lkml/