The idea is simple: write protect clean shared writeable pages,catch the write-fault, make writeable and set dirty. On page write-backclean all the PTE dirty bits and write protect them once again.

The implementation is a tad harder, mainly because the defaultbacking_dev_info capabilities were too loosely maintained. Hence it isnot enough to test the backing_dev_info for cap_account_dirty.

The current heuristic is as follows, a VMA is eligible when: - its shared writeable (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED) - it is not a 'special' mapping (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0 - the backing_dev_info is cap_account_dirty mapping_cap_account_dirty(vma->vm_file->f_mapping) - f_op->mmap() didn't change the default page protection

Page from remap_pfn_range() are explicitly excluded because theirCOW semantics are already horrid enough (see vm_normal_page() indo_wp_page()) and because they don't have a backing store anyway.

mprotect() is taught about the new behaviour as well. However it fudgesthe last.

Cleaning the pages on write-back is done with page_mkclean() a newrmap call. It can be called on any page, but is currently onlyimplemented for mapped pages, if the page is found the be of a trackable VMA it will also wrprotect the PTE.

Bah Bah Bah, why didn't the page_mkwrite() patch re-protect clean pages?And is it a Bad-Thing (tm) that that can happen now?

Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty()from under ->private_lock. This seems to be safe, since ->private_lock is used to serialize access to the buffers, not the page itself.This is needed because clear_page_dirty() will call into page_mkclean()and would thereby violate locking order.

- make page_mkclean_one() modify the pte more like change_pte_range() (suggested by Christoph Lameter) - made is_shared_writable() take vm_flags, it now resembles is_cow_mapping(). - fixed the mprotect() bug (spotted by Hugh Dickins) - hopefully fixed the tiresome issue of do_mmap_pgoff() trampling on driver specific vm_page_prot settings (spotted by Hugh Dickins) - made a new version of the page_mkwrite() patch to go on top of all this. This so that Linus could merge this very early on in 2.6.18.

Changes in -v5

- rename page_wrprotect() to page_mkclean() (suggested by Nick Piggin) - added comment to test_clear_page_dirty() (Andrew Morton) - cleanup page_wrprotect() (Andrew Morton) - renamed VM_SharedWritable() to is_shared_writable() - fs/buffers.c try_to_free_buffers(): remove clear_page_dirty() from under ->private_lock. This seems to be save, since ->private_lock is used to serialize access to the buffers, not the page itself. - rebased on top of David Howells' page_mkwrite() patch.