On Mon, May 03, 2010 at 01:58:36PM -0400, Rik van Riel wrote:>>>> Btw, Mel's patch doesn't really match the description of 2/2. 2/2 says>> that all pages must always be findable in rmap. Mel's patch seems to>> explicitly say "we want to ignore that thing that is busy for execve". Are>> we just avoiding a BUG_ON()? Is perhaps the BUG_ON() buggy?>> I have no good answer to this question.>> Mel? Andrea?>

The wording could have been better.

The problem is that once a migration PTE is established, it is expected thatrmap can find it. In the specific case of exec, this can fail because ofhow the temporary stack is moved. As migration colliding with exec is rare,the approach taken by the patch was to not create migration PTEs that rmapcould not find. On the plus side, exec (the common case) is unaffected. Onthe negative side, it's avoiding the exec vs migration problem instead offixing it.

The BUG_ON is not a buggy check. While migration is taking place, the page lockis held and not unreleased until all the migration PTEs have been removed. Ifa migration entry exists and the page is unlocked, it means that rmap failedto find all the entries. If the BUG_ON was not made, do_swap_page() wouldeither end up looking up a semi-random entry in swap cache and inserting it(memory corruption), inserting a random page from swap (memory corruption)or returning VM_FAULT_OOM to the fault handler (general carnage).

It was considered to lazily clean up the migration PTEs(http://lkml.org/lkml/2010/4/27/458) but there is no guarantee that the pagethe migration PTE pointed to is still the correct one. If it had been freedand re-used, the results would probably be memory corruption.