* anonvma:
anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma
anon_vma: clone the anon_vma chain in the right order
vma_adjust: fix the copying of anon_vma chains
Simplify and comment on anon_vma re-use for anon_vma_prepare()

…anonvma
Otherwise we might be mapping in a page in a new mapping, but that page
(through the swapcache) would later be mapped into an old mapping too.
The page->mapping must be the case that works for everybody, not just
the mapping that happened to page it in first.
Here's the scenario:
- page gets allocated/mapped by process A. Let's call the anon_vma we
associate the page with 'A' to keep it easy to track.
- Process A forks, creating process B. The anon_vma in B is 'B', and has
a chain that looks like 'B' -> 'A'. Everything is fine.
- Swapping happens. The page (with mapping pointing to 'A') gets swapped
out (perhaps not to disk - it's enough to assume that it's just not
mapped any more, and lives entirely in the swap-cache)
- Process B pages it in, which goes like this:
do_swap_page ->
page = lookup_swap_cache(entry);
...
set_pte_at(mm, address, page_table, pte);
page_add_anon_rmap(page, vma, address);
And think about what happens here!
In particular, what happens is that this will now be the "first"
mapping of that page, so page_add_anon_rmap() used to do
if (first)
__page_set_anon_rmap(page, vma, address);
and notice what anon_vma it will use? It will use the anon_vma for
process B!
What happens then? Trivial: process 'A' also pages it in (nothing
happens, it's not the first mapping), and then process 'B' execve's
or exits or unmaps, making anon_vma B go away.
End result: process A has a page that points to anon_vma B, but
anon_vma B does not exist any more. This can go on forever. Forget
about RCU grace periods, forget about locking, forget anything like
that. The bug is simply that page->mapping points to an anon_vma
that was correct at one point, but was _not_ the one that was shared
by all users of that possible mapping.
Changing it to always use the deepest anon_vma in the anonvma chain gets
us to the safest model.
This can be improved in certain cases: if we know the page is private to
just this particular mapping (for example, it's a new page, or it is the
only swapcache entry), we could pick the top (most specific) anon_vma.
But that's a future optimization. Make it _work_ reliably first.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "What do you know, I think you fixed it!" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

We want to walk the chain in reverse order when cloning it, so that the
order of the result chain will be the same as the order in the source
chain. When we add entries to the chain, they go at the head of the
chain, so we want to add the source head last.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "No, it still oopses" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

When we move the boundaries between two vma's due to things like
mprotect, we need to make sure that the anon_vma of the pages that got
moved from one vma to another gets properly copied around. And that was
not always the case, in this rather hard-to-follow code sequence.
Clarify the code, and fix it so that it copies the anon_vma from the
right source.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "Yeah, not so much this one either" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

This changes the anon_vma reuse case to require that we only reuse
simple anon_vma's - ie the case when the vma only has a single anon_vma
associated with it.
This means that a reuse of an anon_vma from an adjacent vma will always
guarantee that both vma's are associated not only with the same
anon_vma, they will also have the same anon_vma chain (of just a single
entry in this case).
And since anon_vma re-use was the only case where the same anon_vma
might be associated with different chains of anon_vma's, we now have the
case that every vma that shares the same anon_vma will always also have
the same chain. That makes it much easier to think about merging vma's
that share the same anon_vma's: you can always just drop the other
anon_vma chain in anon_vma_merge() since you know that they are always
identical.
This also splits up the function to validate the anon_vma re-use, and
adds a lot of commentary about the possible races.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "That didn't fix it" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dq_flags are modified non-atomically in do_set_dqblk via __set_bit calls and
atomically for example in mark_dquot_dirty or clear_dquot_dirty. Hence a
change done by an atomic operation can be overwritten by a change done by a
non-atomic one. Fix the problem by using atomic bitops even in do_set_dqblk.
Signed-off-by: Andrew Perepechko <andrew.perepechko@sun.com>
Signed-off-by: Jan Kara <jack@suse.cz>

…turned on
For a root filesystem write to the filesystem before quota is turned on happens
regularly and there's no way around it because of writes to syslog, /etc/mtab,
and similar. So the warning is rather pointless for ordinary users. It's
still useful during development so we just hide the warning behind
__DQUOT_PARANOIA config option.
Signed-off-by: Jan Kara <jack@suse.cz>

The ebase is relative to CKSEG0 not CAC_BASE. On a 32-bit kernel they
are the same thing, for a 64-bit kernel they are not.
It happens to kind of work on a 64-bit kernel as they both reference
the same physical memory. However since the CPU uses the CKSEG0 base,
determining if a J instruction will reach always gives the wrong result
unless we use the same number the CPU uses.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1093/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

While playing with the out-of-tree MAE driver module, the system would
panic after a while in the db1200 custom wait code after wakeup due to
a clobbered k0 register being used as target address of a store op.
Remove the custom wait implementation and revert back to the Alchemy-
recommended implementation already set as default.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1092/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Commit b3594a0 (lmo) rsp.
3513369 (kernel.org) break non-GPL modules
that use __vmalloc() or any of the vmap(), vm_map_ram(), etc functions on
MIPS.
All those functions are EXPORT_SYMBOL() so are meant to be allowed to be
used by non-GPL kernel modules. These calls all take page protection as
an argument which is normally a constant like PAGE_KERNEL.
This commit causes all protection constants like PAGE_KERNEL to not be
constants and instead to contain the GPL-only symbol _page_cachable_default.
This means that all calls to __vmalloc(), vmap(), etc, cause non-GPL
modules to fail to link with the complaint that they are trying to use the
GPL-only symbol _page_cachable_default...
Change EXPORT_SYMBOL_GPL(_page_cachable_default) to EXPORT_SYMBOL() for
non-GPL modules that call __vmalloc(), vmap(), vm_map_ram() etc.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Cc: Chris Dearman <chris@mips.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: http://patchwork.linux-mips.org/patch/1084/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Since 2083e83, the SPROM is now registered
in the board_prom_init callback, but it references variables and functions
which are declared below. Move the variables and functions above
board_prom_init.
Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1077/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

…sions.
Previously it was unconditionally used on all Sibyte family SOCs. The
M3 bug has to be handled in the TLB exception handler which is extremly
performance sensitive, so this modification is expected to deliver around
2-3% performance improvment. This is important as required changes to the
M3 workaround will make it more costly.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>