The current scheme assumes mmu_gather is always done with preemptiondisabled and uses per-cpu storage for the page batches. Change this totry and allocate a page for batching and in case of failure, use asmall on-stack array to make some progress.

Preemptible mmu_gather is desired in general and usable oncei_mmap_lock becomes a mutex. Doing it before the mutex conversionsaves us from having to rework the code by moving the mmu_gatherbits inside the pte_lock.

Also avoid flushing the tlb batches from under the pte lock,this is useful even without the i_mmap_lock conversion as itsignificantly reduces pte lock hold times.

-DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);- /* * We have up to 8 empty zeroed pages so we can map one of the right colour * when needed. This is necessary only on R4000 / R4400 SC and MC versionsdiff --git a/arch/mn10300/mm/init.c b/arch/mn10300/mm/init.cindex 48907cc..1380182 100644--- a/arch/mn10300/mm/init.c+++ b/arch/mn10300/mm/init.c@@ -37,8 +37,6 @@ #include <asm/tlb.h> #include <asm/sections.h>

/* If there's a TLB batch pending, then we must flush it because the * pages are going to be freed and we really don't want to have a CPU@@ -164,8 +164,10 @@ void tlb_flush(struct mmu_gather *tlb) if (tlbbatch->index) __flush_tlb_pending(tlbbatch);