On processors with deep write buffers, it is likely that many cycles
will pass between a CACHE instruction and the time the data actually
gets written out to DRAM. Add a SYNC instruction to ensure that the
buffers get emptied before the flush functions return.
Actual problem seen in the wild:
1) dma_alloc_coherent() allocates cached memory
2) memset() is called to clear the new pages
3) dma_cache_wback_inv() is called to flush the zero data out to memory
4) dma_alloc_coherent() returns an uncached (kseg1) pointer to the
freshly allocated pages
5) Caller writes data through the kseg1 pointer
6) Buffered writeback data finally gets flushed out to DRAM
7) Part of caller's data is inexplicably zeroed out
This patch adds SYNC between steps 3 and 4, which fixed the problem.
Signed-off-by: Kevin Cernekee<cernekee@gmail.com>
---
arch/mips/mm/c-r4k.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index 6721ee2..05c3de3 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -605,6 +605,7 @@ static void r4k_dma_cache_wback_inv(unsigned long addr,
unsigned long size)
r4k_blast_scache();
else
blast_scache_range(addr, addr + size);
+ __sync();
return;
}

Basically, agreed. I have similar workarounds when initiating DMA,
where we need to flush out data to DRAM before starting DMA trans-
actions. Looks like similar situations.
But I have a concern.
I suspect that SYNC insn alone is still not enough, insn't it? In
such systems with that 'deep' write buffer and data incoherency is
visibly observed, there sill may be data write transactions floating
in the internal bus system.
To make sure that all data (data inside processor's write buffer and
data floating in the internal bus system), we need the following
three steps:
1. Flush data cache
2. Uncached, dummy load operation from _DRAM_ (not somewhere else)
3. then SYNC instruction
With these steps, data in write buffer will be pushed out of the
processor's write buffer, wait for uncached load operation to be
completed, and then finally the pipeline gets cleared. Thoughts?
Shinya