# Windows needs the TAP-Win32 adapter name# from the Network Connections panel if you@@ -93,7 +93,7 @@ dh dh1024.pem# Each client will be able to reach the server# on 10.8.0.1. Comment this line out if you are# ethernet bridging. See the man page for more info.-server 10.8.0.0 255.255.255.0+;server 10.8.0.0 255.255.255.0

2010年2月16日 星期二

On map, buffers are cleaned if they're being used for DMA_TO_DEVICE and DMA_BIDIRECTIONAL, or invalidated in the case of DMA_FROM_DEVICE.

(On unmap, nothing else is done)

However, because ARM CPUs (ARMv7, or some of that higher than ARM11 MPCore) can now speculatively prefetch, just leaving it at that (do nothing on unmap) results in corruption of buffers used for DMA. So we have to invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to ensure coherency with DMA operations.

If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the writes can sit in the cache, and on unmap, they will be discarded.

(Mixing PIO and DMA buffers wasn't a big issue with previous ARM CPUs, whoes unmap didn't do cache operation; while with ARM CPUs with speculatively perfetch, invalidating on unmap would invalidate the valid data that cached by PIO.)

Cleaning the cache on unmap is not an option; that too can lead to DMA buffer corruption in the DMA case.

USB and associated host driver must abide by the DMA API buffer ownership rules otherwise the result will be data corruption; either that or USB/host driver people need to have a discussion with the DMA API authors to remove this sensible "restriction".

2010年2月9日 星期二

I've been trying for some time to use a rootfs (ext2) on a USB memorystick on ARM platforms but without any success. The USB HCD driver isISP1760 which doesn't use DMA.

ARM has a Harvard cache architecture and what I get is incoherencybetween the I and D caches. The CPU I'm using (ARM11MPCore) has PIPTcaches with D-cache lines allocation on write.

Basically, when user space tries to execute from a new page, it faultsand the data is requested via the VFS layer, SCSI block device and USBmass storage from the ISP1760 driver. The page is then mapped into userspace and update_mmu_cache() called.

However, since the driver is PIO, the data copied from the USB deviceinto RAM gets stuck in the D-cache. On the above page requesting paththere is no call to flush_dcache_page() to handle D-cache maintenance(for DMA drivers, that's handled by the DMA API).

Since the USB mass storage code has the information about the USB drivercapabilities (DMA or PIO), it looks like the best place to callflush_dcache_page(). But I got lost in the SCSI emulation and all myattempts failed to get a working rootfs.

Adding flush_dcache_page() higher up in mpage_end_io_read() solves theproblem but that's not the correct fix as it has wider implications andit's not needed for DMA-capable devices.

(.................)

isp1760: Flush the D-cache for the pipe-in transfer buffers

From: Catalin Marinas <catalin.marinas@arm.com>

When the HDC driver writes the data to the transfer buffers it pollutesthe D-cache (unlike DMA drivers where the device writes the data). Ifthe corresponding pages get mapped into user space, there are noadditional cache flushing operations performed and this causes randomuser space faults on architectures with separate I and D caches(Harvard) or those with aliasing D-cache.

(.................)

The PIO-MMC drivers walk through a scatter list via sg_miter_start() andfriends. Those helpers take care of this automaticly.(Actually I just ran into a issue seems related to it. PIO SDHC

(.................)

My issues is with both I-D coherency and D-cache aliasing caused bypages mapped in both user and kernel space (with different colours). Theflush_dcache_page() call should target both cases.

(.................)

We could of course flush the caches every time we get a page fault butthat's far from optimal, especially since DMA-capable drivers to do notpollute the D-cache and don't need this extra flushing. Note that therecent ARM processors have PIPT caches but separate for I and D and it'sthe PIO drivers that pollute the D-cache.

The kernel API provides flush_dcache_page() to be called every time thekernel writes to a page cache page. This is further optimised forworking in pair with update_mmu_cache() to delay the flushing until theactual page is mapped into user space and this latter function is called(which in general is not a cache maintenance function).

The problem with some PIO drivers and a filesystems like ext2 is thatthere is no call to flush_dcache_page() when getting data into a pagecache page. Since the page isn't marked as dirty (PG_arch_1), asubsequent call to update_mmu_cache() as a result of a page faultdoesn't flush the caches.

> This seems wrong to me. Buffers for control transfers may be transfered> by DMA, so the caches must be flushed on architectures whose caches> are not coherent with respect to DMA.Indeed and that's what I mentioned in the comment. But we shouldn't have dmacache maintenance operations done for the buffers which would use pio based transfer.> Would you care to elaborate on the exact nature of the bug you are fixing?On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because controltransfer buffers are corrupted. On our platform, we use PIO mode for controltransfers and DMA for bulk transfers.

The current stack performs dma cache maintenance even for the PIO transferswhich leads to the corruption issue. The control buffers are handled by CPUand they already coherent from CPU point of view.

(.................)

On map, buffers are cleaned if they're being used for DMA_TO_DEVICE andDMA_BIDIRECTIONAL, or invalidated in the case of DMA_FROM_DEVICE.

However, because ARM CPUs can now speculatively prefetch, just leaving itat that results in corruption of buffers used for DMA. So we have toinvalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap toensure coherency with DMA operations.

If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, thewrites can sit in the cache, and on unmap, they will be discarded.

Cleaning the cache on unmap is not an option; that too can lead to DMAbuffer corruption in the DMA case.

USB and associated host driver must abide by the DMA API bufferownership rules otherwise the result will be data corruption; eitherthat or USB/host driver people need to have a discussion with theDMA API authors to remove this sensible "restriction".

a cache flush is supposed equivalent to 8 32-bit memory write(32-byte cacheline) , but it seems that somehow it's not. And so do the 8 ncnb access are not 8 times of 1 ncnb access.

Though the result might not be so accurate, but if the driver would access a cacheline more than twice, then it is worth to use cacheable buffer. In some extreme case one might like to try it out, but cacheable buffer might still be the one

But the kernel doesn't give you a lot of help to use FIQs; you have to do most of the work yourself. See arch/arm/kernel/fiq.c for some example stuff, though it may not suit your needs. Really, it all depends what you want to do.