Linus Torvalds wrote:> > On Fri, 9 Aug 2002, Andrew Morton wrote:> >> > What would be nice is a way of formalising the prefault, to pin> > the mm's pages across the copy_*_user() in some manner, perhaps?> > Too easy to create a DoS-type attack with any trivial implementation.

hmm, yes. The pin has to be held across ->prepare_write. That
tears it.

> However, I don't think pinning is worthwhile, since even if the page goes> away, the prefaulting was just a performance optimization. The code should> work fine without it. In fact, it would probably be good to _not_ prefault> for a development kernel, and verify that the code works without it. That> way we can sleep safe in the knowledge that there isn't some race through> code that requires the prefaulting..

OK. That covers reads. But we need to do something short-term to get
these large performance benefits, and I don't know how to properly fix
the write deadlock. The choices here are:

- live with the current __get_user thing

- make filemap_nopage aware of the problem, via a new `struct page *'
in task_struct (this would be very messy on the reader side).

- or?

(Of course, the write deadlock is a different and longstanding
problem, and I don't _have_ to fix it here, weasel, weasel)

> I agree that if you could guarantee pinning the out-of-line code would be> a bit simpler, but since we have to handle the EFAULT case anyway, I doubt> that it is _that_ much simpler.> > Also, there are actually advantages to doing it the "hard" way. If we ever> want to, we can actually play clever tricks that avoid doing the copy at> all with the slow path.> > Example tricks: we can, if we want to, do a read() with no copy for a> common case by adding a COW-bit to the page cache, and if you do aligned> reads into a page that will fault on write, you can just map in the page> cache page directly, mark it COW in the page cache (assuming the page> count tells us we're the only user, of course), and mark it COW in the> mapping.

glibc malloc currently returns well-aligned-address + 8. If
it were taught to return well-aligned-address+0 then presumably a
lot of applications would automatically benefit from these
zero-copy reads.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/