Another detour: short-circuiting cat(1) - Solaris Rss

This is a discussion on Another detour: short-circuiting cat(1) - Solaris Rss ; What do you think happens when you do this:
# cat vmcore.4 > /dev/null If you've used Unix systems before, you might expect this to read vmcore.4 into memory and do nothing with it, since cat(1) reads a file, and ...

Another detour: short-circuiting cat(1)

What do you think happens when you do this:

# cat vmcore.4 > /dev/null If you've used Unix systems before, you might expect this to read vmcore.4 into memory and do nothing with it, since cat(1) reads a file, and "> /dev/null" sends it to the null driver, which accepts data and does nothing. This appears pointless, but can actually be useful to bring a file into memory, for example, or to evict other files from memory (if this file is larger than total cache size).

But here's a result I found surprising:

# ls -l vmcore.1 -rw-r--r-- 1 root root 5083361280 Oct 30 2009 vmcore.1 # time cat vmcore.1 > /dev/null real 0m0.007s user 0m0.001s sys 0m0.007s That works out to 726GB/s. That's way too fast, even reading from main memory. The obvious question is how does cat(1) know that I'm sending to /dev/null and not bother to read the file at all?

Of course, you can answer this by examining the cat source in the ON gate. There's no special case for /dev/null (though that does exist elsewhere), but rather this behavior is a consequence of an optimization in which cat(1) maps the input file and writes the mapped buffer instead of using read(2) to fill a buffer and write that. With truss(1) it's clear exactly what's going on:

# time cat vmcore.1 | cat > /dev/null real 0m32.661s user 0m0.865s sys 0m32.127s That's more like it: about 155MB/s streaming from a single disk. In this case the second cat invocation can't use this optimization since stdin is actually a pipe, not the input file.

There's another surprising result of the initial example: the file's access time actually gets updated even though it was never read:

363 /* 364 * NFS V2 will let root open a file it does not have permission 365 * to read. This read() is here to make sure that the access 366 * time on the input file will be updated. The VSC tests for 367 * cat do this: 368 * cat file > /dev/null 369 * In this case the write()/mmap() pair will not read the file 370 * and the access time will not be updated. 371 */ 372 373 if (read(fi_desc, &x, 1) == -1) 374 read_error = 1; I found this all rather surprising because I think of cat(1) as one of the basic primitives that's dead-simple by design. If you really want something simple to read a file into memory, you might be better off with dd(1M).