Feature #4038

IO#advise

=begin
As discussed in #4015, I suggest a wrapper around posix_fadvise(2) named IO#advise. On platforms that don't support this system call, IO#advise is a no-op. Otherwise, it provides a hint to the kernel as to how the given file descriptor will be accessed in the future. This allows the kernel to optimise its page cache accordingly.

+static VALUE sym_normal, sym_sequential, sym_random,
+ sym_willneed, sym_dontneed, sym_noreuse;
+/*
+ * call-seq:
+ * ios.advise(advice, offset=0, len=0) -> nil
+ *
+ * Announce an intention to access data from the current file in a
+ * specific pattern. On platforms that do not support the
+ * posix_fadvise(2) system call, this method is a no-op.
+ *
+ * advice is one of the following symbols:
+ *
+ * * :normal - No advice to give; the default assumption for an open file.
+ * * :sequential - The data will be accessed sequentially:
+ * with lower offsets read before higher ones.
+ * * :random - The data will be accessed in random order.
+ * * :willneed - The data will be accessed in the near future.
+ * * :dontneed - The data will not be accessed in the near future.
+ * * :noreuse - The data will only be accessed once.

Probably, we have to note detailed meaning is OS dependent. example,
On almost os, POSIX_FADV_DONTNEED mean to add a hint to cache
reclaim logic. But, on linux, it immediately drop page caches.

*

* "data" means the region of the current file that begins

* at offset and extends for len bytes. By default, both offset

* and len are 0, meaning that the advice applies to the entire

* file.

This doesn't describe offset != 0 && len == 0 case.

* If an error occurs, one of the following exceptions will be raised:

*

* * IOError - The IO stream is closed.

* * Errno::EBADF - The file descriptor of the current file is

invalid.

* * Errno::EINVAL - An invalid value for advice was given.

* * Errno::ESPIPE - The file descriptor of the current

* * file refers to a FIFO or pipe. (Linux raises Errno::EINVAL

* * in this case).

A lot of OS may return os specific errno. so probably we need to note
other Errno::*
may happen.

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

Are there any security issues we need to consider? $SAFE, tainting, trust? #advise already raises a SecurityError when $SAFE=4.
=end

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Looks good. :)

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

|> Please check-in, but I think symbol initialization (:normal etc.) is
|> needed only when posix_fadvise(2) is available.
|
|The advantage of the current approach is that we can fail earlier.
|Otherwise, a script containing io.advise([1,2,3]) would not raise an
|exception under Windows, but would if run under Linux.