I'm building a set of test kernels with this patchset now and plan to spend
significant time over the next few days running regression tests on it.
Cursory hello-world style testing so far looks good...

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
release.

Have run into other problems using directio in the cifs mount options, so
this isn't going to help them work around the behavior they are seeing.
There would be performance differences anyway, since with directio we
would NOT be using the local pagecache to enhance client performance.
In any case, as Jeff mentioned, what they are seeing is really a different
issue than what the case was originally opened on (and which a fix has been
committed for), so Jeff will need a separate BZ to be filed for this
different problem. This will also mean we need a separate Issue Tracker.
Thanks,
Vince
Internal Status set to 'Waiting on Support'
This event sent from IssueTracker by vincew
issue 134794

Navid,
Actually we need to focus a little closer on what the firefox strace log
is really telling us. I'm going to quote some lines from the firefox
strace file and cite line numbers for purpose of reference below:
lines 76700 - 76703:
6527 open("/mnt/cifs/foo.dat", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE,
0600) = 38
6527 open("/tmp/ft3un2ku.bin", O_RDONLY|O_LARGEFILE) = 39
6527 read(39,
"\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"...,
8192) = 8192
6527 write(38,
"\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"...,
8192) = 8192
After doing some initial "setup", firefox open()'s /mnt/cifs/foo.dat
(the DESTINATION file on the CIFS mount) for WRITING, and gets file
descriptor 38 returned from the open() call.
Firefox then open()'s /tmp/ft3un2ku.bin (the SOURCE file) for READING,
and gets file descriptor 39 back for this open() call.
With both files open, it begins reading from fd 39 (the source file) in 8K
buffer sizes, and writing the buffer out to fd38 (the destination on CIFS),
until it hits EOF on the source file:
lines 79517 - 79523
6527 read(39,
"\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"...,
8192) = 8192
6527 write(38,
"\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"...,
8192) = 8192
6527 read(39, "", 8192) = 0
6527 close(38 <unfinished ...>
6531 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
out)
6531 gettimeofday({1201511323, 406093}, NULL) = 0
6531 futex(0x8b9f340, FUTEX_WAKE, 1) = 0
Above, we have finished reading the source file (EOF returned from the
read() call on fd39), so there is nothing more to read and therefore
nothing more to write to the destination file on fd38. So we call close()
on fd 38 - the destination file on the CIFS mount. We then begin a series
of futex/gettimeofday waits until our close() call on the destination file
descriptor returns. However, we do not get a normal return from the close
of our destination file descriptor - we get a -1 ENOSPC error returned TO
FIREFOX instead:
lines 79545 - 79549
6527 <... close resumed> ) = -1 ENOSPC (No space left on
device)
6527 close(39) = 0
6527 lstat64("/tmp/ft3un2ku.bin", {st_mode=S_IFREG|0600,
st_size=11534336, ...}) = 0
6527 unlink("/tmp/ft3un2ku.bin") = 0
6527 chmod("/mnt/cifs/foo.dat", 0644) = 0
Firefox fails to see (check for) an error return on the close and proceeds
to close the source file's file descriptor (fd39). Note that a 0 return
code from the close() on the SOURCE file is completely normal and expected
behavior since all we needed to do was READ from this file.
However, we DID get an error returned to Firefox when we tried to close()
the destination file (the file we wanted to write to the CIFS mount) - but
Firefox ignored the error and went on about its business as if the file was
written successfully.
So this is a Firefox problem, and a separate case should be opened against
Firefox for failing to properly check for (and/or handle) errors returned
from close() calls. Jeff suggests that if they are going to fix Firefox
to check the return value on close(), they should also make it do an
fsync() beforehand and check that return value as well.
Hopefully it is understood now that the problem here is Firefox, *NOT*
CIFS. As such, there is not much more to be done here.
Thank you,
Vince
Internal Status set to 'Waiting on Support'
This event sent from IssueTracker by vincew
issue 134794

Please refer them the write(2) man page, NOTES section at the bottom, to
answer their question. The behavior is by design and a direct result of
using the system page cache. It also discusses using fsync() calls as a
way to ensure data is actually being written. This will be something to
perhaps mention in the ticket opened to request firefox be fixed to
actually check for errors when close() is called.
--vince
This event sent from IssueTracker by vincew
issue 134794

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
http://rhn.redhat.com/errata/RHBA-2008-0314.html