Description of Problem:
If a filesystem which uses the standard page cache routines needs a per-inode
lock in (amongst others) its fsync and write file_operations, then an ABBA
deadlock occurs between file_fsync() and sys_write(). The former grabs i_sem
and then calls the filesystem's fsync routine, which grab the filesystem's
per-inode lock; the latter calls the filesystem's write routine, which grabs the
filesystem's per-inode lock and then calls generic_file_write(), which grabs
i_sem. Provision of a __generic_file_write() entry point which assumes i_sem is
already held would allow the filesystem's write routine to grab i_sem then grab
the per-inode lock, preserving the locking order the VFS locks preceded
filesystem locks.
I am working on a clustered filesystem that has such a requirenment and am
requesting that this otherwise benign change be made.
Version-Release number of selected component (if applicable):
All kernels.
How Reproducible:
N/A
Steps to Reproduce:
1. N/A
2.
3.
Actual Results:
Expected Results:
Additional Information:

I'm not the copyright owner, so I can't change the licensing. However, I should
point out that this change has been made in the 2.5 development kernel to
support XFS now that it is using the kernel's buffer/page cache rather than its
own, so I assert that this change is generally useful.
Regards,
Tim

Note

You need to
log in
before you can comment on or make changes to this bug.