*[PATCH v2 00/10] server-side support for "inter" SSC copy@ 2018-11-30 20:03 Olga Kornievskaia
2018-11-30 20:03 ` [PATCH v2 01/10] VFS generic copy_file_range() support Olga Kornievskaia
` (10 more replies)0 siblings, 11 replies; 37+ messages in thread
From: Olga Kornievskaia @ 2018-11-30 20:03 UTC (permalink / raw)
To: bfields; +Cc: linux-nfs, linux-fsdevel
This patch series adds support for NFSv4.2 copy offload feature
allowing copy between two different NFS servers.
This functionality depends on the VFS ability to support generic
copy_file_range() where a copy is done between an NFS file and
a local file system.
This feature is enabled by the kernel module parameter --
inter_copy_offload_enable -- and by default is disabled. There is
also a kernel compile configuration of NFSD_V4_2_INTER_SSC that
adds dependency on the NFS client side functions called from the
server.
These patches work on top of existing async intra copy offload
patches. For the "inter" SSC, the implementation only supports
asynchronous inter copy.
On the source server, upon receiving a COPY_NOTIFY, it generate a
unique stateid that's kept in the global list. Upon receiving a READ
with a stateid, the code checks the normal list of open stateid and
now additionally, it'll check the copy state list as well before
deciding to either fail with BAD_STATEID or find one that matches.
The stored stateid is only valid to be used for the first time
with a choosen lease period (90s currently). When the source server
received an OFFLOAD_CANCEL, it will remove the stateid from the
global list. Otherwise, the copy stateid is removed upon the removal
of its "parent" stateid (open/lock/delegation stateid).
On the destination server, upon receiving a COPY request, the server
establishes the necessary clientid/session with the source server.
It calls into the NFS client code to establish the necessary
open stateid, filehandle, file description (without doing an NFS open).
Then the server calls into the copy_file_range() to preform the copy
where the source file will issue NFS READs and then do local file
system writes (this depends on the VFS ability to do cross device
copy_file_range().
v2:
-- in on top of 4.20-rc4 + client side inter patch series
-- VFS changes to do enable generic copy_file_range() and then NFS
falls back on generic_copy_file_range() for previous EXDEV/OPNOTSUPP
errors
-- hopefully addressed Bruce's review comments (highlights are):
--- copy_notify patch: addressed naming, sc_cp_list access is
now protected by s2s_cp_lock
--- fillin netloc4 patch: address the size and added WARN_ON
--- add ca_source to COPY: decode only 1 address, dont allocate
memory (the rest into dummy)
--- check stateid against stored: moved the refcount under lock
--- allow stale filehandle: adding a loop to go thru the ops in
the compound, store/manage puttfh if copy is present in the compound
mark the source putfh as "no verify".
All the patches (client inter) and this patch series is available
from git://linux-nfs.org/projects/aglo/linux.git under the "linux-ssc"
branch
Olga Kornievskaia (10):
VFS generic copy_file_range() support
NFS fallback to generic_copy_file_range
NFSD fill-in netloc4 structure
NFSD add ca_source_server<> to COPY
NFSD return nfs4_stid in nfs4_preprocess_stateid_op
NFSD add COPY_NOTIFY operation
NFSD check stateids against copy stateids
NFSD generalize nfsd4_compound_state flag names
NFSD: allow inter server COPY to have a STALE source server fh
NFSD add nfs4 inter ssc to nfsd4_copy
fs/nfs/nfs4file.c | 9 +-
fs/nfsd/Kconfig | 10 ++
fs/nfsd/nfs4proc.c | 406 ++++++++++++++++++++++++++++++++++++++++++++++-----
fs/nfsd/nfs4state.c | 124 ++++++++++++++--
fs/nfsd/nfs4xdr.c | 166 ++++++++++++++++++++-
fs/nfsd/nfsd.h | 32 ++++
fs/nfsd/nfsfh.h | 5 +-
fs/nfsd/nfssvc.c | 6 +
fs/nfsd/state.h | 21 ++-
fs/nfsd/xdr4.h | 37 ++++-
fs/read_write.c | 66 +++++++--
include/linux/fs.h | 7 +
include/linux/nfs4.h | 1 +
mm/filemap.c | 6 +-
14 files changed, 810 insertions(+), 86 deletions(-)
--
1.8.3.1
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-11-30 20:03 ` [PATCH v2 01/10] VFS generic copy_file_range() support Olga Kornievskaia
@ 2018-12-01 8:11 ` Amir Goldstein
2018-12-01 13:23 ` Olga Kornievskaia
2018-12-01 22:00 ` Dave Chinner
2018-12-01 21:18 ` Matthew Wilcox1 sibling, 2 replies; 37+ messages in thread
From: Amir Goldstein @ 2018-12-01 8:11 UTC (permalink / raw)
To: Olga Kornievskaia
Cc: bfields, Linux NFS Mailing List, linux-fsdevel, Dave Chinner,
Matthew Wilcox, Jeff Layton, Steve French
On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> Relax the condition that input files must be from the same
> file systems.
>
> Add checks that input parameters adhere semantics.
>
> If no copy_file_range() support is found, then do generic
> checks for the unsupported page cache ranges, LFS, limits,
> and clear setuid/setgid if not running as root before calling
> do_splice_direct(). Update atime,ctime,mtime afterwards.
>
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
This patch is either going to bring you down or make you stronger ;-)
This is not how its done. Behavior change and refactoring mixed into
one patch is wrong for several reasons. And when you relax same sb
check you need to restrict it inside filesystems, like your previous patch
did.
You already had v7 patch reviewed-by 4 developers.
What made you go and change it (and posted as v2)?
Your intentions were good trying to fix the broken syscall, but
I hope you understood that Dave didn't mean that you *have* to
add the missing generic checks as part of your work. He just
pointed out how broken the current interface is in the context of
reviewing your patch.
In any case, I hear that Dave is neck deep in fixing copy_file_range()
so changes to this function should be collaborated with him. Or better
yet, wait until he posts his fixes and carry on from there.
If I were you, I would just go back to the reviewed v7 vfs patch.
Thanks,
Amir.
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-01 8:11 ` Amir Goldstein@ 2018-12-01 13:23 ` Olga Kornievskaia
2018-12-01 13:44 ` Olga Kornievskaia
2018-12-01 22:00 ` Dave Chinner1 sibling, 1 reply; 37+ messages in thread
From: Olga Kornievskaia @ 2018-12-01 13:23 UTC (permalink / raw)
To: Amir Goldstein
Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, david, willy, jlayton,
stfrench
On Sat, Dec 1, 2018 at 3:11 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> <olga.kornievskaia@gmail.com> wrote:
> >
> > Relax the condition that input files must be from the same
> > file systems.
> >
> > Add checks that input parameters adhere semantics.
> >
> > If no copy_file_range() support is found, then do generic
> > checks for the unsupported page cache ranges, LFS, limits,
> > and clear setuid/setgid if not running as root before calling
> > do_splice_direct(). Update atime,ctime,mtime afterwards.
> >
> > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > ---
>
> This patch is either going to bring you down or make you stronger ;-)
>
> This is not how its done. Behavior change and refactoring mixed into
> one patch is wrong for several reasons. And when you relax same sb
> check you need to restrict it inside filesystems, like your previous patch
> did.
>
> You already had v7 patch reviewed-by 4 developers.
> What made you go and change it (and posted as v2)?
>
> Your intentions were good trying to fix the broken syscall, but
> I hope you understood that Dave didn't mean that you *have* to
> add the missing generic checks as part of your work. He just
> pointed out how broken the current interface is in the context of
> reviewing your patch.
>
> In any case, I hear that Dave is neck deep in fixing copy_file_range()
> so changes to this function should be collaborated with him. Or better
> yet, wait until he posts his fixes and carry on from there.
>
> If I were you, I would just go back to the reviewed v7 vfs patch.
This is NOT a replacement to the v7 vfs patch??? This is a new patch
on top of that one.
I assume that v7 patch has been OK-ed by everybody and is ready to go in???
As you recall, what was left is to provide the functionality to relax
the check for the superblocks to be the same before calling the
do_splice_direct(). This patch attempt do this. I was under the
impression that to do so extra checks were needed to be added which I
added.
>
> Thanks,
> Amir.
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-01 13:23 ` Olga Kornievskaia@ 2018-12-01 13:44 ` Olga Kornievskaia
[not found] ` <CAOQ4uxgENLCDH7QwtBPxA60dKEXvLVknBMY_Lgoetq_uQ=7gwA@mail.gmail.com>
0 siblings, 1 reply; 37+ messages in thread
From: Olga Kornievskaia @ 2018-12-01 13:44 UTC (permalink / raw)
To: Amir Goldstein
Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, david, willy, jlayton,
stfrench
On Sat, Dec 1, 2018 at 8:23 AM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Sat, Dec 1, 2018 at 3:11 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> > <olga.kornievskaia@gmail.com> wrote:
> > >
> > > Relax the condition that input files must be from the same
> > > file systems.
> > >
> > > Add checks that input parameters adhere semantics.
> > >
> > > If no copy_file_range() support is found, then do generic
> > > checks for the unsupported page cache ranges, LFS, limits,
> > > and clear setuid/setgid if not running as root before calling
> > > do_splice_direct(). Update atime,ctime,mtime afterwards.
> > >
> > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > ---
> >
> > This patch is either going to bring you down or make you stronger ;-)
> >
> > This is not how its done. Behavior change and refactoring mixed into
> > one patch is wrong for several reasons. And when you relax same sb
> > check you need to restrict it inside filesystems, like your previous patch
> > did.
> >
> > You already had v7 patch reviewed-by 4 developers.
> > What made you go and change it (and posted as v2)?
> >
> > Your intentions were good trying to fix the broken syscall, but
> > I hope you understood that Dave didn't mean that you *have* to
> > add the missing generic checks as part of your work. He just
> > pointed out how broken the current interface is in the context of
> > reviewing your patch.
> >
> > In any case, I hear that Dave is neck deep in fixing copy_file_range()
> > so changes to this function should be collaborated with him. Or better
> > yet, wait until he posts his fixes and carry on from there.
> >
> > If I were you, I would just go back to the reviewed v7 vfs patch.
>
> This is NOT a replacement to the v7 vfs patch??? This is a new patch
> on top of that one.
>
> I assume that v7 patch has been OK-ed by everybody and is ready to go in???
>
> As you recall, what was left is to provide the functionality to relax
> the check for the superblocks to be the same before calling the
> do_splice_direct(). This patch attempt do this. I was under the
> impression that to do so extra checks were needed to be added which I
> added.
>
To clarify, previously I had a VFS patch with the client-side series
to support "server to server" copy offload. It needed the
functionality to be able to call copy_file_range with different super
blocks.
This patch series is for the server side support for the "server to
server" copy offload. It requires ability to call copy_file_range()
and do a copy between NFS and a local file system. Thus it needs
generic_copy_file_range.
>
> >
> > Thanks,
> > Amir.
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
[not found] ` <CAN-5tyFGV=fUCbAG5mSvy=LXDpdp8VG9Sh1aGMkBHQAG1Rp1sQ@mail.gmail.com>
@ 2018-12-01 16:59 ` Amir Goldstein0 siblings, 0 replies; 37+ messages in thread
From: Amir Goldstein @ 2018-12-01 16:59 UTC (permalink / raw)
To: Olga Kornievskaia
Cc: bfields, Dave Chinner, Jeff Layton, Matthew Wilcox,
linux-fsdevel, Linux NFS Mailing List
On Sat, Dec 1, 2018 at 5:57 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Sat, Dec 1, 2018 at 9:03 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> >
> >
> > On Sat, Dec 1, 2018, 3:44 PM Olga Kornievskaia <olga.kornievskaia@gmail.com wrote:
> >>
> >> On Sat, Dec 1, 2018 at 8:23 AM Olga Kornievskaia
> >> <olga.kornievskaia@gmail.com> wrote:
> >> >
> >> > On Sat, Dec 1, 2018 at 3:11 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >> > >
> >> > > On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> >> > > <olga.kornievskaia@gmail.com> wrote:
> >> > > >
> >> > > > Relax the condition that input files must be from the same
> >> > > > file systems.
> >> > > >
> >> > > > Add checks that input parameters adhere semantics.
> >> > > >
> >> > > > If no copy_file_range() support is found, then do generic
> >> > > > checks for the unsupported page cache ranges, LFS, limits,
> >> > > > and clear setuid/setgid if not running as root before calling
> >> > > > do_splice_direct(). Update atime,ctime,mtime afterwards.
> >> > > >
> >> > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> >> > > > ---
> >> > >
> >> > > This patch is either going to bring you down or make you stronger ;-)
> >> > >
> >> > > This is not how its done. Behavior change and refactoring mixed into
> >> > > one patch is wrong for several reasons. And when you relax same sb
> >> > > check you need to restrict it inside filesystems, like your previous patch
> >> > > did.
> >> > >
> >> > > You already had v7 patch reviewed-by 4 developers.
> >> > > What made you go and change it (and posted as v2)?
> >> > >
> >> > > Your intentions were good trying to fix the broken syscall, but
> >> > > I hope you understood that Dave didn't mean that you *have* to
> >> > > add the missing generic checks as part of your work. He just
> >> > > pointed out how broken the current interface is in the context of
> >> > > reviewing your patch.
> >> > >
> >> > > In any case, I hear that Dave is neck deep in fixing copy_file_range()
> >> > > so changes to this function should be collaborated with him. Or better
> >> > > yet, wait until he posts his fixes and carry on from there.
> >> > >
> >> > > If I were you, I would just go back to the reviewed v7 vfs patch.
> >> >
> >> > This is NOT a replacement to the v7 vfs patch??? This is a new patch
> >> > on top of that one.
> >> >
> >> > I assume that v7 patch has been OK-ed by everybody and is ready to go in???
> >> >
> >> > As you recall, what was left is to provide the functionality to relax
> >> > the check for the superblocks to be the same before calling the
> >> > do_splice_direct(). This patch attempt do this. I was under the
> >> > impression that to do so extra checks were needed to be added which I
> >> > added.
> >> >
> >>
> >> To clarify, previously I had a VFS patch with the client-side series
> >> to support "server to server" copy offload. It needed the
> >> functionality to be able to call copy_file_range with different super
> >> blocks.
> >>
> >> This patch series is for the server side support for the "server to
> >> server" copy offload. It requires ability to call copy_file_range()
> >> and do a copy between NFS and a local file system. Thus it needs
> >> generic_copy_file_range.
> >
> >
> > Ah. Sorry for the confusion.
> > My comment on change of behavior and refactoring in same patch still hold.
> > My comment about coordinate your work with Dave Chinner still hold.
>
> Understood. I will email Dave directly and coordinate.
>
> > Raise that with a comment about adding test coverage to the new
> > generic cross fs copy API to xfstest.
>
> What kind of extra coverage are you envisioning? Something that
> requires two different file systems mounted and then does a fs copy?
>
Yes, if you add this functionality you should add test coverage for the
added functionality. It's not going to be trivial to add cross fs type tests
to xfstests, but adding cross fs (same type) should be relatively easy
(copy_file_range from test fs to scratch fs).
> > Am I mistaken that this change affects any cross fs copy file range
> > by userspace and not only by kernel nfsd?
>
> That's correct, any cross fs copy is what I'm going for here.
>
Forgive me for being thick. After briefly going over the patches, I still don't
understand if you *need* to add generic cross fs copy to implement
server side copy support in nfsd? Or if you are adding it as an added bonus
to the community along with your SSC patch set?
The first two patches of the series seem unrelated to the rest, but maybe
I'm just not getting the connection?
Thanks,
Amir.
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-01 8:11 ` Amir Goldstein
2018-12-01 13:23 ` Olga Kornievskaia@ 2018-12-01 22:00 ` Dave Chinner
2018-12-02 3:12 ` Olga Kornievskaia1 sibling, 1 reply; 37+ messages in thread
From: Dave Chinner @ 2018-12-01 22:00 UTC (permalink / raw)
To: Amir Goldstein
Cc: Olga Kornievskaia, bfields, Linux NFS Mailing List,
linux-fsdevel, Matthew Wilcox, Jeff Layton, Steve French
On Sat, Dec 01, 2018 at 10:11:48AM +0200, Amir Goldstein wrote:
> On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> <olga.kornievskaia@gmail.com> wrote:
> >
> > Relax the condition that input files must be from the same
> > file systems.
> >
> > Add checks that input parameters adhere semantics.
> >
> > If no copy_file_range() support is found, then do generic
> > checks for the unsupported page cache ranges, LFS, limits,
> > and clear setuid/setgid if not running as root before calling
> > do_splice_direct(). Update atime,ctime,mtime afterwards.
> >
> > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > ---
>
> This patch is either going to bring you down or make you stronger ;-)
>
> This is not how its done. Behavior change and refactoring mixed into
> one patch is wrong for several reasons. And when you relax same sb
> check you need to restrict it inside filesystems, like your previous patch
> did.
.....
> In any case, I hear that Dave is neck deep in fixing copy_file_range()
> so changes to this function should be collaborated with him. Or better
> yet, wait until he posts his fixes and carry on from there.
Yeah, because I've heard nothing for a month and this is kinda
important, I have a series of 8-9 patches that make all the fixes we
need, push the cross-filesystem checks down into the filesystems,
and let filesystems handle the fallback to a splice based copy
themselves (because there are way more fallback cases than just
EOPNOPSUPP and EXDEV).
I also have a patch for the man page that document all the missing
failure cases, and document where things are filesystem specific or
not.
And I also have a fstests patch that exercises all the failure cases
so that all filesystems will end up behaving the same way for all
the same cases they should.
I'm still sorting out the fstests patch (it requires changes
to xfs_io's copy-range command) so I've got some confidence that the
code actually does what it says in the man page, but I should have
that sorted in a couple of days.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-01 22:00 ` Dave Chinner@ 2018-12-02 3:12 ` Olga Kornievskaia
2018-12-02 15:19 ` Olga Kornievskaia
2018-12-02 20:47 ` Dave Chinner0 siblings, 2 replies; 37+ messages in thread
From: Olga Kornievskaia @ 2018-12-02 3:12 UTC (permalink / raw)
To: david
Cc: Amir Goldstein, J. Bruce Fields, linux-nfs, linux-fsdevel, willy,
jlayton, stfrench
On Sat, Dec 1, 2018 at 5:00 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sat, Dec 01, 2018 at 10:11:48AM +0200, Amir Goldstein wrote:
> > On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> > <olga.kornievskaia@gmail.com> wrote:
> > >
> > > Relax the condition that input files must be from the same
> > > file systems.
> > >
> > > Add checks that input parameters adhere semantics.
> > >
> > > If no copy_file_range() support is found, then do generic
> > > checks for the unsupported page cache ranges, LFS, limits,
> > > and clear setuid/setgid if not running as root before calling
> > > do_splice_direct(). Update atime,ctime,mtime afterwards.
> > >
> > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > ---
> >
> > This patch is either going to bring you down or make you stronger ;-)
> >
> > This is not how its done. Behavior change and refactoring mixed into
> > one patch is wrong for several reasons. And when you relax same sb
> > check you need to restrict it inside filesystems, like your previous patch
> > did.
> .....
> > In any case, I hear that Dave is neck deep in fixing copy_file_range()
> > so changes to this function should be collaborated with him. Or better
> > yet, wait until he posts his fixes and carry on from there.
>
> Yeah, because I've heard nothing for a month and this is kinda
> important
Dave I think that's unfair. It is important. NFS is actually the file
system that needed VFS support for cross fs copy_file_range and I was
working on it. If you were in doubt, you could have emailed and asked
me.
I'm unsure now what does this mean. I have a patch series with a VFS
patch that went thru the extensive review (people spend time on it)
and an NFS patch series that depends on it that is ready for the
upstream push. Are you saying that the VFS patch is no longer welcomed
and thus NFS series is no longer viable either?
, I have a series of 8-9 patches that make all the fixes we
> need, push the cross-filesystem checks down into the filesystems,
> and let filesystems handle the fallback to a splice based copy
> themselves (because there are way more fallback cases than just
> EOPNOPSUPP and EXDEV).
Are you saying it is each individual filesystem responsibility to
fallback on splice? Isn't that a step backwards? Each individual
filesystem is going to implement the same code of calling
do_splice_direct() to do the functionally that could and should be in
VFS?
>
> I also have a patch for the man page that document all the missing
> failure cases, and document where things are filesystem specific or
> not.
>
> And I also have a fstests patch that exercises all the failure cases
> so that all filesystems will end up behaving the same way for all
> the same cases they should.
>
> I'm still sorting out the fstests patch (it requires changes
> to xfs_io's copy-range command) so I've got some confidence that the
> code actually does what it says in the man page, but I should have
> that sorted in a couple of days.
>
> Cheers,
>
> Dave.
>
> --
> Dave Chinner
> david@fromorbit.com
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-02 3:12 ` Olga Kornievskaia@ 2018-12-02 15:19 ` Olga Kornievskaia
2018-12-02 20:47 ` Dave Chinner1 sibling, 0 replies; 37+ messages in thread
From: Olga Kornievskaia @ 2018-12-02 15:19 UTC (permalink / raw)
To: david
Cc: Amir Goldstein, J. Bruce Fields, linux-nfs, linux-fsdevel, willy,
jlayton, stfrench
On Sat, Dec 1, 2018 at 10:12 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Sat, Dec 1, 2018 at 5:00 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Sat, Dec 01, 2018 at 10:11:48AM +0200, Amir Goldstein wrote:
> > > On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> > > <olga.kornievskaia@gmail.com> wrote:
> > > >
> > > > Relax the condition that input files must be from the same
> > > > file systems.
> > > >
> > > > Add checks that input parameters adhere semantics.
> > > >
> > > > If no copy_file_range() support is found, then do generic
> > > > checks for the unsupported page cache ranges, LFS, limits,
> > > > and clear setuid/setgid if not running as root before calling
> > > > do_splice_direct(). Update atime,ctime,mtime afterwards.
> > > >
> > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > ---
> > >
> > > This patch is either going to bring you down or make you stronger ;-)
> > >
> > > This is not how its done. Behavior change and refactoring mixed into
> > > one patch is wrong for several reasons. And when you relax same sb
> > > check you need to restrict it inside filesystems, like your previous patch
> > > did.
> > .....
> > > In any case, I hear that Dave is neck deep in fixing copy_file_range()
> > > so changes to this function should be collaborated with him. Or better
> > > yet, wait until he posts his fixes and carry on from there.
> >
> > Yeah, because I've heard nothing for a month and this is kinda
> > important
>
> Dave I think that's unfair. It is important. NFS is actually the file
> system that needed VFS support for cross fs copy_file_range and I was
> working on it. If you were in doubt, you could have emailed and asked
> me.
Just to be clear. What I think was unfair in that comment was the
wording "this is kinda important". I think a lot stems from lack of
clarity in the the mailing list communications. I object to the fact
that it wasn't clear who was going to implement the functionality.
Since the work was needed by NFS I didn't want to assume that somebody
in VFS would just do it for us. At the time nobody in VFS stood up and
said they would do the work and thus I tried to do my best.
I'm grateful, and would have been in the first place, that somebody
did support generic cross-filesystem functionality. Thus I'm by no
means speaking against Dave's work.
> I'm unsure now what does this mean. I have a patch series with a VFS
> patch that went thru the extensive review (people spend time on it)
> and an NFS patch series that depends on it that is ready for the
> upstream push. Are you saying that the VFS patch is no longer welcomed
> and thus NFS series is no longer viable either?
I'm unclear of the fate of the patch set that has the (v7) VFS patch
that was reviewed and approved and is thought to be pushed for 4.21.
It is unclear if the new work is on top of that or not.
> , I have a series of 8-9 patches that make all the fixes we
> > need, push the cross-filesystem checks down into the filesystems,
> > and let filesystems handle the fallback to a splice based copy
> > themselves (because there are way more fallback cases than just
> > EOPNOPSUPP and EXDEV).
>
> Are you saying it is each individual filesystem responsibility to
> fallback on splice? Isn't that a step backwards? Each individual
> filesystem is going to implement the same code of calling
> do_splice_direct() to do the functionally that could and should be in
> VFS?
>
> >
> > I also have a patch for the man page that document all the missing
> > failure cases, and document where things are filesystem specific or
> > not.
> >
> > And I also have a fstests patch that exercises all the failure cases
> > so that all filesystems will end up behaving the same way for all
> > the same cases they should.
> >
> > I'm still sorting out the fstests patch (it requires changes
> > to xfs_io's copy-range command) so I've got some confidence that the
> > code actually does what it says in the man page, but I should have
> > that sorted in a couple of days.
> >
> > Cheers,
> >
> > Dave.
> >
> > --
> > Dave Chinner
> > david@fromorbit.com
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-02 3:12 ` Olga Kornievskaia
2018-12-02 15:19 ` Olga Kornievskaia@ 2018-12-02 20:47 ` Dave Chinner1 sibling, 0 replies; 37+ messages in thread
From: Dave Chinner @ 2018-12-02 20:47 UTC (permalink / raw)
To: Olga Kornievskaia
Cc: Amir Goldstein, J. Bruce Fields, linux-nfs, linux-fsdevel, willy,
jlayton, stfrench
On Sat, Dec 01, 2018 at 10:12:05PM -0500, Olga Kornievskaia wrote:
> On Sat, Dec 1, 2018 at 5:00 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Sat, Dec 01, 2018 at 10:11:48AM +0200, Amir Goldstein wrote:
> > > On Fri, Nov 30, 2018 at 10:04 PM Olga Kornievskaia
> > > <olga.kornievskaia@gmail.com> wrote:
> > > >
> > > > Relax the condition that input files must be from the same
> > > > file systems.
> > > >
> > > > Add checks that input parameters adhere semantics.
> > > >
> > > > If no copy_file_range() support is found, then do generic
> > > > checks for the unsupported page cache ranges, LFS, limits,
> > > > and clear setuid/setgid if not running as root before calling
> > > > do_splice_direct(). Update atime,ctime,mtime afterwards.
> > > >
> > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > ---
> > >
> > > This patch is either going to bring you down or make you stronger ;-)
> > >
> > > This is not how its done. Behavior change and refactoring mixed into
> > > one patch is wrong for several reasons. And when you relax same sb
> > > check you need to restrict it inside filesystems, like your previous patch
> > > did.
> > .....
> > > In any case, I hear that Dave is neck deep in fixing copy_file_range()
> > > so changes to this function should be collaborated with him. Or better
> > > yet, wait until he posts his fixes and carry on from there.
> >
> > Yeah, because I've heard nothing for a month and this is kinda
> > important
>
> Dave I think that's unfair. It is important. NFS is actually the file
> system that needed VFS support for cross fs copy_file_range and I was
> working on it. If you were in doubt, you could have emailed and asked
> me.
Last I heard from you was "this isn't my problem and I don't have
time to deal with it". You were fairly unambiguous in saying you
weren't going to spend any time on it.
> I'm unsure now what does this mean. I have a patch series with a VFS
> patch that went thru the extensive review (people spend time on it)
> and an NFS patch series that depends on it that is ready for the
> upstream push. Are you saying that the VFS patch is no longer welcomed
> and thus NFS series is no longer viable either?
No, I'm saying that this is urgent work and needs to be separated
from the NFS patch series, of which there are now two and you've
split copy_file_range() changes across both patch sets.
copy_file_range() is broken for *everyone*, not just NFS. i.e.
fixing these problems should not be tied to some other filesystem
feature patchset.
> , I have a series of 8-9 patches that make all the fixes we
> > need, push the cross-filesystem checks down into the filesystems,
> > and let filesystems handle the fallback to a splice based copy
> > themselves (because there are way more fallback cases than just
> > EOPNOPSUPP and EXDEV).
>
> Are you saying it is each individual filesystem responsibility to
> fallback on splice? Isn't that a step backwards? Each individual
> filesystem is going to implement the same code of calling
> do_splice_direct() to do the functionally that could and should be in
> VFS?
I've done this because one of the problems I've found is that
different filesystems *do not fall back consistently*. e.g. the NFS
client will return -EINVAL if src/dst are the same file, but -EINVAL
is not one of the errors that the vfs code falls back to a data copy
on.
This is despite the fact that the fallback path can copy to/from
the same file, we support same file copy through the
->remap_file_range offload, etc. IOWs, the behaviour of the syscall
when it comes to single file ranges is completely inconsistent
because fallbacks are implemented on a filesystem-by-filesystem
basis.
I called the fallback generic_copy_file_range(), and filesystems that
implement ->copy_file_range() are responsible for calling it
themselves if they want a fallback. That's because there may be
different error/constraint conditions at the filesystem level that
prevent offloading the copy, and we can't distinguish at the VFs
between "-EINVAL means fallback because it was a single file copy"
and "-EINVAL means fail, parameter out of range".
IOWs, if you implement ->copy_file_range() you take full
resposnsibility for implementing the copying function. This is
exactly what we do for all the other file methods, so this is just
making the implementation behaviour consistent with the rest of the
code.
FWIW, this also points out a problem with the copy_file_range()
definition - it does not say WTF should happen if the copy ranges
/overlap/ in the same file. clone is clear on that - support is
determined by the filesystem (i.e. "EINVAL [...] XFS and Btrfs do
not support overlapping reflink ranges in the same file."). For
copying, the fallback code can't copy the file data correctly if the
ranges overlap, so I've added checks to make this illegal and added
that overlapping ranges are not supported to the man page.....
These are the sort of API definition problems that I'm fixing with
right now, and I'm writing tests to make sure that all filesystems
will behave the same way for given copy scenarios.
i.e. I'm not doing this so I can get a NFS feature patchset merged,
I'm doing this to make the copy_file_range API well defined and
robust and allow implementations to be verified against the
specification the man page lays out.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^permalinkrawreply [flat|nested] 37+ messages in thread

*Re: [PATCH v2 01/10] VFS generic copy_file_range() support
2018-12-01 21:18 ` Matthew Wilcox@ 2018-12-01 22:36 ` Dave Chinner0 siblings, 0 replies; 37+ messages in thread
From: Dave Chinner @ 2018-12-01 22:36 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Olga Kornievskaia, bfields, linux-nfs, linux-fsdevel
On Sat, Dec 01, 2018 at 01:18:06PM -0800, Matthew Wilcox wrote:
> On Fri, Nov 30, 2018 at 03:03:39PM -0500, Olga Kornievskaia wrote:
> > Relax the condition that input files must be from the same
> > file systems.
>
> > + ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> > + count > MAX_RW_COUNT ? MAX_RW_COUNT : count, 0);
>
> Wasn't there a concern about splicing between filesystems with different
> block sizes mentioned the last time this came up? I can't find a citation
> for that now.
the filesystems should be able to handle that themselves - they are
just passes an iter that has a range of data regions in pages that
they copy the required data into/out of. The data transfer mechanism
itself is completely independent of filesystem block sizes....
There's lots of other problems with do_splice_direct, but I don't
think this is one of them. I coul dbe wrong - this code has pretty
much zero documentation on how it is supposed to work and what it is
supposed to do - so don't take my word for it...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^permalinkrawreply [flat|nested] 37+ messages in thread