trans2 response handling

On 7/31/06, Andrew Bartlett <abartlet at samba.org> wrote:
>> On Sun, 2006-07-30 at 16:04 -0700, Murali Bashyam wrote:
> > On samba4 TP2, i am noticing the following problem when there are
> multiple
> > trans2 responses to a single trans2 request.
> >
> > After processing the first response, smb_raw_trans2_recv calls
> > smbcli_request_receive_more to handle the next response message, and
> ends up
> > blocking in epoll_wait waiting for a read event. Some time later when
> the
> > next trans2 response message is received, the wait unblocks, and then
> ends
> > up calling packet_receive (via smbcli_transport_event_handler).
> packet_recv
> > checks for the 'processing' flag in the packet context, and since this
> flag
> > is still set, it simply marks the fde not readable and returns without
> > picking up the next message, ultimately leading to a timeout. The
> > 'processing' flag in the packet context is not reset until we return
> from
> > smb_raw_trans2_recv back to the smbcli_transport_finish_recv function.
> This
> > seems to be leading to a deadlock in the multiple trans2 response
> scenario
> > where the smb client raw layer thinks there is no message pending for it
> > even though there is one and it is ready to process it, and the packet
> layer
> > is serializing the next response waiting for the smb client raw layer to
> > indicate to the packet layer somehow that it's finished with the
> previous
> > response. Is this a known issue with the tp2 snapshot?
> >
> > I don't see any change to fix this problem in these code segments in the
> > latest version of samba 4, and was wondering whether anyone else has
> > observed this.
>> I've been looking into this (thankyou very much for your clear
> analysis), and I think there are a few issues here:
>> Firstly, we should be passing the trans requests though the ntvfs layer,
> not directly to the remote server. That issue is in
> smb_server/smb/trans2.c:trans2_backend(). I think we should only pass
> on unknown trans2 requests (this will also better test our code).
>> The issue you are actually seeing comes from
> ntvfs/cifs/vfs_cifsc:async_trans2() calling smb_raw_trans2_recv().
Correct.
If you fixed the code not to serialise it might work, but I wonder if we
> should alter the logic in smb_raw_trans2_recv() to only loop once? If
> that's not practical (I'm not sure how easy it is for the current code
> to pass on the partial trans2) then we should convert this function into
> a callback based system.
The serialise flag is for the packet context / transport as a whole, and it
does affect the behaviour for any other concurrent operations occurring on
that context. It would fix this issue, but the approach does have a side
effect, though i am not sure of the impact of the side effect. If the impact
is none, then this approach achieves what we want at the smallest code
change.
The current reply and ntvfs code wont handle the partial trans2 well at all
without making significant code changes, the callback based approach seems
to be the cleanest one, it preserves the async nature of this interface. So
let's see :
- rewrite the raw trans2 recv function to loop once and accept a callback,
- process the next chunk of the response whenever it's called by the event
handling logic.
- when it has completely accumulated all the chunks of the trans2 responses,
it turns around and invokes the callback and passes the finished trans2
block to the callback.
Is this what u had in mind?
Murali
Andrew Bartlett
> --
> Andrew Bartlett http://samba.org/~abartlet/> Authentication Developer, Samba Team http://samba.org> Samba Developer, Red Hat Inc. http://redhat.com>>> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.4 (GNU/Linux)
>> iD8DBQBEzqecz4A8Wyi0NrsRAtlLAJoDP59iCdiFCU4N1JjMFd9e7MHOkgCfYzf/
> nEjTpuChlqvzbQJIelhN/OQ=
> =lmfH
> -----END PGP SIGNATURE-----
>>>