Re: [Samba] Samba segs when serving files from a windows partition on

On Tue, Apr 29, 2008 at 10:06:18AM +0100, Edd Barrett wrote:
> Hi,
>
> On Fri, Apr 25, 2008 at 3:00 PM, Edd Barrett wrote:
> > I am willing to test patches. I may have a prod about in the source at
> > some point, but you guys can probably diagnose and fix the fault a
> > whole load better than I can. I have never looked at the samba source
> > before.
>
> It turns out OpenBSD-current has some patches to fix this problem
> which came from FreeBSD, just after the release of 4.2.
>
> Is the samba team interested in taking the patches upstream?
>
> http://www.openbsd.org/cgi-bin/cvswe...-cvsweb-markup
> http://www.openbsd.org/cgi-bin/cvswe...-cvsweb-markup

Unfortunately the patch-lib_replace_repdir_getdirentries_c patch
is completely wrong. It removes the abort assert, but doesn't change
the code that the abort is trying to assert. That whole replace
file assumes that an integral number of directory entries always
fit in a DIR_BUF_SIZE (1<<9) sized buffer. If they don't then
this code simply doesn't work, which is why the abort is called.

This file should be removed, when we know that this bug has
been fixed in the *BSD's.

" This is needed because the existing directory handling in FreeBSD
and OpenBSD (and possibly NetBSD) doesn't correctly handle unlink()
on files in a directory where telldir() has been used. On a block
boundary it will occasionally miss a file when seekdir() is used to
return to a position previously recorded with telldir().

This also fixes a severe performance and memory usage problem with
telldir() on BSD systems. Each call to telldir() in BSD adds an
entry to a linked list, and those entries are cleaned up on
closedir(). This means with a large directory closedir() can take an
arbitrary amount of time, causing network timeouts as millions of
telldir() entries are freed"

Is this now the case ? Last time I requested info in this Terry Lambert @ Apple
claimed that this behavior (doesn't correctly handle unlink() on files in a
directory where telldir() has been used. On a block boundary it will
occasionally miss a file when seekdir() is used to return to a position
previously recorded with telldir()) was allowed by POSIX and there was no
intention of fixing it.

If this is true it puts us at an impasse, as all other POSIX systems
don't behave like this. I did do some work on our directory handling
code in smbd/dir.c by adding a parameter "directory name cache size"
which turns off the performance boost if set to zero. Check out the
(long) bug report here :

The last person to check this reported the change did not work
for him. If this is incorrect, and setting "directory name cache size =
0" works for *BSD systems then I can remove the code in

lib/replace/repdir_getdirentries.c

entirely.

In addition, has the second bug been fixed in the *BSD's (the :
"Each call to telldir() in BSD adds an entry to a linked list"
bug) ?

If you give me feedback, I will close this out for 3.2. Unfortunately
it's hard to get anyone on the *BSD side to work on this with me. I
tend to be demand driven, and if someone from the *BSD community is
willing to work directly with me to ensure Samba works on *BSD, I'd
be happy to keep Samba working happily on these platforms. I don't
have time to do a lot of testing on *BSD myself though, that's the
problem. Guenther Kukkuk is a great example of how this can work.
He drive us to keep fixing bigs with the OS/2 client support and
is now a member of the Samba Team.