On Fri, Sep 24, 2004 at 11:32:26PM -0700, Anthony Baratta wrote:
>
> I rummaged through the code and discovered someone had kindly added
>
> $self->{pid} = $pid;
>
> in the windows_fork of Filter.pm. But I didn't find any references to
> "waitpid".
Ah, we have been through this before. waitpid is called in swish.cgi
and I thought it got added to Filter.pm after this discussion:
http://swish-e.org/Discussion/search/swish.cgi?query=%22thread+safety%22&submit=Search%21&metaname=swishtitle&sort=swishlastmodified
Looks like the discussion didn't finish.
> In "$filter_sub = sub { ... " (Approx. line 1051 in spider.pl), I added
> "waitpid($doc->{pid},0);" just after "my $doc = $filter->convert( .."
> and before "return 1 unless $doc;"
But that's won't really work because, as in the case of the PDF
filter, two programs are being run.
The correct solution is to make the call to windows_fork (the call to
IPC::Open2) return an object and then have a DESTROY function that
calls waitpid.
Another way might be to save all the PIDs. So in the windows_form()
function:
push @{$self->{pid}}, $pid;
Then in convert()
eval {
local $SIG{__DIE__};
$filtered_doc = $filter->filter($doc_object);
};
# clean up Windows process table
if ( ref $doc_object->{pid} ) {
waitpid $_, 0 for @{ $doc_object->{pid} };
delete $doc_object->{pid};
}
Can you test that one? I'm not sure how long it takes to test --
maybe you could create a list of links to a bunch of small PDFs on
your local machine so it will run fast.
Or, if you can figure out how to use Win32::Process and avoid
IPC::Open2 completely.
I wonder what happens on Win98. I thought I tried there once and $pid
was always the same number.
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu