Hi --
This issue with httpd illustrates a difficulty when working with
reslists and sub-pools (note that the patches attached are not a complete
solution; that's still forthcoming, I hope):
http://issues.apache.org/bugzilla/show_bug.cgi?id=39985
The reslist code itself never creates any sub-pools with
apr_pool_create(). It passes the pool given to apr_reslist_create()
to the constructor function whenever a resource needs to be created.
If the constructor function merely allocates its opaque structure
out of the pool, all is (mostly) OK. An internal mutex ensures that two
constructors won't try to allocate out of the memory pool at the same time.
The destructor doesn't have to do anything related to memory management.
As a sidebar, there are two minor issues that still arise from this
simple case. First, the caller has to remember not to use the pool they
passed to apr_reslist_create() in any other places where it might
execute at the same time as a constructor function, since the caller
doesn't have access to the reslist's internal mutex. Second, over
the lifetime of the reslist, the reslist will gradually leak memory,
since there's no way to free the memory allocated in the constructor.
The more serious consequence, and the one that pertains to the
httpd bug report, occurs if one tries to plug the memory leak. Suppose
the constructor function creates a sub-pool, and then allocates its
opaque structure out of that sub-pool. The destructor can then
call apr_pool_destroy() on the sub-pool and that will free the memory
used by the opaque structure. Now the reslist won't leak memory over
time (from the constructor, anyway).
Assume, though, that a well-behaved program will eventually call
apr_pool_destroy() on the pool that was passed to apr_reslist_create().
Now apr_reslist_create() registers an internal cleanup function on this
pool, and this cleanup function iterates over all current resources
and calls the destructor function on them.
But, before the cleanup function runs, apr_pool_destroy() will
have recursively called itself on all the sub-pools of the reslist's
pool, including all the sub-pools created by the constructor function.
Thus when the reslist's cleanup function runs and calls the destructor
on a resource, the resource's sub-pool and opaque structure have
been freed; the destructor will issue a second call to apr_pool_destroy()
on the sub-pool and likely cause the program to crash.
I'm just finishing up a proposed solution to this for httpd, but it
depends on a fortuitous situation which can't be expected in most
well-behaved programs.
My immediate thought is that the reslist code could be changed
to (a) create a sub-pool in apr_reslist_create(), so that callers
don't have to avoid using the pool elsewhere, and (b) creating a sub-pool
prior to each invocation of the constructor. These per-resource sub-pools
would be passed to the constructor, so it could allocate without having to
worry about leaks. Also, the destructor would be registered as a cleanup
function on the sub-pool.
To destroy a resource, just calling apr_pool_destroy() on the
resource's sub-pool would then call the destructor automatically.
It would be possible to simplify or remove the main cleanup function,
because there would be no need to iterate over all the resources
and call the destructor -- this would happen automatically within
apr_pool_destroy().
The internal mutex might be able to used less extensively, because
you wouldn't need to ensure that two constructor functions couldn't
run at the same time; they'd be allocating out of private sub-pools
and therefore be thread-safe (at least in this regard).
The problem I see with this change is that it would break existing
applications, like httpd, that work around the current behaviour.
If a program's destructor function already calls apr_pool_destroy() on
any sub-pools created in its constructor, this would cause the same
kind of double-free as before, because these sub-pools would already
have been destroyed before the destructor ran. (The destructor would
now be a cleanup function registered on the per-resource sub-pool,
the one that was passed to the constructor.)
So I'm not sure much can be attempted until, perhaps, APR 2.x.
Maybe one can change such things in with a minor version release, but
it seems to me that while it might not break strict binary compatibility,
it would be a surprise to anyone upgrading to 1.3.
Is there anywhere to put suggestions for APR 2.x? Are there other
solutions or problems that I haven't seen? Thanks,
Chris.
--
GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B