I did experiment with cleaning the threadlocals from their owning thread by doing it in org.apache.tomcat.util.threads.ThreadPoolExecutor.afterExecute(Runnable, Throwable)
It does work (i.e. it improves leak protection for the http://wiki.apache.org/tomcat/MemoryLeakProtection#webappClassInstanceAsThreadLocalIndirectValue case ), but since it clear all thread locals after each request, it breaks some optimizations that are done by some frameworks, applications or even tomcat itself.
I'm experimenting with the approach of recreating the thread pool, it seems cleaner and more efficient.

It shouldn't clean all thread locals after each request, it should clean all thread locals loaded by a web application.
If the cleaning breaks apps or frameworks then those apps/frameworks are broken since they are leaking memory and should be fixed. One option to reduce side-effects would be only to clean the thread locals if the WebAppClassloader is stopped. The downside is that memory leaks won't be fixed immediately.
The ThreadLocals created by Tomcat will be loaded by the common class loader. Those shouldn't be causing any memory leaks, even in the embedded case since Tomcat will stop all the threads it created.

Created attachment 25411[details]
Patch for tomcat 7 to renew threads
Here is finally my proposition of patch for tomcat 7 to recreate the ThreadPoolExecutor of each Executor when a Context is stopped.
I also removed the clearReferencesThreadLocals property on WebApp[Class]Loader since my patch makes it useless and I think this feature is too unsafe.
I'm working on another patch for tomcat 6. There's a little more work to do because of the old WorkerThread stuff that has been replace by ThreadPoolExecutor in trunk.

Created attachment 26068[details]
Alternative approach to renewing threads
Instead of completely renewing the entire thread pool, it's possible to do it incrementally, which might be less invasive, in the event of a large number of applications being updated at the same time.
This patch adds a TaskThread class and modifies the StandardContextValve. The goal is to terminate a class after it's used, when it's discovered to be older than the Context which it last serviced.
This approach wouldn't be as comprehensive as replacing the entire pool, but should cause the pool to refresh incrementally over time. Further enhancement could include self-cleaning of thread locals etc.

Pid, I applied your patch but it does not work.
Actually when the thread calls join(0) on itself it just leads to a deadlock : the Thread will remain forever in the join method unless it is interrupted.
Furthermore your idea works only if a new context is started. If a context is stopped and no other context is started, Threads won't be renewed.

(In reply to comment #6)
> Pid, I applied your patch but it does not work.
> Actually when the thread calls join(0) on itself it just leads to a deadlock :
> the Thread will remain forever in the join method unless it is interrupted.
It's experimental admittedly. I'll clean up my SVN repo & retry it, thought it was working...
If it won't work at all, then the idea is dead, but there must be a way to terminate an individual thread & remove it from the pool.
> Furthermore your idea works only if a new context is started.
Or an existing one is restarted, the point is that it's a compromise.
The idea was to have a lower impact than restarting the whole thread pool.
> If a context is stopped and no other context is started, Threads won't be renewed.
If a context is stopped and no other context is started, how will the server work?

Created attachment 26074[details]
proposed patch to renew threads one by one with a delay
I explored your idea and managed to force threads to "kill themselves" after a context is stopped.
Here are the main points of my new patch :
- I save the last time a context is stopped in a static variable of StandardContext (this could be improved to record it somewhere at the Service or Engine level, in a instance variable)
- The "suicide" is done in ThreadPoolExecutor.afterExecute(), in case the thread was created before the last context was stopped.
- the rate at which threads are renewed is throttled to avoid renewing all threads at the same time (otherwise it would have been the same as my initial proposal to renew the whole pool).
- I had to make TaskQueue extend ArrayBlockingQueue instead of LinkedBlockingQueue because the latter does not propose to have a "fair" policy. Fairness is required so that every thread of the pool is given the opportunity to be killed.
- Unlike LinkedBlockingQueue, ArrayBlockingQueue requires a reasonable size (not Integer.MAX_VALUE). I set it to 100 by default (configurable with the maxQueueSize property at the Connector or Executor level)
I'm quite happy with this patch except the fairness requirement :
- The javadocs of ArrayBlockingQueue say that it may decrease performance, but I don't how much
- Since each thread of the pool is used in turn, we might lose some caching effects
- and it also prevents the size of the pool to be decreased in case of low activity : without fairness, some Thread could timeout waiting for an element from the queue while other threads could serve new requests. Eventually the pool could decide to decrease it size.
What do you think ? If we don't use a fair queue, we do not have a deterministic behavior and some thread might still trigger a memory leak...

> What do you think ? If we don't use a fair queue, we do not have a
> deterministic behavior and some thread might still trigger a memory leak...
Unless we have minSpareThreads=0. This way even threads that are no longer used will eventually be stopped (the timeout is 60s by default, maxIdleTime in StandardThreadExecutor and hardcoded in AbstractEndpoint.createExecutor() )

Created attachment 26097[details]
Patch for tomcat 7 to renew threads one by one with a delay without requiring a fair queue
I worked a little more on my patch to avoid requiring a fair queue. I actually reused part of my very first patch (the Listener). Basically here is what is done :
- idle threads are stopped all at once, even core pool threads. This is done by ThreadLocalLeakPreventionListener
- like my previous patch, active threads are stopped one by one with a delay.
All in all, this avoids performance impacts under load since only idle threads will be stopped as a whole, and other threads will be renewed slowly.
This memory leak protection is not 100% deterministic since one could think of scenarios where the load decreases just after a leaking context is stopped, so that some (core pool) threads are no longer used but still present in the pool and thus triggering the memory leak...
But it's a compromise between performance impacts and the duration of the memory leak.
I'd be glad to have some feedback on this patch!

Created attachment 26108[details]
tc7 renew threads one by one in a bounded time
New patch with further improvements : now the behavior is more deterministic, with an upper bound to the time necessary to renew all the threads of the pool. The upper bound is something like N*max(threadKeepAliveTimeout, longestRequest + threadRenewalDelay). Where N is the number of threads in the pool and longestRequest is the maximum time a request takes to be processed (of course it depends on the application).
This is really a worst case scenario and it should be much quicker in usual cases, something closer to max(threadKeepAliveTimeout, longestRequest).
I still have to make the threadRenewalDelay configurable (hardcoded to 1s for now) and to propose some cleanups in WebAppClassLoader since this patch makes the unsafe-and-disabled-by-default ThreadLocal cleaning obsolete. (it's still interesting to introspect the ThreadLocals to detect potential leaks and warn the user, it will help to improve libraries and applications...).
Any volunteer for a review before I continue finalizing this patch ?

Created attachment 26156[details]
patch tc7 renew threads RC2
patch "RC2" for tomcat 7 : using ProtocolHandler.getExecutor() instead of testing each possible implementation (the method was not present on ProtocolHandler a couple of weeks ago).

Created attachment 26157[details]
patch tc6 renewThreads 2010-10-11
Patch for tomcat 6.
It is essentially the same behavior as for tomcat 7, excepted that it requires that (at least) one Executor be configured and the Connectors be configured to use that(these) Executor(s).
So, the memory leak protection is not enabled by default, unlike for tomcat 7 which always uses an Executor.

Though my proposed patch renews threads in the pool, there are actually 2 other types of threads that can prevent an application from being garbage collected if it has ThreadLocal-related leaks :
- the "main" thread (the one when starting up tomcat)
- the ContainerBackgroundProcessor thread
(- the shutdown hook thread too, but since the JVM is shutting down it does not really matter)
- in an embedded scenario, the thread of the "containing" application
Currently when a webapp is started or stopped, the lifecycle callbacks into the webapp (context listener, filters, servlet) are executed by one of those threads.
I'm working on a patch where I spawn a thread that runs just during those callbacks then shuts down, and so far it prevents such leaks.

rev 1044145 : committed some more enhancements in case a web application creates some ThreadLocals that it does not clean during its startup or shutdown phases (calls to Listeners, Filters and Servlets). A dedicated thread is used in these phases.
Will be available for 7.0.6

This is ASF Bugzilla: the Apache Software Foundation bug system. In case
of problems with the functioning of ASF Bugzilla, please contact
bugzilla-admin@apache.org.
Please Note: this e-mail address is only for reporting problems
with ASF Bugzilla. Mail about any other subject will be silently
ignored.