Tomcat 7 log's this when doing a shutdown:
The web application [/MyApp] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@1d3e069]) and a value of type [java.util.Hashtable] (value [{}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
I'm using it the following way:
try {
if (MDC.get(..) == null) {
MDC.put(.., ...);
}
....
} finally {
MDC.remove(...);
}
After taking a quick look at the source of log4j i think the problem is that remove only removes the entry from the HashTable but does not call remove on the ThreadLocal itself.

Created attachment 28091[details]
submit a patch for memory leak in application server
This patch file just contains the "((ThreadLocalMap)tlm).remove();" to fix the problem below:
SEVERE: The web application [/my_app] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@568614]) and a value of type [java.util.Hashtable] (value [{userId=:unknown:, monitorID=0000000000000000}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
Dec 19, 2011 12:02:09 AM org.apache.catalina.loader.WebappClassLoader clearThreadLocalMap

I'm monitoring this report for quiet some time now and I don't understand the problem why it is not fixed.
This is one of the major sources of memory leaks in tomcat which bugges every developer for quiet some time now.
The fix would be something like a 3-liner and is a complete no-brainer: simply call the remove method via reflections if it exists.
So here is the "patch":
try {
ThreadLocalMap.class.getMethod( "remove" ).invoke( tlm );
} catch( NoSuchMethodException e ){
// Do nothing
}
This won't fix the bug for the 1.4 people but for the 90% of the rest of us which aren't bound to old problems anyway.
So please please fix this.

Is there a particular reason I can't vote for this bug (as I usually can in bugzilla)? I guess votes would be through the roof if voting was activated/allowed...
Do you (log4j devs) really want all of us affected user to create our own custom log4j version just to incorporate this fix?

Ok, the patch was accepted, modified and applied I get that.
a) Shouldn't this ticket be marked as 'resolved'?
b) What's the ETA for 1.2.17?
Also, I believe there's a second issue. The way it's implemented right know clients must explicitly call MDC.clear(). Couldn't/shouldn't remove() call clear() internally once the last item was removed()?
How do clients know when to call clear()? In a request processing chain (e.g. Servlet filter) several parties may write items and each one of them is responsible to call remove() for the items it added. On the way it items are added, on the way out they're removed in reverse order. And the last guy closes the door i.e. calls clear(). However, how does a client know that he's the last?

Hello Marcel,
1) yes- but lets wait if other devs have comments to your questions below. I reposted them to the ml
2) I think the release should happen immediately after we agree on this fix. I have volunteered to make this release, but haven't done this before. I cannot give you a date, but I will try to release a few days after your questions are answered and after everybody agrees to the content of 1.2.17
Cheers
Christian

Created attachment 28436[details]
clear the MDC when last key is removed + testcase
I agree that remove() should call clear when the last key was removed for the reason described by marcel@frightanic.com
I have attached a trivial fix and a small test-case with a simplified version of the code that tomcat uses to check for memory leaks.

Maarteen,
I made your change in MDC.java, recompiled the code with the build.xml and created new log4j.jar and still get errors in a Struts 2 application running on Tomcat:
created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@13c339f]) and a value of type [java.util.Hashtable] (value [{userId=jo2372}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.

We have vote recently to drop jdk 1.3 - means, no, we don't need to take care on Java 1.1 anymore (thanks heaven).
This also makes the patch easier, because in the current version I took care on jdk 1.3 users.

Hello Marcel,
at the ASF we need to vote on our releases. I have create a first RC candidate before 2 days, but need to create another one today. Then add 72 hours voting time to it. After that I can push it to maven. So sorry, it needs another little while, but we are close :-)
We will send an announcement to our users mailinglist and the dev list - feel free to join :-). In addition we will announce it on our blog:
http://blogs.apache.org/logging/
Cheers
Christian

It’s almost frightening how long it takes for such a popular and widely used project to get a new release out.
It’s even more alarming that the reason for the delay is that they do not know how to build the release package anymore (see above mentioned thread at the mailing list for details).

Hi Mattias,
the log4j build is highly complex due to windows dll building. Imagine a build system which has grown for the last 10 years or so. This is an issue which must be fixed soon. Actually there are people who know how to build it, but as we are a small team of volunteers we are in the bad situation that we still miss a required +1.
Once we have this release out I will put pressure on the refactoring of the log4j build system. It actually must become easier, otherwise it will become more and more difficult to release log4j1.
You are welcome to the project if you want to help us. Please join the log4j-dev mailinglist.
Cheers
Christian

(In reply to comment #30)
> Hello all,
> 1.2.17 is in the wild and can be downloaded now.
> thanks for your patience
> Cheers!
Are the updated files at:
http://logging.apache.org/log4j/1.2/download.html
Tried the jar, still getting ...[webapp] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@f6af3b]) and a value of type [java.util.Hashtable] (value [{userId=jo2372}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.

It took one and a half year to "fix" this bug, two months to get a release out and now the issue is still occurring?
I must be dreaming - this cannot be true!
Has nobody tested this fix by just putting a simple war (which is putting some MDC value, logging only one line) to a fairly modern tomcat instance, start and stop the app and just took a look at the log files?
My feelings about this project is getting even worse.
Sorry for the rant...

I hate to ask the obvious, but was MDC.remove() called for all the items? I notice that the message Jim is getting still has a userId. The fix for this issue removes the underlying Hashtable only when all the MDC values are removed.

Isn't that what clear() is for?
When somebody calls clear() then ThreadLocal.remove() is called which should remove the whole map. Imho on container shutdown MDC.clear() should be called. Therefore I would think
1) Hierarchy.shutdown() should call MDC.clear()
2) MDC.clear should do this after clearing: MDC.mdc = new MDC();
Wdyt?

Yes, that is what clear is for.
I don't see how Hierarchy.shutdown() can call MDC.clear(). Well, it could but it would only clear the thread the shutdown is performed on which isn't likely to help much.
Item 2 is likely to cause many problems. MDC.clear only clears the current thread's ThreadLocal. Reinitializing the MDC will cause problems for all the other threads.

Jim, it seems the usual way to hold your memory clean is to call the clear() method appropriate.
Ralph Goers explained on the mailinglist:
>> All the MDC is is a ThreadLocal where each Thread has its own Map associated
>> with the ThreadLocal. You can get and put all you want. In a webapp, of
>> course, if you add stuff to the MDC at the beginning of a request then you
>> need to remove it all at the end of the request. The normal pattern is
>>
>> MDC.put(key, value);
>> try {
>> call servlet
>> } finally {
>> MDC.remove(key);
>> }
As far as I understand it, using the MDC.clear() method should help you in this case (with 1.2.17). In a servlet environment, you might want to put this code into the Servlet.destroy() method:
http://docs.oracle.com/javaee/6/api/javax/servlet/Servlet.html#destroy()
"This method gives the servlet an opportunity to clean up any
resources that are being held (for example, memory, file handles,
threads)"
Could you try out if that helps you? If it does, we'll provide accordingly documentation and finally close the issue afterwards. If it does not, we'll do some more bug hunting.
Thank you!

In comment #12 I raised this issue before...
"Couldn't/shouldn't remove() call clear() internally once the last item was removed()?" For the reasons given in my comment (Servlet filter chain) you want to clear the MDC/ThreadLocal with each thread and not just when the Servlet/web app is destroyed.
With comment #15 a patch was supplied that promises to do just that?

(In reply to comment #38)
> [...] In a servlet environment, you might want to put
> this code into the Servlet.destroy() method:
I don't think that calling the clear method in the servlet's destroy method does the job. The destroy method may be invoked by a background thread of the application server and would remove the ThreadLocal value only from this background thread, not form the HTTP worker threads.
I agree with Marcel that a servlet filter is the best option:
public class ClearMDCFilter implements Filter {
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
try {
chain.doFilter(request, response);
} finally {
MDC.clear();
}
}
[...]
}
This way, you don't have to care about removing all keys you set somewhere in your code.

In reply to comment #42)
> Yes, looks good to me. I just mentioned it again because calling clean() in
> Servlet#destroy() doesn't make sense in my opinion. In fact, clients
> shouldn't have to call clean() at all if they properly call remove() for
> each key they add.
Sorry, make that clear() instead of clean().

Thanks Stephan, I agree with you guys. I will use your code snippet for the FAQ.
Thanks Marcel for confirming the change.
Once the FAQ entry is ready, I will close this issue again with the appropriate link.

(In reply to comment #44)
> Thanks Stephan, I agree with you guys. I will use your code snippet for the
> FAQ.
While Stephan's code certainly is correct (actually, log4j could provide that filter) and does the job I wouldn't be happy to see this mentioned prominently in the FAQ.
You should make clear to users that if using Servlet filters the proper pattern is something like
try {
MDC.put(myKey);
chain.doFilter(request, response);
} finally {
MDC.remove(myKey);
}
With Stephan's filter you have to be careful to configure it in the right position in web.xml (they're processed sequentially in configuration order). If the filter runs before the last piece of code that depends on the MDC is completed you're in trouble.

I can confirm that the memory leak has been fixed for our product after upgrading to 1.2.17. I tested with Tomcat 7.0.26 running on an Oracle JVM 1.6.0_23 on Windows 7.
Thanks to all the people contributing to this fix! You made my day!

What if you do not know which key/value pairs have been put to the MDC (if you think of 3rd party libraries, runtime dependent logic etc.)?
Is calling MDC.clear() at the end of doFilter() of a servlet filter sufficient?
I am really looking forward to some kind of Log4j shutdown method (which should be called during servlet context shutdown) which ensures that all ThreadLocal objects have been removed.

You can receive all the keys with doing:
MDC.getContext();
It returns a Hashmap. For each key in the Hashmap, call remove(key).
If you have ideas for such a shutdown() Method, please propose it on log4j-dev
Cheers!

@grobmeier
You missed a question in my last comment:
Is calling MDC.clear() at the end of doFilter() of a servlet filter sufficient?
IMHO (just took a quick look at the source), a shutdown() method should just set the field MDC.tlm to null (which references the ThreadLocalMap) - it should be a one-liner.

@Patric:
yes I missed your question. Calling MDC.clear() does clear all values from the underlying hashtable and afterwards calls the remove method (if available, which is from java > 1.4). So yes, it should be enough.
Where should the shutdown method reside? In MDC? What is the benefit over clear()? The shutdown method would only affect the current thread btw - other threads would keep the tlm.

>Where should the shutdown method reside? In MDC?
Within MDC would be ok, but for a matter of good design I would put this on an higher level, e.g. inside LogManager.shutdown()
>What is the benefit over clear()?
>The shutdown method would only affect the current thread btw - other threads >would keep the tlm.
In fact, the opposite is true:
From a real-world perspective, all we need is one method which cares about all threads. This method would be called (for example) in the contextDestroyed() method of a web application and would ensure that the application will never leak again after it has been shut down.
IMHO, this would be a one-liner. (just set tlm to null - instead of calling tlm.remove()).

(In reply to comment #52)
> From a real-world perspective, all we need is one method which cares about
> all threads.
I claim to live in the same real world as you but my perspective is still different. It's called /Thread/Local for a reason. The one thread that adds anything to MDC is responsible for removing it afterwards. If a thread doesn't clean up behind itself it leaves it up for someone else to do the dirty work - not nice.
What you propose is IMO nothing but a fallback for applications that don't adhere to that simple pattern. If log4j offered code for this it would actually discourage folks to use MDC the way it is supposed to be used.

>If a thread doesn't clean up behind itself it leaves it up for someone else to >do the dirty work - not nice.
I totally agree - that's way it should be.
Unfortunately this seems not clear for everybody and there will always be enough code around (e.g. 3rd party libraries) which violates this principle (just look at this bug - it took a long time to get log4j cleaning up the resources it acquired)
>What you propose is IMO nothing but a fallback for applications that don't adhere to that simple pattern.
Yes, that's true. But a failsafe which prevents a crash of a production server is better than insisting a perfect world.
And a shutdown procedure that has some effect on other threads is not uncommon - almost any shutdown() I am aware of concerns the outer world/other threads aswell (if you think of connection pools, application servers, operating systems...).

I don't think that setting tlm = null on shutdown would make a big difference.
When your web application is undeployed, all classes loaded by the web app's classloader should get unloaded and GC'ed, including the log4j classes. Once the MDC class has been unloaded, the MDC singleton instance and the ThreadLocal instance referenced by it should get collected. And once the ThreadLocal instance is gone, all values stored for worker threads of the application server may get collected as well.
If you deploy log4j in a way that multiple web applications may share a common logger hierarchy, you don't want to call shutdown when you undeploy one of the web applications. A shutdown would affect the other web applications.
Or do I miss something?

>Or do I miss something?
I suppose you are.
I'll let some details out but I hope that this helps nevertheless:
The problem is that a ThreadLocal is always implicitly strongly referenced by the thread which it's tied to (until the thread dies).
Because most application servers used thread pools the thread will most probably remain alive.
As a result, a reference to the ThreadLocal is still kept.
If the ThreadLocal references an object of the web application, the web application classloader can also not be garbage collected and most of the memory of the (already shut down) application cannot be freed.
This is exactly the reason why tomcats memory leak detection mechanism is correctly issuing the warning (which this bug report is about).

Folks,
I would like to suggest you to bring this discussion to the mailinglist: log4j-dev@logging.apache.org
At the moment this discussion gets only a low exposure as we are mainly discussing new features on the mailinglist (and not on closed issues :-))
Cheers!
Christian

(In reply to Gary Gregory from comment #60)
> Why not use the latest?
Because you probably don't want the test yourself whether the leak still exists if you could simply check the issue list: http://jira.qos.ch/browse/SLF4J-257 - still open.

This is ASF Bugzilla: the Apache Software Foundation bug system. In case
of problems with the functioning of ASF Bugzilla, please contact
bugzilla-admin@apache.org.
Please Note: this e-mail address is only for reporting problems
with ASF Bugzilla. Mail about any other subject will be silently
ignored.