Details

Description

AsyncLogger caches a thread's name in a thread-local info variable. I make use of a thread pool where the submitted Runnables call Thread.setName() at the beginning of their task and the thread name is included in the log message. For an example of this behavior, see org.jboss.netty.util.ThreadRenamingRunnable in Netty 3.x. With the cached thread name, the log messages will contain whatever name the thread had when it logged for the first time and so long as the thread doesn't terminate (such as in a core pool thread), all log messages involving this thread will be erroneous. If Thread.getName has a significant performance impact for async logging, I would be satisfied if this behavior were configurable, perhaps on a per-logger basis, so that the penalty only needs to be taken by users who make use of Thread.setName()

Remko Popma
added a comment - 12/Dec/13 01:24 Anthony, thanks for your feedback.
Caching the thread name did give a significant performance improvement, so I would like to keep it as the default,
but making it configurable is certainly possible.
I'm still thinking about how to do this. (Perhaps do this as part of LOG4J2-321 .)
Do you need this to be configurable on a per-logger basis or is that a nice-to-have and would a global setting work for you?

Gary Gregory
added a comment - 12/Dec/13 01:53 Was caching the thread name a premature optimization?
It saves creating one string for each call to Thread#getName() , which would be called once per log event.
Rather than adding the complexity of making this configurable, why not simply back out the caching?

Remko, thank you for your quick feedback. My use case does not require this to be configurable on a per-logger basis; a global setting would suit me fine. I mentioned the potential for it being set on a per-logger basis simply as a middle-ground option if there was an expedient method for enabling such.

Anthony Baldocchi
added a comment - 12/Dec/13 02:43 Remko, thank you for your quick feedback. My use case does not require this to be configurable on a per-logger basis; a global setting would suit me fine. I mentioned the potential for it being set on a per-logger basis simply as a middle-ground option if there was an expedient method for enabling such.

Anthony, understood, thanks!
I've given it some thought and the per-logger config would be a lot more complex than a global config, so I'll start with the simple solution and will consider a per-logger solution if a strong use case comes up.

Remko Popma
added a comment - 12/Dec/13 03:22 Anthony, understood, thanks!
I've given it some thought and the per-logger config would be a lot more complex than a global config, so I'll start with the simple solution and will consider a per-logger solution if a strong use case comes up.

I find it hard to believe that a method call and one new String per log event makes a 10% difference. Can you create a benchmark before we go on and make things even more complicated than they are? 10% of what is also an issue

Gary Gregory
added a comment - 12/Dec/13 14:52 Hi Remko,
I find it hard to believe that a method call and one new String per log event makes a 10% difference. Can you create a benchmark before we go on and make things even more complicated than they are? 10% of what is also an issue
Gary

Gary, I've tested the performance difference like you asked. I used the existing benchmark test that I used to generate the performance numbers for Async Loggers: org.apache.logging.log4j.core.async.perftest.*
I modified the PerfTestDriver class to comment out all tests except for "Loggers all async", and tested for 1, 2 and 4 threads. Here are the results:

I conclude that caching the thread name gives a significant performance benefit.

I attached the PerfTestDriver class so you can run the tests yourself to verify these results if you want. Each run takes about 30 minutes. To toggle between Thread.currentThread().getName() and the cached thread name, you need to modify AsyncLogger and recompile.

Anthony,
I've added an option to switch off thread name caching with system property -DAsyncLogger.ThreadNameStrategy=UNCACHED

Ideally this should be configurable in the log4j2 configuration file (and there is a Jira ticket LOG4J2-321 to address this) but this option is consistent with other options for tuning AsyncLogger behaviour.

Remko Popma
added a comment - 03/Jan/14 12:56 Anthony,
I've added an option to switch off thread name caching with system property -DAsyncLogger.ThreadNameStrategy=UNCACHED
Ideally this should be configurable in the log4j2 configuration file (and there is a Jira ticket LOG4J2-321 to address this) but this option is consistent with other options for tuning AsyncLogger behaviour.
Fixed in revision 1555076.
Please verify and close.

Using a single-threaded executor, thread names in my test log output change when the Runnable calls setName when tuning this option to UNCACHED and remain fixed with CACHED, which is the expected behavior. Verified and thanks.

Anthony Baldocchi
added a comment - 03/Jan/14 18:23 Using a single-threaded executor, thread names in my test log output change when the Runnable calls setName when tuning this option to UNCACHED and remain fixed with CACHED, which is the expected behavior. Verified and thanks.

CACHED still has an advantage over UNCACHED, but there seems to be a lot of variance between runs. Especially the 4-thread results have too much variance to be useful. (This may be because my laptop only has 4 virtual cores, and occasionally running OS tasks skew the results.)

Hi Scott!
Good question, not sure.
Disk access should be efficient as the RandomAccessFileAppender does a lot of buffering and becomes more efficient (batching multiple events together into a single I/O call) when it falls behind the producer. It could be lock contention, although use of the Disruptor in Async Loggers should resolve most of that. If your have 4 cores then the single-threaded part of the test should show around 75% idle CPU.
How many cores do you have? (What is an MBP, by the way?)

Can you post your results here? (Just the throughput numbers is fine.)

Remko Popma
added a comment - 04/Jan/14 04:04 Hi Scott!
Good question, not sure.
Disk access should be efficient as the RandomAccessFileAppender does a lot of buffering and becomes more efficient (batching multiple events together into a single I/O call) when it falls behind the producer. It could be lock contention, although use of the Disruptor in Async Loggers should resolve most of that. If your have 4 cores then the single-threaded part of the test should show around 75% idle CPU.
How many cores do you have? (What is an MBP, by the way?)
Can you post your results here? (Just the throughput numbers is fine.)

Scott Deboy
added a comment - 04/Jan/14 04:25 The test fails part way through for me since it isn't resolving the slf4j classes: Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
I am only getting through the single threaded tests before it fails.
Assume I need the slf4j API and an implementation? What's the right command to get those included?
MBP means macbook pro..it's a mid 2012, 2.3 i7, SSD Samsung 840 pro w/16G, on Mavericks. Once I have a good command to run I'll update the bug with results.

Remko Popma
added a comment - 04/Jan/14 04:50 (away from PC)
You shouldn't need slf4j to run the modified PerfTestDriver attached to this Jira.
Just found one issue with that PerfTestDriver: it doesn't pass command line system properties to the PerfTest process it creates, so it will only allow one to test the CACHED scenario.
Will fix PerfTestDriver and attach to this Jira.

Scott Deboy
added a comment - 04/Jan/14 05:22 I'm not sure why, but I do get an slf4j error as you see above:
Logback: Sync (single thread) (1/5)...Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory

Remko Popma
added a comment - 04/Jan/14 06:29 Scott,
If you see "Logback..." you are not using the PerfTestDriver class attached to this Jira. The attached PerfTestDriver has all tests commented out except the Log4J2 AsyncLogger test.
I've attached a new version of PerfTestDriver to this Jira that correctly passes command line property -DAsyncLogger.ThreadNameStrategy=UNCACHED (or CACHED) to the actual test process.
To run this test:
update log4j2 project from SVN
replace log4j2/log4j-core/src/test/java/org/apache/logging/log4j/core/async/perftest/PerfTestDriver.java with the file attached to this Jira
build with mvn clean install -DskipTests
cd log4j2/log4j-core/target
java -DAsyncLogger.ThreadNameStrategy=UNCACHED -cp .;log4j-core-2.0RC1-SNAPSHOT-tests.jar;log4j-core-2.0RC1-SNAPSHOT.jar;../../log4j-api/target/log4j-api-2.0RC1-SNAPSHOT.jar;test-classes;c:/users/remko/.m2/repository/com/lmax/disruptor/3.2.0/disruptor-3.2.0.jar org.apache.logging.log4j.core.async.perftest.PerfTestDriver
(You probably need to modify the path to disruptor-3.2.0.jar )

The one thing that the enterprise hardware has going for it is that there is very little variance between runs.

Summary (in million ops/sec):

Cached

Uncached

1 thread

3.0

2.4

2 threads

1.7

1.3

4 threads

1.4

1.1

On the Windows and Unix platforms I tested, caching the thread name seems to boost performance. I can't explain Scott's results. However, I think the results as a whole justify the slightly added complexity of caching the thread name.