Add LOG.isDebugEnabled() guard for each LOG.debug("...")

Details

Description

Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some cases it's expensive to construct the string that is being printed to log. It's much easier to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case reason whether it's necessary or not).

[exec] There appear to be 2 release audit warnings before the patch and 2 release audit warnings after applying the patch.[exec][exec][exec][exec][exec] -1 overall.[exec][exec] +1 @author. The patch does not contain any @author tags.[exec][exec] +1 tests included. The patch appears to include 18 new or modified tests.[exec][exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages.[exec][exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.[exec][exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.[exec][exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.[exec]

Javadoc warnings are unrelated to the patch:

[exec][javadoc] Constructing Javadoc information...[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:39: warning: sun.security.jgss.krb5.Krb5Util is Sun proprietary API and may be removed in a future release[exec][javadoc] import sun.security.jgss.krb5.Krb5Util;[exec][javadoc] ^[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:40: warning: sun.security.krb5.Credentials is Sun proprietary API and may be removed in a future release[exec][javadoc] import sun.security.krb5.Credentials;[exec][javadoc] ^[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:41: warning: sun.security.krb5.PrincipalName is Sun proprietary API and may be removed in a future release[exec][javadoc] import sun.security.krb5.PrincipalName;[exec][javadoc] ^[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:31: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release[exec][javadoc] import sun.security.krb5.Config;[exec][javadoc] ^[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:32: warning: sun.security.krb5.KrbException is Sun proprietary API and may be removed in a future release[exec][javadoc] import sun.security.krb5.KrbException;[exec][javadoc] ^[exec][javadoc] /home/steffl/work/svn.isDebugEnabled/common-trunk/src/java/org/apache/hadoop/security/KerberosName.java:81: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release[exec][javadoc] private static Config kerbConf;[exec][javadoc] ^[exec][javadoc] ExcludePrivateAnnotationsStandardDoclet[exec][javadoc] Standard Doclet version 1.6.0_06[exec][javadoc] Building tree for all the packages and classes...[exec][javadoc] Building index for all the packages and classes...[exec][javadoc] Building index for all classes...[exec][javadoc] Generating /home/steffl/work/svn.isDebugEnabled/common-trunk/build/docs/api/stylesheet.css...[exec][javadoc] 6 warnings

I've got a feeling that Scott is very strong about moving Hadoop to SLF4J.

Moreso that if there was going to be a lot of work to build custom wrappers and change logging signatures across the code-base, that SLF4J should be a strong contender since that is what it does.
If building such a thing is outside the scope here, then guard statements surely work.

In a blue-sky world, SLF4J (or some framework) would modify bytecode to push its guard statement from inside its log method to around it at call sites. But now I'm just dreaming.

I have had good experience with a proprietary logging wrapper that lowered string concatenation churn caused by method invocations in the past – many years before SLF4J did it – and with an additional 200 or so method signatures to avoid autoboxing in 99% of use cases. A bit ugly, but very effective and we didn't have to teach developers when or why to guard their logging statements.

Scott Carey
added a comment - 28/Aug/10 03:03 I've got a feeling that Scott is very strong about moving Hadoop to SLF4J.
Moreso that if there was going to be a lot of work to build custom wrappers and change logging signatures across the code-base, that SLF4J should be a strong contender since that is what it does.
If building such a thing is outside the scope here, then guard statements surely work.
In a blue-sky world, SLF4J (or some framework) would modify bytecode to push its guard statement from inside its log method to around it at call sites. But now I'm just dreaming.
I have had good experience with a proprietary logging wrapper that lowered string concatenation churn caused by method invocations in the past – many years before SLF4J did it – and with an additional 200 or so method signatures to avoid autoboxing in 99% of use cases. A bit ugly, but very effective and we didn't have to teach developers when or why to guard their logging statements.

Tsz Wo Nicholas Sze
added a comment - 28/Aug/10 02:18 Thanks everyone for spending time on this issue. Again, I do not opposite the other ideas but they are just outside the scope of this issue. You are very welcome to work on them.
(Erik is running tests on his patch.)

I've got a feeling that Scott is very strong about moving Hadoop to SLF4J. This clearly does not fall within the scope of this issue as I explained before. But if you would like to invest in this subject you are welcome to create an issue and provide arguments for the transition, may be your experience with SLF4J in Avro, performance comparison, etc.

Konstantin Shvachko
added a comment - 28/Aug/10 02:15 I've got a feeling that Scott is very strong about moving Hadoop to SLF4J. This clearly does not fall within the scope of this issue as I explained before. But if you would like to invest in this subject you are welcome to create an issue and provide arguments for the transition, may be your experience with SLF4J in Avro, performance comparison, etc.

Konstantin, thanks very much for running this benchmark! It sounds like that, or any other LOG.debug() call that might be called while servicing an otherwise inexpensive RPC request, should be guarded with an isDebugEnabled() or replaced with an slf4j or some other wrapper that avoids string concatenation.

Doug Cutting
added a comment - 28/Aug/10 01:45 Konstantin, thanks very much for running this benchmark! It sounds like that, or any other LOG.debug() call that might be called while servicing an otherwise inexpensive RPC request, should be guarded with an isDebugEnabled() or replaced with an slf4j or some other wrapper that avoids string concatenation.

I think Doug's request to provide benchmarks is reasonable. We should benchmark more.
So I modified NNThroughputBenchmark, to benchmark getFileStatus(). In current trunk getFileStatus() calls Groups.getGroups(), which has a LOG.debug() in it. In NNThroughputBenchmark you can specify a logLevel, so if I run it with logLevel = DEBUG the debug message from Groups.getGroups() is printed for every getFileStatus() call, and no other messages are printed.

I compared two versions of the code with logLevel = INFO:

Unmodified Groups.getGroups(), where the argument for the debug call is calculated as a sum of strings.

Modified Groups.getGroups() with if(LOG.isDebugEnabled()) before the logging statement.

I increased the namespace to 100,000 files and performed that many calls of getFileStatus() for both setups. I still see 3-4% improvement in case (2).
The gain is less because it is absorbed by the increased cost of navigating down the namespace tree.

Did the same with 1,000 files. (2) is better about 12-13%.

Groups.getGroups() is a very popular method it is a part of user authentication, so this log message is a part of each and every name-node call. In many cases the cost of concatenating strings in it will be absorbed by disk io operations, but it is still good to have this improvement.

I'll post the patch for NNThroughputBenchmark in another jira shortly.

Konstantin Shvachko
added a comment - 28/Aug/10 01:31 I think Doug's request to provide benchmarks is reasonable. We should benchmark more.
So I modified NNThroughputBenchmark , to benchmark getFileStatus() . In current trunk getFileStatus() calls Groups.getGroups() , which has a LOG.debug() in it. In NNThroughputBenchmark you can specify a logLevel, so if I run it with logLevel = DEBUG the debug message from Groups.getGroups() is printed for every getFileStatus() call, and no other messages are printed.
I compared two versions of the code with logLevel = INFO :
Unmodified Groups.getGroups() , where the argument for the debug call is calculated as a sum of strings.
Modified Groups.getGroups() with if(LOG.isDebugEnabled()) before the logging statement.
Results:
10,000 calls of getFileStatus() in (1) yield 22,905 ops/sec
10,000 calls of getFileStatus() in (2) yield 24,610 ops/sec
This about 7% difference.
I increased the namespace to 100,000 files and performed that many calls of getFileStatus() for both setups. I still see 3-4% improvement in case (2).
The gain is less because it is absorbed by the increased cost of navigating down the namespace tree.
Did the same with 1,000 files. (2) is better about 12-13%.
Groups.getGroups() is a very popular method it is a part of user authentication, so this log message is a part of each and every name-node call. In many cases the cost of concatenating strings in it will be absorbed by disk io operations, but it is still good to have this improvement.
I'll post the patch for NNThroughputBenchmark in another jira shortly.

Suresh Srinivas
added a comment - 27/Aug/10 21:20 I had posted my +1 for the patch earlier. But echoing what Eli has said already, we should open another jira and explore some of the good ideas that this jira has generated.

Seems like a version of the patch that only added the ifs statements when needed (when the log argument is not a constant string) would be a nice middle ground since it addresses both the performance and code readability concerns, and does not require a lot of effort. I think we should implement Konstantin/Doug's proposal in a separate jira and make forward progress on this one.

Eli Collins
added a comment - 27/Aug/10 20:04 Seems like a version of the patch that only added the ifs statements when needed (when the log argument is not a constant string) would be a nice middle ground since it addresses both the performance and code readability concerns, and does not require a lot of effort. I think we should implement Konstantin/Doug's proposal in a separate jira and make forward progress on this one.

Doug, I don't see your point - the patch only makes the code consistent, as I mentioned previously lot of debug() calls are already guarded by isDebugEnabled().

If I started a new practice or tried to bring in new coding style/practice I would understand your point. But defending inconsistent code?

Patch might not be nice to look at but:

it's best from performance viewpoint

even though it adds code still LESS code is executed

code is less readable on small scale (individual statement is longer) but more readable on big scale - no surprises (all debug messages are same), easy to maintain (easy to find debug messages that are not guarded thus potential performance problems), it's obvious that debug message is optimal (don't need to investigate whether it's on critical path, whether it's good enough - in some cases the arguments themselves might be expensive, i.e. it's a function call or a new object is created)

it's relatively standard practice (i.e. java language doesn't offer much better solution, as seen by this long discussion)

Erik Steffl
added a comment - 27/Aug/10 19:45 Doug, I don't see your point - the patch only makes the code consistent, as I mentioned previously lot of debug() calls are already guarded by isDebugEnabled().
If I started a new practice or tried to bring in new coding style/practice I would understand your point. But defending inconsistent code?
Patch might not be nice to look at but:
it's best from performance viewpoint
even though it adds code still LESS code is executed
code is less readable on small scale (individual statement is longer) but more readable on big scale - no surprises (all debug messages are same), easy to maintain (easy to find debug messages that are not guarded thus potential performance problems), it's obvious that debug message is optimal (don't need to investigate whether it's on critical path, whether it's good enough - in some cases the arguments themselves might be expensive, i.e. it's a function call or a new object is created)
it's relatively standard practice (i.e. java language doesn't offer much better solution, as seen by this long discussion)

-0. This patch is not well motivated. Reasonable alternatives have been proposed. No urgency has been demonstrated. No great effort was required in creating this patch: if it goes stale it would be easy to reconstruct. Further effort to reach a solution acceptable to all before anything is committed seems well warranted.

Doug Cutting
added a comment - 27/Aug/10 17:44 -0. This patch is not well motivated. Reasonable alternatives have been proposed. No urgency has been demonstrated. No great effort was required in creating this patch: if it goes stale it would be easy to reconstruct. Further effort to reach a solution acceptable to all before anything is committed seems well warranted.

This issue draw up an unbelievably long discussion. Erik's patch is great. Of course, there is a room for improvement as always, and as many of other patches. Quite a few people have proposed "better" ideas but only Erik has provided patches. I have fine with ideas like Doug's or Konstantin's as long as some one is willing to work on it. IMO, trying to make the logging perfect is a kind of wasting resources in this community.

Konstantin and I have +1'ed earlier but no one has -1'ed yet. I will commit Erik's patch 24 hours later if there is no -1.

Tsz Wo Nicholas Sze
added a comment - 27/Aug/10 01:50 This issue draw up an unbelievably long discussion. Erik's patch is great. Of course, there is a room for improvement as always, and as many of other patches. Quite a few people have proposed "better" ideas but only Erik has provided patches. I have fine with ideas like Doug's or Konstantin's as long as some one is willing to work on it. IMO, trying to make the logging perfect is a kind of wasting resources in this community.
Konstantin and I have +1'ed earlier but no one has -1'ed yet. I will commit Erik's patch 24 hours later if there is no -1.

What has been shown is that over 80% of the performance problem, if it exists (I believe it does, but no one has actually shown that here or in the sibling tickets), is corrected by removing String concatenation in favor of varargs and Object[] creation.

So therefore Object[] creation is at least 5x faster than string concatenation. Generally speaking, autoboxing and transient creation of Object[] would be similar in cost, so it is probably closer to 10x faster.

Pushing on the SLF4J project (which is actively developed, unlike Log4j and commons-logging, etc) to add a few method signatures that reduce Object[] creation and autoboxing feels like a better idea to me than re-inventing SLF4J – which was created in the first place to solve this problem along with a few others.

Scott Carey
added a comment - 27/Aug/10 01:16 What has been shown is that over 80% of the performance problem, if it exists (I believe it does, but no one has actually shown that here or in the sibling tickets), is corrected by removing String concatenation in favor of varargs and Object[] creation.
So therefore Object[] creation is at least 5x faster than string concatenation. Generally speaking, autoboxing and transient creation of Object[] would be similar in cost, so it is probably closer to 10x faster.
Pushing on the SLF4J project (which is actively developed, unlike Log4j and commons-logging, etc) to add a few method signatures that reduce Object[] creation and autoboxing feels like a better idea to me than re-inventing SLF4J – which was created in the first place to solve this problem along with a few others.

No, I have not demonstrated that the cost string concatenation in debug existing log messages is insignificant, but neither has anyone else to my knowledge shown that this cost is significant. I'm willing to accept optimizations that don't make the code any larger without proof of their significance, but optimizations that increase code size should first be shown to significantly improve performance, no?

Doug Cutting
added a comment - 26/Aug/10 20:52 No, I have not demonstrated that the cost string concatenation in debug existing log messages is insignificant, but neither has anyone else to my knowledge shown that this cost is significant. I'm willing to accept optimizations that don't make the code any larger without proof of their significance, but optimizations that increase code size should first be shown to significantly improve performance, no?

Tsz Wo Nicholas Sze
added a comment - 26/Aug/10 20:41 > ... , given that even that the string concatenation has not shown to be significant, ...
Doug, have you done any benchmark to support the above statement?

> SLF4J [ ...] has the regular method void debug(String msg) so people will just use it as they did before.

Not if we have an automated test that looks for string concatenation in log statements. That's required for your proposal too, no?

> Migrating to SLF4J will mean changing logging for all levels in the entire project(s).

SLF4J wraps log4j, just like commons logging. So existing log configuration & format should be unaffected.

> creating extra object[] is a performance problem

No performance problems have actually been identified. Some have been suggested, but none shown to be significant in real operations.

That said, micro-benchmarks show that, were there a performance problem, getting rid of string concatenation would reduce the impact of logging by around 90%. Removing Object[] allocation would improve it further, but it's not clear, given that even that the string concatenation has not shown to be significant, that this would be a significant improvement. Plus, the majority of log statements have a single parameter and would not require an Object[] allocation. So, unless someone can show that Object[] allocation in multi-parameter log statements is slowing things significantly, slf4j would suffice.

Doug Cutting
added a comment - 26/Aug/10 20:16 > SLF4J [ ...] has the regular method void debug(String msg) so people will just use it as they did before.
Not if we have an automated test that looks for string concatenation in log statements. That's required for your proposal too, no?
> Migrating to SLF4J will mean changing logging for all levels in the entire project(s).
SLF4J wraps log4j, just like commons logging. So existing log configuration & format should be unaffected.
> creating extra object[] is a performance problem
No performance problems have actually been identified. Some have been suggested, but none shown to be significant in real operations.
That said, micro-benchmarks show that, were there a performance problem, getting rid of string concatenation would reduce the impact of logging by around 90%. Removing Object[] allocation would improve it further, but it's not clear, given that even that the string concatenation has not shown to be significant, that this would be a significant improvement. Plus, the majority of log statements have a single parameter and would not require an Object[] allocation. So, unless someone can show that Object[] allocation in multi-parameter log statements is slowing things significantly, slf4j would suffice.

Yes, SLF4J does provide a way to log more efficiently, but it also has the regular method void debug(String msg) so people will just use it as they did before. In my approach I intentionally excluded it from the api.

Migrating to SLF4J will mean changing logging for all levels in the entire project(s). This issue is targeting only debug messages.
Logging is a big deal. apache.commons worked for us for a long time, will SLF4J - I don't know. There are analytic tools out there relying on the format of the logs, will SLF4J break them? All this needs investigation. Just optimizing debug logging will not affect anything except improving the performance.

In fact, people in this discussion mentioned that creating extra object[] is a performance problem. I tried to address this issue in my proposal. And I think this makes a big difference compared to SLF4J, because 90% of the calls will end up executing the optimized versions of logDebug().

Konstantin Shvachko
added a comment - 26/Aug/10 19:59 I am reluctant to migrate to SLF4J at the moment for 3 main reasons.
Yes, SLF4J does provide a way to log more efficiently, but it also has the regular method void debug(String msg) so people will just use it as they did before. In my approach I intentionally excluded it from the api.
Migrating to SLF4J will mean changing logging for all levels in the entire project(s). This issue is targeting only debug messages.
Logging is a big deal. apache.commons worked for us for a long time, will SLF4J - I don't know. There are analytic tools out there relying on the format of the logs, will SLF4J break them? All this needs investigation. Just optimizing debug logging will not affect anything except improving the performance.
In fact, people in this discussion mentioned that creating extra object[] is a performance problem. I tried to address this issue in my proposal. And I think this makes a big difference compared to SLF4J, because 90% of the calls will end up executing the optimized versions of logDebug().

I too would prefer we simply switch to slf4j. In performance-critical code that requires complex log messages we might still ocasionally use isDebugEnabled(), as we might make other local code optimizations in critical sections. To be clear, I find Konstantin's approach acceptable, but feel a move to slf4j would be simpler with essentially the same benefits.

String concatenation is the biggest potential performance problem. Adding an automated test that detects string concatenation in log statements would be a good addition. Before we can add that test though, we need a logging API that does not require string concatenation and does not bloat code. Slf4j and Konstantin's approach both provide one.

Doug Cutting
added a comment - 26/Aug/10 18:27 Scott> This is not much different than just using SLF4J [ ... ]
I too would prefer we simply switch to slf4j. In performance-critical code that requires complex log messages we might still ocasionally use isDebugEnabled(), as we might make other local code optimizations in critical sections. To be clear, I find Konstantin's approach acceptable, but feel a move to slf4j would be simpler with essentially the same benefits.
String concatenation is the biggest potential performance problem. Adding an automated test that detects string concatenation in log statements would be a good addition. Before we can add that test though, we need a logging API that does not require string concatenation and does not bloat code. Slf4j and Konstantin's approach both provide one.

This is not much different than just using SLF4J, the only difference is that there would be some extra Object[] creation with SLF4J (which is really not a big performance problem, about the same as auto-boxing). But SLF4J would have a much faster and easier to use formatting implementation, and have performance benefits for INFO and other debug levels.

Additionally, we can still create a wrappper around that to add signatures to prevent object array creation and/or autoboxing for a limited subset of cases.

In short, why not just use SLF4J's more featured and flexible wrapper and either extend that or push for adding a few extra signatures on that side? It even has a tool to go through the source code and change stuff from log4j for you.

Scott Carey
added a comment - 26/Aug/10 17:51 This is not much different than just using SLF4J, the only difference is that there would be some extra Object[] creation with SLF4J (which is really not a big performance problem, about the same as auto-boxing). But SLF4J would have a much faster and easier to use formatting implementation, and have performance benefits for INFO and other debug levels.
Additionally, we can still create a wrappper around that to add signatures to prevent object array creation and/or autoboxing for a limited subset of cases.
In short, why not just use SLF4J's more featured and flexible wrapper and either extend that or push for adding a few extra signatures on that side? It even has a tool to go through the source code and change stuff from log4j for you.

Each statementLog.debug(a1 + a2 + ...)
in current code is replaced by one of the methods above:Logger.logDebug(Log, a1, a2, ...)
Looking at our code, about 90% of debug calls will be replaced by the first 4 methods,
which do not compute the arguments and do not instantiate an array object.
Most of the rest debug calls will be converted into a logDebug(Log log, Object... args) call.
Which is less efficient, but still faster than building a string.
The remaining calls can be transformed into "formatter" version. We should make sure these are not
performance-critical parts of the code, or otherwise rewrite the messages to be able to call one
of the optimized versions of logDebug().
Autoboxing of primitive types should be ok with these less efficient versions of logDebug().

The implementation of logDebug() should use optimized string building tools:

The JavaDoc for logDebug() methods should state which methods are optimized, and
what to avoid when calling them, like primitive type autoboxing, string concatenation, etc.

This will give us all three goals above, up to a point. Efficiency and readability should be obvious.
The tool to verify should be checking that there are no "naked" calls of ".debug" anywhere but
in the Logger. And that calls of logDebug() do not contain "+" operations.

This is not the ideal solution, but it is practical imo.

Future directions

In the future we can turn org.apache.hadoop.log.Logger into a log wrapper by converting
the static methods into non static, and adding methods logInfo(), logError(), logWarn().
This should optimize logging in Hadoop in general. Non optimal string building in info-level
logging hearts the performance right now as I write it.
The idea is to replace org.apache.commons.logging.Log with org.apache.hadoop.log.Logger.logging.Log will become a member of the Logger.
This is a bigger change and should be done in a different jira.

Konstantin Shvachko
added a comment - 26/Aug/10 00:53 This issue
Based on what is said in this jira and discussions with developers I'd like to make the following proposal.
There are 3 main goals the change should be targeting for:
debug() statement optimization : do not calculate the arguments if they are not going to be logged.
code readability : avoid if-debug-enabled statement before logging.
new code verification : provide an automatic procedure or simple rules to verify that the newly introduced code does not introduces that same inefficiencies.
My approach is a variant of Doug's proposal and SLF4J formatting.
Let's create a new class org.apache.hadoop.log.Logger
We will have only static methods in the class for now:
class Logger {
// Optimized logging
static void logDebug(Log log, Object a1);
static void logDebug(Log log, Object a1, Object a2);
static void logDebug(Log log, Object a1, Object a2, Object a3);
static void logDebug(Log log, Object a1, Object a2, Object a3, Object a4);
// not efficient, do not use on performance critical paths
static void logDebug(Log log, Object ... args);
static void logDebug(Log log, String formatter, Object ... args);
static void logDebug(Log log, Throwable e, Object ... args);
}
Each statement
Log.debug(a1 + a2 + ...)
in current code is replaced by one of the methods above:
Logger.logDebug(Log, a1, a2, ...)
Looking at our code, about 90% of debug calls will be replaced by the first 4 methods,
which do not compute the arguments and do not instantiate an array object.
Most of the rest debug calls will be converted into a logDebug(Log log, Object... args) call.
Which is less efficient, but still faster than building a string.
The remaining calls can be transformed into "formatter" version. We should make sure these are not
performance-critical parts of the code, or otherwise rewrite the messages to be able to call one
of the optimized versions of logDebug() .
Autoboxing of primitive types should be ok with these less efficient versions of logDebug() .
The implementation of logDebug() should use optimized string building tools:
static void logDebug(Log log, Object a1, Object a2) {
if (!log.isDebugEnabled())
return ;
StringBuilder str = new StringBuilder().append(a1).append(a2);
log.debug(str);
}
The JavaDoc for logDebug() methods should state which methods are optimized, and
what to avoid when calling them, like primitive type autoboxing, string concatenation, etc.
This will give us all three goals above, up to a point. Efficiency and readability should be obvious.
The tool to verify should be checking that there are no "naked" calls of ".debug" anywhere but
in the Logger. And that calls of logDebug() do not contain "+" operations.
This is not the ideal solution, but it is practical imo.
Future directions
In the future we can turn org.apache.hadoop.log.Logger into a log wrapper by converting
the static methods into non static, and adding methods logInfo(), logError(), logWarn().
This should optimize logging in Hadoop in general. Non optimal string building in info-level
logging hearts the performance right now as I write it.
The idea is to replace org.apache.commons.logging.Log with org.apache.hadoop.log.Logger .
logging.Log will become a member of the Logger .
This is a bigger change and should be done in a different jira.

The unanswered question is whether this patch does anything useful for the project. Does it actually make Hadoop measurably faster? Shouldn't we demonstrate that for optimizations that increase the code size, a cost/benefit analysis?

Doug Cutting
added a comment - 20/Aug/10 01:20 The unanswered question is whether this patch does anything useful for the project. Does it actually make Hadoop measurably faster? Shouldn't we demonstrate that for optimizations that increase the code size, a cost/benefit analysis?

Suresh Srinivas
added a comment - 19/Aug/10 22:49 I agree with Erik. I am actually surprised by how many comments this simple proposal has generated!
My +1 for committing the patch with a minor change - we do not need isDebugEnabled() check for tests.

the practice of using isDebugEnabled() is already common in Hadoop code, this patch just makes it more consistent

HDFS 187 in patch, 86 before patch

Common 63 in patch, 52 before patch

Note: patch adds isDebugEnabled to tests as well so the counts that matter are even lower.

None of the proposed solutions is solving all the problems and none of them is a short term solution. The only complaint against the patch is that it looks inelegant but from the discussion in this bug (and looking up others' effort to do the same) there's really no elegant solution for this problem in Java.

Claiming that patch adds code is a bit tricky. Yes it adds to total code written but in production runs it means less code being executed and less objects created (in alternative proposals even if the objects are not stringified they are still created). Given that we already have problems with GC that seems like a good idea.

We can still work on a long term better solution, once we have one we would have to convert all the code to the new solution anyway (at least remove already existing isDebugEnabled calls). Patch does not add significant (possibly none) additional work for that case.

Erik Steffl
added a comment - 19/Aug/10 22:45 To put things in perspective:
the practice of using isDebugEnabled() is already common in Hadoop code, this patch just makes it more consistent
HDFS 187 in patch, 86 before patch
Common 63 in patch, 52 before patch
Note: patch adds isDebugEnabled to tests as well so the counts that matter are even lower.
None of the proposed solutions is solving all the problems and none of them is a short term solution. The only complaint against the patch is that it looks inelegant but from the discussion in this bug (and looking up others' effort to do the same) there's really no elegant solution for this problem in Java.
Claiming that patch adds code is a bit tricky. Yes it adds to total code written but in production runs it means less code being executed and less objects created (in alternative proposals even if the objects are not stringified they are still created). Given that we already have problems with GC that seems like a good idea.
We can still work on a long term better solution, once we have one we would have to convert all the code to the new solution anyway (at least remove already existing isDebugEnabled calls). Patch does not add significant (possibly none) additional work for that case.

That last parameter would make things slow, but, I'd be surprised if you can show that optimizing this substantially improves Hadoop performance, since this only runs in the case that a retryable method fails with an exception. If we ever found that such cases were problematic, we might add methods like:

> Your benchmark also shows that LOG.isDebugEnabled() provides the best results.

It provides significantly better results in a micro-benchmark. But optimizations that increase code size should show significant improvement in real application performance, not just micro-benchmarks. Until someone's shown that this optimization measurably improves end performance, I see no urgency to commit it.

Doug Cutting
added a comment - 19/Aug/10 19:21 > I found at least two three-variable cases in Erik's latest patch.
I missed those, but it's still different from your benchmark, which focussed on boxed variables. Those examples might look something like:
debug("Exception while invoking {} of {}. Retrying.", method.getName(), implementation.getClass(), StringUtils.stringifyException(e));
That last parameter would make things slow, but, I'd be surprised if you can show that optimizing this substantially improves Hadoop performance, since this only runs in the case that a retryable method fails with an exception. If we ever found that such cases were problematic, we might add methods like:
static void debug(String message, Throwable t);
static void debug(String message, Throwable t, Object p1);
static void debug(String message, Throwable t, Object p1, Object p2);
static void debug(String message, Throwable t, Object... parameters);
Perhaps we could submit a patch to slf4j adding such methods there.
The second case could become something like:
debug("for protocol authorization compare ({}):{} with ", clientPrincipal, shortName, user.getShortUserName));
In this case all parameters are strings, so no boxing would occur.
> Your benchmark also shows that LOG.isDebugEnabled() provides the best results.
It provides significantly better results in a micro-benchmark. But optimizations that increase code size should show significant improvement in real application performance, not just micro-benchmarks. Until someone's shown that this optimization measurably improves end performance, I see no urgency to commit it.

Your benchmark also shows that LOG.isDebugEnabled() provides the best results. A patch using this approach is already there. Do you really want to stop committing it?

There may be other comparable solutions out there but (1) we don't have a patch yet, (2) it may require further modifying the codes and (3) the performance is not as good as LOG.isDebugEnabled().

I forgot to mention that my benchmark only measures running time but it does not account the GC overhead induced by unnecessary object creation due to auto-boxing. This is another reason that LOG.isDebugEnabled() is better than the other suggested solution.

Tsz Wo Nicholas Sze
added a comment - 19/Aug/10 18:56 > What does this show? ...
Your benchmark also shows that LOG.isDebugEnabled() provides the best results. A patch using this approach is already there. Do you really want to stop committing it?
There may be other comparable solutions out there but (1) we don't have a patch yet, (2) it may require further modifying the codes and (3) the performance is not as good as LOG.isDebugEnabled().
I forgot to mention that my benchmark only measures running time but it does not account the GC overhead induced by unnecessary object creation due to auto-boxing. This is another reason that LOG.isDebugEnabled() is better than the other suggested solution.

What does this show? It shows that simply switching to slf4j using format strings would remove string concatenation costs speeding log statements by around 10x without any bloat. It does not show whether that improvement would be significant in any larger Hadoop benchmark, but at least the cost in code readability would be null. Further it shows that, if we find that logging costs are significant somewhere due to boxing, we could optimize that with static methods, gaining another 5x without losing any readability. Finally, if we find that logging costs are still somewhere significant, we can improve them yet another 6x with some impairment to readability. This is a bit like manual loop unrolling. We'll do it in certain performance-critical areas when it's shown to provide a significant advantage, but we shouldn't do it blindly for every loop.

Doug Cutting
added a comment - 19/Aug/10 17:34 You've benchmarked a case that isn't in the existing patch (3 variables) and certainly isn't typical. Your 'static' case still boxes.
So your benchmark mostly shows that boxing costs, and string concatenation costs even more yet. I don't see how it shows that any of these costs are significant in the Hadoop codebase.
I've attached a version that benchmarks the more typical single-parameter using slf4j and a static version that avoids boxing.
The output this gives is:
java.version = 1.6.0_20
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_20-b02
java.vm.version = 16.3-b01
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = i386
os.name = Linux
os.version = 2.6.32-24-generic-pae
n=10000000
LOG.isDebugEnabled(): 13 ms
static debug(..) : 78 ms
LOG.debug(..) : 472 ms
What does this show? It shows that simply switching to slf4j using format strings would remove string concatenation costs speeding log statements by around 10x without any bloat. It does not show whether that improvement would be significant in any larger Hadoop benchmark, but at least the cost in code readability would be null. Further it shows that, if we find that logging costs are significant somewhere due to boxing, we could optimize that with static methods, gaining another 5x without losing any readability. Finally, if we find that logging costs are still somewhere significant, we can improve them yet another 6x with some impairment to readability. This is a bit like manual loop unrolling. We'll do it in certain performance-critical areas when it's shown to provide a significant advantage, but we shouldn't do it blindly for every loop.

Nicholas> I have checked the slf4j Logger API but does find a method like that.

That's right. I said, such an API "need not be a varargs call, but can a normal, 4-arg method call". With slf4j it would be varrargs. Also note, however, that the current patch doesn't contain any log examples with three parameters.

Scott> The autoboxing and varargs is significantly less expensive than string concatenation.
Scott> Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on. That flag becomes the default soon.

Good points, Scott.

This issue lacks benchmarks. Proposed optimizations should include benchmarks. The following article has comments that indicate that, as Scott suggests, autoboxing and varargs are pretty fast. With escape analysis they might be even faster.

If someone adds debug log lines in performance sensitive code, e.g., when calculating CRC32 or somesuch, then explicitly calling isDebugEnabled() would probably be a significant optimization, but such cases are rare, and performance sensitive code should be modified cautiously anyway, since other minor changes can have big effects.

Doug Cutting
added a comment - 17/Aug/10 19:50 Nicholas> I have checked the slf4j Logger API but does find a method like that.
That's right. I said, such an API "need not be a varargs call, but can a normal, 4-arg method call". With slf4j it would be varrargs. Also note, however, that the current patch doesn't contain any log examples with three parameters.
Scott> The autoboxing and varargs is significantly less expensive than string concatenation.
Scott> Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on. That flag becomes the default soon.
Good points, Scott.
This issue lacks benchmarks. Proposed optimizations should include benchmarks. The following article has comments that indicate that, as Scott suggests, autoboxing and varargs are pretty fast. With escape analysis they might be even faster.
http://www.nearinfinity.com/blogs/jeff_kunkle/performance_of_guarding_debug_log.html
If someone adds debug log lines in performance sensitive code, e.g., when calculating CRC32 or somesuch, then explicitly calling isDebugEnabled() would probably be a significant optimization, but such cases are rare, and performance sensitive code should be modified cautiously anyway, since other minor changes can have big effects.

Dong, I have checked the slf4j Logger API but does find a method like that. There is one requiring creating an array, debug(String format, Object[] argArray). From the FAQ link you provided earlier, it also suggests creating an array in this case. Below is quoted from the FAQ.

If three or more arguments need to be passed, you can make use of the Object[] variant. For example, you can write:

Tsz Wo Nicholas Sze
added a comment - 17/Aug/10 19:08 >The way you'd write this is:
>
>LOG.debug("a={} b={} c={}", a, b, c);
Dong, I have checked the slf4j Logger API but does find a method like that. There is one requiring creating an array, debug(String format, Object[] argArray). From the FAQ link you provided earlier, it also suggests creating an array in this case. Below is quoted from the FAQ.
If three or more arguments need to be passed, you can make use of the Object[] variant. For example, you can write:
logger.debug("Value {} was inserted between {} and {}.", new Object[]
Unknown macro: {newVal, below, above}
);

That said, numeric values would get boxed. In the attached patch there are some calls that pass a single numeric value that would get boxed. If we're concerned about the performance cost of the boxing in these cases, then we could define methods like:

public static void debug(String format, int i)

{ if (log.isDebugEnabled()) log.debug(format, i); }

public static void debug(String format, long l)

{ if (log.isDebugEnabled()) log.debug(format, l); }

public static void debug(String format, float f)

{ if (log.isDebugEnabled()) log.debug(format, f); }

public static void debug(String format, double d)

{ if (log.isDebugEnabled()) log.debug(format, d); }

If benchmarks show this to be a significant optimization, this might argue for creating a Hadoop-specific logging wrapper, as slf4j does not make this optimization.

Here's a random idea: use assertions for logging! So folks could write:
assert LOG.debug(....);
where LOG.debug() would need to return 'true'.

Doug Cutting
added a comment - 17/Aug/10 18:38 > LOG.debug("a=" + a + ", b=" + b + ", c=" + c);
The way you'd write this is:
LOG.debug("a={} b={} c={}", a, b, c);
This need not be a varargs call, but can a normal, 4-arg method call.
That said, numeric values would get boxed. In the attached patch there are some calls that pass a single numeric value that would get boxed. If we're concerned about the performance cost of the boxing in these cases, then we could define methods like:
public static void debug(String format, int i)
{ if (log.isDebugEnabled()) log.debug(format, i); }
public static void debug(String format, long l)
{ if (log.isDebugEnabled()) log.debug(format, l); }
public static void debug(String format, float f)
{ if (log.isDebugEnabled()) log.debug(format, f); }
public static void debug(String format, double d)
{ if (log.isDebugEnabled()) log.debug(format, d); }
If benchmarks show this to be a significant optimization, this might argue for creating a Hadoop-specific logging wrapper, as slf4j does not make this optimization.
Here's a random idea: use assertions for logging! So folks could write:
assert LOG.debug(....);
where LOG.debug() would need to return 'true'.

If aspectj can't do this, then its likely a no-go. Spring or Guice cannot do it either (they both change methods by overriding them, not wrapping them).
That leaves ASM or low level bytecode changes which definitely will work but would not be easy at all and would require significant testing. At this point, a feature like that should be part of the logging framework.

If I understand the slf4j API correctly, slf4j does not really help for the codes above. It has to create Integer, Long and Double objects for the primitive data type and an array for vararg.

It helps a lot, even with the autoboxing and varargs. A String has a hashCode and byte array member variable, a minimum memory overhead much larger than a boxed object.
The autoboxing and varargs is significantly less expensive than string concatenation.

In slf4j this requires allocating a varargs array with 3 elements (~32 bytes), and one Integer, one Long, and one Double (16, 24, and 24 bytes).
With log4j, it does not require the vararg array, but creates a string out of each number – which uses CPU and allocates more memory than just the boxed object, and then concatenates that into another String. The end result is at minimum 15 characters (30 bytes) + String overhead (~48 bytes), and the intermediate result overhead is larger. If any of the numbers are not so small, the overhead grows quickly.

The JVM is getting smarter with its Escape Analysis optimizations too – which will either eliminate many autobox and vararg object allocations or push them on the stack. I think Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on. That flag becomes the default soon.

Scott Carey
added a comment - 17/Aug/10 18:38 Thanks Luke for clarfying aspectj.
If aspectj can't do this, then its likely a no-go. Spring or Guice cannot do it either (they both change methods by overriding them, not wrapping them).
That leaves ASM or low level bytecode changes which definitely will work but would not be easy at all and would require significant testing. At this point, a feature like that should be part of the logging framework.
If I understand the slf4j API correctly, slf4j does not really help for the codes above. It has to create Integer, Long and Double objects for the primitive data type and an array for vararg.
It helps a lot, even with the autoboxing and varargs. A String has a hashCode and byte array member variable, a minimum memory overhead much larger than a boxed object.
The autoboxing and varargs is significantly less expensive than string concatenation.
int a = 0; long b = 1L; double c = 0.0d;
LOG.debug( "a={}, b={}, c={}" , a, b, c);
In slf4j this requires allocating a varargs array with 3 elements (~32 bytes), and one Integer, one Long, and one Double (16, 24, and 24 bytes).
With log4j, it does not require the vararg array, but creates a string out of each number – which uses CPU and allocates more memory than just the boxed object, and then concatenates that into another String. The end result is at minimum 15 characters (30 bytes) + String overhead (~48 bytes), and the intermediate result overhead is larger. If any of the numbers are not so small, the overhead grows quickly.
The JVM is getting smarter with its Escape Analysis optimizations too – which will either eliminate many autobox and vararg object allocations or push them on the stack. I think Sun's JVM will already avoid the varargs Object[] construction, and the Double, but not the Long or Integer, if +UseEscapeAnalysis is on. That flag becomes the default soon.

For aspectj, first of all, we currently don't use aspectj in Hadoop except for testing. I beg there are people against using aspectj or other byte-code rewriting technique on the non-testing codes since it is hard to debug.

Let's be academic (as what I see in this issue) and ignore the statement above. I cannot see a simple solution on using aspectj. Clearly, if we define a point cut per LOG.debug(..), aspectj will works but this is too much of works and, in some cases, we need to change the codes in order to define a point cut. For the people suggesting aspectj and saying that it is simple, could you please provide a simple example?

I heard one suggestion from Luke yesterday. He claimed we could use the "around" advice. However, it seems to me that around does not prevent parameter evaluation. I have not tested it yet.

Tsz Wo Nicholas Sze
added a comment - 17/Aug/10 17:42 For aspectj, first of all, we currently don't use aspectj in Hadoop except for testing. I beg there are people against using aspectj or other byte-code rewriting technique on the non-testing codes since it is hard to debug.
Let's be academic (as what I see in this issue) and ignore the statement above. I cannot see a simple solution on using aspectj. Clearly, if we define a point cut per LOG.debug(..), aspectj will works but this is too much of works and, in some cases, we need to change the codes in order to define a point cut. For the people suggesting aspectj and saying that it is simple, could you please provide a simple example?
I heard one suggestion from Luke yesterday. He claimed we could use the "around" advice. However, it seems to me that around does not prevent parameter evaluation. I have not tested it yet.

Did some digging on the aspectj wrapping approach. Looks like it's a no go performance wise, as it cannot solve the arguments building cost issue without fixing aspectj itself, as someone else tried to do exactly the same thing here:

Looks the most reasonable current course of action is just committing the patch as it looks correct, low risk and significantly lower gc stress (especially in namenode code, according to Suresh in offline discussions.)

We should file a separate jira to explore the switching to slf4j api (still using log4j as backend) approach. One open issue even with slf4j API is that it doesn't solve the autoboxing cost issue for primitive types, which we use a lot the logs:

Luke Lu
added a comment - 17/Aug/10 17:40 Did some digging on the aspectj wrapping approach. Looks like it's a no go performance wise, as it cannot solve the arguments building cost issue without fixing aspectj itself, as someone else tried to do exactly the same thing here:
http://dev.eclipse.org/mhonarc/lists/aspectj-users/msg09294.html
Looks the most reasonable current course of action is just committing the patch as it looks correct, low risk and significantly lower gc stress (especially in namenode code, according to Suresh in offline discussions.)
We should file a separate jira to explore the switching to slf4j api (still using log4j as backend) approach. One open issue even with slf4j API is that it doesn't solve the autoboxing cost issue for primitive types, which we use a lot the logs:
http://qos.ch/pipermail/slf4j-dev/2005-August/000241.html (old but still reflecting the current API design.)

add a checkstyle or other test that warns about string concatenation in log expressions

> you also need to choose at redist time what your back end will be and include that JAR

Unless we want to break things incompatibly, Hadoop's standard logging backend will remain log4j for the forseeable future. So slf4j is only being proposed here as a front end API. To my thinking, the question is whether we should invent our own improved front-end API for log4j that makes it easier to be efficient, or whether we should just use an existing efficient API like slf4j. I lean towards the latter.

Doug Cutting
added a comment - 17/Aug/10 16:54 As Luke implies, slf4j already implements the API style I proposed above:
http://www.slf4j.org/faq.html#logging_performance
As for commons logging, there are a large number of articles suggesting that it should not be revived:
http://articles.qos.ch/thinkAgain.html
http://tapestryjava.blogspot.com/2007/08/so-long-commons-logging-hello-slf4j.html
So, to my thinking, a good approach might be:
switch to slf4j
replace string concatenation with slf4j format strings
add a checkstyle or other test that warns about string concatenation in log expressions
> you also need to choose at redist time what your back end will be and include that JAR
Unless we want to break things incompatibly, Hadoop's standard logging backend will remain log4j for the forseeable future. So slf4j is only being proposed here as a front end API. To my thinking, the question is whether we should invent our own improved front-end API for log4j that makes it easier to be efficient, or whether we should just use an existing efficient API like slf4j. I lean towards the latter.

Scott Carey
added a comment - 17/Aug/10 16:54 Of course, the other tactic is to say "no layers, go to log4j" and work with the log4j team on having log4j push out data to chukwa and scribus, feed it back into Hadoop for analysis. Just a thought.
slf4j is log4j's successor, the 'founder' of slf4j is the same person that started log4j. Like commons logging, log4j is in maintenance mode. http://logging.apache.org/log4j/1.2/changes-report.html
One maintenance release in 2.5 years. Thats good, its stable and simple. But new features or adapters? Unlikely.

Luke Lu _said Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j. _

We can fix that, if the project is idle it could be taken over, but I fear the #of open bugreps. The thought appeals though. Strip out avalon support, pare back the POM files and assume log4j and java.util are the two backends.

Maybe I've just had too much negative experience of SLF4J the redistributable, rather than as an API.

it's another JAR

you also need to choose at redist time what your back end will be and include that JAR

If you don't want to log things like Jetty through it you have to write your own bridge specifically for that app [1] and make sure it's on the classpath.

so: I get to deal with the downstream problems, things not logging, things not logging in forked JVMs, etc, the SLF4J api is something I have no knowledge of.

Of course, the other tactic is to say "no layers, go to log4j" and work with the log4j team on having log4j push out data to chukwa and scribus, feed it back into Hadoop for analysis. Just a thought.

Steve Loughran
added a comment - 17/Aug/10 10:07 Luke Lu _said Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j. _
We can fix that, if the project is idle it could be taken over, but I fear the #of open bugreps. The thought appeals though. Strip out avalon support, pare back the POM files and assume log4j and java.util are the two backends.
Maybe I've just had too much negative experience of SLF4J the redistributable, rather than as an API.
it's another JAR
you also need to choose at redist time what your back end will be and include that JAR
If you don't want to log things like Jetty through it you have to write your own bridge specifically for that app [1] and make sure it's on the classpath.
so: I get to deal with the downstream problems, things not logging, things not logging in forked JVMs, etc, the SLF4J api is something I have no knowledge of.
Of course, the other tactic is to say "no layers, go to log4j" and work with the log4j team on having log4j push out data to chukwa and scribus, feed it back into Hadoop for analysis. Just a thought.
[1] http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/jetty/src/org/smartfrog/services/jetty/log/JettyToCommonsLogger.java?revision=8363&view=markup

Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?

You can do this for info messages too for more performance when that is off. There are no extra lines of code or maintenance; it is the simplest solution. As long as the change is applied at build time, I don't see any real drawback.

Scott Carey
added a comment - 17/Aug/10 02:21 Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?
You can do this for info messages too for more performance when that is off. There are no extra lines of code or maintenance; it is the simplest solution. As long as the change is applied at build time, I don't see any real drawback.

@Steve & Suresh: slf4j support "{}" holders in the format string (which has a much faster implementation than jdk MessageFormat), so we don't have to create new strings objects with all these + operators and/or implicit toString() calls. One can create a few overloads (for number of args under n, and delegate to vargs a la slf4j (n=3) for larger n) to avoid vargs (hence the new Object array) overhead in common cases. Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j.

Luke Lu
added a comment - 16/Aug/10 23:37 @Steve & Suresh: slf4j support "{}" holders in the format string (which has a much faster implementation than jdk MessageFormat), so we don't have to create new strings objects with all these + operators and/or implicit toString() calls. One can create a few overloads (for number of args under n, and delegate to vargs a la slf4j (n=3) for larger n) to avoid vargs (hence the new Object array) overhead in common cases. Feature/design/performance aside, the main problem of commons logging is that there are no active maintainers working on it (last release in 2007), unlike slf4j.

> 1. Code must remain as readable/simple as possible and maintainable (no wrapping isDebugEnabled() in all of source code).

Does anyone believe that? I think the guards should ideally be limited to performance sensitive parts of the code. But if folks don't trust that can be maintained, then a warning when non-constant strings are logged might be acceptable. Constant strings should be as fast as the guard.

Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?

Luke Lu
added a comment - 16/Aug/10 23:07 We can save ~296 lines from the patch if we exclude the test code:
[...hadoop]$ find */src/java -name \*.java | xargs grep '\.debug(' | wc -l
410
[...hadoop]$ find */src/test -name \*.java | xargs grep '\.debug(' | wc -l
148
Also #3 seems to be fairly simple with a single "around" advice in aspectj and a couple of lines for ajc target in build.xml. Why use an ad-hoc script that's known (according Erik himself) to have false positive that require manual inspection (which is also not part of the version controlled source tree), when you we have a built-in (aspectj*.jars are already included as build/runtime dependencies) tool specifically for such tasks?

Jakob Homan
added a comment - 16/Aug/10 22:56 If we run it as a target of ant build, we can enforce people to maintain the guards automatically.
I'd like to avoid any potential false positives the script may identify. Checking for appropriate if blocks around costly debug logging should be a responsibility of the patch reviewer.

Scott Carey
added a comment - 16/Aug/10 22:45 It must also be ensured that isDebugEnabled() is NOT wrapped around an info, warn, error, or fatal. Someone might check in a change that turns a debug into an info without removing the guard.

Luke> The problem with the current approach is that you'd have to keep patching people's new code manually.

I understand that Erik has a script, which automates the process of finding calls to debug() unguarded with isDebugEnabled(). If we run it as a target of ant build, we can enforce people to maintain the guards automatically.

Konstantin Shvachko
added a comment - 16/Aug/10 22:33 Luke> The problem with the current approach is that you'd have to keep patching people's new code manually.
I understand that Erik has a script, which automates the process of finding calls to debug() unguarded with isDebugEnabled() . If we run it as a target of ant build, we can enforce people to maintain the guards automatically.

The current patch seems to be a bit of a sledge hammer where a ball-peen might be more appropriate. However, since Java doesn't give us any way of lazily evaluating the arguments to a method, this check does seem to be a standard way of working around this limitation. I'm a bit surprised at the debug logs in the test cases. Doesn't it seem more reasonable to do all logging in tests at an always-on level?

Jakob Homan
added a comment - 16/Aug/10 22:29 The current patch seems to be a bit of a sledge hammer where a ball-peen might be more appropriate. However, since Java doesn't give us any way of lazily evaluating the arguments to a method, this check does seem to be a standard way of working around this limitation. I'm a bit surprised at the debug logs in the test cases. Doesn't it seem more reasonable to do all logging in tests at an always-on level?

Actually, #1 is subjective. I don't think adding an if-statement per LOG.debug(..) make the codes hard to read or maintain. On the other hand, #2 is very important in production systems. Source-code/byte-code rewriting mentioned in #3 needs a lot of works. Why don't we just commit Erik's patch?

Tsz Wo Nicholas Sze
added a comment - 16/Aug/10 22:11 Thanks Scott for a nice summary .
Actually, #1 is subjective. I don't think adding an if-statement per LOG.debug(..) make the codes hard to read or maintain. On the other hand, #2 is very important in production systems. Source-code/byte-code rewriting mentioned in #3 needs a lot of works. Why don't we just commit Erik's patch?

I have additional concerns about AOP. Herriot plans to introduce AOP based interfaces for exposing system internals. This should not be deployed on production clusters. Using AOP to check isDebugEnabled(), results in two flavors of AOP code. One that must be deployed on the system and the other that should not be. Confusion here could result in serious issues.

Second, with AOP, if some one deploys the version without AOP based debug check, there is no way to know if the check is happening or not, by the system behavior alone.

My vote is to keep it simple for now and check in the code changes that Erik has. This is a widely used java coding practice and not a huge code bloat. If others still feel that this causes code bloat, we could further choose one of the choices proposed in the jira. At that time, we need to any way remove all debug enabled checks that are currently being made.

Suresh Srinivas
added a comment - 16/Aug/10 22:08 I have additional concerns about AOP. Herriot plans to introduce AOP based interfaces for exposing system internals. This should not be deployed on production clusters. Using AOP to check isDebugEnabled(), results in two flavors of AOP code. One that must be deployed on the system and the other that should not be. Confusion here could result in serious issues.
Second, with AOP, if some one deploys the version without AOP based debug check, there is no way to know if the check is happening or not, by the system behavior alone.
My vote is to keep it simple for now and check in the code changes that Erik has. This is a widely used java coding practice and not a huge code bloat. If others still feel that this causes code bloat, we could further choose one of the choices proposed in the jira. At that time, we need to any way remove all debug enabled checks that are currently being made.

There appear to be "-1"'s that require the above to all be true. Therefore, this ticket will go nowhere until there is consensus to let one of the three be false. Implicitly, #2 is false if no action is taken.

Scott Carey
added a comment - 16/Aug/10 21:57 At most two of these can be true:
1. Code must remain as readable/simple as possible and maintainable (no wrapping isDebugEnabled() in all of source code).
2. Code must perform best (no varargs, autoboxing, string concatenation, etc for unused debug lines)
3. The build system can't re-write code (AOP, source modification, or bytecode modification)
There appear to be "-1"'s that require the above to all be true. Therefore, this ticket will go nowhere until there is consensus to let one of the three be false. Implicitly, #2 is false if no action is taken.

Steve Loughran
added a comment - 16/Aug/10 16:30 -1 to ASM-style work as it just complicates build/test cycles
-0 to SLF4J. I don't see the point. I use commons-logging as the front end with either Log4J or other tools as the back end.
Now, we could improve commons-logging to add varargs...

Luke Lu
added a comment - 14/Aug/10 00:02 Leaving out test code doesn't change the situation.
Why? It would make the patches smaller and quicker to review.
build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.
AOP can insert if LOG.isDebugEnabled()) guards at build time, so you can definitely enable debug at runtime.
The problem with the current approach is that you'd have to keep patching people's new code manually.

Scott Carey
added a comment - 13/Aug/10 23:56 Scott: build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.
Thats not what I meant. I mean that at build time, the source code:
log.debug( "hello " + "world" + "!" );
can be transformed into:
if (log.isDebugEnabled()) {
log.debug( "hello " + "world" + "!" );
}
The transformation can be at the source code level prior to compilation OR at byte code level post compilation.
Obviously, that is a longer term option.
Do we have any information about how much improved the performance is with the patch here?

Why not committing a simple patch now and work on a good solution later?

Luke: we all prefer fixing the tools. Consider the timeframe and effort required. In the meantime we can have proposed patch as an intermediate solution. Leaving out test code doesn't change the situation. Adding comments is also code bloat (code bloat is mostly relevant for people reading the code) plus it's not guaranteed to work (isDebugEnabled() is guaranteed to work but people will add a marker more or less randomly without considering price of debug call).

Scott: build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.

I appreciate comments but how about separating a short term quick fix and long term proper solution? At this point it's not clear what a proper solution is. If you want I'll file a jira for long term solution so that we don't forget about it? I mean we probably spent more time discussing this that it take to both create a patch and reverse it (once we have the proper solution).

Erik Steffl
added a comment - 13/Aug/10 22:34 Why not committing a simple patch now and work on a good solution later?
Luke: we all prefer fixing the tools. Consider the timeframe and effort required. In the meantime we can have proposed patch as an intermediate solution. Leaving out test code doesn't change the situation. Adding comments is also code bloat (code bloat is mostly relevant for people reading the code) plus it's not guaranteed to work (isDebugEnabled() is guaranteed to work but people will add a marker more or less randomly without considering price of debug call).
Scott: build time - means we wouldn't be able to enable debug at runtime, that's a significant restriction compared to current state and limits troubleshooting capabilities.
I appreciate comments but how about separating a short term quick fix and long term proper solution? At this point it's not clear what a proper solution is. If you want I'll file a jira for long term solution so that we don't forget about it? I mean we probably spent more time discussing this that it take to both create a patch and reverse it (once we have the proper solution).

One could use something like ASM to re-write log.debug() statements with the guard around it at class load time, or some build system trickery can do something similar at build time. An AOP tool could do this sort of thing.

Scott Carey
added a comment - 13/Aug/10 16:55 One could use something like ASM to re-write log.debug() statements with the guard around it at class load time, or some build system trickery can do something similar at build time. An AOP tool could do this sort of thing.
I wonder if any logging tool project has already done such a thing?

Luke Lu
added a comment - 13/Aug/10 01:23 I'd prefer fixing the tools instead of bloating the code. Some simple heuristics that could work:
Skip checking all the test code, which is trivial, as they live in different trees.
Put some marker, say a trailing /// for verified debug calls that are not expensive, so a grep -v '///$' can filter them out etc.

Me too however then there's no way to automatically or at least semi-automatically check for the cases where the optimisation would be helpful.

If all debug() logging calls have isDebugEnabled() guard it's easy to check the whole codebase in few minutes (I have a simple script that only produces few false positives that need to be eyeballed). If the guard is used in minority of cases where it's actually useful then verification whether it's used everywhere where it's needed is very expensive.

Possibly better solutions but I don't think they are realistic in short/medium term:

use scala and lazy values (arguments for debug() are not constructed unless used)

use extensive profiling and see where debug() calls matter (this could be part of automatic builds/tests)

Erik Steffl
added a comment - 13/Aug/10 00:46 Re https://issues.apache.org/jira/browse/HADOOP-6884?focusedCommentId=12897536&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12897536
Me too however then there's no way to automatically or at least semi-automatically check for the cases where the optimisation would be helpful.
If all debug() logging calls have isDebugEnabled() guard it's easy to check the whole codebase in few minutes (I have a simple script that only produces few false positives that need to be eyeballed). If the guard is used in minority of cases where it's actually useful then verification whether it's used everywhere where it's needed is very expensive.
Possibly better solutions but I don't think they are realistic in short/medium term:
use scala and lazy values (arguments for debug() are not constructed unless used)
use extensive profiling and see where debug() calls matter (this could be part of automatic builds/tests)

This bloats the code when the logged string is a constant. Also, in the vast majority of cases, this is not a significant optimization. I'd rather keep most code more readable and only perform the check in performance-critical code.

Doug Cutting
added a comment - 12/Aug/10 02:17 This bloats the code when the logged string is a constant. Also, in the vast majority of cases, this is not a significant optimization. I'd rather keep most code more readable and only perform the check in performance-critical code.

The debug log check is critical for performance sensitive paths. Due to lack of macros in Java, log wrappers doesn't really help if additional objects need to be created for the messages (the template stuff helps but not in many cases.)

Luke Lu
added a comment - 28/Jul/10 18:12 The debug log check is critical for performance sensitive paths. Due to lack of macros in Java, log wrappers doesn't really help if additional objects need to be created for the messages (the template stuff helps but not in many cases.)
-1 for yet another home brew log wrappers.
+1 for slf4j.

Alejandro Abdelnur
added a comment - 27/Jul/10 23:56 While common practice, this result in lot of code uglyfication/noise.
I'd rather suggest the following (we are doing this in Oozie):
Create a Log wrapper, XLog which extends Log and provides the following additional methods:
info(String msgTemplate, Object ... args)
warn(String msgTemplate, Object ... args)
debug(String msgTemplate, Object ... args)
trace(String msgTemplate, Object ... args)
In each one of these methods, if the log is enabled use the JDK MessageFormat class to create the log message and call the corresponding Log method, else do nothing.
Caveat: the last args has to be tested for being a Throwable and if so the corresponding Log method with (String, Throwable) signature.