Dustin's Pages

Friday, March 13, 2009

Java String Literals: No String Constructor Required

I think that most experienced Java developers are aware of many of the many characteristics of the Java String that make it a little different than other objects. One particular nuance of the Java String that people new to Java sometimes don't fully appreciate is that a literal String is already a String object.

When first learning Java it is really easy to write a String assignment like this:

This will compile and the initialized String blogUrlString will support any needs one might expect from a String. However, the downside of this particular statement is there are actually two String instantiations in this case and one of them is unnecessary. Because the String literal "http://marxsoftware.blogspot.com/" is already a full-fledged Java String, the new operator is unnecessary and results in an extraneous instantiation. The code above can be re-written as follows:

// The 'new' keyword is not needed because the literal String is a full String objectString blogUrlString = "http://marxsoftware.blogspot.com/";

The unnecessary String instantiation demonstrated first will lead to reduced performance in Java applications. If the extraneous instantiation occurs in limited cases outside of loops, it is likely not to be a significant performance degradation. However, if it occurs within a loop, its performance impact can be much more significant. However, even when the performance issue is only slight, I still find the extra "new" instantiation to be less readable than the second method shown above.

Joshua Bloch uses an example similar to mine above to illustrate Item 5 ("Avoid Creating Unnecessary Objects") in the Second Edition of Effective Java. He points out that this extra instantiation in frequently called code can lead to performance problems.

To demonstrate the effect of this unnecessary extra instantiation of a String, I put together the following simple class (with a nested member class and a nested enum). The full code for it appears next.

/** * Test performance in loop over single String instantiation that is * executed the number of times as provided by the passed-in argument. * * @param numberOfLoops Number of times to instantiate Single String. * @return Results of this test. */ public TestResult testSingleString(final int numberOfLoops) { final TestResult result = new TestResult(numberOfLoops, TestType.SINGLE); result.startTimer(); for (int counter = 0; counter < numberOfLoops; counter++) { strings.add("http://marxsoftware.blogspot.com/"); } result.stopTimer(); return result; }

/** * Test performance in loop over redundant String instantiations that is * executed the number of times as provided by the passed-in argument. * * @param numberOfLoops Number of times to instantiate Single String. * @return Results of this test. */ public TestResult testRedundantStrings(final int numberOfLoops) { final TestResult result = new TestResult(numberOfLoops, TestType.REDUNDANT); result.startTimer(); for (int counter = 0; counter < numberOfLoops; counter++) { strings.add(new String("http://marxsoftware.blogspot.com/")); } result.stopTimer(); return result; }

/** * Run the examples based on provided command-line arguments. * * @param arguments Command-line arguments where the first argument should * be an integer (not decimal) numeral. */ public static void main(final String[] arguments) { final int numberArguments = arguments.length; if (numberArguments < 2) { System.err.println("Please provide two command-line arguments:"); System.err.println("\tIntegral number of times to instantiate Strings"); System.err.println("\tType of test to run ('redundant', 'constant', or 'single')"); System.exit(-2); }

/** * Stop timer. * * @throws IllegalStateException Thrown if this stopTimer() method is * called and the corresponding startTimer() method was never called or * if the calculated finish time is earlier than the start time. */ public void stopTimer() { if (startTime < 0 ) { throw new IllegalStateException( "Cannot stop timer because it was never started!"); } finishTime = System.currentTimeMillis(); if (finishTime < startTime) { throw new IllegalStateException( "Cannot have a stop time [" + finishTime + "] that is less than " + "the start time [" + startTime + "]"); } }

/** * Provide the number of milliseconds spent in execution of test. * * @return Number of milliseconds spent in execution of test. * @throws IllegalStateException Thrown if the time spent is invalid * due to the finish time being less than (earlier than) the start time. */ public long getMillisecondsSpent() { if (finishTime < startTime) { throw new IllegalStateException( "The time spent is invalid because the finish time [" + finishTime + " is later than the start time [" + startTime + "]."); } return finishTime - startTime; }

/** * Provide the number of seconds spent in execution of test. * * @return Number of seconds spent in execution of test. */ public double getSecondsSpent() { return getMillisecondsSpent() / MILLISECONDS_PER_SECOND; }

/** * Provide the number of executions run as part of this test. * * @return Number of executions of this test. */ public int getNumberOfExecution() { return numberOfExecutions; }

For the very simple code example used in the tests above, I needed to run the tests with many loops to see truly dramatic differences. However, the performance difference was obvious. I ran the tests several times for each test and averaged the results. In general, when the loops were large enough to differentiate significant differences, I found the method using the extraneous String instantiation to take roughly four times as long to execute as the loops using the String literal directly without the extra "new."

Although I ran each test on each number of loops, I show just one representative sample run for a few key data points in the following screen capture. I mark the results of running tests with 1 million loops in yellow and running with 10 million loops in red.

There are many cases in which the extra String instantiation demonstrated above might not have any significant performance impact. However, there is no positive of specifying an extra String instantiation and there is a negative in addition to reduced performance related to the extra code clutter.

Note that the examples above extend to similar String uses. Here is another slightly altered example.

// the way NOT to do itString someString = new String("http://" + theDomain + ":" + thePort + "/servicecontext");

Finally, as a reminder for anyone new to Java and Java Strings, if you find yourself assembling a large String from a large number of pieces, you will typically be better off using a StringBuilder or StringBuffer instead of a String. The root cause for this again has to do with too many String instantiations.

The Java String's behavior can seem a little strange until one gets used to it and even then it still might seem a little strange. The main point to remember related to this blog posting is that String literals are full-fledged String objects and so do not require the String constructor to be explicitly invoked.

5 comments:

Please add your site at http://www.sweebs.com. Sweebs.com is a place where other people can find you among the best sites on the internet!Its just started and we are collecting the best found on the net! We will be delighted to have you in the sweebs listings.

"if you find yourself assembling a large String from a large number of pieces, you will typically be better off using a StringBuilder or StringBuffer instead of a String."

The statement "S = S1 + S2 + S3;" is no more efficient if you convert it to a StringBuilder (which is how its compiled anyway) but it will be harder to read. StringBuilders are for loops and other cases of string catenation across multiple *statements* and not just multiple pieces.

Normally, the VM will optimize a S1 + S2 + S3 into a StringBuilder-like structure, as it can perfectly allocate enough size for the StringBuilder (S1.length() + S2.length() + S3.length). It's a lot more of an issue if you're using String concatenation in a loop. In that case you should ALWAYS use a StringBuilder or StringBuffer. However, stating that you should always use StringBuilder or StringBuffer when handling a known number of Strings is incorrect.