Java String Concatenation

You have been told many times, don’t use + (java plus operator) to concatenate Strings. We all know that it is not good for performance. Have you researched it? Do you know what is happening behind the hood? Lets explore all about String concatenation now.

In the initial ages of java around jdk 1.2 every body used + to concatenate two String literals. When I say literal I mean it. Strings are immutable. That is, a String cannot be modified. Then what happens when we do

String fruit = "Apple"; fruit = fruit + "World";

In the above java code snippet for String concatenation, it looks like the String is modified. It is not happening. Until JDK 1.4 StringBuffer is used internally and from JDK 1.5 StringBuilder is used to concatenate. After concatenation the resultant StringBuffer or StringBuilder is changed to String.

When java experts say, “don’t use + but use StringBuffer”. If + is going to use StringBuffer internally what big difference it is going to make in String concatenation? Look at the following example. I have used both + and StringBuffer as two different cases. In case 1, I am just using + to concatenate. In case 2, I am changing the String to StringBuffer and then doing the concatenation. Then finally changing it back to String. I used a timer to record the time taken for an example String concatenation.

Look at the output (if you run this java program the result numbers might slightly vary based on your hardware / software configuration). The difference between the two cases is astonishing.

My argument is, if + is using StringBuffer internally for concatenation, then why is this huge difference in time? Let me explain that, when a + is used for concatenation see how many steps are involved:

A StringBuffer object is created

string1 is copied to the newly created StringBuffer object

The “*” is appended to the StringBuffer (concatenation)

The result is converted to back to a String object.

The string1 reference is made to point at that new String.

The old String that string1 previously referenced is then made null.

Hope you understand the serious performance issues and why it is important to use StringBuffer or StringBuilder (from java 1.5) to concatenate Strings.

Therefore you can see initially it was +, then StringBuffer came and now StringBuilder. Surely Java is improving release by release!

Comments on "Java String Concatenation" Tutorial:

Hey,
I stumbled upon your website today and I am loving reading your each and every post.
Explanation on java sting manipulation with sample source code is good.
I always wanted to understood JAVA under the hood. And, you are making that very easy for me.
I really appreciate that. :)

In doing performance checks, you should never do I/O to
report intermediate logging. Also, you are creating a new
Clock object in-between the code you’re checking – you
can never be sure what the performance penalty is for the
I/O and the object creation (and GC).

The better way to do it is to create some “long” vars before
executing any code that you want to check, such as:

long startTime, middleTime, endTime ;

Then, just before the first line of code to be measured:
[…]
startTime = System.currentTimeMillis();
[…] executable code to be measured

And now, just before the code you want to compare:
[…]
middleTime = System.currentTimeMillis();
[…] executable code that is to be measured /compared

And then finally, get the end time
endTime = System.currentTimeMillis();

Now it’s a simple matter to subtract the values
to arrive at the first section of code and the second
section of code and report the statistics AFTER all
the code has run.

Don’t do I/O and “object creation” during critical
time-checks of code.

I am measuring performance for only code blocks and not for whole method or program as single unit. When I instantiate the clock object my timer starts and when I print to console, I consider the performance measure as complete. The same steps are repeated again with next code block. I am not measuring to find out performance bottle necks. I am finding out the difference between two code blocks. Time taken for object instantiation and to print will be contributed to both blocks. So the delta will be same in both the measurements.

But the point you mention is good if we are trying to find out performance bottlenecks to optimize the code. But, if you consider popular java profilers, they instrument the byte code to measure performance, which is not a good way of doing it. I agree to that.

Storing time in long variables and at the end subtracting it is a good point. I should have written like that. That’s a nice catch!

I think this artice misses the point about using multiple + in a single expression.
For example
String str = “This ” + is +” a dynamic” + string;
Uses a single StringBuilder object behind the scene and is there for is as effective as using it explicitly, e.g.

Don’t be surprised that because StringBuffer implementation is slower than that of StringBuilder, the expression which uses + is faster than using a StringBuffer explicitly (quite common in pre 1.5 java code).

Your code quite correctly shows how to use StringBuffer explicitly to achieve the same inefficiency as the + operator – but that’s exactly why you don’t want to use either one in this case.

The example is taking an existing string, and appending to it a number of times. Using a single StringBuffer and .append is much faster than the + operator, OR instantiating a new StringBuffer for each append.

Single-line efficiency is also another matter, as you point out. That doesn’t change the validity of the article, which is not dealing with several appends.

There is advantage with StringBuffer’s parameterized constructor calling StringBuffer.append() over we calling StringBuffer.append()

The StringBuffer.append() function checks for its capacity, if its not sufficient it does expansion and then appends.
Expansion includes multiple steps like: creation of new char array, use System.arrayCopy to copy old char array to new one, and then returns newly created array’s reference.

Where as parameterized constructor accepting string know the length of string, using that it creates exact character capacity for string buffer so that

append function does not need to expand again as it has sufficient capacity.

In your code, StringBuffer.append() after exceeding 16 char length calls expands for every iteration when you append ‘*’ which takes even more time.

[…] + is a special operator when it comes to java String. It helps to concatenate two String. I have already written a post exclusively for String concatenation. Please refer that for more on java String concatenation. […]

I really love the clear, concise and to the point explanation with great examples. All your blogs are based on concepts which are well known and very useful practically. Thanks for sharing your knowledge!

This article shows a valid point, but it would be more interesting if another point was made clear – that using StringBuilder is even more efficient that StringBuffer, because it is not synchronized. This means that it is not thread-safe, but when using it as a local variable inside a method, that is not an issue.

When I run the code example with a third variant using StringBuilder, the StringBuilder version is about twice as fast as the StringBuffer version, which in turn is 10-15 times faster than using +.