No, my opinions as expressed in this blog are not those of my employer. They're mine alone. That's why they're called "my opinions." Helloooo.

Monday, March 1, 2010

The little optimization that couldn't

Let's say you represent two groups of people. If I collect one penny from each member of Group A, then, from the unbounded kindness of my heart, give one dollar to each member of Group B, has the total wealth of the groups combined become greater or less?

If you answered "greater", this article's for you! You see, there's a question you forgot to ask -- and it's a question that we as programmers forget to ask constantly.

A nicely optimized method?

In our internal core Java libraries at Google, we have a method called StringUtil.repeat(). It returns a string consisting of n copies of a base string -- so StringUtil.repeat("hey", 3) produces "heyheyhey". When I first came across it, and cleaned it up a bit, its implementation looked a little bit like this:

public static String repeat(String base, int count) {

if (base.equals("") || count == 0) {

return "";

}

if (count == 1) {

return base;

}

StringBuilder sb = new StringBuilder(base.length() * count);

for (int i = 0; i < count; i++) {

sb.append(base);

}

return sb.toString();

}

What's going on here? Well, there are four basic cases.

Case one: base is the empty string, so the result should always be the empty string. We don't want to loop and all that, so this optimization returns an empty string straight away.

Case two: count is zero. Here again, why do any actual work? We should return the empty string here and now.

Case three: count is one. We can avoid instantiating a new string by returning the original string directly!

Case four: aww nuts. In this case, we really do have to loop and create the new string. Well, at least we optimized out as much ass we could first!

In each case, we're very carefully doing the bare minimum amount of work we can! With me so far? Sounds good?

When I found this, I proceeded to make the method run even faster. Any guesses how I did it?

That's right, I simply removed all of these so-called optimizations.

Remember that to optimize a special case, you must check for that special case... in every case! That small extra cost goes to every single user of the method. And what of the benefits? Hah! Notice that the "optimized" special cases are the cases that are the fastest to compute anyway!

What's more, surprisingly enough, it doesn't really happen that often that users call a repeat() method passing zero or one as the count. Why would they? Commonly, that is. So we're "optimizing" a case that hardly even exists, at the expense of all the cases that do exist. The special-case checks were a net loss, and better off removed.

It's not like my removing them radically improved the performance of anyone's application. However, the experience is useful as a lesson. You'll encounter this same situation many times in many guises, and it will often be tempting to think about the benefits to the few rather than the overall aggregate cost to society as a whole (hmm... remind you of any politicians?)

If the Group A of my opening scenario has thousands of times more members than Group B -- not to mention that dollar really turns out to be a dime -- it's a bad deal. Just say no!

21 comments:

I've used the phrase "longcuts" to describe those sorts of optimisation in the past. Makes me wonder how many of these little nuggets might be in the core Java code, for example, just waiting for someone to do exactly what you've just done.

Would be interesting if it was possible to mark these sorts of shortcuts in some way and have something like caliper run tests with and without them to see which is better, so that a more informed decision can be made.

I'm finding that string concatenation can be the root of a lot of evil. Does your caller really need a String? Can you return a CharSequence that delegates to the original String and implement charAt(int) to mod the original string length and get it from that? Then you can do the work in toString() if you absolutely must…

Your point seems to be "weigh the likelihood of each case when optimizing."

But if the simple-minded question that readers forgot to ask was "what's the likelihood of each case"...then what's the answer? What *was* the likelihood of each case, and what was the cost of each case to optimize? Were any of them worth optimizing?

Case one (checking base.equals("")) requires another method call and does sound expensive, but cases two and three (checking the count against 0/1) sound so cheap that I might expect your JVM to optimize them away after it warms up:

probably , we can make use of the count. If count is even , then we can utilize it to append to SB less number of times. So now words, ( try, 4), we can create trytry and then append trytry to it. If count is odd, that case needs to be dealt separate. Anyways, I enjoyed your youtube video of collections.

@rahul what you propose is called loop unrolling, and its typically done by the JIT, unless really needed never even think about doing it by hand, since its "optimization" can turn into a "deoptimization" on the next JIT update.

@Kartik I agree, early returns or so called guards are not only for optimization reasons, but also for read- and maintainability, which is often much more important than squeezing out the last bit of performance. Besides with EA and guards it can even get faster in the general case (cause JIT knows he does not need to do null checks anymore).

@mk I agree that the equals is badness, whereas the integer compares can be neglected. I think I would only keep the "1" case as the 0 case is handled by the loop smart anyway.

Good entry. I had a similar experience with a colleague writing a kind of buffer abstraction where array operations were being done internally. He was constantly checking three or four cases--all of which would've failed immediately anyways. Java byte[] will throw arrayOutOfBounds and dereferencing a null reference will already throw a NPE...no need to check twice...