This obvious solution would however not yield the same results as the function above, as the StringTokenizer handles sequences of token characters as one token, thus a sequence of n spaces will result in one underscore instead of n.

I suppose the amount of WTFness depends on a) if the coder more or less knows the length of the string and the number of spaces it will contain; for short strings, it's probably more efficient to do it like this than with regex's, and b) if the coder even knows that you can use a regex with String.replaceAll

I wouldn't of thought of the regex at first, but I would have used a StringBuffer. If that function gets a large string with many spaces, the whole app may crash. (I learned that the hard way when I was a Java newbie)

I wouldn't of thought of the regex at first, but I would have used a StringBuffer. If that function gets a large string with many spaces, the whole app may crash. (I learned that the hard way when I was a Java newbie)

came across a similar function in a .net production system. It looked something like

All fine, not a major major WTF except IndexOf thought Nul chars where a space, and Replace didn't.... The net effect was an 8cpu server chugging along at 30% averaging on each CPU, then 7 CPU's would go quite and 1 CPU would go nuts at 100% untill the service was shut down.

(We never quite figured why only the infinite looping thread became the only one running... we guessed it was some kind of wierd thread starvation problem, due to some thread synchronisation going on somewhere up the call stack...After all, the system in question was The Beast)

Our code is littered with stuff like this, mostly because regex and elegant replacement methods didn't come around until much later (that and inexperience with Java's core methods). I don't know whether String.replace(char, char) was around for 1.0 or whenever this may have been written, but it's possible that it was not, or that the developer just didn't know about it.

Not using a StringBuffer, or even a char array, though, that's pretty silly.

WTF? He didn't temporarily store the tokens in XML, interspersed with the underscores? That way he would be able to get the final string with one giant, performance crippling swoop of XSLT. That way it's really enterprisey, and customers will better make use of their oversized and overpriced hardware, thereby justifying the cost.

I agree with Volmarias: our project is chock-full of seeming head-slappers like this from years ago. One's sense of smug superiority gets a bit dented when you realise Java's String functionality in 1.2 was on a par with BBC BASIC. So I call it only a 0.5 WTF for dumb StringTokenizer usage.
If my CAPTCHA was almost a rude word, is that a separate WTF?

How do you figure? The token between each space will just be a blank string. He will basically be appending the blank string to the end, then adding the underscore. Multiple consecutive spaces mean multiple consecutive underscores.

yeah, people say that about StringBuffer all the time. But the compiler fixes it, so chill.

Stylistically, I'd agree that StringBuffer is the correct choice since it makes clear what is happening and doesn't rely on optimization, but it is with a clear conscience that I write readable code such as :

yeah, people say that about StringBuffer all the time. But the compiler fixes it, so chill.

Stylistically, I'd agree that StringBuffer is the correct choice since it makes clear what is happening and doesn't rely on optimization, but it is with a clear conscience that I write readable code such as :

sb.append(S+"...");

You realize that this creates a StringBuffer just to parse S + "...", right? Easier to write sp.append(S).append("...")

Fortunately, this is one of those rare WTFs that can be refactored away in seconds. The Real WTF here is all of the above people who didn't immediately think of regex. MAYBE I could understand not liking regex because they can get really complicated but for something like this?

If you guys don't know the regex solution, look it up now. I'll be here....waiting....

Fortunately, this is one of those rare WTFs that can be refactored away in seconds. The Real WTF here is all of the above people who didn't immediately think of regex. MAYBE I could understand not liking regex because they can get really complicated but for something like this?

If you guys don't know the regex solution, look it up now. I'll be here....waiting....

Fortunately, it can be. Unfortunately, at the time that it was written, the version was likely such that there was no convenient regex method to use.

Fortunately, this is one of those rare WTFs that can be refactored away in seconds. The Real WTF here is all of the above people who didn't immediately think of regex. MAYBE I could understand not liking regex because they can get really complicated but for something like this?

If you guys don't know the regex solution, look it up now. I'll be here....waiting....

Fortunately, it can be. Unfortunately, at the time that it was written, the version was likely such that there was no convenient regex method to use.

Yeah, I believe that. I don't really blame the original programmer here. The implementation this guy picked is a little wacky but at least he had the foresight to make it a static utility method that's used everywhere instead of copying-and-pasting this logic all over the place. At least, that's what I hope happened.

Concatenating strings inside a loop like that has quadratic complexity in Java because a copy of the string has to be made on each iteration. This is why Java has StringBuffers and StringBuilders, the latter being added in Java 5.

This is a solved problem. Save string to a text file, load an instance of Internet Explorer, navigate to an asp page that reads the text file, scrape the text, replace (the now) single spaces with underscores. That's how we do it here at MoroniTech, anyway.

yeah, people say that about StringBuffer all the time. But the compiler fixes it, so chill.

Stylistically, I'd agree that StringBuffer is the correct choice since it makes clear what is happening and doesn't rely on optimization, but it is with a clear conscience that I write readable code such as :

sb.append(S+"...");

You are wrong. Yes, the compiler MAY (as per Java Language Specification 15.18.1.2 "Optimization of String Concatenation") turn the + operator used on Strings in one expression into calls to StringBuffer/StringBuilder. But it can NOT do that when the operator is used inside a loop as in the WTF (and only then is it really a big problem). The only reason it didn't kill the app's performance is probably that it was only used on very short strings.

This must be the Java issue with the most amount of misconceptions flying around:

Some are blissfully unaware of the problem and write code like in the WTF

Some have heard about it but have not understood, thus believe that "+ is Teh Evil" and will use StringBuffer whenever they concatenate Strings, not understanding when concatenation is a potential performance problem and when not.

Some have heard that compilers "fix it" but have not understood that they can only fix the simple cases that usually don't even cause performance problems.

Very few people actually understand how String concatenation can cause performance problems and in which cases it is or is not an issue.

Until version 1.5, Java's regex's were slow, much slower than iterating through each character in a string and checking it. On the other hand, StringTokenizer was even slower than using a regex.

And to Andrew Trumper - be careful of trying to prove a point using microbookmarks (including testing something that may be optimized away - I don't think that's the case here though). The first "native" block involves recreating a new String and copying the older one into it on each iteration. Yes, that's something that you would have to be careful not to do, but this code:

Fortunately, this is one of those rare WTFs that can be refactored away in seconds. The Real WTF here is all of the above people who didn't immediately think of regex. MAYBE I could understand not liking regex because they can get really complicated but for something like this?

If you guys don't know the regex solution, look it up now. I'll be here....waiting....

Fortunately, it can be. Unfortunately, at the time that it was written, the version was likely such that there was no convenient regex method to use.

The really funny part is that a non-regexp method that fulfills the requirements (as per the method name, i.e. assuming that the multipe spaces case is not actually a requirement) WAS present in even very early JDKs.

WTF? He didn't temporarily store the tokens in XML, interspersed with the underscores? That way he would be able to get the final string with one giant, performance crippling swoop of XSLT. That way it's really enterprisey, and customers will better make use of their oversized and overpriced hardware, thereby justifying the cost.

And to Andrew Trumper - be careful of trying to prove a point using microbookmarks (including testing something that may be optimized away - I don't think that's the case here though). The first "native" block involves recreating a new String and copying the older one into it on each iteration.

No, it doesn't. And it's YOUR microbenchmark that is completely broken by creating objects scoped inside the loop and not actually doing anything quadratical.

This must be the Java issue with the most amount of misconceptions flying around:

Some are blissfully unaware of the problem and write code like in the WTF

Some have heard about it but have not understood, thus believe that "+ is Teh Evil" and will use StringBuffer whenever they concatenate Strings, not understanding when concatenation is a potential performance problem and when not.

Some have heard that compilers "fix it" but have not understood that they can only fix the simple cases that usually don't even cause performance problems.

Very few people actually understand how String concatenation can cause performance problems and in which cases it is or is not an issue.

The difference between being useful and annoying:

Annoying people will supply you with a list of problems they know the answers too already and make you find them.

Annoying people will supply you with a list of problems they know the answers too already and make you find them.

Useful people will type the answers with their list of problems.

I'm like you. I'm an annoying person.

Well, I sort of had the answer at least hinted at in the part before your quote.

Basically, String concatenation is only a serious problem (with quadratic running time) when you concatenate stuff inside a loop in small increments (i.e. many iterations) onto a String scoped outside the loop. Because in each iteration you copy the entire String, which gets bigger with each iteration. As an added bonus, all those increasingly huge String objects will strain the garbage collector as well.

If there is no loop involved (and of course what goes for loops also goes for recursion), you're very unlikely to reach the number of concatenations and the size of Strings where it is a noticeable problem. Copying a String of size 100 five times rather than once is irrelevant (unless you've identified it as a critical path with a profiler). Copying Strings of sizes growing to 100.000 five thousand times rather than five times IS relevant.

I wouldn't of thought of the regex at first, but I would have used a StringBuffer. If that function gets a large string with many spaces, the whole app may crash. (I learned that the hard way when I was a Java newbie)

The person who wrote that had probably never programmed in Java before. They at least had the decency to wrap their code in a logically named function that could be modified later if and when necessary. Of course, the repeated creation of strings does suck, but when you have a deadline to make it's not the kind of thing that would be a top priority.

Only idiot would use regexes for replacing single character value with another character.

Only idiot would not know how to use indefinite article.

Only idiot would think " +" was matching single character.

(It's quite amusing how Java is advertised as being so much simpler than C++, yet the simple issue of efficient string concatenation spawns large controversial discussions. In C++, you can just use std::string's efficient += operator.)

Which JDK did you use to compile that? Most of the current generation of JDK will optimize that code such that StringBuilder will be used for the first example behind the scenes (I've verified this by decompiling the class files).

Until version 1.5, Java's regex's were slow, much slower than iterating through each character in a string and checking it. On the other hand, StringTokenizer was even slower than using a regex.

And to Andrew Trumper - be careful of trying to prove a point using microbookmarks (including testing something that may be optimized away - I don't think that's the case here though). The first "native" block involves recreating a new String and copying the older one into it on each iteration. Yes, that's something that you would have to be careful not to do, but this code: