Monday, May 5, 2014

Ever Since Java allows using String variable in switch and case statements, there are many programmers using this feature in code, which can be better written using integer and enum pattern. This was one of the popular features of JDK 7 release, including automatic resource management and multi exception catch blocks. Though I upfront didn't like this feature because of better alternative available in terms of using enumeration type, I am not totally against this feature. One reason of this is convenience and given usage of String in Java program, it's quite handy as well, but I prefer to learn more before using any new feature in production code. When I first come to know about this feature, I had an idea that String in Switch can be implemented using equals()and hashCode() method, I was more interested on how String in Switch works in Java 7. One more reason I was curious to know about internal working of this feature because I wanted to ask this during Java interviews, having one of such question makes interviews little more interesting. Testing was simple, you just need to write code using String variables in switch block, and then decompile the code to see, how compiler has translated them. So what are we waiting for, Let's see how String in switch block actually works?

Original Code :

This is the simple test program, which has a main method and a switch block, which is operating on String variable. String argument is provided at the time of running this program, which is then accessed from String array argument of main method. We have three modes to start our application, Active, Passive and Safe. Though its better to use Enum to represent such kind of well known fixed values, if you decided to use String, make sure you write it down in capital case to avoid case-sensitive issue with lower case and camel case. You can also see this tutorial to learn more about correctly using String in switch expressions in Java SE 7.

You need to install JDK 1.7 to compile and run this code. You can use any version of JDK 7.

Decompiled Code :

This is the decompiled version of above class after being compiled on jdk1.7.0_40 version. If you are new in Java and want to learn how to decompile Java class file for reverse engineering, see that post. Since with every new release we are getting more and more syntactic sugar, knowing how to decompile a class has become very important for all level of Java programmers. The gap between code you wrote and what get executed is widening very fast. Basic knowledge of Java class file format and byte code instruction will only going to help your. Java 8 recently released a key feature called lambda expression also takes helps of compiler to implement Anonymous class internally, you can decompile your class file to see methods added by compiler.

If you at this code, you will find out that String in Switch works by using hashCode() and equals() method. Remember, we can only use integer variable in switch case i.e. variable of type byte, short, char, and int. Good thing is return type of hashCode() method is int, not long. By the way this one way to remember this fact as well, which I often forget/get confused by myself. If you look closely, switch is on hash code and then a safety check by comparing String with equals() method, this check is required because two unequal object can have same hash code. So performance wise, it is not as fast as using enum constants on switch case or using pure integer constant on switch, but its not too bad at all. Since Java compiler is only using one additional method equals(), which can be very fast if you are comparing String literals i.e. when "abc" == "abc". If you are also considering about calling to hashCode() method, yes that is another 1 time additional cost, because once created, String cache there hash code, as discussed on my favourite article why String is immutable in Java. So cost of calling hashCode, will not be significant if this switch case is used in a tight loop e.g. loop to process items or game engine loops to render screens. Nevertheless, I still consider using String in switch statement and using it represent fixed number of things is not a good practice, Enumeration type in Java is there for a reason, and every Java programmer must use it.

That's all on How String in Switch works in Java 7. As expected it uses hashCode() method for switching and equals() method for verification, This means it's just a syntactic sugar, rather than an inbuilt native functionality. Now choice is yours, I am personally not a big fan of using String in Switch case as it result in brittle code, case-sensitive issue, and no compile time check for invalid input. In fact plain old integer constants are my favourite for performance critical code and Enumeration type in Java, where readability and code quality is more important. In fact in 99.99% cases enum is better choice then String or integer variable, its the very same reason they exists in Java programming language. All this feature has done is promoted this bad coding practices, I struggle hard to find a proper use case of using String in switch cases with a set of inputs in any other purpose then testing and debugging, let me know if you have a convincing reason of using String in switch case in your project, may be that will change my mind.

Nice article. It made me think about what happens if there's a hash code collision on two strings in the switch. Here's the result: http://blog.tremblay.pro/2014/05/collisions-on-switch-on-strings.html

What I didn't understand in the reverse engineered code above is why mode is assigned to var s and s used in the equals check inside the switch. mode might as well be directly used for the equals check right? Any answers or views on this?

@girish: The expression that you switch upon is not necessarily a local variable or a field, it can be any expression evaluating to a string, for example a method invocation. As the value is used at least twice, it must be saved to a local variable because the result of the method invocation might change over time. If the expression is a field, its value might be changed by a thread running concurrently.