I am working in a .Net, C# shop and I have a coworker that keeps insisting that we should use giant Switch statements in our code with lots of "Cases" rather than more object oriented approaches. His argument consistently goes back to the fact that a Switch statement compiles to a "cpu jump table" and is therefore the fastest option (even though in other things our team is told that we don't care about speed).

I honestly don't have an argument against this...because I don't know what the heck he's talking about.
Is he right?
Is he just talking out his ass?
Just trying to learn here.

You can verify if he's right by using something like .NET Reflector to look at the assembly code and look for the "cpu jump table".
–
FrustratedWithFormsDesignerMay 25 '11 at 15:21

5

"Switch statement compiles to a "cpu jump table" So does worst-case method dispatching with all pure-virtual functions. None virtual functions are simply linked in directly. Have you dumped any code to compare?
–
S.LottMay 25 '11 at 15:26

40

Code should be written for PEOPLE not for machines, otherwise we would just do everything in assembly.
–
maple_shaft♦May 25 '11 at 16:14

6

If he's that much of a noodge, quote Knuth to him : "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
–
DaveEMay 25 '11 at 16:50

9

Maintainability. Any other questions with one word answers I can help you with?
–
Matt EllenMay 25 '11 at 19:34

14 Answers
14

He is probably an old C hacker and yes, he talking out of his ass. .Net is not C++; the .Net compiler keeps on getting better and most clever hacks are counter-productive, if not today then in the next .Net version.
Small functions are preferable because .Net JIT-s every function once before it is being used. So, if some cases never get hit during a LifeCycle of a program, so no cost is incurred in JIT-compiling these. Anyhow, if speed is not an issue, there should not be optimizations. Write for programmer first, for compiler second. Your co-worker will not be easily convinced, so I would prove empirically that better organized code is actually faster. I would pick one of his worst examples, rewrite them in a better way, and then make sure that your code is faster. Cherry-pick if you must. Then run it a few million times, profile and show him. That ought to teach him well.

EDIT

Bill Wagner wrote:

Item 11: Understand the Attraction of Small Functions(Effective C# Second Edition)
Remember that translating your C# code into machine-executable code is a two-step process. The C# compiler generates IL that gets delivered in assemblies. The JIT compiler generates machine code for each method (or group of methods, when inlining is involved), as needed. Small functions make it much easier for the JIT compiler to amortize that cost. Small functions are also more likely to be candidates for inlining. It’s not just smallness: Simpler control flow matters just as much. Fewer control branches inside functions make it easier for the JIT compiler to enregister variables. It’s not just good practice to write clearer code; it’s how you create more efficient code at runtime.

Well, my favorite approach to replacing a huge switch statement is with a dictionary (or sometimes even an array if I am switching on enums or small ints) that is mapping values to functions that get called in response to them. Doing so forces one to remove a lot of nasty shared spaghetti state, but that is a good thing. A large switch statement is usually a maintenance nightmare. So ... with arrays and dictionaries, the lookup will take a constant time, and there will be little extra memory wasted.

Don't worry about proving it faster. This is premature optimization. The millisecond you might save is nothing compared to that index you forgot to add to the database that costs you 200ms. You're fighting the wrong battle.
–
Rein HenrichsMay 25 '11 at 15:47

1

I know that, but humans are no robot. Many cannot be told unless they already know. So, they need a dramatic experience to change their habits.
–
JobMay 25 '11 at 15:56

22

@Job what if he's actually right? The point isn't that he's wrong, the point is that he's right and it doesn't matter.
–
Rein HenrichsMay 25 '11 at 16:00

2

Even if he were right about 100% of the cases he is still wasting our time.
–
JeremyMay 25 '11 at 18:32

6

I want to gouge my eyes out trying to read that page you linked.
–
AttackingHoboMay 25 '11 at 19:25

Unless your colleague can provide proof, that this alteration provides an actual measurable benefit on the scale of the whole application, it is inferior to your approach (i.e. polymorphism), which actually does provide such a benefit: maintainability.

-1 for "Premature optimization is the root of all evil." Please display the entire quote, not just one part that biases Knuth's opinion.
–
alternativeMay 25 '11 at 19:15

1

@mathepic: I intentionally did not present this as a quote. This sentence, as is, is my personal opinion, although of course not my creation. Although it may be noted that the guys from c2 seem to consider just that part the core wisdom.
–
back2dosMay 25 '11 at 20:53

3

@alternative The full Knuth quote "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." Describes the OP's coworker perfectly. IMHO back2dos summarised the quote well with "premature optimisation is the root of all evil"
–
MarkJJul 25 '13 at 10:47

Unless you're writing real-time software it's unlikely that the minuscule amount of speedup you might possibly get from doing something in a completely insane manner will make much difference to your client. I wouldn't even go about battling this one on the speed front, this guy is clearly not going to listen to any argument on the subject.

Maintainability, however, is the aim of the game, and a giant switch statement is not even slightly maintainable, how do you explain the different paths through the code to a new guys? The documentation will have to be as long as the code itself!

Plus, you've then got the complete inability to unit test effectively (too many possible paths, not to mention the probable lack of interfaces etc.), which makes your code even less maintainable.

[On the being-interested side: the JITter performs better on smaller methods, so giant switch statements (and their inherently large methods) will harm your speed in large assemblies, IIRC.]

A giant switch statement is a lot easier for the new guy to comprehend: all the possible behaviors are collected right there in a nice neat list. Indirect calls are extremely difficult to follow, in the worst case (function pointer) you need to search the entire code base for functions of the right signature, and virtual calls are only a little better (search for functions of the right name and signature and related by inheritance). But maintainability is not about being read-only.
–
Ben VoigtAug 12 '14 at 18:44

This type of switch statement should be shunned like a plague because it violates the Open Closed Principle. It forces the team to make changes to existing code when new functionality needs to be added, as opposed to, just adding new code.

That comes with a caveat. There are operations (functions/methods) and types. When you add a new operation, you only have to change the code in one place for the switch statements (add one new function with switch statement), but you'd have to add that method to all classes in the OO case (violates open/closed principle). If you add new types, you have to touch every switch statement, but in the OO case you'd just add one more class. Therefore to make an informed decision you have to know if you'll be adding more operations to the existing types, or adding more types.
–
Scott WhitlockMay 25 '11 at 15:34

3

If you need to add more operations to existing types in an OO paradigm without violating OCP, then I believe that's what the visitor pattern is for.
–
Scott WhitlockMay 25 '11 at 15:37

3

@Martin - name call if you will, but this is a well-known tradeoff. I refer you to Clean Code by R.C. Martin. He revisits his article on OCP, explaining what I outlined above. You can't simultaneously design for all future requirements. You have to make a choice between whether it's more likely to add more operations or more types. OO favours the addition of types. You can use OO to add more operations if you model operations as classes, but that seems to get into the visitor pattern, which has its own issues (notably overhead).
–
Scott WhitlockMay 25 '11 at 18:14

5

@Martin: Have you ever written a parser? It's quite common to have large switch-cases that switch on the next token in the lookahead buffer. You could replace those switches with virtual function calls to the next token, but that would be a maintainence nightmare. It's rare, but sometimes the switch-case is actually the better choice, because it keeps code that should be read/modified together in close proximity.
–
nikieMay 25 '11 at 18:53

1

@Martin: You used words like "never", "ever" and "Poppycock", so I was assuming you're talking about all cases without exceptions, not just about the most common case. (And BTW: People still write parsers by hand. For example, the CPython parser is still written by hand, IIRC.)
–
nikieMay 26 '11 at 9:05

I have survived the nightmare known as the massive finite state machine manipulated by massive switch statements. Even worse, in my case, the FSM spanned three C++ DLLs and it was quite plain the code was written by someone versed in C.

The metrics you need to care about are:

Speed of making a change

Speed of finding the problem when it happens

I was given the task of adding a new feature to that set of DLLs, and was able to convince management that it would take me just as long to rewrite the 3 DLLs as one properly object oriented DLL as it would be for me to monkey patch and jury rig the solution into what was already there. The rewrite was a huge success, as it not only supported the new functionality but was much easier to extend. In fact, a task that would normally take a week to make sure you didn't break anything would end up taking a few hours.

So how about execution times? There was no speed increase or decrease. To be fair our performance was throttled by the system drivers, so if the object oriented solution was in fact slower we wouldn't know it.

What's wrong with massive switch statements for an OO language?

Program control flow is taken away from the object where it belongs and placed outside the object

Many points of external control translates into many places you need to review

It is unclear where state is stored, particularly if the switch is inside a loop

The quickest comparison is no comparison at all (you can avoid the need for many comparisons with a good object oriented design)

It's more efficient to iterate through your objects and always call the same method on all of the objects than it is to change your code based on the object type or enum that encodes the type.

I don't buy the performance argument; it's all about code maintainability.

BUT: sometimes, a giant switch statement is easier to maintain (less code) than a bunch of small classes overriding virtual function(s) of an abstract base class. For example, if you were to implement a CPU emulator, you would not implement the functionality of each instruction in a separate class -- you would just stuff it into a giant swtich on the opcode, possibly calling helper functions for more complex instructions.

Rule of thumb: if the switch is somehow performed on the TYPE, you should probably use inheritance and virtual functions. If the switch is performed on a VALUE of a fixed type (e.g., the instruction opcode, as above), it's OK to leave it as it is.

He is correct that the resulting machine code will probably be more efficient. The compiler essential transforms a switch statement into a set of tests and branches, which will be relatively few instructions. There is a high chance that the code resulting from more abstracted approaches will require more instructions.

HOWEVER: It's almost certainly the case that your particular application doesn't need to worry about this kind of micro-optimisation, or you wouldn't be using .net in the first place. For anything short of very constrained embedded applications, or CPU intensive work you should always let the compiler deal with optimisation. Concentrate on writing clean, maintainable code. This is almost always of far great value than a few tenths of a nano-second in execution time.

For some things and for smaller amounts of actions, the OO version is much goofier. It has to have some kind of factory to convert some value into the creation of an IAction. In many cases it's a lot more readable to just switch on that value instead.
–
Zan LynxMay 25 '11 at 19:49

@Zan Lynx: Your argument is too generic. The creation of the IAction object is just as difficult as retrieving the action integer no harder no easier. So we can have a real conversation without be way to generic. Consider a calculator. Whats the difference in complexity here? The answer is zero. As all actions pre-created. You get the input from the user and its already an action.
–
Loki AstariMay 25 '11 at 20:33

2

@Martin: You are assuming a GUI calculator app. Let's take a keyboard calculator app written for C++ on an embedded system instead. Now you have a scan-code integer from a hardware register. Now what is less complex?
–
Zan LynxMay 25 '11 at 21:15

2

@Martin: You don't see how integer -> lookup table -> creation of new object -> call virtual function is more complicated than integer -> switch -> function? How do you not see that?
–
Zan LynxMay 26 '11 at 16:30

2

@Martin: Maybe I will. In the meantime, explain how you get the IAction object to call action() on from an integer without a lookup table.
–
Zan LynxMay 26 '11 at 18:08

One major reason to use classes instead of switch statements is that switch statements tends to lead to one huge file that have lots of logic. This is both a maintainance nightmare as well as a problem with source management since you have to check out and edit that huge file instead of a different smaller class files

It sounds like your coworker is very concerned about performance. It might be that in some cases a large case/switch structure will perform faster, but hopefully you guys would do an experiment by doing timing tests on the OO version and the switch/case version. I am guessing the OO version has less code and is easier to follow, understand and maintain. I would argue for the OO version first (as maintenance/readability should be initially more important), and only consider the switch/case version only if the OO version has serious performance issues and it can be shown that a switch/case will make a significant improvement.

One maintainability advantage of polymorphism that no-one has mentioned is that you will be able to structure your code much more nicely using inheritance if you are always switching on the same list of cases, but sometime several cases are handled the same way and sometime they aren't

Eg. if you are switching between Dog, Cat and Elephant, and sometimes Dog and Cat have the same case, you can make them both inherit from an abstract class DomesticAnimal and put those function in the abstract class.

Also, I was surprised that several people used a parser as an example of where you wouldn't use polymorphism. For a tree-like parser this is definitely the wrong approach, but if you have something like assembly, where each line is somewhat independent, and start with an opcode that indicates how the rest of the line should be interpreted, I would totally use polymorphism and a Factory. Each class can implement functions like ExtractConstants or ExtractSymbols. I have used this approach for a toy BASIC interpreter.

Even if this wasn't bad for maintainability, I don't believe that it will be better for performance. A virtual function call is simply one extra indirection (the same as the best case for a switch statement) so even in C++ the performance should be roughly equal. In C#, where all function calls are virtual, the switch statement should be worse, since you have the same virtual function call overhead in both versions.