Mastering C# and Unity3D

Logical Operator Performance

An absolute fundamental of programming is the concept of logical operators like && and ||. In a recent comment, Chris H pointed out that MXMLC doesn’t do a particularly good job generating bytecode for these operators. Today I’ll look further into the subject and see just how much this impacts performance.

Let’s take a look at a very simple function using the “logical and” (&&) operator:

First note that this function doesn’t really do anything. MXMLC is perfectly free to delete the contents of this function as an optimization since they can have no effect on anything. However MXMLC (at least in version 4.1) doesn’t do this sort of optimization. While bad for performance, it’ll allow us to see the bytecode that would be generated if MXMLC were forced to keep the if statement around like it would in a real function. So, let’s examine the output bytecode:

While a bit long, this bytecode is basically doing the same thing over and over as it checks a, b, c, d, and e against 1, 2, 3, 4, and 5. The way it does this is through the following series of operations:

While not extremely complicated, this approach has the downside of performing multiple jumps via iffalse and quite a bit of seemingly-unnecessary stack operations like dup. Let’s look at an alternative approach in AS3 to see if we can trick the compiler into generating better bytecode whilst still accomplishing the exact same task:

Unlike the AS3, this bytecode is quite a bit shorter and therefore starts out with a natural advantage: smaller code SWF size. Let’s see how it accomplishes the task:

getlocal1 – Push the variable to compare to the operand stack

pushbyte 1 – Push the literal value to compare to the operand stack

ifne L1 – If the variable doesn’t equal the literal, skip to the end

By using ifne instead of equals, dup, andiffalse to do comparisons, this bytecode is much more compact and taking much better advantage of the available instructions. Further, this version properly skips to the end of the if chain as soon as a comparison fails rather than simply skipping to a strange mid-way point in the process. This allows the Flash VM to “short circuit” the comparisons properly and hopefully save some execution time. Speaking of execution time, let’s see a performance test:

The performance test measures the time it takes to run through these five comparisons via the “compound” approach (using local operators) and via the “chain” approach (using a chain of if blocks). Here are the performance results I get:

Environment

All True

All False

Compound

Chain

Compound

Chain

3.0 Ghz Intel Core 2 Duo, Windows XP

806

428

565

291

2.0 Ghz Intel Core 2 Duo, Mac OS X

1302

639

844

432

As you can see, the performance difference is striking: if chains are about twice as fast regardless of platform or even if any or all of the comparisons are true. If you don’t mind a little extra typing in AS3, you can save a little SWF size and a lot of execution time by using an if chain instead of logical operators like &&.

Wow, apparently HaXe does an even worse job at this than MXMLC/AS3. Your processor is comparable to the ones I used in the test and yet the HaXe test running on it is 2-3x slower. Also, it’s just as disturbing to me that there is a ~3x advantage in HaXe with the “compound” method as the ~2x advantage I found in AS3 for the “chain” method. Really, both methods should result in the same bytecode and it should be the fastest bytecode possible. Currently, that’s the bytecode that MXMLC generates for the AS3 “chain” method as shown above.

Thanks for the HaXe figures. HaXe has been very interesting to me for quite some time and it’s always good to see how performance is shaping up with it.

Wow this is interesting, unfortunately for me, this is something that should be done in tool chain level (nice to hear Joa gonna implement this on apparat ^_^), since this is (well for me, at least) break readability :(