Wednesday, October 31, 2012

Turn Those IF Branches Around

When I find a conditional structure of IF-ELSEIF-ELSE or bigger, in most cases I've noticed that the first things being checked for are the exceptional conditions, the exclusionary conditions, and then the expected follow-the-rules condition will be the last piece. In effect, the branch that gets chosen the most is the last one reached.

In looking for refactoring opportunities to simplify these, it occurred to me to wonder "what happens to the conditional if I check for what I really want first?" In some cases, it has turned out to greatly simplify the check altogether, even to allow for IF to be the branch I want, and ELSE be a simplified catch-all for the conditions I don't want.

No concrete examples here... just a thought to ponder when you see a structure like this. It has proven useful enough times to me that I thought I'd share it.

I think from a readability and maintainability perspective, it makes sense. If the code follows your natural thought process, it should be easier to understand.

The way I typically approach IF-ELSE blocks is by being lazy. The check that requires the least amount of work goes first. For example, checking a boolean is easy. Calling a method is harder. Looking stuff up in a database is hardest. If the boolean check fails, you don't have to do as much hard work to check the other stuff.

One of the reasons to look for exceptional conditions first is to prevent deep nesting. Check for the "bad" conditions first and return early from method. That way, the "happy path" can be outside the conditional structure, and the majority of the logic is at the top level of the method body. Greatly simplifies readability, and tends to lead to shorter, less complex methods.

Don't get me wrong... I'm not knocking it. Just highlighting a question to answer to oneself before leaving a complicated IF-ELSE in place :-)

Single checks that are standalone and return early are indeed one of the patterns I learned since coming to PHP. Being the dinosaur that I am, the old "only ONE return point per function" rule was a hard one to break myself from. If you grew up on that rule, you wrote increasing deeply nested IF-ELSEs that had more IF-ELSEs inside them (yuck).

If you're writing for performance, you should put the most commonly occurring conditional first, that was your code has less 'else' conditionals to test. However, this is a micro optimization and not recommended in the vast majority of cases.

Another trick I use, is to add failure filters as I go in order to reduce indentation headaches, ie:

Matthew's suggestion definitely leads to more readable code, but I don't find it to be a good rule of thumb.

Your branches should be ordered from most likely true to least likely true. If you check the exclusionary cases first, then you're guaranteeing that your code will always run those checks, when they may only be true in a ridiculously small number of cases. That's wasted processing.

So, I would say that trivial exclusionary checks (e.g. checking for null values and such) should be checked first and returned from, but anything that's non-trivial should come last in your if/else block.

The same logic holds true for multiple conditions within the same check (e.g. if A and B and C). You always want your most-likely-to-be-true condition to come first.

Hehe... again, this is just a suggestion to ask oneself when seeing a complex conditional block, in case viewing it from the other direction results in a less complex block. It's not a silver bullet, or the One True Way.

In the case of returning early so that longer logic can follow more legibly, my point does not apply.

In the cases where this step-back-and-reconsider things was helpful for me, I saw three or more conditionals where nothing "returned early", as each branch was choosing what data to set in order for the overall process to continue. In all the instances, I discovered that I could reverse the direction of the individual conditions, join them into the same IF, and then the IF pointed to the this-is-what-we-want result. That then left the one ELSE to provide for the errors, which usually was either "foo is empty" or "foo is invalid". I took this as a refactor-for-readability win :-)