public business secrets

Post navigation

How to avoid premature optimization

A common quote linked with Donald E. Knuth of TeX fame is “premature optimization is the root of all evil”. While this might sound a bit harsh, it holds a lot of truth.

Performance as an asset

If you consider software performance as an asset, you can determine its characteristics and derive your decisions about whether to work on it from them. For example, you will discover that while a good performance is paramount, there is a certain threshold when further optimizations are worthless from the asset’s point of view. If you happen to develop a game, you only need to draw as much frames as the monitor can handle. If you process sensor data in real time, there is no need for a prolonged pause between data packets, because computers don’t grow tired.
If you treat performance as an asset, you can also apply a worth to every optimization you want to make and contrast it to the cost of the work you expect to have to invest. This divides the possible optimizations into a group of lucrative and a (probably larger) group of unprofitable investments.

Simple rules

Treating performance as an asset gives you the mental tools to make profound decisions about when and what to optimize. But there are also three simple rules you can apply if you don’t want to write a business plan every time you think “if I just change this line, the code will run much smoother”.

First rule: Don’t

The first rule of performance optimization with a tendency to avoid premature optimization is to just don’t care. You ask yourself if a LinkedList is faster than an ArrayList for a given use case? The short (and ignorant) answer is: both will be fast enough. Is it better to explicitly set all references to null after usage? Why bother when the garbage collector won’t slow you down anyway. Following this rule, you deliberately act dumber than you are with the goal to delay action.

There is a disclaimer, though. There are two different kinds of performance optimization: The first one was referenced in the examples above and deals with actual, but rather local code changes. The second and more important type of performance consideration deals with complexity theory (the one with big O notation) and isn’t measured in milliseconds, but in scalability. You don’t want to be ignorant of the latter type because it will always ruin your runtime behaviour regardless of any optimization of the former type if you implement an exponential or even factorial algorithm. You can be ignorant of “real performance tuning”, but should always be aware of the complexity category your algorithm is living in.

Second rule: Not Yet

There will be a moment when you clearly see an opportunity to improve the runtime performance of your code with just this very small (and very clever) modification. This is when you are ready to break the first rule. Now you should adhere to the second rule: If the cost is as marginal as you say and the gain is profound, go for it – but not now. Performance tuning isn’t a time limited sale that you are only offered right now or never. You can make the same change and reap the same advantages next week or next month. You doubt that you will remember the details? Write an issue or insert some code comment about it. You probably have another task on your todo list that is more important than speeding up the functionality at hand.

The goal of the first rule was to delay action, and that’s the goal of the second rule, too. You’ve probably guessed it already: you avoid premature optimization best by not optimizing at all or at least not optimizing too early. You need to be sure about the value of an optimization before you implement it. As a result of the second rule, your code will be enriched with possibilities for performance improvement. And if you actually need to improve your performance, you can orient yourself along these possibilities or find them then. You want to invest in the tuning business as late as possible, for it is highly speculative.

Third rule: Measure

If you cannot hold on to the first two rules, for example when a real performance issue is reported, you need to take action. But as you are going to invest work into performance optimization, you can as well invest it efficiently. In most applications, there is a 90/10 rule in effect, stating that 90 percent of the runtime is spent in just 10 percent of the code. If you don’t know exactly where your performance bottleneck is, find it using a profiler and remember the 90/10 rule. It’s not efficient nor effective to improve the 90 percent of your code that doesn’t matter in regard to performance.

If you have identified the piece of code that most likely slows your application down, you should remember the second part of the third rule: Never make performance optimizations without a meaningful benchmark that you can run beforehand and afterwards. All to often, the clever performance trick you remember from long ago is actually hurting your performance now. A meaningful benchmark will tell you if you did good. To make a benchmark “meaningful”, you really need to read up on benchmarking in your target platform. In Java, for example, you need to know about proper warm-up of the VM and perform enough cycles to not include one-time effects in your numbers. If you’ve written such a benchmark, keep it! Try to fully automate it and let it be the cornerstone of your growing performance test suite. There might come the day when this test/benchmark tells you that your formerly clever optimization is now obsolete due to internal platform changes.

Conclusion

If you follow these three simple rules, you won’t automatically write high performance software. But you will spend your valuable time fixing real performance issues instead of tinkering with your code to no effect. You definitely won’t optimize prematurely and steer clear of this “root of all evil”.

Not only should you regard optimizations among each other as an investment, but the act of optimizing itself. The time you do an unnecessary optimization is better spent fixing bugs, because no one wants a program that is fast but doesn’t work correctly.
And if you happen to gain performance through some kinky hack, you hurt your codebase and/or stability and that’s a huge price to pay.

I think premature optimization is often executed because the developer underestimates the speed of modern computers.
I’m developing a game at the moment and sometimes I think: It’s impossible that the CPU will iterate through hundreds or thousands of game objects, handle physics AND update GPU data sixty times per second without performance tweaks.
But yes, the CPU can, and it will still have idle time. It’s pretty astonishing.

One thing I’ve learned from Daniel is: It depends. If your miscalculations will cost money, or worse yet, lives, performance is secondary at best.
On the other hand, I would rather use a chat application that drops every 100th message than one that needs 3 seconds to deliver a message.

I would like to see performance/scalability tests early on in the project. Like with automated (unit) tests you want to catch really bad scalability/performance early on. So maybe point 3 should be the first point. With numbers you can better decide if you want to invest in optimization or not.

Yes, it depends mostly on the requirements of the domain and your customers/users. About the chat application: in this case I see it the other way around: for me I can wait some time to get the message but when it is lost: that’s pretty annoying.