I've been programming in Java for quite some time and I always find myself doing very little "planning" (I haven't worked on anything HUGE yet, but im no stranger to big projects) and develop ideas and fix code efficiency as I go.

I tend to write a block of code, test it, test it again, try to break it, then make it look nice (visually) and then worry about modifying it to be more efficient (if possible).

Is this how most people think/work? I'm wondering if anyone else takes the time to pretty up their code after its done or if they plan how its supposed to look and the most efficient method of writing it before actually coding anything.

I'm trying to fine tune my own efficiency and hearing from the community will give me a broader idea of how professional and amateur programmers alike think.

This question came from our site for professional and enthusiast programmers.

There are a lot of good answers, but I guess it depends what you mean by making your code more efficient. You need to make the correct architectural decisions before and sometimes during coding (obviously it's bad to reparse a string an additional 20 times because you chose the wrong algorithm), but don't try to make 'tiny' optimisations to the code (especially things that make the code less readable or understandable) -- let the compiler or JIT do that.
–
Code BlingJun 7 '11 at 15:09

Put together a flying kite, first get it to fly. Once its flying make it fly higher until the rope snaps. Build a better one get it to fly even higher than the first kite. Once you can't go any higher make it go faster...
–
DarknightJun 7 '11 at 15:22

I almost never think about it. It's very rare that well-written code is too slow or too big, in my experience.
–
nbtJun 7 '11 at 17:34

13 Answers
13

There are times when I just need to add a piece of functionality, and since this piece is just done by gluing together other (existing) pieces, I don't worry too much about efficiency.

There are other times where I know from the start that efficiency will be a bottleneck. That's when I first think about the most efficient way to achieve the solution and then I sit down and write the code.

In any case: cover your code with unit tests as much as possible. Once you have a working solution you can then go back and make it better (in any perceivable way).

Over the years I have done a few projects where at a later stage in the project some major refactoring was needed because of insufficient performance, and the code wasn't easy to tweak. That's where experience comes into the picture. After 20 years of writing code, you get a pretty good understanding of where your bottlenecks will be, and those are the things you want to think about before you actually write any code.

Unreadable code is what costs you a lot of time, energy and money later on. Therefore, make sure the code is as readable as possible.

Optimize for performance when absolutely necessary.

Often, optimizing for performance means producing less readable code. This is acceptable if absolutely required, e.g. because you can't fulfill a non-functional requirement without it. If this is the case, it also means that you have a very clear goal: You know exactly how much faster it needs to be, and when you can stop optimizing. That way, you won't spend more time than necessary on the fix and you won't change more of the running system than absolutely necessary.

That being said, I didn't follow my own advice often enough, only to later find out that my performance optimization was completely unnecessary (or worse: broke something) :(

EDIT: There's an exception to this rule: When it's obvious from the very start that you will need to squeeze out every last drop of performance, don't bother waiting until you run into them. This may be the case for OS design, system-critical realtime applications or (far more important in the eyes of the end-user) mobile applications.

Veto on optimize for performance. It highly depends on the environment you work on. If you develop a mobile app or in an embedded area, you have to care about performance from the very beginning.
–
WarrenFaithJun 7 '11 at 15:08

@Simon: +1 for the (far more important in the eyes of the end-user) :)
–
WarrenFaithJun 7 '11 at 15:14

I agree with Simon, even in the mobile or embedded world, readability should come first because even those applications have features that are subject to change. It doesn't matter if the app is fast or great on battery life if your clients don't want it.
–
Peter SmithJun 7 '11 at 15:15

+1: It's also much more likely you're going to change code in terms of maintenance (process/requirements change or refactoring) than solving performance bottlenecks. So it makes sense that readability is first. Never thought of it this way, but you made me think of it.
–
Robert KoritnikJun 7 '11 at 15:39

In general, write your code to work first. Once you've proven it can work, then you make it work faster - if necessary.

Sometimes through experience you learn little tricks that can make small improvements to performance that are little to no extra cost to code up front, so you may implement those in the initial iteration (and if my brain were fully awake I'd give you some recent examples of mine).

The primary goal is to get something working. In some cases it may be anything.

Then once you've got it working worry about making it fast.

However, you may need to think about efficiency in your initial coding or design if you know enough about your requirements.

For example, you're writing a social networking site. Initially it's only going to be rolled out to your class/work mates. You've no idea whether it will take off or not, so you don't have to worry about handling 100s or 1000s of concurrent users. However, if you're writing a stock control system and you know that your customers have millions of items in the their database then you need to be looking at how to handle this volume of data from the very beginning.

I tend to write a block of code, ... then worry about modifying it to be more efficient (if possible).

This is generally doomed to failure.

A really bad algorithm cannot be tweaked into efficiency. You cannot put lipstick on a pig.

If, through lack of planning, you chose a dreadful algorithm, what you have written may not ever work acceptably fast.

You must choose the correct algorithm first.

Once you've chosen the correct algorithm, you still have to write tests, write code and make the code look good.

But, if you've chosen the right algorithm, you rarely (1% of the time) need to make it "as efficient as possible". It's already as efficient as possible. The only thing left to do is fix any clumsy coding mistakes that defeat the optimizer.

But what about Unix's "When in doubt, use brute force"?
–
Christopher MahanJun 7 '11 at 19:40

@Christopher Mahan: Prevent doubt: research first. Also. This doesn't really matter. Brute force can't be "tweaked up" to make it run faster. It's always going to be slow. Always pick the optimal algorithm first, and if the optimal algorithm is still slow, it's still optimal. You solve performance first, but using an optimal algorithm (even if it's brute force).
–
S.LottJun 7 '11 at 20:00

Yeah, but, huh, when after research, still in doubt? And brute force can definitely be tweaked up. You just have to re-write that piece of the code. Only after you've measured, and only if it's a problem. Right?
–
Christopher MahanJun 14 '11 at 1:12

@Christopher Mahan: "And brute force can definitely be tweaked up". This has limited benefit. You'll never get 2x or 10x out of tweaked up. You get 5% or 10%. Tweaking only helps make the right algorithm a little faster. It won't help the wrong algorithm much at all.
–
S.LottJun 15 '11 at 15:08

@S.Lott No, you rip that whole part out and you recode with a different algorithm, different approach. I never said you need to keep the brute-force approach in place.
–
Christopher MahanJun 15 '11 at 21:11

I take an iterative approach where I get the bare minimum working, test it, then add the next increment and test it, rewriting for readability and maintainability as I go.

As far as efficiency goes, I almost always optimize for the large efficiency gains up front, and almost never worry about the small efficiency gains. That means choosing the best asymptotic complexity for the algorithms and using data structures that work well with those algorithms from the beginning, but not worrying about squeezing out a extra cycle or byte here or there.

However, I also don't sacrifice efficiency for readability or development speed unless it is absolutely necessary. There are times when I know an O(log n) or even O(1) way to do something, but I know I'll never have more than 100 or so elements, and the O(n) way will take me a tenth the time to develop and test, and be very straightforward to maintain.

I always start by writing the most expressive code I can. I worry about efficiency by choosing the correct algorithm for the problem. If the input is known to be small, then a simple brute-force algorithm is better than complex but theoretically faster code.
Even O(2^n) may be ok if n will never exceed 20.

I never worry about low-level optimization until I have a demonstrated problem. For example: I don't try to limit Java object creation; if that is a concern then Java is probably the wrong language for the job. Similarly, I rarely save computed properties of mutable classes; doing this by hand is ugly and error-prone.

+1, though there is an exception with Java: boxing primitives. It is not so hard to avoid boxing primitives that you should use some other language instead; this often makes the difference between Python-like performance and C-like performance.
–
Rex KerrJun 7 '11 at 19:32

@Rex - I might think about that in very tight loops. I was happy when boxing/unboxing became automatic. Now if only Java would join the 20th century and support operator overloading.
–
kevin clineJun 7 '11 at 23:07

Who provides the input, and when/how often do they provide it?
That determines whether the input should be stored in some data which is then pawed over by some interpreter-like program, or translated into compilable code to do what the interpreter would do (but much faster).

If you elect the data-structure route,
what's the time span between the time information is acquired, and the time it is needed?
That determines the minimum data structure you need.

Is it reading a file and processing it? Then ideally it should be I/O bound, reading no more than necessary.
Similarly, it should write no more than necessary.

When you've framed the program that way and you've decided you need a data structure, keep it as simple as possible. That means normalized. It means no notification-style consistency maintenance if you can possibly help it. If you must have denormalization, be willing to tolerate temporary inconsistency that you clean up with a periodic sweep.

When you've got it working, then don't optimize anything without letting the program itself tell you where it is needed. Some people say "profile". That's the right idea, but I find random-pausing much more direct and effective.
If you start optimizing anything, without having had the program tell you what to optimize, you're walking into the guesswork trap, which is almost always a big disappointment.

I always ask myself, starting out, where the majority of the waiting is going to occur:

Waiting for the program to exist so it can be run

Waiting for the program to be fixed so it can run properly

Waiting for the program to finish executing so you can see the result

If the answer is 1. or 2., then I only worry about efficiency in the sense that I avoid doing incredibly inefficient things (e.g. use a linked list when I should use a hash map). If the answer is 3., then at the outset I consider which algorithms will allow computational efficiency.

Keep in mind that there may be external constraints that alter the decision. For example, if the code is mission-critical, then proper execution wins over rapid execution (so, for example, one would choose to write code in a more immutable/functional style even in those cases where there was an extremely efficient mutable alternative). If the code needs to ship soon, then getting the program working at a tolerable level may be necessary to the exclusion of good performance. If the code is going to be used by hundreds of thousands of people, then tiny improvements in performance--if the program ever takes long enough to be perceptibly slow--save a huge amount of time overall, even if it takes you an hour to shave a second off of a wait time.

If you know in advance that you want code that performs well, you'd better design it that way from the start. Achieving good performance is not just a matter of getting something that works and then applying successive rounds of optimization; you need to have algorithms and a design that are compatible with efficiency. For example, if you have a computationally intensive process that you decide to solve by rearranging a complicated data structure over and over, and then you decide you need to parallelize your code for additional speed, you're going to have a problem.

It really depends on what I'm writing. At times, I glue together code for prototyping purposes to ensure feasibility, then go back to the drawing board and think (usually when reusing/integrating well established software bocks). At other times, I want to make sure things will work as efficiently as possible, so it's pen and paper until I'm confident I can write whatever it is I'm building efficiently, with possible additional optimizations down the road.

For database code, I have learned what are the most likely constructs for efficient querying and I use those first. It doesn't take any longer to write efficient SQL code if you are well aware of what is more efficent and what is less efficient before you start. SInce efficiency is critical to database oriented applications, you must practice codeing for efficiency in my opion. This is not "premature optimization", it is doing the job correctly the first time. Many of the more efficient structures are hugely more efficient than something like a cursor for instance and there is noexcuse for not using tehm from the start. I believe it is different when writing application code, but efficiency is critical when writing database code. Good design should consider what is likeliest to work most efficiently when there are two or more choices in my opinion.