Hey, just thought I'd ask for some good tips for optimizing my add-ons. Any really obvious things I should do/ tools I should use?

So far what I've been doing is focusing on functions that run frequently (OnUpdate, UNIT_AURA, log events) and removing all calls to global functions/ variables, plus making sure no variables are created in that scope. The only tool I have for it is WowGlobalFinder, and, err, the game's default display of the 3 add-ons that take up the most memory.
What other techniques and add-ons should I know about? I can only assume there are a bunch of tips and tricks for working with strings and tables that I'm not aware of...

__________________SanityCheck - If you've ever said the words "Sorry, I forgot" then you need this add-on.

Remember, every time you post a comment on an add-on, a kitten gets its wings!

1. Make sure you are not leaking any globals, and reconsider the necessity of any intentional globals, including the names of frames and other objects. Your main addon object and/or main display frame are good candidates for global names; other objects generally don't need names unless you're using Blizzard templates that require them.

2. Upvalue any globals you're reading in frequently called functions, like OnUpdate, CLEU, UNIT_AURA, etc. Don't bother upvaluing globals you only read infrequently, such as in response to the user changing an option, the player leaving combat, etc.

3. Avoid using OnUpdate scripts for anything other than updating a visual display, such as a statusbar, and move as much of that work as you can to other places, such as event handlers. For example, if you're showing a timer bar for a buff on the player, don't call UnitAura each time your OnUpdate script runs -- store the info you need in variables, and update those variables when UNIT_AURA fires for the player unit instead. Depending on what you're updating, you may be able to offload some/all of the work into C code by using animations.

4. Avoid calling functions when you can, as it's really slow. Some common examples of unnecessary function calls include:

(a) using local x = select(2, ...) instead of local _, x = ...
(b) using string.format("raid%d", i) instead of "raid"..i
(c) using for i, v in ipairs(tbl) do instead of for i = 1, #tbl do local v = tbl[ i ]

5. Avoid creating new tables when you can reuse existing tables. In keeping with #4, don't use wipe unless you actually need to -- for example, if you have a table that's storing info about a buff obtained via UnitAura, you don't need to wipe it, since the keys don't change, and you just overwrite the values.

6. Keep variables limited to the narrowest scope necessary. For example, if a variable is only used inside a loop, define it locally inside the loop, not outside.

7. Call string functions directly, eg. format("banana", "%l", strupper, 1) instead of of ("banana"):format("%l", strupper, 1) -- both incur the cost of a function call, but the second also incurs the costs of getting a metatable and performing a table lookup. If you upvalue format then the direct-call method is enormously faster, but even as a global lookup it's still slightly faster. The only time I'd recommend using the meta-method is when you're chaining multiple string operations; in that case, it's still the slower way to do it, but the increased readability of your code is usually worth it.

No, I don't think you want me to post the thousands of lines of all my add-ons

Good tips, I especially like 5 (because I haven't thought of it at all).
However, 4 surprised me a bit. When I was doing my testing, defining a single number variable in OnUpdate had a very visible effect on my performance; calling a function (or twenty) didn't seem to have such an effect.

__________________SanityCheck - If you've ever said the words "Sorry, I forgot" then you need this add-on.

Remember, every time you post a comment on an add-on, a kitten gets its wings!

Static memory usage is almost completely irrelevant, and I have nothing but contempt for whichever Blizzard developer thought it would be a good idea to put "how much memory your addons are using" into the default UI. Most users (and most addon developers) do not understand what those numbers mean, and having them displayed just leads to annoyed developers having to explain over and over that "lower memory usage" is not intrinsically good or a worthwhile goal.

The metrics you should be caring about -- the ones that actually matter -- are increasing memory usage (how quickly your addon's memory usage is growing) and CPU time. Those are the things that actually affect framerates. I use OptionHouse to measure them, but there are plenty of other options, including Broker plugins. Make sure you disable CPU profiling when you're done, as leaving it enabled will slow down everything across the board.

This thread on WowAce from last year includes some benchmark tests I did showing the differences in speed (CPU time) for global lookups vs. upvalues, and different styles of calling string methods:

Also, you should just attach your files, or post a link to your addon download. It doesn't really matter whether it's 100 lines or 100,000 lines; any bad coding habits will be just as easy to scan through and see regardless of the size, and I have a lot of free time at work.

Don't worry, I get that static memory bit. What I meant was, that display helped me see how SanityCheck was taking up rapidly increasing amounts of memory.
I wouldn't go as far as to flame the very idea of that display... my machine, for one, is old enough for me to care if I have several heavy add-ons loaded, even if they don't use much CPU. I doubt it was intended for this purpose; Blizzard, or any other sane organization, would not put that kind of tool on the main display of their game.

Also, just to make sure we're on the same page: when you say "upvalue" you mean this sort of assignment, right?

Lua Code:

local_G=_G

Anyway, I will definitely get that OptionHouse and see about that old thread, thanks.
And if you're truly bored enough, please have a look at this and this and maybe this. Do not under any circumstances look at the code of MooTrack, it was my first add-on over 50 lines and the code sucks immensely

__________________SanityCheck - If you've ever said the words "Sorry, I forgot" then you need this add-on.

Remember, every time you post a comment on an add-on, a kitten gets its wings!

If we would ignore the fact that I need maxCPoints as an upper bound for the loop for the sake of the example, wouldn't it be still better to define maxCPoints outside the for-loop, despite it is only needed inside? Wouldn't this spare me a global look-up for MAX_COMBO_POINTS and a garbage collection on every loop iteration? I don't know how GC works in loops, but if it collects on every iteration would this be cheaper than having to move one scope up to find maxCPoints?

That's actually more specific of a situation, as your definition of maxCPoints is needed in all iterations in the for loop, it's best to keep it where it is instead of reinitializing the same value to the same local multiple times. This can impact performance to do so unnecessarily.

As a rule of thumb, if a variable is needed in all iterations of a loop instead of in each independent iteration, it's best to define it as an upvalue instead of inside the loop.

As far as using unpack() versus manually indexing tables, it depends on how you're indexing the table for each return and at some point, I would think unpack() would start to be more appealing, but no data exists to suggest at what point (if any) this will start to happen.

It can be guessed at that indexing the table will happen in C side when dealing with unpack(), but another question is how would it compare to manual indexing in a single assignment statement? To make the comparison fair, the manual indexing will need to be a single-dimensional table. For example, ns.colors.cpoints[i][1] will use 4 indexing operations for a single value when storing the specific table into a local, then indexing it in your function call will result in only 1 index operation per value.

What is the difference between "needed in all iterations" and "needed in each independent iteration"? Please ignore the use of maxCPoints as the upper bound of the loop as this would be an obvious reason to define it locally in a higher scope as I also need it in each iteration. I thought that the point here would be that I assign a global (MAX_COMBO_POINTS) to maxCPoints, so that I would have a local variable definition, a global look-up and a garbage collect(??) for each loop iteration, where the global look-up is the most expensive part. So would the global look-up be reason enough to define maxCPoints outside the loop despite it being used only in the loop?

What is the difference between "needed in all iterations" and "needed in each independent iteration"? Please ignore the use of maxCPoints as the upper bound of the loop as this would be an obvious reason to define it locally in a higher scope as I also need it in each iteration. I thought that the point here would be that I assign a global (MAX_COMBO_POINTS) to maxCPoints, so that I would have a local variable definition, a global look-up and a garbage collect(??) for each loop iteration, where the global look-up is the most expensive part. So would the global look-up be reason enough to define maxCPoints outside the loop despite it being used only in the loop?

I think you misunderstood the message Phanx was trying to convey. She meant more along the lines of your cPoint definition inside the loop as being the correct way to do that as some people would define cPoint outside the loop and update it within the loop in the hopes of avoiding non-existent overhead from garbage collection.

A good rule of thumb is that the cost of defining a local vs indexing hits the break even point at 3 look-ups. So, if you need an indexed value 3+ times you are not hurting anything by creating a local reference and using it instead.

That said, you do not need MAX_COMBO_POINTS within the loop at all. The calculation you are doing is static for each iteration so it only needs to be done once prior to the loop. Doing that also lets you get rid of the cPoint:GetWidth() index/call as you will already have that value.

Changes:
1. Moved the size calculations to outside the loop
2. Remove the name of the points to avoid clutter in _G
3. Changed the math a bit on sizing to make full use of the width passed
4. Used local references for indexes used in the loop (if you don't use the alpha color parameter remove the color[4] bit)

With that said, I can't imagine that function ever being called enough times to have ever warranted more that a quick pass for optimization. You only really need to scrutinize code for heavily called stuff like OnUpdate scripts and certain event handlers (CLEU, UNIT_AURA, etc...).

What is the difference between "needed in all iterations" and "needed in each independent iteration"?

To explain this simply, every single time code in a loop is run is considered a single iteration of the loop. All locals defined inside a loop only exist in a single iteration independent of all the others that may be run. As an example, a loop that runs code 5 times has 5 iterations whose scope is independent of each other.

__________________

"All I want is a pretty girl, a decent meal, and the right to shoot lightning at fools."

As far as using unpack() versus manually indexing tables, it depends on how you're indexing the table for each return and at some point, I would think unpack() would start to be more appealing, but no data exists to suggest at what point (if any) this will start to happen.

I did some benchmarking on this a while ago. I don't know whether t[a][b][1], t[a][b][2], t[a][b][3] is faster than unpack(t[a][b]) (though I'd guess it is) but I do know that t[1], t[2], t[3] is significantly faster than unpack(t), so I would strongly advise using SDPhantom's alternative instead of using unpack:

Phanx: Sorry, what I meant to say was not that I don't understand the statement, but that I don't understand the reasoning behind it, which appears counter-intuitive to me.
If we take your first and second examples, what do you gain by moving the variable into the loop? It seems like you just define 4 variables instead of just 1, which means more CPU time spent on creating the variable and more work for the GC.

__________________SanityCheck - If you've ever said the words "Sorry, I forgot" then you need this add-on.

Remember, every time you post a comment on an add-on, a kitten gets its wings!

2. Upvalue any globals you're reading in frequently called functions, like OnUpdate, CLEU, UNIT_AURA, etc. Don't bother upvaluing globals you only read infrequently, such as in response to the user changing an option, the player leaving combat, etc.

Regarding the OnUpdate functions, I presume you mean globals you would call each time OnUpdate is run, right?
I mean, in the example bellow (from a non publshed little addon of mine), the function AlkaT_SpORCo_update() is called only once every 0.25 seconds, and it is in this function that I call two global functions. If I understood correctly, upvalueing these two functions would bring no benefit at all, right?