Wednesday, April 24, 2013

Performance tuning. Spending time is NOT OK (if you do not know exactly why)

Yet another performance tuning story, similar to one that happened about a month ago. Sad thing - something was was coded years ago we never questioned the time spent... Just to keep it short:

once upon a time there was a very time-consuming module.

eventually we were forced to take much closer look - WHY is it so time-consuming

we found in our own view a function call that was completely unnecessary.

this function was very light that initially we ignored it altogether, but later we realized that we've been calling it 80000 times - and in that case even the lightest cost adds up.

removing this function (all needed data was already available via joined tables) took the cost of a class from 40 seconds down to 3.

Profit! :-)

We knew that the module is extremely complicated and that a lot of database operations was involved - so we assumed that it is absolutely fine! Of course, we've hit a case when assumptions are the worst enemies of a real knowledge... Especially if we are talking about performance tuning - because unless we've proven where exactly we are losing N seconds in the module X, we cannot say that module X is cannot take less than N seconds. Because those N seconds could be related to the different module/outdated coding technique/structural change etc. And only knowing what exactly is going on we can make a decision whether it is OK or not.

Summary: A couple of lessons learned:

If something does not perform well in the existing system - check it (unless it is something known and documented)! You may think that that time loss is inevitable, but who knows?

Function calls in SELECT statements could cause a lot of issues even if each call is light - you must think not only about the cost of a function call, but about how many times this function will be fired

By the way, at the recent IOUG Collab'13 in the presentation "Top 5 Issues that Cannot be Resolved by DBAs (other than
missed bind variables)" I covered the second topic (number of calls) a bit deeper. Below is a snippet from my white-paper:

Number of calls in SELECT clause

There are multiple ways of ensuring that if a function is
referenced in the SELECT clause, it is not fired more often than needed.
Unfortunately, few developers are even aware of this problem. My recommendation
is to include the following set of examples in any PL/SQL class. This explicitly
illustrates the difference between real understanding and guessing.

First, set up a basic environment to count total number of
calls: a package variable to store the counter, a simple function, and a checker
to display/reset the counter:

create package misha_pkg is

v_nr number:=0;

end;

create or replace function
f_change_tx (i_tx varchar2)

return varchar2 is

begin

misha_pkg.v_nr:=misha_pkg.v_nr+1;

return lower(i_tx);

end;

Create or replace procedure
p_check is
begin

dbms_output.put_line('Fired:'||misha_pkg.v_nr);

misha_pkg.v_nr:=0;

end;

Second, run a very simple query against table EMP, where the function above will be applied
against EMP.JOB. And let us keep in mind that there are 14 total rows in the
table EMP:

SQL> select empno, ename,
f_change_tx(job) job_change_tx

2 from emp;

...

14 rows selected.
SQL> exec p_check

Fired:14

PL/SQL procedure successfully
completed.

If you just use the function, it will be fired for every row.
But we know that there are only 5 distinct JOB values, so we should try to
decrease the number of calls. In Oracle 11gR2, there is a very interesting
internal operation called “scalar sub-query caching” being used while
processing SQL queries. It allows Oracle to internally reuse previously calculated
results on SELECT statements if they are called multiple times in the same
query. The following example tests to see if using this operation helps:

SQL> select empno, ename, (select f_change_tx(job) from dual)

2 from emp;

...

14 rows selected.

SQL> exec p_check Fired:5

PL/SQL procedure successfully
completed.

SQL>

The result shows that it did help. Now, only five distinct
calls are registered, which isexactly as needed. Although, since we are
discussing cache, why not use it explicitly? There is another very powerful
feature called “PL/SQL function result cache.” The following example enables it
on the function while the same query is run two times:

The result is impressive! If the first call matches the
sub-query caching, the second call is a fantastic example of great performance
tuning – everything works as needed, but nothing is being done (actually, this
is not 100% true, since the cache should be retrieved anyway, but for practical
purposes it is a very simple PK lookup).