Frustrated by some of the confusing and contradictory information I have encountered on the Internet over the years, I put together a “cheat sheet” to help identify useful information. The “cheat sheet” lists several questions that one might consider when reviewing books and web articles prior to changing parameters (or implementing other changes) based on the information found in those sources:

Is a specific Oracle release mentioned in the book or article? What was true, or thought to be true, with release 8.0.5 might not be true or even a good idea with release 11.2.0.1.

Does the article have a publication date, and is there a revision history that identifies the date and what modifications were made to the article? Articles which change from one day to the next without knowing what changed, and why the article changed, are difficult to use as justification for changes to the initialization parameters.

Are there any articles by other authors on the Internet which agree with the author’s suggestions or sharply disagree with the author’s suggestions? If Oracle’s official documentation strongly disagrees with the contents of the article, which of the two sources are correct? Should the advice be deemed an over-generalization which worked as a fix for a one time problem that is now advertised as something all DBAs should do as a first step in performance tuning?

Is there reproducible evidence that supports the claims made? Or, is the majority of the justification similar to “I have seen it a 100 times” or “a DBA at a fortune 50 company said to do this” or “I have been doing this for 25 years, and you should too”?

Does the parameter actually control the behavior which it is purported to control, and are there any potential side effects from modifying the parameter?

A couple months ago I scanned through Metalink looking for interesting articles. I found a couple that seem to be well written, most with recent modification dates, that someone out there might enjoy reading when troubleshooting performance problems. Hopefully, the documents still exist on the Metalink replacement.

A couple of years ago I was under the impression that increasing the SORT_AREA_SIZE parameter could reduce the number of sorts to disk in Oracle 10g R2 with the PGA_AGGREGATE_TARGET parameter set, using dedicated server processes, and with the WORKAREA_SIZE_POLICY set to AUTO (yes, confirmed to be set in the session). Why? Because I saw it happen a couple of times when troubleshooting performance problems. But, the sorts to disk still happened on occasion when there seemed to be no reason why Oracle would not be able to perform an optimal, in-memory workarea execution.

Thankfully, someone questioned my earlier testing that produced results which were inconsistent with the documentation. I then attempted to produce a test case to prove myself wrong. I set out to trigger a query to perform a sort to disk, and then by changing only the SORT_AREA_SIZE parameter, have Oracle switch to perform an in-memory workarea execution for the query. Once the test case produced that result, I attempted to identify the trigger that caused Oracle to switch. The test case appears in the following Usenet thread:http://groups.google.com/group/comp.databases.oracle.server/msg/e0b8bea300e45ae4

Line wrapping is a problem in the above thread, so I have reproduced the test case below:

I was able to produce a test case on the base patch of Oracle 10.2.0.2 on Win 32. The following required roughly 10 hours of testing and analysis. I built a new database instance with the following pfile:

Note:
Max Area size is reported as 7,864,320 (7680 KB), while _smm_max_size is reported as 30720 KB. On Oracle 10.2.0.3, when optimizer_features_enable is set to 10.2.0.3, both will be reported with the same values.
5% of 150MB = 7.5MB = 7864320 bytes
150MB/5 = 30MB = 31457280 bytes = 30720 KB

From my notes recorded during the second read through of “Cost-Based Oracle Fundamentals”:sort width = same as the “max intermediate sort width” from a 10032 trace filearea size = amount of memory available for processing data – number reported will be smaller than the SORT_AREA_SIZE due to overheadmax area size = maximum memory available for sortingdegree = degree of parallelism for the queryblocks to sort = row_size*rows/db_block_sizerow size = estimate of the average row size in bytesrows = computed (filtered) cardinality of the tableinitial runs = optimizer’s estimate of the number of sort runs that will be dumped to diskmerge passes = always at least one, even for an in-memory sort, counts the number of times the entire data set will be written to and read from disk in the event of a disk sortIO cost/pass = cost of doing a single merge passtotal IO sort cost – combines the cost per pass with the number of passestotal CPU cost – CPU component of the cost – measured in CPU operationstotal temp space used = estimated amount of temporary space needed for the sort operation

From the Oracle documentation:TEMP_SPACE: Temporary space usage of the operation (sort or hash-join) as estimated by the optimizer’s cost-based approach. For statements that use the rule-based approach, this column is null.ESTIMATED_OPTIMAL_SIZE: Estimated size (in KB) required by this work area to execute the operation completely in memory (optimal execution). This is either derived from optimizer statistics or from previous executions.ESTIMATED_ONEPASS_SIZE: Estimated size (in KB) required by this work area to execute the operation in a single pass. This is either derived from optimizer statistics or from previous executions.LAST_MEMORY_USED: Memory size (in KB) used by this work area during the last execution of the cursorLAST_EXECUTION: Indicates whether this work area ran using OPTIMAL, ONE PASS, or under ONE PASS memory requirement (MULTI-PASS), during the last execution of the cursorLAST_TEMPSEG_SIZE: Temporary segment size (in bytes) created in the last instantiation of this work area. This column is null if the last instantiation of this work area did not spill to disk.

The SQL statement required a 1 pass sort to disk.

Let’s try again, this time with a 200MB PGA_AGGREGATE_TARGET rather than a 150MB:

The Used-Tmp column did not print, so no sort to disk was required. So, increasing the PGA_AGGREGATE_TARGET from 150MB to 200MB apparently forces a sort to disk, while decreasing the PGA_AGGREGATE_TARGET from 300MB to 200MB does not require a sort to disk.

Note that LAST_TEMPSEG_SIZE is null in the above, and that the last execution was OPTIMAL. So, as I originally suggested, changing the SORT_AREA_SIZE for the session helped by removing the sort to disk. Or not – I suspect the second time the query needed to be parsed, it would not have required a sort to disk.

—-

Now, let’s try a similar test on Oracle 10.2.0.3 with a pre-existing database that was started with the following parameters:

Oracle 10.2.0.3 with optimizer_features_enable set to 10.2.0.3 reports Max Area size at 62914560 bytes, which is identical to _smm_max_size, which is 1/5 of the pga_aggregate_target. In the above, Oracle 10.2.0.2 reported Max Area size at roughly 5% of the pga_aggregate_target.

Let’s repeat, this time requesting the additional statistics from DBMS XPLAN:

I performed a similar test on Oracle 10.2.0.2 with the July 2006 CPU on Win x64, and the results were similar to those of Oracle 10.2.0.3.

In summary, as the “Oracle Database Performance Tuning Guide 10g Release 2″ Pg 7-38 (PDF page 146) documentation states, “sizing of work areas for all sessions becomes automatic and the *_AREA_SIZE parameters are ignored by all sessions running in that mode.” There is apparently an odd quirk that once in a while, the first time a SQL statement is parsed, a sort to disk may be required, at least under the base patch of Oracle 10.2.0.2. This lead me, incorrectly, to believe that setting the SORT_AREA_SIZE to a larger value and re-executing the query actually removed the sort to disk – but it was actually the second parse that resulted in the removal of the sort to disk. This test case disproves my suggestion that the SORT_AREA_SIZE has any impact on Oracle 10.2.0.2 when all sessions are set to auto for the WORKAREA_SIZE_POLICY. It is possible to modify the WORKAREA_SIZE_POLICY at the session level, and then the SORT_AREA_SIZE setting takes effect for that session.

This example is adapted from an example I created for a presentation a couple months ago. While the example as displayed is specific to displaying ERP data, the example may be easily adapted for use in monitoring Oracle database performance.

—

Vista and Windows 7 offer a set of built in gadgets that perform a variety of tasks, such as displaying calendars, clocks, resource meters, stock quotes, and so on. It might be nice to show a quick overview of various statistics in an ERP system (or in the database itself) so that a determination may be made whether or not a potential problem exists. Such a gadget would need to be able to automatically update its statistics. This example is complicated as it combines a large number of technologies each with their own unique syntax which looks vaguely like English. This example uses VBScript, Oracle database access, cascading style sheets, XML, HTML, DHTML, and coding that is specific to Vista/Windows 7 gadgets.

When the gadget is opened, it determines the statistics that should be displayed for the user, and the order in which the statistics should be displayed. The selected statistics are then retrieved from the database and displayed on the gadget surface in a DIV tag. Every ten minutes the gadget reconnects to the database and displays updated statistics. As the mouse pointer is moved over the gadget, the gadget’s background lights up slightly, and as the mouse pointer passes over the statistics, those also light up (more specifically, the background image for the statistic is changed). When one of the statistics is clicked, a fly-out child window appears on the screen that shows the detailed statistics behind the one line numeric statistic.

The gadget code files should be placed into the C:\Program Files\Windows Sidebar\Gadgets\KMInspector.gadget folder on the computer (note that KMInspector may be specified with a different name). The gadget.xml file in that folder describes the gadget to Vista/Windows 7 and tells Windows where to find the main gadget HTML file, the name of the gadget, and the name of the various icons – this file MUST be saved in a UTF-8 characterset (this is an option when saving with Notepad). The KMInspector.html file is the main gadget page, and the various pages with names beginning with FlyOut are the pages containing the detail information. The security requirement that all Windows gadgets must be signed needs to be disabled.

Below are a couple of screen shots of what we are trying to achieve (a standard Windows sidebar calendar gadget appears above, with the custom developed gadget below):

The top left screen shot shows the custom gadget directly below a built-in Windows calendar gadget. The top right screen shot shows what happens when the mouse pointer passes over the gadget (note that the background becomes lighter in color, and the button under the mouse pointer changes color). The bottom picture shows what happens when one of the buttons in the gedget is clicked.

Inspector.png is the picture that is displayed when the gadget is dragged from the list of available Windows gadgets to the sidebar. InspectorIcon.png is the picture of the gadget that is displayed in the list of available Windows gadgets. KMInspector.html is the web page that contains the code for the gadget.

The lead into this slide is that a report in an ERP package was running far slower than expected following an upgrade of the ERP package. Of course the SQL statements are hard coded into the ERP package, so there is not much that can be done, right? I created a 10046 extended SQL trace file at level 12, and then passed the trace file through my Toy Project for Performance Tuning (https://hoopercharles.wordpress.com/2009/12/13/toy-project-for-performance-tuning-2/). One of the outputs of my program’s 10046 trace file parser provides an overview of the trace file, as well as an overview of each SQL statement. A screen shot of that output follows:

The screen shot shows that there were 10,399 execute calls in the trace file that consumed 47.67 seconds of the server’s CPU time, and that the elapsed time/wall clock time from the server’s perspective for executing the SQL statements is 50.32 seconds for the executions. There were 16,146 fetch calls that consumed 263.48 seconds of the server’s CPU time, and the elapsed time/wall clock time from the server’s perspective for fetching the rows for the SQL statements is 263.64 seconds. Note that there were 0 physical reads and 8,804,970 consistent gets during the fetch (what’s that buffer cache hit ratio?). The SQL*Net message from client wait totaled 107.33 seconds, but 42.62 seconds were in a single block of time (likely at the end of the trace file, just before tracing was disabled), so the actual total time waiting for the next request from the client is about 64.7 seconds in 22,825 round-trips.

If we scroll down a bit, we might find a couple of the greatest contributors to the server-side processing:

The screen shot shows that there are two groups of SQL statements that combined contributed to 86% of the total server-side processing time for the report. The first SQL statement group is identified as Cursor 16 Ver 1, and the second is identified as Cursor 17 Ver 1.

We could then search the remainder of this file to locate those identifiers.

The above screen shot shows that there are two identical SQL statements in the first group – the first SQL statement (Cursor 16 Ver 1) was parsed, but never executed. The row source execution plan from the 10046 trace file (in the STAT lines) shows that the optimizer decided to use the index X_RECEIVABLE_3 that is on the column ENTITY_ID (there are two distinct values for this column), rather than the much more selective index on the INVOICE_ID column. Notice the number of consistent gets and the CPU utilization for this one SQL statement that was executed 934 times.

Why was the index selected? Would a DBMS_XPLAN help?

The DBMS_XPLAN output shows that the optimizer estimated that the X_RECEIVABLE_3 index would return a single row, when in fact it returned 66124 rows. A problem where statistics were not collected in the last 15 years? No.

Maybe a 10053 trace will help:

In the above we see that the predicted number of rows returned with the X_RECEIVABLE_3 index will be less than that for the other indexes, and the expected CPU resources will also be slightly less, but something is not right. The optimizer selected not to use the primary key index on the RECEIVABLE table. Note that the calculated cost for all three indexes is 2.

Let’s try an experiment with a NO_INDEX hint to prevent the optimizer from using the X_RECEIVABLE_3 index:

In the above, notice that the primary key index (SYS_C006885) was selected by the optimizer, that the execution time dropped from 0.31 seconds to 0.01 seconds (or less), and the number of consistent gets dropped from 1311 to 5. A rather nice improvement, but how do we force the ERP package to not use the index without the ability to modify the program’s source code?

We could do something like this, if we assume that bind variable peeking is the source of the problem:

The above logon trigger results in the following row source execution plans:

But, the above approach may be too broad as it will affect every SQL statement executed by a program that begins with the letters VM or VF. It did help this report:

The time to fetch all rows dropped from 264 seconds to just 11 seconds, and the full report displays in just over a minute.

Let’s see if we are able to trick the optimizer into not using the X_RECEIVABLE_3 index without disabling bind variable peeking. First, we need to enable private outlines in the current session:

ALTER SESSION SET USE_PRIVATE_OUTLINES=TRUE;

For demonstration purposes, I will explicitly tell the optimizer to use the X_RECEIVABLE_3 index when creating the outline (the exact SQL statement used by the program should be used, without the hint):

Note that I had to specify a hint in this case as I was at the time testing in Oracle 11g R1 (11.1.0.6), which refused to use the X_RECEIVABLE_3 on its own to reproduce the problem seen with Oracle 10.2.0.x.

Let’s view the hints generated by the optimizer when it created the outline:

Note that even though we hinted/forced Oracle to use the X_RECEIVABLE_3 index, it now selected to use the primary key index SYS_C0011925 due to the hacked private outline. Oracle IGNORED MY HINT (actually, it did exactly as the outline instructed).

Now, let’s make the outline a bit more permanent, converting it from a private outline to a public outline:

But there is a catch. The USE_STORED_OUTLINES parameter cannot be set in the init.ora or spfile, so we need a STARTUP trigger to set the parameter:

CREATE OR REPLACE TRIGGER ENABLE_OUTLINES_TRIG AFTER STARTUP ON DATABASE
BEGIN
EXECUTE IMMEDIATE('ALTER SYSTEM SET USE_STORED_OUTLINES=TRUE');
END;

After implementing the stored outlines there is a new top SQL statement in the 10046 trace profile file. But, at some point we have to stop (before the effects of compulsive tuning disorder are achieved).

My original idea for hacking the stored outlines came from the book “Troubleshooting Oracle Performance”.

Note that hacking stored outlines should not be the first step. Instead, see if it is possible to modify optimizer parameters at the session level first to achieve the desired execution plan. Once the desired execution plan is achieved, create the stored outline to freeze the execution plan.

I need help with a query which involves the 2 tables defined below. What I need to do is choose the record with the max “Eff Date” from “Table A” for a particular “Emp No.” and update the “Desc” from that record in the field “Desc” of “Table B” for the same “Emp No.”. I am able to choose the max “Eff Date” record for each employee from Table A but somehow not able to updated the same “Desc” in “Table B”.

Request you to please help the query. Any help would be appreciated.
Thanks!

SET PAGESIZE 1000
SET LINESIZE 150
ALTER SESSION SET STATISTICS_LEVEL='ALL';
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
SPOOL C:\CHECKTIMING.TXT
UPDATE
T2
SET
DESCR=(
SELECT
DESCR
FROM
(SELECT
EMP_NO,
GROUPING,
ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
DESCR
FROM
T1) T1
WHERE
RN=1
AND T1.EMP_NO=T2.EMP_NO
AND T1.GROUPING=T2.GROUPING)
WHERE
(T2.EMP_NO,T2.GROUPING) IN (
SELECT
EMP_NO,
GROUPING
FROM
(SELECT
EMP_NO,
GROUPING,
ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
DESCR
FROM
T1)
WHERE
RN=1);
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
SELECT
*
FROM
T2
WHERE
LENGTH(DESCR)=1
ORDER BY
EMP_NO;
ROLLBACK;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
merge into t2 t2
using (
select emp_no,grouping,
max(descr) keep(dense_rank last order by eff_date) descr
from t1 group by emp_no,grouping) t1
on (t1.emp_no=t2.emp_no
and t1.grouping=t2.grouping)
when matched then update set t2.descr=t1.descr
;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
SELECT
*
FROM
T2
WHERE
LENGTH(DESCR)=1
ORDER BY
EMP_NO;
ROLLBACK;
SPOOL OFF

This post is adapted from a small part of a presentation I gave a couple months ago. The original code sample integrated into an ERP system to display an employee’s running attendance record for the last 6 or 12 months in graphical form, output to an Internet Explorer window.

The above created a table with 1,000 rows that picked one of five employees at random for each row, specifying a random date between today and 999 days ago, with one of eight random identifiers for the date. The data in the table will look something like this:

Now that we have sample data, let’s see what we are trying to achieve (reduced in size):

As the color-coded output shows, Eric took a vacation day on January 4 and March 8. Eric also was on bereavement leave on February 23 and 24, as well as March 22 and 25. So, how was this output created? A VBS script connected to the Oracle database (using a custom DLL to hide the username and password, and to simplify the process of submitting the SQL statement with bind variables), submitted a query, and then built the web page on the fly.

While the above uses a custom DLL for the database connection, a standard ADO connection will work just as well. The script also uses a custom program that I created called BringToTop that simply addresses the “pop under” behavior on Vista and Windows 7.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: