In a recent OTN thread a person asked a couple of good questions about why histograms and the use of bind variables sometimes cause problems. The questions did not ask whether or not one should use histograms on the table columns where those columns often appear in WHERE clauses, with the columns compared to bind variables. Instead, the original poster (OP) asked the all important question WHY. In a previous article I provided my response to an OTN thread where the OP of that thread wanted to use histograms to fix bind peeking problems.

The specific questions asked in the recent OTN thread include:

When a SQL is using bind variables how histograms affect the excution plan?

Why histograms can’t work well with bind variables?

I remember a document mentioned that “do not use histograms when using bind variables”. But why?

The answers to these questions have been answered many times in articles written by a number of authors, for example:

Rather than point the OP to one of the above articles, I decided instead to create a test case to demonstrate what could happen on Oracle Database 10.2.0.4 (simulated) and 11.2.0.2 when columns that are compared to bind variables in the WHERE clause also have histograms. Below is my response, slightly reworded:

—-

Histograms can work with bind variables, but the end result is typically not the desired outcome. Bind variables are used to reduce the number of different execution plans. Histograms are used to help the optimizer find what is supposed to be the best execution plan for the supplied predicates, and in the case of bind variables, those are the peeked values of the bind variables. So, if you have a histogram on a column, and for the initial hard parse of the SQL statement the most common value in that column is submitted in the bind variable – the generated execution plan is considered by the optimizer to be the “best” execution plan for the supplied bind variable values. Now assume that instead, the least popular value in the column is specified – the optimizer could produce a very different execution plan for the same SQL statement, one that is optimized for the least popular value (this might be an index range scan, rather than a full table scan). Assume that the execution plan cannot change when the bind variable values change during future executions – if the table column contains a single popular value and many unpopular values, if the initial hard parse is performed with the single popular value, you could find that all future executions of that SQL statement perform full table scans, even when only a couple of rows from the table are selected.

The above created a table with 1,000,000 rows where 99% of the rows have a value of 1 in column C2 and 1% have a value of 99, and the rows are inserted with a perfect clustering factor due to the ORDER BY clause. A histogram was created on the indexed column.

Let’s try a test, we will pick an unpopular value of 2 for the bind variable when the query is initially hard parsed:

So, there were no rows selected, the optimizer predicted that 5,957 rows would be returned, and an index access path was selected for data retrieval. Would this index access path also be appropriate for a bind variable value of 1? Let’s continue the test, this time picking the value 99 for the bind variable:

Once again, the execution plan shows that the optimizer predicted 5,957 rows would be retrieved even though 10,000 rows were actually retrieved. Notice also that the child number is still shown as 0, indicating that a hard parse was not performed. Let’s continue the test, this time with a bind variable value of 1:

That cannot be the execution plan that was used because it still shows that 10,000 rows were retrieved during the last execution, where the AUTOTRACE statistics showed that 990,000 rows were actually retrieved. Let’s try again, this time retrieving the execution plan for CHILD_NUMBER 1:

The above shows the actual execution plan that was used (sse the article Explain Plan Lies, Autotrace Lies, TKPROF Lies, What is the Plan? to see why we cannot use AUTOTRACE or EXPLAIN PLAN to see the actual execution plan). Adaptive cursor sharing (first available with Oracle Database 11.1) stepped in and forced the re-evaluation of the execution plan to prevent a very slow retrieval through the index – that re-evaluation will not happen prior to Oracle Database 11.1 (CURSOR_SHARING=’SIMILAR’ might have the same effect in older Oracle Database releases when literal values are used in the SQL statement).

The above is the execution plan for CHILD_NUMBER 2 – notice that this time it is reporting 990,000 rows retrieved, so this IS the execution plan that was used for the bind variable value that exists in 99% of the table rows. Adaptive cursor sharing did not take effect and force the re-evaluation of the execution plan – the execution plan was NOT changed to a full table scan. That is the risk that you take if you allow histograms to exist on columns that have unequal distributions of values, bind variables are used in the WHERE clause that references the column, and bind variable peeking is enabled (enabled by default in Oracle Database 9i and above, bind variable peeking is controlled by the hidden parameter _OPTIM_PEEK_USER_BINDS, which defaults to TRUE).

It happened again, another blog article that forced me to stop, think, and … hey, why did Oracle Database 11.2.0.2 do something different than Oracle Database 10.2.0.5? What is different, even when the OPTIMIZER_FEATURES_ENABLE parameter is set to 10.2.0.4 (or 10.2.0.5)? The number of consistent gets for a SQL statement is significantly different – we did see a similar difference between release version before, but for a different reason. We need the help of Oracle Database trace event 10200 to determine why there is a difference. Once we have the trace file, we need an easy way to process the trace file.

Excel Macro that will work with a trace file produced by Oracle Database running on Windows (also works in Microsoft Visual Basic 6.0 and earlier; for an Oracle Database running on Unix/Linux, open the trace file with Wordpad first, and then save the trace file using Wordpad):(Replace c:\or10s_ora_4256_watch_consistent.trc with the actual name of the generated trace file – the script as written seems to work with 10.2.0.x and 11.2.0.x)

Private Sub Read_10200_Trace1()
Dim intFileNum As Integer '10200 trace file
Dim intFileNum2 As Integer 'Output file
Dim strInput As String 'Line read from the 10200 trace file
Dim strOutput As String 'Line to be written to the output file
Dim strBlock(2000) As String 'Block read from the trace file
Dim strBlockCounter(2000) As Integer 'Number of times read
Dim intBlocks As Integer 'Total number of blocks
Dim i As Integer 'Loop counter
Dim intFound As Integer 'Indicates whether or not the block was found
intFileNum = FreeFile
Open "c:\or10s_ora_4256_watch_consistent.trc" For Input As #intFileNum
intFileNum2 = FreeFile
Open "c:\watch_consistent.txt" For Output As #intFileNum2
Do While Not EOF(intFileNum)
Line Input #intFileNum, strInput
If InStr(strInput, "started for block") > 0 Then
strOutput = Trim(Right(strInput, Len(strInput) - InStr(InStr(strInput, "started for block"), strInput, ":")))
'Find the number of times the block was accessed
intFound = 0
For i = 1 To intBlocks
If strOutput = strBlock(i) Then
intFound = i
strBlockCounter(i) = strBlockCounter(i) + 1
Exit For
End If
Next i
'If the block was not found, record it
If intFound = 0 Then
intBlocks = intBlocks + 1
intFound = intBlocks
strBlockCounter(intBlocks) = 1
strBlock(intBlocks) = strOutput
End If
Print #intFileNum2, strOutput; vbTab; strBlockCounter(intFound)
End If
Loop
Print #intFileNum2, ""
For i = 1 To intBlocks
Print #intFileNum2, strBlock(i); vbTab; strBlockCounter(i)
Next i
Close #intFileNum
Close #intFileNum2
End Sub

Excel Macro equivalent that will work with a trace file produced by Oracle Database running on Windows/Unix/Linux (also works in Microsoft Visual Basic 6.0 and earlier):(Replace c:\or10s_ora_4256_watch_consistent.trc with the actual name of the generated trace file)

Private Sub Read_10200_Trace2()
Dim strInput As String 'Line read from the 10200 trace file
Dim strOutput As String 'Line to be written to the output file
Dim strBlock(2000) As String 'Block read from the trace file
Dim strBlockCounter(2000) As Integer 'Number of times read
Dim intBlocks As Integer 'Total number of blocks
Dim i As Integer 'Loop counter
Dim intFound As Integer 'Indicates whether or not the block was found
Dim objFSO As Object 'FileSystemObjects
Dim objFile1 As Object 'The FileSystemObjects handle to the raw 10020 trace file
Dim objFile2 As Object 'The FileSystemObjects handle to the output file
Const ForReading = 1
Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile1 = objFSO.OpenTextFile("c:\or10s_ora_4256_watch_consistent.trc", ForReading)
Set objFile2 = objFSO.CreateTextFile("c:\watch_consistent.txt", True)
Do While Not (objFile1.AtEndOfStream)
strInput = objFile1.ReadLine
If InStr(strInput, "started for block") > 0 Then
strOutput = Trim(Right(strInput, Len(strInput) - InStr(InStr(strInput, "started for block"), strInput, ":")))
'Find the number of times the block was accessed
intFound = 0
For i = 1 To intBlocks
If strOutput = strBlock(i) Then
intFound = i
strBlockCounter(i) = strBlockCounter(i) + 1
Exit For
End If
Next i
'If the block was not found, record it
If intFound = 0 Then
intBlocks = intBlocks + 1
intFound = intBlocks
strBlockCounter(intBlocks) = 1
strBlock(intBlocks) = strOutput
End If
objFile2.Write strOutput & vbTab & strBlockCounter(intFound) & vbCrLf
End If
Loop
objFile2.Write "" & vbCrLf
For i = 1 To intBlocks
objFile2.Write strBlock(i) & vbTab & strBlockCounter(i) & vbCrLf
Next i
objFile1.Close
objFile2.Close
End Sub

VBS Script Equivalent that will work with a trace file produced by Oracle Database running on Windows/Unix/Linux (also works in Excel and Microsoft Visual Basic 6.0 and earlier):(Replace c:\or10s_ora_4256_watch_consistent.trc with the actual name of the generated trace file)

In the above, there were 2 consistent gets for the PAR_I1 index, 4 consistent gets for the T1 table, 202 consistent gets for the CHI_I1 index, and 200 consistent gets for the T2 table. While it might not be obvious from the above, the BLEVEL for both indexes is 1 (HEIGHT = 2 – see the quiz article linked to at the start of this article for an explanation). When I first saw the quiz that is linked to at the start of this article, I mentally assumed that there would be about 400 consistent gets for the CHI_I1 index – for every Start of the INDEX UNIQUE SCAN operation, I expected the index root block and the index leaf block to count as a consistent get, while the above showed that did not happen. Let’s trace the consistent gets to see why there were only 202 consistent gets and not roughly 400:

If we then process the resulting 10200 trace file through one of the above trace file parsers, we might see output like what is listed below (the RDBA in hex is listed first, followed by the number of times that block had been accessed by a consistent get to that point in the trace file):

At the bottom of the output is a summary that shows (in order) RDBA 0206e214 was accessed a total of 1 time, RDBA 0206e215 was accessed 1 time, RDBA 01c0000c was accessed 1 time, RDBA 01c72e14 was accessed 2 times, RDBA 01c72e15 was accessed 200 times, etc. Nice, but what do those RDBA numbers represent? We will get to that later.

Inside the raw 10200 trace file we might see something like this (I am able to identifysome items that appear in the raw trace file, but I do not yet fully understand the file):

Now that we see the RDBA numbers again, I suppose that it is time to try to determine the objects that are referenced by the RDBA numbers. We can try dumping the index structure to see which blocks are read, but first need to find the OBJECT_IDs for the two indexes:

Taking the above HEADER_FILE, HEADER_BLOCK, and MAX_BLOCKS numbers and dumping the block contents to a trace file (this will work in this test case script because all of the extents for the table blocks are probably close to each other – looking back, it probably would have been a better idea to use DBA_EXTENTS rather than DBA_SEGMENTS and just dump the first extent for each object):

Those datafile dumps can be quite time consuming, is there anything else we can try?

We could try to find the RDBA for the ten blocks (note that there is a risk here if the first extent is only eight blocks in length) in the first extent of each segment using the DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS function (items appearing in bold were identified in the trace file summary):

Or, we could try working from the opposite direction. With the knowledge that the lower 22 bits of the RDBA is the block number and the upper ten bits of the RDBA is the relative file number, we can manually calculate the relative file number and the block number from the RDBA and then look up the object name associated with the file and block. First, we need the decimal equivalent of (binary) 1111111111111111111111:

(binary) 1111111111111111111111 = (decimal) 4194303

So, if we BITAND the RDBA with 4194303 we can obtain the block number, and if we divide the RDBA by 4194304 we can determine the relative file number for two of the RDBA numbers that were listed in the trace file summary, as shown below:

Remembering the number 4194303 might be challenging, so we can just use the DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE and DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK functions instead. Looking up the associated object names can be slow, unless we are able to limit the object names to a list of specific objects (ideally, we would also specify the DE.OWNER column in the WHERE clause):

Notice in the above that the 202 consistent gets that we saw for the CHI_I1 index in Oracle Database 10.2.0.5 oddly only required 9 consistent gets in Oracle Database 11.2.0.2. But that is not the only change. If we process the 10200 trace file through one of the trace file parsers, we might see something like this:

Interesting – it appears that Oracle Database 11.2.0.2 writes the DATA_OBJECT_ID that is related to the block, directly into the trace file so that we no longer need to execute several SQL statements to determine the object names related to the blocks.

Inside the raw 10200 trace file from 11.2.0.2 we might see something like the following:

The 10200 trace file in 11.2.0.2 provided the DATA_OBJECT_ID for the consistent reads, while the 10.2.0.5 trace file did not. We can use this information to determine which objects were accessed, and in which order by pulling in the unique OBJD values from the summary:

I pre-ordered this book in October 2010 while searching for a way to learn some of the more advanced features of Oracle Database’s SQL; I have been searching for a follow up book to the “Mastering Oracle SQL and SQL*Plus” book that I read a couple of years ago, and I think that I finally found that follow up book. Written as the Oracle Database 11.2 sequel to the book “Mastering Oracle SQL and SQL*Plus” (and that book’s much updated revision titled “Beginning Oracle SQL”), this book is not written as a simple reference for the Oracle specific SQL dialect. This is not a book that you will want to use when trying to learn the basics of creating a simple SQL statement with a four table join. However, if you are able to create that simple SQL statement with a four table join in less than 30 minutes with the help of an entity-relationship diagram (ERD), and without using a DISTINCT clause, this book’s contents will prove to be a valuable resource not only to expand your depth of knowledge of the SQL language, but also to extract the extra value that becomes visible when Oracle Database is viewed as much more than a black box dumping ground for data.

The authors of this book are all OakTable Network members and have between 15 and 29 years of experiencing working with Oracle products. This long term exposure to Oracle Database is quite clearly an advantage when discussing many of the subtopics that are simply missed by other Oracle specific SQL language references. While there was no discussion of the SQL language from Oracle Database 2.0, the authors are clearly comfortable with the features in Oracle Database 11.2 as well as the features that are available in Oracle Database 8i, 9i,10g, and 11g R1.

The book’s contents are well organized. While each chapter identifies the author who wrote the chapter (possibly indicating that the authors did not work together on each individual chapter), the book flows well with plenty of forward and backward references between chapters, as well as including references to other resources (Metalink, books, and blogs). Chapter one of the book is a bit different from the rest of the chapters in the book, and appears to be written as a transition area for readers to become familiar with SQL*Plus and Oracle Database. Chapter one will be valuable to readers attempting to adjust from using graphical query tools (TOAD, SQL Developer, Excel, or any number of other graphical tools) to using SQL*Plus. Additionally, that chapter helps readers who are familiar with other SQL dialects (such as that used by SQL Server) take advantage of Oracle Database’s special characteristics, and introduces readers to multi-table INSERTs and MERGE operations. Understanding the information presented in execution plans is an emphasis in the remaining chapters of the book; this book’s explanation of execution plan content ranks among the best, if not the best, that I have seen to date. While there is not a single cartoon drawing in the book, and it does not appear that any sections of this book were borrowed from other books, there are several very well placed diagrams in the book’s chapters. The book makes a significant effort to control the scope of the material presented. In most cases, that effort resulted in a very easy to understand, yet thorough discussions of complex topics while building bridges to help the reader transition into Oracle performance specific books, such as the book “Troubleshooting Oracle Performance”. In a couple of areas, adding an additional half-dozen words might have saved the reader a little confusion, but those are rare occurrences in this book. Be certain to install Oracle Database’s sample schema, and download the script library for this book from the Apress website (some of the scripts, especially those in chapter 16, are quite useful and may not be printed directly in the book).

Foundation knowledge, and miscellaneous comments while reading the book:

Pages 3-8: Provides a quick demonstration of how to connect to the database using SQL*Plus, which is helpful for people who have primarily used other query tools, or other database platforms. Also demonstrates the basics of executing SQL statements, setting up the SQL*Plus environment, and executing scripts.

The book briefly touches on why it is important to use consistent formatting and bind variables in order to reduce the number of hard parses, and why writing SQL to limit logical IO is also important.

Page 40: Automatic query transformation often takes place, for instance, converting an IN subquery into a standard join.

Page 59: The book demonstrates that the array fetch size has an impact on the number of consistent gets performed when executing a query.

Pages 64-65: Includes a test case that shows why an index would be used in one case to retrieve 1% of the table rows, while in another case a full table scan was more appropriate to retrieve 1% of the rows.

Page 66: Nice summary of what controls the number of blocks that are read in a single read call during a full table scan or a fast full index scan. There is one potentially confusing sentence, “This could mean that a multiblock read might only read one block at a time.” The concepts were correctly stated, however it might be worthwhile to state that “a multiblock read might be truncated to a single block read due to the blocks that are already in the buffer cache.”

Pages 74-75: Describes how B*tree indexes grow from a single block in size, and what triggers the index height to increase.

Long code sections in the early chapters with embedded comments – probably OK for the intended audience.

Page 171: Provides a warning to SET SERVEROUTPUT OFF before displaying the execution plan using DBMS_XPLAN.DISPLAY_CURSOR.

Page 175: Describes all of the format parameters for DBMS_XPLAN, including the undocumented ADVANCED parameter.

Page 187: Example of collecting statistics on a table and its indexes without creating histograms.

Page 191: States that using an account with DBA privileges simplifies the process of using DBMS_XPLAN.DISPLAY_CURSOR, using data from other accounts, and selecting from system views, but advises to do so only in a test environment.

The book typically indicates the first Oracle release version that supports a feature that is being described.

Page 218: Includes a test case that seems to demonstrate an error in the Oracle Database 11.2 documentation.

Pages 223-224: Very helpful brief summary of most of the analytic functions that are discussed in the chapter – this list reduces the frustration in finding the correct analytic function that is needed to produce a desired result.

Page 225: Demonstrates how to create a running sum that automatically resets when the value of a column changes.

Page 244: Presence of the keywords WINDOW SORT in an execution plan indicates that the SQL statement uses an analytic function.

The KEEP keyword is not mentioned in the chapter that describes analytic functions, however an example is provided on page 173 of the book in the pln.sql script (in the script download for chapters 5 and 15).

Page 283: WITH clause is known as subquery factoring, and other database platforms (and the ANSI standard) refer to subquery factoring as common table expression.

Page 286: Oracle may process a WITH block as either an inline view or as a temporary table.

Book describes several methods for optimizing query performance, and suggests revisiting queries created in older release versions of Oracle Database to take advantage of performance optimizations that are available in more recent release versions. Also suggests revisiting PL/SQL code to determine if such procedural code may be accomplished with set-based plain SQL code. Emphasizes testing for performance, rather than just using the first query that seems to produce the expected result.

Describes SQL features through version 11.2 (as does the book “Beginning Oracle SQL”).

Page 309: Table 10-1 quickly describes the various functions, operators, and pseudo columns that are related to the Oracle CONNECT BY syntax, including SYS_CONNECT_BY_PATH, CONNECT_BY_ROOT, CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF, PRIOR, LEVEL, and NOCYCLE.

Page 331-334: Provides various demonstrations of common mistakes with queries containing EXISTS clauses.

Page 336-339: Demonstration that EXISTS and IN queries may be automatically rewritten into equivalent SQL statements and suggests looking in a 10053 trace file to see the actions performed by the optimizer.

Page 349: NOT IN and NOT EXISTS may return different results, and are not functionally equivalent due to the different handling of NULL values.

Page 355: Prior to 11g, anti-joins could not be performed on NOT IN queries unless the optimizer was sure that NULL values could not be returned.

Page 378: Shows how to index NULL values in a B*Tree index by using a constant as the second column in the index.

Page 388: Using DBMS_ROWID.ROWID_OBJECT(ROWID) to show how table rows are distributed with hash partitioning.

Page 388: Using ORA_HASH( column_name, 31, 0) to predict the partition into which a row will be placed in a hash partitioning scheme with 32 partitions.

Page 392: Function based indexes add a virtual column to the table, and those columns are viewable through the DBA_TAB_COLS view.

Page 393: Shows how to create a virtual column that always returns a specified calculated value.

Page 397: Starting in Oracle Database 11g it is possible to alter an index to make it invisible – this can be used to reduce the risks associate with just dropping the index.

Page 403: Direct path inserts are invoked with the APPEND hint, causing rows to be inserted above the high water mark.

Page 408: DBMS_ERRLOG.CREATE_ERROR_LOG procedure creates a logging table that is the destination for rows that fail during an INSERT, UPDATE, or DELETE when the LOG ERRORS INTO clause is included in the INSERT, UPDATE, or DELETE.

Pages 418-421: Demonstrates with a test case that when a large percentage of rows in a table need to be updated, creating a new table and using INSERT APPEND may be much faster than a typical INSERT statement if a little down time is acceptable during the creation of the new table.

Page 482: Demonstrates one of the potential problems associated with using statically defined views in a query, where a query is later extended to return additional information by joining directly to one of the base tables specified in the view.

Page 498: Oracle 10g and 11g use rolling invalidation of cursors when statistics are collected on the objects referenced by the cursors.

Page 514: “Hints are actually directives to the optimizer. As long as the hint is valid, the optimizer will obey it.”

Page 516: Placing the word COMMENT in front of a set of hints in a SQL statement prevents the optimizer from using the hints that follow.

Page 517: Script to extract the OTHER_XML column from V$SQL.

Page 527: Example of creating a stored outline from a SQL statement in the library cache.

Page 538: SQL Profiles do not lock an execution plan in place – instead they lock a cardinality estimate adjustment in place through the use of embedded hints. It is also possible to embed specific hints into SQL statements by creating a SQL profile using the undocumented DBMS_SQLTUNE.IMPORT_SQL_PROFILE function.

ORA_HASH function (pages 95, 388 – used to predict the partition into which a row will be placed in a hash partitioning scheme)

DBMS.APPLICATION_INFO (page 440)

DBMS_RANDOM.VALUE (page 473)

DBMS_UTILITY.GET_TIME and DBMS_UTILITY.GET_CPU_TIME (page 491)

While the “Pro Oracle SQL” book is an overall excellent book, clearly demonstrating that there was a great deal of care put into the book, there are a couple of small problem areas in the book (note that the authors appear to have addressed most of these issues on the errata page for the book on the Apress website):

While not significantly affecting the comprehension of the material presented in the book, there are a couple of typos in the book. For instance, on page 9 the book states “In this book” rather than “In this chapter”; page 31 of the book uses the word “effect” rather than “affect”; and page 221 of the book demonstrates that simple connecting words were sometimes lost, “… would require multiple self-joins to [the] employees table.” None of these typos change the intended message of the paragraphs. However, the writing style in some of the chapters is significantly more fluid than in other chapters.

Pages 51-52: When the optimizer did not select to automatically use the materialized view, the author used a REWRITE hint to force the optimizer to use that materialized view. It probably would have been a good idea to mention that the optimizer likely did not automatically use the materialized view because the calculated cost for that execution plan (1935) exceeded the cost for the execution plan that directly accessed the tables (485).

Page 73: States, “Block accesses made via an index scan are made using single-block reads.” Technically, index range scans may employ multi-block reads in certain cases (such as index pre-fetching), but stating that at this point in the book might lead to unnecessary confusion.

Page 85: A missing word might lead to misunderstanding: “However, in the cases where the number of subindexes needed would be smaller, the operation can be many times more efficient than a full scan as scanning [a] smaller [number of] index blocks can be more efficient than scanning [a] larger [number of] table blocks.”

Page 94: Missing words in an explanation regarding how hash joins work might lead to misunderstanding: “Based on table and index statistics, the table that is determined to return the fewest rows will be hashed in its entirety into memory. This hash table includes all the row data for that table…” Based on testing, it appears that the hash table only includes the selected columns and the columns that are joined to other row sources. Additionally, the row source that is hashed into the hash table might not be from a table. Including this additional detail might be too much information for the scope of the book (reference).

Page 102: There is a risk that the full outer join Oracle syntax equivalent implementation, when the approach is applied to other data, could yield different results from the ANSI full outer join if each row returned is not unique. One way to work around that limitation is to replace the UNION with a UNION ALL and add AND E2.ROWID IS NULL to the final WHERE clause.

Page 163: States, “The operation [in the execution plan] that is most indented is actually the first operation that will be executed. If there are multiple operations at the same level, the operations are executed in a top-down order.” The Oracle Database Performance Tuning Guide for 11.2 from the Oracle documentation library states essentially the same fact, and is also incorrect. This guideline is correct in some cases, but incorrect in other cases as can be confirmed with a 10046 extended SQL trace. The first operation executed in an execution plan is actually the first operation from the top of the execution plan that has no child operations (referencereference2).

Page 379: States, “B-tree indexes are suitable for columns with lower selectivity. If the columns are not selective enough, the index range scan will be slower. Further, less selective columns will retrieve numerous rowids from the leaf blocks leading to excessive single block access to the table.” Based on the second and third quoted sentences, it appears that the author intended to state that “B-tree indexes are GENERALLY NOT suitable for columns with lower selectivity.”

Page 397: States, “There is another use case for the invisible indexes. These indexes are useful to reduce the risk while dropping unused indexes… From Oracle Database version 11g onwards, you can mark the index as invisible, wait for few weeks, and then drop the index if no process is affected with less risk.” It is important that the book states “less risk” and not that there is no risk – it would have been helpful if the book discussed what risks remain (but again this might exceed the intended scope of the book). Just because an index is not used for a couple of weeks does not mean that the index will not be used during month-end closing, year-end processing, or some other infrequently occurring activity. Invisible indexes on foreign key columns may still be used to prevent table locking problems on the child table when the parent’s primary key columns are updated. Additionally, the statistics from the invisible indexes may be used by the optimizer to more accurately determine cardinality estimates (referencereference2).

Page 437: The third paragraph contains an apparent typo regarding the location of the online redo logs: “Undo blocks retain the before condition of the data, while the redo information is stored in the online redo logs in the system global area (SGA).”

Page 529: The outline_startup_trigger.sql script is not included in the script library for the book.

Page 539: The create_tuning_task.sql and accept_sql_profile.sql scripts are not included in the script library for the book.

As indicated by the above, most of the problem areas are related to unintentional word substitutions. The majority of the other problem areas are cases where the authors had to draw the line of discussion at a selected level of detail in order to limit potential confusion and control the scope of the book. The remaining problem areas are minor in the overall context of the book, and might elicit a response from the authors along the lines of “I thought that I modified that phrase in draft 20 of the chapter”. A fantastic Oracle SQL book that ventures well beyond simple syntax diagrams, and the book is an excellent value for the price.

While reading the “Pro Oracle SQL” book I learned something interesting. Commenting your work can improve database performance. You certainly are aware that thoroughly documenting what you do could prevent hours of headaches that might appear later when trying to investigate problems or add additional features to an existing procedure (I think that was part of the message in Cary Millsap’s recent blog article).

But how can commenting what you do actually improve database performance? To demonstrate, let’s create a simple test case using two tables and a single insert statement to populate the two tables:

The tables created by the above script both have primary key indexes, with table T1 having 1,000,000 rows and table T2 having 500,000 rows. Now let’s create a simple SQL statement that joins the two tables and output the execution plan for the SQL statement:

The execution plan indicates that a hash join will be used to join the tables, with table T2 listed as the first row source below the words “HASH JOIN”. The query is expected to require roughly 2 minutes and 48 seconds to execute. Now let’s try the query again with a hint:

The execution plan again indicates that a hash join will be used to join the tables, this time with table T1 listed as the first row source below the words “HASH JOIN”. The query is still expected to require roughly 2 minutes and 48 seconds to execute. Let’s try the query again with a second hint:

The execution plan this time indicates that two nested loops joins will be used to join the tables, with table T1 listed as the first row source below the words “NESTED LOOPS”. The query is still expected to require roughly 301 minutes and 46 seconds to execute. Now let’s document what we did to alter performance so that the next person to investigate the performance of this SQL statement will know why it performs as it does:

As you can see from the AUTOTRACE generated execution plan, simply commenting our changes is sufficient to convince the optimizer that the SQL statement will execute roughly 100 times faster than the version of the SQL statement without the comment that documents our work. The reason for this estimated performance improvement is explained on page 516 of the book. :-)

I found a couple of more Oracle Database related quizzes on the Internet. These quizzes, I would guess, are designed for Oracle Database 8.1 and earlier. I missed one question in each of the quizzes, but I suspect that I could have missed many more if someone had simply handed the quizzes to me and asked that I answer the questions. I think that I would intentionally answer the questions as if the quiz were designed for Oracle Database 11.2, because the database release version for the quiz is not specified. How well would you do if:

You answer the questions knowing that the quizzes were designed for Oracle Database 8.1 and earlier?

You answer the questions believing that the quizzes were designed for the latest release version of Oracle Database?

An interesting hypothetical question was posed on the OTN forums yesterday. Consider this situation. You are using Oracle Database 11.2.0.2 and you find a query that is executing much slower than expected, typically requiring 10 to 15 minutes to execute. You add a RULE hint to the query and find that the query completes in just a couple of seconds. How would you do to address this issue?

—-

To help your thought process, consider the following test case that I included in the OTN thread. The table definition:

In the above, we have an unhinted query, the same query with a RULE hint, the same query with the __FAST=TRUE hint, and the same query with an index hint. Which query will execute the fastest, and why? Let’s execute the test case script to find out (note that your results could be very different from my results):

The hypothetical question was essentially very simple. How would you guide Oracle’s optimizer to find the optimal execution path?

—-

Edit January 10, 2011: Note that the phrase “How would you do to address this issue” in the initial paragraph is intentionally left undefined, and left for your interpretation. “This issue” could very well have multiple intended meanings, depending on how you read the paragraph.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: