This chapter expands on the ideas introduced in Chapter 1 and explains optimizer actions in greater detail for specific cases. This chapter describes how the cost-based optimizer evaluates expressions and performs specific operations. It also explains how the CBO transforms some SQL statements into others to achieve the same goal more efficiently.

How the Optimizer Performs Operations

The optimizer fully evaluates expressions whenever possible and translates certain syntactic constructs into equivalent constructs. Constructs are translated in the following cases:

Oracle can more quickly evaluate the resulting expression than the original expression.

The original expression is a syntactic equivalent of the resulting expression. Sometimes, different SQL constructs can operate identically (for example, = ANY (subquery) and IN (subquery)); Oracle maps these to a single construct.

How the CBO Evaluates IN-List Iterators

The IN-list iterator is used when a query contains an IN clause with values.

The execution plan is identical to what would result for a statement with an equality clause instead of IN. There is one additional step: The IN-list iterator feeds the equality clause with unique values from the IN-list.

Example 2-1 How the CBO Evaluates IN-List Iterators

Both of the following statements are equivalent and produce the same plan:

The IN-list operator is especially useful if there is an expensive first step that you do not want to repeat for every IN-list element. In Example 2-4, even though there are three IN-list elements, the full scan on so_lines_all happens only once.

Example 2-4 Using the IN-List Iterator to Avoid an Expensive First Step

When the Optimizer Uses IN-List Iterators

The optimizer uses an IN-list iterator when an IN clause is specified with values, and the optimizer finds a selective index for that column. If there are multiple OR clauses using the same index, then the optimizer chooses this operation rather than CONCATENATION or UNIONALL, because it is more efficient.

IN-List Iterator Hints

There are no hints for this operation. You can provide a hint to use the relevant index, which can cause this operation. Example 2-5 shows a query with and without an INDEX hint, and the resulting execution plans.

How the CBO Evaluates Concatenation

Concatenation is useful for statements with different conditions combined with an OR clause. With concatenation, you get a good execution plan with appropriate indexes. Examples 2-6 and 2-7 show two plans, each accessing the table through the appropriate index and combined using concatenation.

The plan in Example 2-6 does not return duplicate rows, so for each component it appends a negation of the previous components.

Trying to execute the statement in a single query produces a poor execution plan. Because the optimizer has two paths to follow and has been instructed not to decompose the query, it needs to access all the rows in the second table to see if any rows match one of the conditions.

Concatenation Hints

Use the hint USE_CONCAT for this operation.

When Not to Use Concatenation

Concatenation is expensive in the following cases and should not be used:

When OR conditions are on same column and can use the IN-list operator, which is more efficient than concatenation

If, instead, you provide a hint to use the two-column index, then the optimizer switches to that with an IN-list iterator. The initial scan is not repeated, and a better execution plan results, as follows:

SELECT "ORDER_NUMBER","CUSTOMER_ID"
FROM "SO_HEADERS_ALL" "H"
WHERE "CUSTOMER_ID" IS NOT NULL
ORDER BY "CUSTOMER_ID";

A couple of factors influence the execution plan:

Network round trips can be several orders of magnitude more expensive than the physical and logical I/Os.

The optimizer does not have any statistics on the remote database, which might not be an Oracle database.

In general, the optimizer chooses to access the remote tables first, before accessing the local tables. This works well for cases like Example 2-10, where the driving table is the remote table. However, if the driving table is the local table, then there might not be any selective way of accessing the remote table without first accessing the local tables. In such cases, you might need to provide appropriate hints to avoid performance problems.

Besides the conditions used in the access path, a table might have additional conditions to filter rows when the table is visited. Conditions that get applied when the table is accessed, like attribute1ISNOTNULL, do not show up as FILTER. Examples 2-13 and 2-14 illustrate such conditions.

Example 2-13 shows a query with a GROUPBY condition that creates a filter.

In this example, for every row meeting the condition of the outer query, the correlated EXISTS subquery is executed. If a row meeting the condition is found in the so_lines_all table, then the row from so_headers_all is returned.

How the CBO Executes Distributed Statements

The optimizer chooses execution plans for SQL statements that access data on remote databases in much the same way that it chooses execution plans for statements that access only local data:

If all the tables accessed by a SQL statement are located together on the same remote database, then Oracle sends the SQL statement to that remote database. The remote Oracle instance executes the statement and sends only the results back to the local database.

If a SQL statement accesses tables that are located on different databases, then Oracle decomposes the statement into individual fragments, each of which accesses tables on a single database. Oracle then sends each fragment to the database that it accesses. The remote Oracle instance for each of these databases executes its fragment and returns the results to the local database, where the local Oracle instance can perform any additional processing that the statement requires.

When choosing a cost-based execution plan for a distributed statement, the optimizer considers the available indexes on remote databases, just as it considers indexes on the local database. The optimizer also considers statistics on remote databases for the CBO. Furthermore, the optimizer considers the location of data when estimating the cost of accessing it. For example, a full scan of a remote table has a greater estimated cost than a full scan of an identical local table.

For a rule-based execution plan, the optimizer does not consider indexes on remote tables.

In Example 2-23, a view is created because the IN subquery requires a SORTUNIQUE on the values being selected. This view would be unnecessary if the columns being selected were unique, not requiring a sort.

How the CBO Evaluates Constants

Computation of constants is performed only once, when the statement is optimized, rather than each time the statement is executed.

For example, the following conditions all test for monthly salaries greater than 2000:

sal > 24000/12
sal > 2000
sal*12 > 24000

If a SQL statement contains the first condition, then the optimizer simplifies it into the second condition.

Note:

The optimizer does not simplify expressions across comparison operators: In the preceding examples, the optimizer does not simplify the third expression into the second. For this reason, application developers write conditions that compare columns with constants whenever possible, rather than conditions with expressions involving columns.

How the CBO Evaluates the UNION and UNION ALL Operators

This operator is useful for combining OR clauses into one compound statement or for breaking up a complex statement into a compound statement containing simpler select statements that are easier to optimize and understand.

As with concatenation, you do not want to duplicate expensive operations by using UNIONALL.

When the Optimizer Uses UNION or UNION ALL

The optimizer uses UNION or UNIONALL when the SQL statement contains UNION or UNIONALL clauses.

Example 2-25 shows a query without a UNION clause. The query finds customers who are new or have open orders.

Because the driving conditions come from different tables, you cannot execute the query effectively in a single statement.

With a UNION clause, you can break the query into two statements:

New customers

Customers with open orders

These two statements can be optimized easily. Because you do not want duplicates (some customers meet both criteria), use UNION, which eliminates duplicates by using a sort. If you use two disjoint sets, then you can use UNIONALL, eliminating the sort. The query from Example 2-25 is shown using UNION in Example 2-26.

UNION and UNION ALL Hints

How the CBO Evaluates the LIKE Operator

The optimizer simplifies conditions that use the LIKE comparison operator to compare an expression with no wildcard characters into an equivalent condition that uses an equality operator instead.

In the following example, the optimizer simplifies the first condition into the second:

ename LIKE 'SMITH'
is transformed into
ename = 'SMITH'

The optimizer can simplify these expressions only when the comparison involves variable-length datatypes. For example, if ename was of type CHAR(10), then the optimizer cannot transform the LIKE operation into an equality operation due to the equality operator following blank-padded semantics and LIKE not following blank-padded semantics.

How the CBO Evaluates the IN Operator

The optimizer expands a condition that uses the IN comparison operator to an equivalent condition that uses equality comparison operators and OR logical operators.

In the following example, the optimizer expands the first condition into the second:

How the CBO Evaluates the ANY or SOME Operator

The optimizer expands a condition that uses the ANY or SOME comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and OR logical operators.

In the following example, the optimizer expands the first condition into the second:

How the CBO Evaluates the ALL Operator

The optimizer expands a condition that uses the ALL comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and AND logical operators.

In the following example, the optimizer expands the first condition into the second:

The optimizer transforms a condition that uses the ALL comparison operator followed by a subquery into an equivalent condition that uses the ANY comparison operator and a complementary comparison operator. In the following example, the optimizer transforms the first condition into the second:

x > ALL (SELECT sal
FROM emp
WHERE deptno = 10)
is transformed into
NOT (x <= ANY (SELECT sal
FROM emp
WHERE deptno = 10) )

The optimizer then further transforms the second query into the following query using the rule for transforming conditions with the ANY comparison operator, followed by a correlated subquery:

NOT EXISTS (SELECT sal
FROM emp
WHERE deptno = 10
AND x <= sal)

How the CBO Evaluates the BETWEEN Operator

The optimizer always replaces a condition that uses the BETWEEN comparison operator with an equivalent condition that uses the >= and <= comparison operators.

In the following example, the optimizer replaces the first condition with the second:

sal BETWEEN 2000 AND 3000
is transformed into
sal >= 2000 AND sal <= 3000

How the CBO Evaluates the NOT Operator

The optimizer simplifies a condition to eliminate the NOT logical operator. The simplification involves removing the NOT logical operator and replacing a comparison operator with its opposite comparison operator.

In Example 2-27, the optimizer simplifies the first condition into the second one:

Example 2-27 How the CBO Evaluates the NOT Operator

Often, a condition containing the NOT logical operator can be written in many different ways. The optimizer attempts to transform such a condition so that the subconditions negated by NOTs are as simple as possible, even if the resulting condition contains more NOTs. In Example 2-28, the optimizer simplifies the first condition into the second, and then into the third.

Example 2-28 How the CBO Simplifies a NOT Statement

NOT (sal < 1000 OR comm IS NULL)
is transformed into
NOT sal < 1000 AND comm IS NOT NULL
which is further transformed into
sal >= 1000 AND comm IS NOT NULL

How the CBO Evaluates Transitivity

If two conditions in the WHERE clause involve a common column, then the optimizer sometimes can infer a third condition, using the transitivity principle. The optimizer can then use the inferred condition to optimize the statement. The inferred condition can make available an index access path that was not made available by the original conditions.

Note:

Transitivity is used only by the CBO.

Consider a WHERE clause containing two conditions of these forms:

WHERE column1 comp_oper constant
AND column1 = column2

In this case, the optimizer infers the condition:

column2 comp_oper constant

where:

comp_oper is any of the comparison operators =, !=, ^=, <, <>, >, <=, or >=.

Example 2-29 Query Causing Transivity Evaluation

If an index exists on the dept.deptno column, then this condition makes available access paths using that index.

The optimizer only infers conditions that relate columns to constant expressions, rather than columns to other columns. Consider a WHERE clause containing two conditions of these forms:

WHERE column1 comp_oper column3
AND column1 = column2

In this case, the optimizer does not infer the condition, column2comp_opercolumn3.

How the CBO Optimizes Common Subexpressions

Common subexpression optimization is an optimization heuristic that identifies, removes, and collects common subexpressions from disjunctive (OR) branches of a query. In most cases, it reduces the number of joins to be performed.

Common subexpression optimization is enabled with the initialization parameter OPTIMIZER_FEATURES_ENABLE.

A query is considered valid for common subexpression optimization if its WHERE clause is in the following form:

The top-level is a disjunction (that is, a list of ORed logs).

Each disjunct is either a simple predicate or a conjunction (that is, a list of ANDed logs).

Each conjunct is either a simple predicate or a disjunction of simple predicates. (A predicate is considered simple if it does not contain AND or OR.)

The expression appears in all the disjunctive branches of the query.

Examples of Common Subexpression Optimization

The query in Example 2-30 finds names of employees who work in a department located in L.A. and who make more than 40K or who are accountants. The query contains common subexpressions in its two disjunctive branches.

This query is transformed by common subexpression optimization to the following query, reducing the number joins from three to one.

Optimized Query

SELECT SUM (l_extendedprice* (1 - l_discount))
FROM PARTS, LINEITEM
WHERE (p_partkey = l_partkey /* these are the four common subexpressions */
AND p_size >= 1
AND l_shipmode IN ('AIR', 'REG AIR')
AND l_shipinstruct = 'DELIVER IN PERSON')
AND

How the CBO Evaluates DETERMINISTIC Functions

In some cases, the optimizer can use a previously calculated value rather than executing a user-written function. This is only safe for functions that behave in a restricted manner. The function must return the same output return value for any given set of input argument values.

The function's result must not differ because of differences in the content of package variables or the database, or session parameters such as the globalization support parameters. Furthermore, if the function is redefined in the future, then its output return value must be the same as that calculated with the prior definition for any given set of input argument values. Finally, there must be no meaningful side effects to using a precalculated value instead of executing the function again.

The creator of a function can promise to the Oracle server that the function behaves according to these restrictions by using the keyword DETERMINISTIC when declaring the function with a CREATEFUNCTION statement or in a CREATEPACKAGE or CREATETYPE statement. The server does not attempt to verify this declaration--even a function that obviously manipulates the database or package variables can be declared DETERMINISTIC. It is the programmer's responsibility to use this keyword only when appropriate.

Calls to a DETERMINISTIC function might be replaced by the use of an already calculated value when the function is called multiple times within the same query, or if there is a function-based index or a materialized view defined that includes a relevant call to the function.

How the Optimizer Transforms SQL Statements

SQL is a very flexible query language; there are often many statements you could use to achieve the same goal. Sometimes, the optimizer (Query Transformer) transforms one such statement into another that achieves the same goal if the second statement can be executed more efficiently.

How the CBO Transforms ORs into Compound Queries

If a query contains a WHERE clause with multiple conditions combined with OR operators, then the optimizer transforms it into an equivalent compound query that uses the UNIONALL set operator, if this makes the query execute more efficiently:

If each condition individually makes an index access path available, then the optimizer can make the transformation. The optimizer chooses an execution plan for the resulting statement that accesses the table multiple times using the different indexes and then puts the results together.

If any condition requires a full table scan because it does not make an index available, then the optimizer does not transform the statement. The optimizer chooses a full table scan to execute the statement, and Oracle tests each row in the table to determine whether it satisfies any of the conditions.

For statements that use the CBO, the optimizer might use statistics to determine whether to make the transformation, by estimating and then comparing execution costs of the original statement and the resulting statement.

The CBO does not use the OR transformation for IN-lists or ORs on the same column; instead, it uses the INLIST iterator operator.

Figure 2-1 Execution Plan for a Transformed Query Containing OR

To execute the transformed query, Oracle performs the steps in Figure 2-1 in the following order:

Steps 3 and 5 scan the indexes on the job and deptno columns using the conditions of the component queries. These steps obtain rowids of the rows that satisfy the component queries.

Steps 2 and 4 use the rowids from steps 3 and 5 to locate the rows that satisfy each component query.

Step 1 puts together the row sources returned by steps 2 and 4.

If either the job column or the deptno column is not indexed, then the optimizer does not even consider the transformation, because the resulting compound query would require a full table scan to execute one of its component queries. Executing the compound query with a full table scan in addition to an index scan could not possibly be faster than executing the original query with a full table scan.

Example 2-33 shows the kind of query that is not transformed. The query in this example assumes that there is an index on the ename column only:

Because the condition in the WHERE clause of the second component query (sal > comm) does not make an index available, the compound query requires a full table scan. For this reason, the optimizer does not make the transformation, but chooses a full table scan to execute the original statement.

How the CBO Unnests Subqueries

To optimize a complex statement, the optimizer chooses to do one of the following:

Transform the complex statement into an equivalent join statement, and then optimize the join statement.

Optimize the complex statement as it is.

The optimizer transforms a complex statement into a join statement whenever the resulting join statement is guaranteed to return exactly the same rows as the complex statement. This transformation allows Oracle to execute the statement by taking advantage of join optimizer techniques.

Example 2-34 shows how the optimizer uses a join to unnest a subquery. The complex statement in this example selects all rows from the accounts table for which owners appear in the customers table.

Example 2-34 How the CBO Unnests Subqueries

Initial Query

SELECT *
FROM accounts
WHERE custno IN
(SELECT custno FROM customers);

If the custno column of the customers table is a primary key or has a UNIQUE constraint, then the optimizer can transform the complex query into the following join statement that is guaranteed to return the same data:

Figure 2-2 Execution Plan for a Nested Loops Join

If the optimizer cannot transform a complex statement into a join statement, then it chooses execution plans for the parent statement and the subquery as though they were separate statements. Oracle then executes the subquery and uses the rows returned to execute the parent query.

Example 2-35 shows the kind of statement is not transformed. The complex statement in this example returns all rows from the accounts table that have balances greater than the average account balance.

How the CBO Merges Views

To merge the view's query into a referencing query block in the accessing statement, the optimizer replaces the name of the view with the names of its base tables in the query block and adds the condition of the view's query's WHERE clause to the accessing query block's WHERE clause.

This optimization applies to select-project-join views, which are views that contain only selections, projections, and joins--that is, views that do not contain set operators, aggregate functions, DISTINCT, GROUPBY, CONNECTBY, and so on (as described in "Mergeable and Nonmergeable Views").

Example 2-36 shows how the CBO merges views. The view in this example is of all employees who work in department 10:

Example 2-36 How the CBO Merges Views

The following query accesses the view. The query selects the IDs greater than 7800 of employees who work in department 10:

SELECT empno
FROM emp_10
WHERE empno > 7800;

The optimizer transforms the query into the following query that accesses the view's base table:

SELECT empno
FROM emp
WHERE deptno = 10
AND empno > 7800;

If there are indexes on the deptno or empno columns, then the resulting WHERE clause makes them available.

Mergeable and Nonmergeable Views

The optimizer can merge a view into a referencing query block when the view has one or more base tables, provided the view does not contain any of the following:

Set operators (UNION, UNIONALL, INTERSECT, MINUS)

A CONNECTBY clause

A ROWNUM pseudocolumn

Aggregate functions (AVG, COUNT, MAX, MIN, SUM) in the select list

When a view contains one of the following structures, it can be merged into a referencing query block only if Complex View Merging is enabled:

A GROUPBY clause

A DISTINCT operator in the select list

View merging is not possible for a view that has multiple base tables, if it is on the right side of an outer join. However, if a view on the right side of an outer join has only one base table, then the optimizer can use complex view merging, even if an expression in the view can return a nonnull value for a NULL.

If a query has a CURSOR expression, then no view merging will take place, even for views which would normally be mergeable. An example is the following:

Complex View Merging

If a view's query contains a GROUPBY clause or DISTINCT operator in the select list, then the optimizer can merge the view's query into the accessing statement only if complex view merging is enabled. Complex merging of a view with a GROUPBY clause is illustrated in Example 2-37.

Complex merging can also be used to merge an IN subquery into the accessing statement if the subquery is uncorrelated, as shown in Example 2-38.

Complex merging is not cost-based--it must be enabled with the initialization parameter OPTIMIZER_FEATURES_ENABLE or the MERGE hint. Without this hint or parameter setting, the optimizer uses another approach, pushing predicates.

The following query accesses the view. The query selects the IDs and names of all employees in either table who work in department 20:

SELECT empno, ename
FROM two_emp_tables
WHERE deptno = 20;

Because the view is defined as a compound query, the optimizer cannot merge the view's query into the accessing query block. Instead, the optimizer can transform the accessing statement by pushing its predicate, the WHERE clause condition (deptno = 20), into the view's compound query.

If there is an index on the deptno column, then the resulting WHERE clause makes it available. Figure 2-4 shows the execution plan for the resulting statement. The execution plan uses an index on the deptno column.

Figure 2-4 Accessing a View Defined with a GROUP BY Clause

To execute this statement, Oracle performs the steps in Figure 2-4 in the following order:

Step 4 performs a range scan on the index emp_deptno_index (an index on the deptno column of the emp table) to retrieve the rowids of all rows in the emp table with a deptno value of 10.

Step 3 accesses the emp table using the rowids retrieved by step 4.

Step 2 sorts the rows returned by step 3 to calculate the average, minimum, and maximum sal values.

Step 1 indicates that the view's query was not merged into the accessing query.

How the CBO Applies an Aggregate Function to the View

The optimizer can transform a query that contains an aggregate function (AVG, COUNT, MAX, MIN, SUM) by applying the function to the view's query.

The query in Figure 2-41 accesses the emp_group_by_deptno view defined in Figure 2-40. The query derives the averages for the average department salary, the minimum department salary, and the maximum department salary from the employee table.

Figure 2-5 Applying Aggregate Functions to a View Defined with GROUP BY Clause

To execute this statement, Oracle performs the steps in Figure 2-5 in the following order:

Step 4 performs a full scan of the emp table.

Step 3 sorts the rows returned by step 4 into groups based on their deptno values and calculates the average, minimum, and maximum sal value of each group.

Step 2 indicates that the view's query was not merged into the accessing query.

Step 1 calculates the averages of the values returned by step 2.

How the CBO Executes Views in Outer Joins

For a view that is on the right side of an outer join, the optimizer can use one of two methods, depending on how many base tables the view accesses:

If the view has only one base table, then the optimizer can use view merging.

If the view has multiple base tables, then the optimizer can push the join predicate into the view.

How the CBO Accesses the View's Rows with the Original Statement

The optimizer cannot transform all statements that access views into equivalent statements that access base table(s). For example, if a query accesses a ROWNUM pseudocolumn in a view, then the view cannot be merged into the query, and the query's predicate cannot be pushed into the view.

To execute a statement that cannot be transformed into one that accesses base tables, Oracle issues the view's query, collects the resulting set of rows, and then accesses this set of rows with the original statement as though it were a table. Example 2-42 illustrates this process.

Example 2-42 How the CBO Accesses the View's Rows

The following query accesses this view. The query joins the average, minimum, and maximum salaries from each department represented in the view to the name and location of the department in the dept table.

Because there is no equivalent statement that accesses only base tables, the optimizer cannot transform this statement. Instead, the optimizer chooses an execution plan that issues the view's query and then uses the resulting set of rows as it would the rows resulting from a table access.

Figure 2-6 Joining a View Defined with a GROUP BY Clause to a Table

To execute this statement, Oracle performs the steps in Figure 2-6 in the following order:

Step 4 performs a full scan of the emp table.

Step 3 sorts the results of step 4 and calculates the average, minimum, and maximum sal values selected by the query for the emp_group_by_deptno view.

Step 2 used the data from the previous two steps for a view.

For each row returned by step 2, step 6 uses the deptno value to perform a unique scan of the pk_dept index.

Step 5 uses each rowid returned by step 6 to locate the row in the deptno table with the matching deptno value.

Oracle combines each row returned by step 2 with the matching row returned by step 5 and returns the result.

How the CBO Executes Compound Queries

To choose the execution plan for a compound query, the optimizer chooses an execution plan for each of its component queries, and then combines the resulting row sources with the union, intersection, or minus operation, depending on the set operator used in the compound query. Examples 2-43, 2-44 and 2-45 illustrate the process.

The query in Example 2-43 uses the UNIONALL operator to select all occurrences of all parts in either the orders1 table or the orders2 table.

Figure 2-8 Compound Query with UNION Set Operator

This execution plan is identical to the one for the UNIONALL operator shown in Figure 2-7, except that in this case, Oracle uses the SORT operation to eliminate the duplicates returned by the UNIONALL operation.

The query in Example 2-45 uses the INTERSECT operator to select only those parts that appear in both the orders1 and orders2 tables.