If you need to store numbers in the database, store the numbers in a column with a numeric datatype – storing the numbers in a column with a VARCHAR2 datatype invites problems.

If you need to store dates in the database, store the dates in a column with a DATE datatype – storing the dates in a column with a VARCHAR2 datatype invites problems; storing the dates in a column with a numeric datatype invites problems.

Yes, it happens in production envionments

Why is my SQL statement executing so slowly.

Why am I receiving unexpected results from this query?

The same holds true when selecting data from tables.

If a number happens to exist in a column with a VARCHAR2 datatype, and it is necessary to retrieve that row by specifying that column in the WHERE clause, make certain that either the bind variable is defined as a VARCHAR/VARCHAR2 or that the constant (literal) value is wrapped in single quotes so that it is treated as a VARCHAR2, rather than as a number. Yes, this happens in real life, as demontrated in this very recent OTN thread.

If the column is a DATE datatype, do not define the bind variable with a VARCHAR2 datatype and do not pass in a constant as a VARCHAR2 (string). Something is bound to go wrong at some point. I am still trying to determine why this technique was demonstrated multiple times in the “Oracle SQL Recipes” book (I have not had a chance to finish reading this book, so I have not posted a review yet that draws attention to this bad practice).

In the comments section of the most recent True or False Quiz article I showed a couple of demonstrations why numbers should not be stored in VARCHAR2 columns, if only numbers are to be stored in that column. One of my comments showed why, when the column datatype does not match the datatype of the constant, and one of those is a numeric, why the other entity is converted to a numeric, rather than converting the numeric to a VARCHAR2. From that comment entry, consider the following:

Now consider the following three queries – would the developer expect the three queries to return the same result rows?

SELECT
*
FROM
T3
WHERE
CHAR_COL=1;
CHAR_COL C2
---------- -----
1 A
1.0 A
1.00 A
1.000 A
1.0000 A
1.00000 A
1.000000 A
1.0000000 A
1.00000000 A
9 rows selected.
---
SELECT
*
FROM
T3
WHERE
TO_NUMBER(CHAR_COL)=1;
CHAR_COL C2
---------- -----
1 A
1.0 A
1.00 A
1.000 A
1.0000 A
1.00000 A
1.000000 A
1.0000000 A
1.00000000 A
9 rows selected.
---
SELECT
*
FROM
T3
WHERE
CHAR_COL=TO_CHAR(1);
CHAR_COL C2
---------- -----
1 A

Notice that the last query only returned one row, while the other two queries returned nine rows. So, why are VARCHAR2 columns always converted to numeric values, rather than numeric values converted to VARCHAR2? If the number was automatically converted by Oracle into a character value, Oracle might need to test a nearly infinite number of 0 characters appended to the end of the converted value after the decimal point (up to the number of characters of precision) for a matching result – this extra work is avoided by converting the character value to a number.

I might have appeared to have drifted off the topic of this blog article, so now let’s see a case where Oracle’s cost-based optimizer does the impossible – it does not pick the execution plan with the lowest calculated cost for a simple SQL statement involving a single table. This test case can be reproduced on Oracle Database 10.2.0.4 through 11.2.0.1 (and probably a couple of other releases as well). This test case is from another one of my comments in the recent True or False Quiz article – the bonus question. The test case:

“It’s a great shame that Oracle Corp. decided to use the name “hints” for its optimizer directive mechanism. “Hints” are not hints, they are interception points in the optimizer code path, and must be obeyed.”

So, which expert’s explanation is correct for Oracle Database 10.2.0.4 through 11.2.0.1 for this particular test case? Neither? Both? Let’s take a look at the output that was written to the SQL*Plus screen:

The above SQL statement is the same as the first, just with an INDEX hint. The index hint was obeyed. Jonathan Lewis’ blog article is correct – index hints are directives, so Oracle’s optimizer selected an index access path using the only available index – the index on the primary key column.

But now we have a serious problem. What is the problem? For a SQL statement involving a single table a higher calculated cost execution plan (4,450) with a higher estimated time (54 seconds) was selected rather than using the obviously less expensive execution plan with a lower calculated cost (3,961) and with a lower estimated time (48 seconds). Interesting…

Notice that the calculated cost decreased significantly when we constructed the query correctly, and the available index access path was automatically selected.

So, why did the optimizer not select the lowest cost access path? Fortunately, the test case created a 10053 trace file that helps explain what happened. In the 10053 trace file, we see that the optimizer transformed the original SQL statement a bit:

Now, with the transformed version of the SQL statement, it appears that we need a function based index on the CHAR_COL column that converts the column value to a number so that an index access path is possible. Further down in the 10053 trace we find the following:

In the above, we see that the optimizer immediately jumped to a full table scan access path and then immediately declared that a full table scan offered the lowest cost – the optimizer did not even consider an index access path. Now, let’s compare the above with the SQL statement having a hinted access path:

The hint provided in the SQL statement forced the optimizer to do something that is supposedly not possible. Notice that unlike the previous output, the full table scan was not even considered because of the hint.

The short summary of the above: do things correct from the start to avoid confusion and unexplained performance problems later.

* Late edit May 28, 2010: In the most recent True or False quiz, Centinul provided a documentation reference that states that VARCHAR2 values are always converted to numbers when a number is implicitly compared to a VARCHAR2 value.

Thank you for the compliment. Here is an example of unexplained problems, one of the reasons why I took issue with the advice found in the “Oracle SQL Recipes” book.

I noticed that some of our programs were taking a long time to log in for no apparent reason – the login code was well constructed, other than not using bind variables. But the lack of bind variables was not the problem. The database connection used an Oracle 10.2 ODBC client, with a connection string that looked something like this:

"Data Source=ODBCName;User ID=MyUserName;Password=MyPassword;"

I noticed that an OLEDB connection was much faster for the connection. The OLEDB connection string looked something like this:

Why the difference in performance? If I recall correctly, changing from ODBC to OLEDB decreased the connection time from about 3-5 seconds to less than a second. A 10046 trace file captured during the ODBC connection showed repeated executions of SQL statements like this:

Both of those SQL statements show dep=0, yet the application is not submitting those SQL statements – Oracle’s ODBC client for 10.2 is the culprit. No problem, the solution is to just switch to Oracle’s OraOLEDB provider.

But now we have a problem. The SQL statements that for years have worked passing in dates as string literals:

SELECT * FROM T1 WHERE C1='13-MAR-2010';

Are sometimes, maybe always, failing with either a compilation error, or returning the wrong results.

But, how can it fail – the book stated that it was OK, and it has worked for YEARS. Magically, doing it the right way is the fix:

SELECT * FROM T1 WHERE C1=TO_DATE('13-MAR-2010', "DD-MON-YYYY");

Of course, really doing it the right way is to declare a bind variable of type DATE – and that works just as well.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: