Archive for July, 2007

If you are an Oracle DBA, then chances are that you will be familiar with the DBMS_STATS package and its most common subroutines to collect the statistics for entire database, for specific schema or for a specific user. If not, then we would advise taking a look at it in lieu of the ANALYZE command – to learn, why using DBMS_STATS over ANALYZE command is better to collect the statistics, please look at our previous blog post.

Using the DBMS_STATS package, we can even delete the statistics, export and import statistics from one database to another in case we need to reproduce specific performance problem. In this blog, we are going to talk about how we can trick the optimizer by using one of the subprograms of the DBMS_STATS package.

Normally, when we are performing benchmark tests or are performing load tests, one of the major requirements is to make sure that query uses an optimal execution plan for the queries in the application. A normal task for the DBA or the database developer then is to create volume data for the benchmark to take place. And that works well for the automated benchmarks tests done via load runner or other load test suites.

However, if a developer wants to also test their code for performance when doing development, one cannot just get into the habit of creating volume data each and every time that there is a change to a query or new code is being written. It would be useful to have an option using which one can take a look at how the query execution plan changes as the data set increases. Here SET_TABLE_STATS procedure comes to a rescue. Using this procedure, one can set number of rows and number of blocks with some large number, which will make the optimizer think that there are large number of rows in the table and the data distribution is different. When the query is run against these new values, optimizer may change the query plan based on the available data. There are similar procedures to set index level and column level statistics as well. Please refer to Oracle manuals for complete details on other procedures. Let us check this with the help of working example. First we will create table. Connect to appropriate schema with proper authentication using SQL*Plus.

Plan indicates that even though index is present, full table access is performed which is what is expected since the cost of scanning through the table would be lesser than going through the index and using a rowid to then go through the table. Now let us set the number of rows and number of blocks value to some higher number and let us see whether optimizer changes its mind about access path or not. First we will set the values using SET_TABLE_STATS stored procedure.

In above execution, we are setting number of rows to 10000 and number of blocks to 1000. Based on this data, optimizer may choose to have different plan. Now we will re-execute the same query again. Below is the query followed by an execution plan.

In our example, when we updated table statistics value, optimizer opted for an index scan rather than full table scan.

This method can be used to update statistics on temporary tables as well to increase the performance when temporary tables are used for larger data set. Oracle guru Tom Kyte has explained how we can utilize statistics for temporary tables in detail at his site – here and here.

By no means, this procedure should be used to simulate the production volume by merely setting the number of rows, distinct values or number of blocks for a specific table, index or columns. This only helps us to figure out what plan my query can have or how the join conditions will change the plan if data volume changes. For proper testing, one must create appropriate data volume in all the tables which reflects the production data volume and perform thorough load and performance testing to have smooth role out. However, having this option is very handy for database developers to ensure code performance at design and development time.

In a future post, we will cover how one can do this in SQL Server (both 2000 and 2005).

In Oracle, one can use the INSTR function to search for a string within another string. If there is a match, then it returns the starting position of the matching string. If there is no match, it returns 0. It can also find the specific starting point of the search string within the string. If the starting position is negative, then search will be performed from the end of the string. INSTR takes four arguments. First two are mandatory and last two arguments are optional.

INSTR(<Actual String>, <Search String>, <Starting Position>, <Occurrence>)
<Search String> is the string or expression to be to be searched.
<Actual String> is the string or expression in which we need to look for search string.
<Starting Position> position from which search should start. This is an optional. If it is not given search starts from starting position of <Actual String>
<Occurrence> This is also an optional argument. If given, it will look for specific occurrence of the search string within the string.

In order to extract country from above record, we have to use INSTR function with starting position and occurrence arguments. Here is the SQL.

SELECT substr(state,instr(state,’,’,1,2)+1, length(state) – instr(state,’,’,1,2))
FROM inpt_state
WHERE state like ‘MIAMI%’
/

In above statement 1 is starting position of the string and 2 is occurrence of the search string. Since country ‘USA’ is after second comma, we are doing substring after second occurrence of the search string (‘,’) to get the desired result.

Some folks at client sites have asked this question so I am posting it. For Oracle DBA’s who worked in version 7.x and before, they are still under the opinion that there is a difference between count(*) and count(1). In those old releases, there used to be a performance difference between the two (count(1) used to be slower). Count(1) and Count(*) are essentially the same thing. There is absolutely no difference between them…they do the same thing, take the same amount of resources and return the same output. You can very easily do a trace and run tkprof to see the number of blocks read or written or processed and compare the CPU times and the elapsed times and they will be identical. You can also see examples for those runs (including an example of a count(cola) where cola could be a not null indexed column vs a non-indexed column) over here: http://www.oracledba.co.uk/tips/count_speed.htm

In addition, let me mention one more thing which I have seen some developers do – they typically use a query like “select count(*) from table_foo where x= :1” sometimes to just find out whether there are any qualifying records in the table for that criteria and then based on the non-zero or zero value returned, the code takes the respective routes. Instead of doing that, you can just do: “select /*+ first_rows_1 */ col1 from table_foo where x = :1 and rownum < 2”. This is so that if there are a lot of records that qualify for that criteria, then you do not have incur the cost of aggregate operation since all that you are interested in is to find out whether any record exists at all. Likewise, you can use the TOP 1 clause in SQL Server and the FETCH FIRST N ROWS clause in DB2 LUW to do the same thing.

Ran into a performance issue at one of the client sites yesterday. They were using SQL Server and were relying on the automatic stats collection. Since auto-stats gets kicked off only based on a sampling of the records (you can read this KB article to see how SQL Server determines when to kick off auto-stats and can also search the blog for other things related to stats collection), for their tables that had millions of records in it with their keys that grew monotonically, this meant that the stats for those newly inserted rows based on last night’s feeds was grossly in-correct. Since SQL Server’s optimizer is a Cost Based Optimizer (CBO), the updated stats are very important for generation of good execution plans.

They also were running a de-frag job regularly and as you might already know, when you rebuild an index, the stats are updated automatically. However, this de-frag job which was being run on a weekly basis was kicked off based on a certain percentage of fragmentation level. As a result, the tables with clustered keys that were growing monotonically were running into stale statistics issues since they were below the fragmentation threshold specified in the maintenance task. So, we added another step to the maintenance task to perform update of the statistics with a fullscan on the indexes of the tables that did not get de-fragmented. We restricted this fullscan stats update to the indexes to keep the timing short. This resolved the performance issue for the client as the optimizer now had good up-to-date statistics to generate a good execution plan.

There are additional changes to be aware of between SS2k and SQL 2005 when it comes to counter updates because of DML modifications. In SS2k, a counter is used to track row level modifications whereas SQL 2005 uses a counter that tracks changes at the column level. The counter updates are also different when an update is made to key column(s) v/s when you update non-key column(s). A non-key column update raises the counter with the number of updated columns whereas a key column update raises the counter with 2 for each column. And another very welcome change is that the TRUNCATE TABLE and BULK INSERT commands do not raise the counters in SS2k but they do in SQL 2005.

Using flashback versions query, one can look into the historical changes that were made to the tables. Before Oracle 10g, it was not possible to view series of changes made to the table in the past unless one is doing auditing. Oracle introduced ‘Flashback Query’ feature in oracle 9i, which gave a view of the table at very specific time in the past. To view all the changes made to a specific row between two time intervals, Oracle introduced ‘Flashback Versions Query’ in 10g. Using this feature, we can see all the versions of the row (changes made to the row) between specific time intervals. Whenever commit happens, new version of row gets created.

SELECT versions_Startscn, versions_endscn,
versions_xid, CASE
WHEN versions_operation = ‘I’ THEN ‘INSERT’
WHEN versions_operation = ‘U’ THEN ‘UPDATE’
WHEN versions_operation = ‘D’ THEN ‘DELETE’
END AS Operation,
tran_amt
FROM test_ac
VERSIONS BETWEEN TIMESTAMP minvalue AND maxvalue
ORDER BY versions_Startscn;

We can also query the versions_starttime and versions_endtime instead of SCN. Run following query to see the creation and expiration time of each row version. Also instead of minvalue and maxvalue, we can use interval same way as we use it in regular flash back query. Following is the example.

SELECT versions_Starttime, versions_endtime,
versions_xid, CASE
WHEN versions_operation = ‘I’ THEN ‘INSERT’
WHEN versions_operation = ‘U’ THEN ‘UPDATE’
WHEN versions_operation = ‘D’ THEN ‘DELETE’
END AS Operation,
tran_amt
FROM test_ac
VERSIONS BETWEEN TIMESTAMP systimestamp – interval ’10’ minute and systimestamp
ORDER BY versions_Startscn;

In above query, we are requesting all the changes made to the row in last 10 minutes. Output is shown below.

Using flashback transaction query, we can obtain transaction information including SQL code fired by transaction, to undo the changes made by transaction. A flashback transaction query is a query on the view FLASHBACK_TRANSACTION_QUERY. We can use versions_xid column from above queries to query the view and obtain the transaction information. Run following query to get the transaction details. Output is shown below the query.

If you look at UNDO_SQL column carefully, you will see that update statement, sets the value to 5000 and not to 7000 because it displays the SQL to undo the changes made by the transaction. In this case, transaction changed value of tran_amt column from 5000 to 7000 and hence UNDO_SQL column shows the SQL to revert back the change. Logon_user column shows the user responsible for the change.

Warning

Flashback query uses Oracle’s multiversion read-consistency to retrieve the data by applying undo as needed. So data will be available only for the time specified by UNDO_RETENTION parameter in the database. It will not return the historical data, if time difference exceeds the time defined by UNDO_RETENTION parameter. On our database value for this parameter is set as 900 (15 minutes). So if we run the same query after 15 minutes, we don’t get anything back. Following is the example.

SELECT versions_Starttime, versions_endtime,
versions_xid, CASE
WHEN versions_operation = ‘I’ THEN ‘INSERT’
WHEN versions_operation = ‘U’ THEN ‘UPDATE’
WHEN versions_operation = ‘D’ THEN ‘DELETE’
END AS Operation,
tran_amt
FROM test_ac
VERSIONS BETWEEN TIMESTAMP minvalue AND maxvalue
ORDER BY versions_Startscn;

Also as per Oracle documentation, specify the RETENTION GUARANTEE clause for the undo tablespace to ensure that unexpired undo is not discarded. For performance suggestions and detailed description of flashback query, please refer to Oracle 10g Application Developers’ Guide.

In Oracle, sequences are generally used to auto generate column IDs. You can read more about Sequences and their usage in our previous post – here. In today’s post, we are going to look at how we can re-set the sequence value. Re-setting might be needed when we have used up all the values for a given sequence (A very rare occurence even if you are calling it thousands of time per second) or when someone set the increment value wrongly and you need to correct it.

Output of above query will be 5. Since we have incremented it with desired value let us alter sequence again to be incremented by 1 and get the next value so that we get continuous numbers from now on.

Insert will be successful and we are back on the track. Here is the result set.

SQL> select test_id from test;
TEST_ID
——-
1
2
3
4
7
8

Let us delete records from the TEST table and reset the sequence back to 1. For resetting sequence to 1, we will still alter the sequence by incrementing but this time with negative number, get the next value of the sequence and again alter it to be incremented by 1.

This post explains how one can re-sequence a number column in Oracle. Let’s take a hypothetical example and walk through it. Suppose that a big departmental store has its own stocking facility and each shelf in the facility is identified as a location (STORE_LOCN_ID) in which more than one item can be stored. Whenever a new item is added to the shelf, a new record is created in the table by increasing the value in the locn_seq column by 1. Whenever actual quantity (actl_qty) for the item decreases to 0, record is deleted from the table. Following is the table structure to hold the data of the items in the facility.

There is also a unique index on STORE_LOCN_ID + LOCN_SEQ column combination since that forms the business key to this table. Please note that the LOCN_SEQ column has been defined as NUMBER(3) in this example to demonstrate how to re-sequence the numbers. In an actual design, you may want to increase the length of that column. The example is so chosen to illustrate the req-sequencing using analytic function in Oracle.

And for this example, this is how a set of data looks like in this table:

Now, suppose that because of frequent movement of items, we reached the maximum limit (999) of locn_seq for a particular store_locn_id. But, there are few more items that need to be stocked. Since the max value for LOCN_SEQ has been reached, any attempt to insert another record with a LOCN_SEQ value of 1000 will result into ORA-1401 error. We can thus re-sequence the values so that for a given STORE_LOCN_ID, the LOCN_SEQ values are sequenced and any gaps that resulted via deletion of records could be re-used again. Let us see how we can resolve this using analytical function.

We will create temporary table using CTAS (Create Table AS) to store the data. A key thing to note is that we are using the Row_Number() and the partition clauses of the analytic functions,. Since we want to initialize locn_seq to 1 for each store_locn_id, we will partition by store_locn_id and order by locn_seq.

SQL>create table temp_store_stock
as select store_stock_id, store_locn_id,locn_seq,
Row_number() over(partition by store_locn_id order by locn_seq) new_seq
from store_stock
order by store_locn_id, locn_seq;

The data in this table is shown below: You will notice that new_seq number is in the continuous order without any gap.

If you want to dig into the Row_Number() and the partition by logic, you can search this web-site for more information on those functions or look up Oracle docs. Oracle has a very rich set of analytic functions which can be very useful for development as well as troubleshooting purposes. We had also covered one of those in a post few days ago – you can access it here.

Sometimes, there is a need to store large data set like a big xml file, resumes, book reviews etc. in the database. Since we store the data, occasionally we may have to update it as well. MS SQL Server 2000 provides TEXT data type to store such large contents in the table. In the following example, we will show how we can manipulate such columns. We will also demonstrate how we can perform similar string manipulation for columns with VARCHAR(MAX) data type introduced in MS SQL Server 2005.

We cannot update TEXT column using regular DML statements instead we need to use READTEXT (to read the value), WRITETEXT (to replace the existing value) and UPDATETEXT (to update the partial value). For detailed syntax of all the commands, refer to BOL.

First let us start with creating table and populate with sample data to test.

In order to read, write or update TEXT column, we need to first obtain valid pointer to the TEXT column using TEXTPTR command. If query returns more than one row, than value of pointer will be of the last row fetched.

— This will replace the text with new text
WRITETEXT test.test_text @ptr ‘Hello World!’

Since command replaces the entire string it doesn’t need any offset value. It just requires valid pointer to the row. Execute select statement to see the updated string. It should return ‘Hello World!’.

UPDATETEXT
Following is the code snippet to update portion of the string of the column. We will replace ‘Wor’ with ‘Arno’ so that result will be ‘Hello Arnold!’

Updatetext requires insert_offset (to insert data from specified position) and delete_length(length of data to be deleted starting from insert_offset position). In our example, 6 is the insert_offset and 3 is delete_length. Selecting data should return string ‘Hello Arnold!’.

.WRITE (To manipulate VARCHAR(MAX) Columns)
In MS SQL Server 2005, Microsoft introduced three new data types VARCHAR(MAX), NVARCHAR(MAX) and VARBINARY(MAX) to store the large data sets. This will replace TEXT, NTEXT and IMAGE data types in future releases. To manipulate columns defined with this datatype, Microsoft also introduced the .WRITE clause. Using this clause we can replace entire column (WRITETEXT) or update portion of the column (UPDATETEXT).

Microsoft strongly suggests to use VARCHAR(MAX) and related datatypes as they are going to remove TEXT, NTEXT and IMAGE datatypes and related functions (UPDATETEXT, WRITETEXT and READTEXT) from future releases.

In following example, we will see how we can perform string manipulation for columns defined with VARCHAR(MAX) data type. Let us add new column to table.

.WRITE clause takes three arguments. New String, offset and length. In our case, 0 is the offset (Starting position) and len(test_max) is the length. We are replacing entire string with new string ‘Hello World!’.

Now to update the portion of the string, execute following command.

UPDATE test
SET test_max.write(‘Arno’,6,3) WHERE test_id = 1
GO

Above command replaces the 3 bytes starting from Byte 7(Since offset is 0 based ordinal position). Here we are replacing word ‘Wor’ with ‘Arno’. Executing SELECT * FROM TEST will give us the output ‘Hello Arnold!’.

Using the .WRITE clause we can also add string at the end of the existing string, remove portion of the string and/or removing data from the specified position to the end of the string. For achieving similar functionality for regular VARCHAR columns, we need to use the STUFF command.

Prior to Oracle 10g, it was not possible to recover the table back once it is dropped. Oracle introduced new feature ‘Oracle Flashback Table’, one of the many useful features. Using this new feature we can recover a dropped table to its earlier state with a single statement.

First let us create an empty table and populate it with some data. If you already have table with the same name, change all the occurrences of the table name with some other name.

In order to recover table back, we need to make sure that recyclebin parameter is on. You can read more on recycle bin in our previous post here. Run following SQL to see the current value of parameter.

SQL>show parameter recyclebin

NAME TYPE VALUE
———————————— ———– ——————————
recyclebin string OFF

Recyclebin parameter value is set to off, so we need to turn it on first. We can do it either at session level or at system level. It is a dynamic parameter so there is no need to stop and start the database after changing its value. We will set it at a system level. Alternatively we can put it in the init<sid>.ora file as well.

SQL> ALTER SYSTEM SET recyclebin = ON;

Flashing back to before drop

Before drop syntax allows to restore the table in a state just before the table was dropped. We already created table previously. Let us check it first and then drop it.

Even though we have dropped the table, table is not actually dropped. Instead it is renamed with system generated name and is kept in recyclebin. You can view details either from user_recyclebin view or recyclebin synonym. It also renames any indexes and triggers of the table. Below query shows the original name and new name of all the objects which were dropped.

Now let us try to recover the table. Issue following command to revive the table.

SQL> flashback table test to before drop;
Flashback complete.

SQL> select * from tab;

TNAME TABTYPE CLUSTERID
—————————— ——- ———-
TEST TABLE

Table is now back entirely including trigger and indexes along with data. Flashback command renames the table to its original name but does not rename the indexes, triggers or constraints. We need to manually rename them. Get the object name and its original name from the query on the previous page.

Constraint names are not retrievable from user_recyclebin view so we have to refer to original scripts or any other documentation to get its original name but following is the syntax to rename the constraints.

Other use of flashback table is to replace the data in the table to a different point of time using timestamp and/or SCN clause. Here we will show you how we can achieve it using SCN.

Let us first get the current SCN value. Make sure that you have execute privilege on dbms_flashback package. Run following statement to get current SCN.

SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
————————
3965449

At this point, data in the table is as shown below.

SQL> select test_id, test_indic from test;

TEST_ID TEST_INDIC
———- ———-
1 0
2 0

Now we will update value of test_indic column to 1.

SQL> update test
2 set test_indic = 1
3 /

2 rows updated.

SQL> commit;

Commit complete.

First we will compare the old value and new value of test_indic column by joining against table version at particular SCN. Run following SQL. Make sure to change the SCN value with correct value or you can parameterize the query to get the input of SCN number.

In above query, we are self joining the test table with one of its past version to compare the updated column values. Now let us see how we can restore the table to that SCN value. Issue following command to reinstate the copy of the table at SCN value.

We have successfully reverted back the changes and restored the table to a specific point in time. In similar fashion flashback clause can be used with timestamp and restore point options as well. For limitations and detailed description of flashback and row movement clause, please refer to Oracle 10g SQL reference.

In SQL Server, in order to find out whether triggers are enabled or disabled, we can query the meta data tables. Using the ObjectProperty() function, we can write up a simple script to see whether the trigger in question is disabled or enabled.

The above statement can be run from Query Analyzer (SQL Server 2000) or SSMS (SQL Server 2005). OBJECT_ID and OBJECTPROPERTY are in-built functions. In the above statement, OBJECT_ID(‘TEST_TRIGGER’) will return the id of the ‘TEST_TRIGGER’ object. OBJECTPROPERTY requires two arguments (id, property). property is an expression containing the information to be returned for the object specified by id. For various properties refer to BOL. If a developer wants to see the status of all triggers in the database, then the following query will be useful.

SELECT
OBJECT_NAME(PARENT_OBJ) TABLE_NAME,
NAME AS TRIGGER_NAME,
CASE OBJECTPROPERTY(ID, ‘EXECISTRIGGERDISABLED’)
WHEN 0 THEN ‘ENABLED’
ELSE ‘DISABLED’
END AS STATUS
FROM SYSOBJECTS
WHERE XTYPE = ‘TR’

The above query will return TABLE_NAME and associated TRIGGER_NAME along with its STATUS. If using SQL Server 2005, you can use sys.sysobjects instead of sysobjects though the above query will work fine as well. It is usually advisable to use the Information_Schema views when dealing with meta-data but not all the information can be obtained from those and hence one needs to understand the system tables and views as well to deal with the meta-data.