One of the biggest responsibilities of a DBA is to ensure that the Oracle database is tuned properly. The Oracle RDBMS is highly tunable and allows the database to be monitored and adjusted to increase its performance.

One should do performance tuning for the following reasons:

The speed of computing might be wasting valuable human time (users waiting for response);

Enable your system to keep-up with the speed business is conducted; and

Never forget that tuning should be aimed at fixing business problems: the overnight batch jobs don't finish till lunchtime; a screen takes 10 seconds to refresh, and SLA says 1 second; a report is needed every 10 minutes, but it takes 20 minutes to generate. Do not tune because you see some figure in a report that you don't like. For example, no end user has ever telephoned the help desk to complain that "there are too many buffer busy wait events".

Although this site is not overly concerned with hardware issues, one needs to remember than you cannot tune a Buick into a Ferrari.

Consider the following areas for tuning. The order in which steps are listed needs to be maintained to prevent tuning side effects. For example, it is no good increasing the buffer cache if you can reduce I/O by rewriting a SQL statement.

Database Design (if it's not too late):

Poor system performance usually results from a poor database design. One should generally normalize to the 3NF. Selective denormalization can provide valuable performance improvements. When designing, always keep the "data access path" in mind. Also look at proper data partitioning, data replication, aggregation tables for decision support systems, etc.

Application Tuning:

Experience shows that approximately 80% of all Oracle system performance problems are resolved by coding optimal SQL. Also consider proper scheduling of batch tasks after peak working hours.

It's important to have statistics on all tables for the CBO (Cost Based Optimizer) to work correctly. If one table involved in a statement does not have statistics, and optimizer dynamic sampling isn't performed, Oracle has to revert to rule-based optimization for that statement. So you really want for all tables to have statistics right away; it won't help much to just have the larger tables analyzed.

The STATSPACK and UTLESTAT reports show I/O per tablespace. However, they do not show which tables in the tablespace has the most I/O operations.

The $ORACLE_HOME/rdbms/admin/catio.sql script creates a sample_io procedure and table to gather the required information. After executing the procedure, one can do a simple SELECT * FROM io_per_object; to extract the required information.

For more details, look at the header comments in the catio.sql script.

The likely cause of this is because the execution plan has changed. Generate a current explain plan of the offending query and compare it to a previous one that was taken when the query was performing well. Usually the previous plan is not available.

Some factors that can cause a plan to change are:

Which tables are currently analyzed? Were they previously analyzed? (ie. Was the query using RBO and now CBO?)

Has OPTIMIZER_MODE been changed in INIT<SID>.ORA?

Has the DEGREE of parallelism been defined/changed on any table?

Have the tables been re-analyzed? Were the tables analyzed using estimate or compute? If estimate, what percentage was used?

Have the statistics changed?

Has the SPFILE/ INIT<SID>.ORA parameter DB_FILE_MULTIBLOCK_READ_COUNT been changed?

Has the INIT<SID>.ORA parameter SORT_AREA_SIZE been changed?

Have any other INIT<SID>.ORA parameters been changed?

What do you think the plan should be? Run the query with hints to see if this produces the required performance.

It can also happen because of a very high high water mark. Typically when a table was big, but now only contains a couple of records. Oracle still needs to scan through all the blocks to see if they contain data.

One can use the index monitoring feature to check if indexes are used by an application or not. When the MONITORING USAGE property is set for an index, one can query the v$object_usage to see if the index is being used or not. Here is an example:

This problem normally only arises when the query plan is being generated by the Cost Based Optimizer (CBO). The usual cause is because the CBO calculates that executing a Full Table Scan would be faster than accessing the table via the index. Fundamental things that can be checked are:

USER_TAB_COLUMNS.NUM_DISTINCT - This column defines the number of distinct values the column holds.

USER_TABLES.NUM_ROWS - If NUM_DISTINCT = NUM_ROWS then using an index would be preferable to doing a FULL TABLE SCAN. As the NUM_DISTINCT decreases, the cost of using an index increase thereby making the index less desirable.

USER_INDEXES.CLUSTERING_FACTOR - This defines how ordered the rows are in the index. If CLUSTERING_FACTOR approaches the number of blocks in the table, the rows are ordered. If it approaches the number of rows in the table, the rows are randomly ordered. In such a case, it is unlikely that index entries in the same leaf block will point to rows in the same data blocks.

Decrease the INIT<SID>.ORA parameter DB_FILE_MULTIBLOCK_READ_COUNT - A higher value will make the cost of a FULL TABLE SCAN cheaper.

Remember that you MUST supply the leading column of an index, for the index to be used (unless you use a FAST FULL SCAN or SKIP SCANNING).

There are many other factors that affect the cost, but sometimes the above can help to show why an index is not being used by the CBO. If from checking the above you still feel that the query should be using an index, try specifying an index hint. Obtain an explain plan of the query either using TKPROF with TIMED_STATISTICS, so that one can see the CPU utilization, or with AUTOTRACE to see the statistics. Compare this to the explain plan when not using an index.

You can run the ANALYZE INDEX <index> VALIDATE STRUCTURE command on the affected indexes - each invocation of this command creates a single row in the INDEX_STATS view. This row is overwritten by the next ANALYZE INDEX command, so copy the contents of the view into a local table after each ANALYZE. The 'badness' of the index can then be judged by the ratio of 'DEL_LF_ROWS' to 'LF_ROWS'.

For example, you may decide that index should be rebuilt if more than 20% of its rows are deleted:

However, generally speaking, indexes should not be rebuilt because performance will be degraded for a considerable time afterwards. The issue is that after rebuild, all blocks are full up to the percent free setting. Subsequent DML will therefore cause many block splits, until the index stabilizes with an appropriate amount of free space.

[edit]What is the difference between DBFile Sequential and Scattered Reads?

Both "db file sequential read" and "db file scattered read" events signify time waited for I/O read requests to complete. Time is reported in 100's of a second for Oracle 8i releases and below, and 1000's of a second for Oracle 9i and above. Most people confuse these events with each other as they think of how data is read from disk. Instead they should think of how data is read into the SGA buffer cache.

db file sequential read:

A sequential read operation reads data into contiguous memory (usually a single-block read with p3=1, but can be multiple blocks). Single block I/Os are usually the result of using indexes. This event is also used for rebuilding the controlfile and reading datafile headers
(P2=1). In general, this event is indicative of disk contention on index reads.

db file scattered read:

Similar to db file sequential reads, except that the session is reading multiple data blocks and scatters them into different discontinuous buffers in the SGA. This statistic is NORMALLY
indicating disk contention on full table scans. Rarely, data from full table scans could be fitted into a contiguous buffer area, these waits would then show up as sequential reads instead of scattered reads.

The following query shows average wait time for sequential versus scattered reads:

The size of the Redo log buffer is determined by the LOG_BUFFER parameter in your SPFILE/INIT.ORA file. The default setting is normally 512 KB or (128 KB * CPU_COUNT), whichever is greater. This is a static parameter and its size cannot be modified after instance startup.

When a transaction is committed, info in the redo log buffer is written to a Redo Log File. In addition to this, the following conditions will trigger LGWR to write the contents of the log buffer to disk:

Whenever the log buffer is MIN(1/3 full, 1 MB) full; or

Every 3 seconds; or

When a DBWn process writes modified buffers to disk (checkpoint).

Larger LOG_BUFFER values reduce log file I/O, but may increase the time OLTP users have to wait for write operations to complete. In general, values between the default and 1 to 3MB are optimal. However, you may want to make it bigger to accommodate bulk data loading, or to accommodate a system with fast CPUs and slow disks. Nevertheless, if you set this parameter to a value beyond 10M, you should think twice about what you are doing.

Statistic "REDO BUFFER ALLOCATION RETRIES" shows the number of times a user process waited for space in the redo log buffer. This value is cumulative, so monitor it over a period of time while your application is running. If this value is continuously increasing, consider increasing your LOG_BUFFER (but only if you do not see checkpointing and archiving problems).

"REDO LOG SPACE WAIT TIME" shows cumulative time (in 10s of milliseconds) waited by all processes waiting for space in the log buffer. If this value is low, your log buffer size is most likely adequate.