If response times are lagging in your high-transaction system,
reducing disk I/O is the best way to bring about quick improvement.
And when you access tables in a transaction system exclusively
through range scans in primary-key indexes, reorganizing the tables
with the CTAS method should be one of the first strategies you use
to reduce I/O. By physically sequencing the rows in the same order
as the primary-key index, this method can considerably speed up data
retrieval.

For
queries that access common rows with a table (e.g. get all items in
order 123), unordered tables can experience huge I/O as the index
retrieves a separate data block for each row requested.

If we
group like rows together (as measured by the clustering_factor in
dba_indexes) we can get all of the row with a single block read
because the rows are together. You can use 10g hash cluster
tables, single table clusters, or manual row re-sequencing (CTAS
with ORDER BY) to achieve this goal:

Like disk load balancing, row resequencing is easy, inexpensive, and
relatively quick. With both techniques in your DBA bag of tricks,
you'll be well equipped to shorten response times-often
dramatically-in high-I/O systems.

Experienced Oracle DBAs know that I/O is the single greatest
component of response time and regularly work to reduce I/O. Disk
I/O is expensive because when Oracle retrieves a block from a data
file on disk, the reading process must wait for the physical I/O
operation to complete. Physical reads (disk access) is at
least 100x slower than a buffer get, and anything you can do to
minimize I/O can greatly improve the performance of any Oracle database.

Oracle database administrators also know that storing data in the
same way that it is retrieved will reduce I/O. Re-sequencing
table rows to place related rows together on the same data blocks
can greatly reduce I/O because Oracle can grab more relevant data in
a single block fetch.

For a simple illustration, assume a
system where 95% of the SQL queries retrieve 50 random item rows for
a specific order number. Queries for "show me all items for
this order" would greatly benefit from reduced if we sequence all
item rows by order number, placing together on adjacent data blocks.

For example, clustering all
orders for a customer together on the same data
block allows the query "show me all orders
for this customer" to be retrieved in a
single fetch.
See here for more details on using hashing
to group data rows onto related data blocks.
The
clustering_factor metric shows how in-sync
the rows are.

Note that in
the most beneficial case, dba_indexes will show
this column with a low
clustering_factor, small avg_row_len and we are using
large data blocks (found in dba_segments) to minimize
physical reads.

When row re-sequencing does not help

For the majority of tables, physically re-sequencing
the rows is not going to make a dramatic difference in SQL response
time. Row re-sequencing is best for databases when the vast
majority of SQL queries traverse an un-clustered one-to-many data
relationship.

For example, a shipping database where clerks enter customer data
and then retrieve random order data might benefit by re-sequencing the
orders onto adjacent data blocks.

Packing adjacent rows together on a single data block to
reduce block fetches for index row scans.

Row re-sequencing does not help queries that perform full-scans
or index unique scans, and careful attention must be given to the
nature of the queries against the target table.

Proving the benefits of row-resequencing

There are still relational purists and neophytes who don't
believe that storing related rows together will reduce query I/O.
The best way to address the concerns of Luddites is to create a
simple test case:

Run the original query and measure consistent gets and
physical reads (TKPROF)

Copy the table to a re-sequenced table (using CTAS with
order by), build indexes and re-run the query noting the logical
and physical reads.

Your mileage may vary

The runtime benefits of row-resequencing (as measured by a
reduction in consistent gets) depends on many factors, most
important, the overall amount that the table is accessed via the
index (as a range scan) and the average number of rows returned by
the range scan:

Client access method – The SQL*plus arraysize
parameter and the use of PL/SQL "bulk collect" can affect consistent gets
for row-ordered tables. See my
notes here on
arraysize timings.

Data structure - Rows per block
fetch is important and it depends on block size and avg_row_len.

SQL access method – The way that
your table is accessed has a great effect on the benefits of
row-resequencing:

No benefit for single block access
(“index unique scan” and “fetch by ROWID”) or for full-table
scans.

Best when a large majority of the SQL
does long range scans against the target table

The number of rows returned by the
range scan is a major factor. A system that displays an
average of 200 line items for an order would see a far
greater LIO reduction than a “small” index range scan that
only fetches 20 rows.

Historically, Oracle dropped-the-ball in offering the robust
physical data sequencing methods. Competing databases such as
IDMS/R and DB2 have offered the ability to pre-sequence rows and
manage new inserts.

Since 1981 IDMS has offered the
VIA set construct that allows adjunct access for
key-sequenced rows.

When an owner participates in more than
one set, you can request that a specified set be given priority
and those member records will be stored first, ensuring that
they are stored on the same page as the owner.

In
DB2 cluster indexes, we have been able to cluster rows
together in index sequence sequence for decades:

The keyword CLUSTER, specified when the index is created,
instructs DB2 to maintain the rows on the data pages in sequence
according to the indexed column

Originally, relational theory stated that the physical placement
of rows on the database did not matter, causing consternating among
realists who have known for decades that query performance is highly
related to the placement of the rows on the data blocks.

It was not until Oracle8i that Oracle lifted this silly
restriction and the Oracle DBA was allowed to use "Create Table as
Select (CTAS)" with an ORDER BY clause. Prior to that, the
Oracle DBA had to "force" row re-sequencing with an "index" hint in
the CTAS syntax.

Inside Row re-sequencing

The degree to which resequencing improves performance depends on how
far out of sequence the rows are when you begin and how many rows
you will be accessing in sequence. You can find out how well a
table's rows match the index's sequence key by looking at the
dba_indexes and dba_tables views in the data dictionary.

In the dba_indexes view, we look at the clustering_factor
column. If the clustering factor-an integer-roughly matches the
number of blocks in the table, your table is in sequence with the
index order. However, if the clustering factor is close to the
number of rows in the table, it indicates that the rows in the table
are out of sequence with the index.

The benefits of row resequencing cannot be underestimated. In large
active tables with a large number of index scans, row resequencing
can triple the performance of queries. Once you have decided
to re-sequence the rows in a table, you can use one of the following
tools to reorganize the table.

Copy the table using Oracle's Create Table As Select (CTAS)
syntax.

Oracle9i dbms_redefinition, an in-place table reorganization
tool.

Migrate to a single-table or multi-table cluster.

Row re-sequencing FAQ

>> I am trying my best to make data
sequencing a performance tactic at my shop and when I speak
about it's benefits and what it is......Oracle DBA's look at me
like I'm speaking in tongues.

They must be beginners. Data
sequence on disk has been beneficial for decades before
Oracle came along. Just ask them why Oracle provides
cluster tables, specifically made for row sequencing.

>> I'm more than in agreement with
you that data sequencing is an extremely powerful SQL
performance tactic for sequential heavy queries.

Well, it's best for "large
index range scans", and it can reduce LIO by a factor of
thousands. It's easily proven with a simple test,
cloning the table into row sequence and re-running the
query.

>> So, I'm hoping you could provide
me with a customer reference or two who uses data sequencing.

I would start with ANY DB2 or
IDMS shop, where it's easy to sequence rows into key order.
Also, search MOSC for "table cluster" questions and you
should find hundreds of shops using Oracle cluster tables.

You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts.

��

Burleson is the American Team

Note:This Oracle
documentation was created as a support and Oracle training reference for use by our
DBA performance tuning consulting professionals.
Feel free to ask questions on our
Oracle forum.

Verify
experience!Anyone
considering using the services of an Oracle support expert should
independently investigate their credentials and experience, and not rely on
advertisements and self-proclaimed expertise. All legitimate Oracle experts
publish
their Oracle
qualifications.

Errata? Oracle technology is changing and we
strive to update our BC Oracle support information. If you find an error
or have a suggestion for improving our content, we would appreciate your
feedback. Just e-mail:
and include the URL for the page.