This loop produces output that shows how long it takes to load
100000 rows into the table. Ideally we would like the last 100000
rows to be inserted in roughly the same amount of time as the
first 1000000 rows. In this case, this is exactly what we find -
that is, there is no slow down as more rows are loaded.

Perhaps this is because we are inserting known data values in
order? As a second attempt, we can modify the insert statement in
line 9 of the above procedure to read:

insert into DESTINATION (KEY) select dbms_random.random from SOURCE;

Looking at the times taken for each 100000 row insert we
notice that although the time is slightly longer on average than
it wasfor the first test, there is still no noticeable
increase in time through the iterations in the loop. The extra
time taken being the overhead of the call to dbms_random.

So, maybe we need an index on the DESTINATION table in order
to see the effect that inserting many rows into the table has on
performance. Create an index on the table as follows:

create index IX1 on DESTINATION (KEY);

and then rerun the PL/SQL block that we ran earlier. Perhaps
surprisingly in this case, there is still no noticeable increase
in the amount of time taken to load 100000 rows at the end of the
procedure compared to at the beginning.

As a last resort, we can run the second PL/SQL block,
inserting a random number into the table. At last, we come to a
point where it takes longer to insert the last 100000 rows than
it did to insert the first 100000 rows. (In my test, an increase
in time of over 5000%!) This indicates that the presence of an
index on a column that contains values in a random order is
likely to be the culprit that is slowing the inserts.

The reason for this slow down can be explained if the test is
modified to provide statistics on the number of reads and writes
that are performed on the table and index during the load. As the
table is initially empty when the load starts, there is no need
to read from the disk - new blocks are simply created internally
and mapped onto the data file. There are, however, disk writes as
the database writer flushes dirty buffers to the disk, making
space for new blocks to be created. Initially this is true for
the index as well. However, after about 1 million rows have been
inserted, the number of reads from the index starts to increase.
In my tests, the insert of the last 100000 rows into the table
required over 80000 disk reads from the index segment! (And 3
from the table.)

The reason for the slow down is therefore clear. Up to about a
million rows, the index is completely held within the buffer
cache. In order to insert a new entry into the index, Oracle can
scan a bunch of blocks that are in memory and insert the row.
Once the index grows beyond the size available in the block
buffers, some of the blocks needed for the index maintenance may
not be in memory, and will need to be read. However, in order to
red them, Oracle first needs to create space for them, by
clearing out old blocks, which in turn may be needed for a later
insert. This thrashing of blocks in and out of memory is the
source of the performance issue.

This also explains why the index did not show the same
characteristics when the rows were inserted in order - once a
leaf block had been written it would not be required again, and
so the reads from the index were not required.

In order to try to relieve the problem, the insert statement
can be modified to read:

insert into DESTINATION (KEY) select dbms_random.random from SOURCE order by 1;

This forces each 100000 rows to be inserted in numeric order,
progressing through the index from start to finish. Although this
does not completely get rid of the problem, on my (rather slow)
test system, the time to load 20million rows was reduced by over
28 hours (a saving of 29%)!

The ideal solution is of course to drop the indexes on a table
before running a large load and recreate them after the load has
completed.

It is worth noting that this characteristic is only observed
when inserting large numbers of rows into the table using a batch
system when there is little other activity on the database.
Inserting a single row into a table with 20 million rows on a
busy database will not take significantly more time than
inserting the same row into an empty table, as the chances of the
necessary blocks being in the buffer cache are equally random
depending on the use that has been made of the database in the
recent past.