Separate Indexes From Tables, Some Thoughts Part II (There There) April 18, 2008

In Part I, I discussed how separating indexes from tables won’t likely improve performance as:

Oracle moves from reading index blocks to table blocks in a sequential manner

Most of the associated I/Os are likely to be random anyways

Multi-User environments would result in disk contention regardless

That being said, why is it then some sites claim performance improvements after separating indexes from tables ? Previously, performance was sluggish however after moving indexes into a separate tablespace, performance appears to have picked up. Clearly then, moving indexes into a separate tablespace does improve performance, even if common sense might suggest otherwise.

Well, not quite. Here’s a scenario that’s not entirely uncommon …

Currently, an application has both tables and indexes in the same tablespace. The tablespace consists of various datafiles distributed across (say) 4 physical devices. Most database waits are I/O related with both db file sequential reads and db file scattered reads featuring heavily in performance metric reports. However, I/O performance is somewhat average with slow I/O related wait times and performance is generally suffering as a result.

Maybe, just maybe, the problem is due to having tables and indexes in the same tablespace. Perhaps if we separate the indexes away from the tables, contention will reduce, I/O wait times will decrease and database performance might improve as a result.

So we create a shining new, index only tablespace spread across (say) 4 additional physical disks and rebuild all our indexes in this new tablespace. To our relief, thankfully, performance has indeed improved. Average I/O wait times have been reduced and overall database performance has improved as a result. Despite what folks like that Richard Foote dude claims, here is clear proof and evidence of performance indeed improving, purely and simply by just separating indexes from their tables.

All we did was pull the wings off the fly and now it won’t take off after clapping our hands. Clear proof that flies go deaf when you pull off their wings …

There are of course two additional, potentially significant events that have also occurred other than just the indexes being separated from the tables.

The first one is that not only have all the indexes been moved to another tablespace, but all indexes have also been rebuilt as a consequence. Now, I’m the last person to get all excited about indexes being rebuilt, however as I’ve gone to great lengths to document, there are rare scenarios when indexes can get fragmented and may benefit from a rebuild. By moving indexes into a new tablespace, we’ve effectively rebuilt all the indexes, the (say) 99% where it wouldn’t have mattered but also the (say) 1% where it may have improved things. We have also rebuilt those indexes where there may be some temporary improvement until the index starts to flesh itself out again.

As a result, there could be all manner of related changes to execution plans and performance generally, especially related to larger index range scans and index fast full scans.

It’s not the indexes being separate from the tables that’s making some difference here, it’s the fact all the indexes have been rebuilt (especially those that were badly fragmented and accessed by large index scans).

The fly isn’t really deaf …

However, the far more significant difference we’ve also made is that we have of course just introduced 4 new physical devices into our database infrastructure. As a result, we may have significantly enhanced our I/O bandwidth and possibly reduced I/O related contention issues. All the general I/O activity related to indexes that was occurring on our initial 4 disk table/index tablespace have all been removed and are now occurring on our new, separate 4 disk index only tablespace.

But that’s a good thing right, that’s what we wanted to achieve ?

Not quite.

In the index range scan scenario I discussed in Part I, just note how few of the overall I/Os were related to the index. In larger index range scans where in theory separating indexes might improve performance, very few of the related physical I/O activity is actually attributed to indexes. The index would have to have an extremely low (and rare) clustering factor for index costs to be significant. In most “randomly” distributed index scans, there’s significantly more table related physical I/O activity than index activity.

By moving just the indexes into these new physical devices, we’ve just moved a whole bunch of segments that as a group incur relatively low physical I/O related activity while leaving together all those that result in the majority of physical I/Os.

Yes. we’ve reduced contention and I/O demands on the initial tablespace but as whole, we haven’t done it very well at all. Yes, we’ve reduced contention and perhaps improved performance, but we could have done it so much better. Yes, it appears separating indexes from tables has improved performance but has it really …

It’s not the separating of indexes from tables that’s improved performance, it’s the fact we’ve introduced 4 new disks and we’ve shifted some of the I/O activity away from the initial tablespace.

The fly isn’t deaf after all …

As an example, previously we had 100% of related I/O activity in the initial table/index, 4 disk tablespace. However, only (say) 20% of the activity was actually related to the indexes, 80% was attributed to all the tables. By moving all the indexes into the new, 4 disk index only tablespace, we therefore reduce the load on the initial tablespace by 20%. We now have 80% of the I/O load on 4 disks and just 20% on the other 4 disks. Yes, performance might improve as a result but we could do so much better. Currently, 4 of the disks have 20% of all segment related load on them and the other 4 disks have just 5% of all associated load.

Instead, if only we either added the 4 disks to the other 4 disk set and striped both tables and indexes across all 8 disks or moved and distributed both indexes and tables into the new 4 disk set, we might have been able to distribute load much more evenly across all 8 disks with approximately 12.5% load across each one.

By doing so, we may have improved performance by an even better and more significant amount. Conversely, by separating indexes into their own tablespace, we may actually be hurting general database performance because database performance is not optimal due to the uneven distribution of I/O related activities.

Of course, there’s a very easy way to confirm this. Look at the statistics in V$FILESTAT or look at a statspack report and carefully study the physical I/O activity in the table only and index only tablespaces and compare the results. Just how evenly distirubuted are the I/O related workloads …

Yes, there are scenarios where distributing individual segments here or there may be beneficial but the overall objective is generally to try and even out disk/spindle workloads as much as possible. Separating all indexes blindly is typically a very poor method of trying to achieve this.

If an individual query is not likely to improve by having an index in a separate tablespace and if separating indexes results in a non-uniform distribution of physical I/O activity, then you may want to start questioning whether it’s all really worth it.

Of course, database recoveries will be simplified by having indexes in their own tablespace, right ?. Ummm, I’ll tackle that myth next …

Share this:

Like this:

Related

You end with “Ummm, I’ll tackle that myth next …” – I start to believe that the fact we see so many Oracle related myths is somehow (mythically?) related to the name of this RDBMS. 🙂

A more scientific explanation would probably point to the evolution of the product, improved documentation and especially knowledge and management tools that allow introspection where in the old days (I assume) there was just a black box (probably with few peepholes). I cannot reliably speak about these days though as 8.1.7 was the earliest release I got my hands on…

Ah, yes. Right, actually I read that post of yours but somehow forgot about it. Thanks for reminding me!

Do you agree that one factor that helped make the wrongs persistent was the greater difficulty to inspect Oracle behavior in the “old days”? As I said, I cannot witness myself but I have seen management tools improve (even from 8 to 10) and I believe Cary Millsap and Jeff Holt mention some deficiencies (measurements as well as documentation IIRC) of earlier version’s tracing in “Optimizing Oracle Performance”.

Yes and no. Yes I think there’s certainly more detailed information that one can suck out of Oracle these days from an instrumentation perspective and it’s somewhat easier to do so.

And kinda no because I think the major change over the last 10 years or so is that there’s so much more information out there and with the web and so forth, I think when things get stated incorrectly, it’s just so much easier and there’s so many more people who can now instantly respond, “no, that’s crap” 😉

Richard,
I’m a little confused. How do you get that 80% of the i/o is dedicated to the table? From your part I post it looked like the table i/o was 50% of the i/o, (index, table, index, table). Could you please clarify?
Thank you,
Claudia

Yes it does kinda flip/flop between the index and the table but in the case of the index, it’s almost certainly the same block we go back to again and again and again as we read all the index entries from a leaf block (and hence does not generate any additional physical I/Os)

In the case of the table it’s a different table block, followed by yet another different table block and another and another, unless the index is really well clustered, which means the I/Os are much more likely to be physical I/Os.

So the table I/Os are much much more likely to be physical I/Os than the index I/Os and it’s the amount of physical I/Os that’s potentially relevant when we talk about separating indexes from tables.

what about performance of updates? i never assumed different tablespaces had anything to do with simple queries. But can they help with updates? I’m not saying they can…but can they? Different tablespaces must manage their allocation of new space separately, don’t they? Is that a benefit?

Not really IMHO. Updates are performed in a serialised manner as well and although an update to row could update multiple indexes (or none at all), and although a table might need to allocate a new extent whereas an index might not (or visa versa), again in a multi user environment with lots of users updating lots of different tables and lots of different indexes (or different rows in the same tables and indexes), the net benefit of separating indexes from tables will be “disappointing”.

[…] “simplifies administration and maintenance” – it would have been helpful for the authors to provide supporting evidence for this claim (pages 9-12). In recipe 2-2 on page 51, the authors suggests that backup and […]