Indexes In Their Own Tablespace: Availabilty Advantages (Is There Anybody Out There?) April 28, 2008

I’ve already discussed here some reasons why performance is not particularly improved by simply separating indexes from tables and storing them in a different tablespace. It sounds like it might be helpful from a performance perspective but when one digs down a little, it turns out any so-called performance benefits are questionable to say the least.

However, performance is only one reason why it’s claimed storing indexes in their own tablespace is beneficial. There are wondrous advantages to database availability and database recovery options if only indexes are stored in their own tablespaces. The loss of all indexes due to a catastrophic disaster in the index tablespace means that the database tables themselves are all still potentially available.

This sounds as if there might be a number of advantages with this strategy, right ?

Well it means for a start that none of the “real” data has been lost. If we store indexes away from the parent tables and we only lose the index tablespace, the table tablespace could possible be totally unaffected by this loss. This potentially suggests a number of things:

The Database will still be functional. Yes it might run a little slower without indexes but at least with the tables still available, we can still perform our business critical operations until the indexes have been fixed as the actual tables are unaffected

We don’t actually have to perform a database recovery to get us out of this mess. So long as all the tables are still available, we can simply recover the situation by rebuilding all the indexes from the problematic tablespace. This will hopefully be more simplistic, more efficient and most importantly faster than having to perform an actual database recovery

This all sounds perfectly reasonable …

Today, I’m just going to briefly mentioned some thoughts on the first point, the second point I’ll discuss another day.

I guess the key question here (pun fully intended) is just how important and vital are indexes to the “normal” operation of a database? Is a database effectively operational if we were to lose all our indexes, is an application still effective and operational if we were to lose all indexes belonging to the application? If by storing indexes in their own tablespace, do we get availability benefits if we were to lose only the index related tablespace?

All good questions to ask and ponder about every now and then.

Let’s be clear I’m not discussing the loss or corruption of a single (or handful) of indexes. If a specific index gets corrupted for whatever reason, yes we could recover the index by (say) making the index unusable and rebuilding the index. However, we can do this whether the specific problematic index in question was stored with or separate from the parent table so the scenario doesn’t really differ much.

No, one of the (so-called) benefits of storing indexes in their own tablespace is that if we have a catastrophic issue with the index tablespace, we only lose a whole bunch of indexes. No tables are impacted, just all the indexes stored in the tablespace. However, just how well will business critical operations function without indexes in our database …

The suggestion is that things will just be a lot slower. We’ll have lots of Full Table Scans where previously we had nice efficient index related scans, but at least data can be viewed and manipulated as the actual tables themselves will still be available. Yes things will be slower and less than ideal but better than if we had stored tables and indexes together because in this scenario we would have lost both indexes and tables making the database effectively useless until recovered.

Well let’s setup a really simple scenario and see how things fair without indexes …

First, we create a simple little “parent” test table and populate it with a few rows:

SQL> create table bowie_1 (id number, name varchar2(20));

Table created.

SQL> insert into bowie_1 values (1, ‘Bowie’);

1 row created.

SQL> insert into bowie_1 values (2, ‘Ziggy’);

1 row created.

SQL> insert into bowie_1 values (3, ‘Floyd’);

1 row created.

SQL> commit;

Commit complete.

Next, we create a simple little “child” table and populate it with a few rows:

SQL> create table bowie_2 (id number, fk_value number);

Table created.

SQL> insert into bowie_2 values (1,1);

1 row created.

SQL> insert into bowie_2 values (2,1);

1 row created.

SQL> insert into bowie_2 values (3,2);

1 row created.

SQL> insert into bowie_2 values (4,3);

1 row created.

SQL> insert into bowie_2 values (5,3);

1 row created.

SQL> commit;

Commit complete.

We now add a Primary Key to the parent table which will create for us an index. Note this is the only index in this demonstration which is stored in a separate tablespace to the table:

Finally, we take the index tablespace offline to simulate a problematic index related tablespace:

SQL> alter tablespace users offline;

Tablespace altered.

OK, the setup is now complete. Let’s see what life is like without our poor little index. First, let’s perform a simple query on our parent table. I’ve hinted the query to make the CBO use the index which the CBO is of course likely to do with most of our queries on most of our tables (and if the CBO doesn’t want to use the index for a specific query, the loss of an index is not going to be an issue then anyways):

Now the error one may get if the index was simply corrupted or if there’s a problem or corruption at the hardware level may differ but the overall ramification will be the same. Queries that the CBO deems should use a “problematic” index will simply fall over. This is not exactly a good thing from an availability perspective …

Oh yeah, that’s right. We have an index that also needs to be inserted as well. Not just any index mind you, but an index that is used to police the uniqueness of the associated PK constraint. Yes, if the problem was at the hardware level, the error message will differ but the ramifications will be the same. We will not be able to insert into the table unless the index is dropped and we can’t drop the index unless the PK constraint is dropped as well.

How about an insert into the other table that doesn’t even have an index:

Oh for goodness sake, what now !! Well the table has a FK that points to the table with the problematic index and we need to check to ensure the FK value actually exists in the parent table. How do we perform such a check, why by using the index on the PK column of course and yep, the index can’t currently be used. So unless we drop the FK constraint, we’re stuffed here as well …

Perhaps life isn’t so sweet without these indexes after all …

What if we make the index unusable first rather than it be simply “unavailable” or “damaged” for whatever reason:

We can at least now make our queries run without the use of any problematic indexes:

SQL> select /*+ index */ * from bowie_1 where id = 1;

ID NAME
———- ——————–
1 Bowie

If this table contained 100M rows, it might of course take a long long long time and if we had too many users performing too many Full Table Scans, the entire database might of course scream to a thudding halt, but yes at least we’ve now got our queries working to the point of ignoring unusable indexes.

But is a database (or application or part thereof) that performs nothing but Full Table Scans really a scenario we want to be in? Does this really help to justify the separating of indexes from our tables ? Hummm, not sure about that one …

What about our DML operations now the index is unusable, do these at least function to some degree ?

SQL> insert into bowie_1 values (4, ‘IGGY’);
insert into bowie_1 values (4, ‘IGGY’)
*
ERROR at line 1:
ORA-01502: index ‘BOWIE.BOWIE_1_PK’ or partition of such index is in unusable state

That’s a no for our first parent table example …

SQL> insert into bowie_2 values (6, 1);
insert into bowie_2 values (6, 1)
*
ERROR at line 1:
ORA-01502: index ‘BOWIE.BOWIE_1_PK’ or partition of such index is in unusable state

And that’s also a no for our child, FK table example. Oracle still needs to use the problematic PK related index to police the value in our FK column.

So what kind of database environment are we left with when the indexes from our index only tablespace becomes problematic, even with all our tables totally intact.

Well, until we make the indexes unusable, all index related queries will be falling over all over the place with database related errors. Once we go through a process of identifying all problematic indexes and making them all unusable, we’re left with a database environment that’s performing Full Table Scans all over the place. Just imagine how long it’ll now take to find the customer details of that 10G table. Just imagine the user experience on the database when that 6 table join query can only be performed with Full Table Scans. Just imagine your user concurrent activity with no associated indexes available …

The good news of course is that the tables will at least get no bigger as all inserts will fail, all deletes will fail and many of the updates will fail, except on all those tables that have no Primary Key and no Unique Key and no Foreign Key. Ummm, just how many tables do you have that have no PK or UK or FK constraint ? Not many right …

Losing an index only tablespace would be a catastrophic event, one that would ruin the day of not only the poor DBA having to recover from such a scenario but also any poor user needing to access an impacted application.

One might even argue things could be better if a tablespace containing both tables and indexes was lost if it resulted in another tablespace containing other tables and indexes still being available as at least some table/indexes would be accessible and usable in a viable manner.

Regardless, in either scenario, the database/tablespace/datafile would need to be recovered ASAP to stop user complaints flooding the help desk.

Of course having indexes in their own tablespace will help us recover from such a catastrophic scenario in a more simplistic, efficient and ultimately faster manner, right ?

Share this:

Like this:

Related

While reasoning about the “separating indexes” topic throughout the last weeks, one thought occurred to me that is applicable to a lot other Oracle myths as well, I believe.

The reasoning goes like this: with a product as mature as Oracle (“mature” measured in versions and number of years it has been around as well as support organization etc.) *if* doing X would make sense *then* it’s highly likely that the product will – in some form or another – support doing X (e.g. by doing it automatically, making it easier…). The absence of any such support should spark scepticism about X actually making sense.

Applied to index tablespace separation these could be examples of the mentioned support

a) system database ships with two tablespaces one for data and one for indexes

b) there is a default tablespace “users_index” and newly created indexes without explicit tablespace clause go there by default

c) tablespaces have a flag indicating “data” tablespace or “index” tablespace, trying to place an entity in the wrong tablespace type results in an error

One could probably dream up a lot more like these. Since we’re not seeing them we might as well question the original thesis that separating indexes from tables does improve performance.

Richard,
You have very convincingly shown that there is no particular advantage in performance terms to putting indexes in a separate tablespace.

But I have users who want to separate them because that is what they are comfortable with. Therefore, is there a DISADVANTAGE in separating tables and indexes into separate tablespaces? The advantage of separating them, that you haven’t mentioned, is that it saves me a lot of arguments.
Thanks,
Claudia

Possible Disadvantage : If you put indexes and tables in different tablespaces, where do you put your index-organized tables ? Do you refuse to use them because they don’t “fit”, or you have to come up with some reason to justify them in one place or the other.

The biggest disadvantage is what I mentioned in one of my earlier pieces, in that by separating indexes and hence leaving all the tables grouped together, one runs the real risk of not distributing I/O evenly across available disks/spindles.

If you group together and separate objects that generally have much fewer physical I/Os (the indexes), the table tablespace is quite likely to have the vast majority of disk related I/Os and hence experience possible disk contention. A statspack report will quickly reveal if this is a problem and whether this might actually be impacting general performance.

Other possible problems include having uneven amounts of free disk available (ooooh, we can’t use all that free disk over there, that’s just for indexes), availability issues, (damn, we’ve just lost every single index in all our applications, if only we had lost just the tables and indexes of just this application or that subset of our application) and also impact of problematic queries.

There are often two types of “nasty” queries, those which are terribly expensive and that the CBO has just got plain wrong. Those that use a FTS incorrectly and those that use an index incorrectly. In the case of a FTS, all the I/Os are related to the table tablespace and so the table tablespace can become a hotspot as a result, the index tablespace sits there relatively idle. In the case of an index scan, again, the vast vast majority of I/Os are typically table related and so again, it’s the table tablespace that suffers.

I was assuming that you’d be getting round to the other extreme on the I/O balancing bit, but after this comment I’m not sure that it’s going to appear.

Don’t forget that if you have a lot of indexes on a single table, inserting a couple of hundred rows in the table could result in a couple of table block writes but hundreds of phyiscal writes and reads of index leaf blocks.

Personally I separate indexes from tables only for the benefits of oversight: indexes and tables are usually a little different in their activity patterns and in their sizing – so grouping objects by characteristic behaviour gives you an overview that is a little finer in detail than “the system”, without attempting to go to an insanely detailed level.

That’s of course true although the benefits of separating indexes from tables in this scenario is likewise debatable as now it’s the index tablespace that’s hot during such insert activity.

This also can potentially apply to deletes and updates as well, if for example the updated column features in numerous indexes.

However, the key point is that a specific row is inserted once, maybe updated very occasionally and maybe deleted once.

But the row can be read over and over again (until it’s deleted). So write activity may be roughly comparable but the read activity is typically much much higher in table only tablespaces.

Indeed, just one large range scan can potentially tip the balance of physical I/Os the other way. A moderate number of range scans and we have the unbalanced scenario I described.

This is also why I pulled a figure out of the air of 20% of index related I/Os and not 0.5% as my previous example may have suggested as yes there can also be activity where the balance of index blocks read can be higher such as DMLs and the other classic example of the humble unique scan which can potentially feature more index blocks than table blocks.

However as I mentioned, simply go back to a statspack report or some such and simply see what ratio of physical read/write I/Os are actually index related and what ratio are table related. For the number of disks, what average loads do the index disks have if physically separated and what average loads do the table disks have.

The issue I have is that in the vast majority of cases, when indexes are physically (not so much logically) separately in this manner, the I/O distribution is very uneven which is ultimately unhelpful, or in some cases problematic.

The advantages of separating indexes are often overstated or plain incorrect and the disadvantages are often understated or ignored. Just trying to make people consider the implications of such a strategy a bit more here🙂

I was really only making the point that the “separate indexes and tables for performance” line has another extreme case where you can cause a problem by following the strategy slavishly without thinking about the differences between indexes and tables.

My extreme happens to be more relevant to the types of scenario that crop up in overnight loads – your extreme is geared more towards the day time OLTP activity: which means that it’s easy to lose at both ends.

The first definition on Dictionary.com “a traditional or legendary story, usually concerning some being or hero or event, with or without a determinable basis of fact or a natural explanation, esp. one that is concerned with deities or demigods and explains some practice, rite, or phenomenon of nature.” has nothing to do with the veracity of the statement, but rather, its source.

This is quite apropos for Oracle Myths, as that is what they are, traditional or legendary stories.

Many people use the word myth to take a superiority over the matter (scientists who challenge myths) but others do not (MythBusters, the television show).