September 30, 2011

Table rebuilds

Here’s a question from the OTN forum that I’ve seen fairly frequently in the last couple of years:

I have a table abc(c_date date, c_id varchr2(20), c_name varchar2);The table is already range partitoned on “c_date” column and I have lot of data in that table.Now my task is to create the hash partions under range; hash partiton needs 16 partitons.I dont want to create another table with range and hash partitons and migrate the data. The amount of data I have is 3 terabytes.

The statement of requirements is a little ambigious in this case – does the OP want a table where new partitions should be hash partitioned, or does he want the entire data set to be range/hash partitioned. If it’s the latter then he will have to do something that rewrites (and re-indexes) the entire data set; if the former then he can run through a process that does a small amount of data dictionary manipulation to get to the required state. Here’s a quick outline that I hacked together in a few minutes between sessions (okay, I’ll admit it also needed a few minutes during a session) at the ACED briefing.

Create an empty clone of the table, but introduce hash subpartitioning with a single subpartition per partition.

Create a pair of ‘staging’ clones of the table – the first should be a non-partitioned table, the second should be a hash-partitioned table

For each partition in the original table,

exchange the partition with the first staging table

exchange the first staging table with the single partition of the second staging table

exchange the second staging table with the target partition of the final table

Patch some statistics that can’t otherwise be created in all the exchanges

Copy the hash subpartition stats up to the range partition

copy the original table-level stats to the new table

Change the hash partition default, drop the transfer tables and the old table and rename the new table

Here’s a code demonstration – but it doesn’t include the stats patching, and it’s not a model you should use exactly as it is on a production system.

You’ll notice that I’ve matched the partition names of the full clone using a very simple subpartition name for the one hash subpartition, and repeated that simple name for one partition of the hash-partitioned table. This isn’t absolutely necessary, but I think it makes the next process simpler to understand.

When this block completes your data hasn’t moved, it hasn’t been copied, but it is now in a range/hash partitioned table. The statistics that used to exist at the partition level on the original table will now be on the new table at the subpartition level, and the new table won’t have partition-level or table-levle stats. (I may find time to write part 2 to this note with code that fixes the stats – but any reader who wants to have a shot at it instead is welcome to supply the code).

Now we can change the default behaviour of the new table to specify the number of hash partitions for future partitions:

alter table pt_range_hash
set subpartition template 16
;

NOTE: I did say that this is a model that you shouldn’t follow in a production system. The main reason for this is that if something does go wrong in the loop then you could end up with partitions switching around into the wrong place in the wrong table. (And you definitely don’t want to drop any objects until you’ve checked them all!). Realistically you would probably transfer one partition at a time, and check the results after each transfer. Bear in mind, also, that this was just a little hack put together very quickly. Even though the general strategy is sound I may have overlooked some critical detail and, obviously, there are some variations on theme (IOTS, for example) where you would need to work out some code changes; there’s also an implicit assumption that the only indexes on the table are local indexes – if there are any global, or globally partitioned, indexes then the code has to be modified to allow for them, and the time for the various exchanges will be affected by the choice of index maintenance options.

Related

An interesting approach; The use in production of dbms_redefinition is popular for near-zero-downtime situations where a table restructuring (or reindexing) is a requirement and where the space exists to create that object. With the synchronizing process, it makes for a very smooth way of making changes.

I would guess that the volume of undo and redo generated was a problem, not to mention the work done in creating new indexes. Nologging operations could help, of course, so long as you backup tablespaces before you do the final switch. Then there’s always the question of how much time it takes for the final bit of journalling to be applied – I’ve seen a couple of reports of the tables being locked for a couple of minutes at that step (though I’ve not been able to reproduce the problem, I think it’s only one that’s likely to happen at large volumes and high throughput.)

yes, undo and redo was huge problem. And the final journal apply was locking table not for minutes, but for few hours (the table was quite haeavily used most of day). However, I have never tried DBMS_REDEFINITION in 11R2 so I don’t know wether things have changed since last try.

That’s a particularly interesting observation. WHich version was that, and have you seen anything that looks like it might be a patch related to the lock time ?
The manuals suggest running the routine to resyncronize more than once, of course, but I guess you probably did that anyway.For example:

Sorry about the nitpick, but I got a bit confused by “the former” and “the latter” cases – I think they need to swap places. “The former” (the first-mentioned) case is the one where “new partitions should be hash partitioned”, and that’s what your process appears to accomplish, unless I got irreparably confused.
Cheers,
Flado

we have used that approach several times and it worked. The only problem is that the ALTER TABLE EXECHANGE (SUB)PARTITION still does not work with primary key/unique constraints in sufficient manner (one have to use WITH VALIADATION SYNTAX when PK takes place). It is quite unbelievable, since the problem was not fixed by Oracle Development since Oracle 8.
So for such a large table (2TB), the process could take some time. But I would still favour it instead of DBMS_REDEFINITION.

Thanks for the comment – it’s one I should have raised myself (and one I highlighted, I think in Practical Oracle 8i) that the exchange may be faster with unique (or PK) constraints in place if you choose the without validation option.

For those who never bought a copy, here’s a link to an article on my old website explaining the anomaly. (The link is to a page introducing a Word document, and contains an addedum to that document.)

It was our first big investigation in Oracle 8i in ’98 or ’99 :) I didn’t like that times, “dark age times”, I’m so happy that situation is much bettwer, nowdays. I can read excellent books, excelent blogs (like yours, Tanel’s etc.). So it takes definitely less time to keep myself informed and to use my own tools to watch oracle kernel behaviour.