Sunday, July 11, 2010

Compression Restrictions

When using table or index compression certain restrictions apply. Since I find it always hard to gather that information from the manuals and also some of the restrictions are not documented properly or not at all, I'll attempt to summarize the restrictions that I'm aware of here:

1. Compression attribute on subpartitionsNote that you can't define the COMPRESSION on subpartition level - the subpartitions inherit this attribute from the parent partition.

It is however perfectly valid and legal to exchange a subpartition with a compressed table and that way introduce individual compression on subpartition level. It is unclear why this is only possible using this indirect method - it simply looks like a implementation restriction. Since compressed data is most suitable for data warehouse environments and these ought to use EXCHANGE [SUB]PARTITION technology anyway this restriction is probably not that crucial.

2. Number of columnsBasic heap table compression and the new OLTP (in 11.1 called "for all operations") compression are both limited to 255 columns. If you attempt to compress a table with more than 255 columns then no compression will be performed although you don't get an error message. Interestingly the COMPRESSION column in the dictionary still shows "ENABLED" which is certainly misleading.

3. DDL Restrictions - modification of table structureAs soon as a segment contains compressed data or has been marked for compression (does not apply in all cases, see below for more information), the following restrictions regarding the modification of the table structure apply:

a. Basic heap table compression will not prevent the addition of more than 255 columns via ADD COLUMN - but if you update the segment the affected blocks will be silently decompressed during the update operation. In 9.2 no schema evolution is possible with compressed segments, see the next paragraph.

b. In Oracle 9.2 a lot of modifications to the table are no longer possible with compression enabled (and this applies to partitioned tables as well if at least one partition is compressed):

- You cannot add or drop any columns, and you even can't set any column unused. The error messages are to say at least confusing, since it tells you something about virtual and object columns, but in fact it is about the enabled compression. These are serious limitations in terms of schema evolution. I'll address below what can be done to overcome this issue.

c. In Oracle 10.2 some restrictions have been lifted but there are still a couple of limitations:

- You can now add columns and you can set columns unused but you can't drop columns nor can you add columns with default values

This allows schema evolution but has interesting side effects if you attempt to use for example a Create Table As Select (CTAS) statement to dynamically create a copy of a table and attempt to perform an EXCHANGE PARTITION operation with that newly created table: If there are unused columns in the source table then these won't be copied over to the table created via CTAS and therefore the EXCHANGE PARTITION operation will fail with column mismatch error messages. Since you can't drop the unused columns as long as there is compression enabled on the table (more on that later since it is a bit more complex with partitioned tables), you effectively can't use that approach. In order to perform EXCHANGE PARTITION with compressed tables and evolved schemas that include unused columns you have basically three choices:

- Maintain the "EXCHANGE" table along with the target table which means apply the same DDL in the same order to both tables in order to keep them synchronized

- Extract the hidden columns from the dictionary and create a exchange table with the correct hidden/unused columns which can turn into a complex operation if you need to cover all possible column data types and features including LOBs etc.

- Perform some kind of rebuild operation on the target table to get rid of the unused columns. Some ideas regarding this point are going to follow further below

d. Oracle 11 introduces Advanced Compression which is COMPRESS FOR ALL OPERATIONS in 11.1 or OLTP compression in 11.2 (don't confuse this with HCC compression levels)

Basic compression in 11.1 has the same limitations as 10.2, but with advanced compression (called COMPRESS FOR ALL OPERATIONS in 11.1) all mentioned schema evolution restrictions have been lifted as it can be seen below.

This is more of theoretical nature since you usually won't mix basic with advanced compression, bit things become more interesting when mixing basic and advanced compression in different partitions of the same table:

Notice the inconsistent behaviour: The first DDL command after having added a partition with advanced compression succeeds, but the following DDLs fail with an error message.

If the order of these DDLs is changed, then it becomes obvious that always the first DDL is successful no matter which of those failing otherwise comes first.

Furthermore in 11.1.0.7 if you have a mixture of Advanced Compression (COMPRESS FOR ALL OPERATIONS) and NOCOMPRESS partitions still above restrictions apply - only if you have set all partitions to COMPRESS FOR ALL OPERATIONS then the restrictions are lifted which looks more like a bug than a feature.

I'll go into more details below which settings influence this behaviour.

e. Oracle 11.2 introduces Hybrid Columnar Compression (HCC) which is only enabled in conjunction with ExaData

In 11.2 basic compression still has the same restrictions as already mentioned above, and if at least one partition of a partitioned table has advanced compression enabled then some of restrictions are lifted - even with a mixture of basic and advanced compression or a mixture of no compression and advanced compression, however adding a column with a default value is still only supported if all partitions are compressed with Advanced Compression. I'm not sure if this is really a feature since at least mixing Advanced Compression with no compression adding a column with default value should be possible.

When using HCC the most important difference is that it is not restricted to 255 columns - it successfully compresses segments with any number of columns (up to the hard limit of 1,000 columns).

HCC can only be used with direct-path operations, any conventional DML will decompress the affected blocks and use OLTP compression instead - which means that in case the segment has more than 255 columns the compression will effectively be disabled since any non-HCC compression supports only up to 255 columns.

HCC has only a few restrictions regarding schema evolution - the most noticeable one is that DROP COLUMN is always turned into a SET COLUMN UNUSED, so with HCC enabled you can't drop columns since DROP UNUSED COLUMNS is not supported and raises an error. This means that dropping / adding columns works towards the hard limit of 1,000 columns with HCC enabled.

Again this has the same side effect as described above - a simple CTAS operation to create a table to be exchanged with won't work any longer with unused columns.

As you can see dropping columns is only supported with HCC disabled which allows in this particular case to add another column after dropping the hidden/unused one.

- Index restrictions

* You cannot compress a table or a partition of table (or exchange a compressed partition into it) if there are usable bitmap indexes defined on the table due to the different Hakan factor that is reset when using by the compression - which means that with compressed segments more rows will fit into a single block leading to a different Hakan factor used for the bitmap indexes. The bitmap indexes need to be set unused (all partitions in case of a partitioned table/index) or dropped before enabling the compression and rebuild / recreated after compressing the heap segment. This is also officially documented. It is however interesting to note that this restriction is only enforced when switching from uncompressed to compressed segments / partitions. Changing the compression for example from basic compression to HCC compression is possible without hitting this restriction but due to the different compression levels in my opinion this potentially leads again to significantly different Hakan factors - so actively resetting the Hakan Factor using "ALTER TABLE ... MINIMIZE RECORDS_PER_BLOCK" and rebuilding the bitmap indexes might be in order when moving between different compression levels.

As you can see switching on the compression is only allowed if all partitions of the bitmap index(es) are unusable - the opposite is not true: You can uncompress afterwards without any restrictions.

* This is related to B*Tree index compression and again more of theoretical nature but might become relevant when trying to introduce B*Tree index compression partition-wise via EXCHANGE PARTITION: You can't compress a single partition of a partitioned B*Tree index (or exchange a partition with a compressed index into the table) if the whole index has been not initially created with compression enabled. If you drop the index and recreate the whole index with compression enabled then you can turn the index compression individually on and off per index partition.

Very likely this is related to the restriction that the B*Tree index compression prefix length needs to be the same across all index partitions - this can only be defined on global level and I assume this is the reason why the index needs to be created compressed first. Since you usually want to have all partitions of an partitioned index to be compressed the same way this is probably a rather irrelevant restriction (except for the mentioned case regarding EXCHANGE PARTITION)

As already mentioned above there are different options how to overcome schema evolution restrictions. One obvious option, although very likely not applicable in many cases simply due to space and load constraints, is to temporarily uncompress the segments and to disable the compression. It is interesting to note that it is not as simple as it seems to undo compression with partitioned tables as I will demonstrate - I think I've even read somewhere that it is not possible at all to revert a table into a state where the mentioned restrictions no longer apply. This is definitely not true.

First of all it is important to note that in addition to the obvious decompression of each partition that contains compressed data the "COMPRESS" attribute that determines if a suitable operation will generate compressed data or not needs to be reset to NOCOMPRESS (but see below for "oddities"). So there is a significant difference between "ALTER ... [NO]COMPRESS" and "ALTER ... MOVE [NO]COMPRESS". The latter reorganizes the segment / partition and compresses/uncompresses the data while doing so, the former just "marks" the segment / partition as "enabled/disabled for compression" but doesn't touch the data and leaves it in whatever state it is at present. As as side note, the "COMPRESSION" column in the dictionary can not be used to distinguish between these two - you can have the column show "DISABLED" with data still compressed, and you can have the column show "ENABLED" with no compressed data and you can end up with a mixture of both - not every block of a segment is necessarily in a compressed state.

Furthermore the "COMPRESSION" attribute can be defined on multiple levels with partitioned tables:

Now in order to be able to overcome the schema evolution restrictions all levels except the partition default attribute level need to be addressed: Any compressed data needs to be uncompressed, but also any partition that has been marked for compression (but doesn't necessarily contain compressed data) needs to be modified as well - finally the global table level also needs to be reset.

The setting on global table level seems to be the crucial step. It looks like Oracle checks during this operation (ALTER TABLE ... NOCOMPRESS) if all subordinate (sub-)partitions have a flag set that marks them as uncompressed. Only if that is the case some internal flag on global table level can also be reset so that the restrictions no longer apply.

There are however some oddities that need to be considered when doing so:

- You need to "move" any partition (ALTER TABLE ... MOVE PARTITION NOCOMPRESS) that has marked for compression if any other partition actually held compressed data - simply "unmarking" it using "ALTER TABLE ... MODIFY PARTITION NOCOMPRESS" is not sufficient even it does not hold any compressed data (for example, has only be marked with "ALTER TABLE ... MODIFY PARTITION COMPRESS").

- If you have BITMAP INDEXES defined on the table with compressed partitions you need to either set the whole indexes unusable or drop them before the COMPRESSION can be effectively disabled on global table level using ALTER TABLE ... NOCOMPRESS