TECHNOLOGY: Partitioning

Partition Decisions

By Arup Nanda

Understand your partition choices, and choose the best partition and key for your data.

The partitioning feature of Oracle Database enables you to partition stored data segments such as tables and indexes for easier management and improved performance. Oracle Database provides many types of partitioning options, including range, list, hash, range/list, and range/hash. The real challenge for database designers is not creating the partitioned objects; rather, it's the decision that precedes it—what type of partitioning to use and what column to use as a partition key. This article describes how to choose the best partitioning strategy, including the best partition type and best partition key column.

A Quick Primer

Partitioning, in a nutshell, stores a data segment such as a table as multiple segments while retaining a logically monolithic structure. The most popular partitioning option israngepartitioning, with which you define a range of values for each partition. For example, in a table of customer transactions called TRANS, you can range-partition it by using the TRANS_DT (transaction date) column as the partition key so that a first partition holds records in which the TRANS_DT value is between January 1 and March 31, 2005; the second partition holds records in which the TRANS_DT value is between April 1 and June 30; and so on. Listing 1 creates this table.

After you've created the TRANS table, you will see—if you query the DBA_TABLES dictionary view—that the PARTITIONED column value is YES. You can get further details about the partitioning, such as the type of partitioning scheme and the number of partitions, from the DBA_PART_TABLES view. You can get the details of each partition, such as the name and the upper boundary of the partition, from the DBA_TAB_PARTITIONS view, as shown in Listing 2. Note that this view shows the high value of a partition, which is actually the lower boundary of the next partition. A partition holds records up to but not including its high value. For instance, the upper boundary of the Y05Q2 partition is "2005-07-01 00:00:00", meaning this partition contains values up to "2005-06-30 23:59:59". Records with TRANS_DT values of 2005-07-01 00:00:00 and later will be stored in the next partition-—Y05Q3.

In the TRANS table example, you could find a column that can be grouped into ranges of values such as dates, but it may not be possible to group all columns that way. For example, a column holding names of U.S. states contains a finite and small number of values. This type of column calls forlistpartitioning, in which the partitions hold discrete values instead of ranges. Here is an example:

Sometimes it may not be possible to define the ranges of—or a set of known values for—each partition. A typical example is a table, CUST, containing customers, with the CUST_ID column as the primary key. The CUST_ID value is an incrementally increasing but rather meaningless number, so a range partition based on these values may also be rather meaningless.

You can partition the CUST table by usinghash, where a hash function is applied to the partition key of each row and, based on the output, the row is placed in an appropriate partition. All the hash partitions hold an equal number of rows. Here is how you create the CUST table with four hash partitions:

Indexes

Local.The index is partitioned in exactly the same way as the base table. For instance, you can create a local index on the TRANS_AMOUNT column of the TRANS table as follows:

create index in_trans_01
on trans (trans_amount)
local;

This creates a range-partitioned index, on the TRANS_DT column, the same way the TRANS table is partitioned. All the index entries for a specific partition, such as Y05Q1, will exist only inside the corresponding partition of the index.

Global.The index can span all partitions in the base table. For example, suppose there is a primary key on TRANS, on the TRANS_ID column. The primary key can be anywhere inside the table, across all the partitions. In this case, the index entries of a partition of the table may exist outside the corresponding partition of the index. For such a situation, create a global index, as follows:

The Partition Decision

Now that you've seen the basics of partitioning, here comes the tough part: How do you decide on the type and columns used for partitioning? Your choice of partitioning has to address key partitioning objectives for manageability and performance. How you partition will be based on the priorities you assign to your objectives. I break down the objectives into the following categories:

Performance

Ease of administration

Data purge

Data archiving

Data movement

Data lifecycle management

Efficiency of backup

Performance.This, of course, is the primary objective of many partitioning schemes. Performance advantages come from partition pruning or partitionwise joins, so if your queries do a lot of full-table scans, partitioning will help immensely, because partitions will limit the scope of the search. Consider a scenario involving the TRANS table. Suppose everyone is interested in the aggregate sales figures but product managers (PMs) are interested in the sales figures of their respective products only, not in all the sales. So the PM of product code "8" issues this SQL statement several times an hour:

You can improve the performance of this query by building an index on the PRODUCT_CODE column, but it will increase the execution time on INSERT statements and might also adversely influence the execution plans of other queries on the same table. Therefore, it might be judicious not to create the index but to list-partition the table on the PRODUCT_CODE column, with one partition per PRODUCT_CODE value. The PM's query will still do a full-table scan, but this time the scope of the "full table" is actually thepartition, not the entire table. This enhances the performance immensely. Recall that list partitioning was preferred over range partitioning in this case because the PRODUCT_CODE is a discrete value.

If two rather large tables are often joined together in SQL statements, consider the potential benefits of partitionwise joins. Suppose you have another table called ADVERTISING, which contains detailed information on the company's advertising. Here is a description of the ADVERTISING table:

Suppose that PMs often query the ADVERTISING and TRANS tables to see the revenue vis-à-vis various advertising amounts spent to check the effectiveness of the advertising campaigns. Here is the query they use to find out the revenue numbers and advertising expenditures for a specific period for each product:

select t.product_code, sum(trans_amount), sum(ad_cost)
from trans t, advertising a
where t.trans_dt = a.ad_dt
and t.product_code = a.product_code
and trans_dt between '1-aug-05' and
'1-sep-05'
group by t.product_code;

Here the tables are joined on the columns TRANS_DT and AD_DT, both date columns. To enhance performance, you might decide to partition both tables in the same way. In this case, range partitioning on the columns being joined and partitioning on the same ranges is the most effective. Listing 3 shows the script for creating the ADVERTISING table with this range partition.

Similarly, if you often join these tables on the PRODUCT_CODE column, then that column should be a strong candidate for the partition key. And because product code values are in a specific set, the partitioning scheme should be list partitioning.

If the ADVERTISING and TRANS tables are not joined on some specific date or date fields but rather on some sequential number such as the AD_ID or TRANS_ID value, you should reevaluate the partitioning strategy. Because records have unique values in those columns, you can't choose a list-partitioning strategy (which requires discrete, finite values). A range partition would be ineffective, because the sequentially increasing values in these two columns would require numerous partitions to be created that might not have any specific meaning.

In such a case, the hash partition is useful. It makes sure the records are evenly spread over all the partitions. You can create the tables as follows:

The output is shown in Listing 4. Note the PSTART and PSTOP columns, which indicate which table partitions were selected. For the line with Id=4, the values of these columns are 4 and 4, which indicates that the fourth partition of the TRANS table was selected, where the record with TRANS_ID=101 resides. Note the line with Id=6, which shows that the optimizer will search only partition 4 of the joined ADVERTISING table as well, not the entire table. In the absence of an index, this hash-partition strategy makes the PM's query much faster than a full-table scan on the entire table, for both the TRANS and ADVERTISING tables.

Ease of administration.Partitioning is always introduced as a technique for managing large objects. Although it's applicable to objects of any size, the advantages are more obvious in large tables. When you rebuild an index on a nonpartitioned table, your only option is to build the entire index in one statement. If the table is partitioned, however, you can rebuild partitions of local indexes one at a time. For instance, in the case of the TRANS table, you can rebuild the Y05Q1 partition of the IN_TRANS_01 local index:

alter index in_trans_01 rebuild partition y05q1;

as opposed to rebuilding the IN_TRANS_01 index as a whole. The advantage of this approach is even more pronounced when you load a specific partition and the index needs to be rebuilt only on that partition, not on the rest of the index. In addition to doing index rebuilds, you can also move tables across tablespaces, export tables, delete data, and so on—one partition at a time.

Data purge.Suppose that TRANS is a table in an OLTP system and you want to remove old records, such as the ones with TRANS_DT values in the first quarter of 2005. The conventional method is to delete the records, but deletion causes undo and redo information to be generated, which impacts database performance. This can also be a very time-consuming operation, depending on the volume of data being purged. An alternative to deleting records is to drop a partition. In the case of the TRANS table, if you decided to drop all the records from the first quarter of 2005, you'd issue

alter table trans drop partition y05q1;

That's it. After you execute this command, the partition disappears from the table. This statement does not actually delete anything from the table; it merely updates the data dictionary to indicate that the Y05Q1 partition no longer belongs to the table. Because no data moves (except a small amount of information in the data dictionary), undo and redo generation is minimal, making the operation extremely fast compared to the deletion approach and virtually impact-free in terms of performance.

Note that a regular DELETE statement deletes table rows from wherever they exist, leaving empty spaces in the blocks. Although the table data may be sparse, the overall size of the segment may still be large and have a very highhigh-water mark(HWM, the largest size the table has ever occupied). A high HWM slows full-table scans, because Oracle Database has to search up to the HWM, even if there are no records to be found. When a partition is dropped, the segment itself is gone and the HWM of other segments remains the same, so the query runs faster.

While the DELETE statement deletes the rows, the corresponding index entries are also adjusted, causing fragmentation, which may impact performance.

When a partition is dropped, the corresponding partition of any local index is also dropped. If there is a global index, however, that index can become unusable when you drop the partition. To prevent the index from becoming unusable, in Oracle9i Database and later, you can update the global index when you drop the partition. The following drops the Y05Q1 partition and updates the global index:

alter table trans drop partition y05q1 update global indexes;

If you decide not to do an automatic index update as part of dropping the partition, that global index will become unusable. You will have to rebuild the index to make it usable, but you can defer that task to later, after you've dropped other partitions or done some other operations.

You can manually rebuild the global index on TRANS, by using

alter index pk_trans rebuild;

Data archiving.If you decide to purge data without retaining it, you can use the technique described in the previous section. Data to purge without retention includes log or debug information, which can simply be dropped. Most information, however, probably needs to be archived for future access. In the TRANS table example, if you wanted to store the contents of the Y05Q1 partition before purging it, you could use the following partition exchange technique:

1.Create a table that is almost identical in structure to the TRANS table, except that it is not partitioned:

create table trans_y05q1 as
select * from trans
where 1=2;

2.This creates an empty unpartitioned table TRANS_Y05Q1, whose structure is identical to that of TRANS. The 1=2 clause returns false, so no rows of the TRANS table are transferred. Exchange the contents of the partition with this new table:

alter table trans
exchange partition y05q1
with table trans_y05q1;

This operation makes the data inside the Y05Q1 partition appear inside the TRANS_Y05Q1 table and empties the partition. The data does not physically move from the partition to the new table. This exchange partition statement merely updates the data dictionary to reset a pointer from the partition to the table and vice versa. Because there is no physical movement of data, this exchange does not generate redo and undo, making it faster and far less likely to impact performance than traditional data-movement approaches such as INSERT.

After the EXCHANGE PARTITION command is executed, the partition is empty and you can drop it, as shown earlier. After the table is created, you can transport it out of the database and archive it for future use.

Data movement.One of the most challenging aspects of data warehouse administration is the development of ETL (extract, transform, and load) processes that load data from OLTP systems into data warehouse databases. Other instances in which large amounts of data are moved include replicating data from production to QA databases, populating data marts from main data warehouses and vice versa, and populating summary tables from transaction tables. Partitions may prove to be extremely valuable in these processes. Take, for instance, the TRANS table and another table—TRANS_SUMMARY—which records the total transaction amount for a particular date. The following describes the TRANS_SUMMARY table:

A traditional approach to populating this data warehouse table every day might be as follows:

insert into trans_summary
select product_code, sum(trans_amount)
from trans
group by product_code;

This approach works but is plagued by some problems:

1.The INSERT statement will generate a large amount of undo and redo, which will affect the performance of the database. You can reduce the undo and redo generation by introducing multiple commits in a loop while loading data, but the overall elapsed time will increase significantly. You can use the direct path INSERT (using the APPEND hint) with NOLOGGING, as follows:

Data being loaded into the TRANS_SUMMARY table will remain unavailable, however, until the data load is complete.

2.A daily process will have to wipe the TRANS_SUMMARY table clean before loading fresh data; otherwise, it will load duplicates. This table is unavailable for general use from the time it is truncated until it is finally loaded. On some very large systems—depending on the complexity of the query, the size of the table, and the general load on both the source and target databases—this process can take hours, during which the users cannot see even theolddata (which has been deleted prior to loading). If the INSERT statement fails, due to lack of space or data errors, the users will have to wait until the new data is loaded, which again can be hours.

The use of partitions eliminates or largely mitigates these two issues. Because the summary table is grouped by PRODUCT_CODE, you can create the TRANS_SUMMARY table list partitioned on the PRODUCT_CODE column:

During the loading process, you can load the data from TRANS into TRANS_SUMMARY partition by partition instead of loading the whole table. Here is how you would load the data for PRODUCT_CODE value=1:

1.First create a temporary table whose structure is identical to that of the TRANS table except that it's not partitioned:

create table trans_summary_temp
nologging as
select
cast(1 as number) product_code,
cast(sum(trans_amount) as number(12,2)) tot_amt
from trans
where product_code = 1
group by 1;

Note that the table is created with the NOLOGGING option, which significantly reduces the undo and redo generation. This is a dynamically created table, and while it is being created, the original TRANS and TRANS_SUMMARY tables are fully accessible to users.

2.Once the table has been created, you can exchange the TRANS_SUMMARY table with the p1 partition:

This operation instantly makes the data in the TRANS_SUMMARY_TEMP table available as the data in the p1 partition of the TRANS_SUMMARY table. This "switch" merely updates the pointers in the data dictionary; no data is actually moved. So this process is extremely fast (usually a few seconds), and the table is locked for that period only. The table data is available to users at all other times in the process. So this approach has four key advantages over loading the table directly:

1.The performance impact is low.2.The main table is always available (except for a few seconds).3.In case of data-loading failure, the old data is available.4.You can refresh data selectively.

Data lifecycle management.Most business data follows a predictable lifecycle: In the beginning, it is accessed moderately; then it is updated heavily; and finally access slows down to almost nothing. Regulatory requirements may mandate retention in the database of even this least accessed data. However, because the older data is accessed with decreasing frequency, you can use partitioning to develop a disk access strategy that lowers the total cost of ownership without sacrificing performance. Put partitions with the most accessed and updated data on the fastest disk storage, and put older, less accessed data on slower—and cheaper—disk storage.

For instance, in the TRANS table, the records of more-recent transactions, such as in the current partition, Y05Q4, are accessed heavily, followed by those of the partition immediately preceding it—Y05Q3—and so on. Because the older partitions are not accessed that frequently, it might be worthwhile to put them on a storage tier that is slower than the one on which the current (and most-accessed) partitions reside. One way to save money by using inexpensive storage is to create a new tablespace on the cheaper storage and move the partition to this new tablespace:

alter table trans move partition y05q1 tablespace y05q1_inexpensive;

During the partition move, the partition will be available for using SELECTs but not for UPDATEs.

Efficiency of backup.Consider the example at the beginning of this article—the TRANS table. In that example, each partition is located in a different tablespace named for the partition. If you can say with certainty that the records with a TRANS_DT value earlier than today (or earlier than some specific date) will not change, then you can also consider the corresponding partition to be read-only. In that case, you can convert the tablespace the partition resides in to read-only, as follows:

alter tablespace y05q1 read only;

When you make the tablespace read-only, the Oracle Recovery Manager (RMAN) backup can exclude it while making backups, because the tablespace will not change over time; one backup is enough. The more tablespaces you can make read-only, the shorter the duration of the RMAN job, which also reduces the load the RMAN job puts on the database. This benefit is most visible in databases containing historical data, especially data warehouses, in which the total amount of data to be backed up is usually quite high and you can make numerous tablespaces read-only.

Decisions, Decisions

To make an informed partitioning decision, first prioritize the objectives for your own partitioning setup. For example, using some of this article's alternatives, suppose you decide on the following order of priority:

You have deliberately chosen performance after the other objectives; in a different partitioning situation, however, performance may be the top priority.

Data archiving and data purge.For the data archiving and data purge priorities, you have to choose a differentiating column. Is it time-based, as in the case of TRANS_DT in the TRANS example table? If so, which column differentiates the records to be archived and purged? The TRANS table example uses the TRANS_DT column as the partition key in a range-partitioned table, but suppose you have a similar table with a column named EXPIRY_DT, which indicates the date after which the record will definitely not be updated. In that case, to better meet the data archiving and data purge priorities, make EXPIRY_DT the partitioning key in the range-partitioning scheme.

Suppose the differentiating column contains discrete values, such as CAMPAIGN_CODE (in the ADVERTISING table), denoting advertising campaigns. After each campaign is completed, its records are archived and purged. To meet the data archiving and data purge priorities in this situation, use list partitioning with the CAMPAIGN_CODE column as the partitioning key. A range scheme would be useless here, because you would need to drop partitions of a specific CAMPAIGN_CODE, not the range that contains it.

Now suppose that most queries against the TRANS table do not choose TRANS_DT in the WHERE clause. Choosing this column as a partitioning column doesnothelp performance. Is it a good decision to choose a range-partition scheme, with TRANS_DT as the partition key? The answer lies in your prioritizing. In this example, you list performanceafterdata archiving and data purge, so this scheme offers the best solution. If you had prioritized performance over data archiving and data purge, you would have chosen a different scheme.

Efficiency of backup.If efficient backup is your primary objective, ask yourself this question: Which column makes a record read-only? Suppose the answer is the TRANS_DT column. A record with a TRANS_DT value that is more than a month old will not be updated and should be considered read-only. In such a case, choose range partitioning on TRANS_DT, with each partition residing on a different tablespace, as shown in Listing 1. When the records on a partition will never be updated again, make the corresponding tablespace read-only. The RMAN backups will skip this tablespace during the backup, making the backup process faster.

Ease of administration.If ease of administration is the primary objective, your partition decision must be based on how the data is modified. Suppose you do a lot of index rebuilds on tables, due to data loads. You should note the column that determines which data is to be loaded. Is it a date—as in the TRANS_DT column in the TRANS table? If so, a range-partitioned table with TRANS_DT as the partitioning key is most appropriate. Is this column a discrete value, as in PRODUCT_CODE? If so, list partitioning will be useful. If no such pattern exists, you can use hash partitioning on the table and rebuild the indexes of the generated partitions.

Performance.If performance is the highest priority, choose the column that is used mostly in WHERE conditions to filter rows and in joins with other tables. If it's a time-based column, then range partitioning is very useful, as in the case of TRANS_DT in the TRANS table. Even if it's not time-based, the presence of some logical range will help. For instance, suppose the PRODUCT_CODE column (a number) follows a pattern such as 1000 to 1999 for consumer products, 2000 to 2999 for industrial products, and so on. When users query the table, are they focused on only a specific type of product, such as the industrial category? If that is the case, range partitioning can be used on the PRODUCT_CODE column, with partitions in the ranges 1000-1999, 2000-2999, and so on.

Combining partitioning strategies.You can also combine several of your most important goals with Oracle Database's composite partitioning strategies. For example, you can create a range/list-partitioned ADVERTISING table, using range partitioning on the AD_DT column to satisfy the performance requirement and list partitioning on the PRODUCT_CODE column to satisfy the data purge requirement.

Conclusion

The most important decisions in partition design are choosing the partitioning scheme and the column(s) to partition on, and these decisions depend heavily on the priorities of potentially conflicting objectives. Once you are familiar with the advantages of each type of partitioning scheme, you will be able to choose the scheme and column(s) appropriate for your partitioning needs.

Arup Nanda(arup@proligence.com) has been an Oracle DBA for more than 12 years, handling all aspects of database administration—from performance tuning to security and disaster recovery. He is a coauthor of PL/SQL for DBAs(O'Reilly Media, 2005). He wasOracle Magazine's DBA of the Year in 2003.