Ensuring Correlated Partition Exclusion December 7, 2009

<Previous Post…Next Pos >
I’ve posted a couple of times recently about a tuning method with partition exclusion where you infer a relationship between the value you want to limit a query on and the partition key. It takes a while to explain the theory so I am going to give it a name, in the vain hope it catches on {forgive me if someone has already done this – in fact, just flame me with pointers if you already know a name for this}. At the very least, for my own postings, I can use this name from now on and link back to a single posting explaining it.

I’m going to call it Correlated Partition Exclusion. In essence, you have partitioned the table on a key, in my case the primary key ID, which is an ascending numeric probably sourced from a sequence.
You have a second column, in my case CRE_DATETIME, which increases in line with the PK. If you limit your query on the CRE_DATETIME partition exclusion is not possible as there is no guarantee which CRE_DATETIME values appear in which partition. But, as a human, you understand that if you create 10,000 records a day, if you want to look at the last week’s date you can use:

WHERE ID > MAX_ID-(7*10000)

to exclude partitions with an id more than 7 days worth ago.

So, you have your Correlated Partition Exclusion.

How can you be sure that going back 70,000 IDs is going to safely cover one week of data and how can you maximise your efficiency of including only partitions that cover the date range? {note, I am using a numeric ID as my partition range key and a datetime as my correlated column, this principle works just as well if you partition on datetime and want to correlate to a (generally) ascending numeric key}

If you look at the plan {towards the end, and sorry about the naff layout, my blog is not wide enough for this sort of printout} there are two checks on the TP_CRE_DT indexes that scan all partitions of the index – Pstart/Pstop are 1-46. This is the CBO looking for the partitions where the stated CRE_DATETIME records occur in the whole table. Cost is 892 consistent gets, much of which is the checking of local index partitions that will hold no relevant entries.

Bernard also spotted that the code was considering all partitions and was not as efficient as it could be but did not know how to get around it except with a Global index, which brings it’s own issues.

One way is to just say “well, I know how many records per day I get so I will fake up the ID limits based on this”. The problem is, and this is why the CBO cannot make the decision for you, is that there is no guarantee, no fixed rule, saying that CRE_DATETIME will always increment with the ID. In fact, there is nothing stopping you altering a record which has an ID from yesterday having a CRE_DATETIME from 10 years ago {but forget I said that until tomorrow’s post}. Now, in my example the CRE_DATETIME is going to increment with ID. We will use this special case for now, I know it is flawed and will address that flaw {tomorrow}.

So to ensure yo do not miss data you end up making your ID window pretty large to endure you do not miss records. Say you want all records for the last day, you process 1500 records a day, so you consider a window covering all samples with an ID within the last 10,000. It will still be more efficient than scanning all partitions and should be safe. Fairly safe.

The way out of this is to have a ranges table. Something that tells you, for each partition, which is the maximum and minimum CRE_DATE and the IDs covered by that range. You can then use that to identify the partitions that cover the date range you are interested in.

As you can see from the plan and cost, this is not very efficient as it has to scan the whole partition. Maybe not a problem if you do this once, but there are indexes on both these columns, can’t this be done more efficiently? Yes, if you split up the code into four in-line selects (if you want more details about it being more performant to do MIN and MAX on their own than in one statement then see this post on the topic ):

{You may wonder why the Explain Plan section above has a different look. This is because there seems to be a bug in 10.2.0.3 where the autotrace plan for the insert statement comes out wrong, as a FAST DUAL access, so I had to Explain the statement in a different way}

You would run one of the above statements against each partition. Probably, the “Best Practice” way would be to generate a SQL script from a query against DBA_TAB_PARTITIONS.

To populate my table I cheated – I just assumed all my partitions are of the same size (1000 records) and used:-
insert into tp3_range
select min(cre_datetime),max(cre_datetime),min(id),max(id)
from test_p3
— where id between 10000 and 20000
group by trunc(id/1000)
/

Note I left in a commented line. You could run the above against the whole table and then limit it to just new partitions as you add them. This is a tad risky though, you are relying on the partitioning being perfect and it would not scale to hundreds of very large partitions. You would be better off with the partition-by-partition methods above.

If you look way back up this post, you will see that the number of records selected by the above is the same as from the code indicated by Bernard (721 records) and for a lot less cost – 133 consistent gets compared to 892.

So I have hopefully explained the issue (having to visit all the index partitions to check the date range), shown how a ranges table can help, given some simple and a less-simple-but-more-efficient examples of code to populate the table and finally shown that using the range table can be more efficient.

And I arrogantly gave it a name – Correlated Partition Exclusion 🙂

Some of you may remember tha this example has been a special case, as there is no overlap between the dates; the relationship between the CRE_DATETIME and ID is perfect. You are unlikely to have that in real life. Also, dome of you may also be tired of reading this post. So I will cover the general case in the NEXT post.

Clever trick, but I am afraid it is not recommendable on OLTP system. The problems lies in the integrity of the returned boundaries in the last partition and this suppose providing past partitions are not modified.

The version with inline view over target partitioned table has bad performance but guaranteed that get_min_id/get_max_id returns the real current min and max id for any range period. Though the proposed solution solves the performance issue, it introduces the risk of discrepancy if the range boundaries data contained into the range table (tp3_range) is not in sync with the partitioned data : the final query will be wrong.

I already looked to a singled query with its inline view access restricted to the single partition but the code ‘ select * from table (partition X) does not accept a sub query on X for parsing reason I suspect.

There is also problem for the partition condition, in all_tab_partitions, is stored as long (col high_value) making a pain to use comparisons over it, and this may be overcome with a customized function.

The best I could achieve to reach my single query is to fetch first the partition name in a separate SQL and use it under the form:

create or replace funct_cvt_logn_to_date {
…
code varies following the hash values is a timestamp of date
}

col partition_name new_value my_partition noprint;

select partition_name from all_tab_partitions where funct_cvt_long_to_date(high_value) between &my_date1 and &my_date2 and owner = … and table_name=….;

— My single query becomes. It is parable at last moment when &my_partition gets its value:

with get_min_id as (select min(id) from my_part_table partition (&my_partition) ),
with get_max_id as (select min(id) from my_part_table partition (&my_partition) )
select ……. from my_part_table where …..

Alas, on OLTP system, with heavy concurrent access, I fear the load on data dictionary. So at the end I do prefer to pay the gets to each root block local index. At worse, this is a known and predictable load.

I find Partitions less useful in OLTP (but they do have their place) than DW. The issue you highlight about things being beyond those bounds I will address today (err, maybe tomorrow). But I don’t really address things being changed… I might tweak the next post for that. It is not restricted to OLTP of course, but with a DW, for most queryies lacking one record is sometimes acceptable, eg for reporting stats on monthly growth changes. So long as the financial system catches everything that’s fine.

If you want to alter the partiton you look at, try dynamic SQL from within PL/SQL. {I no longer claim any skill in PL/SQL but I have done similar in the past}. You function for extracting the end ranges from the data dictionary is nice. PL/SQL works around Oracle’s Very Vexing decision to use a LONG for the HIGH_VALUE (it is a LONG in the underlying sys.tabpartv$ and sys.tabcompartv$ objects too, so you can’t side-step it.). You could work it out once and put the data into a range table though 🙂

You are right to be wary of heavy access on the data dictionary for OLTP. I am fighting slow performance of a DW data dictionary at the moment. Gathered all fixed object and dictionary stats but there are so many segements and the DBA_ views are so complex that performance for some statements is very bad.

I suspect my next post will not be good enough for your situatuation Bernard, but maybe it might.

elsif v_col_type = ‘NUMBER’ or v_col_type = ‘INTEGER’ then
select high_value into v_hv_num
from all_tab_partitions
where TABLE_OWNER = P_OWNER and
TABLE_NAME = P_TABLE and
PARTITION_NAME = P_PARTITION_NAME;
return to_char(to_date(v_hv_num,’J’),’YYYYMMDD’);
end if;
— For varchar2 type column, you need to provide your own code as the date format is unknown, if it is a date :p
end ;

[…] Martin Widlake introduces, “ . . . a tuning method with partition exclusion where you infer a relationship between the value you want to limit a query on and the partition key. It takes a while to explain the theory so I am going to give it a name, in the vain hope it catches on . . . I’m going to call it Correlated Partition Exclusion.” […]