Friday, June 14, 2013

TIMESTAMP WITH TIME ZONE Aggregation

The TIMESTAMP WITH TIME ZONE data type that got introduced a long time ago is known for some oddities, for example Tony Hasler has a nice summary of some of them here.
Here is another oddity that shows up when trying to aggregate on such a data type. Have a look at the following simple example:

Notice the HASH GROUP BY operation selected by default by the optimizer (which can be influenced using the [NO_]USE_HASH_AGGREGATION hint to switch between a SORT and HASH GROUP BY).
But for the TIMESTAMP WITH TIME ZONE column, the following execution plan will be shown:

Notice the SORT GROUP BY instead - and this cannot be influenced using the above mentioned hint. So when using TIMESTAMP WITH TIME ZONE, the hash aggregation obviously isn't supported, but for all other TIMEZONE data types it is.
Depending on the scenario this might already influence the performance as the HASH based aggregation in many cases is more efficient than the sort based aggregation (bugs aside).
Things however get really bad when using Parallel Execution:

Notice how the (hash) aggregation is performed as a parallel operation (and you might even see a so called GROUP BY PUSHDOWN from 11g on represented by a second GROUP BY operation before re-distributing the data depending on the cardinalities or the usage of the [NO_]GPY_PUSHDOWN hint).
Now look closely at the execution plan of the last statement using the TIMESTAMP WITH TIME ZONE data type:

So the (sort) aggregation is no longer performed in parallel and this means that the single Query Coordinator process has to perform the aggregation on its own, which clearly can be a threat when dealing with larger data sets that need to be aggregated.

Footnote

Since internally the DISTINCT / UNIQUE operator uses the same implementation as the GROUP BY operator, exactly the same limitations apply when trying to do a DISTINCT / UNIQUE on a TIMESTAMP WITH TIME ZONE data type.
I could reproduce the described behaviour on all versions tested, starting from 10.2.0.4 and including the latest 11.2.0.3.

11 comments:

...Since internally the DISTINCT / UNIQUE operator uses the same implementation as the GROUP BY operator...

not in all cases.SORT GROUP BY NOSORT for example doesn't seem to be applied by a DISTINCT operator, only by a GROUP BY OPERATOR, seehttp://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6841372900346319251#6849721900346948113

I wonder if this is related in some way to the fact that two "timestamp with timezone" values that look different, and are stored differently, can represent the same time and ought to be aggregated together.

Anyway, I think your explanation sounds very reasonable and would also explain the Parallel Execution behaviour - the common value to use for such cases could only be determined when looking at all data, so it has to be done in the Query Coordinator.

However, thinking about it, at least a pre-aggregation step would be possible in the Parallel Execution Servers, similar to what Oracle already does for SORT AGGREGATE operations for example, but it doesn't seem to be implemented.

... I wonder if this is related in some way to the fact that two "timestamp with timezone" values that look different, and are stored differently, can represent the same time and ought to be aggregated together. ....

This cannot be reproduced in the case of case-insensitivity where two varchar2 values that look different, and are stored differently, represent the "same" varchar2 and ought to be aggregated together:sokrates@11.2 > create table t 2 as 3 select 4 rownum as id, 5 dbms_random.string(opt => 'a', len => 5) as s 6 from dual 7 connect by level <= 1e6 8 /

Table created.

sokrates@11.2 > exec dbms_stats.gather_table_stats(null, 't')

PL/SQL procedure successfully completed.

sokrates@11.2 > set autotr on explainsokrates@11.2 > select count, count(*) from (select s, count(*) as count from t group by s) group by count;

Twice the same plan via HASH GROUP BY, though the different results of the query show clearly that in the second group by different VARCHAR2 were counted the same ( not different when case-insensitive ).

If your conjecture were right, I would have expected a similar plan switch to SORT GROUP BY here too.

With the linguistic sort, although the data is stored differently, you can arrive at the common value by applying a function (NLSSORT in your case I think) to each value independently from other values, so this can be done in the Parallel Execution Servers.

However as pointed out above, in the case of TIMESTAMP WITH TIME ZONE, the common value cannot be derived by applying a function independently from other values.

You need to look at all values that represent the same TIME to choose the common one for aggregation - at least that seems to be the way Oracle handles this.

So there is a significant difference between the two cases in my understanding.

...However as pointed out above, in the case of TIMESTAMP WITH TIME ZONE, the common value cannot be derived by applying a function independently from other values.

Please note that any aggregation algorithm doesn't need the common value here, but just one normalized value, which is the same for all differently represented TIMESTAMP WITH TIME ZONEs, which belongs to the same group.Note furthermore, that this value may be of any datatype, it need not be a TIMESTAMP WITH TIME ZONE, it could also be a VARCHAR2 or a NUMBER instead.

I think that such a function does exist.What Jonathan pointed out above (You have to wonder how Oracle decides whether to convert +3:00 to +1:00 or vice versa in this case - does it minimise the timezone offset, perhaps.) showed that Oracle doesn't apply a function here independently from other values , but not that such a function does not exist.

What about this function that computes a normalized value independently of all other values ( returns a timestamp) :

sokrates@11.2 > set autotr traceonlysokrates@11.2 > select count(*), cast(some_timestamp_with_timezone as timestamp with local time zone) 2 from t 3 group by cast(some_timestamp_with_timezone as timestamp with local time zone) 4 /

sokrates@11.2 > select /*+ parallel(t 4) */ count(*), cast(some_timestamp_with_timezone as timestamp with local time zone) 2 from t 3 group by cast(some_timestamp_with_timezone as timestamp with local time zone) 4 /