Inspired by Mark's excellent post I wanted to get to the bottom of this. And while I am at this I'll also have a look at star transformation in snowflaked dimensional models. We will be using Oracle 11.1.0.7 on Windows XP.

Let's start by setting up our snowflaked star schema. We will be using the SH sample schema as a basis for this.

So what conclusions can we draw from the above explain plan. Well, first of all we see that Oracle has used star transformation for this query. This demonstrates that star transformation is used by the CBO in a snowflaked dimensional model. The next question then is, how exactly did this happen

As a first step, on lines 2-7 (Id 2-7), Oracle loads a global temporary table (GTT). It expects to load 29 rows into this table from the Bitmap ANDed predicates on the customer table. It uses that GTT in the star transformation itself (lines 11-27). So rather than joining directly to the customer dimension Oracle uses the GTT as part of the star transformation. On lines 10 and 28 our GTT is hash joined to the results of the star transformation. On lines 9 and 29 our snowflaked countries_star dimension is joined to our result set and this is then finally aggregated in line 10 and returned in line 0. Interestingly, the customers_star dimension does not directly take part in a join at all.

Let's move on to the next item in our list: Does the CBO use star transformation when it finds a compound key in both fact and dimension table?

In order to demonstrate this we will first create a compound key in our products_star dimension and also set this up as a foreign key in the sales_star fact table. We will use the prod_name in products_star as the second item in the compound key. We will also create the prod_name column in the sales_star table.

So, what does this mean now? First of all it means that the CBO can use star transformation with compound keys. Claims to the contrary are simply false. This also means that surrogate keys are not a pre-requisite for star transformation to be used in a dimensional model. So another reason to get rid of them (in most situations).

About the author

Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.

Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.