November 26, 2012

Performance of any application is the most ignored ingredient of any application implementations be it ERP, Reporting, Transaction systems. However reports being the highest visibility item owing to fact that audiences are the top brasses in organization chart, is quickly diagnosed and subsequently punished. Hence a correctly built report is half built report until it runs fast.

We recently came across a such implementation where a revenue report (name suggests the criticality) was running in hours, though perfectly designed, still miles to go before the result appears. Evidently it was a big pain point from report authors to report consumers and we got to put thinking caps on!

Problem Statement & Approach

Though there are plenty of performance tuning solutions available, provided by different reporting tools (OBIEE, Cognos, MicroStrategy) like aggregations, event polling, Cache Seeding etc. There is one solution which continues to be out of spotlight yet one of the most cost effective and more importantly tool agnostic, DWH exchange partition. Though documented by Oracle already but getting to apply in right situation is the chance we got recently for one of our client requirements.

Revenue data was residing in different source systems but getting loaded into a unified warehouse and management expectation was to have a single revenue report on Monthly basis which essentially means querying a table containing 3 billion records leading to retrieval time of minutes in double digits. Certainly not the most desirable situation.

Exchange Partitioning comes as a handy tool to deal with fact tables which contains records in billions. So what we did was, partition the DWH table by interval partition and sub-partition it by list. Partition table is basically divided in multiple logical segments depending upon the user requirement but to user it will be a single table with billions of records.

In almost all of traditional data loading techniques table will not be committed until jobs are processed completely, here lies the difference in Partition Exchange load from traditional data load techniques. In a typical Partition exchange "SWAP" will occur from an ordinary table to a specific partition of the table. Data will be transferred from a non-partitioned table to a partitioned one while exchanging the partition in Partition Exchange Load. We wrote an automated script to add monthly partitions source system wise to logically segment the table for interval partition and list sub-partition. The swap occurs at the oracle data dictionary level rather than actual data level. As there is no physical data movement therefor the exchange is extremely fast.

There were two custom scripts were written to achieve this objective.

Solution

Creating the partition: We used the script embedded in a procedure to execute it while running the monthly load to create the partition for the month and source system for which data is loaded.