Oracle Data Warehouse Design and Architecture

Menu

Category Archives: Performance

Post navigation

I’m told that it’s very important to keep one’s brain in shape as one grows older. Or “drifts gently towards the twilight” as I prefer to think of it. Trying to look on the optimistic side, obviously. Fortunately I end each day with my brain wrung out like a well-used sponge courtesy of some ETL development work on a largeish data warehouse development.

Although the regular ETL development is outsourced, we are responsible for the coding that processes historical data, so on deployment we will run our historical load code and then the “other team’s” regular load will take over. Yes, it is a strange arrangement. And by “we” I really mean my collegues somewhere in the pestilent jungles of Brazil, surrounded by poisonous frogs and who-knows-what other kind of other fearsome thing, and living on a diet of rice and beans in their houses on stilts.

My own role is not so bad. I don’t have to get too involved in the detailed busness logic that the code has to implement, and thank goodness for that because frankly I’ve not seen anything so complex for … well, forever. So I sit here in a metaphorically lofty perch, making myself popular by criticising the hard work of others and encouraging them to greater efforts by calling them lazy if they stop work before midnight (the traditional management role). Yes, it’s a good life alright. I constantly remind them that the real aim of the project is to produce a process for loading five years of historical data that is faster than the “B-Team’s” load of single day.

I think our methodology works quite well though. When you are down in the weeds of the logic (did I mention that it is particularly complex logic?) it is always difficult to see the slightly larger picture of where you’re taking the data next and how you optimise both specific ETL stages and how you can adjust the overall process to make everything more efficient. It’s a little like pair programming, I guess.

And we do get to do the fun bit — 10 million records joining 0-M:0-N through eight OR’d join conditions to 10 million other records, with a couple of billion joins resulting that we then artfully winnow down to a few million. The optimization process is a little mind bending, with all sorts of issues that we wouldn’t normally worry about too much becoming vitally important in an effort to avoid blowing out the host server’s memory with a degree of parallelism of 40. I have to say that there was initially some FUD over the historical process, with estimates of the load duration being bandied about that were in the order of months — we never thought it would take more than five days, and it’s now looking like it’ll run in under one day, which means that we’d be able to start the daily load process directly afterwards without altering it. And I think we still have a few more tricks up our sleeves as well in case we need to buy an extra hour somehow.

In fact it’s turning into something of a stroll through the SQL Reference book. The list of features we’re touching on include analytic functions (which have become a second language for everyone now, along with the “PCTFREE 0 NOLOGGING COMPRESS” which I nearly appended to my sandwich order the other day), connect by clauses, and even some rather exotic stuff with finding the lowest set bit in a something-or-other that I don’t understand yet but which Heitor thought of while lying in bed one night and which he used to help reduce an eight hour process to fifteen minutes. They say that they don’t have siestas in Brazil, but maybe they should start.

I just posted this question to the Oracle list, but thought I might post it here in case that reaches a wider audience. It also may be a situation that others in the DW arena have faced before.

I have a large and busy OLTP table, 100GB or so, against which there is a need to capture changes. Until an asynchronous CDC solution is in place we have to rely on two columns: create_date and update_date (null until the first update), both being of DATE type of course.

These are currently unindexed, but there is a desire to index them to improve change capture performance for queries such as:

select …

from …

where (create_dt >= trunc(sysdate)-1 and create_dt < trunc(sysdate))

or (update_dt >= trunc(sysdate)-1 and update_dt < trunc(sysdate))

The desire is obviously to provide the maximum performance benefit while reducing the impact on the OLTP system.

I thought of four different indexing options:

i) Indexing the columns separately, leading in the best case to an unusual execution plan where the indexes are range scanned and the results merged before accessing the table.

ii) A single composite index (create_date,update_date),leading to a fast full index scan.

iii) A single composite index (update_date,create_date), rewriting the query predicate as …

iv) A single-column function based index on (Nvl(update_dt,create_dt)) and rewriting the predicate appropriately.

Whichever of these is chosen the growth pattern for the data naturally tends towards index contention as all the new values are equal to sysdate.

So the nub of my problem comes down to these questions:

Is it possible to mitigate the index contention problem with a suitably high initrans values? Does it seem likely that implementing a reverse index would be worthwhile in reducing contention (I could probably take the pain of the fast full scan if it was, although it would lead to a larger index due to the 50/50 block splits).

Would you regard implementing a function-based index as risky in any significant way on major tables of a busy OLTP system? How about a reverse function based index?

Basically, “what would you do”?

Maybe I’m overthinking it, but I’d like to go to the sytem developers with a fully thought out set of options (even if they do tell me to take a running jump anyway).

When you’re designing ETL processes the mantra of “do it in SQL” realy gets engrained in your way of thinking. PL/SQL-based processing as a method of handling business logic is anathema, as the principle of “row-by-row equals slow-by-slow” is very well highlighted by the processing of millions of rows.

Very recently though I found a case where this principle had to be violated, and with performance results that were not at all unpleasant. I was stumped by a SQL problem that I thought would be amenable to analytic functions, but I couldn’t think of a way of doing it. Here’s the situation.

I have two sets of data representing, let us say, sources and targets. Sources and targets only have one attribute each, which is a date in each case, and sources and targets can be joined in a many-to-many relationship. However only one join is valid per source and only one is valid per target, and we have to form connections between sources and targets on a one-to-one basis. We work out which one of the possbily-many joins is the valid connection according to the following logic.

We start with the most recent source. We work out the earliest target to which it is joined, and discard all other targets to which it is joined (if any). The target to which it connected is now ineligible for any other source to connect to it. Then we take the next most recent source and we work through the targets in ascending target-date order and we connect to the first (ie. earliest) target available, discarding all others. The only reason why a target might not be available is that we already connected a source to it.

One way of looking at this is to imagine that the set of joined sources and targets are ordered by source-date descending and then target-date ascending. Examining each pair in turn we ask two questions:

“Has this source already connected to a target?”

“Has this target already connected to a source?”

If the answer to both of those is “No” then we take that join as the valid connection and we move on to the next pair.

It’s very easy to construct a sample data set for this problem as well, by simply joining a list of integers to itself to form pairs such as (1,1), (1,2), (1,3), … , (1000,1000) and ordering the output with a query such as this:

Given that data set the first pair is (1000,1), which becomes the first connection. The next connection will be (999,2), the next will be (998,3) etc.. It’s important to note that the real data set is not like that though, where every source connects to every target, so special conditions in this data set do not apply to the real one.

By the way, there are some special conditions that are easy even in the real data — if a row represents the first occurance for both the source and the target then that is always going to be the valid connection. Um … that’s actually the only special condition I could think of. There may be some very clever mathematical way of working out others, but I don’t know what it is.

So, the first solution I came up with was to use PL/SQL to read each row in order and attempt to insert into a table having unique constraints on the source_id and target_id columns, handling any uniqueness error. The handling of the error is the only virtue that PL/SQL brings to this case. It was functional, but it was slow because of course it requires a lot of logical reads and writes in the index maintenance. I considered an insert with an error-logging clause by the way, but that doesn’t handle unique or primary key constraint errors and if it did it would still have the same problem of high logical reads and writes.

The next solution was to maintain a pair of PL/SQL associative arrays to handle the detection of uniqueness, and that worked well except that I was using conventional path inserts for the result set, and I didn’t fancy coding up bulk inserts. I know enough PL/SQL to be dangerous but it’s not my native language.

My (currently) final solution was to embed the PL/SQL into a package so that a function to detect uniqueness could be embedded in a sql statement. I was pretty concerned about the context switch to PL/SQL, and the logic means that this process can’t be parallelised, but the result suprised me. When I streamed 100,000,000 rows through the process to detect 10,000 valid connections the process completed in 770 seconds, so that was at least 250,000 uniqueness tests per second. I can’t imagine that being achieved based on detecting violations of a unique constraint.

The call to ss.reset purges the arrays, and must be executed between runs. The rownum > 0 predicate is just the poor mans way of ensuring that the call to ss.is_new is not pushed into the inline view — Oracle obviously doesn’t know that the row ordering is a critical part of the logic.

Obviously an INSERT (with append hint to invoke direct path) can be prepended to that to perform the actual ETL task.

No distinct on the first select clause because we assume that City is unique (it would be if it were the PK on a real dimension, of course). In the absence of indexes to support fast full index scans we have four full table scans and three operations to support the distinct operator …

A pretty clean looking piece of SQL, but multiple scans of tables make my hackles rise. If this were a very large table and indexes couldn’t be leveraged then it could be a performance problem.

An alternative occured to me:

insert all
when 1 = 1 then into hier values (city ,state ,'City' )
when state_row_ind = 1 then into hier values (state ,region ,'State' )
when region_row_ind = 1 then into hier values (region ,country,'Region' )
when country_row_ind = 1 then into hier values (country,null ,'Country')
select city,
state,
region,
country,
row_number() over (partition by region order by 1) region_row_ind,
row_number() over (partition by state order by 1) state_row_ind,
row_number() over (partition by country order by 1) country_row_ind
from dim
/

The cost of the two methods is exactly the same, but the cardinalities are both out — the correct answer is “19”. Of the two methods I’d expect the first to be more accurate and it did do pretty well. With such a small amount of data it’s difficult to make an accurate comparison, and the proper approach would be to benchmark each method on the production data.

One interesting difference is that with the single table insert method you can control the order of insert into the hierarchy table, which might be important if you wanted to physically cluster particular rows together. I don’t think that there is a deterministic way of controlling the insert order in the multitable insert.

Yes, don’t get excited — I don’t know of a way to do it. I wish there was though, and the ability to rebuild multiple indexes (or partitions of different indexes) at the same time would be useful because …

Well the problem is in the domain of data warehousing — frankly, all the interesting problems are there ;) — and it arises in ETL operations when we want to load to an unindexed tables and then create or rebuild multiple indexes or index partitions on that data. With each of those index operations we need a full table or partition scan (you know, let’s just assume the “or partition” bit from now on) and for a large data set that may mean very high physical read rates are required. Although we can avoid or mitigate the physical reads by building multiple indexes at the same time in different sessions using dbms_jobs or dbms_scheduler and relying on caching in the SGA or at a hardware level, we still incur the logical reads.

So I’d like to be able to create multiple indexes based on a single scan of the data segment. A “Create Indexes” statement in which some slave process spits out the required columns and rowids to a set of processes which handle the sorts etc.. Even if it meant writing a temporary (compresed?) data set of the required columns that then gets scanned multiple times then as long as we’re not indexing the majority of the columns on the table we could still see a net benefit.

I really can’t think of a downside to this. Of course, you could try to build fifty indexes on a fact table in a single operation and end up cooking your CPU’s and/or blowing out your temporary disk space and/or memory and incurring horrendous levels of the dreaded multipass sort, but so much of what we do is finding that sweet spot between two extremes anyway. Maybe building those fifty indexes in ten groups of five would reduce the number of reads of the data segment by a factor of five without spilling the sorts to disk, and would also put some decent load on those pricey CPU’s.

Enhancement request time again, then.

Actually, I suppose you can create mulitple indexes in a single DDL statement …

I interviewed a couple of senior ETL developer candidates last week, so I shook the dust off of some of my tried-and-true probes for technical knowledge.

I’ve always had a liking for asking for the differences between “truncate” and “delete” ever since I was asked it myself some years ago. I think it allows candidates to demonstrate a nice range of knowledge.

Here are some answers in no particular order.

“Truncate is fast, delete is slow”. Hmmm, well that’s mostly true I guess. The most basic or all answers.

“Truncate is DDL, delete is DML”. Is this just so obvious that people don’t mention it?

“You can’t rollback a truncate”. Indeed. See 2 above, of course. And 1.

“Truncate is implicitly a commit”. A better answer than 3 I think. I seem to recall that someone once mentioned that there are actually two commits in a truncate, one before and one after … but maybe I dreamed that. I should test it really.

“You can’t grant permission to truncate a table”. Ah, practical experience shines through. If you don’t like your ETL process to connect as the owner of the schema then this is a challenge that has to be overcome with stored procedures or something sophisticated like that. You really don’t want to grant “DROP ANY TABLE” to your ETL user.

“You can delete any subset of rows, but you can only truncate the complete table, or a partition or subpartition of it”. Is this also so obvious that nobody mentions it?

“Truncate makes unusable indexes usable again”. A real gotcha for the unwary. If you attempt to optimise a data load by rendering indexes unusable and truncating a table (possibly followed by an index rebuild and a partition exchange) then be careful of the order.

“Truncate can’t maintain foreign keys”. It’s “cascading delete”, not “cascading truncate”. That would be an interesting feature though — point 4 above would make it a little trickier. Truncating an index cluster is pretty close to a “cascading truncate” to a limited extent though. In any case no truncate is permitted on a table referenced by foreign keys.

“You can’t flashback a truncate”. This is an oddity to me. We can flashback a “drop table”, rollback uncommited deletes, or use flashback to recover pre-commit deleted data, but a truncate is a barrier across which we cannot flashback.

“Truncate deallocates space, delete doesn’t”. Unless you want it not to, using the “reuse storage” clause. However the high water mark is reset in either case so maybe that’s a better answer …

“Truncate resets the high water mark, delete doesn’t”. And on the indexes, also. See 7 above also.

“DML triggers do not fire on a truncate”. Because … um … it’s DDL not DML.

I suppose there are other issues as well, but this shows that it’s a complex business. I really wish we could grant “Truncate table” to a user or role … enhancement request time I think.

Isn’t it interesting that in some circumstances only one method can be used to remove all rows from a table in a single step, sometimes both methods can. In other circumstances both methods can remove a subset of rows from a table, and sometimes both can.

Here’s a curiosity. Or a banal observation, depending on your perspective.

There are, broadly speaking, two different causes of buffer busy waits. They are nicely described in the documentation here.

Waiting for a block to be read into the buffer cache by a different session.

Waiting for a block in memory to become available in some way (maybe it’s being modified by another session, for example)

The second of these is likely to be observed in an OLTP system, the first of them is more likely in a reporting system with multiple nonparallel full table/partition scans.

The curiosity is that the cure for these two causes are exactly the opposite of each other. To reduce the number of buffer busy waits you either have to increase the number of rows per block (to reduce the number of blocks needed to fulfill a full table scan) or reduce the number of rows per block (to reduce the chance of multiple sessions requiring rows that are in the same block).

So what to make of a consultant who sees a high number of buffer busy waits on a read-only reporting system, and advises that you increase pctfree or partition the table (sic) to reduce the number of rows per block?

Incidentally, I have a headache. Possibly it is caused by spending so much of the day frowning.