Wednesday, March 3, 2010

Learning About Performance

I've found myself at a certain disadvantage lately, specifically in regards to understanding tuning. I don't like being at a disadvantage...I'm competitive that way.

I understand, from a very high level of what needs to be done, but I could not prove my theories. Well, I could, sort of, but it wasn't necessarily a reasoned or logical approach. I just tried all possible permutations. There's a certain time constraint with that method and when you're talking about huge volumes of data (i.e. datawarehouses), time is short.

I've decided it's time to change that, so I'll begin to peck away ever so slowly.

In Usage Tracking, there is a table called S_ETL_TIME_DAY. It's a sister table to S_ETL_DAY which is your everyday time dimension. S_ETL_TIME_DAY breaks down a single day into hours and minutes which means there are 1440 records (24*60*60).

Somewhere I saw the following SQL:

SELECT DISTINCT hoursFROM s_etl_time_day

which just returns the hours in the day (24). I've pondered on whether DISTINCT is a bug, but it seems fairly innocuous here.

Sadly, about the only thing I can usually see in an explain plan is whether an index is being picked up or if there is a nested loop. I'm not going to worry about all of it now...this is just a start. More to get me in the habit.

What does it all mean? I am sure you know or understand better than I. For me, I just need to create the habit. I'll read (more) about the specifics of the explain plan and the output from tkprof. If you want to explain, feel free. If you want to do it in your own post, link it up or you can write it here. I'd be happy to host it.

8 comments:

> I understand, from a very high level of what needs to be done, but I could not prove my theories. Well, I could, sort of, but it wasn't necessarily a reasoned or logical approach. I just tried all possible permutations.

I have no idea about your application butIf this will be running by more than one session at the same time , before conclusion with one session test, you should run both statements with number of session which will be actively running them.

My vote is to dual one which is simplest and doesn't depend on the size of the table you are querying from.

I'd second Coskan. Especially as FAST DUAL is really fast ;-)In <our particular case, It might be sufficient to get v$sesstat before and after each statement and calculate the deltas (to see what the session was doing). This could give you an even closer insight.

I wouldn't say either would noticably benefit from tuning for a single call/session. If there were LOTS of calls it may be worth it.

I suspect in a very high concurrency environment (lots of sessions running the same query concurrently) you'd be more worried by latch contention. When a block is queried it needs to be latched momentarily (for the duration of the read) as you can't have another session (on another CPU) writing to the block at the same time.

Because DUAL doesn't really exist as a block, it doesn't need that protection so using a DUAL query can reduce latch contention.

That's my point. I would bet all of you have a much better understanding of the why's involved...I don't. I would probably pick dual, mostly because it will always be there.

I wish I could say emphatically and with examples, that dual is the single best way to go. I can't yet because I don't understand at that level yet.

I don't like being in that situation...at all. Much of my knowledge is based on reading and internalizing asktom for years...but to repeat the tests that he performs, or to even create the tests that he performs is just now out of my grasp. That's what I want to change.