If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

How to start using histograms?

Hi guys,

I read some articles about histograms and a question that was posted here: link

I really think that setting up the histograms could improve performance on larger tables, but the Oracle documentation isn't very clear on their explanation. So, I hope you guys can enlighten some questions I have.

Choosing the number of buckets for a histogram.
If the number of frequently occurring distinct values in a column is relatively small, then set the number of buckets to be greater then that number.

Okay, I get this, but how much bigger must that number be? Number of distinct values+1 or +2, +10?

2. How do I determine the number of buckets, when the number of distinct values is bigger than the maximum number of buckets(254)?
Do you make an estimated guess and test the results and increase/decrease the number and test again? I think analyzing all kinds of columns would take a lot of time and my production db has limited rest time.
Can someone give me some guidelines on this?

3. How big must a table be, before I can use column-histograms?
(1000 rows? 10000 rows? 1 mln rows?)

Any help, links or guidance are welcome...

Erik

Last edited by efrijters; 09-17-2003 at 04:54 AM.

An expert is one who knows more and more about less and less until he knows absolutely everything about nothing.

The DBMS_STATS.GATHER_TABLE_STATS() procedure in 9isupports automatic identification of number of buckets, so you might try that.

If you are using an OLTP system with extensive use of bind variables, then histograms will not do anything for you -- if you are predicating on highly skewed columns and you want the optimizer plan to be sensitive to the skew, then you have to stopusing bind variables on that column.

For DSS/DW batch jobs, histograms may be beneficial - I'd look at the automatic skew identification through the DBMS_STATS procedure to make it simpler.