How to find tables containing lots of unused space?

Have you ever wondered why the table segment consumes as much space as it does and how does one know if the space allocated by each of the segments is actually used for storing data and is not mostly empty? Those question did bother me time to time and I was looking for a method that would not require to license any packs (like Diagnostic Pack for Segment Advisor, because it requires AWR) and which would not do lots of IOs by scanning the segments. In the end I found a simple solution for this…

Starting with 10gR1 Oracle provides procedure DBMS_SPACE.CREATE TABLE_COST that helps estimate the size of the table segment based on input about the expected number of rows and the average size of rows (in bytes), pct_free setting for the table and the name of the tablespace where the table would be created. The output contains information on the size of the user data and size consumed by the table segment.

I found the procedure could be used to estimate if the actual size of the table differs from the expected size as all the input metrics are available - pct_free and the tablespace name from the properties of the table and the number of rows and the average size of the row metrics are collected while gathering the statistics for the table. The same procedure can be used for table partition segments as well as the same metrics are available.

To make things simpler, I wrote a function that returns the estimated size of the segment, so I can use it in query (it returns null in case of error, errors usually happen if any of the input data is null or provided number or size of rows is 0):

I wrote a query that estimates the size of top 500 largest table and table partition segments and reports them sorted by estimated size of the “wasted space” in descending order. It uses information in DBA_SEGMENTS, DBA_TABLES and DBA_TAB_PARTITIONS to provide the estimates. Try it in your test environment, the results might surprise you! And I guess you’ll know how to change it if estimates for other tables are needed. Here it is!

I found the actual benefit after moving the table can be different - for some tables it was as estimated, but for some it was not. Even if the statistics were gathered having estimate_percent=100 the situation was not better. I think the procedure is not able to handle all the cases properly, but anyway, that’s a very good starting point to find where the space might have been wasted.