This blog is about one of those things I do as almost an autonomous “not thinking about it” performance housekeeping task, one which I have been meaning to mention for ages.

There can be quite a lot to gain by checking out full scans on “small” tables and seeing if it is as efficient as it can be. Sometimes it is, sometimes it is not. Often it is “good enough”. Occasionally it is awful.

Recently I was just casting an eye over the “top 20″ SQL on a system for any code generating a lot of consistent gets. I came across the below:

It caught my eye as it struck me as a lot of buffer gets for what I thought was quite a small table and it is executed quite often. In this case, a few thousand times a day. It takes 2500 buffer gets per count(*). All buffered, but the CPU has to chunk through it.

The table IS small, only 73 rows. It is also interesting that the stats are out by quite a percentage, but the table is still seen as small according to the stats. This indicates the table is quite volatile, but consistently small. Another thing to notice is that the number of distinct values for each column in the WHERE clause is only 4 and thus an index lookup would not be very specific. With such a small number of records in the table and not very specific index I “know” that an index lookup would not be of much benefit {BTW Everything that you “know” in this way – do check it in reality every now and again…}. I don’t look at adding an index, I look at why such a small table needs so many buffer gets.

So, why so many buffer gets?

First of all, I need to be careful. The original source of information I was looking as was V$SQL and that holds cumulative stats for all executions of the same SQL statement. Thus the buffer gets could have been very high a while ago (say, yesterday) and now are small, and the average is 2500 BGs per select but right now it is running fine. So I verify that the code is still doing a lot of work for such a small number of records:

That’s still a lot of consistent gets to scan all of 75 rows and do a count (which is the sort aggregate step). A second to process and 2013 consistent gets to look at 75 rows in a skinny table – just trust me it is a skinny table. And the optimiser cost the full scan step at 546 “equivalent single IOs”.
{Oh, if anyone is wondering what the “table access storage full” and predicate step “storage(“SESS_STATUS”=’D’ AND “AGENT_NAME”=’DEV_ODI_AGENT’)” it’s because this system is an exadata box. Just ignore this for now, it is not relevant to this issue. I’ll blog about it *next week*}.

The table is 2048 blocks in size, the consistent gets to look at the table is 2013. That’s close.

The problem is when you full scan a table, it scans up to the high water mark of that table, ie the highest block that Oracle knows there has been data in. As Oracle inserts new data into a table and extends it, it keeps a note of the highest block data has been put into, the High Water Mark (HWM). When you delete data from a table, Oracle marks individual blocks to be reused as they empty (drop below a certain percentage full). But Oracle does NOT check to see if many blocks are being fully emptied and it can reduce the high water mark. In fact, there could well be a few records in the first block of the table and then nothing until a few records in the last block that has been used in the table, the High Water Mark. Oracle just scans all blocks and counts relevant records up to the high water mark. This is actually a very long-standing performance tweek, you can allocate GBs of data to a table but Oracle knows the very highest block it has put data into for that segment and only scans up to the High Water Mark. Usually, the HWM is useful.

I happen to know that this table used to have several thousand records in it but is now used far less extensively and holds only a few records. They are constantly deleted and new ones created. I have seen {very regularly, at least once a year} some extreme situations where a table has had many millions of rows in it at one point, setting the HWM to several hundred thousand blocks into the table segment, but the table now only holds a few hundred records. But a Full Table Scan scans all the way up to the HWM, taking several minutes to access a few hundred records. Often that table is being inefficiently cleared with a “delete from table_x” statement and then re-populated.

What is the answer? Well, I could truncate the table and set the HWM to zero, but this table actually holds some real data. I can’t truncate it in this situation. I could MOVE the table, ie get Oracle to recreate the table and move the existing data into it. That would set the HWM of the new segment to block one and then raise it as it inserts the data neatly into the first few blocks. The third option, and the one I took, is to SHRINK the table. This basically tells oracle to shuffle the data down to the start of the table and reset the HWM. You can do this on the fly and I have done it on relatively active systems, but I must give the usual advice to test this yourself and you may have locking issues on a busy system hitting the table you shrink very hard. Also, you do need to enable row movement on the table for Shrink to work, as is demonstrated below:

Fabulous. The run time is very short and the number of consistent gets is tiny. You might notice that the Explain Plan still thinks the cost of the full scan will be 546 (ie the equivalent of 546 single block reads) but that is as I have not re-gathered stats. I left the system to do this automatically that night.

I’ll just mention one more thing. Shrinking a table is a bit like the old trick of rebuilding indexes to compact them. People misunderstood what rebuilding indexes was achieving and why the indexes were so much larger than they “needed” to be and would spend CPU and IO rebuilding indexes overnight – only for them to quickly explode back to the original size (creating a lot of overhead as they did). Shrinking oversized tables can have a great benefit but keep in mind that if some process pushes the HWM right up every day by shoving lots of data into the table and then clearing it down later, shrinking the table is just a waste of time. It gives benefit when the table really has shrunk down in size for a while. Or it only occasionally explodes to a massive size. I have in the past written “regular shrink” scripts for tables I know explode once in a while (eg once a week) but are usually small, but you need to know what the pattern is. Don’t just blindly shrink tables.

For me in this situation, that code scanning that table dropped out of my “top 20 most demanding SQL statements” and has stayed out of it for weeks. The code now completed in considerably under 0.1 seconds as opposed to a whole second – but no one has noticed. I’d removed about 5% of the workload of the system – but no one noticed. But it took me 1/2 an hour to investigate and fix and I’d freed up a non-trivial chunk of processing resource. In my Performance Tuning heart I know I did a good job on that day.

Story of my life when I first had a look at our DW, a few years ago. Apparently an “expert consultant” had swooped in and told our developers they had to use APPEND for top performance. On all large tables. I love “hit-and-run expertise”…

You should feel good about the effort as long as there was no pressing issue where current response time for something unrelated was more important. You have increased the remaining headroom for both horsepower and scaling by eliminating the wasteful scan of empty blocks. You have warned against get on a compulsive tuning disorder (CTD) treadmill of indiscriminately rebuilding tables.

Dom mentioned one of the several ways to get not just a gratuitous scan to the high water mark (HWM) when there is an unrestricted full table scan (FTS), but also something that can be a huge problem if a scan with a stop key count or in service of a where exists is run. I call this problem “Empty Front.”

If any of the application code really just needs one row and there is no index with the appropriate columns, this can turn what should be a minimum scan into an arbitrarily large scan. For cases where routine maintenance is problematic, I’ve written an enhancement request on Oracle Mix about allowing you to set a low water mark for a given table and monitor it. Of course this is unlikely to be a good general case solution, so it would have to be something you could turn on and off per table. I encourage folks to read it and consider voting for it.

Another “your mileage may vary” solution to this problem is IF the population of the table has settled down for a while it may make sense to build an all columns index, or an “all the columns I ever use in a query” index. Of course which solution is best or even workable will vary, and the risk of spending more trouble on this than it is worth is real.

Thanks for highlighting a very real situation that often wastes substantial resources without popping into the visibility of currently affecting response time.

As you say, fixing these background issues is something you do when you are not faced with more urgent issues that are threatening SLAs. Having said that, I’ve always been keen to dedicate some time to such things in order to maintain the horsepower headroom your system has.

That’s a good point about the all column index. The use of an index in my example would have probably been a good alternative fix, especially if the number of blocks had been say 2 or 4 times higher (so still a “small” table) and the specificity of the columns maybe 3 or 4 times better. The index would have picked out the few blocks of interest out of the mass of empty/mostly empty ones.

I’ve got another blog to do on a closely related matter to this one, which is about scans of smallish tables which have indexes and return only one value – but where a slight index tweak can again reduce the effort from a wasteful index or even table scan to a very efficient index scan. Again, it may well not help match any SLAs but reduces an overhead on the system

which means that the table SNP_SESSION resides in an exadata machine. But I don’t think that a smart scan is done on this table because it is so small so direct path read will not be chosen to full scan this table. Is is correct?

Then, I am wondering, in case this table was big enough to be Full scanned via a direct path read, will a shrink be of any help in improving the performance? In other words, what is the effect of the HWM for table that are smart scanned?

you are right, this is on Exadata. I plan to do a short blog about what is going on here (though I am NOT an exadata expert). In this case, the table is small enough that it is not being smart scanned. If it was being smart scanned, the benefit (or not) would depend on a few things, such as the size of the table, if the storage indexes are used and how busy the storage cells are. One of the “problems” with exadata is that effort can hide on the storage cells. It may well be that it is fine to have the storage cells take the load but, as with all computer systems, generally speaking it is better to not do stuff you don’t need to do.

As a matter of routine, I always check on the the nightly ADDM run using OEM DB Console to look at what savings can be made. We still use Multi-Master Replication and the ADVREP tablespace can grow to over 8 Gig in size. There are no indexes on the DEF$AQCALL table and the number of rows drops down to zero as deferred transactions are cleared. The table is always accessed via full table scans. Dropping the high water mark regularly improves replication performance.

One question : Is there any way to find out tables which has more than 50% free space? i.e If I have a table which contains 10 blocks (let say each block contains 10 raws) so total 100 raws can store in this 10 blocks but it has only 10 raw spread on different blocks.. so that means there are good amount of space available in each block.. is there any way to find out the blocks which has 50% of empty space?

I was going through a metalink note on reducing space usage on SYSAUX tablespace (AWR Data Uses Significant Space in the SYSAUX Tablespace (Doc ID 287679.1)) and it recommened to shrink wrh$_active_session_history table on 11.2. but did not mention to do a row movement, so i send my query on the note to oracle and they responded by saying that “You don’t have to enable row movement, by default this is disabled”, so in 11.2 do we need to enable row movement before shrinking operation .