It covers how using ALL_TAB_MODIFICATIONSand the NUM_ROWS column in ALL_TABLES can be combined to get a very accurate estimate of count(*) for most tables in most situations.

If this does not work for you {e.g no privilege to see the tables or flush the information), or you absolutely must have the count as accurately as you can right now, as a Manager is threatening you, this is another method using a count table. It can be expanded to hold more than just count(*) and is particularly useful for partitioned tables {which are often huge}.

Basically, every time you select count(*)on a table {or even eg select count(*) where status=0} you are probably scanning a lot of table you scanned last time you did count(*) on that table. If you are not deleting records from that table, that scanning effort is effectively a waste of time – it won’t have changed since last time you did select count(*). So, as a developer or DBA, you can decide you can trust the last count(*), so you need only count more recent records.

If you have an ascending primary key {traditionally a number sourced from a sequence} you can store the max(PK) and count(*) in a simple table. You then count all the records with a PK greater than the stored one. I’ve used count tables several times in the past and two enhancements invariably crop up, so I am going to include them in the below example of how you can implement this.

You need the partitioning option to run the test yourself, so sorry you will need Enterprise Edition with the option. If you downloaded Oracle to your PC/linux box to play on, you probably have it. {it never ceases to amaze me that Oracle Corp will only allow Partitions to be used in EE edition and it is an extra cost option!}

You need to create tablespaces to hold the partition tables {or alter the test script to use existing tablespaces}. Feel free to take and modify this script to create the tablespaces.

So, it takes 111 consistent gets and an estimated cost of 26 to do a count(*) on this relatively small table, doing a fast full scan of the PK index. It takes only slightly more effort to create a record to hold this information.

{As an aside, it is curious that the total estimated cost of the insert is the same as the select, suggesting naughtily that the insert is free, but that’s not my topic today}.

The CBO scanned the index of the TEST_P_RUNNING_COUNT table using an index full scan (min/max), which means Oracle basically worked down the index to get a min or max value, which is highty efficient. This value was then used to do an index range scan of only one index partition (partition number 5).

Estimated cost was 4 and the execution took 3 consistent gets {NB this is for the second execution, it is 12 consistent gets an 4 recursive calls the first time, when the statement is parsed}.

But that only got the new max(id) and how many records have been created since you last recorded it, 100. What you really want is the count of records since you last recorded the count and max(id) PLUS the count as it was then. ie the full count now.
This is not straightforward in a single SQL statement as you want the count, a goup function, from one table and the single row value from another. You can do this though. You basically write SQL statements to do the bits you want and then select from them. My example script shows how I build up to this, but the final statement I came up with {and you might well be able to come up with better options yourself} is:

{you might want to click on “show plain” in the above code window for a clearer layout}
That’s an estimated cost of 7 by the CBO and 10 consistent gets {again, the second itteration}. Remember, scanning the whole partitioned table was costed at 27 and took 111 consistent gets.

You can convert that sql select statement into an insert and use it to update your count table:-

That is one of the improvements usually asked for when you have a count table, people want to see what the counts were historically, seeing as you are now holding the data in a table {why is it so many people suddenly want to keep information once it is seen in a table?}. That is why the above does an insert and not an update and my code selects the max(max_id) and max(rowcount). It is not perfect, if no records have been inserted into your massive table since last the max(id) and row count was inserted, you will get a duplicate primary key error. You could add a datetime column to get around this.

Also, my SQL is a little naughty in that I select the max(id) and max(rowcount). There is nothing in the table design to enforce the unspoken rule that they both appear in the “last record” of the table, but as a human you can see that they do. Again, the use of a datetime can help with this.

The final bit for today. The above two sql statments look a bit nasty, certainly a lot more complex than “select count(*) from table”. Also, they are not as efficient as they could be, the selecting max(row_count) is not supported by an index and could slow down over time. I could tweak the SQL statement even further to work around this, but a simple piece of PL/SQL does the job better, and is what I usually end up implementing {within a package, along with a load of other count table functions for several large tables}.

And, as you can see below, it remains an efficient trick. I’ll leave it to you to create a procedure to update the count table and with the observation that you can then argue for several hours as to how often you update the count table… :-)