Confessions of an Oracle Database Junkie - Arup Nanda
The opinions expressed here are mine and mine alone. They may not necessarily reflect that of my employers and customers - both past or present. The comments left by the reviewers are theirs alone and may not reflect my opinion whether implied or not. None of the advice is warranted to be free of errors and ommision. Please use at your own risk and after thorough testing in your environment.

During the recently concluded Collaborate 11 (http://collaborate11.ioug.org) I was asked a question which led me to the idea for this entry - the fourth in the series. If the database size is 100GB and you have a 100GB buffer cache, all the blocks will be in the memory and there will never be an I/O to the datafiles other than the regular DBWR lazy writes, right?

This is a very important question and you must consider the implications carefully. Many folks assume that by getting a large buffer cache eliminates or reduces the buffer related I/O - a very wrong assumption.A large buffer cache helps; but the relationship between buffer and block is not one to one. A block may have more than one buffer in the buffer cache. How so? Let's see how that happens. We will examine this in a non-RAC database to keep it simple.

Note how I used char(2000) instead of varchar2(2000). The reason is simple. The char datatype takes up entire string of values, all 2000 of them regardless of the actual value placed there. So, even though I put 'x' there, the entire row will be quite a long one.

After the insertion, check which blocks these rows went into. We can do that by calling dbms_rowid.rowid_block_number() function.

This completes the setup. In summary, we know that these rows are in blocks 4419 and 4423 and the object ID is 99360.

Experiment

If possible, clear out the cache by shutting down and restarting the database. This brings up an empty cache. It's not absolutely necessary though. Now select from the table:

SQL> select * from bhtest;

This will bring up all the blocks of the table into the buffer cache, To check for the same, you can check the view V$BH (the buffer headers). The column OBJD is the object_id. (Actually it's the DATA_OBJECT_ID. In this case both are the same; but may not be in all cases). Here are the columns of interest to us:

There are 7 buffers. In this example we have not restarted the cache. So there are two buffers for the segment header. There is one buffer for each data block - from 4419 to 4423. The status is "xcur", which stands for Exclusive Current. I will explain that in detail in a later blog. But in summary it means that the buffer was acquired (or filled by a block) with the intention of being modified. If the intention is merely to select, then the status would have shown CR (Consistent Read). In this case since the rows were inserted modifying the buffer, the blocks were gotten in xcur mode. From a different session update a single row. For easier identification I have used Sess2> as the prompt:

There are 8 buffers now, up one from the previous seven. Note there are two buffers for block ID 4423. One CR and one xcur. Why two?

It's because when the update statement was issued, it would have modified the block. Instead of modifying the existing buffer, Oracle creates a "copy" of the buffer and modifies that. This copy is now [Note there was a typo earlier "not", it should have been "now". Corrected. Thanks to Martin Bex] XCUR status because it was acquired for the purpose of being modified. The previous buffer of this block, which used to be xcur, is converted to "CR". There can't be more than one XCUR buffer for a specific block, that's why it is exclusive. If someone wants to find out the most recently updated buffer, it will just have to look for the copy with the XCUR status. All others are marked CR.

Let's continue with this experiment. From a third session, update a different row in the same block.

There are 4 buffers for block 4423 now - up from 2. What happened? Since the buffer was required to be modified once more, Oracle created yet another "copy", marked it "xcur" and relegated the older one to "cr". What about the extra CR copy? That was done because Oracle had to perform something called CR processing to create a CR copy from another CR copy or an XCUR copy.

You can notice how the number of buffers proliferate. Let's change the experiment a little bit. From a 4th session, select from the table, instead of updating a row:

Whoa! there are 12 buffers now. Block 4423 now has 6 buffers - up from 4 earlier. This was merely a select statement, which, by definition does not change data. Why did Oracle create a buffer for that?

Again, the answer is CR processing. The CR processing creates copies of the buffer and rolls them back or forward to create the CR copy as of the correct SCN number. This created 2 additional CR copies. From one block, now you have 6 buffers and some buffers were created as a result of select statement. This should answer the question whether the buffer cache of size of the database will be able to hold all the buffers.

Free Buffers

While being on the subject, let's ponder over another question - what happens when you flush the buffer cache? Let's see.

The buffers are still there and marked as belonging to the object. However the status is "free", i.e. the buffers can be reused if some session wants them for some other block. If a session wants a free buffer and can't find one, it waits with the wait event "free buffer wait". At that point, Oracle makes room in the buffer cache for the blocks requested by the session by forcing the buffers out of the buffer cache. The CR copies are discarded (since they were not updated) and the XCUR copies were written to the disk (if not written already).

Who does the writing? It's the process known as DBWR - Database Buffer Writer, which is generally named DBW0, DBW1, etc. The DBWR is a very lazy process. It sleeps most of the time, unless it is woken up by someone. In this case the session (actually the server process) kicks DBWn to write the dirty buffers to the disk and change the status to non-dirty. This is why sometimes SELECT statement may generate writing to data files. Until the buffers are freed up, the session waits patiently and displays to all that it's waiting on free buffer waits. You can check it by selecting the EVENT column from V$SESSION.

Impact on Segment Header

So far we talked about the data block. When does the segment header see some action? Segment header does not contain any data; so it does not need to be updated every time an update is made. But when the segment itself is modified, segment header is updated. Let's see an example with adding a column:

There are 6 copies of the segment header. Table alteration changed the segment header block - block# 4418. Just like any other block, the buffer was copied over and modified, creating multiple copies of the block. Whe you issue another DDL operation - truncate - the result is the same:

There are now additional segment header buffers, since truncate is a DDL command and segment header is modified. Even though there was no block of the table in the buffer cache (remember, we flushed it), the segment header still needs to be updated. Truncate also marks all blocks as free, and resets the high water mark and updates the bitmap block. The bitmap block (BMB) is used in tablespaces with automatic segment space management (ASSM) to display whether a block is free or not, which is similar to the functionality of freelists. The truncate caused the BMBs (there are two - 1st level and 2nd level) to be modified as well and they also come into the buffer cache.

Takeaways

From the above discussion you saw how a table with just two blocks populated fills up the buffer cache with 20 buffers. Imagine a normal database with, say 10000 filled blocks (8KX10,000 = 80M). It might easily fill 200,000 buffers. With a 8K block size that amounts to 8 K X 200 K = 1600M, or about 1.6 GB of buffer cache. The amount of buffers taken up depends on several factors:

(1) less buffers consumed in selects
(2) less buffers consumed if the commits occur more frequently, since the demand for CR processing is less
(3) more buffers are consumed if the activity updates buffers
(4) more the logical I/O, the more the need for buffers
(5) DDL activities increase the demand for the buffers, even if the blocks are empty

Food for Thought

Now that you know how the buffers are consumed, what do you think the situation is in a RAC environment where buffers are transferred from one instance to the other - better, worse, or about the same? The answer will come in a future blog. Stay tuned.