KEEP Pool – What is Wrong with this Quote?

28082011

August 28, 2011

The book “Oracle Database 11g Performance Tuning Recipes” is out of the alpha stage – if you bought the alpha version from Apress, you should now be able to download the finished book. I am currently reviewing the book review notes that I started collecting when reading the alpha version, so that I may see if any problems I found in the alpha version of the book are also found in the final printed version of the book. My final version of the book review for this book could be interesting… stay tooned (yes, an intentional typo).

One of the sections of the book that survived the alpha stage of the book is Recipe 3-2 Managing Buffer Pools, found on page 87. The first paragraph under the Solution heading states the following (note that I removed a couple of sentences from the middle of the paragraph, so the words in [brackets] were added to paraphrase the other sentences):

“You can use multiple buffer pools instead of Oracle’s single default buffer pool, to ensure that frequently used segments stay cached in the buffer pool without being recycled out of the buffer pool… For example, if you want a segment to be cached (or pinned) in the buffer pool, you must specify the KEEP buffer pool [when creating the segment].”

*** Note that this section of the book is NOT suggesting the use of multiple block sizes in a single database, but is trying to suggest why KEEP and RECYCLE buffer pools should be used in addition to the DEFAULT buffer pool.

What, if anything, is wrong with the above quote from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.

** Based on what you wrote, I think that you understand the point of this blog article. ** If you have a copy of the entire book (I see that you are listed as the technical reviewer of the book), take a close look at the rest of Recipe 3-2 to see if anything else in that Recipe should be mentioned.

“if you want a segment to be cached (or pinned) in the buffer pool”
that to me is the biggest problem. “pin” and “cache” are not an “or” proposition here. The buffer pool WILL cache, regardless of the KEEP pool setting.

You touched on an important point here. One might be wondering: What is the purpose of the CACHE keyword in a CREATE TABLE command? It would be redundant if it meant the same thing as use the KEEP buffer pool.

“You can use multiple buffer pools instead of Oracle’s single default buffer pool, to ensure that frequently used segments stay cached in the buffer pool without being recycled out of the buffer pool… ”

I think this is worded generally enough to be correct. 🙂 You might want to argue if segments are cached in the buffer pool or blocks, but well…

“For example, if you want a segment to be cached (or pinned) in the buffer pool, you must specify the KEEP buffer pool [when creating the segment].”

Noons already commented on “cached or pinned”.

My addtional 2 cents: Specifying KEEP does not guarantee that the buffer are kept in the buffer cache. KEEP, DEFAULT and RECYCLE only are three different LRU lists that can help you prevent “unimportant” blocks aging out “important” blocks out of the buffer cache by classifying their importance in regards to lifetime in the buffer cache.

The CACHE keyword in a create table command indicates that blocks retrieved for this table should go to the top of the (KEEP, DEFAULT or RECYLE) LRU list, while as default these blocks will go to the lower end of the list, where they will age out quickly. I think, but I am not sure, the default behaviour for indexes is that blocks retrieved for them will go to the top of the LRU list, so the CACHE keyword makes blocks for table segements act like blocks for index segements regarding aging out.

On a side note, I think the default behaviour for index organized tables in that regard is like indexes (or tables with the CACHE option). This means that an (accidential) full table scan of an (large) IOT will really mess up the buffer cache (buffer cache hit ratio going down).

You mentioned several points that will be in my final review of the book for recipe 3-2 (I did not think that I provided enough of a quote for you to make those observations 🙂 ). I was not aware that index blocks go to the top end of the buffer pool – I thought that all blocks were inserted in the middle (of course it was not long ago that I thought that blocks read by full table scan were inserted into the LRU (least recently used) end. Kind of makes you wonder what happens when serial direct path read is used in 11.1.0 and later rather than db file scattered reads – would we still be ensuring that frequently used segments stay cached in the buffer pool ?

The following is from my review for recipe 3-2 of this book (so far I have only double-checked recipes 3-1 through 3-10, so only the review of those sections has been finalized… I might need to reduce the sarcasm when the book review is posted):

• In recipe 3-2 the authors state, “if you want a segment to be cached (or pinned) in the buffer pool, you must specify the KEEP buffer pool [when creating the segment].” Something just doesn’t seem to be right (or complete) about this statement – for one, I wonder what the CACHE and NOCACHE clauses might affect? Secondly, this statement might be implying that blocks belonging to an object will only remain in the buffer cache if those objects are assigned to the KEEP buffer cache. The recipe states that the approximate size of the KEEP buffer pool can be determined by “summing up the size of all candidate objects for this pool” – great, now what happens when Oracle Database creates up to six (and potentially many more) consistent read versions of the objects’ blocks (reference)? The last two paragraphs of this recipe seem to indicate that it is the segments themselves (apparently all blocks) that are aged out of the buffer cache when space is needed to cache other blocks, rather than the segments’ blocks that are aged out of the cache – additional clarity is needed in these two paragraphs. The second to last paragraph also makes a claim that I am not sure how to interpret: “Oracle figures that if a single segment takes up more than 10% of (nonsequential) physical reads, it’s a large segment, for the purpose of deciding if it needs to use the KEEP or RECYCLE buffer pools.” The SQL statement provided in recipe 3-2 is also found in the Oracle Database 9.2 Performance Tuning Manual, but that SQL statement potentially produces misleading information. (pages 87-88)

I would suggest that Marcus’ explanation is as good or better than mine, and Nuno and Surachart both touched on important points.

> default behaviour for indexes is that blocks retrieved for them will go to the top of the LRU list …

Marcus,

I think I’ve read somewhere that the distinction is along the lines of “sequential reads” vs. “scattered reads”, and then subject to the “small table threshold”. I have this (oversimplified and outdated) mental model for the buffer cache policy:
1. scattered reads from large segments go into the cold end of the LRU list.
2. everything else (either small or read via a “sequential read”) goes into the middle of the LRU list.
3. CACHE moves segments form Case 1 to Case 2 in the DEFAULT pool and any nK-pools.
4. The KEEP pool has only Case 2.
5. The RECYCLE pool has only Case 1.

I did some experiments several years ago (Oracle 9iR2), and got the impression that the KEEP and DEFAULT pools indeed have different policies for ageing stuff out. I would start a full scan of a large table and repeatedly count the number of buffers with its objd in v$bh. For the DEFAULT pool, the number of buffers grew as long as there were any buffers with status “free,” at which point it stopped growing; if there weren’t any free buffers at the start of the scan, the number of blocks cached from that table never got higher than db_file_multiblock_read_count. In the KEEP pool, on the other hand, the full scan would throw everything else out of the cache.
Meanwhile there are multiple subpools and therefore multiple LRU lists, so things are definitely not so simple anymore.

BTW, I think no blocks can go directly to the top (the hot end) of the LRU; buffers have to “earn” their way there by increasing their touch counts (which is pretty hard, since not every “get” counts as a “touch”).

> you must specify the KEEP buffer pool [when creating the segment].
I’m not sure whether this is a paraphrasing artefact, but if this is correct, then I am wasting 90 GiB of RAM on my empty KEEP pool, since I assigned the segments using ALTER … STORAGE (buffer_pool keep)! OMG!

Reg. where buffers go in the LRU list(s): Different Oracle documentation, namely the Concept Guide and the Performance Tuning Guide, gives different information regarding where buffer pointers go in the LRU list, sometime it says tail or head of the list, sometimes it says middle of the LRU list.

I tend to believe it goes to the tail or into the middle and never directly to the head of the list, simply because that would make sense, since a block with a high touch count should be higher in the list than a block that was just loaded into the buffer cache.

It’s somehow a question on which documentation, book, ML note, expert blog post, etc. you believe, since there is no way (at least I don’t know a way) to find out the position of a block in a LRU list.

@Flado:

I believe your mental model is closer to reality than the model I gave.

> I did some experiments several years ago (Oracle 9iR2), and got the impression that the KEEP and DEFAULT pools indeed have
> different policies for ageing stuff out. I would start a full scan of a large table and repeatedly count the number of buffers
> with its objd in v$bh. For the DEFAULT pool, the number of buffers grew as long as there were any buffers with status “free,” at
> which point it stopped growing; if there weren’t any free buffers at the start of the scan, the number of blocks cached from
> that table never got higher than db_file_multiblock_read_count. In the KEEP pool, on the other hand, the full scan would throw
> everything else out of the cache.

I think what you saw in your test can directly be explained by your points 1, 2 and 4.

However, I guess the thrust of this article is that you cannot guarantee that an object placed in the KEEP cache will stay in the keep cache. Particularly if you size it to object requirements. I read from the ‘buffer-states’ article by Jonathan Lewis referenced above that cloned consistent read blocks may overspill the KEEP CACHE if not sized sufficiently large.

Taking the article on face value might provoke the question ‘Why isn’t my object in the KEEP CACHE?’ from the unwary.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: