What Could Go Wrong? Testing the DB_FILE_MULTIBLOCK_READ_COUNT Parameter

14102011

October 14, 2011 (Modified October 15, 2011)

When reading various Oracle Database books, I often perform mental exercises related to the book’s contents, asking “what could go wrong?” That is probably why some of my book reviews have extended to record setting lengths… who knew that Amazon imposed a limit on book review lengths…

I recently started re-reading the “Troubleshooting Oracle Performance” book after buying a PDF copy of the book. As previously mentioned, the ultimate reason for re-reading the book is so that I may compose a detailed book review, just as I have done for most of the Oracle Database related books that I have read in the past couple of years. So far, I have had few opportunities to ask “what could go wrong” – until of course I made it to page 178 during the second read-through of the book. That page includes a PL/SQL block that tests the performance effects of modifying the DB_FILE_MULTIBLOCK_READ_COUNT parameter from a value of 1 to 32, increasing the value by 1 for each pass through the loop in the PL/SQL block. What could go wrong?

What Could Go Wrong?(I had two or three answers for this question as I was reading the script, but I have a couple more after a bit of experimentation. Think about what might go wrong without looking in the book.)

–

—

Added October 15, 2011:

The “Troubleshooting Oracle Performance” book, to my knowledge, was the first performance tuning book to address Oracle Database 11.1, and this book was released shortly after the release of Oracle Database 11.1.0.6. While it was not widely known at the time (I do not think that it was documented), Oracle Database 11.1.0.6 might employ serial direct path reads (directly into the PGA) rather than db file scattered reads (into the buffer cache) when performing a full table scan. Based on an experiment, the DB_FILE_MULTIBLOCK_READ_COUNT parameter influences the maximum number of blocks read in a single direct path read request; the maximum number of blocks read by direct path read appears to be the greatest power of two that is equal to or less than the value of the the DB_FILE_MULTIBLOCK_READ_COUNT parameter. The paragraphs that follow this test case in the book suggest that a parallel full table scan should not be utilized – that implies that the book author is aware that direct path reads might distort the outcome of the test case. This was the first thought that I had when viewing this test case when re-reading the book.

The effects of the table blocks that are already buffered in the buffer cache is also a potential area where this test case might see distorted results (the paragraphs that follow this test case in the book address this issue). The test table is roughly 1,152MB in size (minus the blocks that are above the high water mark, of course), so how likely is it that some of the table blocks might be cached in the buffer cache either before the first execution of the SELECT from the table T1, or between executions of the SELECT from T1? Does the answer depend on what was performed prior to the test case – how full is the buffer cache with unrelated blocks? What is possibly interesting is that in my test case, all of the blocks belonging to the entire table were buffered in the buffer cache prior to the start of the PL/SQL block in the test case. Without the timing information displayed by SET TIMING ON, it is difficult to determine if the DBMS_UTILITY.GET_TIME function simply is not supported on the operating system platform used for the test, or if as Timur stated in the comments section, the full table scans were completing so quickly that the timing of the individual runs was being rounded to 0 seconds. I was careful not to include the output of the SQL*Plus timing during the initial posting of the blog article. I obtained the following times from repeated PL/SQL executions:

The potential buffering of some or all of the table blocks was the second thought that I had while reading the test case during the re-read of the book. The paragraph in the book that followed the test case addressed the potential problems of the table blocks being cached in the buffer cache, the operating system cache, a SAN cache, or a similar cache. As previously mentioned, Oracle Database 11.1.0.6 and later are able to perform serial direct path reads, where previous Oracle Database versions would have used db file scattered reads, so the Oracle Database version is potentially important. The book previously mentioned on page 175 three items that might cause fewer blocks to be read than what is specified by the DB_FILE_MULTIBLOCK_READ_COUNT parameter – those items might impact the results of the test, if the test table is not representative of the other objects in the database that might be read by a db file scattered read.

While reading the test case, another thought raced through my mind: why is the author experimenting with the DB_FILE_MULTIBLOCK_READ_COUNT parameter – just leave it unset and make certain that system statistics are collected so that the MBRC system statistic is used (I believe that this is related to Mohamed’s second point in the comments section). Since the release of Oracle Database 10.2.0.1 it is possible to leave that parameter unset (and a bug was fixed in 10.2.0.4 that then allowed the parameter to be set to 0 which then had the same effect as leaving it unset). That is the point that Mohamed made in the comments section. The book author covered the possibility of leaving the DB_FILE_MULTIBLOCK_READ_COUNT parameter unset in the paragraphs that followed the test case, and since this book attempts to help with performance problems in Oracle 9.0.1 and later, it probably is a good idea that the test case was included to benefit those who had not yet transitioned all of the databases to Oracle Database 10.2 or later.

Result caching, mentioned by Steve Catmull in the comments section, is not something that I had considered while reading the test case. If I remember correctly, when the RESULT_CACHE_MODE parameter is set to FORCE, the result cache could distort the results of the test case.

I thought that I would include a couple of more test cases on 11.2.0.2.
With an unset DB_FILE_MULTIBLOCK_READ_COUNT and the following script:

In the above, note that the direct path read wait did not entirely adhere to the value for the modified DB_FILE_MULTIBLOCK_READ_COUNT, the actual number of blocks was rounded down to the next lower power of 2.

Let’s try again with another value for the DB_FILE_MULTIBLOCK_READ_COUNT parameter:

In the above, note that the direct path read wait did not entirely adhere to the value for the modified DB_FILE_MULTIBLOCK_READ_COUNT, the actual number of blocks was rounded down to the next lower power of 2.

Let’s try again with another value for the DB_FILE_MULTIBLOCK_READ_COUNT parameter:

Other testing showed that if the buffer cache was flushed after creating and populating the table, but before collecting statistics, Oracle Database 11.2.0.2 performed serial direct path reads to read the table blocks, rather than reading those blocks into the buffer cache using db file scattered reads.

So far, the re-read of the book has not changed my original opinion of the book, as expressed in my original review. The updated review is currently 5 typewritten pages long through page 178 of the book, which is about on par with my review of the “Pro Oracle SQL” book. It appears that the errata for all books on the Apress site has disappeared, including the errata for your book. If I recall correctly, you have a page on your website that is dedicated to book errata. So that I am consistent with the practice used when reviewing other books, I will not look at that errata until my review is nearly finalized (and then likely will only mention where to find the errata). Such an approach will hopefully provide insight into what types of problems I might have overlooked in the other book reviews, and allow the book review for the “Troubleshooting Oracle Performance” book to be fairly compared with my other book reviews.

—

The test case is interesting, as others have mentioned in the comments below. I will try to not comment further in this article for at least 24 hours.

Timur, at first I thought that the DBMS_UTILITY.GET_TIME function was not functioning correctly on my operating system platform. :-) So, I enabled a 10046 trace and was surprised that buffered access (as you mentioned) was the cause of the unexpected test script output.

Steve – good point, automatic result caching might have been enabled – something that I had not considered.

Mohamed, good points, thanks for providing the link. Technically, because we are forcing a full table scan through a hint, we would not need to re-collect system statistics after each change of the DB_FILE_MULTIBLOCK_READ_COUNT parameter. I see your point about the DB_FILE_MULTIBLOCK_READ_COUNT parameter’s effect on the possible maximum value for the MBRC system statistic, so there is a potential need to re-collect system statistics after changing the DB_FILE_MULTIBLOCK_READ_COUNT parameter.

> While reading the test case, another thought raced through my mind: why is the author
> experimenting with the DB_FILE_MULTIBLOCK_READ_COUNT parameter – just leave
> it unset and make certain that system statistics are collected so that the MBRC system
> statistic is used

Because the engine determines the value without considering the real performance!
Hence my advice is to perform some tests and find out what the optimal value is.

Good point! By the way, nice formula (on the page containing this test case) that shows how the value of the DB_FILE_MULTIBLOCK_READ_COUNT parameter is automatically derived.

I probably should have written that paragraph a little differently. The comment was definitely not intended to be critical of what you wrote in this section of the book. In the paragraph I attempted to describe how, when I was reading the test case, I should have glanced further down the page of the book to notice that you also described the possibility of leaving the DB_FILE_MULTIBLOCK_READ_COUNT parameter unset. I also should have remembered that the book, while targeting Oracle Database 11.1, also provides very detailed information about how the behavior of the earlier Oracle Database versions differed (other books fail to do this).

Hi Charles,
reading your test the first thing that came to mind is the use of the result cache,
because it uses a DB 11g version, but I immediately thought that this could not be, for two reasons:

1) did not specify
ALTER SESSION SET result_cache_mode = FORCE;
Or, you do not use the hint / *+ RESULT_CACHE */

the default for this parameter is ‘MANUAL’

2) the first run lasted 0 seconds, using the result cache
takes 0 seconds, at least from the second run onwards ….

Another thing caught my attention reading the comments ….
did not specify if your test is done with system workload statistics
noworkload or system statistics, because I think it does not matter in your tests.
In fact we are not comparing the cost of two queries, we’re just trying
to compare the execution times to vary db_file_multiblock_read_count, right?

but then, because it comes to recalculate the statistics system?
we are not discussing a change in cost, then the CBO, but
we’re talking about a change in timing of Engine when it is called
to recall the blocks …

It is interesting to see the application of logic to determine what did or did not happen – listing the clues that lead someone to believe that something happened, and then counter-clues that indicate that something else likely happened.

Good point about not seeing a “ALTER SESSION SET result_cache_mode = FORCE;” in the test case – I try to make certain that my test cases are reproducible, but there is always a small chance that a modified setting will be left over from a previous test case. As mentioned in the additional information that I added to the blog post, the total execution time required ranged between 00:00:37.05 and 00:00:39.37 for testing 128 values of the DB_FILE_MULTIBLOCK_READ_COUNT parameter. I was careful to hide that timing information when the blog post was originally created, in part so that I could see the rules of logic that people would use to determine what happened.

You are correct that in my test case it does not matter whether or not workload statistics or noworkload statistics used – in the test case it really does not matter what the calculated cost is based on the change to the workload statistics. It might, however, make a difference if the blocks belonging to the table are read by direct path read (commonly used in Oracle Database 11.1 and above) or are read into the buffer cache by a db file scattered read – it also makes a difference if the table’s blocks are already in the buffer cache.

The idea of the test, as mentioned by Chris, is to consider the real performance impact caused by articifically limiting the size of a multi-block read request below the typical 1MB limit. Consider if the database block size is 8KB, and collecting workload statistics results in a MBRC value (the average multi-block read size) of 64. The system can only obtain a MBRC value of 64 if the DB_FILE_MULTIBLOCK_READ_COUNT parameter is at least 64. Now, assume that testing shows that the system performance improves when the DB_FILE_MULTIBLOCK_READ_COUNT parameter is adjusted from the default (unset) value of 128 to a value of 32. If you change that parameter value to 32, you need to make certain that you either collect updated system statistics, or manually set the MBRC value to a lower value. So, one thing that could go wrong is that you set the DB_FILE_MULTIBLOCK_READ_COUNT parameter to a value smaller than the MBRC system statistic, which means that the MBRC statistic is then unrealistic. But, if you change the MBRC statistic, you risk changing the execution plans of most of the SQL statements that are executed in the database. If you are not using workload statistics, changing the value of the DB_FILE_MULTIBLOCK_READ_COUNT parameter could also change the execution plans.

>> It might, however, make a difference if the blocks belonging to the table are read by direct path read (commonly
>> used in Oracle Database 11.1 and above) or are read into the buffer cache by a db file scattered read – it also
>> makes a difference if the table’s blocks are already in the buffer cache.

which is what I meant when I wrote …

>> … a “change in timing” of Engine when it is called to recall the blocks …

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: