Large Page Support or Not – ORA_LPENABLE, Granule Size

17052011

May 17, 2011

Have you ever tried to bounce a database (maybe to adjust a parameter such as PROCESSES, which can only be altered in the spfile/init.ora) and encountered something like this (note that this is on a 64 bit Windows server):

Interesting, 14,256MB of free memory, the database instance was just running with an SGA_TARGET set to 10,016M, and now the database will not OPEN. The SGA_TARGET need not be more than 10,000M in size, the same problem can happen for small databases with the SGA_TARGET set to just 600M. If you check the alert log, you might find something like this for the database instance with the small SGA_TARGET (note that the server has Intel Xeon X5687 CPUs, Oracle Database 11.2.0.2):

Straying a bit from the main topic of this blog article, I was a little curious about the change in the Large page request size that was reported for the two database instances. So, I performed a couple of tests with different SGA_TARGET values:

“Memory for the shared pool, large pool, java pool, and buffer cache is allocated in units of granules. The granule size is 4MB if the SGA size is less than 1GB. If the SGA size is greater than 1GB, the granule size changes to 16MB. The granule size is calculated and fixed when the instance starts up. The size does not change during the lifetime of the instance.”

The observation of 4MB up to a SGA_TARGET of 1GB matches the documentation that describes the granual size, as does the observation of 16MB for an SGA_TARGET of 8GB. However, notice that at a SGA_TARGET of 1 byte beyond 8GB, the Large page request size jumped to 32MB, which is a value greater than what the documentation describes for the granule size. So, is the granule size still 16MB when the SGA_TARGET is set to a value just larger than 8GB, as indicated it would be in the 11.2 documentation?

You might recall reading in the past that as of Oracle Database 10.2 the LOG_BUFFER parameter is set by default to a value of just less than the granule size, typically being set to a value around 14MB with SGA values larger than 1GB – see Metalink (MOS) Doc ID 351857.1, “The Log_buffer Default Size Cannot Be Reduced In 10g R2″. Let’s take a look at the size of the LOG_BUFFER in the database instance with the 10,016M SGA_TARGET:

The LOG_BUFFER was auto-set to a value of 17.36MB, which possibly indicates that the granule size in this database instance is not 16MB. If we check Metalink (MOS) Doc ID 947152.1, “How to determine granule size”, we find that the calculation for the granule size has changed from Oracle Database 9.2 to 10.2, again with the release of 11.1, and again with the release of 11.2. With the discovery of this new bit of information, the values reported for the Large page request size exactly mirrors that of the granule size.

“Large page support is a feature of Oracle Database 10g Release 1 (10.1) or later. It provides a performance boost for memory-intensive database instances. By taking advantage of newly introduced operating system support, Oracle Database 10g Release 1 (10.1) or later can now make more efficient use of processor memory addressing resources. Specifically, when large page support is enabled, the CPUs in the system will be able to access the Oracle Database buffers in RAM more quickly. Instead of addressing the buffers in 4KB increments, the CPUs are told to use 4MB page sizes when addressing the database buffers.”

A quote from the same section of the Oracle Database 11.2 documentation, with one extra line copied at the end:

“Large page support is a feature of Oracle Database 10g Release 1 (10.1) or later. It provides a performance boost for memory-intensive database instances on Windows Server 2003. By taking advantage of newly introduced operating system support, Oracle Database 10g Release 1 (10.1) or later can now make more efficient use of processor memory addressing resources. Specifically, when large page support is enabled, the CPUs in the system will be able to access the Oracle Database buffers in RAM more quickly. Instead of addressing the buffers in 4KB increments, the CPUs are told to use 2 MB page sizes in Physical Address Extension (PAE) mode and 4MB page sizes in non-PAE mode when addressing the database buffers.”

“By default Oracle allocates the minimum available large page size when using large pages. The minimum available large page size, 16 MB, is obtained by using the GetLargePageMinumum function.”

As you can see by the above, a little additional clarity was added to the 11.2 documentation, but where are the 2MB, 4MB, and 16MB numbers coming from? Let’s check Metalink (MOS) Doc ID 422844.1, “Using Large Memory Pages on 64-Bit Windows Systems” – a very brief quote:

“… instead of addressing the buffers in 4KB increments, the CPUs are told to use 2 MB page sizes or larger. The default page size depends on the chipset, and the total amount of memory in the system.”

From the above we learn that the default page size depends on the chipset and the amount of memory in the server, with 2MB page sizes being the minimum (that 2MB figure does match the Large page size figure that appears above). Is it really the chipset, or is it the CPU architecture (Intel vs. AMD, NUMA vs. non-NUMA) that is a potential source of the change? Is it really the total amount of memory in the system, or is it the total amount of memory allocated to the SGA that is a potential source of the change?

—

Let’s take a look at a Microsoft document to see if it provides a little more insight into why the ORA-27102: out of memory error might have appeared when there was 14,256MB of free memory in the server:

“Large-page memory regions may be difficult to obtain after the system has been running for a long time because the physical space for each large page must be contiguous, but the memory may have become fragmented. Allocating large pages under these conditions can significantly affect system performance…”

The memory must be contiguous… in this case, I wonder if Oracle Database is requesting multiple (313) 32MB allocations of memory without physical gaps in each memory region, or a single 10,016MB contiguous memory region at start up – the difference is significant.

Incidentally, the quick work-around for this problem is to disable large page support by changing the ORA_LPENABLE Windows registry key from a value of 1 to a value of 0 (this parameter is set for the Oracle home, but can also be set for each database instance running from the Oracle home). There is also a mention of the ORA_LPSIZE registry key in Metalink (MOS) for controlling the size of the large page request size, but not enough information is provided in the articles to determine its exact usage.

On a slight tangent, the ORA_LPENABLE and ORA_LPSIZE registry keys, which are quite clearly intended to store numbers, must be defined as string (REG_SZ) datatypes in the Windows registry. A bit of irony for a database company to use a string field to store a number, when the same action in the company’s database may have undesirable consequences.

Like this:

LikeLoading...

Related

Actions

Information

11 responses

(rant on)
Well, it seems better to fail at startup than what happens on AIX (it tries to allocate the entire SGA in paged memory if the allocated large pages are not suficient ==> OS hang).
We just went through a migration to 11gR2 this weekend, and the Metalink note you quote gives a granule size of 512MiB for our configuration. Now I have a log buffer of half a gig (minus the SGA fixed size – they share a granule, it seems), and streams and java pools (which I don’t use much) each wasting half a gig of precious RAM I’d have other uses for… It is still better than 10.2.0.4, though – with 16MiB-granules it flatly refused to allocate more than 136GiB for the SGA. I’ll never know the magical properties of this constant…
(rant off)

Sorry for the late answer – the “Notify me of follow-up comments via email” thing seems to have not worked…
Anyway, the error we were hitting in 10.2 was “ORA-00064: object is too large to allocate on this O/S (1,16777216)” and the first thing we tried was of course to up the number of large pages – in fact, we allocated several thousand (1000 lp ~ 16 GiB!) more than should have been enough given the configured SGA size. Then we went back to a much smaller DB_KEEP_CACHE_SIZE and started increasing it until we got to the 136 GiB total SGA size limit, at which point we still had plenty of free large pages (vmstat -l: flp), but a 1 GiB-increase threw the ORA-64. We didn’t have time to play with smaller increases (steps of 16 MiB would have made sense to try, as it was the granule size as well as the large page size) or trying to increase other SGA components instead of the KEEP pool – the maintenance window on the production system was about to close (who has such amounts of RAM on a test system?).
A Metalink search discovered “Bug 7659217: ORA-64 ERROR RECEIVED WHEN SGA_MAX_SIZE INCREASED TO 85GB” which provides some explanation of what might be going wrong plus the workaround of increasing the granule size, but we decided not to play with hidden parameters (even if Oracle says it would be supported – Note 985884.1) and just run with only 136 GiB for the next couple of months until we would upgrade to 11.2.0.2 and get 512 MiB-granules automatically. During THAT upgrade we got the hang – resolved by increasing the number of large pages (after rebooting, of course). One might say we should have expected this increase and should have pre-allocated enough large pages, and we had, but apparently they weren’t enough. Again, we don’t have a test system with that much RAM.

Overall, we’ve been pretty happy with the large pages on AIX. We’ve been using them for about four years now. One needs more planning for upgrades of any sort (hardware, OS, Oracle), but having a huge SGA in pinned memory is just beautiful.

An hour and 50 minute video? Thanks for the link, although I have only managed to listen to the first 50 minutes of the video, it is very well done.

From the video, to check the granule size used by the database instance, take a look at the _KSMG_GRANULE_SIZE hidden parameter value or query V$SGA_DYNAMIC_COMPONENTS. For some reason, yesterday I was drawing a blank when trying to quickly determine the granule size.

The situation with Aix is a bit sensitive to the actual large page size granule defined in the OS. It supports – as of Aix 6.1 – 4KB, 64KB and 16MB page sizes. Dunno if 7.1 has changed that but I guess I’ll find out soon as we’re upgrading.
What memory is used has to be pre-defined with vmo – large page pools are non-pageable and locked, they must come from somewhere in the total memory pool if not already defined. That means major memory re-work/re-allocation if doing it on a hot production system already under memory stress! Hence the OS hangs and other problems folks experience. Guess that’s a shared concern with Windows, given this blog entry.
I wrote an entry on Aix largepages a couple of years ago: http://dbasrus.blogspot.com/2009/06/size-sometimes-does-matter.html
Careful: it’s based on real active systems, not awesomeness, and it is SPECIFIC to Aix only. Read it entirely: there is a lot there including in the comments.

Your blog article is very helpful for people running AIX and Oracle Database, and considering that you mentioned the vmo command and configuring the number of large pages, it appears that my Google search worked properly. The Google search also mentioned using the hugetlb_pool in the proc filesystem on Linux systems to configure the number of large pages (of course, this is just a starting point for additional research).

From what I am able to see from reading your blog article, AIX and Windows differs a bit regarding the large page usage. In the comments section of your blog article, you mentioned:

“Largepage memory space is pre-reserved, non-pageable. All virtual memory page management is active only for the REMAINING 4K pages.”

As far as I know, the large pages are not pre-reserved on Windows – they are attempted to be acquired from memory (the memory regions which are also used by the small memory pages) when requested by the process. A scan through Metalink (MOS) only offers the suggestion of either rebooting the server or disabling large memory pages when the error displayed at the start of this blog article appears on the Windows platform. As I see it, this is a significant disadvantage of Windows compared to AIX. I wonder if other operating systems that work with Oracle Database also pre-set-aside memory chunks for the large memory pages.

Oh, wasn’t aware of that difference with Windows. Dang, that throws a spanner in the works, doesn’t it? In a busy or stressed system, I can see the impact of re-arranging memory on the fly! I’ve seen Aix page out an entire active SGA to get memory re-arranged for 16MB largepages – fortunately, that was my test instance and I learned fast not to do foolish things! ;)
Thing I like about the Aix largepages – 16MB – is that I can kinda “partition” the memory into a fixed pool that never gets paged and is used by Oracle, and a normal 4K page pool that acts as conventional paged memory, for use by everything else. Mind you: it only works with 16MB largepages: 64K largepages are not locked by default…

That, and the fact that Oracle tries to fit the entire SGA into the remaining paged memory. And when your desired SGA approaches 40-50% (or 80% in my case) of the total RAM and doesn’t fit in the allocated LPs, major thrashing occurs. The thing is, when the punishment for underestimating lgpg_regions is so severe, you tend to overestimate it, thus wasting RAM (since no-one else can use the free large pages). Because – who knows? – the next security patch can cause the SGA to grow just enough to no longer fit…
I hear (https://kevinclosson.wordpress.com/2010/09/28/configuring-linux-hugepages-for-oracle-database-is-just-too-difficult-part-i/) there’s a parameter in 11.2.0.2 (use_large_pages=only), but it only applies to Linux (Note 361323.1). Oh well. Maybe next release.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: