8.12.5.2 Enabling Large Page Support

Some hardware/operating system architectures support memory
pages greater than the default (usually 4KB). The actual
implementation of this support depends on the underlying
hardware and operating system. Applications that perform a lot
of memory accesses may obtain performance improvements by
using large pages due to reduced Translation Lookaside Buffer
(TLB) misses.

In MySQL, large pages can be used by InnoDB, to allocate
memory for its buffer pool and additional memory pool.

Currently, MySQL supports only the Linux implementation of
large page support (which is called HugeTLB in Linux).

Before large pages can be used on Linux, the kernel must be
enabled to support them and it is necessary to configure the
HugeTLB memory pool. For reference, the HugeTBL API is
documented in the
Documentation/vm/hugetlbpage.txt file of
your Linux sources.

The kernel for some recent systems such as Red Hat Enterprise
Linux appear to have the large pages feature enabled by
default. To check whether this is true for your kernel, use
the following command and look for output lines containing
“huge”:

The nonempty command output indicates that large page support
is present, but the zero values indicate that no pages are
configured for use.

If your kernel needs to be reconfigured to support large
pages, consult the hugetlbpage.txt file
for instructions.

Assuming that your Linux kernel has large page support
enabled, configure it for use by MySQL using the following
commands. Normally, you put these in an
rc file or equivalent startup file that
is executed during the system boot sequence, so that the
commands execute each time the system starts. The commands
should execute early in the boot sequence, before the MySQL
server starts. Be sure to change the allocation numbers and
the group number as appropriate for your system.

# Set the number of pages to be used.
# Each page is normally 2MB, so a value of 20 = 40MB.
# This command actually allocates memory, so this much
# memory must be available.
echo 20 > /proc/sys/vm/nr_hugepages
# Set the group number that is permitted to access this
# memory (102 in this case). The mysql user must be a
# member of this group.
echo 102 > /proc/sys/vm/hugetlb_shm_group
# Increase the amount of shmem permitted per segment
# (12G in this case).
echo 1560281088 > /proc/sys/kernel/shmmax
# Increase total amount of shared memory. The value
# is the number of pages. At 4KB/page, 4194304 = 16GB.
echo 4194304 > /proc/sys/kernel/shmall

For MySQL usage, you normally want the value of
shmmax to be close to the value of
shmall.

To verify the large page configuration, check
/proc/meminfo again as described
previously. Now you should see some nonzero values:

The final step to make use of the
hugetlb_shm_group is to give the
mysql user an “unlimited”
value for the memlock limit. This can by done either by
editing /etc/security/limits.conf or by
adding the following command to your
mysqld_safe script:

ulimit -l unlimited

Adding the ulimit command to
mysqld_safe causes the
root user to set the memlock limit to
unlimited before switching to the
mysql user. (This assumes that
mysqld_safe is started by
root.)

Large page support in MySQL is disabled by default. To enable
it, start the server with the
--large-pages option. For
example, you can use the following lines in your server's
my.cnf file:

[mysqld]
large-pages

With this option, InnoDB uses large pages
automatically for its buffer pool and additional memory pool.
If InnoDB cannot do this, it falls back to
use of traditional memory and writes a warning to the error
log: Warning: Using conventional memory
pool

User Comments

An error will always generate on initial startup of mysqld since the 'mysql_install_db' script is run necessarily prior to 'mysqld_safe' in the daemon startup script. In order to allow the initialization of the database using huge page support, ease the memlock limit by typing 'ulimit -l unlimited' as root at the command prompt before either starting mysqld or executing mysql_install_db. This is in addition to adding it to the mysqld_safe script.

Also, choose values wisely since I believe that the 'nr_hugepages' parameter will actually reserve physical memory. At 2MB per page that means a value of 512 will reserve 1GB.

I had an issue where large pages were enabled however mysql was not using the whole chunk of large page memory allocated (approx 23 GB) so the server was swapping constantly. This was MySQL 5.1.49 using the new innodb plugin 1.0.10 (also applies to MySQL 5.5 and above most likely). The solution: the amount of large page memory you allocate should be > innodb_buffer_pool size.

I allocated 13,500 large pages for a 24 GB buffer pool and it works fine now.

Therefore the following /etc/sysctl.d/mysql.conf----# Set the number of pages to be used:# Add innodb_buffer_pool_size and# innodb_additional_mem_pool_size# and divide by Hugepagesize.# Each page is normally 2MB, so a value of 6100 = 12200MB.# This command actually allocates memory, so this much# memory must be available.# Important:# ulimit -l unlimited# and set in my.cnf:# ...# [mysqld]# large-pages# ...vm.nr_hugepages = 6100

# Set the group number that is permitted to access this# memory (110 in this case). The mysql user must be a# member of this group.vm.hugetlb_shm_group = 110

# Set the amount of shmem permitted per segment in bytes# (12199Mb in this case).kernel.shmmax = 12791578624

# Set the total amount of shared memory. The value# is the size in pages. At 4KB/page, 3122944 = 12199MB.kernel.shmall = 3122944----

I added the ulimit to the startup script and added the large-pages in the mysqld section.

But I get the following error:....110823 13:41:00 mysqld_safe Starting mysqld daemon with databases from /mnt/mysql/data110823 13:41:00 [Warning] '--log_slow_queries' is deprecated and will be removed in a future release. Please use ''--slow_query_log'/'--slow_query_log_file'' instead.110823 13:41:00 [Note] Plugin 'FEDERATED' is disabled.InnoDB: HugeTLB: Warning: Failed to allocate 12582928384 bytes. errno 28InnoDB HugeTLB: Warning: Using conventional memory pool...

The existing samples are not very clear - I would like to know if my example is right and clear enough for calculation the size correctly.

I hope this comment will save severals hours and white nights on production launching...After folowing every How-to and all's documentation over Google, to enable huge pages... i must give you this post.

# Total of allowed memoryvm.nr_hugepages = YYYYYY# total amount of memory that can be allocated to shared memory, huge pages or not, on the boxkernel.shmall = XXXXXXXXXX# maximum single shared memory segment, which for me was basically innodb_buffer_pool+1%kernel.shmmax = XXXXXXXXXX# Groupe autorisévm.hugetlb_shm_group = `id -g mysql`

A bit of a note on the math here, some articles and blogs say that you should add your innodb_buffer_pool size to your innodb_additional_mem_pool_size, and divide that by your hugetlb page size. Then add a few on to that. Unfortunately, that doesn't seem to be the whole story.

For those who want to allocate as little RAM as possible to HugeTLB while still satisfying the requirements outlined in my.cnf, this formula might be a little better. This is after some experimentation led me to put some effort behind finding out why I always had to allocate many more pages than the math suggested.

Then to that, add an additional 11 - 15 pages until MySQL starts. I give my best guess as to why these pages are unaccounted for below.

First, a note on why tmp_table_size is included: I'm not sure if it *should* be tmp_table_size * max_tmp_tables, but MySQL starts and runs with only tmp_table_size included. I think this only applies if default_tmp_storage_engine is InnoDB. If a tmp table needs to be created for a sort or order, and that table is going to be InnoDB in RAM, then hugetlb will need to be used.

Secondly, I noticed in the source code that the InnoDB buffer log uses the 'os_mem_alloc_large' function. So I think that should be included in the calculation as well. In my experimentation, I had 22 pages unaccounted for until I found that, then my unaccounted for pages went down to 11.

As for the pages which don't seem to be accounted for, I think that is the overhead cost of the nature of pages. For instance, if you have an innodb_buffer_pool size of 256 MB, and you have 8 buffer instances then you have:

At 2048 KB per page, that comes to 16.4 pages per buffer. That .4 of a page means an entire page must be allocated, or 17 pages per buffer instead of 16.4. That would account for 8 pages right there. So if one is really picky, declaring buffer sizes that meet the page size exactly would theoretically leave no overhead to absorb. I don't know why but MySQL and google convert have differing opinions on how to convert megabytes to bytes, and vice versa. So if you want to cut it as close as possible, fill out your my.cnf. Start mysql without large-pages, and take note of the values of these 4 variables. Then convert those values into kilobytes for the page count calculation.