InnoDB Command Options

In MySQL 5.1, this option caused the server to behave as if
the built-in InnoDB were not present, which
enabled the InnoDB Plugin to be used
instead. In MySQL 5.7, InnoDB
is the default storage engine and InnoDB
Plugin is not used, so this option is ignored.

The InnoDB storage engine can no longer be
disabled, and the
--innodb=OFF
and
--skip-innodb
options are deprecated and have no effect. Their use results
in a warning. These options will be removed in a future MySQL
release.

--innodb-status-file

Command-Line Format

--innodb-status-file

Permitted Values

Type

boolean

Default

OFF

Controls whether InnoDB creates a file
named
innodb_status.pid
in the MySQL data directory. If enabled,
InnoDB periodically writes the output of
SHOW ENGINE
INNODB STATUS to this file.

By default, the file is not created. To create it, start
mysqld with the
--innodb-status-file=1 option. The file is
deleted during normal shutdown.

Enable this option on the
master server to use
the InnoDBmemcached
plugin (daemon_memcached) with the MySQL
binary log. This option
can only be set at server startup. You must also enable the
MySQL binary log on the master server using the
--log-bin option.

The path of the directory containing the shared library that
implements the InnoDBmemcached plugin. The default value is
NULL, representing the MySQL plugin directory. You should not
need to modify this parameter unless specifying a
memcached plugin for a different storage
engine that is located outside of the MySQL plugin directory.

Used to pass space-separated memcached options to the
underlying memcached memory object caching
daemon on startup. For example, you might change the port that
memcached listens on, reduce the maximum
number of simultaneous connections, change the maximum memory
size for a key/value pair, or enable debugging messages for
the error log.

This value is set to 1 by default, so that any changes made to
the table through SQL statements are immediately visible to
memcached operations. You might increase it
to reduce the overhead from frequent commits on a system where
the underlying table is only being accessed through the
memcached interface. If you set the value
too large, the amount of undo or redo data could impose some
storage overhead, as with any long-running transaction.

Specifies how many memcached write
operations, such as add,
set, and incr, to
perform before doing a COMMIT
to start a new transaction. Counterpart of
daemon_memcached_r_batch_size.

This value is set to 1 by default, on the assumption that data
being stored is important to preserve in case of an outage and
should immediately be committed. When storing non-critical
data, you might increase this value to reduce the overhead
from frequent commits; but then the last
N-1 uncommitted write operations
could be lost if a crash occurs.

Whether the InnoDBadaptive hash
index is enabled or disabled. It may be desirable,
depending on your workload, to dynamically enable or disable
adaptive hash
indexing to improve query performance. Because the
adaptive hash index may not be useful for all workloads,
conduct benchmarks with it both enabled and disabled, using
realistic workloads. See
Section 14.4.3, “Adaptive Hash Index” for details.

This variable is enabled by default. You can modify this
parameter using the SET GLOBAL statement,
without restarting the server. Changing the setting requires
the SUPER privilege. You can also use
--skip-innodb_adaptive_hash_index at server
startup to disable it.

Disabling the adaptive hash index empties the hash table
immediately. Normal operations can continue while the hash
table is emptied, and executing queries that were using the
hash table access the index B-trees directly instead. When the
adaptive hash index is re-enabled, the hash table is populated
again during normal operation.

Partitions the adaptive hash index search system. Each index
is bound to a specific partition, with each partition
protected by a separate latch.

In earlier releases, the adaptive hash index search system was
protected by a single latch
(btr_search_latch) which could become a
point of contention. With the introduction of the
innodb_adaptive_hash_index_parts option,
the search system is partitioned into 8 parts by default. The
maximum setting is 512.

Permits InnoDB to automatically adjust the
value of
innodb_thread_sleep_delay up
or down according to the current workload. Any non-zero value
enables automated, dynamic adjustment of the
innodb_thread_sleep_delay value, up to the
maximum value specified in the
innodb_adaptive_max_sleep_delay option. The
value represents the number of microseconds. This option can
be useful in busy systems, with greater than 16
InnoDB threads. (In practice, it is most
valuable for MySQL systems with hundreds or thousands of
simultaneous connections.)

The size in bytes of a memory pool InnoDB
uses to store data
dictionary information and other internal data
structures. The more tables you have in your application, the
more memory you allocate here. If InnoDB
runs out of memory in this pool, it starts to allocate memory
from the operating system and writes warning messages to the
MySQL error log. The default value is 8MB.

Use this option to disable row locks when
InnoDBmemcached
performs DML operations. By default,
innodb_api_disable_rowlock is disabled,
which means that memcached requests row
locks for get and set
operations. When innodb_api_disable_rowlock
is enabled, memcached requests a table lock
instead of row locks.

innodb_api_disable_rowlock is not dynamic.
It must be specified on the mysqld command
line or entered in the MySQL configuration file. Configuration
takes effect when the plugin is installed, which occurs when
the MySQL server is started.

innodb_buffer_pool_chunk_size defines the
chunk size for InnoDB buffer pool resizing
operations. The
innodb_buffer_pool_size
parameter is dynamic, which allows you to resize the buffer
pool without restarting the server.

To avoid copying all buffer pool pages during resizing
operations, the operation is performed in
“chunks”. By default,
innodb_buffer_pool_chunk_size is 128MB
(134217728 bytes). The number of pages contained in a chunk
depends on the value of
innodb_page_size.
innodb_buffer_pool_chunk_size can be
increased or decreased in units of 1MB (1048576 bytes).

Specifies the percentage of the most recently used pages for
each buffer pool to read out and dump. The range is 1 to 100.
The default value is 25. For example, if there are 4 buffer
pools with 100 pages each, and
innodb_buffer_pool_dump_pct
is set to 25, the 25 most recently used pages from each buffer
pool are dumped.

Specifies the name of the file that holds the list of
tablespace IDs and page IDs produced by
innodb_buffer_pool_dump_at_shutdown
or
innodb_buffer_pool_dump_now.
Tablespace IDs and page IDs are saved in the following format:
space, page_id. By default, the file is
named ib_buffer_pool and is located in
the InnoDB data directory. A non-default
location must be specified relative to the data directory.

You can also specify a file name at startup, in a startup
string or MySQL configuration file. When specifying a file
name at startup, the file must exist or
InnoDB will return a startup error
indicating that there is no such file or directory.

The number of regions that the InnoDBbuffer pool is divided
into. For systems with buffer pools in the multi-gigabyte
range, dividing the buffer pool into separate instances can
improve concurrency, by reducing contention as different
threads read and write to cached pages. Each page that is
stored in or read from the buffer pool is assigned to one of
the buffer pool instances randomly, using a hashing function.
Each buffer pool manages its own free lists,
flush lists,
LRUs, and all other data
structures connected to a buffer pool, and is protected by its
own buffer pool mutex.

Immediately warms up the
InnoDBbuffer pool by loading
a set of data pages, without waiting for a server restart. Can
be useful to bring cache memory back to a known state during
benchmarking, or to ready the MySQL server to resume its
normal workload after running queries for reports or
maintenance.

The size in bytes of the
buffer pool, the
memory area where InnoDB caches table and
index data. The default value is 128MB. The maximum value
depends on the CPU architecture; the maximum is 4294967295
(232-1) on 32-bit systems and
18446744073709551615 (264-1) on
64-bit systems. On 32-bit systems, the CPU architecture and
operating system may impose a lower practical maximum size
than the stated maximum. When the size of the buffer pool is
greater than 1GB, setting
innodb_buffer_pool_instances
to a value greater than 1 can improve the scalability on a
busy server.

A larger buffer pool requires less disk I/O to access the same
table data more than once. On a dedicated database server, you
might set the buffer pool size to 80% of the machine's
physical memory size. Be aware of the following potential
issues when configuring buffer pool size, and be prepared to
scale back the size of the buffer pool if necessary.

Competition for physical memory can cause paging in the
operating system.

InnoDB reserves additional memory for
buffers and control structures, so that the total
allocated space is approximately 10% greater than the
specified buffer pool size.

Address space for the buffer pool must be contiguous,
which can be an issue on Windows systems with DLLs that
load at specific addresses.

The time to initialize the buffer pool is roughly
proportional to its size. On instances with large buffer
pools, initialization time might be significant. To reduce
the initialization period, you can save the buffer pool
state at server shutdown and restore it at server startup.
See Section 14.6.3.8, “Saving and Restoring the Buffer Pool State”.

When you increase or decrease buffer pool size, the operation
is performed in chunks. Chunk size is defined by the
innodb_buffer_pool_chunk_size
configuration option, which has a default of 128 MB.

Whether InnoDB performs
change buffering,
an optimization that delays write operations to secondary
indexes so that the I/O operations can be performed
sequentially. Permitted values are described in the following
table.

Table 14.13 Permitted Values for innodb_change_buffering

Value

Description

none

Do not buffer any operations.

inserts

Buffer insert operations.

deletes

Buffer delete marking operations; strictly speaking, the writes that
mark index records for later deletion during a purge
operation.

changes

Buffer inserts and delete-marking operations.

purges

Buffer the physical deletion operations that happen in the background.

Sets a debug flag for InnoDB change
buffering. A value of 1 forces all changes to the change
buffer. A value of 2 causes a crash at merge. A default value
of 0 indicates that the change buffering debug flag is not
set. This option is only available when debugging support is
compiled in using the WITH_DEBUGCMake option.

Specifies how to generate and verify the
checksum stored in the
disk blocks of InnoDBtablespaces.
crc32 is the default value as of MySQL
5.7.7.

innodb_checksum_algorithm replaces the
innodb_checksums option. The
following values were provided for compatibility, up to and
including MySQL 5.7.6:

innodb_checksums=ON is the same as
innodb_checksum_algorithm=innodb.

innodb_checksums=OFF is the same as
innodb_checksum_algorithm=none.

As of MySQL 5.7.7, with a default
innodb_checksum_algorithm value of crc32,
innodb_checksums=ON is now the same as
innodb_checksum_algorithm=crc32.
innodb_checksums=OFF is still the same as
innodb_checksum_algorithm=none.

The value innodb is backward-compatible
with earlier versions of MySQL. The value
crc32 uses an algorithm that is faster to
compute the checksum for every modified block, and to check
the checksums for each disk read. It scans blocks 32 bits at a
time, which is faster than the innodb
checksum algorithm, which scans blocks 8 bits at a time. The
value none writes a constant value in the
checksum field rather than computing a value based on the
block data. The blocks in a tablespace can use a mix of old,
new, and no checksum values, being updated gradually as the
data is modified; once blocks in a tablespace are modified to
use the crc32 algorithm, the associated
tables cannot be read by earlier versions of MySQL.

The strict form of a checksum algorithm reports an error if it
encounters a valid but non-matching checksum value in a
tablespace. It is recommended that you only use strict
settings in a new instance, to set up tablespaces for the
first time. Strict settings are somewhat faster, because they
do not need to compute all checksum values during disk reads.

Note

Prior to MySQL 5.7.8, a strict mode setting for
innodb_checksum_algorithm caused
InnoDB to halt when encountering a
valid but non-matching checksum. In
MySQL 5.7.8 and later, only an error message is printed, and
the page is accepted as valid if it has a valid
innodb, crc32 or
none checksum.

The following table shows the difference between the
none, innodb, and
crc32 option values, and their strict
counterparts. none,
innodb, and crc32 write
the specified type of checksum value into each data block, but
for compatibility accept other checksum values when verifying
a block during a read operation. Strict settings also accept
valid checksum values but print an error message when a valid
non-matching checksum value is encountered. Using the strict
form can make verification faster if all
InnoDB data files in an instance are
created under an identical
innodb_checksum_algorithm value.

Table 14.14 innodb_checksum_algorithm Settings

Value

Generated checksum (when writing)

Permitted checksums (when reading)

none

A constant number.

Any of the checksums generated by none,
innodb, or
crc32.

innodb

A checksum calculated in software, using the original algorithm from
InnoDB.

Any of the checksums generated by none,
innodb, or
crc32.

crc32

A checksum calculated using the crc32 algorithm,
possibly done with a hardware assist.

Any of the checksums generated by none,
innodb, or
crc32.

strict_none

A constant number

Any of the checksums generated by none,
innodb, or
crc32. InnoDB
prints an error message if a valid but non-matching
checksum is encountered.

strict_innodb

A checksum calculated in software, using the original algorithm from
InnoDB.

Any of the checksums generated by none,
innodb, or
crc32. InnoDB
prints an error message if a valid but non-matching
checksum is encountered.

strict_crc32

A checksum calculated using the crc32 algorithm,
possibly done with a hardware assist.

Any of the checksums generated by none,
innodb, or
crc32. InnoDB
prints an error message if a valid but non-matching
checksum is encountered.

InnoDB can use
checksum validation on
all tablespace pages read from disk to ensure extra fault
tolerance against hardware faults or corrupted data files.
This validation is enabled by default. Under specialized
circumstances (such as when running benchmarks) this safety
feature can be disabled with
--skip-innodb-checksums. You can specify the
method of calculating the checksum using the
innodb_checksum_algorithm
option.

Remove any innodb_checksums options from
your configuration files and startup scripts to avoid
conflicts with innodb_checksum_algorithm.
innodb_checksums=OFF automatically sets
innodb_checksum_algorithm=none.
innodb_checksums=ON is ignored and
overridden by any other setting for
innodb_checksum_algorithm.

Enables per-index compression-related statistics in the
INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX
table. Because these statistics can be expensive to gather,
only enable this option on development, test, or slave
instances during performance tuning related to
InnoDBcompressed tables.

Compresses all tables using a specified compression algorithm
without having to define a COMPRESSION
attribute for each table. This option is only available if
debugging support is compiled in using the
WITH_DEBUGCMake option.

Sets the cutoff point at which MySQL begins adding padding
within compressed
pages to avoid expensive
compression
failures. A value of zero disables the mechanism that
monitors compression efficiency and dynamically adjusts the
padding amount.

Specifies the maximum percentage that can be reserved as free
space within each compressed
page, allowing room to
reorganize the data and modification log within the page when
a compressed table or
index is updated and the data might be recompressed. Only
applies when
innodb_compression_failure_threshold_pct
is set to a non-zero value, and the rate of
compression
failures passes the cutoff point.

Determines the number of
threads that can enter
InnoDB concurrently. A thread is placed in
a queue when it tries to enter InnoDB if
the number of threads has already reached the concurrency
limit. When a thread is permitted to enter
InnoDB, it is given a number of “
tickets” equal to the value of
innodb_concurrency_tickets,
and the thread can enter and leave InnoDB
freely until it has used up its tickets. After that point, the
thread again becomes subject to the concurrency check (and
possible queuing) the next time it tries to enter
InnoDB. The default value is 5000.

With a small innodb_concurrency_tickets
value, small transactions that only need to process a few rows
compete fairly with larger transactions that process many
rows. The disadvantage of a small
innodb_concurrency_tickets value is that
large transactions must loop through the queue many times
before they can complete, which extends the amount of time
required to complete their task.

With a large innodb_concurrency_tickets
value, large transactions spend less time waiting for a
position at the end of the queue (controlled by
innodb_thread_concurrency)
and more time retrieving rows. Large transactions also require
fewer trips through the queue to complete their task. The
disadvantage of a large
innodb_concurrency_tickets value is that
too many large transactions running at the same time can
starve smaller transactions by making them wait a longer time
before executing.

With a non-zero
innodb_thread_concurrency
value, you may need to adjust the
innodb_concurrency_tickets value up or down
to find the optimal balance between larger and smaller
transactions. The SHOW ENGINE INNODB STATUS
report shows the number of tickets remaining for an executing
transaction in its current pass through the queue. This data
may also be obtained from the
TRX_CONCURRENCY_TICKETS column of the
INFORMATION_SCHEMA.INNODB_TRX
table.

An optimized temporary table is a lightweight subclass of
temporary table that excludes certain functionality and
benefits from optimizations that make it faster than a normal
temporary table. Like normal temporary tables, optimized
temporary tables are only visible to the current connection
and are dropped when the connection is terminated. Unlike
normal temporary tables, optimized temporary tables are
operational when InnoDB is in read-only
mode.

Row format COMPRESSED is not supported. If
you attempt to create a compressed optimized temporary table,
the innodb_create_intrinsic=ON setting is
ignored and InnoDB creates a normal
temporary table.

Undo logging is disabled for optimized temporary tables, which
means that rollback is not supported.

Atomicity for optimized temporary tables is supported at the
row-level, not at the statement level.

Statistics generated by the same workload may differ for
optimized temporary tables compared to normal temporary
tables, as optimized temporary tables may use a different
algorithm to complete certain types of operations.

Defines the path and file size for individual
InnoDBsystem
tablespacedata
files. The full directory path for system tablespace
data files is formed by concatenating path defined by
innodb_data_home_dir and
innodb_data_file_path. File sizes are
specified KB, MB or GB (1024MB) by appending
K, M or
G to the size value. If specifying the data
file size in kilobytes (KB), do so in multiples of 1024.
Otherwise, KB values are rounded to nearest megabyte (MB)
boundary. The sum of the sizes of the files must be at least
slightly larger than 10MB. If you do not specify
innodb_data_file_path, the default behavior
is to create a single auto-extending data file, slightly
larger than 12MB, named ibdata1. The size
limit of individual files is determined by your operating
system. You can set the file size to more than 4GB on
operating systems that support large files. You can also
use raw disk partitions as
data files. For more information about configuring
system tablespace data files, see
Section 14.6.1, “InnoDB Startup Configuration”.

The following minimum file sizes are enforced for the
first system tablespace data file
(ibdata1) to ensure that there is enough
space for doublewrite buffer pages:

For an innodb_page_size
value of 16KB or less, the minimum data file size is 3MB.

This option is used to disable deadlock detection. On high
concurrency systems, deadlock detection can cause a slowdown
when numerous threads wait for the same lock. At times, it may
be more efficient to disable deadlock detection and rely on
the innodb_lock_wait_timeout
setting for transaction rollback when a deadlock occurs.

The innodb_default_row_format option
defines the default row format for InnoDB
tables and user-created temporary tables. The default setting
is DYNAMIC. Other permitted values are
COMPACT and REDUNDANT.
The COMPRESSED row format, which is not
supported for use in the
system
tablespace, cannot be defined as the default.

Newly created tables use the row format defined by
innodb_default_row_format
when a ROW_FORMAT option is not specified
explicitly or when ROW_FORMAT=DEFAULT is
used.

When enabled (the default), InnoDB stores
all data twice, first to the
doublewrite
buffer, then to the actual
data files. This
variable can be turned off with
--skip-innodb_doublewrite for benchmarks or
cases when top performance is needed rather than concern for
data integrity or possible failures.

If system tablespace data files (ibdata*
files) are located on Fusion-io devices that support atomic
writes, doublewrite buffering is automatically disabled and
Fusion-io atomic writes are used for all data files. Because
the doublewrite buffer setting is global, doublewrite
buffering is also disabled for data files residing on
non-Fusion-io hardware. This feature is only supported on
Fusion-io hardware and only enabled for Fusion-io NVMFS on
Linux. To take full advantage of this feature, an
innodb_flush_method setting
of O_DIRECT is recommended.

The InnoDBshutdown mode. If the
value is 0, InnoDB does a
slow shutdown, a
full purge and a change
buffer merge before shutting down. If the value is 1 (the
default), InnoDB skips these operations at
shutdown, a process known as a
fast shutdown. If
the value is 2, InnoDB flushes its logs and
shuts down cold, as if MySQL had crashed; no committed
transactions are lost, but the
crash recovery
operation makes the next startup take longer.

The slow shutdown can take minutes, or even hours in extreme
cases where substantial amounts of data are still buffered.
Use the slow shutdown technique before upgrading or
downgrading between MySQL major releases, so that all data
files are fully prepared in case the upgrade process updates
the file format.

Use innodb_fast_shutdown=2 in emergency or
troubleshooting situations, to get the absolute fastest
shutdown if data is at risk of corruption.

By default, setting
innodb_fil_make_page_dirty_debug to the ID
of a tablespace immediately dirties the first page of the
tablespace. If
innodb_saved_page_number_debug
is set to a non-default value, setting
innodb_fil_make_page_dirty_debug dirties
the specified page. The
innodb_fil_make_page_dirty_debug option is
only available if debugging support is compiled in using the
WITH_DEBUGCMake option.

The innodb_file_format
setting is ignored when creating tables that use the
DYNAMIC row format. A table created using
the DYNAMIC row format always uses the
Barracuda file format, regardless of the
innodb_file_format setting.
To use the COMPRESSED row format,
innodb_file_format must be
set to Barracuda.

The innodb_file_format option
is deprecated and will be removed in a future release. The
purpose of the
innodb_file_format option was
to allow users to downgrade to the built-in version of
InnoDB in MySQL 5.1. Now that MySQL 5.1 has
reached the end of its product lifecycle, downgrade support
provided by this option is no longer necessary.

This variable can be set to 1 or 0 at server startup to enable
or disable whether InnoDB checks the
file format tag in the
system
tablespace (for example, Antelope or
Barracuda). If the tag is checked and is
higher than that supported by the current version of
InnoDB, an error occurs and
InnoDB does not start. If the tag is not
higher, InnoDB sets the value of
innodb_file_format_max to the
file format tag.

Note

Despite the default value sometimes being displayed as
ON or OFF, always use
the numeric values 1 or 0 to turn this option on or off in
your configuration file or command line string.

At server startup, InnoDB sets the value of
this variable to the file
format tag in the
system
tablespace (for example, Antelope or
Barracuda). If the server creates or opens
a table with a “higher” file format, it sets the
value of
innodb_file_format_max to
that format.

When innodb_file_per_table is enabled (the
default), InnoDB stores the data and
indexes for each newly created table in a separate
.ibd
file instead of the system tablespace. The storage for
these tables is reclaimed when the tables are dropped or
truncated. This setting enables
InnoDBfeatures such as table
compression. See
Section 14.7.4, “InnoDB File-Per-Table Tablespaces” for more
information.

When innodb_file_per_table is disabled,
InnoDB stores the data for tables and
indexes in the ibdata
files that make up the
system
tablespace. This setting reduces the performance
overhead of file system operations for operations such as
DROP TABLE or
TRUNCATE TABLE. It is most
appropriate for a server environment where entire storage
devices are devoted to MySQL data. Because the system
tablespace never shrinks, and is shared across all databases
in an instance, avoid
loading huge amounts of temporary data on a space-constrained
system when innodb_file_per_table is
disabled. Set up a separate instance in such cases, so that
you can drop the entire instance to reclaim the space.

innodb_file_per_table is enabled by
default. Consider disabling it if backward compatibility with
MySQL 5.5 or 5.1 is a concern. This will prevent
ALTER TABLE from moving
InnoDB tables from the system
tablespace to individual .ibd files.

innodb_file_per_table is
dynamic and can be set ON or
OFF using SET GLOBAL.
You can also set this option in the MySQL
configuration
file (my.cnf or
my.ini) but this requires shutting down
and restarting the server.

Dynamically changing the value requires the
SUPER privilege and immediately affects the
operation of all connections.

InnoDB performs a bulk load when creating
or rebuilding indexes. This method of index creation is known
as a “sorted index build”.

innodb_fill_factor defines the percentage
of space on each B-tree page that is filled during a sorted
index build, with the remaining space reserved for future
index growth. For example, setting
innodb_fill_factor to 80 reserves 20
percent of the space on each B-tree page for future index
growth. Actual percentages may vary. The
innodb_fill_factor setting is interpreted
as a hint rather than a hard limit.

An innodb_fill_factor setting
of 100 leaves 1/16 of the space in clustered index pages free
for future index growth.

innodb_fill_factor applies to both B-tree
leaf and non-leaf pages. It does not apply to external pages
used for TEXT or
BLOB entries.

Write and flush the logs every N
seconds.
innodb_flush_log_at_timeout
allows the timeout period between flushes to be increased in
order to reduce flushing and avoid impacting performance of
binary log group commit. The default setting for
innodb_flush_log_at_timeout
is once per second.

Controls the balance between strict
ACID compliance for
commit operations and
higher performance that is possible when commit-related I/O
operations are rearranged and done in batches. You can achieve
better performance by changing the default value but then you
can lose up to a second of transactions in a crash.

The default value of 1 is required for full ACID
compliance. With this value, the contents of the
InnoDBlog buffer are
written out to the log
file at each transaction commit and the log file is
flushed to disk.

With a value of 0, the contents of the
InnoDB log buffer are written to the
log file approximately once per second and the log file is
flushed to disk. No writes from the log buffer to the log
file are performed at transaction commit. Once-per-second
flushing is not guaranteed to happen every second due to
process scheduling issues. Because the flush to disk
operation only occurs approximately once per second, you
can lose up to a second of transactions with any
mysqld process crash.

With a value of 2, the contents of the
InnoDB log buffer are written to the
log file after each transaction commit and the log file is
flushed to disk approximately once per second.
Once-per-second flushing is not 100% guaranteed to happen
every second, due to process scheduling issues. Because
the flush to disk operation only occurs approximately once
per second, you can lose up to a second of transactions in
an operating system crash or a power outage.

InnoDB log flushing frequency is
controlled by
innodb_flush_log_at_timeout,
which allows you to set log flushing frequency to
N seconds (where
N is 1 ...
2700, with a default value of 1). However, any
mysqld process crash can erase up to
N seconds of transactions.

DDL changes and other internal InnoDB
activities flush the InnoDB log
independent of the
innodb_flush_log_at_trx_commit setting.

InnoDBcrash recovery
works regardless of the
innodb_flush_log_at_trx_commit setting.
Transactions are either applied entirely or erased
entirely.

For durability and consistency in a replication setup that
uses InnoDB with transactions:

If binary logging is enabled, set
sync_binlog=1.

Always set
innodb_flush_log_at_trx_commit=1.

Caution

Many operating systems and some disk hardware fool the
flush-to-disk operation. They may tell
mysqld that the flush has taken place,
even though it has not. In this case, the durability of
transactions is not guaranteed even with the setting 1, and
in the worst case, a power outage can corrupt
InnoDB data. Using a battery-backed disk
cache in the SCSI disk controller or in the disk itself
speeds up file flushes, and makes the operation safer. You
can also try to disable the caching of disk writes in
hardware caches.

If innodb_flush_method is set to
NULL on a Unix-like system, the
fsync option is used by default. If
innodb_flush_method is set to
NULL on Windows, the
async_unbuffered option is used by default.

The innodb_flush_method options for
Unix-like systems include:

fsync: InnoDB uses
the fsync() system call to flush both
the data and log files. fsync is the
default setting.

O_DSYNC: InnoDB uses
O_SYNC to open and flush the log files,
and fsync() to flush the data files.
InnoDB does not use
O_DSYNC directly because there have
been problems with it on many varieties of Unix.

littlesync: This option is used for
internal performance testing and is currently unsupported.
Use at your own risk.

nosync: This option is used for
internal performance testing and is currently unsupported.
Use at your own risk.

O_DIRECT: InnoDB
uses O_DIRECT (or
directio() on Solaris) to open the data
files, and uses fsync() to flush both
the data and log files. This option is available on some
GNU/Linux versions, FreeBSD, and Solaris.

O_DIRECT_NO_FSYNC:
InnoDB uses O_DIRECT
during flushing I/O, but skips the
fsync() system call afterward. This
setting is suitable for some types of file systems but not
others. For example, it is not suitable for XFS. If you
are not sure whether the file system you use requires an
fsync(), for example to preserve all
file metadata, use O_DIRECT instead.

How each setting affects performance depends on hardware
configuration and workload. Benchmark your particular
configuration to decide which setting to use, or whether to
keep the default setting. Examine the
Innodb_data_fsyncs status
variable to see the overall number of
fsync() calls for each setting. The mix of
read and write operations in your workload can affect how a
setting performs. For example, on a system with a hardware
RAID controller and battery-backed write cache,
O_DIRECT can help to avoid double buffering
between the InnoDB buffer pool and the
operating system file system cache. On some systems where
InnoDB data and log files are located on a
SAN, the default value or O_DSYNC might be
faster for a read-heavy workload with mostly
SELECT statements. Always test this
parameter with hardware and workload that reflect your
production environment. For general I/O tuning advice, see
Section 8.5.8, “Optimizing InnoDB Disk I/O”.

The default value of 1 flushes contiguous dirty pages in
the same extent from the buffer pool.

A setting of 0 turns
innodb_flush_neighbors off and no other
dirty pages are flushed from the buffer pool.

A setting of 2 flushes dirty pages in the same extent from
the buffer pool.

When the table data is stored on a traditional
HDD storage device, flushing
such neighbor pages
in one operation reduces I/O overhead (primarily for disk seek
operations) compared to flushing individual pages at different
times. For table data stored on
SSD, seek time is not a
significant factor and you can turn this setting off to spread
out write operations. For related information, see
Section 14.6.3.7, “Fine-tuning InnoDB Buffer Pool Flushing”.

The innodb_flush_sync parameter, which is
enabled by default, causes the
innodb_io_capacity setting to
be ignored for bursts of I/O activity that occur at
checkpoints. To adhere
to the limit on InnoDB background I/O
activity defined by the
innodb_io_capacity setting,
disable innodb_flush_sync.

Number of iterations for which InnoDB keeps
the previously calculated snapshot of the flushing state,
controlling how quickly
adaptive
flushing responds to changing
workloads. Increasing the
value makes the rate of
flush operations change
smoothly and gradually as the workload changes. Decreasing the
value makes adaptive flushing adjust quickly to workload
changes, which can cause spikes in flushing activity if the
workload increases and decreases suddenly.

Permits InnoDB to load tables at startup
that are marked as corrupted. Use only during troubleshooting,
to recover data that is otherwise inaccessible. When
troubleshooting is complete, disable this setting and restart
the server.

Only set this variable to a value greater than 0 in an
emergency situation so that you can start
InnoDB and dump your tables. As a safety
measure, InnoDB prevents
INSERT,
UPDATE, or
DELETE operations when
innodb_force_recovery is greater than 0.
An innodb_force_recovery setting of 4 or
greater places InnoDB into read-only
mode.

The memory allocated, in bytes, for the
InnoDBFULLTEXT search
index cache, which holds a parsed document in memory while
creating an InnoDBFULLTEXT index. Index inserts and updates
are only committed to disk when the
innodb_ft_cache_size size limit is reached.
innodb_ft_cache_size defines the cache size
on a per table basis. To set a global limit for all tables,
see
innodb_ft_total_cache_size.

Whether to enable additional full-text search (FTS) diagnostic
output. This option is primarily intended for advanced FTS
debugging and will not be of interest to most users. Output is
printed to the error log and includes information such as:

Specifies that a set of
stopwords is associated
with an InnoDBFULLTEXT
index at the time the index is created. If the
innodb_ft_user_stopword_table
option is set, the stopwords are taken from that table. Else,
if the
innodb_ft_server_stopword_table
option is set, the stopwords are taken from that table.
Otherwise, a built-in set of default stopwords is used.

Maximum character length of words that are stored in an
InnoDBFULLTEXT index.
Setting a limit on this value reduces the size of the index,
thus speeding up queries, by omitting long keywords or
arbitrary collections of letters that are not real words and
are not likely to be search terms.

Minimum length of words that are stored in an
InnoDBFULLTEXT index.
Increasing this value reduces the size of the index, thus
speeding up queries, by omitting common words that are
unlikely to be significant in a search context, such as the
English words “a” and “to”. For
content using a CJK (Chinese, Japanese, Korean) character set,
specify a value of 1.

Number of words to process during each
OPTIMIZE TABLE operation on an
InnoDBFULLTEXT index.
Because a bulk insert or update operation to a table
containing a full-text search index could require substantial
index maintenance to incorporate all changes, you might do a
series of OPTIMIZE TABLE
statements, each picking up where the last left off.

The InnoDB full-text search query result
cache limit (defined in bytes) per full-text search query or
per thread. Intermediate and final InnoDB
full-text search query results are handled in memory. Use
innodb_ft_result_cache_limit to place a
size limit on the full-text search query result cache to avoid
excessive memory consumption in case of very large
InnoDB full-text search query results
(millions or hundreds of millions of rows, for example).
Memory is allocated as required when a full-text search query
is processed. If the result cache size limit is reached, an
error is returned indicating that the query exceeds the
maximum allowed memory.

The maximum value of
innodb_ft_result_cache_limit for all
platform types and bit sizes is 2**32-1.

This option is used to specify your own
InnoDBFULLTEXT index
stopword list for all InnoDB tables. To
configure your own stopword list for a specific
InnoDB table, use
innodb_ft_user_stopword_table.

Set innodb_ft_server_stopword_table to the
name of the table containing a list of stopwords, in the
format
db_name/table_name.

The stopword table must exist before you configure
innodb_ft_server_stopword_table.
innodb_ft_enable_stopword must be enabled
and innodb_ft_server_stopword_table option
must be configured before you create the
FULLTEXT index.

The stopword table must be an InnoDB table,
containing a single VARCHAR column named
value.

The total memory allocated, in bytes, for the
InnoDB full-text search index cache for all
tables. Creating numerous tables, each with a
FULLTEXT search index, could consume a
significant portion of available memory.
innodb_ft_total_cache_size
defines a global memory limit for all full-text search indexes
to help avoid excessive memory consumption. If the global
limit is reached by an index operation, a forced sync is
triggered.

This option is used to specify your own
InnoDBFULLTEXT index
stopword list on a specific table. To configure your own
stopword list for all InnoDB tables, use
innodb_ft_server_stopword_table.

Set innodb_ft_user_stopword_table to the
name of the table containing a list of stopwords, in the
format
db_name/table_name.

The stopword table must exist before you configure
innodb_ft_user_stopword_table.
innodb_ft_enable_stopword must be enabled
and innodb_ft_user_stopword_table must be
configured before you create the FULLTEXT
index.

The stopword table must be an InnoDB table,
containing a single VARCHAR column named
value.

The innodb_io_capacity limit
is a total limit for all buffer pool instances. When dirty
pages are flushed, the limit is divided equally among buffer
pool instances.

innodb_io_capacity should be
set to approximately the number of I/O operations that the
system can perform per second. Ideally, keep the setting as
low as practical, but not so low that background activities
fall behind. If the value is too high, data is removed from
the buffer pool and insert buffer too quickly for caching to
provide a significant benefit.

The default value is 200. For busy systems capable of higher
I/O rates, you can set a higher value to help the server
handle the background maintenance work associated with a high
rate of row changes.

In general, you can increase the value as a function of the
number of drives used for InnoDB
I/O. For example, you can increase the value on systems that
use multiple disks or solid-state disks (SSD).

The default setting of 200 is generally sufficient for a
lower-end SSD. For a higher-end, bus-attached SSD, consider a
higher setting such as 1000, for example. For systems with
individual 5400 RPM or 7200 RPM drives, you might lower the
value to 100, which represents an estimated
proportion of the I/O operations per second (IOPS) available
to older-generation disk drives that can perform about 100
IOPS.

Although you can specify a very high value such as one
million, in practice such large values have little if any
benefit. Generally, a value of 20000 or higher is not
recommended unless you have proven that lower values are
insufficient for your workload.

Consider write workload when tuning
innodb_io_capacity. Systems
with large write workloads are likely to benefit from a higher
setting. A lower setting may be sufficient for systems with a
small write workload.

You can set innodb_io_capacity to any
number 100 or greater to a maximum defined by
innodb_io_capacity_max.
innodb_io_capacity can be set in the MySQL
option file (my.cnf or
my.ini) or changed dynamically using a
SET GLOBAL statement, which requires the
SUPER privilege.

If you specify an
innodb_io_capacity setting at
startup but do not specify a value for
innodb_io_capacity_max,
innodb_io_capacity_max defaults to twice
the value of
innodb_io_capacity, with a
minimum value of 2000.

When configuring innodb_io_capacity_max,
twice the innodb_io_capacity
is often a good starting point. The default value of 2000 is
intended for workloads that use a solid-state disk (SSD) or
more than one regular disk drive. A setting of 2000 is likely
too high for workloads that do not use SSD or multiple disk
drives, and could allow too much flushing. For a single
regular disk drive, a setting between 200 and 400 is
recommended. For a high-end, bus-attached SSD, consider a
higher setting such as 2500. As with the
innodb_io_capacity setting,
keep the setting as low as practical, but not so low that
InnoDB cannot sufficiently extend beyond
the innodb_io_capacity limit,
if necessary.

Consider write workload when tuning
innodb_io_capacity_max. Systems with large
write workloads may benefit from a higher setting. A lower
setting may be sufficient for systems with a small write
workload.

For tables that use
REDUNDANT
or
COMPACT
row format, this option does not affect the permitted index
key prefix length.

innodb_large_prefix is enabled by default
in MySQL 5.7. This change coincides with the
default value change for
innodb_file_format, which is set to
Barracuda by default in MySQL
5.7. Together, these default value changes allow
larger index key prefixes to be created when using
DYNAMIC or COMPRESSED
row format. If either option is set to a non-default value,
index key prefixes larger than 767 bytes are silently
truncated.

innodb_large_prefix is
deprecated and will be removed in a future release.
innodb_large_prefix was
introduced in MySQL 5.5 to disable large index key prefixes
for compatibility with earlier versions of
InnoDB that do not support large index key
prefixes.

The length of time in seconds an InnoDBtransaction waits for
a row lock before giving
up. The default value is 50 seconds. A transaction that tries
to access a row that is locked by another
InnoDB transaction waits at most this many
seconds for write access to the row before issuing the
following error:

You might decrease this value for highly interactive
applications or OLTP systems,
to display user feedback quickly or put the update into a
queue for processing later. You might increase this value for
long-running back-end operations, such as a transform step in
a data warehouse that waits for other large insert or update
operations to finish.

innodb_lock_wait_timeout applies to
InnoDB row locks only. A MySQL
table lock does not
happen inside InnoDB and this timeout does
not apply to waits for table locks.

innodb_lock_wait_timeout can
be set at runtime with the SET GLOBAL or
SET SESSION statement. Changing the
GLOBAL setting requires the
SUPER privilege and affects the operation
of all clients that subsequently connect. Any client can
change the SESSION setting for
innodb_lock_wait_timeout,
which affects only that client.

This variable affects how InnoDB uses
gap locking for searches
and index scans.
innodb_locks_unsafe_for_binlog is
deprecated and will be removed in a future MySQL release.

Normally, InnoDB uses an algorithm called
next-key locking that combines index-row locking with
gap locking.
InnoDB performs row-level locking in such a
way that when it searches or scans a table index, it sets
shared or exclusive locks on the index records it encounters.
Thus, row-level locks are actually index-record locks. In
addition, a next-key lock on an index record also affects the
gap before the index record. That is, a next-key lock is an
index-record lock plus a gap lock on the gap preceding the
index record. If one session has a shared or exclusive lock on
record R in an index, another session
cannot insert a new index record in the gap immediately before
R in the index order. See
Section 14.5.1, “InnoDB Locking”.

By default, the value of
innodb_locks_unsafe_for_binlog is 0
(disabled), which means that gap locking is enabled:
InnoDB uses next-key locks for searches and
index scans. To enable the variable, set it to 1. This causes
gap locking to be disabled: InnoDB uses
only index-record locks for searches and index scans.

Enabling innodb_locks_unsafe_for_binlog
does not disable the use of gap locking for foreign-key
constraint checking or duplicate-key checking.

The effects of enabling
innodb_locks_unsafe_for_binlog are the same
as setting the transaction isolation level to
READ COMMITTED, with these
exceptions:

Enabling
innodb_locks_unsafe_for_binlog
is a global setting and affects all sessions, whereas the
isolation level can be set globally for all sessions, or
individually per session.

Enabling innodb_locks_unsafe_for_binlog may
cause phantom problems because other sessions can insert new
rows into the gaps when gap locking is disabled. Suppose that
there is an index on the id column of the
child table and that you want to read and
lock all rows from the table having an identifier value larger
than 100, with the intention of updating some column in the
selected rows later:

SELECT * FROM child WHERE id > 100 FOR UPDATE;

The query scans the index starting from the first record where
the id is greater than 100. If the locks
set on the index records in that range do not lock out inserts
made in the gaps, another session can insert a new row into
the table. Consequently, if you were to execute the same
SELECT again within the same
transaction, you would see a new row in the result set
returned by the query. This also means that if new items are
added to the database, InnoDB does not
guarantee serializability. Therefore, if
innodb_locks_unsafe_for_binlog is enabled,
InnoDB guarantees at most an isolation
level of READ COMMITTED.
(Conflict serializability is still guaranteed.) For more
information about phantoms, see
Section 14.5.4, “Phantom Rows”.

Enabling innodb_locks_unsafe_for_binlog has
additional effects:

For UPDATE or
DELETE statements,
InnoDB holds locks only for rows that
it updates or deletes. Record locks for nonmatching rows
are released after MySQL has evaluated the
WHERE condition. This greatly reduces
the probability of deadlocks, but they can still happen.

For UPDATE statements, if a
row is already locked, InnoDB performs
a “semi-consistent” read, returning the
latest committed version to MySQL so that MySQL can
determine whether the row matches the
WHERE condition of the
UPDATE. If the row matches
(must be updated), MySQL reads the row again and this time
InnoDB either locks it or waits for a
lock on it.

Suppose also that a second client performs an
UPDATE by executing these
statements following those of the first client:

SET autocommit = 0;
UPDATE t SET b = 4 WHERE b = 2;

As InnoDB executes each
UPDATE, it first acquires an
exclusive lock for each row, and then determines whether to
modify it. If InnoDB does not modify the
row and innodb_locks_unsafe_for_binlog is
enabled, it releases the lock. Otherwise,
InnoDB retains the lock until the end of
the transaction. This affects transaction processing as
follows.

If innodb_locks_unsafe_for_binlog is
disabled, the first UPDATE
acquires x-locks and does not release any of them:

For the second UPDATE,
InnoDB does a
“semi-consistent” read, returning the latest
committed version of each row to MySQL so that MySQL can
determine whether the row matches the WHERE
condition of the UPDATE:

This configuration option was removed and replaced by
innodb_log_checksums.

Specifies how to generate and verify the
checksum stored in each
redo log disk block.
innodb_log_checksum_algorithm supports same
algorithms as innodb_checksum_algorithm.
Previously, only the innodb algorithm was
supported for redo log disk blocks.
innodb_log_checksum_algorithm=innodb is the
default setting.

The strict forms work the same as innodb,
crc32, and none, except
that InnoDB halts if it encounters a mix of
checksum values in the same redo log. You can only use the
strict settings in a completely new instance. The strict
settings are somewhat faster, because they do not need to
compute both new and old checksum values to accept both during
disk reads.

The following table shows the difference between the
none, innodb, and
crc32 option values, and their strict
counterparts. none,
innodb, and crc32 write
the specified type of checksum value into each data block, but
for compatibility accept any of the other checksum values when
verifying a block during a read operation. The strict form of
each option only recognizes one kind of checksum, which makes
verification faster but requires that all
InnoDB redo logs in an instance are created
under an identical
innodb_log_checksum_algorithm value.

Table 14.15 innodb_log_checksum_algorithm Settings

Value

Generated checksum (when writing)

Permitted checksums (when reading)

none

A constant number.

Any of the checksums generated by none,
innodb, or
crc32.

innodb

A checksum calculated in software, using the original algorithm from
InnoDB.

Any of the checksums generated by none,
innodb, or
crc32.

crc32

A checksum calculated using the crc32 algorithm,
possibly done with a hardware assist.

Any of the checksums generated by none,
innodb, or
crc32.

strict_none

A constant number

Only the checksum generated by none.

strict_innodb

A checksum calculated in software, using the original algorithm from
InnoDB.

Only the checksum generated by innodb.

strict_crc32

A checksum calculated using the crc32 algorithm,
possibly done with a hardware assist.

Specifies whether images of
re-compressedpages are written to the
redo log. Re-compression
may occur when changes are made to compressed data.

innodb_log_compressed_pages is enabled by
default to prevent corruption that could occur if a different
version of the zlib compression algorithm
is used during recovery. If you are certain that the
zlib version will not change, you can
disable innodb_log_compressed_pages to
reduce redo log generation for workloads that modify
compressed data.

To measure the effect of enabling or disabling
innodb_log_compressed_pages, compare redo
log generation for both settings under the same workload.
Options for measuring redo log generation include observing
the Log sequence number (LSN) in the
LOG section of
SHOW ENGINE
INNODB STATUS output, or monitoring
Innodb_os_log_written status
for the number of bytes written to the redo log files.

The size in bytes of each log
file in a log
group. The combined size of log files
(innodb_log_file_size *
innodb_log_files_in_group)
cannot exceed a maximum value that is slightly less than
512GB. A pair of 255 GB log files, for example, approaches the
limit but does not exceed it. The default value is 48MB.

Generally, the combined size of the log files should be large
enough that the server can smooth out peaks and troughs in
workload activity, which often means that there is enough redo
log space to handle more than an hour of write activity. The
larger the value, the less checkpoint flush activity is
required in the buffer pool, saving disk I/O. Larger log files
also make crash
recovery slower, although improvements to recovery
performance in MySQL 5.5 and higher make the log file size
less of a consideration.

The minimum innodb_log_file_size value was
increased from 1MB to 4MB in MySQL 5.7.11.

The directory path to the InnoDBredo log files, whose
number is specified by
innodb_log_files_in_group. If
you do not specify any InnoDB log
variables, the default is to create two files named
ib_logfile0 and
ib_logfile1 in the MySQL data directory.
Log file size is given by the
innodb_log_file_size system
variable.

The write-ahead block size for the redo log, in bytes. To
avoid “read-on-write”, set
innodb_log_write_ahead_size to match the
operating system or file system cache block size.
Read-on-write occurs when redo log blocks are not entirely
cached to the operating system or file system due to a
mismatch between write-ahead block size for redo logs and
operating system or file system cache block size.

Valid values for
innodb_log_write_ahead_size are multiples
of the InnoDB log file block size (2^n).
The minimum value is the InnoDB log file
block size (512). Write-ahead does not occur when the minimum
value is specified. The maximum value is equal to
innodb_page_size. If you
specify a value for
innodb_log_write_ahead_size that is larger
than the innodb_page_size
value, the innodb_log_write_ahead_size
value is truncated to the
innodb_page_size value.

Setting the innodb_log_write_ahead_size
value too low in relation to the operating system or file
system cache block size results in
“read-on-write”. Setting the value too high may
have a slight impact on fsync performance
for log file writes due to several blocks being written at
once.

A parameter that influences the algorithms and heuristics for
the flush operation for the
InnoDBbuffer pool. Primarily
of interest to performance experts tuning I/O-intensive
workloads. It specifies, per buffer pool instance, how far
down the buffer pool LRU list the page cleaner thread scans
looking for dirty pages
to flush. This is a background operation performed once per
second.

A setting smaller than the default is generally suitable for
most workloads. A value that is much higher than necessary may
impact performance. Only consider increasing the value if you
have spare I/O capacity under a typical workload. Conversely,
if a write-intensive workload saturates your I/O capacity,
decrease the value, especially in the case of a large buffer
pool.

When tuning innodb_lru_scan_depth, start
with a low value and configure the setting upward with the
goal of rarely seeing zero free pages. Also, consider
adjusting innodb_lru_scan_depth when
changing the number of buffer pool instances, since
innodb_lru_scan_depth *
innodb_buffer_pool_instances
defines the amount of work performed by the page cleaner
thread each second.

The InnoDB transaction system maintains a
list of transactions that have index records delete-marked by
UPDATE or
DELETE operations. The length
of the list represents the
purge_lag value. When
purge_lag exceeds
innodb_max_purge_lag,
INSERT,
UPDATE, and
DELETE operations are delayed.

To prevent excessive delays in extreme situations where
purge_lag becomes huge, you can
limit the delay by setting the
innodb_max_purge_lag_delay
configuration option. The delay is computed at the beginning
of a purge batch.

A typical setting for a problematic workload might be 1
million, assuming that transactions are small, only 100 bytes
in size, and it is permissible to have 100MB of unpurged
InnoDB table rows.

The lag value is displayed as the history list length in the
TRANSACTIONS section of
InnoDB Monitor
output . For example, if the output includes the
following lines, the lag value is 20:

Specifies the maximum delay in milliseconds for the delay
imposed by the
innodb_max_purge_lag
configuration option. A non-zero value represents an upper
limit on the delay period computed from the formula based on
the value of innodb_max_purge_lag. The
default of zero means that there is no upper limit imposed on
the delay interval.

Defines a threshold size for undo tablespaces. If an undo
tablespace exceeds the threshold, it can be marked for
truncation when
innodb_undo_log_truncate is
enabled. The default value is 1024 MiB (1073741824 bytes).

Enables the NUMA interleave memory policy for allocation of
the InnoDB buffer pool. When
innodb_numa_interleave is enabled, the NUMA
memory policy is set to MPOL_INTERLEAVE for
the mysqld process. After the
InnoDB buffer pool is allocated, the NUMA
memory policy is set back to MPOL_DEFAULT.
For the innodb_numa_interleave option to be
available, MySQL must be compiled on a NUMA-enabled Linux
system.

Specifies the approximate percentage of the
InnoDBbuffer pool used for
the old block sublist. The
range of values is 5 to 95. The default value is 37 (that is,
3/8 of the pool). Often used in combination with
innodb_old_blocks_time.

Non-zero values protect against the
buffer pool being
filled by data that is referenced only for a brief period,
such as during a full
table scan. Increasing this value offers more
protection against full table scans interfering with data
cached in the buffer pool.

Specifies how long in milliseconds a block inserted into the
old sublist must stay
there after its first access before it can be moved to the new
sublist. If the value is 0, a block inserted into the old
sublist moves immediately to the new sublist the first time it
is accessed, no matter how soon after insertion the access
occurs. If the value is greater than 0, blocks remain in the
old sublist until an access occurs at least that many
milliseconds after the first access. For example, a value of
1000 causes blocks to stay in the old sublist for 1 second
after the first access before they become eligible to move to
the new sublist.

Specifies an upper limit on the size of the temporary log
files used during online
DDL operations for InnoDB tables.
There is one such log file for each index being created or
table being altered. This log file stores data inserted,
updated, or deleted in the table during the DDL operation. The
temporary log file is extended when needed by the value of
innodb_sort_buffer_size, up
to the maximum specified by
innodb_online_alter_log_max_size. If a
temporary log file exceeds the upper size limit, the
ALTER TABLE operation fails and
all uncommitted concurrent DML operations are rolled back.
Thus, a large value for this option allows more DML to happen
during an online DDL operation, but also extends the period of
time at the end of the DDL operation when the table is locked
to apply the data from the log.

This configuration option is only relevant if you use multiple
InnoDBtablespaces. It
specifies the maximum number of
.ibd
files that MySQL can keep open at one time. The minimum
value is 10. The default value is 300 if
innodb_file_per_table is not
enabled, and the higher of 300 and
table_open_cache otherwise.

The number of page cleaner threads that flush dirty pages from
buffer pool instances. Page cleaner threads perform flush list
and LRU flushing. A single page cleaner thread was introduced
in MySQL 5.6 to offload buffer pool flushing
work from the InnoDB master thread. In
MySQL 5.7, InnoDB provides
support for multiple page cleaner threads. A value of 1
maintains the pre-MySQL 5.7 configuration in
which there is a single page cleaner thread. When there are
multiple page cleaner threads, buffer pool flushing tasks for
each buffer pool instance are dispatched to idle page cleaner
threads. The innodb_page_cleaners default
value was changed from 1 to 4 in MySQL 5.7. If
the number of page cleaner threads exceeds the number of
buffer pool instances, innodb_page_cleaners
is automatically set to the same value as
innodb_buffer_pool_instances.

If your workload is write-IO bound when flushing dirty pages
from buffer pool instances to data files, and if your system
hardware has available capacity, increasing the number of page
cleaner threads may help improve write-IO throughput.

Multi-threaded page cleaner support is extended to shutdown
and recovery phases in MySQL 5.7.

The setpriority() system call is used on
Linux platforms where it is supported, and where the
mysqld execution user is authorized to give
page_cleaner threads priority over other
MySQL and InnoDB threads to help page
flushing keep pace with the current workload.
setpriority() support is indicated by this
InnoDB startup message:

[Note] InnoDB: If the mysqld execution user is authorized, page cleaner
thread priority can be changed. See the man page of setpriority().

For systems where server startup and shutdown is not managed
by systemd, mysqld execution user
authorization can be configured in
/etc/security/limits.conf. For example,
if mysqld is run under the
mysql user, you can authorize the
mysql user by adding these lines to
/etc/security/limits.conf:

mysql hard nice -20
mysql soft nice -20

For systemd managed systems, the same can be achieved by
specifying LimitNICE=-20 in a localized
systemd configuration file. For example, create a file named
override.conf in
/etc/systemd/system/mysqld.service.d/override.conf
and add this entry:

[Service]
LimitNICE=-20

After creating or changing override.conf,
reload the systemd configuration, then tell systemd to restart
the MySQL service:

Specifies the page size
for all InnoDBtablespaces in a MySQL
instance. You can specify
page size using the values 64k, 32k, 16k
(the default), 8k, or
4k. Alternatively, you can specify page
size in bytes (65536, 32768, 16384, 8192, 4096).

Support for 32k and 64k page sizes was added in MySQL
5.7. For both 32k and 64k page sizes, the maximum
row length is approximately 16000 bytes.
ROW_FORMAT=COMPRESSED is not supported when
innodb_page_size is set to 32KB or 64KB.
For innodb_page_size=32k, extent size is
2MB. For innodb_page_size=64k, extent size
is 4MB.
innodb_log_buffer_size should
be set to at least 16M (the default) when using 32k or 64k
page sizes.

The default 16KB page size or larger is appropriate for a wide
range of workloads,
particularly for queries involving table scans and DML
operations involving bulk updates. Smaller page sizes might be
more efficient for OLTP
workloads involving many small writes, where contention can be
an issue when single pages contain many rows. Smaller pages
might also be efficient with
SSD storage devices, which
typically use small block sizes. Keeping the
InnoDB page size close to the storage
device block size minimizes the amount of unchanged data that
is rewritten to disk.

The minimum file size for the first system tablespace data
file (ibdata1) differs depending on the
innodb_page_size value. See
the innodb_data_file_path
option description for more information.

When this option is enabled, information about all
deadlocks in
InnoDB user transactions is recorded in the
mysqlderror
log. Otherwise, you see information about only the last
deadlock, using the SHOW ENGINE INNODB
STATUS command. An occasional
InnoDB deadlock is not necessarily an
issue, because InnoDB detects the condition
immediately and rolls back one of the transactions
automatically. You might use this option to troubleshoot why
deadlocks are occurring if an application does not have
appropriate error-handling logic to detect the rollback and
retry its operation. A large number of deadlocks might
indicate the need to restructure transactions that issue
DML or SELECT ... FOR
UPDATE statements for multiple tables, so that each
transaction accesses the tables in the same order, thus
avoiding the deadlock condition.

Defines the number of undo log pages that purge parses and
processes in one batch from the
history list. In a
multi-threaded purge configuration, the coordinator purge
thread divides innodb_purge_batch_size by
innodb_purge_threads and
assigns that number of pages to each purge thread. The
innodb_purge_batch_size option also defines
the number of undo log pages that purge frees after every 128
iterations through the undo logs.

The innodb_purge_batch_size option is
intended for advanced performance tuning in combination with
the innodb_purge_threads
setting. Most MySQL users need not change
innodb_purge_batch_size from its default
value.

The number of background threads devoted to the
InnoDBpurge operation. A minimum
value of 1 signifies that the purge operation is always
performed by a background thread, never as part of the
master thread.
Running the purge operation in one or more background threads
helps reduce internal contention within
InnoDB, improving scalability. Increasing
the value to greater than 1 creates that many separate purge
threads, which can improve efficiency on systems where
DML operations are performed
on multiple tables. The maximum is 32.

Defines the frequency with which the purge system frees
rollback segments. An undo tablespace cannot be truncated
until its rollback segments are freed. Normally, the purge
system frees rollback segments once every 128 times that purge
is invoked. Reducing the
innodb_purge_rseg_truncate_frequency value
increases the frequency with which the purge thread frees
rollback segments. The default value is 128.

Controls the sensitivity of linear
read-ahead that
InnoDB uses to prefetch pages into the
buffer pool. If
InnoDB reads at least
innodb_read_ahead_threshold pages
sequentially from an extent
(64 pages), it initiates an asynchronous read for the entire
following extent. The permissible range of values is 0 to 64.
A value of 0 disables read-ahead. For the default of 56,
InnoDB must read at least 56 pages
sequentially from an extent to initiate an asynchronous read
for the following extent.

SHOW ENGINE
INNODB STATUS also shows the rate at which the
read-ahead pages are read in and the rate at which such pages
are evicted without being accessed. The per-second averages
are based on the statistics collected since the last
invocation of SHOW ENGINE INNODB STATUS and
are displayed in the BUFFER POOL AND MEMORY
section of the
SHOW ENGINE
INNODB STATUS output.

On Linux systems, running multiple MySQL servers (typically
more than 12) with default settings for
innodb_read_io_threads,
innodb_write_io_threads,
and the Linux aio-max-nr setting can
exceed system limits. Ideally, increase the
aio-max-nr setting; as a workaround, you
might reduce the settings for one or both of the MySQL
configuration options.

InnoDBrolls
back only the last statement on a transaction timeout
by default. If
--innodb_rollback_on_timeout is
specified, a transaction timeout causes
InnoDB to abort and roll back the entire
transaction.

Note

If the start-transaction statement was
START
TRANSACTION or
BEGIN
statement, rollback does not cancel that statement. Further
SQL statements become part of the transaction until the
occurrence of COMMIT,
ROLLBACK,
or some SQL statement that causes an implicit commit.

One rollback segment is always assigned to the system
tablespace, and 32 rollback segments are reserved for use by
temporary tables and are hosted in the temporary tablespace
(ibtmp1). To allocate additional rollback
segments for data-modifying transactions that generate undo
records,
innodb_rollback_segments must
be set to a value greater than 33. If you configure separate
undo tablespaces, the rollback segment in the system
tablespace is rendered inactive. Each rollback segment can
support a maximum of 1023 data-modifying transactions.

When innodb_rollback_segments
is set to 32 or less, InnoDB assigns one
rollback segment to the system tablespace and 32 to the
temporary tablespace (ibtmp1).

When innodb_rollback_segments
is set to a value greater than 32, InnoDB
assigns one rollback segment to the system tablespace, 32 to
the temporary tablespace (ibtmp1), and
additional rollback segments to undo tablespaces, if present.
If undo tablespaces are not present, additional rollback
segments are assigned to the system tablespace.

Although you can increase or decrease the number of rollback
segments used by InnoDB, the number of
rollback segments physically present in the system never
decreases. Thus, you might start with a low value for this
parameter and gradually increase it, to avoid allocating
rollback segments that are not required. The
innodb_rollback_segments
default value is 128, which is also the maximum value.

Saves a page number. Setting the
innodb_fil_make_page_dirty_debug
option dirties the page defined by
innodb_saved_page_number_debug. The
innodb_saved_page_number_debug option is
only available if debugging support is compiled in using the
WITH_DEBUGCMake option.

Specifies the size of sort buffers used to sort data during
creation of an InnoDB index. The specified
size defines the amount of data that is read into memory for
internal sorting and then written out to disk. This process is
referred to as a “run”. During the merge phase,
pairs of buffers of the specified size are read in and merged.
The larger the setting, the fewer runs and merges there are.

This sort area is only used for merge sorts during index
creation, not during later index maintenance operations.
Buffers are deallocated when index creation completes.

The value of this option also controls the amount by which the
temporary log file is extended to record concurrent DML during
online DDL operations.

Before this setting was made configurable, the size was
hardcoded to 1048576 bytes (1MB), which remains the default.

During an ALTER TABLE or
CREATE TABLE statement that
creates an index, 3 buffers are allocated, each with a size
defined by this option. Additionally, auxiliary pointers are
allocated to rows in the sort buffer so that the sort can run
on pointers (as opposed to moving rows during the sort
operation).

For a typical sort operation, a formula such as this one can
be used to estimate memory consumption:

The maximum delay between polls for a
spin lock. The low-level
implementation of this mechanism varies depending on the
combination of hardware and operating system, so the delay
does not correspond to a fixed time interval. For more
information, see
Section 14.6.10, “Configuring Spin Lock Polling”.

Causes InnoDB to automatically recalculate
persistent
statistics after the data in a table is changed
substantially. The threshold value is 10% of the rows in the
table. This setting applies to tables created when the
innodb_stats_persistent
option is enabled. Automatic statistics recalculation may also
be configured by specifying
STATS_PERSISTENT=1 in a
CREATE TABLE or
ALTER TABLE statement. The
amount of data sampled to produce the statistics is controlled
by the
innodb_stats_persistent_sample_pages
configuration option.

By default, InnoDB reads uncommitted data
when calculating statistics. In the case of an uncommitted
transaction that deletes rows from a table,
InnoDB excludes records that are
delete-marked when calculating row estimates and index
statistics, which can lead to non-optimal execution plans for
other transactions that are operating on the table
concurrently using a transaction isolation level other than
READ UNCOMMITTED. To avoid
this scenario,
innodb_stats_include_delete_marked
can be enabled to ensure that InnoDB
includes delete-marked records when calculating persistent
optimizer statistics.

How the server treats NULL values when
collecting statistics
about the distribution of index values for
InnoDB tables. Permitted values are
nulls_equal,
nulls_unequal, and
nulls_ignored. For
nulls_equal, all NULL
index values are considered equal and form a single value
group with a size equal to the number of
NULL values. For
nulls_unequal, NULL
values are considered unequal, and each
NULL forms a distinct value group of size
1. For nulls_ignored,
NULL values are ignored.

When innodb_stats_on_metadata is enabled,
InnoDB updates non-persistent
statistics when
metadata statements such as SHOW TABLE
STATUS or when accessing the
INFORMATION_SCHEMA.TABLES or
INFORMATION_SCHEMA.STATISTICS
tables. (These updates are similar to what happens for
ANALYZE TABLE.) When disabled,
InnoDB does not update statistics during
these operations. Leaving the setting disabled can improve
access speed for schemas that have a large number of tables or
indexes. It can also improve the stability of
execution
plans for queries that involve
InnoDB tables.

To change the setting, issue the statement SET GLOBAL
innodb_stats_on_metadata=mode,
where mode is
either ON or OFF (or
1 or 0). Changing the
setting requires the SUPER privilege and
immediately affects the operation of all connections.

Specifies whether InnoDB index statistics
are persisted to disk. Otherwise, statistics may be
recalculated frequently which can lead to variations in
query execution
plans. This setting is stored with each table when the
table is created. You can set
innodb_stats_persistent at the global level
before creating a table, or use the
STATS_PERSISTENT clause of the
CREATE TABLE and
ALTER TABLE statements to
override the system-wide setting and configure persistent
statistics for individual tables.

Enables or disables the InnoDB Lock
Monitor. When enabled, the InnoDB Lock
Monitor prints additional information about locks in
SHOW ENGINE INNODB STATUS output and in
periodic output printed to the MySQL error log. Periodic
output for the InnoDB Lock Monitor is
printed as part of the standard InnoDB
Monitor output. The standard InnoDB Monitor
must therefore be enabled for the InnoDB
Lock Monitor to print data to the MySQL error log
periodically. For more information, see
Section 14.17.2, “Enabling InnoDB Monitors”.

When innodb_strict_mode is enabled,
InnoDB returns errors rather than warnings
for certain conditions.

Strict mode helps
guard against ignored typos and syntax errors in SQL, or other
unintended consequences of various combinations of operational
modes and SQL statements. When
innodb_strict_mode is enabled,
InnoDB raises error conditions in certain
cases, rather than issuing a warning and processing the
specified statement (perhaps with unintended behavior). This
is analogous to
sql_mode in
MySQL, which controls what SQL syntax MySQL accepts, and
determines whether it silently ignores errors, or validates
input syntax and data values.

The innodb_strict_mode setting affects the
handling of syntax errors for CREATE
TABLE, ALTER TABLE,
CREATE INDEX, and
OPTIMIZE TABLE statements.
innodb_strict_mode also enables a record
size check, so that an INSERT or
UPDATE never fails due to the record being
too large for the selected page size.

Oracle recommends enabling
innodb_strict_mode when using
ROW_FORMAT and
KEY_BLOCK_SIZE clauses in
CREATE TABLE,
ALTER TABLE, and
CREATE INDEX statements. When
innodb_strict_mode is disabled,
InnoDB ignores conflicting clauses and
creates the table or index with only a warning in the message
log. The resulting table might have different characteristics
than intended, such as lack of compression support when
attempting to create a compressed table. When
innodb_strict_mode is enabled, such
problems generate an immediate error and the table or index is
not created.

You can enable or disable
innodb_strict_mode on the command line when
starting mysqld, or in a MySQL
configuration
file. You can also enable or disable
innodb_strict_mode at runtime with the
statement SET [GLOBAL|SESSION]
innodb_strict_mode=mode,
where mode is
either ON or OFF.
Changing the GLOBAL setting requires the
SUPER privilege and affects the operation
of all clients that subsequently connect. Any client can
change the SESSION setting for
innodb_strict_mode, and the setting affects
only that client.

Enables InnoDB support for two-phase commit
in XA transactions, causing an
extra disk flush for transaction preparation. The XA mechanism
is used internally and is essential for any server that has
its binary log turned on and is accepting changes to its data
from more than one thread. If you disable
innodb_support_xa, transactions can be
written to the binary log in a different order than the live
database is committing them, which can produce different data
when the binary log is replayed in disaster recovery or on a
replication slave. Do not disable
innodb_support_xa on a replication master
server unless you have an unusual setup where only one thread
is able to change data.

innodb_support_xa is deprecated and will be
removed in a future MySQL release. InnoDB
support for two-phase commit in XA transactions is always
enabled as of MySQL 5.7.10. Disabling
innodb_support_xa is no
longer permitted as it makes replication unsafe and prevents
performance gains associated with binary log group commit.

Splits an internal data structure used to coordinate threads,
for higher concurrency in workloads with large numbers of
waiting threads. This setting must be configured when the
MySQL instance is starting up, and cannot be changed
afterward. Increasing the value is recommended for workloads
that frequently produce a large number of waiting threads,
typically greater than 768.

Enables sync debug checking for the InnoDB
storage engine. This option is only available if debugging
support is compiled in using the
WITH_DEBUGCMake option.

Previously, enabling InnoDB sync debug
checking required that the Debug Sync facility be enabled
using the ENABLE_DEBUG_SYNCCMake option. This requirement was removed
in MySQL 5.7 with the introduction of this
configuration option.

Specifies the path, file name, and file size for
InnoDB temporary tablespace data files. The
full directory path for a file is formed by concatenating
innodb_data_home_dir to the
path specified by
innodb_temp_data_file_path. File size is
specified in KB, MB, or GB (1024MB) by appending
K, M, or
G to the size value. The sum of the sizes
of the files must be slightly larger than 12MB. If you do not
specify a value for
innodb_temp_data_file_path, the default
behavior is to create a single auto-extending temporary
tablespace data file, slightly larger than 12MB, named
ibtmp1. The size limit of individual
files is determined by your operating system. You can set the
file size to more than 4GB on operating systems that support
big files. Use of raw disk partitions for temporary tablespace
data files is not supported.

A temporary tablespace data file name cannot be the same as an
InnoDB data file name. Any inability or
error creating a temporary tablespace data file is treated as
fatal and server startup is refused. The temporary tablespace
has a dynamically generated space ID, which can change on each
server restart.

The InnoDB temporary tablespace is shared
by all non-compressed InnoDB temporary
tables. Compressed InnoDB temporary tables
reside in file-per-table tablespace files, located in the
temporary file directory defined by
tmpdir.

Used to define an alternate directory for temporary sort files
created during online ALTER
TABLE operations that rebuild the table.

Online ALTER TABLE operations
that rebuild the table also create an
intermediate table file in the same
directory as the original table. The
innodb_tmpdir option is not applicable to
intermediate table files.

A valid value is any directory path other than the MySQL data
directory path. If the value is NULL (the default), temporary
files are created MySQL temporary directory
($TMPDIR on Unix, %TEMP%
on Windows, or the directory specified by the
--tmpdir configuration
option). If a directory is specified, existence of the
directory and permissions are only checked when
innodb_tmpdir is configured using a
SET
statement. If a symlink is provided in a directory string, the
symlink is resolved and stored as an absolute path. The path
should not exceed 512 bytes. An online
ALTER TABLE operation reports
an error if innodb_tmpdir is set to an
invalid directory. innodb_tmpdir overrides
the MySQL tmpdir setting but
only for online ALTER TABLE
operations.

The FILE privilege is required to configure
innodb_tmpdir.

The innodb_tmpdir option was introduced to
help avoid overflowing a temporary file directory located on a
tmpfs file system. Such overflows could
occur as a result of large temporary sort files created during
online ALTER TABLE operations
that rebuild the table.

In replication environments, only consider replicating the
innodb_tmpdir setting if all servers have
the same operating system environment. Otherwise, replicating
the innodb_tmpdir setting could result in a
replication failure when running online
ALTER TABLE operations that
rebuild the table. If server operating environments differ, it
is recommended that you configure
innodb_tmpdir on each server individually.

InnoDB tries to keep the number of
operating system threads concurrently inside
InnoDB less than or equal to the limit
given by this variable (InnoDB uses
operating system threads to process user transactions). Once
the number of threads reaches this limit, additional threads
are placed into a wait state within a “First In, First
Out” (FIFO) queue for execution. Threads waiting for
locks are not counted in the number of concurrently executing
threads.

The range of this variable is 0 to 1000. A value of 0 (the
default) is interpreted as infinite concurrency (no
concurrency checking). Disabling thread concurrency checking
enables InnoDB to create as many threads as
it needs. A value of 0 also disables the queries
inside InnoDB and queries in queue
counters in the ROW OPERATIONS
section of SHOW ENGINE INNODB STATUS
output.

Consider setting this variable if your MySQL instance shares
CPU resources with other applications, or if your workload or
number of concurrent users is growing. The correct setting
depends on workload, computing environment, and the version of
MySQL that you are running. You will need to test a range of
values to determine the setting that provides the best
performance. innodb_thread_concurrency is a
dynamic variable, which allows you to experiment with
different settings on a live test system. If a particular
setting performs poorly, you can quickly set
innodb_thread_concurrency back to 0.

Use the following guidelines to help find and maintain an
appropriate setting:

If the number of concurrent user threads for a workload is
less than 64, set
innodb_thread_concurrency=0.

If your workload is consistently heavy or occasionally
spikes, start by setting
innodb_thread_concurrency=128 and then
lowering the value to 96, 80, 64, and so on, until you
find the number of threads that provides the best
performance. For example, suppose your system typically
has 40 to 50 users, but periodically the number increases
to 60, 70, or even 200. You find that performance is
stable at 80 concurrent users but starts to show a
regression above this number. In this case, you would set
innodb_thread_concurrency=80 to avoid
impacting performance.

If you do not want InnoDB to use more
than a certain number of vCPUs for user threads (20 vCPUs,
for example), set
innodb_thread_concurrency to this
number (or possibly lower, depending on performance
results). If your goal is to isolate MySQL from other
applications, you may consider binding the
mysqld process exclusively to the
vCPUs. Be aware, however, that exclusive binding could
result in non-optimal hardware usage if the
mysqld process is not consistently
busy. In this case, you might bind the
mysqld process to the vCPUs but also
allow other applications to use some or all of the vCPUs.

Note

From an operating system perspective, using a resource
management solution to manage how CPU time is shared
among applications may be preferable to binding the
mysqld process. For example, you
could assign 90% of vCPU time to a given application
while other critical process are
not running, and scale that value back to 40%
when other critical processes are
running.

innodb_thread_concurrency values that
are too high can cause performance regression due to
increased contention on system internals and resources.

In some cases, the optimal
innodb_thread_concurrency setting can
be smaller than the number of vCPUs.

Monitor and analyze your system regularly. Changes to
workload, number of users, or computing environment may
require that you adjust the
innodb_thread_concurrency setting.

Pauses purging of delete-marked records while allowing the
purge view to be updated. This option artificially creates a
situation in which the purge view is updated but purges have
not yet been performed. This option is only available if
debugging support is compiled in using the
WITH_DEBUGCMake option.

Sets a debug flag that limits
TRX_RSEG_N_SLOTS to a given value for the
trx_rsegf_undo_find_free function that
looks for free slots for undo log segments. This option is
only available if debugging support is compiled in using the
WITH_DEBUGCMake option.

Defines how long InnoDB threads sleep
before joining the InnoDB queue, in
microseconds. The default value is 10000. A value of 0
disables sleep. You can set the configuration option
innodb_adaptive_max_sleep_delay
to the highest value you would allow for
innodb_thread_sleep_delay, and
InnoDB automatically adjusts
innodb_thread_sleep_delay up or down
depending on current thread-scheduling activity. This dynamic
adjustment helps the thread scheduling mechanism to work
smoothly during times when the system is lightly loaded or
when it is operating near full capacity.

When enabled, undo tablespaces that exceed the threshold value
defined by
innodb_max_undo_log_size are
marked for truncation. Only undo tablespaces can be truncated.
Truncating undo logs that reside in the system tablespace is
not supported. For truncation to occur, there must be at least
two undo tablespaces and two redo-enabled undo logs configured
to use undo tablespaces. This means that
innodb_undo_tablespaces must
be set to a value equal to or greater than 2, and
innodb_rollback_segments must
set to a value equal to or greater than 35.

Because undo logs can become large during long-running
transactions, having undo logs in multiple tablespaces reduces
the maximum size of any one tablespace. The undo tablespace
files are created in the location defined by
innodb_undo_directory, with
names in the form of
undoN, where
N is a sequential series of
integers (including leading zeros) representing the space ID.
The default size of an undo tablespace file is 10MiB.

innodb_undo_tablespaces can
only be configured prior to initializing the MySQL instance
and cannot be changed afterward. If no value is specified,
the instance is initialized using the default setting of 0.
Attempting to restart InnoDB with a
greater number of undo tablespaces than specified when the
MySQL instance was initialized results in a startup failure
and an error stating that InnoDB did not
find the expected number of undo tablespaces.

Specifies whether to use the Linux asynchronous I/O subsystem.
This variable applies to Linux systems only, and cannot be
changed while the server is running. Normally, you do not need
to configure this option, because it is enabled by default.

The asynchronous
I/O capability that InnoDB has on
Windows systems is available on Linux systems. (Other
Unix-like systems continue to use synchronous I/O calls.) This
feature improves the scalability of heavily I/O-bound systems,
which typically show many pending reads/writes in
SHOW ENGINE INNODB STATUS\G output.

Running with a large number of InnoDB I/O
threads, and especially running multiple such instances on the
same server machine, can exceed capacity limits on Linux
systems. In this case, you may receive the following error:

EAGAIN: The specified maxevents exceeds the user's limit of available events.

You can typically address this error by writing a higher limit
to /proc/sys/fs/aio-max-nr.

However, if a problem with the asynchronous I/O subsystem in
the OS prevents InnoDB from starting, you
can start the server with
innodb_use_native_aio=0. This
option may also be disabled automatically during startup if
InnoDB detects a potential problem such as
a combination of tmpdir location,
tmpfs file system, and Linux kernel that
does not support AIO on tmpfs.

On Linux systems, running multiple MySQL servers (typically
more than 12) with default settings for
innodb_read_io_threads,
innodb_write_io_threads, and the Linux
aio-max-nr setting can exceed system
limits. Ideally, increase the aio-max-nr
setting; as a workaround, you might reduce the settings for
one or both of the MySQL configuration options.

Also take into consideration the value of
sync_binlog, which controls
synchronization of the binary log to disk.

Be careful when being too aggressive with settings like innodb_buffer_pool_size. Although your system might have a lot of RAM installed, a 32-bit Linux operating can't allocate more than 2.2-2.7G* per process.

This is particularly of importance when performing a file based sync to setup replication. If you have a different (or no) innodb_log_file_size setting at the slave, you will be puzzled for hours (I was).

Posted by
Simon Mudd
on
October 13, 2009

NOTE: The time to Initialise the innodb buffer pool is roughly proportional to the size of the pool created. On large installations[*] this initialisation time may be significant.