MySQL Glossary

These terms are commonly used in information about the MySQL
database server. This glossary originated as a reference for
terminology about the InnoDB storage engine, and the majority of
definitions are InnoDB-related.

A

.ARM file

Metadata for ARCHIVE tables. Contrast with
.ARZ file. Files with this
extension are always included in backups produced by the
mysqlbackup command of the
MySQL Enterprise Backup
product.

An acronym standing for atomicity, consistency, isolation, and
durability. These properties are all desirable in a database
system, and are all closely tied to the notion of a
transaction. The transactional
features of InnoDB adhere to the ACID principles.

Transactions are atomic units
of work that can be committed
or rolled back. When a
transaction makes multiple changes to the database, either all
the changes succeed when the transaction is committed, or all
the changes are undone when the transaction is rolled back.

The database remains in a consistent state at all times -- after
each commit or rollback, and while transactions are in progress.
If related data is being updated across multiple tables, queries
see either all old values or all new values, not a mix of old
and new values.

Transactions are protected (isolated) from each other while they
are in progress; they cannot interfere with each other or see
each other's uncommitted data. This isolation is achieved
through the locking mechanism.
Experienced users can adjust the isolation
level, trading off less protection in favor of
increased performance and
concurrency, when they can be
sure that the transactions really do not interfere with each
other.

The results of transactions are durable: once a commit operation
succeeds, the changes made by that transaction are safe from
power failures, system crashes, race conditions, or other
potential dangers that many non-database applications are
vulnerable to. Durability typically involves writing to disk
storage, with a certain amount of redundancy to protect against
power failures or software crashes during write operations. (In
InnoDB, the doublewrite buffer
assists with durability.)

An algorithm for InnoDB tables
that smooths out the I/O overhead introduced by
checkpoints. Instead of
flushing all modified
pages from the
buffer pool to the
data files at once, MySQL
periodically flushes small sets of modified pages. The adaptive
flushing algorithm extends this process by estimating the
optimal rate to perform these periodic flushes, based on the
rate of flushing and how fast
redo information is generated.
First introduced in MySQL 5.1, in the InnoDB Plugin.

An optimization for InnoDB tables that can speed up lookups
using = and IN operators,
by constructing a hash index in
memory. MySQL monitors index searches for InnoDB tables, and if
queries could benefit from a hash index, it builds one
automatically for index pages
that are frequently accessed. In a sense, the adaptive hash
index configures MySQL at runtime to take advantage of ample
main memory, coming closer to the architecture of main-memory
databases. This feature is controlled by the
innodb_adaptive_hash_index
configuration option. Because this feature benefits some
workloads and not others, and the memory used for the hash index
is reserved in the buffer pool,
typically you should benchmark with this feature both enabled
and disabled.

The hash index is always built based on an existing InnoDB
secondary index, which is
organized as a B-tree
structure. MySQL can build a hash index on a prefix of any
length of the key defined for the B-tree, depending on the
pattern of searches against the index. A hash index can be
partial; the whole B-tree index does not need to be cached in
the buffer pool.

In MySQL 5.6 and higher, another way to take advantage of fast
single-value lookups with InnoDB tables is to use the
memcached interface to InnoDB.
See InnoDB Integration with memcached for details.

The code name for the original InnoDB file
format. It supports the
redundant and
compact row formats, but not
the newer dynamic and
compressed row formats
available in the Barracuda file
format.

If your application could benefit from InnoDB table
compression, or uses BLOBs or
large text columns that could benefit from the dynamic row
format, you might switch some tables to Barracuda format. You
select the file format to use by setting the
innodb_file_format option
before creating the table.

A set of functions or procedures. An API provides a stable set
of names and types for functions, procedures, parameters, and
return values.

apply

When a backup produced by the MySQL
Enterprise Backup product does not include the most
recent changes that occurred while the backup was underway, the
process of updating the backup files to include those changes is
known as the apply step. It is
specified by the apply-log option of the
mysqlbackup command.

Before the changes are applied, we refer to the files as a
raw backup. After the changes
are applied, we refer to the files as a
prepared backup. The changes
are recorded in the
ibbackup_logfile file; once the
apply step is finished, this file is no longer necessary.

A type of I/O operation that allows other processing to proceed
before the I/O is completed. Also known as
non-blocking I/O and
abbreviated as AIO. InnoDB uses
this type of I/O for certain operations that can run in parallel
without affecting the reliability of the database, such as
reading pages into the buffer
pool that have not actually been requested, but might
be needed soon.

Historically, InnoDB has used asynchronous I/O on Windows
systems only. Starting with the InnoDB Plugin 1.1 and MySQL 5.5,
InnoDB uses asynchronous I/O on Linux systems. This change
introduces a dependency on libaio.
Asynchronous I/O on Linux systems is configured using the
innodb_use_native_aio option,
which is enabled by default. On other Unix-like systems, InnoDB
uses synchronous I/O only.

In the SQL context,
transactions are units of work
that either succeed entirely (when
committed) or have no effect at
all (when rolled back). The
indivisible ("atomic") property of transactions is the "A" in
the acronym ACID.

Special instructions provided by the CPU, to ensure that
critical low-level operations cannot be interrupted.

auto-increment

A property of a table column (specified by the
AUTO_INCREMENT keyword) that automatically
adds an ascending sequence of values in the column. InnoDB
supports auto-increment only for primary
key columns.

It saves work for the developer, not to have to produce new
unique values when inserting new rows. It provides useful
information for the query optimizer, because the column is known
to be not null and with unique values. The values from such a
column can be used as lookup keys in various contexts, and
because they are auto-generated there is no reason to ever
change them; for this reason, primary key columns are often
specified as auto-incrementing.

Auto-increment columns can be problematic with statement-based
replication, because replaying the statements on a slave might
not produce the same set of column values as on the master, due
to timing issues. When you have an auto-incrementing primary
key, you can use statement-based replication only with the
setting
innodb_autoinc_lock_mode=1. If
you have innodb_autoinc_lock_mode=2, which
allows higher concurrency for insert operations, use
row-based replication rather
than statement-based
replication. The setting
innodb_autoinc_lock_mode=0 is the previous
(traditional) default setting and should not be used except for
compatibility purposes.

The convenience of an
auto-increment primary key
involves some tradeoff with concurrency. In the simplest case,
if one transaction is inserting values into the table, any other
transactions must wait to do their own inserts into that table,
so that rows inserted by the first transaction receive
consecutive primary key values. InnoDB includes optimizations,
and the
innodb_autoinc_lock_mode
option, so that you can choose how to trade off between
predictable sequences of auto-increment values and maximum
concurrency for insert
operations.

A setting that causes a commit
operation after each SQL
statement. This mode is not recommended for working with InnoDB
tables with transactions that
span several statements. It can help performance for
read-only transactions on
InnoDB tables, where it minimizes overhead from
locking and generation of
undo data, especially in MySQL
5.6.4 and up. It is also appropriate for working with MyISAM
tables, where transactions are not applicable.

The ability to cope with, and if necessary recover from,
failures on the host, including failures of MySQL, the operating
system, or the hardware and maintenance activity that may
otherwise cause downtime. Often paired with
scalability as critical aspects
of a large-scale deployment.

B

B-tree

A tree data structure that is popular for use in database
indexes. The structure is kept sorted at all times, enabling
fast lookup for exact matches (equals operator) and ranges (for
example, greater than, less than, and BETWEEN
operators). This type of index is available for most storage
engines, such as InnoDB and MyISAM.

Because B-tree nodes can have many children, a B-tree is not the
same as a binary tree, which is limited to 2 children per node.

Contrast with hash index, which
is only available in the MEMORY storage engine. The MEMORY
storage engine can also use B-tree indexes, and you should
choose B-tree indexes for MEMORY tables if some queries use
range operators.

The use of the term B-tree is intended as a reference to the
general class of index design. B-tree structures used by MySQL
storage engines may be regarded as variants due to
sophistications not present in a classic B-tree design. For
related information, refer to the InnoDB Page Structure
Fil
Header section of the
MySQL
Internals Manual.

Identifiers within MySQL SQL statements must be quoted using the
backtick character (`) if they contain
special characters or reserved words. For example, to refer to a
table named FOO#BAR or a column named
SELECT, you would specify the identifiers as
`FOO#BAR` and `SELECT`.
Since the backticks provide an extra level of safety, they are
used extensively in program-generated SQL statements, where the
identifier names might not be known in advance.

Many other database systems use double quotation marks
(") around such special names. For
portability, you can enable ANSI_QUOTES mode
in MySQL and use double quotation marks instead of backticks to
qualify identifier names.

The process of copying some or all table data and metadata from
a MySQL instance, for safekeeping. Can also refer to the set of
copied files. This is a crucial task for DBAs. The reverse of
this process is the restore
operation.

With MySQL, physical backups
are performed by the MySQL Enterprise
Backup product, and logical
backups are performed by the
mysqldump command. These techniques have
different characteristics in terms of size and representation of
the backup data, and speed (especially speed of the restore
operation).

Backups are further classified as
hot,
warm, or
cold depending on how much they
interfere with normal database operation. (Hot backups have the
least interference, cold backups the most.)

The code name for an InnoDB file
format that supports compression for table data. This
file format was first introduced in the InnoDB Plugin. It
supports the compressed row
format that enables InnoDB table compression, and the
dynamic row format that
improves the storage layout for BLOB and large text columns. You
can select it through the
innodb_file_format option.

The InnoDB system tablespace is
stored in the original Antelope
file format. To use features supported by the Barracuda file
format, you can enable the
file-per-table setting, which
allows tables to be created in file-per-table tablespaces,
separate from the system tablespace. Alternatively, you can
create tables in general tablespaces, which have no dependence
on the InnoDB file format setting. General tablespaces were
introduced in MySQL 5.7.6.

The MySQL Enterprise Backup
product version 3.5 and above supports backing up tablespaces
that use the Barracuda file format.

An early stage in the life of a software product, when it is
available only for evaluation, typically without a definite
release number or a number less than 1. InnoDB does not use the
beta designation, preferring an early
adopter phase that can extend over several point
releases, leading to a GA
release.

A file containing a record of all statements that attempt to
change table data. These statements can be replayed to bring
slave servers up to date in a
replication scenario, or to
bring a database up to date after restoring table data from a
backup. The binary logging feature can be turned on and off,
although Oracle recommends always enabling it if you use
replication or perform backups.

For the MySQL Enterprise Backup
product, the file name of the binary log and the current
position within the file are important details. To record this
information for the master server when taking a backup in a
replication context, you can specify the
--slave-info option.

Prior to MySQL 5.0, a similar capability was available, known as
the update log. In MySQL 5.0 and higher, the binary log replaces
the update log.

A special mode of full-text
search enabled by the WITH QUERY
EXPANSION clause. It performs the search twice, where
the search phrase for the second search is the original search
phrase concatenated with the few most highly relevant documents
from the first search. This technique is mainly applicable for
short search phrases, perhaps only a single word. It can uncover
relevant matches where the precise search term does not occur in
the document.

A portion of a system that is constrained in size or capacity,
that has the effect of limiting overall throughput. For example,
a memory area might be smaller than necessary; access to a
single required resource might prevent multiple CPU cores from
running simultaneously; or waiting for disk I/O to complete
might prevent the CPU from running at full capacity. Removing
bottlenecks tends to improve
concurrency. For example, the
ability to have multiple InnoDB buffer
pool instances reduces contention when multiple
sessions read from and write to the buffer pool simultaneously.

A memory or disk area used for temporary storage. Data is
buffered in memory so that it can be written to disk
efficiently, with a few large I/O operations rather than many
small ones. Data is buffered on disk for greater reliability, so
that it can be recovered even when a
crash or other failure occurs
at the worst possible time. The main types of buffers used by
InnoDB are the buffer pool, the
doublewrite buffer, and the
change buffer.

The memory area that holds cached InnoDB data for both tables
and indexes. For efficiency of high-volume read operations, the
buffer pool is divided into
pages that can potentially hold
multiple rows. For efficiency of cache management, the buffer
pool is implemented as a linked list of pages; data that is
rarely used is aged out of the cache, using a variation of the
LRU algorithm. On systems with
large memory, you can improve concurrency by dividing the buffer
pool into multiple buffer pool
instances.

Several InnoDB status variables,
information_schema tables, and
performance_schema tables help to monitor the
internal workings of the buffer pool. Starting in MySQL 5.6, you
can also dump and restore the contents of the buffer pool,
either automatically during shutdown and restart, or manually at
any time, through a set of InnoDB
configuration variables such as
innodb_buffer_pool_dump_at_shutdown
and
innodb_buffer_pool_load_at_startup.

Any of the multiple regions into which the
buffer pool can be divided,
controlled by the
innodb_buffer_pool_instances
configuration option. The total memory size specified by the
innodb_buffer_pool_size is
divided among all the instances. Typically, multiple buffer pool
instances are appropriate for systems devoting multiple
gigabytes to the InnoDB buffer pool, with each instance 1
gigabyte or larger. On systems loading or looking up large
amounts of data in the buffer pool from many concurrent
sessions, having multiple instances reduces the contention for
exclusive access to the data structures that manage the buffer
pool.

The built-in InnoDB storage engine within MySQL is the original
form of distribution for the storage engine. Contrast with the
InnoDB Plugin. Starting with
MySQL 5.5, the InnoDB Plugin is merged back into the MySQL code
base as the built-in InnoDB storage engine (known as InnoDB
1.1).

This distinction is important mainly in MySQL 5.1, where a
feature or bug fix might apply to the InnoDB Plugin but not the
built-in InnoDB, or vice versa.

The relationships and sequences of actions that form the basis
of business software, used to run a commercial company.
Sometimes these rules are dictated by law, other times by
company policy. Careful planning ensures that the relationships
encoded and enforced by the database, and the actions performed
through application logic, accurately reflect the real policies
of the company and can handle real-life situations.

For example, an employee leaving a company might trigger a
sequence of actions from the human resources department. The
human resources database might also need the flexibility to
represent data about a person who has been hired, but not yet
started work. Closing an account at an online service might
result in data being removed from a database, or the data might
be moved or flagged so that it could be recovered if the account
is re-opened. A company might establish policies regarding
salary maximums, minimums, and adjustments, in addition to basic
sanity checks such as the salary not being a negative number. A
retail database might not allow a purchase with the same serial
number to be returned more than once, or might not allow credit
card purchases above a certain value, while a database used to
detect fraud might allow these kinds of things.

C

.cfg file

A metadata file used with the InnoDBtransportable tablespace
feature. It is produced by the command FLUSH TABLES ...
FOR EXPORT, puts one or more tables in a consistent
state that can be copied to another server. The
.cfg file is copied along with the
corresponding .ibd file, and
used to adjust the internal values of the
.ibd file, such as the
space ID, during the
ALTER TABLE ... IMPORT TABLESPACE step.

The number of different values in a table
column. When queries refer to
columns that have an associated
index, the cardinality of each
column influences which access method is most efficient. For
example, for a column with a unique
constraint, the number of different values is equal
to the number of rows in the table. If a table has a million
rows but only 10 different values for a particular column, each
value occurs (on average) 100,000 times. A query such as
SELECT c1 FROM t1 WHERE c1 = 50; thus might
return 1 row or a huge number of rows, and the database server
might process the query differently depending on the cardinality
of c1.

If the values in a column have a very uneven distribution, the
cardinality might not be a good way to determine the best query
plan. For example, SELECT c1 FROM t1 WHERE c1 =
x; might return 1 row when x=50 and
a million rows when x=30. In such a case, you
might need to use index hints
to pass along advice about which lookup method is more efficient
for a particular query.

Cardinality can also apply to the number of distinct values
present in multiple columns, as in a
composite index.

For InnoDB, the process of estimating cardinality for indexes is
influenced by the
innodb_stats_sample_pages and
the innodb_stats_on_metadata
configuration options. The estimated values are more stable when
the persistent statistics
feature is enabled (in MySQL 5.6 and higher).

A special data structure that records changes to
pages in
secondary indexes. These values
could result from SQL INSERT,
UPDATE, or
DELETE statements
(DML). The set of features
involving the change buffer is known collectively as
change buffering, consisting of
insert buffering,
delete buffering, and
purge buffering.

Changes are only recorded in the change buffer when the relevant
page from the secondary index is not in the
buffer pool. When the relevant
index page is brought into the buffer pool while associated
changes are still in the change buffer, the changes for that
page are applied in the buffer pool
(merged) using the data from
the change buffer. Periodically, the
purge operation that runs
during times when the system is mostly idle, or during a slow
shutdown, writes the new index pages to disk. The purge
operation can write the disk blocks for a series of index values
more efficiently than if each value were written to disk
immediately.

Physically, the change buffer is part of the
system tablespace, so that the
index changes remain buffered across database restarts. The
changes are only applied
(merged) when the pages are
brought into the buffer pool due to some other read operation.

The general term for the features involving the
change buffer, consisting of
insert buffering,
delete buffering, and
purge buffering. Index changes
resulting from SQL statements, which could normally involve
random I/O operations, are held back and performed periodically
by a background thread. This
sequence of operations can write the disk blocks for a series of
index values more efficiently than if each value were written to
disk immediately. Controlled by the
innodb_change_buffering and
innodb_change_buffer_max_size
configuration options.

As changes are made to data pages that are cached in the
buffer pool, those changes are
written to the data files
sometime later, a process known as
flushing. The checkpoint is a
record of the latest changes (represented by an
LSN value) that have been
successfully written to the data files.

In InnoDB, a validation mechanism to detect
corruption when a page in a
tablespace is read from disk
into the InnoDB buffer pool.
This feature is turned on and off by the
innodb_checksums configuration
option. In MySQL 5.6, you can enable a faster checksum algorithm
by also specifying the configuration option
innodb_checksum_algorithm=crc32.

The innochecksum command helps to diagnose
corruption problems by testing the checksum values for a
specified tablespace file while
the MySQL server is shut down.

In a foreign key relationship,
a child table is one whose rows refer (or point) to rows in
another table with an identical value for a specific column.
This is the table that contains the FOREIGN KEY ...
REFERENCES clause and optionally ON
UPDATE and ON DELETE clauses. The
corresponding row in the parent
table must exist before the row can be created in the
child table. The values in the child table can prevent delete or
update operations on the parent table, or can cause automatic
deletion or updates in the child table, based on the ON
CASCADE option used when creating the foreign key.

A type of program that sends requests to a server, and
interprets or processes the results. The client software might
run only some of the time (such as a mail or chat program), and
might run interactively (such as the mysql
command processor).

The InnoDB term for a primary
key index. InnoDB table storage is organized based on
the values of the primary key columns, to speed up queries and
sorts involving the primary key columns. For best performance,
choose the primary key columns carefully based on the most
performance-critical queries. Because modifying the columns of
the clustered index is an expensive operation, choose primary
columns that are rarely or never updated.

In the Oracle Database product, this type of table is known as
an index-organized table.

A data item within a row, whose
storage and semantics are defined by a data type. Each
table and
index is largely defined by the
set of columns it contains.

Each column has a cardinality
value. A column can be the primary
key for its table, or part of the primary key. A
column can be subject to a unique
constraint, a NOT NULL
constraint, or both. Values in different columns,
even across different tables, can be linked by a
foreign key relationship.

In discussions of MySQL internal operations, sometimes
field is used as a synonym.

When an index is created with a length specification, such as
CREATE INDEX idx ON t1 (c1(N)), only the
first N characters of the column value are stored in the index.
Keeping the index prefix small makes the index compact, and the
memory and disk I/O savings help performance. (Although making
the index prefix too small can hinder query optimization by
making rows with different values appear to the query optimizer
to be duplicates.)

For columns containing binary values or long text strings, where
sorting is not a major consideration and storing the entire
value in the index would waste space, the index automatically
uses the first N (typically 768) characters of the value to do
lookups and sorts.

A SQL statement that ends a
transaction, making permanent
any changes made by the transaction. It is the opposite of
rollback, which undoes any
changes made in the transaction.

InnoDB uses an optimistic
mechanism for commits, so that changes can be written to the
data files before the commit actually occurs. This technique
makes the commit itself faster, with the tradeoff that more work
is required in case of a rollback.

By default, MySQL uses the
autocommit setting, which
automatically issues a commit following each SQL statement.

The default InnoDBrow
format since MySQL 5.0.3. Available for tables that
use the Antelopefile format. It has a more
compact representation for nulls and variable-length fields than
the prior default (redundant row
format).

Because of the B-tree indexes
that make row lookups so fast in InnoDB, there is little if any
performance benefit to keeping all rows the same size.

The compression feature of the MySQL
Enterprise Backup product makes a compressed copy of
each tablespace, changing the extension from
.ibd to .ibz. Compressing
the backup data allows you to keep more backups on hand, and
reduces the time to transfer backups to a different server. The
data is uncompressed during the restore operation. When a
compressed backup operation processes a table that is already
compressed, it skips the compression step for that table,
because compressing again would result in little or no space
savings.

A set of files produced by the MySQL
Enterprise Backup product, where each
tablespace is compressed. The
compressed files are renamed with a .ibz file
extension.

Applying compression right at
the start of the backup process helps to avoid storage overhead
during the compression process, and to avoid network overhead
when transferring the backup files to another server. The
process of applying the
binary log takes longer, and
requires uncompressing the backup files.

A row format that enables data
and index compression for
InnoDB tables. It was introduced in the
InnoDB Plugin, available as part of the
Barracuda file format. Large
fields are stored away from the page that holds the rest of the
row data, as in dynamic row
format. Both index pages and the large fields are
compressed, yielding memory and disk savings. Depending on the
structure of the data, the decrease in memory and disk usage
might or might not outweigh the performance overhead of
uncompressing the data as it is used. See
Section 14.9, “InnoDB Table Compression” for usage details.

A feature with wide-ranging benefits from using less disk space,
performing less I/O, and using less memory for caching. InnoDB
table and index data can be kept in a compressed format during
database operation.

The data is uncompressed when needed for queries, and
re-compressed when changes are made by
DML operations. After you
enable compression for a table, this processing is transparent
to users and application developers. DBAs can consult
information_schema tables to
monitor how efficiently the compression parameters work for the
MySQL instance and for particular compressed tables.

When InnoDB table data is compressed, the compression applies to
the table itself, any
associated index data, and the
pages loaded into the buffer
pool. Compression does not apply to pages in the
undo buffer.

The table compression feature requires using MySQL 5.5 or
higher, or the InnoDB Plugin in MySQL 5.1 or earlier, and
creating the table using the
Barracuda file format and
compressed row format, with the
innodb_file_per_table setting
enabaled.

Not actually an error, rather an expensive operation that can
occur when using compression in
combination with DML
operations. It occurs when: updates to a compressed
page overflow the area on the
page reserved for recording modifications; the page is
compressed again, with all changes applied to the table data;
the re-compressed data does not fit on the original page,
requiring MySQL to split the data into two new pages and
compress each one separately. To check the frequency of this
condition, query the table
INFORMATION_SCHEMA.INNODB_CMP and
check how much the value of the COMPRESS_OPS
column exceeds the value of the
COMPRESS_OPS_OK column. Ideally, compression
failures do not occur often; when they do, you can adjust the
configuration options
innodb_compression_level,
innodb_compression_failure_threshold_pct,
and
innodb_compression_pad_pct_max.

The ability of multiple operations (in database terminology,
transactions) to run
simultaneously, without interfering with each other. Concurrency
is also involved with performance, because ideally the
protection for multiple simultaneous transactions works with a
minimum of performance overhead, using efficient mechanisms for
locking.

The file that holds the option
values used by MySQL at startup. Traditionally, on Linux and
UNIX this file is named my.cnf, and on
Windows it is named my.ini. You can set a
number of options related to InnoDB under the
[mysqld] section of the file.

Typically, this file is searched for in the locations
/etc/my.cnf/etc/mysql/my.cnf/usr/local/mysql/etc/my.cnf and
~/.my.cnf. See
Section 4.2.6, “Using Option Files” for details about the search path
for this file.

When you use the MySQL Enterprise
Backup product, you typically use two configuration
files: one that specifies where the data comes from and how it
is structured (which could be the original configuration file
for your real server), and a stripped-down one containing only a
small set of options that specify where the backup data goes and
how it is structured. The configuration files used with the
MySQL Enterprise Backup product
must contain certain options that are typically left out of
regular configuration files, so you might need to add some
options to your existing configuration file for use with
MySQL Enterprise Backup.

A read operation that uses snapshot information to present query
results based on a point in time, regardless of changes
performed by other transactions running at the same time. If
queried data has been changed by another transaction, the
original data is reconstructed based on the contents of the
undo log. This technique avoids
some of the locking issues that
can reduce concurrency by
forcing transactions to wait for other transactions to finish.

With the repeatable read
isolation level, the snapshot is based on the time when the
first read operation is performed. With the
read committed isolation level,
the snapshot is reset to the time of each consistent read
operation.

Consistent read is the default mode in which InnoDB processes
SELECT statements in
READ COMMITTED and
REPEATABLE READ isolation
levels. Because a consistent read does not set any locks on the
tables it accesses, other sessions are free to modify those
tables while a consistent read is being performed on the table.

An automatic test that can block database changes to prevent
data from becoming inconsistent. (In computer science terms, a
kind of assertion related to an invariant condition.)
Constraints are a crucial component of the
ACID philosophy, to maintain
data consistency. Constraints supported by MySQL include
FOREIGN KEY constraints and
unique constraints.

A value that is incremented by a particular kind of
InnoDB operation. Useful for measuring how
busy a server is, troubleshooting the sources of performance
issues, and testing whether changes (for example, to
configuration settings or indexes used by queries) have the
desired low-level effects. Different kinds of counters are
available through
performance_schema tables and
information_schema tables,
particularly
information_schema.innodb_metrics.

An index that includes all the
columns retrieved by a query. Instead of using the index values
as pointers to find the full table rows, the query returns
values from the index structure, saving disk I/O. InnoDB can
apply this optimization technique to more indexes than MyISAM
can, because InnoDB secondary
indexes also include the primary key columns. InnoDB
cannot apply this technique for queries against tables modified
by a transaction, until that transaction ends.

Any column index or
composite index could act as a
covering index, given the right query. Design your indexes and
queries to take advantage of this optimization technique
wherever possible.

MySQL uses the term "crash" to refer generally to any unexpected
shutdown operation where
the server cannot do its normal cleanup. For example, a crash
could happen due to a hardware fault on the database server
machine or storage device; a power failure; a potential data
mismatch that causes the MySQL server to halt; a
fast shutdown initiated by the
DBA; or many other reasons. The robust, automatic
crash recovery for
InnoDB tables ensures that data
is made consistent when the server is restarted, without any
extra work for the DBA.

The cleanup activities that occur when MySQL is started again
after a crash. For
InnoDB tables, changes from
incomplete transactions are replayed using data from the
redo log. Changes that were
committed before the crash, but
not yet written into the data
files, are reconstructed from the
doublewrite buffer. When the
database is shut down normally, this type of activity is
performed during shutdown by the
purge operation.

During normal operation, committed data can be stored in the
change buffer for a period of
time before being written to the data files. There is always a
tradeoff between keeping the data files up-to-date, which
introduces performance overhead during normal operation, and
buffering the data, which can make shutdown and crash recovery
take longer.

Acronym for "create, read, update, delete", a common sequence of
operations in database applications. Often denotes a class of
applications with relatively simple database usage (basic
DDL,
DML and
query statements in
SQL) that can be implemented
quickly in any language.

An internal data structure that is used to represent the result
set of a query, or other
operation that performs a search using an SQL
WHERE clause. It works like an iterator in
other high-level languages, producing each value from the result
set as requested.

Although usually SQL handles the processing of cursors for you,
you might delve into the inner workings when dealing with
performance-critical code.

D

Metadata that keeps track of InnoDB-related objects such as
tables,
indexes, and table
columns. This metadata is
physically located in the InnoDB system
tablespace. For historical reasons, it overlaps to
some degree with information stored in the
.frm files.

Because the MySQL Enterprise
Backup product always backs up the system tablespace,
all backups include the contents of the data dictionary.

A database system or application that primarily runs large
queries. The read-only or
read-mostly data might be organized in
denormalized form for query
efficiency. Can benefit from the optimizations for
read-only transactions in MySQL
5.6 and higher. Contrast with
OLTP.

Within the MySQL data
directory, each database is represented by a separate
directory. The InnoDB system
tablespace, which can hold table data from multiple
databases within a MySQL
instance, is kept in
data files that reside outside
of individual database directories. When
file-per-table mode is enabled,
the .ibd files representing
individual InnoDB tables are stored inside the database
directories unless created elsewhere using the DATA
DIRECTORY clause. General tablespaces, introduced in
MySQL 5.7.6, also hold table data in .ibd
files. Unlike file-per-table
.ibd files, general tablespace
.ibd files can hold table data
from multiple databases within a MySQL
instance, and can be assigned
to directories relative to or independent of the MySQL data
directory.

For long-time MySQL users, a database is a familiar notion.
Users coming from an Oracle Database background will find that
the MySQL meaning of a database is closer to what Oracle
Database calls a schema.

Data definition language, a set of
SQL statements for manipulating
the database itself rather than individual table rows. Includes
all forms of the CREATE,
ALTER, and DROP
statements. Also includes the TRUNCATE
statement, because it works differently than a DELETE
FROM table_name statement,
even though the ultimate effect is similar.

DDL statements automatically
commit the current
transaction; they cannot be
rolled back.

A situation where different
transactions are unable to
proceed, because each holds a
lock that the other needs.
Because both transactions are waiting for a resource to become
available, neither will ever release the locks it holds.

A deadlock can occur when the transactions lock rows in multiple
tables (through statements such as UPDATE or
SELECT ... FOR UPDATE), but in the opposite
order. A deadlock can also occur when such statements lock
ranges of index records and
gaps, with each transaction
acquiring some locks but not others due to a timing issue.

To reduce the possibility of deadlocks, use transactions rather
than LOCK TABLE statements; keep transactions
that insert or update data small enough that they do not stay
open for long periods of time; when different transactions
update multiple tables or large ranges of rows, use the same
order of operations (such as SELECT ... FOR
UPDATE) in each transaction; create indexes on the
columns used in SELECT ... FOR UPDATE and
UPDATE ... WHERE statements. The possibility
of deadlocks is not affected by the
isolation level, because the
isolation level changes the behavior of read operations, while
deadlocks occur because of write operations.

If a deadlock does occur, InnoDB detects the condition and
rolls back one of the
transactions (the victim).
Thus, even if your application logic is perfectly correct, you
must still handle the case where a transaction must be retried.
To see the last deadlock in an InnoDB user transaction, use the
command SHOW ENGINE INNODB STATUS. If
frequent deadlocks highlight a problem with transaction
structure or application error handling, run with the
innodb_print_all_deadlocks
setting enabled to print information about all deadlocks to the
mysqld error log.

When InnoDB processes a DELETE statement, the
rows are immediately marked for deletion and no longer are
returned by queries. The storage is reclaimed sometime later,
during the periodic garbage collection known as the
purge operation, performed by a
separate thread. For removing large quantities of data, related
operations with their own performance characteristics are
truncate and
drop.

The technique of storing changes to secondary index pages,
resulting from DELETE operations, in the
change buffer rather than
writing the changes immediately, so that the physical writes can
be performed to minimize random I/O. (Because delete operations
are a two-step process, this operation buffers the write that
normally marks an index record for deletion.) It is one of the
types of change buffering; the
others are insert buffering and
purge buffering.

A data storage strategy that duplicates data across different
tables, rather than linking the tables with
foreign keys and
join queries. Typically used in
data warehouse applications,
where the data is not updated after loading. In such
applications, query performance is more important than making it
simple to maintain consistent data during updates. Contrast with
normalized.

A type of index available with some database systems, where
index storage is optimized to process ORDER BY
column DESC clauses.
Currently, although MySQL allows the DESC
keyword in the CREATE TABLE
statement, it does not use any special storage layout for the
resulting index.

An operation that retrieves unreliable data, data that was
updated by another transaction but not yet
committed. It is only possible
with the isolation level known
as read uncommitted.

This kind of operation does not adhere to the
ACID principle of database
design. It is considered very risky, because the data could be
rolled back, or updated further
before being committed; then, the transaction doing the dirty
read would be using data that was never confirmed as accurate.

Its polar opposite is consistent
read, where InnoDB goes to great lengths to ensure
that a transaction does not read information updated by another
transaction, even if the other transaction commits in the
meantime.

A kind of database that primarily organizes data on disk storage
(hard drives or equivalent). Data is brought back and forth
between disk and memory to be operated upon. It is the opposite
of an in-memory database.
Although InnoDB is disk-based, it also contains features such as
the buffer pool, multiple
buffer pool instances, and the adaptive
hash index that allow certain kinds of workloads to
work primarily from memory.

Data manipulation language, a set of
SQL statements for performing
insert, update, and delete operations. The
SELECT statement is sometimes
considered as a DML statement, because the SELECT ...
FOR UPDATE form is subject to the same considerations
for locking as
INSERT,
UPDATE, and
DELETE.

DML statements for an InnoDB table operate in the context of a
transaction, so their effects
can be committed or
rolled back as a single unit.

In the InnoDB full-text search
feature, a special column in the table containing the
FULLTEXT index, to uniquely
identify the document associated with each
ilist value. Its name is
FTS_DOC_ID (uppercase required). The column
itself must be of BIGINT UNSIGNED NOT NULL
type, with a unique index named
FTS_DOC_ID_INDEX. Preferably, you define this
column when creating the table. If InnoDB must add the column to
the table while creating a FULLTEXT index,
the indexing operation is considerably more expensive.

InnoDB uses a novel file flush technique called doublewrite.
Before writing pages to the
data files, InnoDB first writes
them to a contiguous area called the doublewrite buffer. Only
after the write and the flush to the doublewrite buffer have
completed, does InnoDB write the pages to their proper positions
in the data file. If there is an operating system, storage
subsystem, or mysqld process crash in the
middle of a page write, InnoDB can later find a good copy of the
page from the doublewrite buffer during
crash recovery.

Although data is always written twice, the doublewrite buffer
does not require twice as much I/O overhead or twice as many I/O
operations. Data is written to the buffer itself as a large
sequential chunk, with a single fsync() call
to the operating system.

A kind of DDL operation that
removes a schema object, through a statement such as
DROP TABLE or
DROP INDEX. It maps internally to
an ALTER TABLE statement. From an
InnoDB perspective, the performance considerations of such
operations involve the time that the data
dictionary is locked to ensure that interrelated
objects are all updated, and the time to update memory
structures such as the buffer
pool. For a table,
the drop operation has somewhat different characteristics than a
truncate operation
(TRUNCATE TABLE statement).

A row format introduced in the InnoDB Plugin,
available as part of the
Barracudafile format. Because
TEXT and BLOB fields are
stored outside of the rest of the page that holds the row data,
it is very efficient for rows that include large objects. Since
the large fields are typically not accessed to evaluate query
conditions, they are not brought into the
buffer pool as often, resulting
in fewer I/O operations and better utilization of cache memory.

E

early adopter

A stage similar to beta, when a software product is typically
evaluated for performance, functionality, and compatibility in a
non-mission-critical setting. InnoDB uses the
early adopter designation
rather than beta, through a
succession of point releases leading up to a
GA release.

The process of removing an item from a cache or other temporary
storage area, such as the InnoDB buffer
pool. Often, but not always, uses the
LRU algorithm to determine
which item to remove. When a dirty
page is evicted, its contents are
flushed to disk, and any
dirty neighbor pages might be
flushed also.

A kind of lock that prevents
any other transaction from
locking the same row. Depending on the transaction
isolation level, this kind of
lock might block other transactions from writing to the same
row, or might also block other transactions from reading the
same row. The default InnoDB isolation level,
REPEATABLE READ, enables higher
concurrency by allowing
transactions to read rows that have exclusive locks, a technique
known as consistent read.

A group of pages within a
tablespace. With the default
page size of 16KB, an extent
contains 64 pages. In MySQL 5.6, the page size for an
InnoDB instance can be 4KB, 8KB, or 16KB,
controlled by the
innodb_page_size configuration
option. For 4KB, 8KB, and 16KB pages sizes, the extent size is
always 1MB (or 1048576 bytes).

Support for 32KB and 64KB InnoDB page sizes
was added in MySQL 5.7.6. For a 32KB page size, the extent size
is 2MB. For a 64KB page size, the extent size is 4MB.

InnoDB features such as
segments,
read-ahead requests and the
doublewrite buffer use I/O
operations that read, write, allocate, or free data one extent
at a time.

F

A file containing the metadata, such as the table definition, of
a MySQL table.

For backups, you must always keep the full set of
.frm files along with the backup data to be
able to restore tables that are altered or dropped after the
backup.

Although each InnoDB table has a .frm file,
InnoDB maintains its own table metadata in the system
tablespace; the .frm files are not needed for
InnoDB to operate on InnoDB tables.

These files are backed up by the MySQL
Enterprise Backup product. These files must not be
modified by an ALTER TABLE operation while
the backup is taking place, which is why backups that include
non-InnoDB tables perform a FLUSH TABLES WITH READ
LOCK operation to freeze such activity while backing
up the .frm files. Restoring a backup can
result in .frm files being created, changed,
or removed to match the state of the database at the time of the
backup.

A capability first introduced in the InnoDB Plugin, now part of
the MySQL server in 5.5 and higher, that speeds up creation of
InnoDB secondary indexes by
avoiding the need to completely rewrite the associated table.
The speedup applies to dropping secondary indexes also.

Because index maintenance can add performance overhead to many
data transfer operations, consider doing operations such as
ALTER TABLE ... ENGINE=INNODB or
INSERT INTO ... SELECT * FROM ... without any
secondary indexes in place, and creating the indexes afterward.

In MySQL 5.6, this feature becomes more general: you can read
and write to tables while an index is being created, and many
more kinds of ALTER TABLE
operations can be performed without copying the table, without
blocking DML operations, or
both. Thus in MySQL 5.6 and higher, we typically refer to this
set of features as online DDL
rather than Fast Index Creation.

The default shutdown procedure
for InnoDB, based on the configuration setting
innodb_fast_shutdown=1. To save
time, certain flush operations
are skipped. This type of shutdown is safe during normal usage,
because the flush operations are performed during the next
startup, using the same mechanism as in
crash recovery. In cases where
the database is being shut down for an upgrade or downgrade, do
a slow shutdown instead to
ensure that all relevant changes are applied to the
data files during the shutdown.

The format used by InnoDB for each table, typically with the
file-per-table setting enabled
so that each table is stored in a separate
.ibd file.
Currently, the file formats available in InnoDB are known as
Antelope and
Barracuda. Each file format
supports one or more row
formats. The row formats available for Barracuda
tables, COMPRESSED and
DYNAMIC, enable important new
storage features for InnoDB tables.

A general name for the setting controlled by the
innodb_file_per_table option,
which is an important configuration option that affects aspects
of InnoDB file storage, availability of features, and I/O
characteristics. In MySQL 5.6.7 and higher, it is enabled by
default. Prior to MySQL 5.6.7, it is disabled by default.

With the innodb_file_per_table
option enabled, you can create a table in its own
.ibd file rather than in the
shared ibdata files of the
system tablespace. When table
data is stored in an individual .ibd
file, you have more flexibility to choose nondefault
file formats and
row formats, which are required
for features such as data
compression. The
TRUNCATE TABLE operation is also much faster,
and the reclaimed space can be used by the operating system
rather than remaining reserved for InnoDB.

The MySQL Enterprise Backup
product is more flexible for tables that are in their own files.
For example, tables can be excluded from a backup, but only if
they are in separate files. Thus, this setting is suitable for
tables that are backed up less frequently or on a different
schedule.

In an InnoDB index, the
proportion of a page that is
taken up by index data before the page is split. The unused
space when index data is first divided between pages allows for
rows to be updated with longer string values without requiring
expensive index maintenance operations. If the fill factor is
too low, the index consumes more space than needed, causing
extra I/O overhead when reading the index. If the fill factor is
too high, any update that increases the length of column values
can cause extra I/O overhead for index maintenance. See
Section 14.5.6.3, “Physical Structure of an InnoDB Index” for more
information.

This row format is used by the MyISAM storage engine, not by
InnoDB. If you create an InnoDB table with the option
row_format=fixed, InnoDB translates this
option to use the compact row
format instead, although the fixed
value might still show up in output such as SHOW TABLE
STATUS reports.

To write changes to the database files, that had been buffered
in a memory area or a temporary disk storage area. The InnoDB
storage structures that are periodically flushed include the
redo log, the
undo log, and the
buffer pool.

Flushing can happen because a memory area becomes full and the
system needs to free some space, because a
commit operation means the
changes from a transaction can be finalized, or because a
slow shutdown operation means
that all outstanding work should be finalized. When it is not
critical to flush all the buffered data at once,
InnoDB can use a technique called
fuzzy checkpointing to flush
small batches of pages to spread out the I/O overhead.

An internal InnoDB data structure that tracks
dirty pages in the
buffer pool: that is,
pages that have been changed
and need to be written back out to disk. This data structure is
updated frequently by InnoDB's internal
mini-transactions, and so is
protected by its own mutex to
allow concurrent access to the buffer pool.

A type of pointer relationship, between rows in separate InnoDB
tables. The foreign key relationship is defined on one column in
both the parent table and the
child table.

In addition to enabling fast lookup of related information,
foreign keys help to enforce referential
integrity, by preventing any of these pointers from
becoming invalid as data is inserted, updated, and deleted. This
enforcement mechanism is a type of
constraint. A row that points
to another table cannot be inserted if the associated foreign
key value does not exist in the other table. If a row is deleted
or its foreign key value changed, and rows in another table
point to that foreign key value, the foreign key can be set up
to prevent the deletion, cause the corresponding column values
in the other table to become
null, or automatically delete
the corresponding rows in the other table.

One of the stages in designing a
normalized database is to
identify data that is duplicated, separate that data into a new
table, and set up a foreign key relationship so that the
multiple tables can be queried like a single table, using a
join operation.

The type of constraint that
maintains database consistency through a
foreign key relationship. Like
other kinds of constraints, it can prevent data from being
inserted or updated if data would become inconsistent; in this
case, the inconsistency being prevented is between data in
multiple tables. Alternatively, when a
DML operation is performed,
FOREIGN KEY constraints can cause data in
child rows to be deleted,
changed to different values, or set to
null, based on the ON
CASCADE option specified when creating the foreign
key.

An operation that requires reading the entire contents of a
table, rather than just selected portions using an index.
Typically performed either with small lookup tables, or in data
warehousing situations with large tables where all available
data is aggregated and analyzed. How frequently these operations
occur, and the sizes of the tables relative to available memory,
have implications for the algorithms used in query optimization
and managing the buffer pool.

The purpose of indexes is to
allow lookups for specific values or ranges of values within a
large table, thus avoiding full table scans when practical.

The MySQL feature for finding words, phrases, Boolean
combinations of words, and so on within table data, in a faster,
more convenient, and more flexible way than using the SQL
LIKE operator or writing your own
application-level search algorithm. It uses the SQL function
MATCH() and
FULLTEXT indexes.

The special kind of index that
holds the search index in the
MySQL full-text search
mechanism. Represents the words from values of a column,
omitting any that are specified as
stopwords. Originally, only
available for MyISAM tables. Starting in
MySQL 5.6.4, it is also available for
InnoDB tables.

G

A place in an InnoDB index data
structure where new values could be inserted. When you lock a
set of rows with a statement such as SELECT ... FOR
UPDATE, InnoDB can create locks that apply to the gaps
as well as the actual values in the index. For example, if you
select all values greater than 10 for update, a gap lock
prevents another transaction from inserting a new value that is
greater than 10. The supremum
record and infimum
record represent the gaps containing all values
greater than or less than all the current index values.

A lock on a
gap between index records, or a
lock on the gap before the first or after the last index record.
For example, SELECT c1 FOR UPDATE FROM t WHERE c1
BETWEEN 10 and 20; prevents other transactions from
inserting a value of 15 into the column t.c1,
whether or not there was already any such value in the column,
because the gaps between all existing values in the range are
locked. Contrast with record
lock and next-key
lock.

Gap locks are part of the tradeoff between performance and
concurrency, and are used in
some transaction isolation
levels and not others.

A type of log used for
diagnosis and troubleshooting of SQL statements processed by the
MySQL server. Can be stored in a file or in a database table.
You must enable this feature through the
general_log configuration
option to use it. You can disable it for a specific connection
through the sql_log_off
configuration option.

Records a broader range of queries than the
slow query log. Unlike the
binary log, which is used for
replication, the general query log contains
SELECT statements and does not
maintain strict ordering. For more information, see
Section 5.2.3, “The General Query Log”.

A shared InnoDB tablespace created using
CREATE TABLESPACE syntax. General
tablespaces can be created outside of the MySQL data directory,
are capable of holding multiple tables, and support tables of
all row formats. General tablespaces were introduced in MySQL
5.7.6.

A type of transaction involved
in XA operations. It consists
of several actions that are transactional in themselves, but
that all must either complete successfully as a group, or all be
rolled back as a group. In essence, this extends
ACID properties "up a level" so
that multiple ACID transactions can be executed in concert as
components of a global operation that also has ACID properties.
For this type of distributed transaction, you must use the
SERIALIZABLE isolation level to
achieve ACID properties.

H

hash index

A type of index intended for
queries that use equality operators, rather than range operators
such as greater-than or BETWEEN. It is
available for MEMORY tables. Although hash indexes are the
default for MEMORY tables for historic reasons, that storage
engine also supports B-tree
indexes, which are often a better choice for general-purpose
queries.

MySQL includes a variant of this index type, the
adaptive hash index, that is
constructed automatically for InnoDB tables if needed based on
runtime conditions.

Acronym for "hard disk drive". Refers to storage media using
spinning platters, usually when comparing and contrasting with
SSD. Its performance
characteristics can influence the throughput of a
disk-based workload.

A periodic message that is sent to indicate that a system is
functioning properly. In a
replication context, if the
master stops sending such
messages, one of the slaves can
take its place. Similar techniques can be used between the
servers in a cluster environment, to confirm that all of them
are operating properly.

A list of transactions with
delete-marked records scheduled to be processed by the
InnoDBpurge
operation. Recorded in the undo
log. The length of the history list is reported by
the command SHOW ENGINE INNODB STATUS. If the
history list grows longer than the value of the
innodb_max_purge_lag
configuration option, each DML
operation is delayed slightly to allow the purge operation to
finish flushing the deleted
records.

A backup taken while the database and is running and
applications are reading and writing to it. The backup involves
more than simply copying data files: it must include any data
that was inserted or updated while the backup was in process; it
must exclude any data that was deleted while the backup was in
process; and it must ignore any changes that were not committed.

The Oracle product that performs hot backups, of InnoDB tables
especially but also tables from MyISAM and other storage
engines, is known as MySQL Enterprise
Backup.

The hot backup process consists of two stages. The initial
copying of the data files produces a raw
backup. The apply
step incorporates any changes to the database that happened
while the backup was running. Applying the changes produces a
prepared backup; these files
are ready to be restored whenever necessary.

I

.ibd file

The data file for
file-per-table tablespaces and
general tablespaces.
File-per-table tablespace
.idb files contain a single table and
associated index data. General tablespace
.idb files may contain table and index data
for multiple tables. General tablespaces were introduced in
MySQL 5.7.6.

The .ibd file extension does not apply to the
system tablespace, which
consists of the ibdata files.

If a file-per-table table is created with the DATA
DIRECTORY = clause (in MySQL 5.6 and higher), the
.ibd file is located outside the normal
database directory, and is pointed to by a
.isl file.

When a .ibd file is included in a compressed
backup by the MySQL Enterprise
Backup product, the compressed equivalent is a
.ibz file.

When the MySQL Enterprise
Backup product performs a
compressed backup, it
transforms each tablespace file
that is created using the
file-per-table setting from a
.ibd extension to a .ibz
extension.

The compression applied during backup is distinct from the
compressed row format that
keeps table data compressed during normal operation. A
compressed backup operation skips the compression step for a
tablespace that is already in compressed row format, as
compressing a second time would slow down the backup but produce
little or no space savings.

A file that specifies the location of a
.ibd file for an InnoDB table
created with the DATA DIRECTORY = clause in
MySQL 5.6 and higher, or with the CREATE TABLESPACE ...
ADD DATAFILE clause in MySQL 5.7.8 and higher. It
functions like a symbolic link, without the platform
restrictions of the actual symbolic link mechanism. You can
store InnoDB tablespaces
outside the database directory,
for example, on an especially large or fast storage device
depending on the usage of the table. For details, see
Creating a File-Per-Table Tablespace Outside the Data Directory, and
InnoDB General Tablespaces.

The set of files managed by InnoDB within a MySQL database: the
system tablespace, any
file-per-table tablespaces, and
the (typically 2) redo log
files. Used sometimes in detailed discussions of InnoDB file
structures and formats, to avoid ambiguity between the meanings
of database between different
DBMS products, and the non-InnoDB files that may be part of a
MySQL database.

A supplemental backup file created by the
MySQL Enterprise Backup product
during a hot backup operation.
It contains information about any data changes that occurred
while the backup was running. The initial backup files,
including ibbackup_logfile, are known as a
raw backup, because the changes
that occurred during the backup operation are not yet
incorporated. After you perform the
apply step to the raw backup
files, the resulting files do include those final data changes,
and are known as a prepared
backup. At this stage, the
ibbackup_logfile file is no longer necessary.

A set of files with names such as ibdata1,
ibdata2, and so on, that make up the InnoDB
system tablespace. These files
contain metadata about InnoDB tables, (the
data dictionary), and the
storage areas for one or more undo
logs, the change
buffer, and the doublewrite
buffer. They also can contain some or all of the
table data also (depending on whether the
file-per-table mode is in
effect when each table is created). When the
innodb_file_per_table option is
enabled, data and indexes for newly created tables are stored in
separate .ibd files rather than
in the system tablespace.

The InnoDB temporary tablespace data file for non-compressed
InnoDB temporary tables and related objects.
The configuration file option,
innodb_temp_data_file_path,
allows users to define a relative path for the temporary data
file. If
innodb_temp_data_file_path is
not specified, the default behavior is to create a single
auto-extending 12MB data file named ibtmp1
in the data directory, alongside ibdata1.

A set of files, typically named ib_logfile0
and ib_logfile1, that form the
redo log. Also sometimes
referred to as the log group.
These files record statements that attempt to change data in
InnoDB tables. These statements are replayed automatically to
correct data written by incomplete transactions, on startup
following a crash.

This data cannot be used for manual recovery; for that type of
operation, use the binary log.

A type of database system that maintains data in memory, to
avoid overhead due to disk I/O and translation between disk
blocks and memory areas. Some in-memory databases sacrifice
durability (the "D" in the ACID
design philosophy) and are vulnerable to hardware, power, and
other types of failures, making them more suitable for read-only
operations. Other in-memory databases do use durability
mechanisms such as logging changes to disk or using non-volatile
memory.

MySQL features that are address the same kinds of
memory-intensive processing include the InnoDB
buffer pool,
adaptive hash index, and
read-only transaction
optimization, the MEMORY storage engine, the MyISAM key cache,
and the MySQL query cache.

A type of hot backup, performed
by the MySQL Enterprise Backup
product, that only saves data changed since some point in time.
Having a full backup and a succession of incremental backups
lets you reconstruct backup data over a long period, without the
storage overhead of keeping several full backups on hand. You
can restore the full backup and then apply each of the
incremental backups in succession, or you can keep the full
backup up-to-date by applying each incremental backup to it,
then perform a single restore operation.

The granularity of changed data is at the
page level. A page might
actually cover more than one row. Each changed page is included
in the backup.

A data structure that provides a fast lookup capability for
rows of a
table, typically by forming a
tree structure (B-tree)
representing all the values of a particular
column or set of columns.

InnoDB tables always have a clustered
index representing the primary
key. They can also have one or more
secondary indexes defined on
one or more columns. Depending on their structure, secondary
indexes can be classified as
partial,
column, or
composite indexes.

Indexes are a crucial aspect of
query performance. Database
architects design tables, queries, and indexes to allow fast
lookups for data needed by applications. The ideal database
design uses a covering index
where practical; the query results are computed entirely from
the index, without reading the actual table data. Each
foreign key constraint also
requires an index, to efficiently check whether values exist in
both the parent and
child tables.

Although a B-tree index is the most common, a different kind of
data structure is used for hash
indexes, as in the MEMORY storage
engine and the InnoDB adaptive hash
index.

A memory area that holds the token data for InnoDB
full-text search. It buffers
the data to minimize disk I/O when data is inserted or updated
in columns that are part of a FULLTEXT
index. The token data is written to disk when the
index cache becomes full. Each InnoDB
FULLTEXT index has its own separate index
cache, whose size is controlled by the configuration option
innodb_ft_cache_size.

Extended SQL syntax for overriding the
indexes recommended by the
optimizer. For example, the FORCE INDEX,
USE INDEX, and IGNORE
INDEX clauses. Typically used when indexed columns
have unevenly distributed values, resulting in inaccurate
cardinality estimates.

In an index that applies to
multiple columns (known as a composite
index), the initial or leading columns of the index.
A query that references the first 1, 2, 3, and so on columns of
a composite index can use the index, even if the query does not
reference all the columns in the index.

A pseudo-record in an
index, representing the
gap below the smallest value in
that index. If a transaction has a statement such as
SELECT ... FOR UPDATE ... WHERE col < 10;,
and the smallest value in the column is 5, it is a lock on the
infimum record that prevents other transactions from inserting
even smaller values such as 0, -10, and so on.

The name of the database that
provides a query interface to the MySQL
data dictionary. (This name is
defined by the ANSI SQL standard.) To examine information
(metadata) about the database, you can query tables such as
INFORMATION_SCHEMA.TABLES and
INFORMATION_SCHEMA.COLUMNS, rather than using
SHOW commands that produce unstructured
output.

The information schema contains some tables that are specific to
InnoDB, such as
INNODB_LOCKS and
INNODB_TRX. You use these tables
not to see how the database is structured, but to get real-time
information about the workings of InnoDB tables to help with
performance monitoring, tuning, and troubleshooting. In
particular, these tables provide information about MySQL
features related to
compression, and
transactions and their
associated locks.

A MySQL component that combines high performance with
transactional capability for
reliability, robustness, and concurrent access. It embodies the
ACID design philosophy.
Represented as a storage
engine; it handles tables created or altered with the
ENGINE=INNODB clause. See
Chapter 14, The InnoDB Storage Engine for architectural
details and administration procedures, and
Section 8.5, “Optimizing for InnoDB Tables” for performance advice.

In MySQL 5.5 and higher, InnoDB is the default storage engine
for new tables and the ENGINE=INNODB clause
is not required. In MySQL 5.1 only, many of the advanced InnoDB
features require enabling the component known as the InnoDB
Plugin. See Section 14.1.2, “InnoDB as the Default MySQL Storage Engine” for the
considerations involved in transitioning to recent releases
where InnoDB tables are the default.

The innodb_autoinc_lock_mode
option controls the algorithm used for
auto-increment locking. When
you have an auto-incrementing primary
key, you can use statement-based replication only
with the setting
innodb_autoinc_lock_mode=1.
This setting is known as
consecutive lock mode, because
multi-row inserts within a transaction receive consecutive
auto-increment values. If you have
innodb_autoinc_lock_mode=2, which allows
higher concurrency for insert operations, use row-based
replication rather than statement-based replication. This
setting is known as interleaved
lock mode, because multiple multi-row insert statements running
at the same time can receive autoincrement values that are
interleaved. The setting
innodb_autoinc_lock_mode=0 is the previous
(traditional) default setting and should not be used except for
compatibility purposes.

An important configuration option that affects many aspects of
InnoDB file storage, availability of features, and I/O
characteristics. In MySQL 5.6.7 and higher, it is enabled by
default. Prior to MySQL 5.6.7, it is disabled by default. The
innodb_file_per_table option
turns on file-per-table mode.
With this mode enabled, a newly created InnoDB table and
associated indexes can be stored in an
.ibd file, outside the
system tablespace.

This option affects the performance and storage considerations
for a number of SQL statements, such as
DROP TABLE and
TRUNCATE TABLE.

Enabling the
innodb_file_per_table option
allows you to take advantage of other features, such as table
compression, and backups of
named tables in MySQL Enterprise
Backup.

The innodb_lock_wait_timeout
option sets the balance between
waiting for shared resources to
become available, or giving up and handling the error, retrying,
or doing alternative processing in your application. Rolls back
any InnoDB transaction that waits more than a specified time to
acquire a lock. Especially
useful if deadlocks are caused
by updates to multiple tables controlled by different storage
engines; such deadlocks are not
detected automatically.

One of the primary DML
operations in SQL. The
performance of inserts is a key factor in
data warehouse systems that
load millions of rows into tables, and
OLTP systems where many
concurrent connections might insert rows into the same table, in
arbitrary order. If insert performance is important to you, you
should learn about InnoDB
features such as the insert
buffer used in change
buffering, and
auto-increment columns.

The former name of the change
buffer. In MySQL 5.5, support was added for buffering
changes to secondary index pages for
DELETE and
UPDATE operations. Previously,
only changes resulting from
INSERT operations were buffered.
The preferred term is now change
buffer.

The technique of storing changes to secondary index pages,
resulting from INSERT operations,
in the change buffer rather
than writing the changes immediately, so that the physical
writes can be performed to minimize random I/O. It is one of the
types of change buffering; the
others are delete buffering and
purge buffering.

Insert buffering is not used if the secondary index is
unique, because the uniqueness
of new values cannot be verified before the new entries are
written out. Other kinds of change buffering do work for unique
indexes.

A single mysqld daemon managing
a data directory representing
one or more databases with a
set of tables. It is common in
development, testing, and some
replication scenarios to have
multiple instances on the same
server machine, each managing
its own data directory and listening on its own port or socket.
With one instance running a
disk-bound workload, the server
might still have extra CPU and memory capacity to run additional
instances.

Modifications at the source code level to collect performance
data for tuning and debugging. In MySQL, data collected by
instrumentation is exposed through a SQL interface using the
INFORMATION_SCHEMA and
PERFORMANCE_SCHEMA databases.

A kind of lock that applies to
the table level, used to indicate what kind of lock the
transaction intends to acquire on rows in the table. Different
transactions can acquire different kinds of intention locks on
the same table, but the first transaction to acquire an
intention exclusive (IX) lock
on a table prevents other transactions from acquiring any S or X
locks on the table. Conversely, the first transaction to acquire
an intention shared (IS) lock
on a table prevents other transactions from acquiring any X
locks on the table. The two-phase process allows the lock
requests to be resolved in order, without blocking locks and
corresponding operations that are compatible. For more details
on this locking mechanism, see
Section 14.5.2.1, “InnoDB Lock Modes”.

A data structure optimized for document retrieval systems, used
in the implementation of InnoDB full-text
search. The InnoDB FULLTEXT
index, implemented as an inverted index, records the
position of each word within a document, rather than the
location of a table row. A single column value (a document
stored as a text string) is represented by many entries in the
inverted index.

Acronym for I/O operations per
second. A common measurement for busy systems,
particularly OLTP applications.
If this value is near the maximum that the storage devices can
handle, the application can become
disk-bound, limiting
scalability.

One of the foundations of database processing. Isolation is the
I in the acronym
ACID; the isolation level is
the setting that fine-tunes the balance between performance and
reliability, consistency, and reproducibility of results when
multiple transactions are
making changes and performing queries at the same time.

From highest amount of consistency and protection to the least,
the isolation levels supported by InnoDB are:
SERIALIZABLE,
REPEATABLE READ,
READ COMMITTED, and
READ UNCOMMITTED.

With InnoDB tables, many users can keep the default isolation
level (REPEATABLE READ) for all
operations. Expert users might choose the
read committed level as they
push the boundaries of scalability with OLTP processing, or
during data warehousing operations where minor inconsistencies
do not affect the aggregate results of large amounts of data.
The levels on the edges
(SERIALIZABLE and
READ UNCOMMITTED) change the
processing behavior to such an extent that they are rarely used.

J

join

A query that retrieves data
from more than one table, by referencing columns in the tables
that hold identical values. Ideally, these columns are part of
an InnoDB foreign key
relationship, which ensures referential
integrity and that the join columns are
indexed. Often used to save
space and improve query performance by replacing repeated
strings with numeric IDs, in a
normalized data design.

K

KEY_BLOCK_SIZE

An option to specify the size of data pages within an InnoDB
table that uses compressed row
format. The default is 8 kilobytes. Lower values risk
hitting internal limits that depend on the combination of row
size and compression percentage.

L

latch

A lightweight structure used by InnoDB to implement a
lock for its own internal
memory structures, typically held for a brief time measured in
milliseconds or microseconds. A general term that includes both
mutexes (for exclusive access)
and rw-locks (for shared
access). Certain latches are the focus of InnoDB performance
tuning, such as the data
dictionary mutex. Statistics about latch use and
contention are available through the
Performance Schema interface.

The InnoDB buffer pool is
represented as a list of memory
pages. The list is reordered as
new pages are accessed and enter the buffer pool, as pages
within the buffer pool are accessed again and are considered
newer, and as pages that are not accessed for a long time are
evicted from the buffer pool.
The buffer pool is actually divided into
sublists, and the replacement
policy is a variation of the familiar
LRU technique.

The high-level notion of an object that controls access to a
resource, such as a table, row, or internal data structure, as
part of a locking strategy. For
intensive performance tuning, you might delve into the actual
structures that implement locks, such as
mutexes and
latches.

An operation used in some database systems that converts many
row locks into a single table lock, saving memory space but
reducing concurrent access to the table. InnoDB uses a
space-efficient representation for row locks, so that lock
escalation is not needed.

A shared (S) lock allows a transaction to read a row. Multiple
transactions can acquire an S lock on that same row at the same
time.

An exclusive (X) lock allows a transaction to update or delete a
row. No other transaction can acquire any kind of lock on that
same row at the same time.

Intention locks apply to the
table level, and are used to indicate what kind of lock the
transaction intends to acquire on rows in the table. Different
transactions can acquire different kinds of intention locks on
the same table, but the first transaction to acquire an
intention exclusive (IX) lock on a table prevents other
transactions from acquiring any S or X locks on the table.
Conversely, the first transaction to acquire an intention shared
(IS) lock on a table prevents other transactions from acquiring
any X locks on the table. The two-phase process allows the lock
requests to be resolved in order, without blocking locks and
corresponding operations that are compatible.

The system of protecting a
transaction from seeing or
changing data that is being queried or changed by other
transactions. The locking strategy must balance reliability and
consistency of database operations (the principles of the
ACID philosophy) against the
performance needed for good
concurrency. Fine-tuning the
locking strategy often involves choosing an
isolation level and ensuring
all your database operations are safe and reliable for that
isolation level.

A SELECT statement that also
performs a locking operation on
an InnoDB table. Either
SELECT ... FOR
UPDATE or SELECT ... LOCK
IN SHARE MODE. It has the potential to produce a
deadlock, depending on the
isolation level of the
transaction. The opposite of a non-locking
read. Not allowed for global tables in a
read-only transaction.

In the InnoDB context, “log”log or “log
files” typically refers to the redo
log represented by the
ib_logfile* files. Another log
area which may be physically part of the
system tablespace is the
undo log.

Other kinds of logs that are important in MySQL are the
error log (for diagnosing
startup and runtime problems), binary
log (for working with replication and performing
point-in-time restores), the general query
log (for diagnosing application problems), and the
slow query log (for diagnosing
performance problems).

A type of operation that involves high-level, abstract aspects
such as tables, queries, indexes, and other SQL concepts.
Typically, logical aspects are important to make database
administration and application development convenient and
usable. Contrast with physical.

A backup that reproduces table
structure and data, without copying the actual data files. For
example, the
mysqldump
command produces a logical backup, because its output contains
statements such as CREATE TABLE and
INSERT that can re-create the data. Contrast
with physical backup. A logical
backup offers flexibility (for example, you could edit table
definitions or insert statements before restoring), but can take
substantially longer to restore
than a physical backup.

In MySQL 5.1, a prefix added to InnoDB configuration options
when installing the InnoDB
Plugin after server startup, so
any new configuration options not recognized by the current
level of MySQL do not cause a startup failure. MySQL processes
configuration options that start with this prefix, but gives a
warning rather than a failure if the part after the prefix is
not a recognized option.

An acronym for "least recently used", a common method for
managing storage areas. The items that have not been used
recently are evicted when space
is needed to cache newer items. InnoDB uses the LRU mechanism by
default to manage the pages
within the buffer pool, but
makes exceptions in cases where a page might be read only a
single time, such as during a full table
scan. This variation of the LRU algorithm is called
the midpoint insertion
strategy. The ways in which the buffer pool
management differs from the traditional LRU algorithm is
fine-tuned by the options
innodb_old_blocks_pct,
innodb_old_blocks_time, and the
new MySQL 5.6 options
innodb_lru_scan_depth and
innodb_flush_neighbors.

Acronym for "log sequence number". This arbitrary,
ever-increasing value represents a point in time corresponding
to operations recorded in the redo
log. (This point in time is regardless of
transaction boundaries; it can
fall in the middle of one or more transactions.) It is used
internally by InnoDB during crash
recovery and for managing the buffer pool.

Prior to MySQL 5.6.3, the LSN was a 4-byte unsigned integer. The
LSN became an 8-byte unsigned integer in MySQL 5.6.3 when the
redo log file size limit increased from 4GB to 512GB, as
additional bytes were required to store extra size information.
Applications built on MySQL 5.6.3 or later that use LSN values
should use 64-bit rather than 32-bit variables to store and
compare LSN values.

In the MySQL Enterprise Backup
product, you can specify an LSN to represent the point in time
from which to take an incremental
backup. The relevant LSN is displayed by the output
of the mysqlbackup command. Once you have the
LSN corresponding to the time of a full backup, you can specify
that value to take a subsequent incremental backup, whose output
contains another LSN for the next incremental backup.

M

.MRG file

A file containing references to other tables, used by the
MERGE storage engine. Files with this
extension are always included in backups produced by the
mysqlbackup command of the
MySQL Enterprise Backup
product.

Frequently shortened to "master". A database server machine in a
replication scenario that
processes the initial insert, update, and delete requests for
data. These changes are propagated to, and repeated on, other
servers known as slave servers.

An InnoDB thread that performs
various tasks in the background. Most of these tasks are I/O
related, such as writing changes from the
change buffer to the
appropriate secondary indexes.

To improve concurrency,
sometimes actions are moved from the master thread to separate
background threads. For example, in MySQL 5.6 and higher,
dirty pages are
flushed from the
buffer pool by the
page cleaner thread rather than
the master thread.

A popular component of many MySQL and
NoSQL software stacks, allowing
fast reads and writes for single values and caching the results
entirely in memory. Traditionally, applications required extra
logic to write the same data to a MySQL database for permanent
storage, or to read data from a MySQL database when it was not
cached yet in memory. Now, applications can use the simple
memcached protocol, supported by client
libraries for many languages, to communicate directly with MySQL
servers using InnoDB or MySQL
Cluster tables. These NoSQL interfaces to MySQL tables allow
applications to achieve higher read and write performance than
by issuing SQL commands directly, and can simplify application
logic and deployment configurations for systems that already
incorporated memcached for in-memory caching.

To apply changes to data cached in memory, such as when a page
is brought into the buffer
pool, and any applicable changes recorded in the
change buffer are incorporated
into the page in the buffer pool. The updated data is eventually
written to the tablespace by
the flush mechanism.

Enhancements to online
operations, particularly in MySQL 5.6 and higher, are focused on
reducing the amount of metadata locking. The objective is for
DDL operations that do not change the table structure (such as
CREATE INDEX and
DROP INDEX for
InnoDB tables) to proceed while the table is
being queried, updated, and so on by other transactions.

A feature implemented by the
innodb_metrics table in the
information_schema, in MySQL
5.6 and higher. You can query
counts and totals for low-level
InnoDB operations, and use the results for performance tuning in
combination with data from the
performance_schema.

The technique of initially bringing
pages into the InnoDB
buffer pool not at the "newest"
end of the list, but instead somewhere in the middle. The exact
location of this point can vary, based on the setting of the
innodb_old_blocks_pct option.
The intent is that blocks that are only read once, such as
during a full table scan, can
be aged out of the buffer pool sooner than with a strict
LRU algorithm.

An internal phase of InnoDB processing, when making changes at
the physical level to internal
data structures during DML
operations. A mini-transaction (mtr) has no notion of
rollback; multiple
mini-transactions can occur within a single
transaction. Mini-transactions
write information to the redo
log that is used during crash
recovery. A mini-transaction can also happen outside
the context of a regular transaction, for example during
purge processing by background
threads.

An INSERT statement where
auto-increment values are
specified for some but not all of the new rows. For example, a
multi-value INSERT could specify a value for
the auto-increment column in some cases and
NULL in other cases.
InnoDB generates auto-increment values for
the rows where the column value was specified as
NULL. Another example is an
INSERT ...
ON DUPLICATE KEY UPDATE statement, where
auto-increment values might be generated but not used, for any
duplicate rows that are processed as UPDATE
rather than INSERT statements.

Can cause consistency issues between
master and
slave servers in a
replication configuration. Can
require adjusting the value of the
innodb_autoinc_lock_mode
configuration option.

Informal abbreviation for "mutex variable". (Mutex itself is
short for "mutual exclusion".) The low-level object that InnoDB
uses to represent and enforce exclusive-access
locks to internal in-memory
data structures. Once the lock is acquired, any other process,
thread, and so on is prevented from acquiring the same lock.
Contrast with rw-locks, which
InnoDB uses to represent and enforce shared-access
locks to internal in-memory
data structures. Mutexes and rw-locks are known collectively as
latches.

Acronym for "multiversion concurrency control". This technique
lets InnoDB transactions with
certain isolation levels to
perform consistent read
operations; that is, to query rows that are being updated by
other transactions, and see the values from before those updates
occurred. This is a powerful technique to increase
concurrency, by allowing
queries to proceed without waiting due to
locks held by the other
transactions.

This technique is not universal in the database world. Some
other database products, and some other MySQL storage engines,
do not support it.

A command that performs a logical
backup of some combination of databases, tables, and
table data. The results are SQL statements that reproduce the
original schema objects, data, or both. For substantial amounts
of data, a physical backup
solution such as MySQL Enterprise
Backup is faster, particularly for the
restore operation.

N

natural key

An indexed column, typically a primary
key, where the values have some real-world
significance. Usually advised against because:

If the value should ever change, there is potentially a lot
of index maintenance to re-sort the
clustered index and update
the copies of the primary key value that are repeated in
each secondary index.

Even seemingly stable values can change in unpredictable
ways that are difficult to represent correctly in the
database. For example, one country can change into two or
several, making the original country code obsolete. Or,
rules about unique values might have exceptions. For
example, even if taxpayer IDs are intended to be unique to a
single person, a database might have to handle records that
violate that rule, such as in cases of identity theft.
Taxpayer IDs and other sensitive ID numbers also make poor
primary keys, because they may need to be secured,
encrypted, and otherwise treated differently than other
columns.

Thus, it is typically better to use arbitrary numeric values to
form a synthetic key, for
example using an auto-increment
column.

Any page in the same
extent as a particular page.
When a page is selected to be
flushed, any neighbor pages
that are dirty are typically
flushed as well, as an I/O optimization for traditional hard
disks. In MySQL 5.6 and up, this behavior can be controlled by
the configuration variable
innodb_flush_neighbors; you
might turn that setting off for SSD drives, which do not have
the same overhead for writing smaller batches of data at random
locations.

A query that does not use the
SELECT ... FOR UPDATE or SELECT ...
LOCK IN SHARE MODE clauses. The only kind of query
allowed for global tables in a read-only
transaction. The opposite of a
locking read.

The situation when a query retrieves data, and a later query
within the same transaction
retrieves what should be the same data, but the queries return
different results (changed by another transaction committing in
the meantime).

This kind of operation goes against the
ACID principle of database
design. Within a transaction, data should be consistent, with
predictable and stable relationships.

Among different isolation
levels, non-repeatable reads are prevented by the
serializable read and
repeatable read levels, and
allowed by the consistent read,
and read uncommitted levels.

A database design strategy where data is split into multiple
tables, and duplicate values condensed into single rows
represented by an ID, to avoid storing, querying, and updating
redundant or lengthy values. It is typically used in
OLTP applications.

For example, an address might be given a unique ID, so that a
census database could represent the relationship
lives at this address by
associating that ID with each member of a family, rather than
storing multiple copies of a complex value such as
123 Main Street, Anytown, USA.

For another example, although a simple address book application
might store each phone number in the same table as a person's
name and address, a phone company database might give each phone
number a special ID, and store the numbers and IDs in a separate
table. This normalized representation could simplify large-scale
updates when area codes split apart.

Normalization is not always recommended. Data that is primarily
queried, and only updated by deleting entirely and reloading, is
often kept in fewer, larger tables with redundant copies of
duplicate values. This data representation is referred to as
denormalized, and is frequently
found in data warehousing applications.

A broad term for a set of data access technologies that do not
use the SQL language as their
primary mechanism for reading and writing data. Some NoSQL
technologies act as key-value stores, only accepting
single-value reads and writes; some relax the restrictions of
the ACID methodology; still
others do not require a pre-planned
schema. MySQL users can combine
NoSQL-style processing for speed and simplicity with SQL
operations for flexibility and convenience, by using the
memcached API to directly
access some kinds of MySQL tables. The
memcached interface to InnoDB tables is
available in MySQL 5.6 and higher; see
InnoDB Integration with memcached for details. The
memcached interface to MySQL Cluster tables
is available in MySQL Cluster 7.2; see
http://dev.mysql.com/doc/ndbapi/en/ndbmemcache.html
for details.

A type of constraint that
specifies that a column cannot
contain any NULL values. It
helps to preserve referential
integrity, as the database server can identify data
with erroneous missing values. It also helps in the arithmetic
involved in query optimization, allowing the optimizer to
predict the number of entries in an index on that column.

A special value in SQL,
indicating the absence of data. Any arithmetic operation or
equality test involving a NULL value, in turn
produces a NULL result. (Thus it is similar
to the IEEE floating-point concept of NaN, "not a number".) Any
aggregate calculation such as AVG() ignores
rows with NULL values, when determining how
many rows to divide by. The only test that works with
NULL values uses the SQL idioms IS
NULL or IS NOT NULL.

NULL values play a part in index operations,
because for performance a database must minimize the overhead of
keeping track of missing data values. Typically,
NULL values are not stored in an index,
because a query that tests an indexed column using a standard
comparison operator could never match a row with a
NULL value for that column. For the same
reason, unique indexes do not prevent NULL
values; those values simply are not represented in the index.
Declaring a NOT NULL constraint on a column
provides reassurance that there are no rows left out of the
index, allowing for better query optimization (accurate counting
of rows and estimation of whether to use the index).

Because the primary key must be
able to uniquely identify every row in the table, a
single-column primary key cannot contain any
NULL values, and a multi-column primary key
cannot contain any rows with NULL values in
all columns.

Although the Oracle database allows a NULL
value to be concatenated with a string, InnoDB treats the result
of such an operation as NULL.

A column containing variable-length data (such as
BLOB and VARCHAR) that is
too long to fit on a B-tree
page. The data is stored in overflow
pages. The DYNAMIC row format in
the InnoDB Barracuda file
format is more efficient for such storage than the older
COMPACT row format.

Acronym for "Online Transaction Processing". A database system,
or a database application, that runs a workload with many
transactions, with frequent
writes as well as reads, typically affecting small amounts of
data at a time. For example, an airline reservation system or an
application that processes bank deposits. The data might be
organized in normalized form
for a balance between DML
(insert/update/delete) efficiency and
query efficiency. Contrast with
data warehouse.

With its row-level locking and
transactional capability,
InnoDB is the ideal storage
engine for MySQL tables used in OLTP applications.

A type of operation that involves no downtime, blocking, or
restricted operation for the database. Typically applied to
DDL. Operations that shorten
the periods of restricted operation, such as
fast index creation, have
evolved into a wider set of online
DDL operations in MySQL 5.6.

In the context of backups, a hot
backup is an online operation and a
warm backup is partially an
online operation.

A feature that improves the performance, concurrency, and
availability of InnoDB tables during
DDL (primarily
ALTER TABLE) operations. See
InnoDB and Online DDL for details.

The details vary according to the type of operation. In some
cases, the table can be modified concurrently while the
ALTER TABLE is in progress. The operation
might be able to be performed without doing a table copy, or
using a specially optimized type of table copy. Space usage is
controlled by the
innodb_online_alter_log_max_size
configuration option.

This feature is an enhancement of the Fast
Index Creation feature in MySQL 5.5 and the InnoDB
Plugin for MySQL 5.1.

A methodology that guides low-level implementation decisions for
a relational database system. The requirements of performance
and concurrency in a relational
database mean that operations must be started or dispatched
quickly. The requirements of consistency and
referential integrity mean that
any operation could fail: a transaction might be rolled back, a
DML operation could violate a
constraint, a request for a lock could cause a deadlock, a
network error could cause a timeout. An optimistic strategy is
one that assumes most requests or attempts will succeed, so that
relatively little work is done to prepare for the failure case.
When this assumption is true, the database does little
unnecessary work; when requests do fail, extra work must be done
to clean up and undo changes.

InnoDB uses optimistic strategies for operations such as
locking and
commits. For example, data
changed by a transaction can be written to the data files before
the commit occurs, making the commit itself very fast, but
requiring more work to undo the changes if the transaction is
rolled back.

The opposite of an optimistic strategy is a
pessimistic one, where a system
is optimized to deal with operations that are unreliable and
frequently unsuccessful. This methodology is rare in a database
system, because so much care goes into choosing reliable
hardware, networks, and algorithms.

A unit representing how much data InnoDB transfers at any one
time between disk (the data
files) and memory (the buffer
pool). A page can contain one or more
rows, depending on how much
data is in each row. If a row does not fit entirely into a
single page, InnoDB sets up additional pointer-style data
structures so that the information about the row can be stored
in one page.

One way to fit more data in each page is to use
compressed row format. For
tables that use BLOBs or large text fields,
compact row format allows those
large columns to be stored separately from the rest of the row,
reducing I/O overhead and memory usage for queries that do not
reference those columns.

When InnoDB reads or writes sets of pages as a batch to increase
I/O throughput, it reads or writes an
extent at a time.

All the InnoDB disk data structures within a MySQL instance
share the same page size.

For releases up to and including MySQL 5.5, the size of each
InnoDB page is fixed at 16
kilobytes. This value represents a balance: large enough to hold
the data for most rows, yet small enough to minimize the
performance overhead of transferring unneeded data to memory.
Other values are not tested or supported.

Starting in MySQL 5.6, the page size for an InnoDB
instance can be either 4KB,
8KB, or 16KB, controlled by the
innodb_page_size configuration
option. As of MySQL 5.7.6, InnoDB also provides support for 32KB
and 64KB page sizes. For both page sizes,
ROW_FORMAT=COMPRESSED is not supported and
the maximum record size is 16KB.

You set the size when creating the MySQL instance, and it
remains constant afterwards. The same page size applies to all
InnoDB tablespaces, both the
system tablespace and any
separate tablespaces created in
file-per-table mode.

Smaller page sizes can help performance with storage devices
that use small block sizes, particularly for
SSD devices in
disk-bound workloads, such as
for OLTP applications. As
individual rows are updated, less data is copied into memory,
written to disk, reorganized, locked, and so on.

The table in a foreign key
relationship that holds the initial column values pointed to
from the child table. The
consequences of deleting, or updating rows in the parent table
depend on the ON UPDATE and ON
DELETE clauses in the foreign key definition. Rows
with corresponding values in the child table could be
automatically deleted or updated in turn, or those columns could
be set to NULL, or the operation could be
prevented.

The performance_schema schema, in MySQL 5.5
and up, presents a set of tables that you can query to get
detailed information about the performance characteristics of
many internal parts of the MySQL server.

A methodology that sacrifices performance or concurrency in
favor of safety. It is appropriate if a high proportion of
requests or attempts might fail, or if the consequences of a
failed request are severe. InnoDB uses what is known as a
pessimistic locking strategy,
to minimize the chance of
deadlocks. At the application
level, you might avoid deadlocks by using a pessimistic strategy
of acquiring all locks needed by a transaction at the very
beginning.

Many built-in database mechanisms use the opposite
optimistic methodology.

A row that appears in the result set of a query, but not in the
result set of an earlier query. For example, if a query is run
twice within a transaction, and
in the meantime, another transaction commits after inserting a
new row or updating a row so that it matches the
WHERE clause of the query.

This occurrence is known as a phantom read. It is harder to
guard against than a non-repeatable
read, because locking all the rows from the first
query result set does not prevent the changes that cause the
phantom to appear.

Among different isolation
levels, phantom reads are prevented by the
serializable read level, and
allowed by the repeatable read,
consistent read, and
read uncommitted levels.

A type of operation that involves hardware-related aspects such
as disk blocks, memory pages, files, bits, disk reads, and so
on. Typically, physical aspects are important during
expert-level performance tuning and problem diagnosis. Contrast
with logical.

A backup that copies the actual
data files. For example, the
mysqlbackup
command of the MySQL Enterprise
Backup product produces a physical backup, because
its output contains data files that can be used directly by the
mysqld server, resulting in a faster
restore operation. Contrast
with logical backup.

The process of restoring a
backup to recreate the state of
the database at a specific date and time. Commonly abbreviated
PITR. Because it is unlikely
that the specified time corresponds exactly to the time of a
backup, this technique usually requires a combination of a
physical backup and a
logical backup. For example,
with the MySQL Enterprise
Backup product, you restore the last backup that you
took before the specified point in time, then replay changes
from the binary log between the
time of the backup and the PITR time.

A set of backup files, produced by the
MySQL Enterprise Backup
product, after all the stages of applying
binary logs and
incremental backups are
finished. The resulting files are ready to be
restored. Prior to the apply
steps, the files are known as a raw
backup.

A set of columns -- and by implication, the index based on this
set of columns -- that can uniquely identify every row in a
table. As such, it must be a unique index that does not contain
any NULL values.

InnoDB requires that every table has such an index (also called
the clustered index or
cluster index), and organizes
the table storage based on the column values of the primary key.

When choosing primary key values, consider using arbitrary
values (a synthetic key) rather
than relying on values derived from some other source (a
natural key).

An instance of an executing program. The operating system
switches between multiple running processes, allowing for a
certain degree of concurrency.
On most operating systems, processes can contain multiple
threads of execution that share
resources. Context-switching between threads is faster than the
equivalent switching between processes.

A type of garbage collection performed by a separate thread,
running on a periodic schedule. The purge includes these
actions: removing obsolete values from indexes; physically
removing rows that were marked for deletion by previous
DELETE statements.

The technique of storing changes to secondary index pages,
resulting from DELETE operations, in the
change buffer rather than
writing the changes immediately, so that the physical writes can
be performed to minimize random I/O. (Because delete operations
are a two-step process, this operation buffers the write that
normally purges an index record that was previously marked for
deletion.) It is one of the types of
change buffering; the others
are insert buffering and
delete buffering.

A thread within the InnoDB
process that is dedicated to performing the periodic
purge operation. In MySQL 5.6
and higher, multiple purge threads are enabled by the
innodb_purge_threads
configuration option.

Q

query

In SQL, an operation that reads
information from one or more
tables. Depending on the
organization of data and the parameters of the query, the lookup
might be optimized by consulting an
index. If multiple tables are
involved, the query is known as a
join.

For historical reasons, sometimes discussions of internal
processing for statements use "query" in a broader sense,
including other types of MySQL statements such as
DDL and
DML statements.

The set of decisions made by the optimizer about how to perform
a query most efficiently,
including which index or
indexes to use, and the order in which to
join tables.
Plan stability involves the
same choices being made consistently for a given query.

To reduce the amount of database activity, often in preparation
for an operation such as an ALTER
TABLE, a backup, or a
shutdown. Might or might not
involve doing as much flushing
as possible, so that InnoDB
does not continue doing background I/O.

In MySQL 5.6 and higher, the syntax FLUSH TABLES ...
FOR EXPORT writes some data to disk for
InnoDB tables that make it simpler to back up
those tables by copying the data files.

R

Acronym for "Redundant Array of Inexpensive Drives". Spreading
I/O operations across multiple drives enables greater
concurrency at the hardware
level, and improves the efficiency of low-level write operations
that otherwise would be performed in sequence.

A technique for quickly estimating the number of different
values in a column (the column's cardinality). InnoDB samples
pages at random from the index and uses that data to estimate
the number of different values. This operation occurs when each
table is first opened.

Originally, the number of sampled pages was fixed at 8; now, it
is determined by the setting of the
innodb_stats_sample_pages
parameter.

The way the random pages are picked depends on the setting of
the innodb_use_legacy_cardinality_algorithm parameter. The
default setting (OFF) has better randomness than in older
releases.

The initial set of backup files produced by the
MySQL Enterprise Backup
product, before the changes reflected in the
binary log and any
incremental backups are
applied. At this stage, the files are not ready to
restore. After these changes
are applied, the files are known as a
prepared backup.

An isolation level that uses a
locking strategy that relaxes
some of the protection between
transactions, in the interest
of performance. Transactions cannot see uncommitted data from
other transactions, but they can see data that is committed by
another transaction after the current transaction started. Thus,
a transaction never sees any bad data, but the data that it does
see may depend to some extent on the timing of other
transactions.

When a transaction with this isolation level performs
UPDATE ... WHERE or DELETE ...
WHERE operations, other transactions might have to
wait. The transaction can perform SELECT ... FOR
UPDATE, and LOCK IN SHARE MODE
operations without making other transactions wait.

The isolation level that
provides the least amount of protection between transactions.
Queries employ a locking
strategy that allows them to proceed in situations where they
would normally wait for another transaction. However, this extra
performance comes at the cost of less reliable results,
including data that has been changed by other transactions and
not committed yet (known as dirty
read). Use this isolation level only with great
caution, and be aware that the results might not be consistent
or reproducible, depending on what other transactions are doing
at the same time. Typically, transactions with this isolation
level do only queries, not insert, update, or delete operations.

An internal snapshot used by the
MVCC mechanism of InnoDB.
Certain transactions, depending
on their isolation level, see
the data values as they were at the time the transaction (or in
some cases, the statement) started. Isolation levels that use a
read view are REPEATABLE READ,
READ COMMITTED, and
READ UNCOMMITTED.

A type of I/O request that prefetches a group of
pages (an entire
extent) into the
buffer pool asynchronously, in
anticipation that these pages will be needed soon. The linear
read-ahead technique prefetches all the pages of one extent
based on access patterns for pages in the preceding extent, and
is part of all MySQL versions starting with the InnoDB Plugin
for MySQL 5.1. The random read-ahead technique prefetches all
the pages for an extent once a certain number of pages from the
same extent are in the buffer pool. Random read-ahead is not
part of MySQL 5.5, but is re-introduced in MySQL 5.6 under the
control of the innodb_random_read_ahead
configuration option.

A type of transaction that can be optimized for
InnoDB tables by eliminating some of the
bookkeeping involved with creating a read
view for each transaction. Can only perform
non-locking read queries. It
can be started explicitly with the syntax
START TRANSACTION READ
ONLY, or automatically under certain conditions. See
Optimizing InnoDB Read-Only Transactions for details.

A lock on an index record. For
example, SELECT c1 FOR UPDATE FROM t WHERE c1 =
10; prevents any other transaction from inserting,
updating, or deleting rows where the value of
t.c1 is 10. Contrast with
gap lock and
next-key lock.

The data, in units of records, recorded in the
redo log when
DML statements make changes to
InnoDB tables. It is used during crash
recovery to correct data written by incomplete
transactions. The
ever-increasing LSN value
represents the cumulative amount of redo data that has passed
through the redo log.

A disk-based data structure used during
crash recovery, to correct data
written by incomplete
transactions. During normal
operation, it encodes requests to change InnoDB table data,
which result from SQL statements or low-level API calls through
NoSQL interfaces. Modifications that did not finish updating the
data files before an unexpected
shutdown are replayed
automatically.

The redo log is physically represented as a set of files,
typically named ib_logfile0 and
ib_logfile1. The data in the redo log is
encoded in terms of records affected; this data is collectively
referred to as redo. The
passage of data through the redo logs is represented by the
ever-increasing LSN value. The
original 4GB limit on maximum size for the redo log is raised to
512GB in MySQL 5.6.3.

The oldest InnoDB row format, available for
tables using the Antelopefile format. Prior to MySQL
5.0.3, it was the only row format available in
InnoDB. In My SQL 5.0.3 and later, the
default is compact row format.
You can still specify redundant row format for compatibility
with older InnoDB tables.

The technique of maintaining data always in a consistent format,
part of the ACID philosophy. In
particular, data in different tables is kept consistent through
the use of foreign key
constraints, which can prevent changes from happening
or automatically propagate those changes to all related tables.
Related mechanisms include the unique
constraint, which prevents duplicate values from
being inserted by mistake, and the NOT
NULL constraint, which prevents blank values from
being inserted by mistake.

An important aspect of modern database systems. The database
server encodes and enforces relationships such as one-to-one,
one-to-many, many-to-one, and uniqueness. For example, a person
might have zero, one, or many phone numbers in an address
database; a single phone number might be associated with several
family members. In a financial database, a person might be
required to have exactly one taxpayer ID, and any taxpayer ID
could only be associated with one person.

The database server can use these relationships to prevent bad
data from being inserted, and to find efficient ways to look up
information. For example, if a value is declared to be unique,
the server can stop searching as soon as the first match is
found, and it can reject attempts to insert a second copy of the
same value.

At the database level, these relationships are expressed through
SQL features such as columns
within a table, unique and NOT NULLconstraints,
foreign keys, and different
kinds of join operations. Complex relationships typically
involve data split between more than one table. Often, the data
is normalized, so that
duplicate values in one-to-many relationships are stored only
once.

In a mathematical context, the relations within a database are
derived from set theory. For example, the OR
and AND operators of a
WHERE clause represent the notions of union
and intersection.

In the full-text search
feature, a number signifying the similarity between the search
string and the data in the FULLTEXT
index. For example, when you search for a single
word, that word is typically more relevant for a row where if it
occurs several times in the text than a row where it appears
only once.

The default isolation level for
InnoDB. It prevents any rows that are queried from being changed
by other transactions, thus blocking
non-repeatable reads but not
phantom reads. It uses a
moderately strict locking
strategy so that all queries within a transaction see data from
the same snapshot, that is, the data as it was at the time the
transaction started.

When a transaction with this isolation level performs
UPDATE ... WHERE, DELETE ...
WHERE, SELECT ... FOR UPDATE, and
LOCK IN SHARE MODE operations, other
transactions might have to wait.

The practice of sending changes from a
master database, to one or more
slave databases, so that all
databases have the same data. This technique has a wide range of
uses, such as load-balancing for better scalability, disaster
recovery, and testing software upgrades and configuration
changes. The changes can be sent between the database by methods
called row-based replication
and statement-based
replication.

The process of putting a set of backup files from the
MySQL Enterprise Backup product
in place for use by MySQL. This operation can be performed to
fix a corrupted database, to return to some earlier point in
time, or (in a replication
context) to set up a new slave
database. In the MySQL
Enterprise Backup product, this operation is
performed by the copy-back option of the
mysqlbackup command.

The disk storage format for
rows of an InnoDB
table. As InnoDB gains new
capabilities such as compression, new row formats are introduced
to support the resulting improvements in storage efficiency and
performance.

The row format of an InnoDB table is specified by the
ROW_FORMAT option. Row formats include
REDUNDANT, COMPACT,
COMPRESSED, and DYNAMIC.
To view the row format of an InnoDB table, you can issue the
SHOW TABLE STATUS statement, or query
INFORMATION_SCHEMA.INNODB_SYS_TABLES
(available in MySQL 5.6 or higher). Tables with a
DYNAMIC or COMPRESSED row
format are only supported with file-per-table tablespaces and
general tablespaces. General tablespaces were introduced in
MySQL 5.7.6.

A lock that prevents a row from
being accessed in an incompatible way by another
transaction. Other rows in the
same table can be freely written to by other transactions. This
is the type of locking done by
DML operations on
InnoDB tables.

Contrast with table locks used
by MyISAM, or during DDL
operations on InnoDB tables that cannot be done with
online DDL; those locks block
concurrent access to the table.

A form of replication where
events are propagated from the
master server specifying how to
change individual rows on the
slave server. It is safe to use
for all settings of the
innodb_autoinc_lock_mode
option.

The locking mechanism used for
InnoDB tables, relying on
row locks rather than
table locks. Multiple
transactions can modify the
same table concurrently. Only if two transactions try to modify
the same row does one of the transactions wait for the other to
complete (and release its row locks).

The low-level object that InnoDB uses to represent and enforce
shared-access locks to internal
in-memory data structures following certain rules. Contrast with
mutexes, which InnoDB uses to
represent and enforce exclusive access to internal in-memory
data structures. Mutexes and rw-locks are known collectively as
latches.

An x-lock provides write access to a
common resource while not permitting inconsistent reads by
other threads.

An sx-lock provides write access to a
common resource while permitting inconsistent reads by other
threads. sx-locks were introduced in
MySQL 5.7 to optimize concurrency and improve scalability
for read-write workloads.

S

savepoint

Savepoints help to implement nested
transactions. They can be used
to provide scope to operations on tables that are part of a
larger transaction. For example, scheduling a trip in a
reservation system might involve booking several different
flights; if a desired flight is unavailable, you might
roll back the changes involved
in booking that one leg, without rolling back the earlier
flights that were successfully booked.

The ability to add more work and issue more simultaneous
requests to a system, without a sudden drop in performance due
to exceeding the limits of system capacity. Software
architecture, hardware configuration, application coding, and
type of workload all play a part in scalability. When the system
reaches its maximum capacity, popular techniques for increasing
scalability are scale up
(increasing the capacity of existing hardware or software) and
scale out (adding new servers
and more instances of MySQL). Often paired with
availability as critical
aspects of a large-scale deployment.

A technique for increasing
scalability by adding new
servers and more instances of MySQL. For example, setting up
replication, MySQL Cluster, connection pooling, or other
features that spread work across a group of servers. Contrast
with scale up.

A technique for increasing
scalability by increasing the
capacity of existing hardware or software. For example,
increasing the memory on a server and adjusting memory-related
parameters such as
innodb_buffer_pool_size and
innodb_buffer_pool_instances.
Contrast with scale out.

Conceptually, a schema is a set of interrelated database
objects, such as tables, table columns, data types of the
columns, indexes, foreign keys, and so on. These objects are
connected through SQL syntax, because the columns make up the
tables, the foreign keys refer to tables and columns, and so on.
Ideally, they are also connected logically, working together as
part of a unified application or flexible framework. For
example, the information_schema
and performance_schema
databases use "schema" in their names to emphasize the close
relationships between the tables and columns they contain.

In MySQL, physically, a schema
is synonymous with a database.
You can substitute the keyword SCHEMA instead
of DATABASE in MySQL SQL syntax, for example
using CREATE SCHEMA instead of
CREATE DATABASE.

Some other database products draw a distinction. For example, in
the Oracle Database product, a
schema represents only a part
of a database: the tables and other objects owned by a single
user.

In MySQL, full-text search
queries use a special kind of index, the
FULLTEXT index. In MySQL 5.6.4
and up, InnoDB and MyISAM
tables both support FULLTEXT indexes;
formerly, these indexes were only available for
MyISAM tables.

A type of InnoDB index that
represents a subset of table columns. An InnoDB table can have
zero, one, or many secondary indexes. (Contrast with the
clustered index, which is
required for each InnoDB table, and stores the data for all the
table columns.)

A secondary index can be used to satisfy queries that only
require values from the indexed columns. For more complex
queries, it can be used to identify the relevant rows in the
table, which are then retrieved through lookups using the
clustered index.

Creating and dropping secondary indexes has traditionally
involved significant overhead from copying all the data in the
InnoDB table. The fast index
creation feature of the InnoDB Plugin makes both
CREATE INDEX and DROP
INDEX statements much faster for InnoDB secondary
indexes.

A division within an InnoDB
tablespace. If a tablespace is
analogous to a directory, the segments are analogous to files
within that directory. A segment can grow. New segments can be
created.

For example, within a
file-per-table tablespace, the
table data is in one segment and each associated index is in its
own segment. The system
tablespace contains many different segments, because
it can hold many tables and their associated indexes. The system
tablespace also includes one or more
rollback segments used for
undo logs.

Segments grow and shrink as data is inserted and deleted. When a
segment needs more room, it is extended by one
extent (1 megabyte) at a time.
Similarly, a segment releases one extent's worth of space when
all the data in that extent is no longer needed.

A property of data distribution, the number of distinct values
in a column (its cardinality)
divided by the number of records in the table. High selectivity
means that the column values are relatively unique, and can
retrieved efficiently through an index. If you (or the query
optimizer) can predict that a test in a WHERE
clause only matches a small number (or proportion) of rows in a
table, the overall query tends
to be efficient if it evaluates that test first, using an index.

A type of read operation used for UPDATE
statements, that is a combination of read
committed and consistent
read. When an UPDATE statement
examines a row that is already locked, InnoDB returns the latest
committed version to MySQL so that MySQL can determine whether
the row matches the WHERE condition of the
UPDATE. If the row matches (must be updated),
MySQL reads the row again, and this time InnoDB either locks it
or waits for a lock on it. This type of read operation can only
happen when the transaction has the read committed
isolation level, or when the
innodb_locks_unsafe_for_binlog
option is enabled.

The isolation level that uses
the most conservative locking strategy, to prevent any other
transactions from inserting or changing data that was read by
this transaction, until it is finished. This way, the same query
can be run over and over within a transaction, and be certain to
retrieve the same set of results each time. Any attempt to
change data that was committed by another transaction since the
start of the current transaction, cause the current transaction
to wait.

This is the default isolation level specified by the SQL
standard. In practice, this degree of strictness is rarely
needed, so the default isolation level for InnoDB is the next
most strict, repeatable read.

A type of program that runs continuously, waiting to receive and
act upon requests from another program (the client). Because
often an entire computer is dedicated to running one or more
server programs (such as a database server, a web server, an
application server, or some combination of these), the term
server can also refer to the
computer that runs the server software.

The process of flushing to disk
all dirty buffer pool pages
whose redo entries are contained in certain portion of the
redo log. Occurs before InnoDB
reuses a portion of a log file; the log files are used in a
circular fashion. Typically occurs with write-intensive
workloads.

The process of stopping the MySQL server. By default, this
process does cleanup operations for
InnoDB tables, so it can
slow to shut down, but fast to
start up later. If you skip the cleanup operations, it is
fast to shut down but must do
the cleanup during the next restart.

Frequently shortened to "slave". A database
server machine in a
replication scenario that
receives changes from another server (the
master) and applies those same
changes. Thus it maintains the same contents as the master,
although it might lag somewhat behind.

In MySQL, slave servers are commonly used in disaster recovery,
to take the place of a master servers that fails. They are also
commonly used for testing software upgrades and new settings, to
ensure that database configuration changes do not cause problems
with performance or reliability.

Slave servers typically have high workloads, because they
process all the DML (write)
operations relayed from the master, as well as user queries. To
ensure that slave servers can apply changes from the master fast
enough, they frequently have fast I/O devices and sufficient CPU
and memory to run multiple database instances on the same slave
server. For example, the master server might use hard drive
storage while the slave servers use
SSDs.

A type of log used for
performance tuning of SQL statements processed by the MySQL
server. The log information is stored in a file. You must enable
this feature to use it. You control which categories of "slow"
SQL statements are logged. For more information, see
Section 5.2.5, “The Slow Query Log”.

A type of shutdown that does additional
InnoDB flushing operations before completing.
Also known as a clean shutdown.
Specified by the configuration parameter
innodb_fast_shutdown=0 or the
command SET GLOBAL innodb_fast_shutdown=0;.
Although the shutdown itself can take longer, that time will be
saved on the subsequent startup.

The buffer used for sorting data during creation of an
InnoDB index. Sort buffer size is configured
using the
innodb_sort_buffer_size
configuration option.

space ID

An identifier used to uniquely identify an
InnoDBtablespace within a MySQL
instance. The space ID for the system
tablespace is always zero; this same ID applies to
all tables within the system tablespace or within a general
tablespace. Each file-per-table
tablespace and general tablespace has its own space ID.

Prior to MySQL 5.6, this hardcoded value presented difficulties
in moving InnoDB tablespace files between
MySQL instances. Starting in MySQL 5.6, you can copy tablespace
files between instances by using the
transportable tablespace
feature involving the statements FLUSH TABLES ... FOR
EXPORT, ALTER TABLE ... DISCARD
TABLESPACE, and ALTER TABLE ... IMPORT
TABLESPACE. The information needed to adjust the space
ID is conveyed in the .cfg file
which you copy along with the tablespace. See
Copying File-Per-Table Tablespaces to Another Server for details.

A type of wait operation that
continuously tests whether a resource becomes available. This
technique is used for resources that are typically held only for
brief periods, where it is more efficient to wait in a "busy
loop" than to put the thread to sleep and perform a context
switch. If the resource does not become available within a short
time, the spin loop ceases and another wait technique is used.

Acronym for "solid-state drive". A type of storage device with
different performance characteristics than a traditional hard
disk drive (HDD): smaller
storage capacity, faster for random reads, no moving parts, and
with a number of considerations affecting write performance. Its
performance characteristics can influence the throughput of a
disk-bound workload.

A form of replication where SQL
statements are sent from the
master server and replayed on
the slave server. It requires
some care with the setting for the
innodb_autoinc_lock_mode
option, to avoid potential timing problems with
auto-increment locking.

Estimated values relating to each InnoDBtable and
index, used to construct an
efficient query execution plan.
The main values are the
cardinality (number of distinct
values) and the total number of table rows or index entries. The
statistics for the table represent the data in its
primary key index. The
statistics for a secondary
index represent the rows covered by that index.

The values are estimated rather than counted precisely because
at any moment, different
transactions can be inserting
and deleting rows from the same table. To keep the values from
being recalculated frequently, you can enable
persistent statistics, where
the values are stored in InnoDB system
tables, and refreshed only when you issue an
ANALYZE TABLE statement.

You can control how NULL values
are treated when calculating statistics through the
innodb_stats_method
configuration option.

Other types of statistics are available for database objects and
database activity through the
INFORMATION_SCHEMA and
PERFORMANCE_SCHEMA tables.

The ability to search for different variations of a word based
on a common root word, such as singular and plural, or past,
present, and future verb tense. This feature is currently
supported in MyISAM full-text
search feature but not in
FULLTEXT indexes for InnoDB
tables.

In a FULLTEXT index, a word
that is considered common or trivial enough that it is omitted
from the search index and
ignored in search queries. Different configuration settings
control stopword processing for InnoDB and
MyISAM tables. See
Section 12.9.4, “Full-Text Stopwords” for details.

A component of the MySQL database that performs the low-level
work of storing, updating, and querying data. In MySQL 5.5 and
higher, InnoDB is the default
storage engine for new tables, superceding MyISAM. Different
storage engines are designed with different tradeoffs between
factors such as memory usage versus disk usage, read speed
versus write speed, and speed versus robustness. Each storage
engine manages specific tables, so we refer to
InnoDB tables,
MyISAM tables, and so on.

The MySQL Enterprise Backup
product is optimized for backing up InnoDB tables. It can also
back up tables handled by MyISAM and other storage engines.

The general name for the setting controlled by the
innodb_strict_mode option.
Turning on this setting causes certain conditions that are
normally treated as warnings, to be considered errors. For
example, certain invalid combinations of options related to
file format and
row format, that normally
produce a warning and continue with default values, now cause
the CREATE TABLE operation to fail.

Within the list structure that represents the buffer pool, pages
that are relatively old and relatively new are represented by
different portions of the list. A set of parameters control the
size of these portions and the dividing point between the new
and old pages.

A pseudo-record in an index,
representing the gap above the
largest value in that index. If a transaction has a statement
such as SELECT ... FOR UPDATE ... WHERE col >
10;, and the largest value in the column is 20, it is
a lock on the supremum record that prevents other transactions
from inserting even larger values such as 50, 100, and so on.

An indexed column, typically a primary
key, where the values are assigned arbitrarily. Often
done using an auto-increment
column. By treating the value as completely arbitrary, you can
avoid overly restrictive rules and faulty application
assumptions. For example, a numeric sequence representing
employee numbers might have a gap if an employee was approved
for hiring but never actually joined. Or employee number 100
might have a later hiring date than employee number 500, if they
left the company and later rejoined. Numeric values also produce
shorter values of predictable length. For example, storing
numeric codes meaning "Road", "Boulevard", "Expressway", and so
on is more space-efficient than repeating those strings over and
over.

One or more data files (ibdata
files) containing the metadata for InnoDB-related objects (the
data dictionary), and the
storage areas for one or more undo
logs, the change
buffer, and the doublewrite
buffer. Depending on the setting of the
innodb_file_per_table, when
tables are created, it might also contain table and index data
for some or all InnoDB tables. The data and metadata in the
system tablespace apply to all the
databases in a MySQL
instance.

Prior to MySQL 5.6.7, the default was to keep all InnoDB tables
and indexes inside the system tablespace, often causing this
file to become very large. Because the system tablespace never
shrinks, storage problems could arise if large amounts of
temporary data were loaded and then deleted. In MySQL 5.6.7 and
higher, the default is
file-per-table mode, where each
table and its associated indexes are stored in a separate
.ibd file. This new default
makes it easier to use InnoDB features that rely on the
Barracuda file format, such as
table compression and the
DYNAMIC row format.

In MySQL 5.6 and higher, the
innodb_undo_tablespaces option
allows you to configure separate tablespace files for undo logs.
These files are still considered part of the system tablespace.

Keeping all table data in the system tablespace or in separate
.ibd files has implications for storage
management in general. The MySQL
Enterprise Backup product might back up a small set
of large files, or many smaller files. On systems with thousands
of tables, the file system operations to process thousands of
.ibd files can cause bottlenecks.

InnoDB introduced general tablespaces in MySQL 5.7.6. General
tablespaces are shared tablespaces created using
CREATE TABLESPACE syntax. They
can be created outside of the MySQL data directory, are capable
of holding multiple tables, and support tables of all row
formats.

In terms of file storage, an InnoDB table belongs to one of the
following tablespace types:

The shared InnoDB system
tablespace, which is comprised of one or more
.ibdata files.

A file-per-table tablespace, comprised of an individual
.ibd file.

A shared general tablespace, comprised of an individual
.ibd file. General tablespaces were
introduced in MySQL 5.7.6.

.ibd data
files contain both table and
index data.

InnoDB tables created in file-per-table tablespaces can use the
Barracuda file format.
Barracuda tables can use the DYNAMIC row
format or the COMPRESSED row
format. These row formats enable InnoDB features such
as compression,
off-page columns, and large
index key prefixes (see
innodb_large_prefix). General
tablespaces support tables of all row formats regardless of the
innodb_file_format setting.

For backward compatibility with MySQL 5.1 and earlier, InnoDB
tables inside the system tablespace must use the
Antelope file format, which
supports the compact row format
and the redundant row format.

The rows of an InnoDB table are
organized into an index structure known as the
clustered index, with entries
sorted based on the primary key
columns of the table. Data access is optimized for queries that
filter and sort on the primary key columns, and each index
contains a copy of the associated primary key columns for each
entry. Modifying values for any of the primary key columns is an
expensive operation. Thus an important aspect of InnoDB table
design is choosing a primary key with columns that are used in
the most important queries, and keeping the primary key short,
with rarely changing values.

A lock that prevents any other
transaction from accessing a
table. InnoDB makes considerable effort to make such locks
unnecessary, by using techniques such as
online DDL,
row locks and
consistent reads for processing
DML statements and
queries. You can create such a
lock through SQL using the LOCK TABLE
statement; one of the steps in migrating from other database
systems or MySQL storage engines is to remove such statements
wherever practical.

InnoDB introduced general tablespaces in MySQL 5.7.6. General
tablespaces are shared tablespaces created using
CREATE TABLESPACE syntax. They
can be created outside of the MySQL data directory, are capable
of holding multiple tables, and support tables of all row
formats.

A representation of the data
dictionary metadata for a table, within the InnoDB
tablespace. This metadata can
be checked against the .frm
file for consistency when the table is opened, to
diagnose errors resulting from out-of-date
.frm files. This information is present for
InnoDB tables that reside in the system
tablespace, a
file-per-table tablespace, or a
general tablespace.

A table whose data does not need to be truly permanent. For
example, temporary tables might be used as storage areas for
intermediate results in complicated calculations or
transformations; this intermediate data would not need to be
recovered after a crash. Database products can take various
shortcuts to improve the performance of operations on temporary
tables, by being less scrupulous about writing data to disk and
other measures to protect the data across restarts.

Sometimes, the data itself is removed automatically at a set
time, such as when the transaction ends or when the session
ends. With some database products, the table itself is removed
automatically too.

The tablespace for non-compressed InnoDB
temporary tables and related objects, introduced in MySQL 5.7.1.
The configuration file option,
innodb_temp_data_file_path,
allows users to define a relative path for the temporary
tablespace data file. If
innodb_temp_data_file_path is
not specified, the default behavior is to create a single
auto-extending 12MB data file named ibtmp1
in the data directory, alongside ibdata
system tablespace files. The temporary tablespace is recreated
on each server start and receives a dynamically generated space
ID, which helps avoid conflicts with existing space IDs. The
temporary tablespace cannot reside on a raw device. Startup is
refused if the temporary tablespace cannot be created.

The temporary tablespace is removed on normal shutdown or on an
aborted initialization. The temporary tablespace is not removed
when a crash occurs. In this case, the database administrator
may remove the temporary tablespace manually or restart the
server with the same configuration, which removes and recreates
the temporary tablespace.

An error condition that can occur due to a combination of I/O
device configuration and hardware failure. If data is written
out in chunks smaller than the InnoDB page
size (by default, 16KB), a hardware failure while
writing could result in only part of a page being stored to
disk. The InnoDB doublewrite
buffer guards against this possibility.

Acronym for "transactions per
second", a unit of measurement sometimes used in benchmarks. Its
value depends on the workload
represented by a particular benchmark test, combined with
factors that you control such as the hardware capacity and
database configuration.

Transactions are atomic units of work that can be committed or
rolled back. When a transaction makes multiple changes to the
database, either all the changes succeed when the transaction is
committed, or all the changes are undone when the transaction is
rolled back.

Database transactions, as implemented by InnoDB, have properties
that are collectively known by the acronym
ACID, for atomicity,
consistency, isolation, and durability.

A feature added in MySQL 5.7.8 that permits page-level
compression for InnoDB tables that reside in
file-per-table tablespaces. Page compression is enabled by
specifying the COMPRESSION attribute with
CREATE TABLE or
ALTER TABLE. For more
information, see InnoDB Page Compression.

A DDL operation that removes
the entire contents of a table, while leaving the table and
related indexes intact. Contrast with
drop. Although conceptually it
has the same result as a DELETE statement
with no WHERE clause, it operates differently
behind the scenes: InnoDB creates a new empty table, drops the
old table, then renames the new table to take the place of the
old one. Because this is a DDL operation, it cannot be
rolled back.

If the table being truncated contains foreign keys that
reference another table, the truncation operation uses a slower
method of operation, deleting one row at a time so that
corresponding rows in the referenced table can be deleted as
needed by any ON DELETE CASCADE clause.
(MySQL 5.5 and higher do not allow this slower form of truncate,
and return an error instead if foreign keys are involved. In
this case, use a DELETE statement instead.

A technical term designating an ordered set of elements. It is
an abstract notion, used in formal discussions of database
theory. In the database field, tuples are usually represented by
the columns of a table row. They could also be represented by
the result sets of queries, for example, queries that retrieved
only some columns of a table, or columns from joined tables.

An operation that is part of a distributed
transaction, under the
XA specification. (Sometimes
abbreviated as 2PC.) When multiple databases participate in the
transaction, either all databases
commit the changes, or all
databases roll back the
changes.

U

undo

Data that is maintained throughout the life of a
transaction, recording all
changes so that they can be undone in case of a
rollback operation. It is
stored in the undo log either
within the system tablespace or
in separate undo tablespaces.

A storage area that holds copies of data modified by active
transactions. If another
transaction needs to see the original data (as part of a
consistent read operation), the
unmodified data is retrieved from this storage area.

By default, this area is physically part of the
system tablespace. In MySQL 5.6
and higher, you can use the
innodb_undo_tablespaces and
innodb_undo_directory
configuration options to split it into one or more separate
tablespace files, the
undo tablespaces, optionally
stored on another storage device such as an
SSD.

The undo log is split into separate portions, the
insert undo buffer and the
update undo buffer.

One of a set of files containing the undo
log, when the undo log is separated from the
system tablespace using the
innodb_undo_tablespaces and
innodb_undo_directory
configuration options. Only applies to MySQL 5.6 and higher.

A kind of constraint that
asserts that a column cannot contain any duplicate values. In
terms of relational algebra, it
is used to specify 1-to-1 relationships. For efficiency in
checking whether a value can be inserted (that is, the value
does not already exist in the column), a unique constraint is
supported by an underlying unique
index.

An index on a column or set of columns that have a
unique constraint. Because the
index is known not to contain any duplicate values, certain
kinds of lookups and count operations are more efficient than in
the normal kind of index. Most of the lookups against this type
of index are simply to determine if a certain value exists or
not. The number of values in the index is the same as the number
of rows in the table, or at least the number of rows with
non-null values for the associated columns.

Change buffering optimization
does not apply to unique indexes. As a workaround, you can
temporarily set unique_checks=0 while doing a
bulk data load into an InnoDB table.

The set of columns (one or more) comprising a
unique index. When you can
define a WHERE condition that matches exactly
one row, and the query can use an associated unique index, the
lookup and error handling can be performed very efficiently.

W

wait

When an operation, such as acquiring a
lock,
mutex, or
latch, cannot be completed
immediately, InnoDB pauses and tries again. The mechanism for
pausing is elaborate enough that this operation has its own
name, the wait. Individual
threads are paused using a combination of internal InnoDB
scheduling, operating system wait() calls,
and short-duration spin loops.

On systems with heavy load and many transactions, you might use
the output from the SHOW INNODB STATUS
command to determine whether threads are spending too much time
waiting, and if so, how you can improve
concurrency.

A backup taken while the
database is running, but that restricts some database operations
during the backup process. For example, tables might become
read-only. For busy applications and web sites, you might prefer
a hot backup.

To run a system under a typical
workload for some time after
startup, so that the buffer
pool and other memory regions are filled as they
would be under normal conditions.

This process happens naturally over time when a MySQL server is
restarted or subjected to a new workload. Starting in MySQL 5.6,
you can speed up the warmup process by setting the configuration
variables
innodb_buffer_pool_dump_at_shutdown=ON
and
innodb_buffer_pool_load_at_startup=ON,
to bring the contents of the buffer pool back into memory after
a restart. Typically, you run a workload for some time to warm
up the buffer pool before running performance tests, to ensure
consistent results across multiple runs; otherwise, performance
might be artificially low during the first run.

The built-in InnoDB storage
engine and the InnoDB Plugin
are supported on all the same Microsoft Windows versions as the
MySQL server. The MySQL Enterprise
Backup product has more comprehensive support for
Windows systems than the InnoDB Hot
Backup product that it supersedes.

The combination and volume of
SQL and other database
operations, performed by a database application during typical
or peak usage. You can subject the database to a particular
workload during performance testing to identify
bottlenecks, or during capacity
planning.

An optimization technique that reduces write operations when
dirty pages are
flushed from the InnoDB
buffer pool. If a row in a page
is updated multiple times, or multiple rows on the same page are
updated, all of those changes are stored to the data files in a
single write operation rather than one write for each change.

X

XA

A standard interface for coordinating distributed
transactions, allowing multiple
databases to participate in a transaction while maintaining
ACID compliance. For full
details, see Section 13.3.7, “XA Transactions”.

XA Distributed Transaction support is turned on by default. If
you are not using this feature, you can disable the
innodb_support_xa configuration
option, avoiding the performance overhead of an extra fsync for
each transaction.

Y

young

A characteristic of a page in
the InnoDBbuffer
pool meaning it has been accessed recently, and so is
moved within the buffer pool data structure, so that it will not
be flushed soon by the
LRU algorithm. This term is
used in some information schema
column names of tables related to the buffer pool.