This chapter discusses user-defined
partitioning. In MySQL 8.0, partitioning
support is provided by the InnoDB
storage engine. (The NDB storage engine
used by MySQL Cluster also provides partitioning support, but
NDB is not included in MySQL 8.0.)

MySQL 8.0 does not currently support partitioning of
tables using any storage engine other than
InnoDB, such as
MyISAM. An attempt to create a
partitioned tables using a storage engine that does not supply
native partitioning support fails with
ER_CHECK_NOT_IMPLEMENTED.

MySQL 8.0 Community binaries provided by Oracle include
partitioning support provided by the InnoDB
storage engine. For information about partitioning support offered
in MySQL Enterprise Edition binaries, see Chapter 29, MySQL Enterprise Edition.

If you are compiling MySQL 8.0 from source, configuring
the build with InnoDB support is sufficient to
produce binaries with partition support for
InnoDB tables. For more information, see
Section 2.8, “Installing MySQL from Source”.

Nothing further needs to be done to enable partitioning support by
InnoDB (for example, no special entries are
required in the my.cnf file).

It is not possible to disable partitioning support by the
InnoDB storage engine.

This is the official discussion forum for those interested in or
experimenting with MySQL Partitioning technology. It features
announcements and updates from MySQL developers and others. It
is monitored by members of the Partitioning Development and
Documentation Teams.

A MySQL news site featuring MySQL-related blogs, which should be
of interest to anyone using my MySQL. We encourage you to check
here for links to blogs kept by those working with MySQL
Partitioning, or to have your own blog added to those covered.

22.1 Overview of Partitioning in MySQL

This section provides a conceptual overview of partitioning in
MySQL 8.0.

The SQL standard does not provide much in the way of guidance
regarding the physical aspects of data storage. The SQL language
itself is intended to work independently of any data structures or
media underlying the schemas, tables, rows, or columns with which
it works. Nonetheless, most advanced database management systems
have evolved some means of determining the physical location to be
used for storing specific pieces of data in terms of the file
system, hardware or even both. In MySQL, the
InnoDB storage engine has long supported the
notion of a tablespace (see Section 15.7, “InnoDB Tablespaces”),
and the MySQL Server, even prior to the introduction of
partitioning, could be configured to employ different physical
directories for storing different databases (see
Section 8.12.2, “Using Symbolic Links”, for an explanation of how this
is done).

Partitioning takes this
notion a step further, by enabling you to distribute portions of
individual tables across a file system according to rules which
you can set largely as needed. In effect, different portions of a
table are stored as separate tables in different locations. The
user-selected rule by which the division of data is accomplished
is known as a partitioning
function, which in MySQL can be the modulus, simple
matching against a set of ranges or value lists, an internal
hashing function, or a linear hashing function. The function is
selected according to the partitioning type specified by the user,
and takes as its parameter the value of a user-supplied
expression. This expression can be a column value, a function
acting on one or more column values, or a set of one or more
column values, depending on the type of partitioning that is used.

In the case of RANGE, LIST,
and [LINEAR] HASH
partitioning, the value of the partitioning column is passed to
the partitioning function, which returns an integer value
representing the number of the partition in which that particular
record should be stored. This function must be nonconstant and
nonrandom. It may not contain any queries, but may use an SQL
expression that is valid in MySQL, as long as that expression
returns either NULL or an integer
intval such that

-MAXVALUE <= intval <= MAXVALUE

(MAXVALUE is used to represent the least upper
bound for the type of integer in question.
-MAXVALUE represents the greatest lower bound.)

For [LINEAR] KEY,
RANGE COLUMNS, and LIST
COLUMNS partitioning, the partitioning expression
consists of a list of one or more columns.

For [LINEAR] KEY
partitioning, the partitioning function is supplied by MySQL.

This is known as horizontal
partitioning—that is, different rows of a table
may be assigned to different physical partitions. MySQL
8.0 does not support
vertical partitioning, in
which different columns of a table are assigned to different
physical partitions. There are no plans at this time to introduce
vertical partitioning into MySQL.

For creating partitioned tables, you must use a storage engine
that supports them. In MySQL 8.0, all partitions of
the same partitioned table must use the same storage engine.
However, there is nothing preventing you from using different
storage engines for different partitioned tables on the same MySQL
server or even in the same database.

In MySQL 8.0, the only storage engine that supports
partitioning is InnoDB. Partitioning cannot be
used with storage engines that do not support it; these include
the MyISAM, MERGE,
CSV, and FEDERATED storage
engines.

When creating a partitioned table, the default storage engine is
used just as when creating any other table; to override this
behavior, it is necessary only to use the [STORAGE]
ENGINE option just as you would for a table that is not
partitioned. The target storage engine must provide native
partitioning support, or the statement fails. You should keep in
mind that [STORAGE] ENGINE (and other table
options) need to be listed before any
partitioning options are used in a CREATE
TABLE statement. This example shows how to create a
table that is partitioned by hash into 6 partitions and which uses
the InnoDB storage engine (regardless of the
value of default_storage_engine):

Each PARTITION clause can include a
[STORAGE] ENGINE option, but in MySQL
8.0 this has no effect.

Unless otherwise specified, the remaining examples in this
discussion assume that
default_storage_engine is
InnoDB.

Important

Partitioning applies to all data and indexes of a table; you
cannot partition only the data and not the indexes, or vice
versa, nor can you partition only a portion of the table.

Data and indexes for each partition can be assigned to a specific
directory using the DATA DIRECTORY and
INDEX DIRECTORY options for the
PARTITION clause of the
CREATE TABLE statement used to
create the partitioned table.

All columns used in the table's partitioning expression must
be part of every unique key that the table may have, including any
primary key. This means that a table such as this one, created by
the following SQL statement, cannot be partitioned:

Because the keys pk and uk
have no columns in common, there are no columns available for use
in a partitioning expression. Possible workarounds in this
situation include adding the name column to the
table's primary key, adding the id column
to uk, or simply removing the unique key
altogether. See
Section 22.6.1, “Partitioning Keys, Primary Keys, and Unique Keys”,
for more information.

In addition, MAX_ROWS and
MIN_ROWS can be used to determine the maximum
and minimum numbers of rows, respectively, that can be stored in
each partition. See Section 22.3, “Partition Management”, for
more information on these options.

Some advantages of partitioning are listed here:

Partitioning makes it possible to store more data in one table
than can be held on a single disk or file system partition.

Data that loses its usefulness can often be easily removed
from a partitioned table by dropping the partition (or
partitions) containing only that data. Conversely, the process
of adding new data can in some cases be greatly facilitated by
adding one or more new partitions for storing specifically
that data.

Some queries can be greatly optimized in virtue of the fact
that data satisfying a given WHERE clause
can be stored only on one or more partitions, which
automatically excludes any remaining partitions from the
search. Because partitions can be altered after a partitioned
table has been created, you can reorganize your data to
enhance frequent queries that may not have been often used
when the partitioning scheme was first set up. This ability to
exclude non-matching partitions (and thus any rows they
contain) is often referred to as
partition pruning. For
more information, see Section 22.4, “Partition Pruning”.

In addition, MySQL 8.0 supports explicit
partition selection for queries. For example,
SELECT * FROM t
PARTITION (p0,p1) WHERE c < 5 selects only those
rows in partitions p0 and
p1 that match the WHERE
condition. In this case, MySQL does not check any other
partitions of table t; this can greatly
speed up queries when you already know which partition or
partitions you wish to examine. Partition selection is also
supported for the data modification statements
DELETE,
INSERT,
REPLACE,
UPDATE, and
LOAD DATA,
LOAD XML. See the descriptions
of these statements for more information and examples.

Other benefits usually associated with partitioning include those
in the following list. These features are not currently
implemented in MySQL Partitioning, but are high on our list of
priorities.

Queries involving aggregate functions such as
SUM() and
COUNT() can easily be
parallelized. A simple example of such a query might be
SELECT salesperson_id, COUNT(orders) as order_total
FROM sales GROUP BY salesperson_id;. By
“parallelized,” we mean that the query can be run
simultaneously on each partition, and the final result
obtained merely by summing the results obtained for all
partitions.

HASH partitioning.
With this type of partitioning, a partition is selected
based on the value returned by a user-defined expression
that operates on column values in rows to be inserted into
the table. The function may consist of any expression valid
in MySQL that yields a nonnegative integer value. An
extension to this type, LINEAR HASH, is
also available. See Section 22.2.4, “HASH Partitioning”.

KEY partitioning.
This type of partitioning is similar to partitioning by
HASH, except that only one or more
columns to be evaluated are supplied, and the MySQL server
provides its own hashing function. These columns can contain
other than integer values, since the hashing function
supplied by MySQL guarantees an integer result regardless of
the column data type. An extension to this type,
LINEAR KEY, is also available. See
Section 22.2.5, “KEY Partitioning”.

A very common use of database partitioning is to segregate data by
date. Some database systems support explicit date partitioning,
which MySQL does not implement in 8.0. However, it is
not difficult in MySQL to create partitioning schemes based on
DATE,
TIME, or
DATETIME columns, or based on
expressions making use of such columns.

When partitioning by KEY or LINEAR
KEY, you can use a DATE,
TIME, or
DATETIME column as the partitioning
column without performing any modification of the column value.
For example, this table creation statement is perfectly valid in
MySQL:

In MySQL 8.0, it is also possible to use a
DATE or
DATETIME column as the partitioning
column using RANGE COLUMNS and LIST
COLUMNS partitioning.

Other partitioning types require a partitioning expression that
yields an integer value or NULL. If you wish to
use date-based partitioning by RANGE,
LIST, HASH, or
LINEAR HASH, you can simply employ a function
that operates on a DATE,
TIME, or
DATETIME column and returns such a
value, as shown here:

It is important to remember—regardless of the type of
partitioning that you use—that partitions are always
numbered automatically and in sequence when created, starting with
0. When a new row is inserted into a
partitioned table, it is these partition numbers that are used in
identifying the correct partition. For example, if your table uses
4 partitions, these partitions are numbered 0,
1, 2, and
3. For the RANGE and
LIST partitioning types, it is necessary to
ensure that there is a partition defined for each partition
number. For HASH partitioning, the
user-supplied expression must evaluate to an integer value greater
than 0. For KEY
partitioning, this issue is taken care of automatically by the
hashing function which the MySQL server employs internally.

Names of partitions generally follow the rules governing other
MySQL identifiers, such as those for tables and databases.
However, you should note that partition names are not
case-sensitive. For example, the following
CREATE TABLE statement fails as
shown:

Failure occurs because MySQL sees no difference between the
partition names mypart and
MyPart.

When you specify the number of partitions for the table, this must
be expressed as a positive, nonzero integer literal with no
leading zeros, and may not be an expression such as
0.8E+01 or 6-2, even if it
evaluates to an integer value. Decimal fractions are not
permitted.

In the sections that follow, we do not necessarily provide all
possible forms for the syntax that can be used for creating each
partition type; for this information, see
Section 13.1.15, “CREATE TABLE Syntax”.

22.2.1 RANGE Partitioning

A table that is partitioned by range is partitioned in such a
way that each partition contains rows for which the partitioning
expression value lies within a given range. Ranges should be
contiguous but not overlapping, and are defined using the
VALUES LESS THAN operator. For the next few
examples, suppose that you are creating a table such as the
following to hold personnel records for a chain of 20 video
stores, numbered 1 through 20:

The employees table used here has no
primary or unique keys. While the examples work as shown for
purposes of the present discussion, you should keep in mind
that tables are extremely likely in practice to have primary
keys, unique keys, or both, and that allowable choices for
partitioning columns depend on the columns used for these
keys, if any are present. For a discussion of these issues,
see
Section 22.6.1, “Partitioning Keys, Primary Keys, and Unique Keys”.

This table can be partitioned by range in a number of ways,
depending on your needs. One way would be to use the
store_id column. For instance, you might
decide to partition the table 4 ways by adding a
PARTITION BY RANGE clause as shown here:

In this partitioning scheme, all rows corresponding to employees
working at stores 1 through 5 are stored in partition
p0, to those employed at stores 6 through 10
are stored in partition p1, and so on. Each
partition is defined in order, from lowest to highest. This is a
requirement of the PARTITION BY RANGE syntax;
you can think of it as being analogous to a series of
if ... elseif ... statements in C or Java in
this regard.

It is easy to determine that a new row containing the data
(72, 'Mitchell', 'Wilson', '1998-06-25', NULL,
13) is inserted into partition p2,
but what happens when your chain adds a
21st store? Under this scheme, there
is no rule that covers a row whose store_id
is greater than 20, so an error results because the server does
not know where to place it. You can keep this from occurring by
using a “catchall”VALUES LESS
THAN clause in the CREATE
TABLE statement that provides for all values greater
than the highest value explicitly named:

MAXVALUE represents an integer value that is
always greater than the largest possible integer value (in
mathematical language, it serves as a
least upper bound). Now,
any rows whose store_id column value is
greater than or equal to 16 (the highest value defined) are
stored in partition p3. At some point in the
future—when the number of stores has increased to 25, 30,
or more—you can use an
ALTER
TABLE statement to add new partitions for stores
21-25, 26-30, and so on (see
Section 22.3, “Partition Management”, for details of how to
do this).

In much the same fashion, you could partition the table based on
employee job codes—that is, based on ranges of
job_code column values. For
example—assuming that two-digit job codes are used for
regular (in-store) workers, three-digit codes are used for
office and support personnel, and four-digit codes are used for
management positions—you could create the partitioned
table using the following statement:

In this instance, all rows relating to in-store workers would be
stored in partition p0, those relating to
office and support staff in p1, and those
relating to managers in partition p2.

It is also possible to use an expression in VALUES LESS
THAN clauses. However, MySQL must be able to evaluate
the expression's return value as part of a LESS
THAN (<) comparison.

Rather than splitting up the table data according to store
number, you can use an expression based on one of the two
DATE columns instead. For
example, let us suppose that you wish to partition based on the
year that each employee left the company; that is, the value of
YEAR(separated). An example of a
CREATE TABLE statement that
implements such a partitioning scheme is shown here:

In this scheme, for all employees who left before 1991, the rows
are stored in partition p0; for those who
left in the years 1991 through 1995, in p1;
for those who left in the years 1996 through 2000, in
p2; and for any workers who left after the
year 2000, in p3.

It is also possible to partition a table by
RANGE, based on the value of a
TIMESTAMP column, using the
UNIX_TIMESTAMP() function, as
shown in this example:

Any other expressions involving
TIMESTAMP values are not
permitted. (See Bug #42849.)

Range partitioning is particularly useful when one or more of
the following conditions is true:

You want or need to delete “old” data. If you
are using the partitioning scheme shown previously for the
employees table, you can simply use
ALTER TABLE employees DROP PARTITION p0;
to delete all rows relating to employees who stopped working
for the firm prior to 1991. (See
Section 13.1.7, “ALTER TABLE Syntax”, and
Section 22.3, “Partition Management”, for more
information.) For a table with a great many rows, this can
be much more efficient than running a
DELETE query such as
DELETE FROM employees WHERE YEAR(separated) <=
1990;.

You want to use a column containing date or time values, or
containing values arising from some other series.

A variant on this type of partitioning is RANGE
COLUMNS partitioning. Partitioning by RANGE
COLUMNS makes it possible to employ multiple columns
for defining partitioning ranges that apply both to placement of
rows in partitions and for determining the inclusion or
exclusion of specific partitions when performing partition
pruning. See Section 22.2.3.1, “RANGE COLUMNS partitioning”, for
more information.

Partitioning schemes based on time intervals.
If you wish to implement a partitioning scheme based on ranges
or intervals of time in MySQL 8.0, you have two
options:

Partition the table by RANGE, and for the
partitioning expression, employ a function operating on a
DATE,
TIME, or
DATETIME column and returning
an integer value, as shown here:

The use of partitioning columns employing date or time types
other than DATE or
DATETIME is not supported with
RANGE COLUMNS.

22.2.2 LIST Partitioning

List partitioning in MySQL is similar to range partitioning in
many ways. As in partitioning by RANGE, each
partition must be explicitly defined. The chief difference
between the two types of partitioning is that, in list
partitioning, each partition is defined and selected based on
the membership of a column value in one of a set of value lists,
rather than in one of a set of contiguous ranges of values. This
is done by using PARTITION BY
LIST(expr) where
expr is a column value or an
expression based on a column value and returning an integer
value, and then defining each partition by means of a
VALUES IN
(value_list), where
value_list is a comma-separated list
of integers.

However, other column types may be used in value lists when
employing LIST COLUMN partitioning, which
is described later in this section.

Unlike the case with partitions defined by range, list
partitions do not need to be declared in any particular order.
For more detailed syntactical information, see
Section 13.1.15, “CREATE TABLE Syntax”.

For the examples that follow, we assume that the basic
definition of the table to be partitioned is provided by the
CREATE TABLE statement shown
here:

This makes it easy to add or drop employee records relating to
specific regions to or from the table. For instance, suppose
that all stores in the West region are sold to another company.
In MySQL 8.0, all rows relating to employees
working at stores in that region can be deleted with the query
ALTER TABLE employees TRUNCATE PARTITION
pWest, which can be executed much more efficiently
than the equivalent DELETE
statement DELETE FROM employees WHERE store_id IN
(4,12,13,14,18);. (Using ALTER TABLE
employees DROP PARTITION pWest would also delete all
of these rows, but would also remove the partition
pWest from the definition of the table; you
would need to use an ALTER TABLE ... ADD
PARTITION statement to restore the table's
original partitioning scheme.)

As with RANGE partitioning, it is possible to
combine LIST partitioning with partitioning
by hash or key to produce a composite partitioning
(subpartitioning). See
Section 22.2.6, “Subpartitioning”.

Unlike the case with RANGE partitioning,
there is no “catch-all” such as
MAXVALUE; all expected values for the
partitioning expression should be covered in PARTITION
... VALUES IN (...) clauses. An
INSERT statement containing an
unmatched partitioning column value fails with an error, as
shown in this example:

When inserting multiple rows using a single
INSERT statement into a single
InnoDB table,
InnoDB considers the statement a single
transaction, so that the presence of any unmatched values causes
the statement to fail completely, and so no rows are inserted.

You can cause this type of error to be ignored by using the
IGNORE keyword. If you do so, rows containing
unmatched partitioning column values are not inserted, but any
rows with matching values are inserted, and
no errors are reported:

MySQL 8.0 also provides support for LIST
COLUMNS partitioning, a variant of
LIST partitioning that enables you to use
columns of types other than integer types for partitioning
columns, and to use multiple columns as partitioning keys. For
more information, see
Section 22.2.3.2, “LIST COLUMNS partitioning”.

22.2.3 COLUMNS Partitioning

The next two sections discuss
COLUMNS
partitioning, which are variants on
RANGE and LIST
partitioning. COLUMNS partitioning enables
the use of multiple columns in partitioning keys. All of these
columns are taken into account both for the purpose of placing
rows in partitions and for the determination of which partitions
are to be checked for matching rows in partition pruning.

In addition, both RANGE COLUMNS partitioning
and LIST COLUMNS partitioning support the use
of non-integer columns for defining value ranges or list
members. The permitted data types are shown in the following
list:

22.2.3.1 RANGE COLUMNS partitioning

Range columns partitioning is similar to range partitioning,
but enables you to define partitions using ranges based on
multiple column values. In addition, you can define the ranges
using columns of types other than integer types.

RANGE COLUMNS partitioning differs
significantly from RANGE partitioning in
the following ways:

RANGE COLUMNS does not accept
expressions, only names of columns.

RANGE COLUMNS accepts a list of one or
more columns.

RANGE COLUMNS partitions are based on
comparisons between
tuples (lists of
column values) rather than comparisons between scalar
values. Placement of rows in RANGE
COLUMNS partitions is also based on comparisons
between tuples; this is discussed further later in this
section.

In the syntax just shown,
column_list is a list of one or
more columns (sometimes called a
partitioning column
list), and value_list is
a list of values (that is, it is a
partition definition value
list). A value_list must
be supplied for each partition definition, and each
value_list must have the same
number of values as the column_list
has columns. Generally speaking, if you use
N columns in the
COLUMNS clause, then each VALUES
LESS THAN clause must also be supplied with a list
of N values.

The elements in the partitioning column list and in the value
list defining each partition must occur in the same order. In
addition, each element in the value list must be of the same
data type as the corresponding element in the column list.
However, the order of the column names in the partitioning
column list and the value lists does not have to be the same
as the order of the table column definitions in the main part
of the CREATE TABLE statement.
As with table partitioned by RANGE, you can
use MAXVALUE to represent a value such that
any legal value inserted into a given column is always less
than this value. Here is an example of a
CREATE TABLE statement that
helps to illustrate all of these points:

Table rcx contains the columns
a, b,
c, d. The partitioning
column list supplied to the COLUMNS clause
uses 3 of these columns, in the order a,
d, c. Each value list
used to define a partition contains 3 values in the same
order; that is, each value list tuple has the form
(INT, INT,
CHAR(3)), which corresponds to the data
types used by columns a,
d, and c (in that
order).

Placement of rows into partitions is determined by comparing
the tuple from a row to be inserted that matches the column
list in the COLUMNS clause with the tuples
used in the VALUES LESS THAN clauses to
define partitions of the table. Because we are comparing
tuples (that is, lists or sets of values) rather than scalar
values, the semantics of VALUES LESS THAN
as used with RANGE COLUMNS partitions
differs somewhat from the case with simple
RANGE partitions. In
RANGE partitioning, a row generating an
expression value that is equal to a limiting value in a
VALUES LESS THAN is never placed in the
corresponding partition; however, when using RANGE
COLUMNS partitioning, it is sometimes possible for a
row whose partitioning column list's first element is
equal in value to the that of the first element in a
VALUES LESS THAN value list to be placed in
the corresponding partition.

If we insert 3 rows into this table such that the column value
for a is 5 for each row,
all 3 rows are stored in partition p1
because the a column value is in each case
not less than 5, as we can see by executing the proper query
against the
INFORMATION_SCHEMA.PARTITIONS
table:

This is because we are comparing rows rather than scalar
values. We can compare the row values inserted with the
limiting row value from the VALUES THAN LESS
THAN clause used to define partition
p0 in table rc1, like
this:

The 2 tuples (5,10) and
(5,11) evaluate as less than
(5,12), so they are stored in partition
p0. Since 5 is not less than 5 and 12 is
not less than 12, (5,12) is considered not
less than (5,12), and is stored in
partition p1.

The SELECT statement in the
preceding example could also have been written using explicit
row constructors, like this:

For a table partitioned by RANGE COLUMNS
using only a single partitioning column, the storing of rows
in partitions is the same as that of an equivalent table that
is partitioned by RANGE. The following
CREATE TABLE statement creates a table
partitioned by RANGE COLUMNS using 1
partitioning column:

It is also possible to create tables partitioned by
RANGE COLUMNS where limiting values for one
or more columns are repeated in successive partition
definitions. You can do this as long as the tuples of column
values used to define the partitions are strictly increasing.
For example, each of the following CREATE
TABLE statements is valid:

The following statement also succeeds, even though it might
appear at first glance that it would not, since the limiting
value of column b is 25 for partition
p0 and 20 for partition
p1, and the limiting value of column
c is 100 for partition
p1 and 50 for partition
p2:

When you get such an error, you can deduce which partition
definitions are invalid by making “less than”
comparisons between their column lists. In this case, the
problem is with the definition of partition
p2 because the tuple used to define it is
not less than the tuple used to define partition
p3, as shown here:

It is also possible for MAXVALUE to appear
for the same column in more than one VALUES LESS
THAN clause when using RANGE
COLUMNS. However, the limiting values for individual
columns in successive partition definitions should otherwise
be increasing, there should be no more than one partition
defined where MAXVALUE is used as the upper
limit for all column values, and this partition definition
should appear last in the list of PARTITION ...
VALUES LESS THAN clauses. In addition, you cannot
use MAXVALUE as the limiting value for the
first column in more than one partition definition.

As stated previously, it is also possible with RANGE
COLUMNS partitioning to use non-integer columns as
partitioning columns. (See
Section 22.2.3, “COLUMNS Partitioning”, for a complete listing
of these.) Consider a table named employees
(which is not partitioned), created using the following
statement:

Because different character sets and collations have
different sort orders, the character sets and collations in
use may effect which partition of a table partitioned by
RANGE COLUMNS a given row is stored in
when using string columns as partitioning columns. In
addition, changing the character set or collation for a
given database, table, or column after such a table is
created may cause changes in how rows are distributed. For
example, when using a case-sensitive collation,
'and' sorts before
'Andersen', but when using a collation
that is case insensitive, the reverse is true.

Similarly, you can cause the employees
table to be partitioned in such a way that each row is stored
in one of several partitions based on the decade in which the
corresponding employee was hired using the
ALTER
TABLE statement shown here:

22.2.3.2 LIST COLUMNS partitioning

MySQL 8.0 provides support for LIST
COLUMNS partitioning. This is a variant of
LIST partitioning that enables the use of
multiple columns as partition keys, and for columns of data
types other than integer types to be used as partitioning
columns; you can use string types,
DATE, and
DATETIME columns. (For more
information about permitted data types for
COLUMNS partitioning columns, see
Section 22.2.3, “COLUMNS Partitioning”.)

Suppose that you have a business that has customers in 12
cities which, for sales and marketing purposes, you organize
into 4 regions of 3 cities each as shown in the following
table:

Region

Cities

1

Oskarshamn, Högsby, Mönsterås

2

Vimmerby, Hultsfred, Västervik

3

Nässjö, Eksjö, Vetlanda

4

Uppvidinge, Alvesta, Växjo

With LIST COLUMNS partitioning, you can
create a table for customer data that assigns a row to any of
4 partitions corresponding to these regions based on the name
of the city where a customer resides, as shown here:

As with partitioning by RANGE COLUMNS, you
do not need to use expressions in the
COLUMNS() clause to convert column values
into integers. (In fact, the use of expressions other than
column names is not permitted with
COLUMNS().)

It is also possible to use DATE
and DATETIME columns, as shown
in the following example that uses the same name and columns
as the customers_1 table shown previously,
but employs LIST COLUMNS partitioning based
on the renewal column to store rows in one
of 4 partitions depending on the week in February 2010 the
customer's account is scheduled to renew:

This works, but becomes cumbersome to define and maintain if
the number of dates involved grows very large; in such cases,
it is usually more practical to employ
RANGE or RANGE COLUMNS
partitioning instead. In this case, since the column we wish
to use as the partitioning key is a
DATE column, we use
RANGE COLUMNS partitioning, as shown here:

22.2.4 HASH Partitioning

Partitioning by HASH is used primarily to
ensure an even distribution of data among a predetermined number
of partitions. With range or list partitioning, you must specify
explicitly which partition a given column value or set of column
values should be stored in; with hash partitioning, this
decision is taken care of for you, and you need only specify a
column value or expression based on a column value to be hashed
and the number of partitions into which the partitioned table is
to be divided.

To partition a table using HASH partitioning,
it is necessary to append to the CREATE
TABLE statement a PARTITION BY HASH
(expr) clause, where
expr is an expression that returns an
integer. This can simply be the name of a column whose type is
one of MySQL's integer types. In addition, you most likely
want to follow this with PARTITIONS
num, where
num is a positive integer
representing the number of partitions into which the table is to
be divided.

Note

For simplicity, the tables in the examples that follow do not
use any keys. You should be aware that, if a table has any
unique keys, every column used in the partitioning expression
for this table must be part of every unique key, including the
primary key. See
Section 22.6.1, “Partitioning Keys, Primary Keys, and Unique Keys”,
for more information.

The following statement creates a table that uses hashing on the
store_id column and is divided into 4
partitions:

expr must return a nonconstant,
nonrandom integer value (in other words, it should be varying
but deterministic), and must not contain any prohibited
constructs as described in
Section 22.6, “Restrictions and Limitations on Partitioning”. You should also keep
in mind that this expression is evaluated each time a row is
inserted or updated (or possibly deleted); this means that very
complex expressions may give rise to performance issues,
particularly when performing operations (such as batch inserts)
that affect a great many rows at one time.

The most efficient hashing function is one which operates upon a
single table column and whose value increases or decreases
consistently with the column value, as this allows for
“pruning” on ranges of partitions. That is, the
more closely that the expression varies with the value of the
column on which it is based, the more efficiently MySQL can use
the expression for hash partitioning.

For example, where date_col is a column of
type DATE, then the expression
TO_DAYS(date_col) is said to vary
directly with the value of date_col, because
for every change in the value of date_col,
the value of the expression changes in a consistent manner. The
variance of the expression
YEAR(date_col) with respect to
date_col is not quite as direct as that of
TO_DAYS(date_col), because not
every possible change in date_col produces an
equivalent change in
YEAR(date_col). Even so,
YEAR(date_col) is a good
candidate for a hashing function, because it varies directly
with a portion of date_col and there is no
possible change in date_col that produces a
disproportionate change in
YEAR(date_col).

By way of contrast, suppose that you have a column named
int_col whose type is
INT. Now consider the expression
POW(5-int_col,3) + 6. This would
be a poor choice for a hashing function because a change in the
value of int_col is not guaranteed to produce
a proportional change in the value of the expression. Changing
the value of int_col by a given amount can
produce widely differing changes in the value of the expression.
For example, changing int_col from
5 to 6 produces a change
of -1 in the value of the expression, but
changing the value of int_col from
6 to 7 produces a change
of -7 in the expression value.

In other words, the more closely the graph of the column value
versus the value of the expression follows a straight line as
traced by the equation
y=cx where
c is some nonzero constant, the
better the expression is suited to hashing. This has to do with
the fact that the more nonlinear an expression is, the more
uneven the distribution of data among the partitions it tends to
produce.

In theory, pruning is also possible for expressions involving
more than one column value, but determining which of such
expressions are suitable can be quite difficult and
time-consuming. For this reason, the use of hashing expressions
involving multiple columns is not particularly recommended.

When PARTITION BY HASH is used, the storage
engine determines which partition of
num partitions to use based on the
modulus of the result of the expression. In other words, for a
given expression expr, the partition
in which the record is stored is partition number
N, where
N =
MOD(expr,
num). Suppose that table
t1 is defined as follows, so that it has 4
partitions:

If you insert a record into t1 whose
col3 value is
'2005-09-15', then the partition in which it
is stored is determined as follows:

MOD(YEAR('2005-09-01'),4)
= MOD(2005,4)
= 1

MySQL 8.0 also supports a variant of
HASH partitioning known as
linear hashing which
employs a more complex algorithm for determining the placement
of new rows inserted into the partitioned table. See
Section 22.2.4.1, “LINEAR HASH Partitioning”, for a description of
this algorithm.

The user-supplied expression is evaluated each time a record is
inserted or updated. It may also—depending on the
circumstances—be evaluated when records are deleted.

22.2.4.1 LINEAR HASH Partitioning

MySQL also supports linear hashing, which differs from regular
hashing in that linear hashing utilizes a linear powers-of-two
algorithm whereas regular hashing employs the modulus of the
hashing function's value.

Syntactically, the only difference between linear-hash
partitioning and regular hashing is the addition of the
LINEAR keyword in the PARTITION
BY clause, as shown here:

Given an expression expr, the
partition in which the record is stored when linear hashing is
used is partition number N from
among num partitions, where
N is derived according to the
following algorithm:

Find the next power of 2 greater than
num. We call this value
V; it can be calculated as:

The advantage in partitioning by linear hash is that the
adding, dropping, merging, and splitting of partitions is made
much faster, which can be beneficial when dealing with tables
containing extremely large amounts (terabytes) of data. The
disadvantage is that data is less likely to be evenly
distributed between partitions as compared with the
distribution obtained using regular hash partitioning.

22.2.5 KEY Partitioning

Partitioning by key is similar to partitioning by hash, except
that where hash partitioning employs a user-defined expression,
the hashing function for key partitioning is supplied by the
MySQL server. This internal hashing function is based on the
same algorithm as PASSWORD().

The syntax rules for CREATE TABLE ... PARTITION BY
KEY are similar to those for creating a table that is
partitioned by hash. The major differences are listed here:

KEY is used rather than
HASH.

KEY takes only a list of zero or more
column names. Any columns used as the partitioning key must
comprise part or all of the table's primary key, if the
table has one. Where no column name is specified as the
partitioning key, the table's primary key is used, if
there is one. For example, the following
CREATE TABLE statement is
valid in MySQL 8.0:

The preceding statement would not be
valid, were a different partitioning type to be specified.
(In this case, simply using PARTITION BY
KEY() would also be valid and have the same effect
as PARTITION BY KEY(s1), since
s1 is the table's primary key.)

The LINEAR keyword has the same effect on
KEY partitioning as it does on
HASH partitioning, with the partition number
being derived using a powers-of-two algorithm rather than modulo
arithmetic. See Section 22.2.4.1, “LINEAR HASH Partitioning”, for
a description of this algorithm and its implications.

22.2.6 Subpartitioning

Subpartitioning—also known as
composite
partitioning—is the further division of each
partition in a partitioned table. Consider the following
CREATE TABLE statement:

Table ts has 3 RANGE
partitions. Each of these
partitions—p0, p1,
and p2—is further divided into 2
subpartitions. In effect, the entire table is divided into
3 * 2 = 6 partitions. However, due to the
action of the PARTITION BY RANGE clause, the
first 2 of these store only those records with a value less than
1990 in the purchased column.

It is possible to subpartition tables that are partitioned by
RANGE or LIST.
Subpartitions may use either HASH or
KEY partitioning. This is also known as
composite partitioning.

Note

SUBPARTITION BY HASH and
SUBPARTITION BY KEY generally follow the
same syntax rules as PARTITION BY HASH and
PARTITION BY KEY, respectively. An
exception to this is that SUBPARTITION BY
KEY (unlike PARTITION BY KEY)
does not currently support a default column, so the column
used for this purpose must be specified, even if the table has
an explicit primary key. This is a known issue which we are
working to address; see
Issues with subpartitions, for
more information and an example.

It is also possible to define subpartitions explicitly using
SUBPARTITION clauses to specify options for
individual subpartitions. For example, a more verbose fashion of
creating the same table ts as shown in the
previous example would be:

Each SUBPARTITION clause must include (at
a minimum) a name for the subpartition. Otherwise, you may
set any desired option for the subpartition or allow it to
assume its default setting for that option.

Subpartition names must be unique across the entire table.
For example, the following CREATE
TABLE statement is valid:

22.2.7 How MySQL Partitioning Handles NULL

Partitioning in MySQL does nothing to disallow
NULL as the value of a partitioning
expression, whether it is a column value or the value of a
user-supplied expression. Even though it is permitted to use
NULL as the value of an expression that must
otherwise yield an integer, it is important to keep in mind that
NULL is not a number. MySQL's
partitioning implementation treats NULL as
being less than any non-NULL value, just as
ORDER BY does.

This means that treatment of NULL varies
between partitioning of different types, and may produce
behavior which you do not expect if you are not prepared for it.
This being the case, we discuss in this section how each MySQL
partitioning type handles NULL values when
determining the partition in which a row should be stored, and
provide examples for each.

Handling of NULL with RANGE partitioning.
If you insert a row into a table partitioned by
RANGE such that the column value used to
determine the partition is NULL, the row is
inserted into the lowest partition. Consider these two tables
in a database named p, created as follows:

As with other MySQL functions,
YEAR(NULL) returns
NULL. A row with a dt
column value of NULL is treated as though the
partitioning expression evaluated to a value less than any other
value, and so is inserted into partition p0.

Handling of NULL with LIST partitioning.
A table that is partitioned by LIST admits
NULL values if and only if one of its
partitions is defined using that value-list that contains
NULL. The converse of this is that a table
partitioned by LIST which does not
explicitly use NULL in a value list rejects
rows resulting in a NULL value for the
partitioning expression, as shown in this example:

Only rows having a c1 value between
0 and 8 inclusive can be
inserted into ts1. NULL
falls outside this range, just like the number
9. We can create tables
ts2 and ts3 having value
lists containing NULL, as shown here:

When defining value lists for partitioning, you can (and should)
treat NULL just as you would any other value.
For example, both VALUES IN (NULL) and
VALUES IN (1, 4, 7, NULL) are valid, as are
VALUES IN (1, NULL, 4, 7), VALUES IN
(NULL, 1, 4, 7), and so on. You can insert a row
having NULL for column c1
into each of the tables ts2 and
ts3:

By issuing the appropriate query against
INFORMATION_SCHEMA.PARTITIONS, you
can determine which partitions were used to store the rows just
inserted (we assume, as in the previous examples, that the
partitioned tables were created in the p
database):

As shown earlier in this section, you can also verify which
partitions were used for storing the rows by deleting these
partitions and then performing a
SELECT.

Handling of NULL with HASH and KEY partitioning. NULL is handled somewhat differently for
tables partitioned by HASH or
KEY. In these cases, any partition
expression that yields a NULL value is
treated as though its return value were zero. We can verify
this behavior by examining the effects on the file system of
creating a table partitioned by HASH and
populating it with a record containing appropriate values.
Suppose that you have a table th (also in
the p database) created using the following
statement:

Recall that for any integer N, the
value of NULL MOD
N is always
NULL. For tables that are partitioned by
HASH or KEY, this result
is treated for determining the correct partition as
0. Checking the
INFORMATION_SCHEMA.PARTITIONS table
once again, we can see that both rows were inserted into
partition p0:

There are a number of ways using SQL statements to modify
partitioned tables; it is possible to add, drop, redefine, merge,
or split existing partitions using the partitioning extensions to
the
ALTER
TABLE statement. There are also ways to obtain
information about partitioned tables and partitions. We discuss
these topics in the sections that follow.

All partitions of a partitioned table must have the same number
of subpartitions; it is not possible to change the
subpartitioning once the table has been created.

To change a table's partitioning scheme, it is necessary only
to use the
ALTER
TABLE statement with a
partition_options option, which has the
same syntax as that as used with CREATE
TABLE for creating a partitioned table; this option
(also) always begins with the keywords PARTITION
BY. Suppose that the following
CREATE TABLE statement was used to
create a table that is partitioned by range:

To repartition this table so that it is partitioned by key into
two partitions using the id column value as the
basis for the key, you can use this statement:

ALTER TABLE trb3 PARTITION BY KEY(id) PARTITIONS 2;

This has the same effect on the structure of the table as dropping
the table and re-creating it using CREATE TABLE trb3
PARTITION BY KEY(id) PARTITIONS 2;.

ALTER TABLE ... ENGINE = ... changes only the
storage engine used by the table, and leaves the table's
partitioning scheme intact. The statement succeeds only if the
target storage engine provides partitioning support. You can use
ALTER TABLE ... REMOVE PARTITIONING to remove a
table's partitioning; see Section 13.1.7, “ALTER TABLE Syntax”.

Important

Only a single PARTITION BY, ADD
PARTITION, DROP PARTITION,
REORGANIZE PARTITION, or COALESCE
PARTITION clause can be used in a given
ALTER
TABLE statement. If you (for example) wish to drop a
partition and reorganize a table's remaining partitions,
you must do so in two separate
ALTER
TABLE statements (one using DROP
PARTITION and then a second one using
REORGANIZE PARTITION).

22.3.1 Management of RANGE and LIST Partitions

Adding and dropping of range and list partitions are handled in
a similar fashion, so we discuss the management of both sorts of
partitioning in this section. For information about working with
tables that are partitioned by hash or key, see
Section 22.3.2, “Management of HASH and KEY Partitions”.

Dropping a partition from a table that is partitioned by either
RANGE or by LIST can be
accomplished using the
ALTER
TABLE statement with the DROP
PARTITION option. Suppose that you have created a
table that is partitioned by range and then populated with 10
records using the following CREATE
TABLE and INSERT
statements:

The NDBCLUSTER storage engine
does not support ALTER TABLE ... DROP
PARTITION. It does, however, support the other
partitioning-related extensions to
ALTER
TABLE that are described in this chapter.

It is very important to remember that, when you drop a
partition, you also delete all the data that was stored in that
partition. You can see that this is the case by
re-running the previous SELECT
query:

mysql> SELECT * FROM tr WHERE purchased
-> BETWEEN '1995-01-01' AND '1999-12-31';
Empty set (0.00 sec)

Note

DROP PARTITION is supported by native
partitioning in-place APIs and may be used with
ALGORITHM={COPY|INPLACE}. DROP
PARTITION with ALGORITHM=INPLACE
deletes data stored in the partition and drops the partition.
However, DROP PARTITION with
ALGORITHM=COPY or
old_alter_table=ON rebuilds
the partitioned table and attempts to move data from the
dropped partition to another partition with a compatible
PARTITION ... VALUES definition. Data that
cannot be moved to another partition is deleted.

Because of this, you must have the
DROP privilege for a table before
you can execute ALTER TABLE ... DROP
PARTITION on that table.

If you intend to change the partitioning of a table
without losing data, use ALTER
TABLE ... REORGANIZE PARTITION instead. See below or
in Section 13.1.7, “ALTER TABLE Syntax”, for information about
REORGANIZE PARTITION.

If you now execute a SHOW CREATE
TABLE statement, you can see how the partitioning
makeup of the table has been changed:

The number of rows dropped from the table as a result of
ALTER TABLE ... DROP PARTITION is not
reported by the server as it would be by the equivalent
DELETE query.

Dropping LIST partitions uses exactly the
same ALTER TABLE ... DROP PARTITION syntax as
used for dropping RANGE partitions. However,
there is one important difference in the effect this has on your
use of the table afterward: You can no longer insert into the
table any rows having any of the values that were included in
the value list defining the deleted partition. (See
Section 22.2.2, “LIST Partitioning”, for an example.)

To add a new range or list partition to a previously partitioned
table, use the ALTER TABLE ... ADD PARTITION
statement. For tables which are partitioned by
RANGE, this can be used to add a new range to
the end of the list of existing partitions. Suppose that you
have a partitioned table containing membership data for your
organization, which is defined as follows:

Suppose further that the minimum age for members is 16. As the
calendar approaches the end of 2015, you realize that you will
soon be admitting members who were born in 2000 (and later). You
can modify the members table to accommodate
new members born in the years 2000 to 2010 as shown here:

With tables that are partitioned by range, you can use
ADD PARTITION to add new partitions to the
high end of the partitions list only. Trying to add a new
partition in this manner between or before existing partitions
results in an error as shown here:

mysql> ALTER TABLE members
> ADD PARTITION (
> PARTITION n VALUES LESS THAN (1970));
ERROR 1463 (HY000): VALUES LESS THAN value must be strictly »
increasing for each partition

You can work around this problem by reorganizing the first
partition into two new ones that split the range between them,
like this:

Because any rows with the data column value
12 have already been assigned to partition
p1, you cannot create a new partition on
table tt that includes 12
in its value list. To accomplish this, you could drop
p1, and add np and then a
new p1 with a modified definition. However,
as discussed earlier, this would result in the loss of all data
stored in p1—and it is often the case
that this is not what you really want to do. Another solution
might appear to be to make a copy of the table with the new
partitioning and to copy the data into it using
CREATE TABLE ...
SELECT ..., then drop the old table and rename the new
one, but this could be very time-consuming when dealing with a
large amounts of data. This also might not be feasible in
situations where high availability is a requirement.

You can add multiple partitions in a single ALTER TABLE
... ADD PARTITION statement as shown here:

Fortunately, MySQL's partitioning implementation provides ways
to redefine partitions without losing data. Let us look first at
a couple of simple examples involving RANGE
partitioning. Recall the members table which
is now defined as shown here:

Suppose that you would like to move all rows representing
members born before 1960 into a separate partition. As we have
already seen, this cannot be done using
ALTER
TABLE ... ADD PARTITION. However, you can use another
partition-related extension to
ALTER
TABLE to accomplish this:

In effect, this command splits partition p0
into two new partitions s0 and
s1. It also moves the data that was stored in
p0 into the new partitions according to the
rules embodied in the two PARTITION ... VALUES
... clauses, so that s0 contains
only those records for which
YEAR(dob) is less than 1960 and
s1 contains those rows in which
YEAR(dob) is greater than or
equal to 1960 but less than 1970.

A REORGANIZE PARTITION clause may also be
used for merging adjacent partitions. You can reverse the effect
of the previous statement on the members
table as shown here:

No data is lost in splitting or merging partitions using
REORGANIZE PARTITION. In executing the above
statement, MySQL moves all of the records that were stored in
partitions s0 and s1 into
partition p0.

Here, tbl_name is the name of the
partitioned table, and partition_list
is a comma-separated list of names of one or more existing
partitions to be changed.
partition_definitions is a
comma-separated list of new partition definitions, which follow
the same rules as for the
partition_definitions list used in
CREATE TABLE. You are not limited
to merging several partitions into one, or to splitting one
partition into many, when using REORGANIZE
PARTITION. For example, you can reorganize all four
partitions of the members table into two,
like this:

You can also use REORGANIZE PARTITION with
tables that are partitioned by LIST. Let us
return to the problem of adding a new partition to the
list-partitioned tt table and failing because
the new partition had a value that was already present in the
value-list of one of the existing partitions. We can handle this
by adding a partition that contains only nonconflicting values,
and then reorganizing the new partition and the existing one so
that the value which was stored in the existing one is now moved
to the new one:

Here are some key points to keep in mind when using
ALTER TABLE ... REORGANIZE PARTITION to
repartition tables that are partitioned by
RANGE or LIST:

The PARTITION options used to determine
the new partitioning scheme are subject to the same rules as
those used with a CREATE
TABLE statement.

A new RANGE partitioning scheme cannot
have any overlapping ranges; a new LIST
partitioning scheme cannot have any overlapping sets of
values.

The combination of partitions in the
partition_definitions list should
account for the same range or set of values overall as the
combined partitions named in the
partition_list.

For example, partitions p1 and
p2 together cover the years 1980 through
1999 in the members table used as an
example in this section. Any reorganization of these two
partitions should cover the same range of years overall.

For tables partitioned by RANGE, you can
reorganize only adjacent partitions; you cannot skip over
range partitions.

For instance, you could not reorganize the example
members table using a statement beginning
with ALTER TABLE members REORGANIZE PARTITION p0,p2
INTO ... because p0 covers the
years prior to 1970 and p2 the years from
1990 through 1999 inclusive, so these are not adjacent
partitions. (You cannot skip partition p1
in this case.)

You cannot use REORGANIZE PARTITION to
change the type of partitioning used by the table; for
example, you cannot change RANGE
partitions to HASH partitions or the
reverse. You also cannot use this statement to change the
partitioning expression or column. To accomplish either of
these tasks without dropping and re-creating the table, you
can use
ALTER
TABLE ... PARTITION BY ..., as shown here:

ALTER TABLE members
PARTITION BY HASH( YEAR(dob) )
PARTITIONS 8;

22.3.2 Management of HASH and KEY Partitions

Tables which are partitioned by hash or by key are very similar
to one another with regard to making changes in a partitioning
setup, and both differ in a number of ways from tables which
have been partitioned by range or list. For that reason, this
section addresses the modification of tables partitioned by hash
or by key only. For a discussion of adding and dropping of
partitions of tables that are partitioned by range or list, see
Section 22.3.1, “Management of RANGE and LIST Partitions”.

You cannot drop partitions from tables that are partitioned by
HASH or KEY in the same
way that you can from tables that are partitioned by
RANGE or LIST. However,
you can merge HASH or KEY
partitions using ALTER TABLE ... COALESCE
PARTITION. Suppose that a clients
table containing data about clients is divided into 12
partitions, created as shown here:

COALESCE works equally well with tables that
are partitioned by HASH,
KEY, LINEAR HASH, or
LINEAR KEY. Here is an example similar to the
previous one, differing only in that the table is partitioned by
LINEAR KEY:

To increase the number of partitions for the
clients table from 12 to 18, use
ALTER TABLE ... ADD PARTITION as shown here:

ALTER TABLE clients ADD PARTITION PARTITIONS 6;

22.3.3 Exchanging Partitions and Subpartitions with Tables

In MySQL 8.0, it is possible to exchange a table
partition or subpartition with a table using ALTER
TABLE pt EXCHANGE PARTITION
p WITH TABLE
nt, where
pt is the partitioned table and
p is the partition or subpartition of
pt to be exchanged with unpartitioned
table nt, provided that the following
statements are true:

Table nt is not itself
partitioned.

Table nt is not a temporary
table.

The structures of tables pt and
nt are otherwise identical.

Table nt contains no foreign key
references, and no other table has any foreign keys that
refer to nt.

There are no rows in nt that lie
outside the boundaries of the partition definition for
p. This condition does not apply
if WITHOUT VALIDATION is used.

The IGNORE keyword has no effect when
used with ALTER TABLE ... EXCHANGE
PARTITION.

The syntax for
ALTER TABLE ...
EXCHANGE PARTITION is shown here, where
pt is the partitioned table,
p is the partition (or subpartition)
to be exchanged, and nt is the
nonpartitioned table to be exchanged with
p:

ALTER TABLE pt
EXCHANGE PARTITION p
WITH TABLE nt;

Optionally, you can append WITH VALIDATION or
WITHOUT VALIDATION. When WITHOUT
VALIDATION is specified, the
ALTER TABLE ...
EXCHANGE PARTITION operation does not perform any
row-by-row validation when exchanging a partition a
nonpartitioned table, allowing database administrators to assume
responsibility for ensuring that rows are within the boundaries
of the partition definition. WITH VALIDATION
is the default.

One and only one partition or subpartition may be exchanged with
one and only one nonpartitioned table in a single
ALTER TABLE
EXCHANGE PARTITION statement. To exchange multiple
partitions or subpartitions, use multiple
ALTER TABLE
EXCHANGE PARTITION statements. EXCHANGE
PARTITION may not be combined with other
ALTER TABLE options. The
partitioning and (if applicable) subpartitioning used by the
partitioned table may be of any type or types supported in MySQL
8.0.

Exchanging a Partition with a Nonpartitioned Table

Suppose that a partitioned table e has been
created and populated using the following SQL statements:

More precisely, the statement just issued causes any rows found
in the partition to be swapped with those found in the table.
You can observe how this has happened by querying the
INFORMATION_SCHEMA.PARTITIONS
table, as before. The table row that was previously found in
partition p0 is no longer present:

The table to be exchanged with the partition does not
necessarily have to be empty. To demonstrate this, we first
insert a new row into table e, making sure
that this row is stored in partition p0 by
choosing an id column value that is less than
50, and verifying this afterward by querying the
PARTITIONS table:

The output of the following queries shows that the table row
that was stored in partition p0 and the table
row that was stored in table e2, prior to
issuing the
ALTER
TABLE statement, have now switched places:

Nonmatching Rows

You should keep in mind that any rows found in the
nonpartitioned table prior to issuing the
ALTER TABLE ...
EXCHANGE PARTITION statement must meet the conditions
required for them to be stored in the target partition;
otherwise, the statement fails. To see how this occurs, first
insert a row into e2 that is outside the
boundaries of the partition definition for partition
p0 of table e. For
example, insert a row with an id column value
that is too large; then, try to exchange the table with the
partition again:

When a partition is exchanged with a table that contains rows
that do not match the partition definition, it is the
responsibility of the database administrator to fix the
non-matching rows, which can be performed using
REPAIR TABLE or
ALTER
TABLE ... REPAIR PARTITION.

Exchanging Partitions Without Row-By-Row Validation

To avoid time consuming validation when exchanging a partition
with a table that has many rows, it is possible to skip the
row-by-row validation step by appending WITHOUT
VALIDATION to the
ALTER
TABLE ... EXCHANGE PARTITION statement.

The following example compares the difference between execution
times when exchanging a partition with a nonpartitioned table,
with and without validation. The partitioned table (table
e) contains two partitions of 1 million rows
each. The rows in p0 of table e are removed and p0 is exchanged
with a nonpartitioned table of 1 million rows. The WITH
VALIDATION operation takes 0.74 seconds. By
comparison, the WITHOUT VALIDATION operation
takes 0.01 seconds.

If a partition is exchanged with a table that contains rows that
do not match the partition definition, it is the responsibility
of the database administrator to fix the non-matching rows,
which can be performed using REPAIR
TABLE or
ALTER
TABLE ... REPAIR PARTITION.

Exchanging a Subpartition with a Nonpartitioned Table

You can also exchange a subpartition of a subpartitioned table
(see Section 22.2.6, “Subpartitioning”) with a
nonpartitioned table using an
ALTER TABLE ...
EXCHANGE PARTITION statement. In the following
example, we first create a table es that is
partitioned by RANGE and subpartitioned by
KEY, populate this table as we did table
e, and then create an empty, nonpartitioned
copy es2 of the table, as shown here:

Although we did not explicitly name any of the subpartitions
when creating table es, we can obtain
generated names for these by including the
SUBPARTITION_NAME column of the
PARTITIONS table from
INFORMATION_SCHEMA when selecting from that
table, as shown here:

Table structures are compared in a strict fashion; the number,
order, names, and types of columns and indexes of the
partitioned table and the nonpartitioned table must match
exactly. In addition, both tables must use the same storage
engine:

You can use a number of extensions to
ALTER
TABLE for performing operations of this type on one or
more partitions directly, as described in the following list:

Rebuilding partitions.
Rebuilds the partition; this has the same effect as
dropping all records stored in the partition, then
reinserting them. This can be useful for purposes of
defragmentation.

Example:

ALTER TABLE t1 REBUILD PARTITION p0, p1;

Optimizing partitions.
If you have deleted a large number of rows from a
partition or if you have made many changes to a
partitioned table with variable-length rows (that is,
having VARCHAR,
BLOB, or
TEXT columns), you can use
ALTER
TABLE ... OPTIMIZE PARTITION to reclaim any
unused space and to defragment the partition data file.

Example:

ALTER TABLE t1 OPTIMIZE PARTITION p0, p1;

Using OPTIMIZE PARTITION on a given
partition is equivalent to running CHECK
PARTITION, ANALYZE PARTITION,
and REPAIR PARTITION on that partition.

Some MySQL storage engines, including
InnoDB, do not support
per-partition optimization; in these cases,
ALTER
TABLE ... OPTIMIZE PARTITION analyzes and rebuilds
the entire table, and causes an appropriate warning to be
issued. (Bug #11751825, Bug #42822) Use ALTER TABLE
... REBUILD PARTITION and ALTER TABLE ...
ANALYZE PARTITION instead, to avoid this issue.

Analyzing partitions.
This reads and stores the key distributions for
partitions.

Example:

ALTER TABLE t1 ANALYZE PARTITION p3;

Repairing partitions.
This repairs corrupted partitions.

Example:

ALTER TABLE t1 REPAIR PARTITION p0,p1;

Normally, REPAIR PARTITION fails when the
partition contains duplicate key errors. You can use
ALTER
IGNORE TABLE with this option, in which case all
rows that cannot be moved due to the presence of duplicate
keys are removed from the partition (Bug #16900947).

Checking partitions.
You can check partitions for errors in much the same way
that you can use CHECK TABLE with
nonpartitioned tables.

Example:

ALTER TABLE trb3 CHECK PARTITION p1;

This command will tell you whether the data or indexes in
partition p1 of table
t1 are corrupted. If this is the case,
use
ALTER
TABLE ... REPAIR PARTITION to repair the
partition.

Normally, CHECK PARTITION fails when the
partition contains duplicate key errors. You can use
ALTER
IGNORE TABLE with this option, in which case the
statement returns the contents of each row in the partition
where a duplicate key violation is found. Only the values
for the columns in the partitioning expression for the table
are reported. (Bug #16900947)

Each of the statements in the list just shown also supports the
keyword ALL in place of the list of partition
names. Using ALL causes the statement to act
on all partitions in the table.

The output from SHOW TABLE STATUS
for partitioned tables is the same as that for nonpartitioned
tables, except that the Create_options column
contains the string partitioned. The
Engine column contains the name of the
storage engine used by all partitions of the table. (See
Section 13.7.5.36, “SHOW TABLE STATUS Syntax”, for more information about
this statement.)

It is possible to determine which partitions of a partitioned
table are involved in a given
SELECT query using
EXPLAIN. The
partitions column in the
EXPLAIN output lists the
partitions from which records would be matched by the query.

In this case, all four partitions are searched. However, when a
limiting condition making use of the partitioning key is added
to the query, you can see that only those partitions containing
matching values are scanned, as shown here:

22.4 Partition Pruning

The optimization known as partition
pruning is based on a relatively simple concept which
can be described as “Do not scan partitions where there can
be no matching values”. Suppose a partitioned table
t1 is created by this statement:

It is easy to see that none of the rows which ought to be returned
are in either of the partitions p0 or
p3; that is, we need search only in partitions
p1 and p2 to find matching
rows. By limiting the search, it is possible to expend much less
time and effort in finding matching rows than by scanning all
partitions in the table. This “cutting away” of
unneeded partitions is known as
pruning. When the optimizer
can make use of partition pruning in performing this query,
execution of the query can be an order of magnitude faster than
the same query against a nonpartitioned table containing the same
column definitions and data.

The optimizer can perform pruning whenever a
WHERE condition can be reduced to either one of
the following two cases:

partition_column =
constant

partition_column IN
(constant1,
constant2, ...,
constantN)

In the first case, the optimizer simply evaluates the partitioning
expression for the value given, determines which partition
contains that value, and scans only this partition. In many cases,
the equal sign can be replaced with another arithmetic comparison,
including <, >,
<=, >=, and
<>. Some queries using
BETWEEN in the WHERE clause
can also take advantage of partition pruning. See the examples
later in this section.

In the second case, the optimizer evaluates the partitioning
expression for each value in the list, creates a list of matching
partitions, and then scans only the partitions in this partition
list.

Pruning can also be applied to short ranges, which the optimizer
can convert into equivalent lists of values. For instance, in the
previous example, the WHERE clause can be
converted to WHERE region_code IN (126, 127, 128,
129). Then the optimizer can determine that the first
two values in the list are found in partition
p1, the remaining two values in partition
p2, and that the other partitions contain no
relevant values and so do not need to be searched for matching
rows.

The optimizer can also perform pruning for
WHERE conditions that involve comparisons of
the preceding types on multiple columns for tables that use
RANGE COLUMNS or LIST
COLUMNS partitioning.

This type of optimization can be applied whenever the partitioning
expression consists of an equality or a range which can be reduced
to a set of equalities, or when the partitioning expression
represents an increasing or decreasing relationship. Pruning can
also be applied for tables partitioned on a
DATE or
DATETIME column when the
partitioning expression uses the
YEAR() or
TO_DAYS() function. Pruning can
also be applied for such tables when the partitioning expression
uses the TO_SECONDS() function.

Suppose that table t2, partitioned on a
DATE column, is created using the
statement shown here:

Scan only these two partitions and any partitions
that may lie between them.

In this case, this means that only partitions
d3, d4, and
d5 are scanned. The remaining partitions
may be safely ignored (and are ignored).

Important

Invalid DATE and DATETIME
values referenced in the WHERE condition of a
statement against a partitioned table are treated as
NULL. This means that a query such as
SELECT * FROM
partitioned_table WHERE
date_column <
'2008-12-00' does not return any values (see Bug
#40972).

So far, we have looked only at examples using
RANGE partitioning, but pruning can be applied
with other partitioning types as well.

Consider a table that is partitioned by LIST,
where the partitioning expression is increasing or decreasing,
such as the table t3 shown here. (In this
example, we assume for the sake of brevity that the
region_code column is limited to values between
1 and 10 inclusive.)

For a statement such as SELECT * FROM t3 WHERE
region_code BETWEEN 1 AND 3, the optimizer determines in
which partitions the values 1, 2, and 3 are found
(r0 and r1) and skips the
remaining ones (r2 and r3).

For tables that are partitioned by HASH or
[LINEAR] KEY, partition pruning is also
possible in cases in which the WHERE clause
uses a simple = relation against a column used
in the partitioning expression. Consider a table created like
this:

A statement that compares a column value with a constant can be
pruned:

UPDATE t4 WHERE region_code = 7;

Pruning can also be employed for short ranges, because the
optimizer can turn such conditions into IN
relations. For example, using the same table t4
as defined previously, queries such as these can be pruned:

SELECT * FROM t4 WHERE region_code > 2 AND region_code < 6;
SELECT * FROM t4 WHERE region_code BETWEEN 3 AND 5;

In both these cases, the WHERE clause is
transformed by the optimizer into WHERE region_code IN
(3, 4, 5).

Important

This optimization is used only if the range size is smaller than
the number of partitions. Consider this statement:

DELETE FROM t4 WHERE region_code BETWEEN 4 AND 12;

The range in the WHERE clause covers 9 values
(4, 5, 6, 7, 8, 9, 10, 11, 12), but t4 has
only 8 partitions. This means that the DELETE
cannot be pruned.

When a table is partitioned by HASH or
[LINEAR] KEY, pruning can be used only on
integer columns. For example, this statement cannot use pruning
because dob is a
DATE column:

SELECT * FROM t4 WHERE dob >= '2001-04-14' AND dob <= '2005-10-15';

However, if the table stores year values in an
INT column, then a query having
WHERE year_col >= 2001 AND year_col <=
2005 can be pruned.

Tables using a storage engine that provides automatic
partitioning, such as the NDB storage engine
used by MySQL Cluster (not currently supported in MySQL
8.0), can be pruned if they are explicitly
partitioned.

22.5 Partition Selection

Explicit selection of partitions and subpartitions for rows
matching a given WHERE condition is supported.
Partition selection is similar to partition pruning, in that only
specific partitions are checked for matches, but differs in two
key respects:

The partitions to be checked are specified by the issuer of
the statement, unlike partition pruning, which is automatic.

Whereas partition pruning applies only to queries, explicit
selection of partitions is supported for both queries and a
number of DML statements.

The remainder of this section discusses explicit partition
selection as it applies generally to the statements just listed,
and provides some examples.

Explicit partition selection is implemented using a
PARTITION option. For all supported statements,
this option uses the syntax shown here:

PARTITION (partition_names)
partition_names:
partition_name, ...

This option always follows the name of the table to which the
partition or partitions belong.
partition_names is a comma-separated
list of partitions or subpartitions to be used. Each name in this
list must be the name of an existing partition or subpartition of
the specified table; if any of the partitions or subpartitions are
not found, the statement fails with an error (partition
'partition_name' doesn't
exist). Partitions and subpartitions named in
partition_names may be listed in any
order, and may overlap.

When the PARTITION option is used, only the
partitions and subpartitions listed are checked for matching rows.
This option can be used in a SELECT
statement to determine which rows belong to a given partition.
Consider a partitioned table named employees,
created and populated using the statements shown here:

The result is the same as obtained by the query SELECT *
FROM employees WHERE id BETWEEN 5 AND 9.

To obtain rows from multiple partitions, supply their names as a
comma-delimited list. For example, SELECT * FROM
employees PARTITION (p1, p2) returns all rows from
partitions p1 and p2 while
excluding rows from the remaining partitions.

Any valid query against a partitioned table can be rewritten with
a PARTITION option to restrict the result to
one or more desired partitions. You can use
WHERE conditions, ORDER BY
and LIMIT options, and so on. You can also use
aggregate functions with HAVING and
GROUP BY options. Each of the following queries
produces a valid result when run on the
employees table as previously defined:

Statements using partition selection can be employed with tables
using any of the supported partitioning types. When a table is
created using [LINEAR] HASH or
[LINEAR] KEY partitioning and the names of the
partitions are not specified, MySQL automatically names the
partitions p0, p1,
p2, ...,
pN-1, where
N is the number of partitions. For
subpartitions not explicitly named, MySQL assigns automatically to
the subpartitions in each partition
pX the names
pXsp0,
pXsp1,
pXsp2, ...,
pXspM-1,
where M is the number of subpartitions.
When executing against this table a
SELECT (or other SQL statement for
which explicit partition selection is allowed), you can use these
generated names in a PARTITION option, as shown
here:

You can explicitly select partitions (or subpartitions, or both)
from any or all of the tables in a join. (The
PARTITION option used to select partitions from
a given table immediately follows the name of the table, before
all other options, including any table alias.) For example, the
following query gets the name, employee ID, department, and city
of all employees who work in the Sales or Delivery department
(partition p1 of the
departments table) at the stores in either of
the cities of Nambucca and Bellingen (partition
p0 of the stores table):

When the PARTITION option is used with
DELETE statements, only those
partitions (and subpartitions, if any) listed with the option are
checked for rows to be deleted. Any other partitions are ignored,
as shown here:

Only the two rows in partitions p0 and
p1 matching the WHERE
condition were deleted. As you can see from the result when the
SELECT is run a second time, there
remains a row in the table matching the WHERE
condition, but residing in a different partition
(p2).

UPDATE statements using explicit
partition selection behave in the same way; only rows in the
partitions referenced by the PARTITION option
are considered when determining the rows to be updated, as can be
seen by executing the following statements:

For statements that write multiple rows to a partitioned table
that using the InnoDB storage engine:
If any row in the list following VALUES cannot
be written to one of the partitions specified in the
partition_names list, the entire
statement fails and no rows are written. This is shown for
INSERT statements in the following
example, reusing the employees table created
previously:

Arithmetic and logical operators.
Use of the arithmetic operators
+,
-, and
* is permitted in
partitioning expressions. However, the result must be an integer
value or NULL (except in the case of
[LINEAR] KEY partitioning, as discussed
elsewhere in this chapter; see
Section 22.2, “Partitioning Types”, for more information).

The bit operators
|,
&,
^,
<<,
>>, and
~ are not
permitted in partitioning expressions.

Server SQL mode.
Tables employing user-defined partitioning do not preserve the
SQL mode in effect at the time that they were created. As
discussed elsewhere in this Manual (see
Section 5.1.8, “Server SQL Modes”), the results of many MySQL functions
and operators may change according to the server SQL mode.
Therefore, a change in the SQL mode at any time after the
creation of partitioned tables may lead to major changes in the
behavior of such tables, and could easily lead to corruption or
loss of data. For these reasons, it is strongly
recommended that you never change the server SQL mode after
creating partitioned tables.

Examples.
The following examples illustrate some changes in behavior of
partitioned tables due to a change in the server SQL mode:

Error handling.
As discussed elsewhere, handling of “special”
values such as zero and NULL can differ
between different server SQL modes (see
Section 5.1.8, “Server SQL Modes”). For example,
ERROR_FOR_DIVISION_BY_ZERO
can affect whether or not 0 can be inserted as a value into
a table whose paritioning expression uses
column DIV
value or
column MOD
value.

Table accessibility.
Sometimes a change in the server SQL mode can make
partitioned tables unusable. The following
CREATE TABLE statement can be
executed successfully only if the
NO_UNSIGNED_SUBTRACTION
mode is in effect:

Server SQL modes also impact replication of partitioned tables.
Disparate SQL modes on master and slave can lead to partitioning
expressions being evaluated differently; this can cause the
distribution of data among partitions to be different in the
master's and slave's copies of a given table, and may
even cause inserts into partitioned tables that succeed on the
master to fail on the slave. For best results, you should always
use the same server SQL mode on the master and on the slave.

Performance considerations.
Some effects of partitioning operations on performance are given
in the following list:

File system operations.
Partitioning and repartitioning operations (such as
ALTER
TABLE with PARTITION BY ...,
REORGANIZE PARTITION, or REMOVE
PARTITIONING) depend on file system operations for
their implementation. This means that the speed of these
operations is affected by such factors as file system type
and characteristics, disk speed, swap space, file handling
efficiency of the operating system, and MySQL server options
and variables that relate to file handling. In particular,
you should make sure that
large_files_support is
enabled and that
open_files_limit is set
properly. Partitioning and repartitioning operations
involving InnoDB tables may be made more
efficient by enabling
innodb_file_per_table.

Table locks.
Generally, the process executing a partitioning operation on
a table takes a write lock on the table. Reads from such
tables are relatively unaffected; pending
INSERT and
UPDATE operations are
performed as soon as the partitioning operation has
completed. For InnoDB-specific exceptions
to this limitation, see
Section 15.12.7, “Online DDL for Partitioned Tables”.

Indexes; partition pruning.
As with nonpartitioned tables, proper use of indexes can
speed up queries on partitioned tables significantly. In
addition, designing partitioned tables and queries on these
tables to take advantage of
partition pruning can
improve performance dramatically. See
Section 22.4, “Partition Pruning”, for more
information.

Performance with LOAD DATA.
In MySQL 8.0, LOAD
DATA uses buffering to improve performance. You
should be aware that the buffer uses 130 KB memory per
partition to achieve this.

Maximum number of partitions.
In MySQL 8.0, the maximum possible number of
partitions for a given table is 8192. This number includes
subpartitions.

If, when creating tables with a large number of partitions (but
less than the maximum), you encounter an error message such as
Got error ... from storage engine: Out of resources
when opening file, you may be able to address the
issue by increasing the value of the
open_files_limit system variable.
However, this is dependent on the operating system, and may not be
possible or advisable on all platforms; see
Section B.5.2.17, “File Not Found and Similar Errors”, for more information.
In some cases, using large numbers (hundreds) of partitions may
also not be advisable due to other concerns, so using more
partitions does not automatically lead to better results.

Query cache not supported.
The query cache is not supported for partitioned tables, and is
automatically disabled for queries involving partitioned tables.
The query cache cannot be enabled for such queries.

Foreign keys not supported for partitioned InnoDB tables.
Partitioned tables using the InnoDB
storage engine do not support foreign keys. More specifically,
this means that the following two statements are true:

No InnoDB table definition may contain a
foreign key reference to a user-partitioned table; no
InnoDB table with user-defined partitioning
may contain columns referenced by foreign keys.

The scope of the restrictions just listed includes all tables that
use the InnoDB storage engine.
CREATE
TABLE and ALTER TABLE
statements that would result in tables violating these
restrictions are not allowed.

ALTER TABLE ... ORDER BY.
An ALTER TABLE ... ORDER BY
column statement run
against a partitioned table causes ordering of rows only within
each partition.

Data type of partitioning key.
A partitioning key must be either an integer column or an
expression that resolves to an integer. Expressions employing
ENUM columns cannot be used. The
column or expression value may also be NULL;
see Section 22.2.7, “How MySQL Partitioning Handles NULL”.

There are two exceptions to this restriction:

When partitioning by [LINEAR]
KEY, it is possible to use columns of any
valid MySQL data type other than
TEXT or
BLOB as partitioning keys,
because the internal key-hashing functions produce the correct
data type from these types. For example, the following two
CREATE TABLE statements are
valid:

Neither of the preceding exceptions applies to
BLOB or
TEXT column types.

Subqueries.
A partitioning key may not be a subquery, even if that subquery
resolves to an integer value or NULL.

Issues with subpartitions.
Subpartitions must use HASH or
KEY partitioning. Only
RANGE and LIST partitions
may be subpartitioned; HASH and
KEY partitions cannot be subpartitioned.

SUBPARTITION BY KEY requires that the
subpartitioning column or columns be specified explicitly, unlike
the case with PARTITION BY KEY, where it can be
omitted (in which case the table's primary key column is used
by default). Consider the table created by this statement:

However, the following statement that attempts to create a
subpartitioned table using the default column as the
subpartitioning column fails, and the column must be specified for
the statement to succeed, as shown here:

In addition, you can use ALTER TABLE ... REBUILD
PARTITION to rebuild one or more partitions of a
partitioned table; ALTER TABLE ... REORGANIZE
PARTITION also causes partitions to be rebuilt. See
Section 13.1.7, “ALTER TABLE Syntax”, for more information about these
two statements.

22.6.1 Partitioning Keys, Primary Keys, and Unique Keys

This section discusses the relationship of partitioning keys
with primary keys and unique keys. The rule governing this
relationship can be expressed as follows: All columns used in
the partitioning expression for a partitioned table must be part
of every unique key that the table may have.

In other words, every unique key on the table must use
every column in the table's partitioning
expression. (This also includes the table's
primary key, since it is by definition a unique key. This
particular case is discussed later in this section.) For
example, each of the following table creation statements is
invalid:

The CREATE TABLE statement fails
because both col1 and col3
are included in the proposed partitioning key, but neither of
these columns is part of both of unique keys on the table. This
shows one possible fix for the invalid table definition:

If a table has no unique keys—this includes having no
primary key—then this restriction does not apply, and you
may use any column or columns in the partitioning expression as
long as the column type is compatible with the partitioning
type.

For the same reason, you cannot later add a unique key to a
partitioned table unless the key includes all columns used by
the table's partitioning expression. Consider the partitioned
table created as shown here:

In the case of np_pk, the only column that
may be used as part of a partitioning expression is
id; if you wish to partition this table using
any other column or columns in the partitioning expression, you
must first modify the table, either by adding the desired column
or columns to the primary key, or by dropping the primary key
altogether.

22.6.2 Partitioning Limitations Relating to Storage Engines

In MySQL 8.0, partitioning support is not actually
provided by the MySQL Server, but rather by a table storage
engine's own or native partitioning handler. In MySQL
8.0, only the InnoDB
storage engine provides a native partitioning handler. This
means that partitioned tables cannot be created using any other
storage engine.

Note

MySQL Cluster's NDB storage engine
also provides native partitioning support, but is not
currently supported in MySQL 8.0.

Upgrading partitioned tables.
When performing an upgrade, tables which are partitioned by
KEY must be dumped and reloaded.
Partitioned tables using storage engines other than
InnoDB cannot be upgraded from MySQL 5.7 or
earlier to MySQL 8.0 or later; you must either drop the
partitioning from such tables with ALTER TABLE ...
REMOVE PARTITIONING or convert them to
InnoDB using ALTER TABLE ...
ENGINE=INNODB prior to the upgrade.

CEILING() and FLOOR().
Each of these functions returns an integer only if it is
passed an argument of an exact numeric type, such as one of
the INT types or
DECIMAL. This means, for
example, that the following CREATE
TABLE statement fails with an error, as shown here:

EXTRACT() function with WEEK specifier.
The value returned by the
EXTRACT() function, when used
as EXTRACT(WEEK FROM
col), depends on the
value of the
default_week_format system
variable. For this reason,
EXTRACT() is not permitted as a
partitioning function when it specifies the unit as
WEEK. (Bug #54483)