WL#9507: InnoDB: Make the number of undo tablespaces and rollback segments dynamic

There are two primary goals of this worklog:
Requirements
============
1. Initiate the process of moving rollback segments and undo logs away from the
system tablespace. The following two tasks are listed in their own worklogs for
tracking purposes, but will be done within this work.
WL#10498: Change the default value of innodb-undo-tablespaces from 0 to 2. We
need at least 2 undo tablespaces for undo truncation to work.
WL#10499: Change the default value for innodb_undo_log_truncate to true.
The goal is that the system tablespace will no longer be used for rollback
segments. So at a later time we should change the minimum value of
undo_tablespaces to 2.
2. Allow the the number of undo tablespaces to be set at runtime using the
existing interface which is the setting innodb-undo-tablespaces. This work
prepares the way for WL9508 which introduces CREATE UNDO TABLESPACE.
Tasks
=====
1. This worklog will include a redesign of the way rollback segments are found
in undo tablespaces. This will change the meaning of the rollback segment
identifier associated with each modified clustered index record. In addition, it
will create a new rollback segment mapping page in each undo tablespace it
creates and give that tablespace a new default name so it can be distinguished
from the old style.
2. This worklog will allow the setting innodb_undo_tablespaces to be increased
or decreased at runtime or at startup. If increased, new undo tablespaces will
be created and then atomically added to the list of active undo tablespaces. If
decreased, the list of active undo tablespaces is atomically reduced in size.
Existing undo tablespaces will continue to be used by existing transactions and
eventually will become inactive. Version 8.0 style undo tablespaces will not be
deleted. In this way, innodb_undo_tablespaces can be easily increased again.
Note that older undo tablespaces are deleted after they are replaced by v8.0
undo tablespaces.
If innodb_undo_tablespaces is set to 0, which is the default for v5.7, this
means that the system tablespace is used for rollback segments.
If innodb_undo_tablespaces is set at runtime from 0 to a number between 1 and
127, new undo tablespaces will be created with the specified number of rollback
segments. Once they are ready to use, new transactions will be assigned to
these rollback segments on a round robin basis and rollback segments in the
system tablespace will no longer be assigned. Once all the transactions using
undo logs in the system tablespace have been fully purged, the rollback segments
there will only consist of one page per rollback segment.
If innodb_undo_tablespaces is > 0 and then set to 0 at runtime, then the
rollback segments in the system tablespace will be used for new transactions and
the existing undo tablespaces will eventually become unused. But unused v8.0
undo tablespaces will not be deleted in case the setting is changed again.
The default setting for innodb_rollback_segments will stay at 128. v5.7 reserved
32 of the 128 rollback segment slots available in the system tablespace for
temporary rollback segments physically found in the temporary tablespace.
3. This worklog will use innodb_rollback_segments to indicate how many rollback
segments are in each tablespace. That includes the temporary tablespace, the
system tablespace, and each undo tablespace. Using the default setting of 128,
combined with innodb_undo_tablespaces=0, it is possible that rollback segments
will need to be added when innodb_undo_tablespaces is changed.
4. The setting innodb_rollback_segments will also be adjustable at startup and
at runtime.
5. In short, this worklog will add the ability to add new undo tablespaces and
rollback segments at any time. But it will not delete any undo tablespaces or
rollback segments in case they are needed in the future.
6. This worklog will add the ability to upgrade undo tablespaces from
existing databases to the new undo tablespaces mentioned in task #1. Any old
undo tablespaces in a 5.7 database will be replaced by v8.0 undo tablespaces.

FR1: (WL#10498) Change the default value for innodb-undo-tablespaces to 2.
FR2: (WL#10499) Change the default value for innodb-undo-log-truncate to true;
FR3: innodb-undo-tablespaces can be changed at runtime or offline to any value
from 0 to 127.
FR4: A deprecation message is put to the error log if innodb_undo_tablespaces is
set to 0, either during runtime or at startup. It will look like this:
[Info] InnoDB: Setting 'innodb_undo_tablespaces' to 0 is deprecated and will
not be supported in a future release.
FR5: innodb-rollback-segments can be changed at runtime or offline to any
value from 1 to 128. V5.7 allowed this to be reduced from what was originally
created. Now it can be increased to a higher number.
FR6: The setting 'innodb-rollback-segments' is no longer a database wide limit.
It now controls the number of rollback segments per tablespace. That includes
the temporary tablespace, the system tablespace (if innodb_undo_tablespaces=0),
and every undo tablespace (if innodb_undo_tablespaces > 0).
FR7: A v5.7 database will be upgraded when a v8.0 server is started on
it. That database can be the result of either a normal or fast shutdown
(innodb_fast_shutdown=0 or =1).
FR7a: If this upgrade occurs with innodb_undo_tablespaces=0, 2 undo tablespaces
will be used for that instance of InnoDB and this message will be put into the
log:
[Info] InnoDB: Database upgrade will use 2 undo tablespaces instead of 0.
Please set innodb_undo_tablespaces=2 or higher.
FR8: Implicit external undo tablespaces are named "undo_nnn" where nnn is the
undo_space_num. The space_id is related to the undo_space_num like this:
undo_space_num = 0xFFFFFFF0 - undo_space_id
undo_space_id = 0xFFFFFFF0 - undo_space_num
FR9: The setting innodb_undo_logs is deleted in this worklog. So if it is used,
there will be an error message. The setting is deprecated in 5.7 with WL#10322.

This worklog creates the infrastructure to create and drop undo tablespaces at
runtime. It also establishes a new undo tablespace file format and the upgrade
procedures associated with that. All this is done without requiring any new
interfaces. This task will allow the number of Undo Tablespaces to be changed
during runtime by changing that value of innodb_undo_tablespaces.

Terminology

RSEG = structure in memory to represent a rollback segment in a datafile
rseg_id = a number from 0 to 127 that is put into 7 bits of a rollback pointer

WHY: These are not needed for recovery so the mapping from rseg_id to a
Rollback Segment Header Page can be done in memory only. They will never need to
be persistent. So they need to be tracked separately. There are only 128
numbers that can be tracked in the rollback pointer for every clustered record
change. There is no need to reserve any of those numbers for changes to
temporary tables.

2) After upgrading from a 5.7 system tablespace that does not use undo

tablespaces, there will be gaps in the numbering of RSEG IDs from 1 to 32 in
existing databases (new databases with be initialized without that gap.) This
vector of durable RSEGs will eliminate that 32 slot gap so that round robin
traversal of this list will not slow down.

Task#3. Switch from a fixed array to a std::vector to list space_id_t for undo
tablespaces.

For similar reasons described above, this list will go from a fixed

unchanging size to a dynamic list as tablespaces are added and dropped. This
list of undo tablespaces will become the source of the round robin algorithm to
assign RSEG objects to transactions once we do item 4 below.

WHY: When undo tablespaces are randomly dropped by DROP UNDO TABLESPACE there
will be holes in this array. The worst case is if slot 1 and 127 undo
tablespaces so that every other transaction will need to traverse 125 empty
slots looking for an available rollback segment. A vector allows faster access
since the array is adjusted to the actual size and there are no empty slots.

The following tasks will be done in this worklog:

Task #4. Change the way rollback segments are located by purge, MVCC, transaction rollback and rollback segment assignment at start transaction time.

MVCC and transaction rollback only need to know the space_id associated with a
7-bit rseg_id in a rollback pointer. But purge and start transaction need to
find the Rollback Segment Header Page for each rollback segment. Both the
space_id and page_num are currently found in the TRX_SYS page in a 256 slot
array of which only the first 128 slots are used in v5.6 and v5.7.

We should drop this dependency on the TRX_SYS page and the system tablespace by
making a way to translate a 7-bit number into a space_id for MVCC and
transaction rollback, and making a way to find each rollback segment within an
undo tablespace so that purge and start transaction can do their jobs.

See LLD for Details.

WHY: 1) It will be a good thing to reduce dependency on the system tablespace.

2) Increase the number of possible rollback segments. Note that it is
questionable if any customer has ever run out of 128 rollback segments
since each rollback segment can contain about 4000 Undo Segments.
But using more of them may be better in a highly concurrent environment.
3) Make Rollback Segments independent of transaction rollback pointers.
4) Allow discovery, translation and identification of undo space IDs quick,
easy and deterministic, even during redo recovery.

Each undo tablespace will have its own list of rollback segments. So the
algorithm for assigning a rollback segment to a transaction will need to choose
a tablespace first by round robin and then a rollback segment by round-robin.

By preallocating these vectors and only adding new rollback segments to the back
of the vector when in concurrent mode, these vectors will not need any new
latches. See LLD for a description.

WHY: What used to be a fixed size array can have holes and will be changing
size when undo tablespaces can be added and dropped.

Task #6. Deprecate the innodb_undo_logs setting in the latest v5.7 (WL#10322)
and delete this setting in this worklog for v8.0. Use innodb_rollback_segments
to specify the number of rollback segments that is used within each undo
tablespace created. The default will remain 128 so that even when
innodb-undo-tablespaces=0 which is the default is in v5.7, the total number of
rollback segments will not be less than v5.7.

WHY: 1) These two settings alias each other. The user does not know which will
be used if one is different from the other. The second alias setting called
innodb_undo_logs uses the wrong name. They actually control the number of
Rollback Segments. And innodb_rollback_segments is the setting that reports a
deprecation warning!

2) For 5.6 and 5.7, the minimum value accepted is 1, which is OK if you do

not use innodb_undo_tablespaces > 0, which is the default. But if you try to
use extra undo tablespaces, and you forget to set innodb_undo_logs or
innodb_rollback_segments to at least 33 + the number of undo tablespaces you
want, then the undo tablespaces will be created but not used.

3) innodb_rollback_segments will change meaning from the number of rollback

segments in a bootstrapped database to the number of rollback segments in a
tablespace.

If innodb-undo-tablespaces is increased, create the requested number of new undo
tablespaces and fill them with the set number of rollback segments. If
innodb-undo-tablespaces is decreased, reduce the active number of undo
tablespaces used to assign rollback segments to a new transaction. Keep the
inactive undo tablespaces in the vector, even after they are no longer being
used.

If innodb-rollback-segments is increased, create the requested number of
rollback segments in each known undo tablespace, or in the system tablespace if
innodb-undo-tablespace=0. If innodb-rollback-segments is decreased, reduce the
active number of rollback segments used to assign one to a new transaction.
Keep the inactive rollback segments in the vectors, even after they are no
longer being used. They will remain in the tablespace and use only one empty
header page.

WHY: 1) This code is the core ability to add tablespaces. It be necessary for
CREATE UNDO TABLESPACE.

2) If a database is initialized with innodb-undo-tablespaces > 1, only one

rollback segment will be created in the system tablespace. Then if
innodb-undo-tablespace is set to 0 at runtime, we will need to be able to create
the necessary number of rollback segments in the system tablespace. Since this
routine is necessary, it is very little extra effort to also be able to add
rollback segments to any undo tablespace in response to a change in
innodb-rollback-segments. This will provide a more thorough user experience.

Task #8. Create an upgrade path for existing databases.

When innodb starts up with an existing database from an older engine, it will
attempt to upgrade the old data dictionary to the new DD. As part of this
process, the undo tablespaces and rollback segments will
also be upgraded. The tricky part is if there are existing undo logs in the
existing rollback segments. We will allow the purge threads to process
these undo logs before the new rollback segments and undo tablespaces are used.

But we must not make any changes to the TRX_SYS page or delete any old undo
tablespaces until the upgrade process has succeeded.

Fortunately, new undo tablespaces do not need the TRX_SYS page. So they can be
created at the normal time in the startup process. They must be created early
in the upgrade process since they will be used for creating the new DD and
upgrading the user tables. And they can be created before the existing undo logs
are purged because these new undo tablespaces do not get in the way of the purge
process, nor do they change any existing rollback segment references.

If the upgrade process fails after neew undo tablespaces have been created, they
do not seen by an earlier engine. They can be deleted by the user. If an upgrade
is attempted again, they will be used. If they do not yet have enough rollback
segments, they will be added during startup.

During the purge of old undo logs, a flag will be used to indicate that undo
logs are being purged. While this flag is set, the interpretation of the 7-bit
rollback pointer is done as if it is a slot in the TRX_SYS page instead of an
undo space number that can be converted directly to an undo space ID. After
purge of old undo is complete and that flag is turned off, rollback pointers
will be interpreted the new way.

Even though the default value in this worklog is innodb_undo_tablespaces=2, the
minimum accepted value is still 0. We will set that minimum value to 2.
Until then, if MySQL is started with innodb_undo_tablespaces=0 and upgrade is
needed, we will ignore that value and use innodb_undo_tablespaces=2. This will
prevent the TRX_SYS page from being modified during the upgrade process, so
nothing needs to be 'undone' to the TRX_SYS page if the upgrade fails.

Task#9. Optional but not recommended: Delete unused rollback segments in the
system tablespace.

If the value of innodb-undo-tablespaces changes from 0 to >0 at runtime, the
engine already has code that stops using the rollback segments in the system
tablespace. This happens lazily by assigning to each new transaction only
rollback segments from undo tablespaces. Eventually, all transactions that were
using rollback segments in the system tablespace will be completed and those
rollback segments will no longer be used.

It has been suggested that unused rollback segments in the system tablespace
should be removed. They only consist of an inode and a header page. With
default settings, these empty rollback segments will use up 128 slots in the
TRX_SYS page, 128 inodes, and 128 empty rollback segment header pages. This
overhead could be deleted from the system tablespace to save space. But it
might be handy to keep them there in case all the undo tablespaces are dropped
at runtime. In this worklog that could happen if innodb-undo-tablespace is set
back to 0 at runtime. If the existing rollback segments in the system
tablespace are deleted, then it should no longer be possible to set
innodb_undo_tablespaces=0 or drop the last undo tablespace.

WHY: The idea is that
when a bunch of new undo tablespaces are created, we need to stop using the
rollback segments in the system tablespace. That much is already being done and
will continue. The question is whether to delete these header pages and make
the inodes and pages available for reuse. But a more important reason to quit
using these pages is to be able to free up space used by the system tablespace
back to the file system. This can only be done by optimizing or rebuilding the
system tablespace.

Task #2. Detail - Switch from a fixed array of RSEG pointers to a std::vector
We will preset the size of this vector to 128 so that it does not need to be
resized. This allows new entries to continue to be added without mutex
protection.
The list of rollback segments is accessed in 2 ways:
1. By the purge thread traversing from beginning to end
2. At the start of each transaction. The list of rollback segments is
traversed from a starting point using a round-robin algorithm.
It is not necessary to access the list of rollback segments when undoing records
for MVCC or rollback. The rollback pointer contains a page number and offset of
a specific undo record, plus a 7-bit number called an rseg_id. The rseg_id is
used to find the space_id of the tablespace containing that undo record.
Task #4 Detail - Change the way rollback segments are located
A Rollback Pointer contains a 7-bit identifier which is only used to find the
undo tablespace where the undo record is found. This 7-bit number is usually
referred to as the rseg_id, but we can also call it the undo_space_num.
An analysis of the InnoDB code reveals that the 7-bit rseg_id from a rollback
pointer is only used to find the space_id. The TRX_SYS page contains both the
space_id and the page number of the rollback segment header page for each
rseg_id.
If all undo tablespaces use fixed space_ids in a reserved range of near
0xFFFFFFFF, then an rseg_id between 1 and 127 will map directly to an Undo
Tablespace ID and 0 will map to the system tablespace as before.
Redo log space IDs are reserved from 0xFFFFFFF0 to 0xFFFFFFFF (see
SRV_LOG_SPACE_FIRST_ID). So we will reserve numbers from 0xFFFFFFFF0 - 1 to
0xFFFFFFF0 - 127 for undo tablespace IDs.
undo_space_num = 1 to 127
undo_space_id = 0xffffffef to 0xffffff71 or (uint32) -17 to (uint32) -143
SRV_LOG_SPACE_FIRST_ID = 0xfffffff0 or (uint32) -16
SPACE_UNKNOWN = 0xffffffff or (uint32) -1
undo_space_id = SRV_LOG_SPACE_FIRST_ID - undo_space_num
undo_space_num = SRV_LOG_SPACE_FIRST_ID - undo_space_id
The following chart shows this mapping
RSEG_ID SPACE_ID
======= ================
0 0
1 -17 (0xffffffef)
2 -18 (0xffffffee)
3 -19 (0xffffffed)
... ...
127 -143 (0xffffff71)
These undo tablespaces can easily be discovered at startup since the space_ids
are at a fixed range and the predicted file name can be assembled just as it is
done now. CREATE UNDO TABLESPACE can create an unpredictable file
name with ADD DATAFILE. So we will need to implement a way to discover
these undo tablespaces not only using the data dictionary but also during
startup before redo recovery when the DD is not yet available.
Purge can find a list of each Rollback Segment Header Page in an Undo Tablespace
by reading a page at a fixed page number that contains a list of Rollback
Segment Header page numbers, similar to the TRX_SYS page. This page could be
called an RSEG_ARAY page type. The size of this array would be limited to the
(page size - overhead) / 4 so it could potentially be very large compared to
what is available now. We could go from a max of 128 rollback segments to
~512,000 (128 undo tablespaces * ~4000 for 16k page size). Initially though, we
will set the default and maximum number of rollback segments per tablespace to
128 since that is the previous limit for the setting innodb-rollback-segments.
Task #5 Detail - Implement in memory storage for rollback segments
All undo tablespace and RSEG vectors will be preallocated to the maximum size of
128 slots. The active number of slots may increase in size up to 128. But they
will not decrease. Only the current active size of these vectors will be
reduced.
If innodb-undo-tablespaces or innodb-rollback-segments is reduced, purge will
continue to attempt to purge all undo tablespaces and rollback segments in each
vector. Even after these inactive undo tablespaces and rollback segments are
completely purged and empty of undo logs and segments, this work log will not
attempt to delete them and take them out of the vectors. In this way, there
will be no need for any new mutexes. Only after a clean restart with a smaller
value for innodb_undo_tablespaces or innobd-rollback-segments will the vectors
hold fewer objects. The ability to delete a fully purged undo tablespace and
associated in-memory objects will be added later.
Task #8. Detail - Create an upgrade path for existing databases.
Here are the steps involved in upgrading the undo tablespaces and rollback
segments:
1. Discover if upgrade is required. If so, set srv_is_upgrade_mode=true.
See innobase_init_files() just before calling srv_start().
2. If upgrading and innodb_undo_tablespaces=0, use innodb_undo_tablespaces=2
instead so that the TRX_SYS page does not need to be modified until upgrade is
successful.
See srv_start().
3. Discover all existing undo tablespaces whether old or new style.
See srv_start() -> srv_undo_tablespaces_open().
4. Create any new 8.0 undo tablespaces that will be needed.
See srv_start() -> srv_undo_tablespaces_create().
5. Find all existing rollback segments whether old or new and set up the
in-memory rseg objects.
See srv_start() -> trx_sys_init_at_db_start() -> trx_rsegs_init().
6. Search all existing rollback segments for undo logs, add to purge queue if
found.
See srv_start() -> trx_sys_init_at_db_start() -> trx_lists_init_at_db_start()
-> See trx_resurrect().
7. If upgrading and undo logs are found, set srv_upgrade_old_undo_found=true.
See srv_start().
8. While this is true, make sure DB_ROLL_PTR is decoded the old way where the
7-bit rseg_id is a slot in the TRX_SYS page.
See trx_rseg_id_to_space_id().
9. Create new rollback segments in the new undo tablespaces from step 4 and add
rseg objects. These are empty and are not on the purge queue.
See srv_start() -> trx_rseg_adjust_rollback_segments().
10. Start purge threads and wait for purge to be become empty.
See innobase_init_files() -> srv_start_purge_threads().
11. When Purge becomes empty, turn srv_upgrade_old_undo_found=false. From this
point forward, any change that creates an undo log will use the new undo
tablespaces and rollback segments.
See innobase_init_files() after waiting for purge to become empty.
12. Create Dictionary tables and finish the upgrade process.
See dd_upgrade_*.
13. When the upgrade succeeds, delete any old undo tablespaces and clean up the
slots in the TRX_SYS rseg array that pointed to rollback segments in those old
undo tablespaces. If the slots in the TRX_SYS page referred to rollback segments
in the system tablespace, leave them alone. This worklog does not delete
useable rollback segments.
See dd_upgrade_finish().
14. Set srv_is_upgrade_mode=false.
See dd_upgrade_finish().