WL#9467: Resource Groups

## GENERAL IDEA
The idea itself is pretty simple here :
1. Introduce global Resource Groups feature within MySQL Server
2. Each Resource Group has its dedicated *"Attributes"*
3. All the data/info/configs related to Resource Group are managed by MySQL Server (DBA actions)
4. The only user command or *"external"* API is allowed : assign / switch a given thread to a given Resource Group (no direct way to apply resource related action, only via Resource Group assignment)..
For this WL we're selecting the *"minimal"* required functionality to allow a user manually or semi-automatically (by external tools) to assign dedicated resources to a given query / session / thread. All configuration operations should be made dynamically via SQL commands and saved in internal tables (in *"mysql"* database, similar to users creation/delete/etc.), so all the applied changes will remain persistent over MySQL Server restart.
So we'll need to introduce and implement the following :
1. MySQL Resource Groups
2. Extend the current THREADS table in PFS schema with RESOURCE_GROUP column
3. Extend SQL with new "RESOURCE GROUP" related commands (allowing a manual
action, external tool action, etc.)
4. Add a new Query HINT (could be added by user app, Query Rewrite Plugin,
external tool like ProxySQL, etc.)
## RESOURCE GROUP ATTRIBUTES :
* Name
* CPU Affinity
* OS thread priority
* Group type ("system" or "user")
* Enabled flag (1 or 0)
## THREADS IN MYSQL :
1. Currently we have all threads already referenced in PFS THREADS table
2. Threads can be of 2 types :
+ FOREGROUND (FG) - "user" threads
+ BACKGROUND (BG) - "system" threads (internal Engine threads, e.g. "purge"in InnoDB, etc.)
3. Each thread has its corresponding functional name in the table (e.g.
"thread/innodb/page_cleaner_thread", etc. ) as well the thread_os_id.
4. We just need to add a new "resource_group" column to complete the whole picture
NOTE: If a thread is not referenced in PFS it cannot be addressed, so all MySQL/SE threads *must* be registered in PFS on creation!
## "SYSTEM" AND "USER" RESOURCE GROUPS :
+ Each Resource Group must have one of attributed types: "system" or "user"
+ Once the Group is created, the type cannot be changed anymore
+ The goal of having this type attribute is to protect MySQL "system" threads
from fighting for CPU resources with "user" threads
+ The idea here is the following :
+ Only "system" Resource Groups can have priority higher than 0.
+ Only "system" (BG) threads can be assigned to "system" Resource Groups.
+ "User" Resource Groups are allowed to have 0 (normal) priority or
lower only.
+ This approach will guarantees that never by mistake a user thread could
run on a higher priority than than Storage Engine itself.
## CONFIGURING RESOURCE GROUPS :
1. There will be one "user" group (USR_Default) and "system"
group (SYS_Default)
2. Both default groups are having 0 priority and no CPU affinity
3. These default groups attributes cannot be modified (and these groups
cannot be dropped either)
4. Any newly created user threads is automatically assigned to USR_Default.
5. Any "system" thread is assigned by default to SYS_Default.
6. Later we may decide to have more pre-defined groups created by default
(according different Storage Engines needs, etc. - for ex. to match NDB threads
CPU affinity requirements, as well Replication threads too, etc.)
7. However, by default MySQL Server is starting with just default groups, and
once online, the validation process of all Resource Groups is done and then
user and system threads are re-assigned to defined Resource Groups if needed
8. If one of Resource Groups did not pass "validation" check, it should be
flagged as disabled and warning message sent to MySQL log output, no threads can
be assigned to such a Resource Group until its attributes is adjusted (and the enabled flag can be changed then to "true")
9. All Resource Groups configuration actions are done via SQL :
<pre style="white-space:pre-wrap; font-family: 'Open Sans', Tahoma, Verdana, Arial, Helvetica, sans-serif; font-size:14px; font-weight: 400; line-height: 17px; border:0px solid lightgrey; padding: 10px; background:#e3e3e7; color:#000080; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace;">
mysql> CREATE RESOURCE GROUP 'name'
TYPE=SYSTEM|USER [VCPU=num|start-end[,num|start-end]*]
[THREAD_PRIORITY=num] [ENABLE|DISABLE] ;
mysql> ALTER RESOURCE GROUP 'name'
[VCPU=num|start-end[,num|start-end]*]
[THREAD_PRIORITY=num] [ENABLE|DISABLE] [FORCE] ;
mysql> DROP RESOURCE GROUP 'name' [FORCE];
mysql> SELECT * from INFORMATION_SCHEMA.RESOURCE_GROUPS;
Where :
VCPU : a list of CPUs (vcpu) bind to the group, ex: 0,2,4-7,12,14
THREAD_PRIORITY : a "renice" value to apply to Group threads
(range in [-19,19], 0 is default)
Example :
mysql> CREATE RESOURCE GROUP 'slow_batch' type='user' vcpu='10-12,22-24' ;
</pre>
Note about FORCE option:
There may be an urgent situation to DISABLE a group. Hence a DBA may prefer to not wait until no thread will use this group, the following semantics apply:
1. DISABLE : will just set DISABLE flag on a group, so all newly coming
threads will not be able to use it anymore.. -- while
already running threads will continue to run as it
till disconnect..
2. DISABLE FORCE : set DISABLE flag to the group and move all threads
using it to Default group..
3. DROP : returns error if there are some threads in group
4. DROP FORCE : move group threads to Default group, drop the group
## SQL COMMANDS :
Allow to change the Resource Group for a given connection by its thread_id
or to the user session itself :
<pre style="white-space:pre-wrap; font-family: 'Open Sans', Tahoma, Verdana, Arial, Helvetica, sans-serif; font-size:14px; font-weight: 400; line-height: 17px; border:0px solid lightgrey; padding: 10px; background:#e3e3e7; color:#000080; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace;">
mysql> SET RESOURCE GROUP 'name' FOR thread_id ;
mysql> SET RESOURCE GROUP 'name' FOR thread_id1, thread_id2,thread_id3, ...;
mysql> SET RESOURCE GROUP 'name' ;
</pre>
The thread id in the above resource group commands should be obtained from the
performance schema threads table.
## QUERY HINTS :
1. /*+ RESOURCE_GROUP(resource_group_name) */ -- for executing this SQL
query switch the given thread to Resource Group 'name', then switch it back once
the query execution is finished.
2. The above query hint shall be applicable only to data manipulation statements and should be placed after the initial keyword of SELECT, UPDATE, INSERT, REPLACE and DELETE.
3. Having a hints feature is very important as it'll allow users to fix problematic queries generated by "black box" apps (when changing apps code or
generated query is not possible, while a query can be rewritten on fly by plugin
(like Query Rewrite Plugin) or external tools (like ProxySQL))..
## PRIVILEGES :
The following privilege levels are associated with resource groups
and their meanings are explained below:
1. RESOURCE_GROUP_ADMIN:- RESOURCE_GROUP_ADMIN privilege allows resource group
management (creation, deletion and modification) as well as assignment of the
resource groups(both system/user) to various threads. It is maximum privilege
allowing for any operation relating to resource groups.
2. RESOURCE_GROUP_USER: RESOURCE_GROUP_USER privilege bestows a user to
assign other users session or queries or threads as well his own to user
resource groups defined.
## NOTE :
Performance Schema shall provide an API for providing the security context
info required for privilege checks associated with a given thread os id. In
addition, if the privilege context associated with a thread( identified by
thread os id) changes during it's lifetime, Performance schema shall provide a
notification service.
## INTERACTION WITH PFS
1. The PFS threads feature should be available to make use of the Resource group
feature.
2. PFS shall provide API to obtain thread os id and security context information
associated with thread id.
3. PFS shall provide API to update the resource group name associated with a
given thread id.
4. PFS shall provide notification to the resource group component on the
following events:
- Thread create event.
- Thread destruct event
- Session connect event
- Session disconnect event.
- Security context change event.
The API shall be provided by Performance Schema, Resource Control**.
## RESOURCE GROUP CONFIGURATION PERSISTENCE:
The resource group configs shall be persisted in the Data Dictionary (DD) table
*mysql.resource_groups*. This shall be a system table and the persistence shall be done via use of Data Dictionary APIs. The schema definitions for
resource_groups:
<pre style="white-space:pre-wrap; font-family: 'Open Sans', Tahoma, Verdana, Arial, Helvetica, sans-serif; font-size:14px; font-weight: 400; line-height: 17px; border:0px solid lightgrey; padding: 10px; background:#e3e3e7; color:#000080; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace;">
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
resource_group_name VARCHAR(64) NOT NULL COLLATE utf8_general_ci,
resource_group_type enum('SYSTEM', 'USER') NOT NULL,
resoruce_group_enabled boolean NOT NULL,
cpu_id_mask VARCHAR(1024) NOT NULL,
thread_priority int NOT NULL,
PRIMARY KEY(id),
UNIQUE KEY (resource_group_name)
</pre>
## NOTE:
The Resource group feature shall not be available when the thread pool plugin
is enabled. This is to keep the scope to a minimum.
## PLATFORM RELATED NOTES:
1. Mac OS doesn't provide an explicit API for binding a set of CPUs to a thread.
[thread affinity mac os ](https://developer.apple.com/library/content/releasenotes/Performance/RN-
AffinityAPI/#//apple_ref/doc/uid/TP40006635-CH1-DontLinkElementID_2).
Thus resource groups shall not be available on Mac OSX.
2. On FreeBSD and Solaris, thread priority would not be ignored. This platform
different implementations of thread library and assignment of LWPs thread
priority is prohibited by native syscalls. [FreeBSD thread priority issues](https://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038624.html). Thus Thread priority values are ignored are unsupported and ignored on these platforms.
3. Windows has the thread priority classes:
* THREAD_PRIORITY_BELOW_NORMAL
* THREAD_PRIORITY_NORMAL
* THREAD_PRIORITY_ABOVE_NORMAL
* THREAD_PRIORITY_HIGHEST
The thread priority values shall be mapped in to the 4 classes by the simple
linear mapping as below:
|Priority Range |Windows Priority Level |
|:----------------:|:------------------------------:|
|-20 to -10 | THREAD_PRIORITY_HIGHEST |
|-9 to -1 | THREAD_PRIORITY_ABOVE_NORMAL |
| 0 | THREAD_PRIORITY_NORMAL |
| 1 to 10 | THREAD_PRIORITY_BELOW_NORMAL |
| 11 t0 20 | THREAD_PRIORITY_LOWEST
## REPLICATION SEMANTICS
CREATE, ALTER and DROP shall not be binlogged. Resource groups are managed locally at each node.

## Functional Requirements:
1. Provide functionality to create, drop and alter **resource groups**. The **resource group** associate a set of controllers which can be applied to resources of the system. This is to manage and throttle the resource usage consumption for different workloads. In the current WL, resource groups contain
specification of CPU affinity and thread priority which can be applied to
threads (which are representatives of workloads).
2. **Resource group** shall have a type associated with it and the type of the
resource group shall be either **SYSTEM** or **USER**.
* **SYSTEM** resource group allow for assignment of system threads to it as
well have a higher control setting. Thread priority associated with a
system resource group can be 0 or higher (negative value of thread
priority).
* **USER** resource group allow for assignment of user threads and the thread
priority can be 0 or lower (positive value of thread priority).
3. **Resource group** can be either in **ENABLED** OR **DISABLED** state. A resource group can be disabled explicitly by SQL command (CREATE or ALTER) or it can become disabled as some of controllers that make up the resource group are in invalid state (eg: invalid value of CPU ID due to hardware change etc.). No control operations can be done on a disabled resource group.
4. **Resource group** should allow to specify the control properties. The control properties are **CPU list** and **thread priority** value. The CPU list is a comma separated list of cpu ids and the cpu ids can specify a range (using the hyphen separator) in addition to a single cpu id value. The range of priority values for a system resource group is -20 to 0 and for a user resource group is 0 to 19.
5. A **resource group** shall be associated with a thread, a session thread, list of thread and a hint to run a particular query using a resource group. The hint is applicable to data change statements. These statements include *SELECT*, *UPDATE*, *INSERT*, *REPLACE* and *DELETE*. The threads specification shall be via the *PFS* thread id value. This value can be queried from the *performance_schema.threads* table.
6. By default there will be two resource groups - the system default resource
resource group (*SYS_default*) and the user default resource group
(*USR_default*).
Both of these resource groups will have a priority of 0 and no specific cpu affinity. All threads unless otherwise explicitly changed, shall be associated with *SYS_default* or *USR_default* resource group depending on it being a system or user thread.
7. The *alter* and *drop* operation of a resource group shall have an option **FORCE**.
* **FORCE** allows a resource group to be disabled even when some threads are
using the resource group (as part of an *alter* operation). It should move the threads into their respective default resource groups.
* **FORCE** allows a resource group to be dropped while some threads are associated with the resource group. The threads associated with the resource group shall be moved to the respective default resource groups.
8. Two new privileges shall be introduced:
* RESOURCE_GROUP_ADMIN:- RESOURCE_GROUP_ADMIN privilege allows resource
group management (creation, deletion and modification) as well
as assignment of the resource groups(both system/user) to various threads.
It is maximum privilege allowing for any operation relating to
resource groups.
* RESOURCE_GROUP_USER: RESOURCE_GROUP_USER privilege bestows a user to
assign other users session or queries or threads as well his own to user
resource groups defined.
9. The *performance_schema.threads* shall be augmented with a new column
*RESOURCE_GROUP* which shall indicate the resource group the thread is
associated with.
10. The resource groups shall be persisted in the data dictionary table (mysql.
resource_groups) and be made available across server reboots/future
upgrades. The resource groups defined shall be exposed to the users (
who has appropriate privileges) via the INFORMATION_SCHEMA.RESOURCE_GROUPS
table. The Data Dictionary table mysql.resource_groups is local to the node
and this table shall not be replicated.
11. Resource groups shall not be available on *Mac OS*. Also resource group
feature shall not be available if server is using thread pool plugin for
connection handling. On these, all resource group operation shall return
the error ER_FEATURE_UNSUPPORTED.
Thread priority specification shall be ignored on *FreeBSD* and *Solaris*
platforms. On Linux systems, the mysqld process should have CAP_SYS_NICE
capability for setting thread priority. Packaging changes shall be done on
linux to ensure that mysqld has CAP_SYS_NICE capability. An warning
ER_ATTRIBUTE_IGNORED shall be indicated to the user and the default
priority of zero shall be used on these platforms.
12. There shall be new error messages and error codes introduced as result of
this worklog.
13. The DDL for CREATING, DROPPING and ALTERING a resource group configuration
has only local scope and these statements are not replicated. This is
because different replicas may have different hardware profiles. Hence the
resource group configurations are not replicated.
14. No Resource group management APIs shall be exposed to the any subsystems or
as services in the service registry (for use by other components/plugins).
This can be considered in future extensions.
15. This feature doesn't impact any upgrade & cross replication scenarios. It
is to be noted that an extra column is added in pfs threads table.
## Non-Functional Requirements:
1. In the default configuration (where in all threads are bound to default
resource groups), the server throughput and transaction latency shall not
decrease for sysbench RW & RO benchmark tests.

# Contents
1. [Parser Component](#parserc)
2. [Platform API Component](#platformc)
3. [Core Runtime Component](#corec)
4. [Persistence (DD) Component](#persistc)
5. [Performance Schema (PFS) Component](#pfsc)
6. [Privilege Component](#privilegec)
7. [Resource Group Hint Component](#hintc)
## Parser Component <a name="parserc"></a>
The parser component consists of required changes in parser related components to implement SQL interfaces and query hint described in the introduction section. The component provides interaction with DBAs and end-users for resource group implementation. It gets input from user and does syntax validation and produces parsed output and hands control to core runtime component.
It consists of modifying bison parser files , lexer and the parser related source files to provide grammar rules, new tokens and defining the relevant semantic actions. New symbols are introduced in the lexer. New classes are introduced to represent the parsed object output .
## Platform API Component <a name="platformc"></a>
The platform component provides API in a platform agnostic-way to Core Runtime Component. It specifies API implementation to bind & unbind threads to CPU cores (identified by CPU ID) , get and set thread priorities, validate thread priority and get the total number of virtual CPUs. The APIs are implemented across the platforms (Linux, FreeBSD, Solaris & Windows).
## Core Runtime Component <a name="corec"></a>
The core runtime component interacts with all other components to implement resource group functionality. The class Resource_group contains the attributes of the resource group such as name, type and whether resource is active or inactive. It is composed of the Thread_resource_control class. The Thread_resource_control class consists of control methods which apply cpu affinity and thread priority to the thread resource. A singleton Resource_group_mgr class is introduced and this class is responsible for maintaining the mapping of the resource group names and the corresponding resource group objects. It also provides API methods to apply the resource controls onto the thread, get default resource groups, abstracts services to the PFS component and move resource group functionality (for Resource group Hint feature.)
## Persistence (DD) Component <a name="persistc"></a>
The persistence or dictionary component is responsible for persisting the resource group configurations defined by DBA user in the relational tables and make it available across server reboots. The system table mysql.resource_groups is introduced and the schema definitions of this table are:
```sql
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
resource_group_name VARCHAR(64) NOT NULL COLLATE utf8_general_ci,
resource_group_type enum('SYSTEM', 'USER') NOT NULL
resource_group_enabled boolean NOT NULL
cpu_id_mask varchar(1024) NOT NULL,
thread_priority int NOT NULL,
PRIMARY_KEY(id),
UNIQUE KEY (resource_group_name)
```
mysql.resource_groups shall be a hidden system table and the information pertaining to the resource groups shall be exposed to the end-user as an information schema view **INFORMATION_SCHEMA.RESOURCE_GROUPS**.
## Performance Schema (PFS) Component <a name="pfsc"></a>
The performance schema provides threads table. It lists out the threads which are in existence in mysql server. It maintains information relating to thread such as it's operating system thread identifier, security context information and whether thread is a system or user thread. The performance schema threads table (performance_schema.threads) table shall be augmented with an additional column by name resource_group_name which shall be a varchar attribute. It identifies resource group name which the thread was associated with. The performance_schema.threads provides the thread_id value which is required for usage in SET RESOURCE GROUP command.
In addition, the resource group component shall register with the performance schema to listen to certain events which are of interest. These events include:
1. thread create event: On notification of thread create event from Performance Schema, the resource group component associates the default resource group. The resource shall be user or system default depending on the thread being a user or system thread.
2. thread destruct event: Currently there is no action performed on thread destruct event by the resource group.
3. session connect event: - When a thread is binded to a new session (either from thread cache or thread pool), we get notification of this event. Upon this event, we associate the resource group the session was associated with earlier.
4. session disconnect event:- On a session disconnect event, we revert the thread to the default user resource group.
5. security context change event: When a security context associated with a thread undergoes a change, we get notification of this event. Upon received, reevaluation of the association of the thread with it's resource group. If security credentials are validated for that resource group, the thread shall remain in the same resource group else it shall reverted to the appropriate default resource group.
The API shall be provided by Performance Schema, Resource Control.
## Privilege Component <a name="privilegec"></a>
Two dynamic privileges are introduced:
1. RESOURCE_GROUP_ADMIN:- RESOURCE_GROUP_ADMIN privilege allows resource group management (creation, deletion and modification) as well as assignment of the resource groups(both system/user) to various threads. It is maximum privilege allowing for any operation relating to resource groups.
2. RESOURCE_GROUP_USER: RESOURCE_GROUP_USER privilege bestows a user to assign other users session or queries or threads as well his own to user resource groups defined.
## Resource Group Hint Component <a name="hintc"></a>
Resource group hint component consists of the changes necessary to implement the
hint functionality of the resource group. Changes are made to introduce the token RESOURCE_GROUP in the hints yy file. The necessary parser classes are introduced. A new data member to the THD class which consists of context info necessary for switching the resource group during execution of the query is added. Before execution of data change query begins, the context info is checked and if necessary the switch is done. The context info is protected LOCK_thd_data mutex to synchronization with modification from a concurrent ALTER, DROP and SET resource group commands. We do not hold the MDL lock on resource group names as part of the query execution for faster switch.