WL#4922: Backup endurance testing

Test that BACKUP and RESTORE works in a highly parallel environment (many
concurrent transactions)
See also WL#4220, WL#4406, and WL#4044.
BRAINSTORMING IDEAS
===================
Endurance testing,
- Checks for problems (slow response, CPU full, system hang) that may occur with
prolonged execution of backup with concurrent DML operations.
- Conducted to understand the behavior of backup under a specific expected load.
- This load can be accomplished with concurrent number of users / connections
performing a specific number of transactions within the set duration while
backup is running in the background
- With the response times this test can itself point towards the bottleneck in
backup.
- Enables to measure response times, throughput rates, resource-utilization
levels (CPU, Memory) and to identify breaking point for backup
In a nutshell, with Endurance alias Load testing we intend to see that Backup /
Restore operations in a MySQL server with considerable load executes
successfully without any consequence. The backup time should be the same, even
if the operations are continued for long durations without any hit in resource
utilization levels * (24, 48 and 72 hours test)
Resource utilization level - (CPU %, Memory, N/w throughput, # of concurrent
connections) -> TBD

When writing HLS for this worklog - please review / consider work done in WL#4044
(phase #1)
Variables and Measurement criterion
====================================
1. Variables - Backup time, # of client connections, # of concurrent operations,
MySQL server CPU, Memory, Network, Backup database size.
2. Run DML operations and Backup concurrently, scale up the clients and measure
the variables.
3. End goal - To define a specific load condition for this endurance test and
run continuous Backup and DML in that load for a long duration.
Factors to be considered
========================
* N/w bandwidth - Assume MySQL server has 1GB NIC and if we hit the line speed
with 10 concurrent client connections performing DML. In such case we won’t be
able to scale up # of concurrent connections to accomplish the expected load.
* Client type - Windows, Linux or combination (we need to execute the load test
for them separately)
* Disk utilization factor - Performance degradation is obvious in any server if
the free space left is very small. We need to ensure that disk utilization
doesn’t go beyond 90 % of total disk space.
* Backup size (we may need to repeat the test with different backup size to
measure the delta in CPU in the above test because of change in backup size).
### Note: The above test considers the CPU Utilization as the "Load factor" for
the endurance test. Similar tests can be performed for Memory, # of concurrent
connections and N/w speed. ###
Endurance testing requirement should be something like following:
* With DML from 150 Client connections, utilizing 80% of disk, with system
memory & n/w bandwidth not exceeding 80%, a large database Backup (250 to 500
GB) completes with the specific time range.
* This time range or backup speed shouldn’t slide down if the DML operations are
persistent.
NOTE:
* Endurance testing, Stress testing can be combined with Performance testing of
backup.
* Results obtained from performance testing could be deemed as requirements for
Endurance and Stress testing.