Wednesday, May 31, 2017

Sysbench, IO-bound & Intel NUC for MySQL 5.6, 5.7 & 8

Update - the regression isn't as bad as I have been reporting. Read this post to understand why.

In this post I share results from sysbench with an IO-bound workload on Intel NUC servers. My previous post shared results from an in-memory workload. This is part of my work from bug 86215 as I identify CPU performance regressions from MySQL 5.6 to 5.7 and 8.

tl;dr

The results here are similar to, but not as bad as, the results from the in-memory workload. This is reasonable because CPU regressions can be hidden by IO-bound tests. But if you scroll to the bottom of this page and look at the graph for the point-query test you will see there are problems even for IO-bound tests.

For the i5 NUC most of the regression is from 5.6 to 5.7

For the i3 NUC, MySQL 5.7 did better especially on range scans but there is still a regression from 5.6 to 8. From many tests I have run it looks like someone did great work in MySQL 5.7 to make range scans more efficient in InnoDB.

For long scans the overhead from the default charset/collation in MySQL 8 is significant compared to latin1/latin1_swedish_ci.

Configuration

I tested MySQL using upstream 5.6.35, 5.7.17 and 8.0.1. For 8.0.1 I repeated the test: first with the default charset/collation and then with latin1/latin1_swedish_ci.

My servers are described here. The charts below refer to the servers as i3 NUC and i5 NUC. Compared to the i3 NUC, the i5 NUC is newer, has a CPU that is 2X faster, 2X more RAM and storage that is more than 2X faster.

My usage of sysbench, including the tests run, and the my.cnf used for each database engine is described here. The my.cnf files I share were for the i3 NUC. For the i5 NUC the InnoDB buffer pool and IO capacity options were increased using these values. I use the same server for mysqld and the sysbench clients. The binlog is enabled but sync-on-commit is disabled. Sysbench is run with 4 tables and 40M rows per table. The database is larger than RAM. Although in future tests I might need to use larger tables for the i5 NUC which has twice the memory of the i3 NUC.

ResultsI first ran these tests on the i3 NUC and then was curious about performance on a more modern CPU so I setup a cluster of new i5 NUC servers. I am a big fan of Intel NUC and have 6 at home and have yet to trip a circuit breaker, overheat my office or suffer from too much fan noise. The results on the i5 NUC are different than on the i3 NUC. On the i3 NUC MySQL 5.7 has much less of a regression from 5.6. While on the i5 NUC most of the regression is from MySQL 5.6 to 5.7. Perhaps one day I will have time to explain that, but I am too busy right now.

The results are in the order in which tests are run.

There is a small regression from 5.6 to 5.7/8 for update-index on the i5 NUC. This test requires secondary index maintenance as part of the update (read-modify-write of secondary index leaf pages).

The regression for i5 NUC for update-nonindex is larger than for update-index. This test is less IO-bound than update-index because secondary index maintenance is not needed.

The regression for delete is worse on the i5 NUC. If fact, there isn't a regression on the i3 NUC.

The regression for write-only is worse on the i5 NUC and there isn't a regression on the i3 NUC.

For read-write the regression is worse on the i5 NUC. MySQL 5.7 does better than 5.6 on the i3 NUC and that difference is larger for the larger range scan (10,000 row) than the smaller (100 row).

For read-only using latin1/latin1_swedish_ci improves QPS at 1000 and 10,000 row range scans with MySQL 8 compared to the default charset/collation. Again, MySQL 5.7 does great on the i3 NUC and does better as the range scan increases. MySQL 8 has regressions on both the i3 and i5 NUC and that isn't explained by charset.

For point-query the regression on the i3 NUC starts with MySQL 8 and on the i5 NUC starts with MySQL 5.7.

2 comments:

I wonder if you measured more details here for the cause of regression - does 5.7 do more IO for the same workload or it is just the CPU path in IO bound workload is so much impacted so it makes the difference ?

I have those details for the insert benchmark and linkbench and will share them in the next few days. I have yet to update my sysbench helper scripts to collect them. From what I have seen, newer releases use more CPU per query/insert/transaction but not more IO.