You don’t get to chose what you lose, it could be a $10 order, or a $10M order.

Independent of losing orders, DBA time is expensive. Having to perform post-failure investigation to handle discrepancies has a high cost.

The cost of durability is much lower with SSDs. In a comment, Daniël van Eeden also correctly said that the cost is much lower in 5.6 with group commit (thanks Daniël!).

I recommend most users should be running with MySQL as fully durable.

Now to explain each of these proposed changes:

Sync_binlog

By setting sync_binlog=1 you are ensuring that any changes written to the binary log will not be lost in the event of a crash. This is a problem because with the current default of sync_binlog=0, a replication master crashing may lose events and require all slaves to be re-imaged with a fresh copy of data to be consistent.

It’s important to note that changing this setting has a downside: syncing the binary log causes a performance impact. As I mentioned above, this was reduced significantly in MySQL 5.6 with group commit, but it will still be evident in some workloads. The most unfortunate may be in single-threaded performance on systems with hard-drives and no RAID controller with a battery backed write cache.

Master-info-repository/relay-log-info-repository

These two options were first introduced in MySQL 5.6 as part of a feature called transactional replication. When both set to TABLE, they store the internal replication coordinates in an InnoDB table, and COMMIT changes as part of the same transaction as the replication events being applied.

What this means in practical terms, is that slaves are in a consistent state when they crash. Replication can just resume processing events from the point of failure, and data does not need to be re-imaged from the master or another slave.

Conclusion

As with previous proposals, we are seeking feedback on how these changes will impact you.

Do you currently use sync_binlog and transactional replication?

Have you considered these features, but decided not to switch? (This is a case where we’d really love to hear from you).

Do you agree that these changes will make MySQL safer, and easier to administer?

When MySQL 5.7 DMR3 was released, I couldn’t wait to try out the new UNION ALL optimization that no longer requires data to be materialized in a temporary table.

This new optimization can be seen via EXPLAIN, but it is also nice to run it through SHOW PROFILES, which breaks down query execution step by step. However, this feature is now deprecated (since it overlaps with performance_schema), and will be removed in a future version.

So today I wanted to show you a combination of:
* What a UNION ALL statement looks like in MySQL 5.6 (EXPLAIN, SHOW PROFILES).
* How it is improved in MySQL 5.7 (EXPLAIN, SHOW PROFILES).
* How you can easily emulate the SHOW PROFILES feature with performance_schema + ps_helper 🙂

Initial Setup

The UNION ALL query I am going to demonstrate uses the following fake sample data:

You will notice that there is in fact still a temporary table, denoted by the step removing tmp table. The release notes explain this point very carefully:

“The server no longer uses a temporary table for UNION statements that
meet certain qualifications. Instead, it retains from temporary table
creation only the data structures necessary to perform result column
typecasting. The table is not fully instantiated and no rows are
written to or read from it; rows are sent directly to the client. As
a result, The result is reduced memory and disk requirements, and
smaller delay before the first row is sent to the client because the
server need not wait until the last query block is executed. EXPLAIN
and optimizer trace output will change: The UNION RESULT query block
will not be present because that block is the part that reads from
the temporary table.”

Example using ps_helper

And now to demonstrate how to replace SHOW PROFILES with performance_schema! To do this, I decided to base my scripts on ps_helper by Mark Leith.

There is no specific reason you need to do this, but I admire the way ps_helper works and its something I’ve wanted to try extending for a while. It also includes useful helper functions to intelligently truncate/format SQL, and convert times to a human readable format.

I renamed the command SHOW PROFILE FOR QUERY x to show_profile_for_event_id, since this better maps to performance_schema naming, but otherwise it behaves pretty much the same.

For some reason, there is a second query end state in the SHOW PROFILES version after removing the temporary table. I’m not sure why it is, and if not including it will be an issue.

Conclusion

It’s great to see these optimizations introduced into MySQL – I can think of a number of users who can benefit from a better UNION ALL.

I implemented my own SHOW PROFILES feature really as an educational step for myself to make sure I knew performance_schema, and it was actually quite a quick and joyful experience. I plan to work with the maintainer of ps_helper and see if this can be included in future versions.

As Giuseppe has written about previously, writing notes or “informational events” can create debugging problems because they reduce the signal to noise ratio. There is now an easy way to reduce the logging to be only warnings and errors!

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.15.

In particular:

Thanks to Yoshinori Matsunobu, who reported that the performance_schema did not instrument locks related to semisync replication. Bug #70577.

Thanks to Sebastian Strzelczyk, who reported excessive memory usage when querying INFORMATION_SCHEMA.INNODB_FT_INDEX. Sebastian uploaded a number of reports of memory usage, which was helped us generate a good reproducible testcase. Bug #70329.

Thanks to Inaam Rana, who suggested improvements to InnoDB counters in information_schema, and contributed a patch! Bug #70241.

Thanks to Yoshinori Matsunobu, who reported inefficiency in InnoDB memcached API. Yoshinori’s feedback was also part of the 1 million QPS record set for 5.7. Bug #70172.

Thanks to Nizameddin Ordulu, who identified a case where crash recovery might not be possible with InnoDB. Nizameddin pointed out where in the code the bug existed, and we were quickly able to identify that this was a regression from an earlier bug fix. Bug #69122.

Thanks to Elena Stepanova, who identified a case where SELECT COUNT queries would run very slowly when run concurrently with a “LOAD DATA” operation. Elena also provided a testcase! Bug #69141.

Thanks to Daniël van Eeden, who reported a confusing/incorrect error message when using the InnoDB memcached API. Bug #68684.

Thanks to Monty Widenius and Michal Sulik, who both reported an issue where a unique composite index would return wrong results. Bug #70038, Bug #60220.

Thanks to Edward Dore, who reported an issue when converting tables from MyISAM to InnoDB. It was from a table that Edward provided that we were able to reproduce the issue! Bug #50366.

Thanks to Elena Stepanova who reported an error when deleting from a partitioned table, and provided a testcase! Bug #67982.

Thanks to Vlad Lesin, who reported that certain LOAD DATA statements would appear in the binary log twice, and provided a testcase! Bug #70277.

Thanks to Yoshinori Matsunobu, who reported that transactions could be lost when starting/stopping slaves with GTIDs. Yoshinori also provided a small program which we were able to use to reproduce the issue. Bug #69943.

Thanks to Roderik van der Veer, who reported that MySQL would not compile on Mac OS X 10.9. Thanks also to David Peixotto who provided an example patch of how to fix the problem. While we didn’t end up using David’s patch, we certainly value his contribution. Bug #70542.