Of course there are other improvements on top of this as well, such as the new replication tables, but I won’t go in to that here. Today I want to focus on the kind of data you can now get in MySQL 5.7 for debugging what a foreground user connection has been doing.

The problem is that a lot of that extra data that I wanted exposed is really of a structured form, such as the last N statements (or in 5.7, last N transactions even) that a connection has executed, or a summary of wait information, for example. Everybody loves JSON now’a’days, so that was the obvious choice to me – I wanted a JSON object, per thread, of the available data from Performance Schema.

And so started the writing of a query from hell. But the output is, if I may say so myself, a thing of beauty.

Lets whet that appetite a little (and you may need to scroll to the right):

This is a snapshot of one of the connections from a MySQL Enterprise Monitor server, with this particular connection having inserted some data related to the Query Analyzer, with one past transaction that inserted a normalized and example statement in a single transaction with a final commit, and currently inserting another normalized statement.

Because it is currently executing an INSERT on the `memtrunkio__quan`.`normalized_statements_by_server_by_schema` table, you can also see that it holds a SHARED_WRITEmetadata lock (for the duration of the TRANSACTION) on that table, and holds a table handle on that table as well.

Finally, you can see a summary of all of that connections memory usage, and timing information for both stages and lower level waits for those transactions and statements that it has executed.

Whilst developing this output, I’ve been using it almost exclusively against our own MEM schema. In the process it’s managed to help me find some interesting bugs. For instance, here’s a previous incarnation of that output, that helped find a bug where we were not committing between DELETE statements when purging data from all of our instrument tables. Ultimately this meant we were holding open huge transactions, with many many metadata locks being held across all of the instrument tables:

This comes with a bonus though, there was also a new view that was contributed by Jesper Wisborg Krogh waiting to be merged as well, so I’ve added that to the 1.3.0 version (hence the 1.3.0 instead of 1.2.1).

This is the innodb_lock_waits view, which shows all sessions that are waiting for a lock within InnoDB, as well as the details of who is blocking them, here’s an example output:

All views now work with MySQL 5.7.5, and the ONLY_FULL_GROUP_BY changes.

There is also a new script available (generate_sql_file.sh) that will allow RDS users to easily generate a single SQL file, using a specified user, that can be loaded in to an RDS instance. See the README for details on how to use that.

Here’s a full summary of the other changes:

Backwards Incompatible Changes

The host_summary_by_stages and user_summary_by_stages wait_sum and wait_avg columns were renamed to total_latency and avg_latency respectively, for consistency.

The host_summary_by_file_io_type and user_summary_by_file_io_type latency column was renamed to total_latency, for consistency.

Improvements

Made the truncation length for the format_statement view configurable. This includes adding a new persistent sys_config table to store the new variable – statement_truncate_len – see the README for usage

Added total_latency to the schema_tables_with_full_table_scans view, and added an x$ counterpart

Added innodb_buffer_free to the schema_table_statistics_with_buffer view, to summarize how much free space is allocated per table in the buffer pool

The schema_unused_indexes view now ignores indexes named PRIMARY (primary keys)

Added rows_affected and rows_affected_avg stats to the statement_analysis views

The statements_with_full_table_scans view now ignores any SQL that starts with SHOW

Added a script, generate_sql_file.sh, that can be used to generate a single SQL file, also allowing substitution of the MySQL user to use, and/or whether the SET sql_log_bin … statements should be omitted. This is useful for those using RDS, where the root@localhost user is not accessible, and sql_log_bin is disabled (Issue #5)

Added a set of memory_by_thread_by_current_bytes views, that summarize memory usage per thread with MySQL 5.7’s memory instrumentation

Improved each of the host specific views to return aggregate values for background threads, instead of ignoring them, in the same way as the user summary views

This release is hugely pleasing to me, in that I actually didn’t have to do too much work on it myself! There were a significant number of contributions from Jesper Wisborg Krogh and Arnaud Adant, both MySQL Support Engineers (at the time at least, Arnaud has moved on to pastures new now), as well as again from Joe Grasse.

Thank you all for your contributions!

Here’s a summary of the changes:

Improvements

Added host summary views, which have the same structure as the user summary views, but aggregated by host instead (Contributed by Arnaud Adant)

host_summary

host_summary_by_file_io_type

host_summary_by_file_io

host_summary_by_statement_type

host_summary_by_statement_latency

host_summary_by_stages

waits_by_host_by_latency

Added functions which return instruments are either enabled, or timed by default (#15) (Contributed by Jesper Wisborg Krogh)

ps_is_instrument_default_enabled

ps_is_instrument_default_timed

Added a ps_thread_id function, which returns the thread_id value exposed within performance_schema for the current connection (Contributed by Jesper Wisborg Krogh)

Improved each of the user specific views to return aggregate values for background threads, instead of ignoring them (Contributed by Joe Grasse)

Optimized the schema_table_statistics and schema_table_statistics_with_buffer views, to use a new view that will get materialized (x$ps_schema_table_statistics_io), along with the changes to the RETURN types for extract_schema_from_file_name and extract_table_from_file_name, this results in a significant performance improvement – in one test changing the run time from 14 minutes to 20 seconds. (Conceived by Roy Lyseng, Mark Leith and Jesper Wisborg Krogh, implemented and contributed by Jesper Wisborg Krogh)

Like this:

I’ve just released the 1.0.1 version of the MySQL sys schema. It contains a few bug fixes (including a contribution from Joe Grasse, thanks Joe!), and a number of new helper procedures around viewing and altering configuration for consumers and instruments, contributed by the MySQL QA team, as they’ve started to incorporate more Performance Schema trace data in to their automated testing.

Next up is the 1.1.0 release, that also includes a bunch of new host summary views that were contributed by Arnaud Adant (of the MySQL Support team). I have a number of new things in development to add as well before then though.

Let me know if there are things you’d like to see as well, maybe I can find time to work on those too.

Visible there on the left is also Marc Alff, the principle developer of Performance Schema, who was also able to interject some great points for people as well.

My presentation was a super-set of a few of my previous ones, but updated and with a little added content here and there (particularly around the MySQL sys updates from ps_helper), so it’s pretty long (I had 3 hours to fill), but probably the most complete that I’ve done.

Like this:

I had a great time last week at Percona Live, meeting up with lots of old friends, and getting to know lots of new ones.

It was great to meet many of the people that hang around on DBHangOps face to face. Geoff even got a community award (well done)! Unfortunately I had to miss the lunch.

It was also good to see Oracle getting a community award. Our engineers are extremely hard working, and all want to help community and customers alike be successful with their MySQL environments. There was lots of great positive attitude towards the work we’ve been doing, it was pleasing to hear that we are on the right track.

I haven’t been to a conference at that venue since the “old school” MySQL UC. Well done Percona for putting on such a great event.

And as a bonus, I even got to sit in front of Shlomi Noach on my flight from SFO to Newark:

Like this:

MySQL’s Performance Schema is an incredibly rich and versatile instrumentation engine, but apparently, with great power, comes increased complexity for the user to understand up front.

I think this is pretty natural:

The more flexibility you provide to the user on what to monitor (and this is one of the major goals of Performance Schema) – the more there is to twiddle from a configuration perspective.

The more data we provide in as generic a ways as possible – the more data there is.

Performance Schema is also only going to grow over time. There are many many missing bits of instrumentation within MySQL that users want us to add, and for the most part, most of those new things are now going in to Performance Schema. Things such as memory instrumentation, stored programs, transactions, prepared statements, better replication monitoring … the list goes on, and the things I mention here are just the things that have gone in to MySQL 5.7 so far.

When I started ps_helper it was nothing more than a little playground for me to research both the breadth of the data that Performance Schema provides, as well as to try and nail down exactly the kinds of views that users want and need, based around various different use cases.

Judging by the feedback that I’ve gotten on ps_helper so far, it turns out that I was actually pretty successful with the latter goal. I’ve heard about it being used in all sorts of situations – from large websites, to small installations, everybody seems to have found something useful with it. MySQL Support also regularly directs customers to install it, and return the output from many of the views, to help customers tune and debug what their MySQL instances are doing.

One of the biggest pieces of feedback that I (and others) have heard is that users really want something like ps_helper to ship with the server itself – there is even a bug to have it included opened by my old colleague Valerii Kravchuk.

Getting something like ps_helper in to the server is not a small task. We need to know that what we are providing makes sense to users, commit to maintaining it “forever”, make sure that it is properly tested, is as bug free as possible, that it follows the right procedures for licenses and that it is properly structured for any future plans that we may have (of which there are many).

Recently, the MySQL Workbench team announced that they would be shipping a new schema that was something like ps_helper:

Additionally Workbench 6.1 is leveraging work from various teammates in MySQL Engineering in introducing a schema called “SYS” that provides simplified views on Performance Schema, Information Schema, and other areas.

This is the first step towards moving the kinds of views and procedures that ps_helper provides in to a shipping “MySQL product”. It started with the Workbench team taking ps_helper, performing a bunch of post processing on it (to rename schema etc.), and then integrating the resulting set of objects. In discussion with them after their first try at this, we decided instead to start a new branch, that didn’t need post-processing, and that starts following some of the other strict guidelines for getting a product shipped within MySQL/Oracle.

And so we get to the point of this blog finally, the mysql-sys repository was born on Github.

Why the name “sys“?

Firstly, and more importantly for some people – it is shorter and easier to type than “ps_helper”.

When you look at other database systems though, and at other similar things that people have done within the MySQL Community, it’s clear that there is room for a schema that can more generically provide a set of objects to administrators that help them with their day to day work.

DB2 has SYSIBM, which is a mixture of some DD and some runtime info. They also have syscat and sysstat that can be used too.

Oracle has a SYS catalog, owned by the SYS user. It’s a little different to our goals here (it’s more the DD owner). When comparing our goals with sys to Oracle, you could compare it to the V$ tables in many ways. However in Oracle SYS also includes a number of packages (routines) that can be used for administration tasks – the SYS.DBMS_* packages.

Looking a little closer to home, one only has to look at the excellent common_schema maintained by Shlomi Noach, which provides a whole host of views, procedures and functions to help manage and interact with MySQL, around objects, runtime metadata, and more (QueryScript for example is incredibly powerful).

With all of these in mind, we also wanted something a little more generic than ps_helper, which was created to make Performance Schema itself a little easier to understand. Something that could contain routines and views that are not necessarily Performance Schema specific (although that is what the bulk of the objects are for at the moment).

So what are the changes that have been made when migrating ps_helper to sys?

A quick summary would be:

mysql-sys now uses the GPLv2 license.

This is an important change, it aligns it with other MySQL products, and will follow the same procedures as other Oracle products. If you want to submit patches, you should follow the OCA. Unfortunately, this means that merges from mysql-sys to ps_helper can not happen. More on this shortly

All objects now fully specify all characteristic clauses

For instance, they all have a “root@localhost” DEFINER with SQL SECURITY INVOKER, and fully specify other things such as whether a routine is DETERMINISTIC, how it interacts (i.e. READS SQL DATA), or the ALGORITHM a view should use.

All [name]_raw views have been renamed to x$[name]

Within ps_helper, the “foo” and “foo_raw” tables all sort together, which makes it a little harder to scan the list of user friendly views from the output of SHOW TABLES. By renaming them all to x$foo, all of the views with some form of raw output sort at the end, and give the nice user friendly list individually first. Oracle also has something like this with their X$ vs V$ views, although X$ is “hidden” in that case.

A lot of the objects have had their names changed, as well as some columns

Many of the routines are now prefixed with ps_*, many of them have had their names changed for a little more clarity etc.

All “count” or “count_star” columns now use “total”, which was also used a lot. There was a lot of inconsistency there.

All stored routines (functions and procedures) now have a little documentation in their COMMENT clauses.

This means you can get their docs with “SELECT ROUTINE_NAME, ROUTINE_COMMENT FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = ‘sys’;”

A number of new views were added, particularly around summarizing user activity, and making sure that each formatted view had an x$ counterpart.

A WHOLE BUNCH of bugs were fixed along the way.

Unfortunately, I can not really merge those bug fixes back to ps_helper – they happened as I was going through each of the objects for the clean up, but their commits were kind of mixed in with all these other changes (including the license header changes). If I try to merge this back to ps_helper, I have to re-license ps_helper as GPLv2. I’d still like to keep ps_helper as my little playground though, to try out views and routines that could be useful one day to the sys schema.

I’m not sure what to do with the dbahelper (ps_helper) repository at the moment. I am tempted to remove all of the objects that are within mysql-sys itself, and do a similar rename of the remaining objects to “sys” (so they would build on each other). I do still want to use it as my “hey this could be an interesting object to add” playground. Your thoughts on this from the community would be appreciated.

I’ve now updated the ps_helper page to remove the outdated examples, and to simply point at the Github pages directly instead. The README for the mysql-sys repository now contains the most up to date examples and documentation, and should be used instead.

I hope to blog some more about the changes, and the new things that came along in the near future, but this blog is long enough for now.

Finally, I’d like to take the opportunity to thank all in the community for their great feedback on ps_helper. It helped to turn it in to a much more mature project.

After a slight delay (travel and catching up with “real work”), I’ve now uploaded the talks that I gave at MySQL Connect and Oracle Open World.

They are available on my Presentations Page, and inline below for convenience. The “Introduction to MySQL Enterprise Monitor” talk was actually a full demo, but there are some screenshots of MEM 3.0 in there if you’re interested in seeing a high level picture of what it looks like now.

Thanks to all that attended my talks, I got a lot of good questions and feedback!