Uh, uh… extending mysqlnd: monitoring and statement redirection

2011/10/11by admin

Uh, uh… about a year ago Mayflower OpenSource Labs released the mysqlnd user handler plugin (PECL/mysqlnd_uh). The extension lets you extend and replace mysqlnd internal function calls with PHP. Uh, uh… mysqlnd internals exported to user space? Who cares as long as it does the trick?! Let me show you seven lines of PHP code to monitor all queries issued by any PHP MySQL application using any PHP MySQL extension (mysql, mysqli, PDO_MySQL) compiled to use the mysqlnd library.

query_monitor.php

That’s it. Install PECL/mysqlnd_uh and load the above query_monitor.php before your application starts using the PHP configuration directive auto_prepend_file to see all queries issued by your application, including a backtrace. The proof:

In preparation for the third and last PHP MySQL webinar, I wrote documentation for PECL/mysqlnd_uh and made the extension compile with PHP 5.4. Get PHP 5.4, get a development version of mysqlnd_uh and give it a try. Both 5.4 and mysqlnd_uh deserve user feedback.

On the C level, the methods of the built-in classes are installed to be used instead of the original mysqlnd library functions. The classes do nothing but call the original library methods. The built-in classes behave like a transparent proxy. To change the default behaviour of mysqlnd, you have to subclass the two built-in classes and install proxy objects of you derived classes.

Basics: query logging and rewriting

If that sounds confusing, forget about it for now. Get yourself used to the pattern shown in the monitoring example. Its sufficient for basic stuff like monitoring and rewriting. Rewriting? Sure… let’s assume the database schema has been re-factored over the years. You are understaffed and there is a legacy application accessing a renamed table. The application is building SQL dynamically and its hard to locate the source. Ulf helps: rewriting and logging the backtrace.

There are many use cases for this little snippet: monitoring, rewriting but also auditing and even SQL injection protection using black- and whitelists is possible. If you are interested in the latter and you have some C skills, check out PECL/mysqlnd_sip. The name mysqlnd_sip stands for mysqlnd SQL injection protection. PECL/mysqlnd_sip is a proof-of-concept. Otherwise, try out mysqlnd_uh. It has been written for no other purpose but making it possible to play with mysqlnd, to prototype plugins with PHP.

Note

The mysqlnd library and its functions have not been designed to exposed to the user space. PECL/mysqlnd_uh lets you manipulate the inner workings of mysqlnd. It tries to detect abuse. Nonetheless, it is possible to make mysqlnd leak memory or even crash PHP, if you hook mysqlnd calls inappropriately. This is not considered a mysqlnd bug. Please, report issues which you
want PECL/mysqlnd_uh try to detect and try to prevent from happening.

Standard PHP code encryption tools cannot stop users from monitoring mysqlnd activities with PECL/mysqlnd_uh. PECL/mysqlnd_uh lets users hook mysqlnd C calls. That’s on a level beneath the application. Whatever technology is used to hide the source code of the PHP application, the PHP application still has to call the C calls internally. By hooking the mysqlnd C calls with PECL/mysqlnd_uh users get access to database users, database passwords (MysqlndUhConnection::connect()) and SQL commands executed (MysqlndUhConnection::query(), MysqlndUhPreparedStatement::prepare()). Therefore, administrators should restrict access to PECL/mysqlnd_uh to trusted persons only. Just like access to a database server log should be restricted…

Advanced: statement redirection

After the note only experts should be left. Let’s do something expert-like. Let’s do something in user space which PECL/mysqlnd_ms, the replication and load balancing plugin, does in C. Let’s redirect statements to different servers without letting the application know. The replication plugin does read/write splitting to send read requests to slave nodes and write requests to master nodes in a MySQL replication cluster. We want to pick a server based on a SQL hint. A SQL hint, which shall represent a sharding key.

Again, we start deriving a proxy from MysqlndUhConnection, overwriting the query method. The query method has two arguments. The first argument is a resource of type mysqlnd connection, the second argument is the query statement string. Whenever the statement begins with a certain SQL hint, we shall redirect it to a certain server. To do so, we’ll open a new connection to that server and use that new connection to execute the statement on.

Warning: MysqlndUhConnection::query() expects parameter 1 to be resource, object given in /home/nixnutz/php-src/branches/PHP_5_4/foo.php on line 7
Warning: mysqli::query(): (Mysqlnd User Handler) The method MysqlndUhConnection::query() did not return a boolean value as it should in /home/nixnutz/php-src/branches/PHP_5_4/foo.php on line 17

Good try, but obviously wrong. The parent implementation of the query must not be given a mysqli object as its first parameter but a mysqlnd connection resource. The conversion between the two connection handles is to be done using mysqlnd_uh_convert_to_mysqlnd(). Next try.

The errors are gone but we are not quite there yet: no proper result set. By scanning the example you may be tempted to forget that a user space call, such as mysqli_query(), may invoke many mysqlnd library calls, no just query. The mysqli_query() function sends a query, checks if there is a result set to fetch and if so, does fetch and store the result set. The one line $res = $mysqli->query("/*shard1*/SELECT DATABASE() AS _shard1 FROM DUAL"); expands to three mysqlnd library calls. All three functions must be modified to use the proxy connection which has been used for redirection. You have to implement MysqlndUhConnection::query(), MysqlndUhConnection::getFieldCount() and MysqlndUhConnection::storeResult().

Here comes the "working" example. "Working" means it is capable of redirecting exactly the user calls shown in the example. Other user API calls may require additional mysqlnd function calls to be overwritten.

Finding which functions to overwrite is a bit of try-and-error process. Basically you have three options to go though the iterative process.

Reverse engineer mysqlnd and extension source

Check mysqlnd debug and trace log for pointers

Implement all MysqlndUhConnection methods to see what gets called

As a mysqlnd developer, I usually take the reverse engineering route. It does not require extensive C knowledge to see which function calls which. However, if you fear C, try using the debug and trace log. Below is my standard way of working with mysqlnd_uh, if trying to achieve anything fancy, such as statement redirection.

Too complicated? Go back to the basics from the beginning. Monitoring, rewriting and auditing is super easy. Other tasks can easily become more complex. That’s one reason why we have written the PECL/mysqlnd_ms in C. However, the power and flexibility of hooking mysqlnd library calls, the possibility of installing a client-side proxy is quite unique.