MySQL Architecture meeting at Google

Friday after MySQL Users Conference we had a smaller meeting at Google campus to talk about MySQL architecture mainly focusing on storage engine vendors and other extension areas. It was very interesting to see all these storage engine interface extensions which are planned for MySQL 6.0 and beyond – abilities to intercept query execution or offloading query fragments and operations (sorting limit etc) in the storage engines. This is great news as this would allow to build really innovative storage engines with MySQL which was previously hard because of defined row by row retrieval interface and nested loops used for joins.

However what stroke me is a thought – This thing is really getting complicated. Few years ago Marten would frequently mention Oracle (and other commercial databases) as complicated beasts being overkill for most of their users.

Is MySQL becoming such a beast as well ? Will MySQL be able to provide quality software on a good schedule with this increased complexity ? Only time will tell, though so far the track record is not perfect – MySQL 4.1 was still rather simple and MySQL 5.0 GA release was a very late buggy disaster. A lot have changed since that inside the MySQL and with forthcoming releases of MySQL 5.1 and 6.0 we’ll see how new VP of Engineering reorganized and straightened the team to provide highly quality software on timely fashion.

It was also very interesting for me to see most (commercial) storage engine vendors really do not care about operating in mixed storage engine environment – it is assumed people would just use their very best storage engine. It may be true in most cases but it handicaps real flexibility power of storage engines to use different tables while being able to conveniently access data between them.

Tags:

Categories:

Comments

There are trade-offs that many deployments will have to make. Today with InnoDB on a slave you can use MySQL replication to trickle changes to the slave and have long-running reporting queries concurrent with the replication changes. Reporting is done against current data and changes are pushed to the server automatically.

Some of the new storage engines don’t support replication. You have to figure out how to capture changes on your OLTP masters (full extraction might cost too much) and then automate pushing them to the new storage engine. And if the change capture mechanism is to use a trigger on every table on the OLTP master, then that makes the OLTP master a lot slower.

Others might support replication, but don’t support MVCC meaning that when you do updates, all queries have to stop.