Overview of fragmented MySQL InnoDB tables

The major engine for MySQL is InnoDB, it complies with the ACID properties for a transactional database engine. Even if InnoDB is the most recommended engine for MySQL, it has also some caveats. The biggest criticism lies in the amount of disk space used. And even when we remove data, tablespaces don’t shrink.

This post is about those tables that are fragmented.

To find the top 10 of tables with free space (free space means gaps that may happen when for example large delete operations happened or if many pages had to be moved around), a simple query can be run:

I hacked innodb_ruby from Jeremy Cole (a tool that I really appreciate and that I find very useful especially when used to show the lsn heat map), to have the possibility to output
a map of the table showing the used and the free pages.

This is the output for our tables (sbtest.sbtest1):

As you can see the 5MB of data free are mostly at the end (happening when allocating extra extends) but there are also some small gaps (I didn’t check yet why?).

Deleting the first 100k records

As innodb_ruby reads from the table space file, it’s mandatory to flush all pages from the Buffer Pool to disk (all the changes must be applied to disk). The best way to achieve this is to
stop mysqld not using the default fast shutdown and restart it:

mmm… !? less rows but the data_free is still the same (25MB). What does the output tell ?

We don’t see any difference !?

In fact, this is because the records deleted were certainly stored on a page with others records that were not deleted. So the page is still in use. Not completely but still in use, so not yet
totally free.

Deleting 100k records in the middle

OK… let’s then delete 100k sequential records in the middle of the table:

Optimizing

So we don’t have any records and we use a “lot” of diskpace, which is completely wasted. Can we do something ?
Yes we can optimize the table (the table doesn’t support it but will perform a recreate + analyze). We can so run one of these commands:

mysql> optimize table sbtest1;

or

mysql> alter table sbtest1 engine=innodb;

After such operation, we can finally see the result on the filesystem and on the page’s illustration:

Conclusion

In conclusion, if diskpace is a concern for you (or if you are doing physical backups regularly), it could be interesting for you to check the fragmentation of your tables (when you use InnoDB). And if like me (and many others) you like InnoDB, please take a look at innodb_ruby if it’s not yet done 😉