Limiting InnoDB Data Dictionary

One of InnoDB’s features is that memory allocated for internal tables definitions is not limited and may grow indefinitely. You may not notice it if you have an usual application with say 100-1000 tables. But for hosting providers and for user oriented applications ( each user has dedicated database / table) it is disaster. For 100.000+ tables InnoDB is consuming gigabytes of memory, keeping definition in memory all time after table was once opened. Only way to cleanup memory is to drop table or restart mysqld – I can’t say this is good solution, so we made patch which allows to restrict memory dedicated for data dictionary.

Some internals: There is already implemented in InnoDB LRU-based algorithm to keep only recent table entries, but it was not used by reason that InnoDB has to know if table is used or not on MySQL level. We made it by checking MySQL table_cache. If table is placed in table_cache we consider it as used, if not – we can delete it from InnoDB data dictionary. So there is the trick – if you have big enough table_cache, memory consumed by data dictionary may exceed innodb_dict_size_limit, as we can’t delete any table entry from it.

To finalize this post small marketing message – if you faced bug or problem which exists for long time and is not going to be solved by MySQL / InnoDB – contact us regarding Custom MySQL Development.

About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

great to hear. “We made it by checking MySQL table_cache. If table is placed in table_cache we consider it as used, if not – we can delete it from InnoDB data dictionary.”

Does this mean a global lock on the table cache is required when checking for table existence? (the table could be loaded into cache right after you check, in which case you must not delete it from data dictionary)

If so, how does that affect performance? Is the table cache lock a bottleneck?

This patch has a problem : innodb_dict_size_limit cannot exceed 4GB on 64 bit Windows. Using variations of long is generally not a good idea – it is the worst datatype when it comes to portability. You need size_t for it or larger (i.e ulonglong).

As innodb_open_files exists as max open handles; it could have been nice if it was controlled by innodb_table_open_cache (same as table_[open_]cache); so that it limits based on table count instead of size

Innodb data dictionary is different. First you can run with single large tablespace and in this case number of open files can be irrelevant. Second if you have very many tables (millions) you may have more of them cached in the innodb data dictionary than have file handler associated with it. Opening/Closing file is relatively fast. The update of Innodb stats (bunch of random dives) is the real cost here.

ANSI does not specify that size_t is equal to unsigned long anywhere. Casting pointer or long to int is commonly considered bad. Casting pointer or size_t to long is exactly as bad, given that a platform/compiler can choose LP or LLP (or ILP or SILP) model for 64 bit handling (http://en.wikipedia.org/wiki/64-bit)

In fact, the whole 64 bit support on Windows in compiler and SDK and reference compiler on this platform (MSVC) has been LLP on 64 bit (long is int is 4 bytes, long long is size_t is 8 bytes) since they started playing with Itanium prototypes, and that is almost a decade ago.