Opening Tables scalability

I was restarting MySQL on box with 50.000 of Innodb tables and again it took couple of hours to reach decent performance because of “Opening Tables” stage was taking long.

Part of the problem is Innodb is updating stats on each table open which is possibly expensive operation, but really it is only great test case for general MySQL problem.

During warmup process I noticed I get very low CPU usage and disk Usage and IOWait about 25% (4CPU box) which indicates table opens and stats updates are serialized rather than performed in parallel.

I’ve checked with Heikki and he confirmed MySQL has global table cache mutex (LOCK_open) which is held for all open table operation so table opens are serialized.

Do not think however it is only Innodb problem. I’ve seen similar problems with MyISAM – these also take few IO operations to open and could take quite a while to close if there were unflushed key blocks and we have no idea what other storage engines may do to perform table open – some may need network operation etc.

In general I think it is extremely poor design choice to have global mutexes for anything which may require blocking physical IO or network operation if you care about scalability. MySQL has same problem with key buffer but that was fixed in MySQL 4.1, this one still remains.

I do not blame Monty – implementing first MySQL version in 3 months he had to take shortcuts and implement most simple solutions for many things. Now years later and having 50+ developers this should have been fixed.

Until this is fixed it is especially important to keep your table_cache large enough so table opens will be rare (I use 1/sec as a number to worry about) and also worry about possible limited performance while table cache is being warmed up in addition to all other caches 🙂

Related

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

50.000 tables is probably a bit too many but many tables per server often is much better then single set of large tables, even if you use Innodb.

You can get better clustering of data together so local operations are faster, each of tables fits in memory completely so batch operations like purging etc go much faster than they would go with single large table, not to mention ALTER TABLE, OPTIMIZE TABLE etc.

It also makes it easier to transfer table to other server if needed etc.

I know it looks quite contradicting to DBMS theory which teaches to use large table and let DBMS to do it job but in practice it may not be always best 🙂

Frank, there are people who have more than 100,000 tables on a server. It’s one reason why the InnoDB data dictionary in RAM was made 50-70% smaller in 5.1 and why a couple of other InnoDB changes have been made in 5.0 and 5.1 to speed up handling of lots of tables.

We had a feature request recently from someone who wanted more than 32,000 databases on a file system that couldn’t handle that many subdirectories, so the MySQL database count was limited by that. No current plans to do this.

The InnoDB stats continue to be quite frustrating, since the sample size isn’t big enough to give consistent query results at larger table sizes and the overhead at opening is annoying. It’d be nice if it remembered the stats and refreshed occasionally, or perhaps added to them in the background during use.

I honestly do not think MySQL should work around file system limits (this is about more than 32000 databases) plus there is solution to that problem – use symlinks – this way you can get as many directories as you want 🙂

With 32000 databases the real bottleneck would be MySQL privilege system. It becomes a problem when you have 5000 databases, and with 10000 databases you’ll have to fix some things in the source – otherwise queries like SHOW DATABASES will use 100% of your CPU for a couple of minutes.

Why would privilege system cause the problems ? Or are you speaking about special case of web hosting then there would be 32000 users created for each database. With handful of users I never seen it being the problem.

Also “SHOW DATABASES” is probably not the query you really care about (It does not mean it should run for minutes though) – most queries would only need to check permissions to few databases/tables.

Yes, I was speaking about web hosting case. Companies that sell shared web hosting services often place all of their customers’ DBs to a dedicated boxes. Most of DBs do not produce high load so it’s ok to stuff thousands or even tens of thousands databases on a single server.
I’m not sure why one would want to create thousands of DBs belonging to a handful of users. 🙂

Regarding ‘show databases’ – you have to care about it in a web hosting case. phpMyAdmin calls it twice when you open it’s main page. 10 users actively working with phpMyAdmin can ruin the performance of otherwise not-heavily-loaded server.

First two functions are privilege-system related (strcmp is used for comparing logins, filenames etc., mysql uses it’s own functions for matching data). As you can see those two functions alone consume more than 40% of cpu time. That’s after fixing show databases problem, with show databases the number would be more like 95%. 🙂

If anyone is interested in improving the way MySQL deals with a ton of table we (WordPress.com) would happily sponsor that development. We have about 5 million tables today and it’s going up constantly. I’m reachable on my site.

OK This is different story when privilege check system comes into play. I guess it was not optimized well to handle such large amount of tables. I remember seeing quite a few of list scans when I worked with code few years ago.
I would suggest to post it as a bug and hope it would be fixed sometime.

The 32000 limit is in ext2 and ext3 (still not fixed for ext4, although there is an old patch to make it 64000). If you happen to be using ext as a file system with a lot of either databases or file systems, you might try adding dir_index and then do a tune. Don’t know what the speed up is, your mileage may vary.

Matt: why 5 million tables? We host 500,000 forums and don’t give each one its own set of tables, it was just too much work and too slow and complex.

We have forum hosting community – around 3 500 000 myisam tables per each server. Each forum gets it’s own database (~90 tables in one database) . We have similar issue – when server is overloaded, status of all threads in “show processlist” is “opening tables”. We have many php and system tweaks but this problem seems to be hard to resolve without MySQL source code rewrite -we have not enought knowledge to do it. It’s MySQL 4.0.

We have tried to increase table_cache but it doesn’t make any sense with such many useable tables – best performance setting for us is table_cache=0.

Now it’s hosted on Linux 2.6 & XFS which supports more than 32K subdirectories in one directory. Reiserfs supports too, but it’s terribly unstable within such load and brokes it’s journal itself very often, and reiserfs is not visible faster in this than XFS.

I’m afraid of putting such many tables into one database – this cause worse performance problems – like hosting >10 000 000 files in one directory (3 files per table)…

[…] bad at start time, when InnoDB takes probes during opening table, as it is slow operation. See also http://www.mysqlperformanceblog.com/2006/11/21/opening-tables-scalability/. Partially it can be fixed by recent patches by enabling / disabling probes and changing count of […]

We work on the basis of a different database for each new company account. The problem I’m starting to have with 5000 databases is the show databases and use database commands are taking a long time to run and are showing up in my slow-queries log. Are there any tips for making this go faster?

We have some 2000 databases with the same set of 250+ tables (both numbers constantly growing) = 500’000 tables and counting on the same server.

As we are trying to scale out, our main options seems to be either “manually” partition different databases to different servers, or putting all the data in a single set of large tables in combination with built-in MySQL Partitioning.

We run a WP multisite with about 200k tables and as tables in open_table cache increase the execution time of the query increase and at last there is always 400 querys running and most of them in opening/closing state. If I restart the process it works like a charm for a couple of hours, and then it starts all over again.