Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I remember from the stackoverflow podcasts that Fog Creek use a database per customer for Fogbugz. I assume that means the Fogbugz On Demand servers have 10s of thousands of databases.

We are just starting to develop a web app and have a similar problem to solve (lots of customers with their own isolated data).

What problems should I expect with using a database per customer? How can I solve them?

My Initial Thoughts

Advantages of a database per customer

Simpler database schema

Simpler backups - you can backup each customer in turn without it really impacting on other customers.

Makes it easy to export a given customers data.

Better cache performance - a write to one of the more active tables only impacts that single customer that performed the write.

Easier to scale across hardware. For example, when we need to go from 1 to 2 servers, we just move half our customers to the new server.

Disadvantages

Can MySQL cope with 5,000 databases? Would performance suck?

Changes to the schema can be hard to replicate out across all the databases. We would really really have to have an automated plan for this, such as versioning the schema and a script that understands how to take a database from one version to another.

Doing anything that is common to all our customers might be awkward or impossible

Similar to above, but any analytics we want to perform across all our customers might be impossible. How should we track usage across all customers for example?

Remember that "database" means different things to different people. In the Oracle world, a database per user would be massive overkill. But in MySQL "database" is synonymous with "schema".
–
GaiusFeb 3 '11 at 11:14

I mean it in the mysql sense. USE CompanyData;
–
Rik HeywoodFeb 3 '11 at 11:31

6 Answers
6

This solution is called a multi-tenant design where each tenant (customer) has their own database. Given that, there are some other considerations to the alternative approach which is a single database:

With a single database, everyone must be on the same version no matter what. It isn't possible to upgrade some customers and not others. This can be problematic if a customer wants a hotfix of an application that isn't ready for wide release.

With a single database, when you do an upgrade, every client is down. If something goes wrong, every client is screwed.

With a single database, it is much more difficult to throttle resources. I.e., if one client is hammering the database, it is harder to give them more resources separate from everyone else.

It is much more difficult to allow users to host their own versions of your application. If you are building a solution that will be used by large enterprises, this is often a non-starter. Their IT department wants complete control over access to the system.

It is probably cheaper to scale out databases rather than scale them up. I.e., having to invest in faster hardware to host one database to rule them all is probably more expensive than being able to scale customers out to smaller less expensive database servers. I can't say this one definitively because it depends greatly on the server software. If you stick with MySQL, this is probably true because the licensing costs are negligible. However, if you move up to SQL Server for example, scaling out becomes much more expensive unless you use a VPS environment and cost-benefit of scaling up vs. scaling out changes. I can say however that once your database gets very large, management requires every greater levels of expertise. Very large databases require playing around with multiple filegroups and pushing certain indexes to different spindles to get better performance. In short, they get can complicated very quickly.

Having separate databases does mean you have to build an update mechanism which matches the database version with the application/site version. However, separate databases do provide superior isolation of data and IMO have a lower the cost of hosting. It isn't a solution for all scenarios. If your system were never going to be hosted outside of your hosting and needed to scale up in customers rapidly and having all users on the same version of the application and database schema was desirable, then certainly having a single database is a better approach.

I run web services with both the shared database and multi-tenant separate database setups. There's times where both are the right choice. On the app where I have a separate database per customer, I've run into the exact same 5 reasons it was the right choice for that app.
–
Dan GrossmanFeb 5 '11 at 20:06

You'd likely want to keep another database to track what version each customer is at, so you could keep track of which ones have or haven't undergone the last round of modifications.

Scripting the upgrades wouldn't be that difficult ... you could write something that looks at the catalog of databases and applied the necessary changes to get each database to the latest version, possibly skipping those that shouldn't be upgraded for some reason.

As mysql 'databases' are just schemas, as Gaius pointed out, if it's all running from the same server instance, you can just qualify the name of the tables you're trying to modify, or get information out of:

alter schema.table ...
select ... from schema.table

...

If you start breaking things up across multiple servers, you can still script something that makes connections to multiple servers so you can apply all of the changes; for the analytics, again, you could set a bunch of database links using federated tables in your master database to access the data from one place, as you'd just be reading from the tables.

...

Also, be aware that they're not using mySQL for stack exchange, they're using SQL Server.

And I have no idea what sort of performance overhead there'd be in mysql at that scale, I don't think I've ever gotten past 30 'databases' in mysql.

In my experience you shouldn't create a db per costumer, let me give you an example:

Last year I worked with 70 databases (a lot less than 5000), each with the same scheme and all, in theory, things would go as planed (as you mention in the advantages section), but in reality not so much. We had many problems with updating schemes, user support, software update, you name it. It was awful.

We used firebird and I was hired way after the product was shiped, but this gave me the knowledge to never work with separated databases.

I'm not saying you can't pull it off, I'm saying things can go very wrong and to be honest, your advatage list did't sound so appealing to take the risk, most of them can be done with a sigle database.

We implemented a Multiple Listings Database that services several customers. We wound up in a situation where customers started wanting custom results. To solve this issue we cloned the stored procs and gave them unique customer name prefixes and then called them from within the application. On the other hand we sold 150 webstores each with their own separate database (97% the same). So both can be done it depends on the situation.
–
Cape Cod GunnyMay 12 '11 at 19:38

Nice. I'm not saying it can't be done, just that its not as easy as it sounds, good for you Gunny.
–
eiefaiMay 12 '11 at 20:17

I have a Web/DB Hosting client that has 750+ customer databases with the same number of tables (162) and same table structures. Combined, all of my client's customer data total 524GB (95% InnoDB)

Imagine all of these databases competing for 13G of innodb buffer pool on nine DB servers via circular replication. Scaling out with that hardware configuration was not enough. Immediately, we recommended to the client to scale up.

We recently migrated this client to 3 DB servers with far more horsepower (At all costs, stay away from SSD in high-write environments, ALWAYS !!!). We upgraded them from MySQL 5.0.90 to MySQL 5.5.9. Dramatic Differences were seen almost instantly.

Scaling out must also be considered because if you have hundreds of clients hitting the same memory and disk resources, scaling out reduces their usage linearly (O(n)) where n is based on the number of DB servers in a multimaster environment.

In the case of my client, my company is reducing him from 9 DB servers (Quad Code, 32GB RAM, 824G RAID10) to faster DB servers (Dual HexaCore [that's right 12 CPUs], 192GB RAM,1.7TB RAID10) of MySQL 5.5.9 (to table take advantage of the multiple CPUs). In addition, imagine 150GB innodb buffer pool in 50 partitions of 3GB each (Multiple InnoDB buffer pools is a new feature in MySQL 5.5). A smaller scale out, but massive scale up, had worked for my client's unique infrastructure.

MORAL OF THE STORY : Scaling up or out is not always the solution if you have badly designed tables. What I mean is this: If index pages have lopsided key population for multicolumn indexes, querying keys from the lopsided parts of indexes leads to table scan after table scan, or at least indexes that never get used due to being ruled out by the MySQL Query Optimizer. There simply is no substitute for proper design.

MySQL creates databases in separate directories so a lot depends on the underlying operating system and how many folders/file handles it can handle. Shouldn't be an issue with modern operating systems but that's where a lot of the bottleneck will come from.

There isn't anything saying you have to host different versions of the database or app. Whats wrong with simply isolating the data by doing one db per customer and having one version of the database and app? Of course each customer db would have to be cloned from a template of the current working version. From a security and data isolation standpoint, I think this is ideal.

The only downside I can see is that you would have to manually update each database when creating a new version. This could be easily automated though.