What is the largest amount of data do you store in MySQL ?

My previous poll got a great response so I thought I should continue these series. The question of the day today is How much data do your store in your largest MySQL instance ? I’m interested about single instance not the total amount of data you have in MySQL in your Sharded replicated environment. Feel free to share details in comments – are you using MyISAM or Innodb ? Is it working out well for you or are there any challenges you’re seeing ?

What is the largest data size for single MySQL Instance you use in Production

100GB to 1TB (34%, 298 Votes)

10GB to 100GB (25%, 226 Votes)

1GB to 10GB (16%, 145 Votes)

1TB to 10TB (15%, 134 Votes)

less than 1GB (7%, 60 Votes)

more than 10TB (3%, 25 Votes)

Total Voters: 888

Loading ...

Thank you for your participation !

Related

Author

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments (28)

We got 16TB and the hardest part is backups. Xtrabackup works good. But incrementals do not work with compressed Xtrabackups. We had a bug open for this. (We have Percona Gold support).

So, if we can get: (a) compressed xtrabackup, (b) compressed incrementals and (c) parallel recovery (or to put it more simply a simple way to parallelize the restore such that the entire io capacity of the disk sub system is utilized — we’d be much happier.

– Large DBs imply large tables — so DDLs are too slow. We are using Percona 5.1.6x builds. So, online index additions would be ideal. Not sure if 5.5.x or 5.6.x already addresses this.

– Large DBs also come with large buffer pool sizes. And that implies very slow shutdown and start (we have fast-shutdown disabled — all that seems to do is just shift the slowness from shutdown to starting up) – I am thinking there is no easy solution here

– A simpler way to distribute some DBs to a different mount point. We have to use symlinks to point some DBs to other places. Instead, if we could do “create database new_db using db_file_space ‘/new_mount_point/mysql_data_dir’;” — would be nice.

– Backport crash-safe replication to 5.1.6x 🙂 — yes we want to get to 5.5 — but now that 5.6.x is round the corner, have put 5.5.x migration on hold and are planning to go straight to 5.6.x

Thanks for feedback. Number of compression backup issues are fixed in Xtrabackup 2.0 do you have some specific unresolved issue ? Please feel free to open support request or let me know if there is unresolved support issue I would investigate it

Our main pain point is that we have a single table that has ~12+tb. This data is a rolling 90 days worth of data, write mostly, but needs to be available for immediate queries if the need presents itself. This is an innodb table. The main issue is that with a table this large any table changes are out of the question, as are backups, etc. We write at a rate of 30k+ rows / sec into this table.

If DDL statements were able to be run in a multithreaded fashion through some sort of divide and conquer methodology that would help immensely. It’s just near impossible to run any sord of DDL on a table with 180b rows. When each row needs to be managed sequentially.

I once had five minutes of stress, when I couldn’t boot up ~30TB MySQL instance, errors were way too cryptic 😉 I don’t think we’re using that configuration any more, with regular ~2-3TB machines serving the workload nowadays.

With MySQL one must shard/partition/whatnot large tables because of index locking, so I don’t even people running single huge table 😉

Currently we have a new 300G instance (a new internal-analytics instance) that is growing daily by 3G. This is the entry instance for raw data from ETL scripts.

The machine is a 16GB-RAM, Quad-Core Xeon with HT, 2 x 3TB HDD (Software-RAID 1). It’s on a CentOS 6.3 and I’m using MariaDB Server 5.5.27.

On the same instance there is also: – an instance replicating our website prod database (too small to count – also MariaDB), – a TokuDB instance that is computing data from the ETL instance (under 2G on permanent data)

Laurynas, Thanks – I must have misread the documentation. For some reason I had thought that crash-resistant replication was in 5.5.x only.

Peter, Re: the compressed incremental backups with latest xtrabackup, I’ll double check if its working now and report back.

Domas, Re: your point about sharding — well, yes, true. We are right on the cusp of whether to shard or not. Now, sharding will greatly increase application complexity. And the rate at which high iops (FusionIO, Virident, even commodity OCZ) storage costs are plummeting; I suspect our needs are going to be met with a largish single node Mysql for quite a few years more allowing us to postpone the sharding decision.

Hi Peter, I have designed and replicated a MySQL instance as slave for archiving our daily transactions for 6 months ,we are using this server for some reporting activities as well . It took around 700 G up to now .

Expired data will be deleted every night after working hours ,we have started with 5 million transactions per day and now i’m upgrading it to store around 9 million transactions every day .(extra transactions are fetching with ETL from row data sent by our partners )

I’ve used Innodb engine and I have partitioned big tables ,they are around 300G now ,each table contained several data files ,

We are planning to integrate around 250G of data into our main production server which contains our user information , our plan is to have complete user information in one database the below is the current configuration .

1. How much of RAM we need to request for (currently we are having 45G) which is fully occupied . 2.How much disk space we need to request for (mysql data dir -480G , total disk space allocated : 650G)