When you have to drop a large database, you'll encounter some problems, mainly replication lag. Now I'll show you how to avoid this.

What can cause replication lag when you drop a database? First, it takes some disk I/O to unlink the files, and secondly MySQL will scan through the buffer pool to see if there are pages from that database or not. On a huge (or at least big) database this could take seconds or even minutes, what means your slaves will collect lag for seconds or (of course) even minutes.

Every database dropping should be started with a backup of that database. I did it on a slave.

You can see we have only one really big table, and that is 'images' with about 9 million rows. The 'videos' table is the second biggest, it is 1/1000th of the images.

So far, so good.

I use the pt-archiver tool to clean up tables, because it chunks up the data what needed to delete. That means it will run the delete command with a limit clause where 'limit' equals the chunksize. I have calculated earlier that our database can delete about 2000 row in a chunk with no replication lag, so I'll use the same chunksize for cleaning up imges.

Advertisement

Before I start the delete, I check how many chunks I have - just to help myself keep tracking the state of delete.

After the progressbar is set up, we can start the table cleanup job with pt-archiver

[root@db-master.bfc /home/banyek]# pt-archiver --statistics --no-check-charset --limit 2000 --txn-size 2000 --source D=chomp,t=images,u=user,p=password --purge --where "1=1"
Source: D=chomp,p=...,t=images,u=user
SELECT 8420052
INSERT 0
DELETE 8420052
Action Count Time Pct
deleting 8420052 1981.1935 72.48
select 4212 72.4313 2.65
commit 4211 9.3239 0.34
other 0 670.3669 24.53
# You can see the 'Count' row is a bit less than it was at the select count(*), but this is just because
# I've ran the archiver two times.
#
#
# After the cleanup is done at the table, repeat this with the others too.

Shrink the datafiles

After you have cleaned up the data from the tables themselves you can see that the datafiles are still there. (You can read more about this here) So the next step is to shrink the datafiles. The easiest way to achieve this to run a simple alter like 'ALTER TABLE tablename engine=InnoDB' but we are still aware of replication lag, so run it via pt-online-schema-change

Conclusion

Dropping a database is really an easy thing, just one command and everything done, but if your site has a lot of traffic, and your users didn't tolerate weird things, you have to be cautious. In this scenario a simple 'drop database' had to be took about a few seconds, with a few seconds lag, but building up good habits are mandatory - so I recommend to always clean up database before dropping it.