A 100TB Database June 23, 2009

Some of you may know of me as the guy who constantly presented on the “massive database of genetic information” he was creating. I started presenting about it around 2003 and I said it would be 7TB. As I built it, the data kept flooding in and by the time I had it up and running and fully populated, around 2005, it was getting scary – it had grown to over 25TB. Who am I kidding? It was beyond scary, it kept me awake at nights.

Well, it still exists and continues to grow. I saw it up to 45TB before I left the Sanger institute {where I built it} and it continues to grow towards the 100TB I designed it to scale to.

Why am I bragging about this? {” hey, have you seen the size of my database”?!}. Well, I am very proud of it. It was my pet project.

But pet project or not, I had a team of DBAs at the Sanger and of course, when I say “I built it” I should say “My team and I built it”. And they looked after it after I departed and it got even bigger.

Last week I got an email off one of them to invite me over for a small celebration this week. What are we celebrating? The first database on-site to hit 100TB. Am I proud? Hell yes, I am proud.

But not proud of what you probably think I am, given my careful preamble.

It is not my database that has broached the 100TB limit.

It is another one, a database the team put together after I left and that they have looked after without my input. What I am really proud about is that, with Shanthi Sivadasan who was holding the fort when I arrived at the Sanger {and remains there}, we put together a team of DBAs that is capable of creating and looking after such a large behemoth. It could not be done without excellent support from the Systems Administrators as well, but I feel particularly proud of the DBAs.

So congratulations to the DBAs at the Wellcome Trust Sanger Institue: Shanthi Sivadasan, Tony Webb, Andy Bryant, Aftab Ahmed, Kalyan Kallepally and Karen Ambrose. You went further with this than I did.