Raptorn:GB, rather. I just checked and the human genome has roughly 3.3 billion base pairs. Since you can store a DNA-Sequence with just the 4 letters ATGC and you only need one half of a strand to have all information, that makes roughly 1.6 billion bytes of data = 1.6 GB. Since you only need 4 symbols, that isn't even accurate, since you could store those 4 letters in just 2 bits - (00 = A, 10 = T, 01 = G, 11 = C), so in one byte you can encode 4 bases. Makes it 400 MB of data. Since parts of the DNA are highly redundant you could probably get to 75 MB with a compression algorithm.Gotta check if that is correct.

Edit: Minor correction: Dividing by 2 isn't allowed here, since you could just store a base pair in one bit. So that makes around 800 MB for the whole genome. The coding part of that is about 3%, which amounts to about 24 MB - but since we don't know what the other approximately 97% of the DNA do, it's probably safer to not ditch them . Else evolution would probably have taken care of that already.

Edit II: Oh, and if we're talking about the complete genome, not the one in sperm or egg cells, than we have to multiply with 2, since all other cells are diploid (have 2 complete DNAs, one from the father and one from the mother).

I thought i remembered hearing it was TBs. thanks for clearing that up.

"The reason an author needs to know the rules of grammar isn't so he or she never breaks them, but so the author knows how to break them."

Have you linked that picture to one you received in an email, Meer? Because it won't work like that. In the case of posting an emailed picture you would have to download it to your computer and then use Photobucket or similar to show it.