A giant virus has its genome sequenced, which shows that it has stolen genes …

Share this story

A few years back, a giant virus made news because it blurred the boundaries between what we consider living cells and viruses. The Mimivirus, which infected a freshwater amoeba, had a genome that contained over a million base pairs of DNA, and carried a set of genes that were previously only found in living cells. In a PNAS paper that will be released later this week, researchers describe the genome of an oceanic mimivirus cousin that has the second largest viral genome ever seen. It was apparently discovered off the coast of Texas; you can insert your own jokes here.

The virus was found in a single-celled host that preys on the bacteria and plankton at the base of the food chain. The new find, Cafeteria roenbergensis virus (CroV), has a genome that's over 700,000 base pairs long (700 kilobases, or kb). It's linear, and the ends are filled with repetitive DNA, which the authors speculate acts a bit like telomeres do in human cells, protecting the important DNA. The gene-containing region is about 630kb, and encodes about 550 proteins, along with a handful of transfer RNAs (part of the protein production machinery).

The virus also carries some molecular parasites called inteins. These pieces of DNA insert into the middle of genes and encode a sequence that ends up in the middle of its host's protein once the gene is translated. This doesn't actually disrupt the host's protein, though—amazingly, the intein protein sequence can cut itself out of the surrounding host protein and link its two ends together as if the intein were never there. The excised intein protein then helps ensure that the DNA sequence that made it gets inherited and spreads.

About half of the virus' genes are similar to those of something from either other giant viruses or living cells, with pieces from all three domains of life (eukaryotes, bacteria, and archaea). But the majority of genes have no known function, so it's hard to know what to make of them. Some of the ones that are present, however, are pretty sophisticated. The virus has its own DNA repair system, and can hijack a system its host uses to destroy unwanted proteins; it uses this to get rid of the host's defense proteins. It also seems to have picked up 38kb from a bacteria (potentially, one of the meals of its host) that encodes for a pathway that attaches sugars to proteins.

When mimivirus was first identified, some of its discoverers looked at its large size and distinct genome and decided that viruses like that may have participated in the origin of life. CroV, the new virus, is actually the mimivirus' closest known relative, but the two viruses only share about a third of their genes, and many of the critical ones have come from completely different species. Thus, the viruses seem to have been built from a small, ancient core that has undergone a lot of horizontal gene transfers that provided them with genes from living cells.

This doesn't rule out a possible ancient existence for this group of viruses, but it does suggest that the ancient versions probably looked very little like the modern ones. Which probably means that the argument over their role in the origin of life is going to be very difficult to settle, and may go on for years.

Promoted Comments

Bacteria is plural of bacterium. Please use is correctly. The same goes for media and medium, which is not in this article but is often misused in scientific writing.

In Latin maybe. And the phrase you are nitpicking is actually "from a bacteria". So if you were anything but pedantic, you would exclaim "that requires the ablative of source! In the singular." Following your logic, the article should read "...from a bacterio". But wait, Latin has no indefinite article, so whether it is "a bacterium" or "a bacterio", the noun phrase is redundant since indefiniteness is presupposed in simple noun forms. But "from bacterio" is neither grammatical English nor comprehensible Latin. And the ablative of source usually employs a preposition, so "from a bacteria" should read "ab bacterio" to be exact.

Problem is: This is not FRICKING Latin. This is a word of Latin origin that has entered into English. Therefore our rules apply. Because if you demand a Latin singular, I demand the proper Latin case, pronunciation, etc. We took the plural form for obvious reasons. Because of the physical size of bacteria, the word became a mass noun and functions as both plural and singular. Same reason for taking "data": it is collective. There are rarely "bacterias". And certainly no "datas". One sheep. Two sheep. Ten sheep. One form, all numbers. It is legal in English - accept it.

I can only assume you are one of those people who thinks the plural of "octopus" is "octopi" as well. Except there is no such word as "octopus" in Latin. The word is "polypus", "Octopus" is from the Greek ὀκτάπους, and the plural of that is ὀκτάποδες. And even if people knew "octopodes" was the true plural, they would say it wrong since the epsilon is not silent. Why? Because it is adapted for usage in the new language. It has no obligations to its old morphology and phonology.

By your inanity, if you ever say the word "cherry" for a single unit, I have every right to chastise you. That word never existed in French. "Cheris(e)" is the singular form that was introduced into English. How dare you impose English conventions of depluralization on it! You will say "Shair-eez" for one piece of fruit, you'll do it in a beret and you'll like it. You doctrinaire dope.

What I personally wonder is how much variation they would find if they sequenced a few different copies of this from the same general area. Would they be almost the same with some mutations, or would there be larger differences where they've grabbed different chunks from their environment?

Put another way - what's the lifetime of these absorbed chunks? Do they come and go, or is this a fairly stable virus?

This is very interesting. So viruses injecting genetic material could have played a larger factor in evolution than random genetic mutation and natural selection? That opens up one hell of a can of worms and lends credo to the viral-zombies of pop culture material like Left 4 Dead or 28 Days Later.

Are they sure that it copies this stuff on purpose? Because the way it reads, it sounds like that virus has a toolbox and filled it with useful gene sequences, just in case it would need them later.Although I'd find something like it being actually defective to be plausible too. Maybe many cells can't reproduce the virus cleanly and every generation of the virus gets a few bases of random crap attached to its gene code?

Also, didn't that 'new flu' virus they paniced so much about a year or so ago also have more base sequences than the normal flu? I can certainly see the benefit of having more tools just in case the host cells don't want to let you in as easily. But doesn't that additional DNA also mean that there might be new weak points for fighting the virus? Also doesn't it drastically decrease the number of viruses each cell produces?

I think of viruses as being like jellyfish; blind, deaf, mute, hardly there. Floating seemingly randomly in a sea of life. Seemingly simple, yet amazing complex and nuanced. Like a japanese camera, impossibly tiny yet jammed to the brim with technology.

The only things to come out of Texas… and I don’t see any steers, so what does that make this virus?

In Texas, it’s ‘Go big or go fishin’!’ Or both!

Knock, knock.Who’s there?A Texas-sized (proportionately) linear mimivirus cousin, found in a single-celled host that preys on the bacteria and plankton at the base of the food, chain over 700,000 base pairs long, carrying molecular parasites called inteins and its own DNA repair system.I don’t see no steers…

Are they sure that it copies this stuff on purpose? Because the way it reads, it sounds like that virus has a toolbox and filled it with useful gene sequences, just in case it would need them later.

It's more like the host's genetic material is sometimes shuffled into the viral genome by accident. The reverse also happens, and a not-insignificant part of your DNA originally came from viral infections of our distant ancestors. If the genetic material proves useful to the virus in helping it spread more effectively, I would expect it to become fixed pretty easily.

The second largest viral genome was discovered in the ocean off the coast of Texas. Hence, the joke that practically writes itself (or so I thought): "World's largest gene found in a pool next to the world's smallest gene pool." (you know, because of inbreeding)

Really beautiful stuff. The inteins and messy horizontal transfer, and how life starts and takes ANY opoourtunity it has, is so amazing. Reminds me of some parasitism in the Tierra artificial life evolution experiments.

It's so powerful....get anything, by accident, to self re-produce mutably and it just GOES off (if the medium is rich in "oppurtunity" at least). Any phenomenon. You can easily imagine how the inteins might have started off as an accidental defect in the virus that happened to reproduce itself...and OFF it bloody went!

Someone needs to make a self reproducing chemical system in the lad so we can see what happens. It does not even have to have all the prperties of life, like a boundary etc. Just make it reproduce and see what happens. Doesn't even have to be very similar to biology, just make it reproducing and mutable.

The second largest viral genome was discovered in the ocean off the coast of Texas. Hence, the joke that practically writes itself (or so I thought): "World's largest gene found in a pool next to the world's smallest gene pool." (you know, because of inbreeding)

Bacteria is plural of bacterium. Please use is correctly. The same goes for media and medium, which is not in this article but is often misused in scientific writing.

In Latin maybe. And the phrase you are nitpicking is actually "from a bacteria". So if you were anything but pedantic, you would exclaim "that requires the ablative of source! In the singular." Following your logic, the article should read "...from a bacterio". But wait, Latin has no indefinite article, so whether it is "a bacterium" or "a bacterio", the noun phrase is redundant since indefiniteness is presupposed in simple noun forms. But "from bacterio" is neither grammatical English nor comprehensible Latin. And the ablative of source usually employs a preposition, so "from a bacteria" should read "ab bacterio" to be exact.

Problem is: This is not FRICKING Latin. This is a word of Latin origin that has entered into English. Therefore our rules apply. Because if you demand a Latin singular, I demand the proper Latin case, pronunciation, etc. We took the plural form for obvious reasons. Because of the physical size of bacteria, the word became a mass noun and functions as both plural and singular. Same reason for taking "data": it is collective. There are rarely "bacterias". And certainly no "datas". One sheep. Two sheep. Ten sheep. One form, all numbers. It is legal in English - accept it.

I can only assume you are one of those people who thinks the plural of "octopus" is "octopi" as well. Except there is no such word as "octopus" in Latin. The word is "polypus", "Octopus" is from the Greek ὀκτάπους, and the plural of that is ὀκτάποδες. And even if people knew "octopodes" was the true plural, they would say it wrong since the epsilon is not silent. Why? Because it is adapted for usage in the new language. It has no obligations to its old morphology and phonology.

By your inanity, if you ever say the word "cherry" for a single unit, I have every right to chastise you. That word never existed in French. "Cheris(e)" is the singular form that was introduced into English. How dare you impose English conventions of depluralization on it! You will say "Shair-eez" for one piece of fruit, you'll do it in a beret and you'll like it. You doctrinaire dope.

Life is the single most powerful force in the universe. It is the only thing which exists specifically for the sake of its own existence. It is amazingly tenacious.

I would give you two more (which in fact connect with life, but exist outside living organisms, or mimic the effects): fire, which eats, excretes, replicates (breeds) and dies. The other is entropy, which exists specifically for its own existence. And hopefully neither fire nor entropy are aware - I shudder to think of a self-aware decaying stellar process, or a fire sad that it must destroy to reproduce.

Bacteria is plural of bacterium. Please use is correctly. The same goes for media and medium, which is not in this article but is often misused in scientific writing.

In Latin maybe. And the phrase you are nitpicking is actually "from a bacteria". So if you were anything but pedantic, you would exclaim "that requires the ablative of source! In the singular." Following your logic, the article should read "...from a bacterio". But wait, Latin has no indefinite article, so whether it is "a bacterium" or "a bacterio", the noun phrase is redundant since indefiniteness is presupposed in simple noun forms. But "from bacterio" is neither grammatical English nor comprehensible Latin. And the ablative of source usually employs a preposition, so "from a bacteria" should read "ab bacterio" to be exact.

Problem is: This is not FRICKING Latin. This is a word of Latin origin that has entered into English. Therefore our rules apply. Because if you demand a Latin singular, I demand the proper Latin case, pronunciation, etc. We took the plural form for obvious reasons. Because of the physical size of bacteria, the word became a mass noun and functions as both plural and singular. Same reason for taking "data": it is collective. There are rarely "bacterias". And certainly no "datas". One sheep. Two sheep. Ten sheep. One form, all numbers. It is legal in English - accept it.

I can only assume you are one of those people who thinks the plural of "octopus" is "octopi" as well. Except there is no such word as "octopus" in Latin. The word is "polypus", "Octopus" is from the Greek ὀκτάπους, and the plural of that is ὀκτάποδες. And even if people knew "octopodes" was the true plural, they would say it wrong since the epsilon is not silent. Why? Because it is adapted for usage in the new language. It has no obligations to its old morphology and phonology.

By your inanity, if you ever say the word "cherry" for a single unit, I have every right to chastise you. That word never existed in French. "Cheris(e)" is the singular form that was introduced into English. How dare you impose English conventions of depluralization on it! You will say "Shair-eez" for one piece of fruit, you'll do it in a beret and you'll like it. You doctrinaire dope.

Bacteria is plural of bacterium. Please use is correctly. The same goes for media and medium, which is not in this article but is often misused in scientific writing.

In Latin maybe. And the phrase you are nitpicking is actually "from a bacteria". So if you were anything but pedantic, you would exclaim "that requires the ablative of source! In the singular." Following your logic, the article should read "...from a bacterio". But wait, Latin has no indefinite article, so whether it is "a bacterium" or "a bacterio", the noun phrase is redundant since indefiniteness is presupposed in simple noun forms. But "from bacterio" is neither grammatical English nor comprehensible Latin. And the ablative of source usually employs a preposition, so "from a bacteria" should read "ab bacterio" to be exact.

Problem is: This is not FRICKING Latin. This is a word of Latin origin that has entered into English. Therefore our rules apply. Because if you demand a Latin singular, I demand the proper Latin case, pronunciation, etc. We took the plural form for obvious reasons. Because of the physical size of bacteria, the word became a mass noun and functions as both plural and singular. Same reason for taking "data": it is collective. There are rarely "bacterias". And certainly no "datas". One sheep. Two sheep. Ten sheep. One form, all numbers. It is legal in English - accept it.

I can only assume you are one of those people who thinks the plural of "octopus" is "octopi" as well. Except there is no such word as "octopus" in Latin. The word is "polypus", "Octopus" is from the Greek ὀκτάπους, and the plural of that is ὀκτάποδες. And even if people knew "octopodes" was the true plural, they would say it wrong since the epsilon is not silent. Why? Because it is adapted for usage in the new language. It has no obligations to its old morphology and phonology.

By your inanity, if you ever say the word "cherry" for a single unit, I have every right to chastise you. That word never existed in French. "Cheris(e)" is the singular form that was introduced into English. How dare you impose English conventions of depluralization on it! You will say "Shair-eez" for one piece of fruit, you'll do it in a beret and you'll like it. You doctrinaire dope.