Ancient Egyptians described diabetes on a scrap of papyrus 3500 years ago. Two thousand four hundred years ago, Parkinson's was first outlined in a Chinese medical text. And Chinese, Greek, Roman, and Indian civilizations had all recognized malaria long before we had microscopes to observe the parasites that cause the disease. By comparison, HIV is a distinctly modern disease. It was first described in 1981, and drugs to treat it weren't available until 1987. But for how long before its discovery did HIV lurk unnoticed in human populations? New research paints a clearer picture of when (and how) HIV got its start. Last month, an international group of researchers reported on fragments of the HIV virus discovered in a preserved, 1960 tissue sample from a woman living in Leopoldville (now Kinshasa, Democratic Republic of the Congo). The sample, along with many other lines of evidence, confirms that HIV's roots burrow much deeper than our first recognition of the virus  back to the turn of the century.

Where's the evolution?

A tour of HIV genealogy
HIV (Human Immunodeficiency Virus) is actually composed of five different viral "families": HIV-1 groups M, N, and O and HIV-2 groups A and B. All of these families are descended from SIV (Simian Immunodeficiency Virus, which infects apes and monkeys)  but each is the result of a separate invasion of human hosts. On two separate occasions, an SIV virus was passed from a Sooty Mangabey (a monkey of western Africa) to a human, resulting in HIV-2 groups A and B. On three separate occasions, an SIV virus was passed from a chimpanzee to a human, resulting in the three HIV-1 groups.

Despite all this diversity, only one of these invasions has really taken off. HIV-1 group M is responsible for 95% of HIV infections globally. Michael Worobey and colleagues studied the origins of that viral family.

The scientists, led by Michael Worobey at the University of Arizona, relied on evolutionary theory to dig into HIV's past. HIV  and indeed all pathogens  are not static infectious particles, but evolvingpopulations. HIV's evolution is particularly rapid because of its high mutation rate  up to a million times faster than our own. As the virus proliferates within an individual, its genetic material (RNA) accumulates mutations that are, in turn, passed on to its descendents. Different viral lineages accumulate different mutations  so as the virus infects new individuals, it begins to diversify, forming a tree-like network of relationships. By studying the genetic sequences of HIV, scientists can figure out how the viruses infecting different people are related to one another and reconstruct their evolutionary tree, or phylogeny.

The HIV sequence that the researchers recovered from the Kinshasa sample was of a subtype known as M  the strain responsible for most HIV infections worldwide. The sample was interesting not so much because its age, but because of how different the newly discovered 1960 sequence was from that of a previously known 1959 sample. The fact that these two, early HIV sequences were already quite different from one another (11.7% where the two samples had overlapping fragments) suggested to the researchers that the virus had been diversifying in humans for several decades before 1960. The logic here is a bit like coming home from vacation to find that a slow but steady leak has left a pool of water in your basement: a minor puddle means that the leak must have started recently, while a serious flood means that the pipe has been leaking for some time. The substantial difference between the two HIV sequences amounted to a "flood" of genetic diversity, indicating that the virus lineages had been diverging behind our backs (and in human bodies) for quite a while. But for just how long? Figuring that out would require more evidence.

To work their way back to the origins of HIV, the research team took a phylogenetic approach. They used the sequences of many other, more modern HIV samples to build an evolutionary tree depicting how the modern and historic sequences are related. Tracing the branches of this tree back toward its trunk is like travelling back in time. The team just needed to figure out how long the journey from root to branch took. Luckily, the 1959 and 1960 samples, along with three other old samples, could be used to calculate typical rates of evolution for the virus. Knowing these rates is something like knowing how quickly a particular tree grows; the growth rate, along with data on the size of the tree, can be used to estimate the tree's age. The researchers applied the same logic to their HIV phylogeny. Based on their calculated rates of evolution and some statistical models of evolution, they extrapolated backwards to the root of the epidemic to estimate the date at which the common ancestor of all these different viral sequences must have existed.

The data point to 1908 as the year that HIV group M (which now infects more than 31 million people worldwide) began its assault  somewhat earlier than the previous best estimate of 1931. Though 1908 is an approximation, the evidence suggests that the true date almost certainly falls sometime between 1884 and 1924.

When such evolutionary studies are overlaid with the history of human societies in Africa, a detailed picture of the origins of HIV group M comes into focus. Historically, chimpanzees in west-central Africa have been hunted for food. Many of them are also infected with the virus that HIV evolved from, Simian Immunodeficiency Virus (SIV). Butchering chimps probably repeatedly exposed local hunters to SIV. The virus may have made the leap to infect people many times  but only at the turn of the century did this viral invasion gain a foothold in the population. Around that time, a hunter seems to have picked up the virus from a chimp in the southeast corner of Cameroon and carried the pathogen along the main route out of the forest at the time, the Sangha river, to Leopoldville (modern-day Kinshasa). Mirroring the growth of the cities in Africa, the virus spread slowly in Leopoldville until around 1950, when it began to proliferate rapidly. Still undetected, the virus continued to evolve and to diversify, leapfrogging through burgeoning cities. With the increasing ease of global travel, HIV was carried out of Africa and around the world  and the rest, as they say, is history.

If you can see this text instead of a movie, you need to install Adobe's Flash Player (go to www.adobe.com).

Video podcast on virus evolution provided by the National Evolutionary Synthesis Center (NESCent). To learn more, visit the NESCent website.

This reconstruction of HIV's origins certainly satisfies our curiosity  but it also serves as a practical reminder of the conditions that foster the emergence of new diseases. We cannot stop evolution. Pathogens regularly make the leap to infect new hosts, and we increase our chances of being victimized by one of these host switches, when we take on lifestyles that put us in close contact with other species  especially ones closely related to us  like chimpanzees. The early history of HIV also illustrates that the virus is not invincible. For more than 50 years, HIV infected human populations but had such a small impact that it wasn't noticed against the backdrop of other diseases. In comparison to pathogens like malaria (which is carried by mosquitoes) and the common cold (which can travel through the air), HIV is pretty terrible at getting from one person to the next, relying on the direct transfer of body fluids. The virus only got its start in humans through a confluence of opportunity and history  the practice of hunting chimpanzees, the rise of densely populated cities in Africa, and a correlated increase in high-risk behaviors involving the exchange of body fluids (e.g., injection drug use, prostitution). The fact that changes in human societies were so critical in the rise of the virus suggests that changes in human societies could snuff it as well. Though the changes wouldn’t be easy, through monitoring, testing, treatment, and prevention, HIV could be once again shut down.

News update, August 2010

Since we published this report on using phylogenetics to study the origins of HIV, scientists have learned even more about the virus's evolutionary history. In August of 2009, researchers announced that they'd discovered a new strain of HIV infecting humans  but this one seems to have come to us from gorillas, not chimpanzees. The new viral strain was identified when a Cameroonian woman living in Paris was diagnosed with HIV, but the RNA of her virus could not be copied using standard genetic techniques for HIV analysis. Eventually, scientists discovered that the virus's RNA wasn't being copied because its genetic sequence was quite different from that of other known HIV strains. The new strain much more closely resembled immunodeficiency viruses infecting gorillas. The most likely explanation is that an SIV virus was passed from a gorilla to a human and then experienced natural selection favoring adaptations for living in the human body  just as chimp SIV viruses were passed to humans multiple times and independently evolved specialized traits for infecting humans. Since the Cameroonian woman hadn't had close contact with apes and did not eat ape meat, the researchers reasoned that she probably got the virus from another person in Cameroon and that there are probably many other individuals in Africa infected with this new strain of the virus.

The scientists used the RNA sequences of viral strains infecting humans, chimps, and gorillas to construct a family tree of the new virus and its close relatives. If we incorporate this new information into the phylogeny above, it would look something like this:

This new discovery emphasizes the complexity of HIV's evolutionary history and, perhaps more importantly, just how easy it is for diseases to cross species barriers when those species are as closely related as humans, chimps, and gorillas. HIV has jumped to humans from chimps (our closest relatives) at least three times, and likely from gorillas (our next closest relatives) at least once.

Since we first published this report, scientists have used phylogenetics to delve even deeper into the history of the HIV epidemic, focusing in on how the virus spread once it arrived in Kinshasa. It now seems that the virus responsible for the global pandemic began to proliferate in Kinshasa in the early 1920s. Then, in the 1930s, when railroads were thriving in Africa, the virus traveled with train riders to other cities. From those outposts, it spread throughout Sub-Saharan Africa with migrant workers. The strain of HIV responsible for most infections in the U.S. and Europe (subtype B) evolved from the HIV strains circulating in Kinshasa in the 1940s. Haitian professionals came to the Democratic Republic of the Congo in the 1960s and then carried the virus home with them to Haiti. From there, subtype B spread to the U.S., and then to Europe.

The new research also helps explain a mystery. Scientists have now documented that the SIV virus has jumped from monkeys or apes into humans at least 13 separate times! Several of those invasions have resulted in outbreaks; yet only one of those strains (group M, which contains HIV subtype B and many others) has become a global pandemic. Why? Is there something special about group M? The new analysis indicates that for 40 years, group M (which came to us from chimpanzees) was spreading at the same rate as another HIV strain (group O, which recent evidence suggests came to us from gorillas). This might suggest that there is nothing biologically special about group M since its basic infection rate was similar to that of group O. The researchers hypothesize that when group M later started to spread much more quickly and broadly it was simply because the virus was in the right place at the right time — or from our perspective as potential hosts to this virus, the wrong place at the wrong time — and was perfectly positioned to hitchhike to new populations as global travel rapidly expanded.

If you obtained the genetic sequences of HIV viruses infecting two different people living in the U.S., would you expect the sequences to be the same or different? Explain your reasoning.

How did the researchers use evolutionary trees to figure out the approximate date at which HIV-1 group M first began infecting humans? In your own words, explain the steps they took to make this estimate.

The researchers who found the 1960 Kinshasa HIV sample found that it was quite different from a 1959 sample. If they had instead found that the two samples were quite similar, would that have changed their conclusions? If so, how? Explain your reasoning.

Read this article about an HIV outbreak in a Libyan hospital. Describe the similarities between scientists' investigation of the Libyan HIV outbreak and the investigation of HIV origins described here.

HIV made the jump to humans from chimpanzees  several times, in fact. Research another case of an infectious disease that has evolved from a strain originally infecting a wild animal population. Explain how that disease made the jump to humans and how that "host switch" is similar to and different from the emergence of HIV.