It’s 2014 in Liberia. The country’s largest hospital is already full to the
brim, unable to admit. Instead, the sick lie on the ground outside,
writhing and crying in pain. They’re struck with severe bouts of vomiting
and diarrhea, impaired kidney and liver function, perhaps even internal
bleeding. Up to 90 percent of those sick will die. Their only hope of
getting treatment is if someone else dies first, freeing up a bed.

The culprit behind the devastation? Ebola. By 2016, the outbreak ended with
more than 11,000 reported deaths across West Africa. In the aftermath,
experts underscored that, during the worst months, few were prepared for
such catastrophe — neither the countries that suffered most, nor the
international community at large. Though the World Health Organization
eventually declared a Public Health Emergency of International Concern, it
came late, and arguably, so did vital funding. And while few cases spread
outside the continent, the resulting panic certainly did.

But what if there were an early warning system for the outbreak? Something
that could have given health organizations a heads-up, allowing them to
organize an effective response, contain the disease’s spread, save tens of
thousands of lives and prevent an international crisis? Such a system may
be in the not-too-distant future.

Non-profit, university and non-government groups across the globe are
tackling this idea from various angles. From compiling information on
potentially infectious agents to tracking real-time diagnoses in disease
hot spots, epidemiologists — those who study the incidence and prevalence
of disease — are getting us closer to a world with fewer surprise
pandemics.

Pulling It Together

Pathogens hitch rides on hosts, spreading microorganisms such as bacteria
and viruses. Researchers have studied many of these pathogens (human
immunodeficiency virus, for example). But the results of their work aren’t
all stored in the same place — they’re scattered across journals and
various databases. If experts sequence a pathogen’s DNA (which helps
identify and track it), that data typically gets uploaded to a public
database, but it’s not paired with any additional existing descriptions.
Instead, interested researchers have to manually cross-reference with
various journals.

“We needed to find a better way of bringing the information together,” says
Marie McIntyre, an epidemiologist at the Institute of Infection and Global
Health at the University of Liverpool.

So McIntyre and her colleagues created a new database: the Enhanced
Infectious Diseases Database, or EID2. It’s programmed to link publicly
available information about all known pathogens of a given host, all hosts
of a given pathogen and info about when and where that pathogen showed up.
“It’s not about where the disease is occurring today,” McIntyre says. “It’s
about where the disease is occurring and who the disease is occurring in.”
That combination of information can help researchers look for long-term
drivers of disease, such as how climate affects the spread of a pathogen.

This representation of EID2 data shows the link between pathogens and their human and domestic animal hosts. Each host is represented by a node; the bigger the node, the more pathogens found in that host. The lines between hosts indicate the number of pathogens that show up in both hosts; the thicker the line, the more pathogens the pair share. Colors simply indicate the type of host: human, rodents, other mammals and birds.

Maya Wardeh

By pooling information from various public databases, EID2 lets users see in one spot lots of data that’s usually scattered. Above is a look at sources of information on HIV overlaid on areas where the pathogen exists.

Laptop Inset: Courtesy of Marie Mcintyre. Laptop: Kostov/Shutterstock

Pros: By bringing knowledge together in one place, EID2 makes it easier to
investigate, anticipate and prepare for a pathogen. McIntyre says the
database, which includes millions of sequences and information on thousands
of pathogen species, is also easily updated. Plus, it’s free, and anyone
can use it.

Cons: EID2 relies on public information, so it’s limited to already published
knowledge. If researchers discover a pathogen but its DNA isn’t sequenced,
or if no one else has posted information about it in a public forum, EID2
can’t incorporate it.

Up Next: The EID2 team plans to expand the database, incorporating diseases that
affect crops.

A Learning Process

In the world of epidemiology, diseases that have seen an uptick in recent
years are called “emerging infectious diseases.” But are there really more
cases of these diseases, or have we just become better at spotting them?
According to Barbara Han, a disease ecologist at the non-profit Cary
Institute of Ecosystem Studies in New York, it’s not just us getting
better. “It’s actually an increasing problem of infectious diseases,” she
says. And most of these diseases originate in animals.

Han decided to figure out what makes certain animals more likely to host
specific diseases. “There is something inherent about a species that
enables it to carry disease, compared to the vast majority that don’t,” she
says. “I want to know what the data can give me, what can the data show me,
about what distinguishes those two.” She turned to algorithms and machine
learning.

Han starts with a list of species that researchers have already flagged as
disease carriers or non-disease carriers. She then trains a computer
algorithm to separate the species on the list — not labeled in any way, so
the algorithm doesn’t know which is which — by dozens of traits. For
example, the algorithm may start by looking at an animal’s body mass,
followed by its age of sexual maturity and finally by whether it’s
nocturnal or not. At the end of this sorting, the algorithm will ideally
have grouped species by whether they’re disease carriers or not.

But this first sort gets a fair bit wrong. To make the algorithm more
accurate, Han has the computer do another round of sorting, this time
focusing on the species it miscategorized the first time. When it does this
over and over again, the algorithm learns. And, importantly, it learns
which factors contribute to a species carrying a transferable disease or
not. “At the end of that process, you get a very powerful predictor,” Han
says. When the model examines a species that’s a question mark — whether or
not it carries disease isn’t known beforehand — it can use what it’s
learned to study that species’s traits, compare them with traits from known
carriers and predict the likelihood of that species hosting a disease.

The algorithm can also create a list of animals ranked by their risk of
carrying disease, as well as a description of the traits that determine
that risk. For example, when Han trained the algorithm with hundreds of
mice species, it determined disease-carrying risk was associated with a
rapid life cycle — early sexual maturity, frequent reproduction and fast
growth rates. Knowing what animals and which traits are most likely to be
associated with disease allows researchers to zero in on and prepare for
where the next pandemic could originate.

An example of how machine learning can help researchers predict where and when outbreaks might occur.

Pros: This model is based on objective facts about animals, so predictions are less prone to bias. And the model’s predictions of risk are stable because they’re based on biological traits that aren’t likely to change anytime soon.

Cons: The ability to predict any species’s disease risk relies on how much we know about it. So if we don’t have enough information, the algorithm has little to work with — and that could lead to inaccurate predictions. There’s also the problem of follow-up. “It’s almost like selling an insurance policy,” Han says. Her model can produce a list of potentially risky animals, but if no one investigates them firsthand, the prediction is just a prediction. So in many cases, confirming the model’s output takes some time.

Up Next: Han is working on figuring out how to turn prediction systems like her algorithms, which can be valuable tools for researchers already focused on sniffing out emerging diseases, into something more proactive, such as an early warning system. She’s now focusing on what types of data are necessary for such an alert system and what still needs to be collected.

The High Costs of Fighting Disease

Working to give people a heads-up when diseases break out is useless
without resources to deal with the situation. Once experts predict a
potential outbreak, who funds the necessary preventive and containment
measures? And how much will they give?

Here’s a look at some of the major contributors and how much money they’ve
committed to fighting significant disease outbreaks.

Alison Mackey/Discover

Location, Location, Location

EcoHealth Alliance, another New York-based non-profit focused on global
health, is also interested in how and when diseases jump from animals to
humans. Not only is it looking at which species put humans at risk, it also
focuses on which regions and animal habitats are more susceptible to
sparking pandemics.

“A few years ago, we compiled a database of every known emerging disease to
find out what the reality is,” says Peter Daszak, a disease ecologist and
the organization’s president. “Around two-thirds of all emerging diseases,
maybe even more, are of animal origin.”

Daszak and his team created a mathematical model that uses outbreak data
from the last 50 years to predict where outbreaks might occur. With that
tool, he and his colleagues found that many of these hot spots of emerging
diseases were in tropical areas. Then, EcoHealth team members went out to
these areas, testing local residents and wildlife for disease to confirm
their model’s accuracy. Those regions host incredibly dense and diverse
wildlife, and since each species comes with its own set of pathogens, the
more biodiversity you have, the greater the risk of emerging diseases.

“We live in a globalized world where we’re changing the environment so
fundamentally that pathogens are changing their behavior,” Daszak says.
“They can jump from one species to another more easily because we’re
butting up against different species.”

Based on data from past outbreaks, EcoHealth Alliance’s mathematical model flags areas (usually those rich in biodiversity) that are more likely to spawn an emerging disease in the future. The warmer the color, the greater the likelihood.

EcoHealth Alliance

Pros: Using these analyses to pinpoint potential outbreak hot spots allows
health care organizations and governments to direct resources to that area.
Researchers and physicians then can focus on that region and directly test
for the emergence of diseases from both wildlife and humans, allowing for a
better chance at prevention and containment.

Cons: Relying on a mathematical model requires researchers to make
assumptions. For example, the model may show that deforested areas are hot
spots for new outbreaks. But it doesn’t explain the complex reasons that
make up the whole picture of why this occurs. So while the map is limited
in what it can tell researchers, it does point researchers to key places to
seek underlying causes.

Up Next: Emerging disease leaders from around the world, including those
from EcoHealth Alliance, have come together to form the Global Virome
Project. The goal is to identify all currently unknown viruses that could
emerge in the future — an estimated 1.6 million. By knowing which viruses
pose a threat to humans and which animals carry them, EcoHealth and similar
groups will be even better prepared to predict where the next pandemic may
spring up. The project is expected to take 10 years and cost up to $5
billion.

There's a Map for That

Doctors Without Borders (also known by the French name Médecins Sans
Frontières, or MSF) and the British Red Cross (BRC) are collaborating to
tackle the spread of disease in real time.

Their efforts began with the Missing Maps Project, a 2014 initiative
carried out by MSF, BRC, the American Red Cross and the U.S.-based
non-profit Humanitarian OpenStreetMap Team. The project trained citizen
volunteers to digitally trace the buildings and roads that appear in
satellite images, creating maps. They focused on regions that are most
vulnerable to crises like disease outbreaks and natural disasters, but
aren’t typically mapped in detail — which can be a problem for aid workers
responding to a disaster.

MSF and BRC applied this technique in Lubumbashi, a city in the Democratic
Republic of Congo. They mapped buildings and road networks, as well as
details like neighborhood limits, identifying key areas where crisis
victims might arrive. These maps provided a basis on which to build an
outbreak tracking system: The team created software that would combine the
maps with patient details collected by doctors, making it easier to check
for patterns or signs of an outbreak.

Doctors and nurses enter patient information, including age, length of stay
and admission date, into the software, and an animated map shows where
patients are coming from and when. The tool “will show a map of the city
and the administration areas, and will show colors in different intensity
where the outbreak is occurring the highest,” says Simon Johnson, a BRC
technical leader who helped develop the software. “The idea is you can then
start preventative exercises, rather than just treatment of patients coming
in.”

The British Red Cross and Doctors Without Borders teamed up to build this digital dashboard. The tool combines local maps with patient data, so first responders can track details that could help them spot an outbreak in real time.