Biology after virus: think first, count later

A protective mask is shown on a statue of a mermaid playing harp in St. Clair Shores, Mich., Thursday, May 7, 2020. (AP Photo/Paul Sancya)

When I was a student at Patha Bhavan, a school that was founded by Tagore in Santiniketan, my father presented me with a book. The title of the book was Two Cultures by C.P. Snow, later Lord Snow. I discovered that Snow had a PhD in physics and became a fellow in Cambridge University. The literary bug proved too strong even for the pulls of gravity and Snow took to writing which later included a memorable profile of the mathematician Ramanujam. In Two Cultures, Snow bemoans the chasm between science and humanities and laments the fact that Britain rewarded humanities over science. Some 40 years have gone by. The strapping youth from Santiniketan is now a biologist in Bangalore. As the pandemic sweeps across the world, I am struck by a different kind of cultural divide, this time within science.

Two aspects of this health crisis strike me as a scientist – one at a “micro”, and the other at the “macro”, level. First, the micro level issue is exemplified by the question I am asked frequently – when will we have a vaccine or effective treatment? This, in turn, centres on a detailed scientific understanding of how the virus does damage, and how to prevent it. Unfortunately, this is where the word “novel” in “novel coronavirus” looms large. To quote the infamous phrase used by Donald Rumsfeld to describe the situation in Iraq – in our war against this new virus, we are stuck somewhere between “known unknowns” and “unknown unknowns”. In fact, much of the biomedical sciences, in the initial stages of research, function in this domain. And in a massive and rapidly evolving health crisis like this, the best we can do is “learn on the job”. So, this is what clinicians and biologists are doing – collecting noisy, real world data. The hope is that with enough data we will be able to make sense of it all – how the virus invades and damages body cells and gives rise to life threatening symptoms. The trouble is even within this one subclass of viruses – the coronavirus – there are enough differences that make it difficult to predict how it will affect us or what effective treatment may look like.

This is why biology is so messy. We understand the broad principles of how these little monsters attack our bodies. But one size never fits all. So, for every challenge posed by these “unknowns” we have to first collect enough data before we can devise an effective strategy to fight it. And this is largely true for how we study biological phenomena. Hence, the need to gather good data invariably precedes the process of understanding what the data “means”.

At the other end of the spectrum, the “macro” level, we face similar challenges in trying to guess how the virus may spread across large populations – how many will get infected, and how many may die or recover. All talk of “flattening the curve” and “rate of doubling”, which have now become part of our daily vocabulary, are a reflection of how theoretical models strive to predict the rate at which the disease is likely to affect us over time. No matter how sophisticated the algorithms or computers may be, all this depends on how good the data is. So, once again – data first, make sense of it later. So, at both ends – from the actions of the virus within tiny cells of the human body to its devastating impact on millions of human beings – data reigns supreme.

Scientists conduct research with corona viruses in a biological safety level 3 laboratory (high-security laboratory) at the Helmholtz Centre for Infection Research HZI, in Brunswick, Germany, May 8, 2020. PTI

Research dependent on descriptive approach

In fact, this holds true for much of the biomedical sciences. Traditionally, this line of research has relied heavily on a descriptive approach wherein scientists are driven by the hope that if we can gather large amounts of data, that will lead to insights and big picture principles – be it how the immune system fights infection or how the brain gives rise to complex functions like speech or memory. With rapid and impressive technological breakthroughs in the biomedical sciences we are now flooded with massive volumes of data. But, this rich treasure trove of data has not necessarily translated into better understanding that leads to deeper principles – or overarching rules or laws – like we have in physics. Thus, more often than not, theoretical analyses or model building comes after data is collected. Also, there is an implicit assumption that although the descriptive findings are largely correlative in nature, enough of it will lead to a causal link.

What about the other way around? Imagine a different sequence wherein the starting point is a theoretical framework, like in physics, that gives rise to specific predictions that are proven or refuted through experimental data. Indeed, this is how many areas of physics have developed over the centuries. The initial theoretical formalism may not be perfect or may not be an accurate description of the natural phenomenon being studied. But it still suggests specific strategies for how an idea may be tested through empirical means. Data is still pivotal to the eventual refinement and strengthening of the theory, but the trajectory or sequence of how the science is done has a very different flavour to it.

Technician of the French General Armament Directorate, DGA, specialized in research of Chemical Biological Radiological and Nuclear military protection gear analyzes public masks in their lab at Vert Le Petit, south of Paris, Wednesday, May 6, 2020. (AP Photo/Francois Mori)

Two to tango

The co-evolution of theory and experiments – when the two tango well – is what clicks. I saw this up close while pursuing my PhD in neuroscience at the Salk Institute for Biological Science in California. Francis Crick, who along with Jim Watson discovered the structure of DNA and transformed modern biology, was an integral part of our lab at the Salk. He was a regular participant of our lab’s afternoon tea – a daily “chaa-er adda” – that covered almost every interesting scientific topic under the sun. Our discussions sitting around a small table often led to heated arguments on some of the most exciting and unresolved issues of brain science, such as the neurobiological basis of consciousness. Research on consciousness was, and still is, a “high-risk” profession because everybody from the philosopher, to the psychologist, to your barber or a new-age spiritual guru has an opinion on it.

But, Francis had this remarkable ability to glean through various scientific reports to focus on a few key experimental findings that he felt held the key to moving to the next step of analysis – small, incremental steps without getting caught up in big battles with either the guru or the barber. But here is why he was so striking – the next step did not necessarily involve another dozen experiments – but thinking first about how the existing data can be synthesized into a testable idea. Francis insisted on formulating a hypothesis, based on whatever we already knew rather than complain about how much was unknown. Then that hypothesis could be tested using new experiments. And the data emerging from those experiments could be used to refine the idea further – and this is how things could move ahead through a co-evolution of ideas and experiments unified together seamlessly in pursuit of a question, however intractable it may be. Francis did this with an amazing clarity of mind that valued a good idea without being bogged down by loads of data, or lack of it. Perhaps this was not an accident because he was originally trained as a physicist who then moved into biology.

A researcher tests a patient's blood for Covid-19 at a private laboratory in Rome, Tuesday, May 5, 2020. (AP Photo/Alessandra Tarantino)

Theorists’ advantage

Of course, any such scientific endeavour has a high failure rate – rarely do things work as planned. But theorists have an advantage. They do not get bogged down by the fear of being wrong, or not having “all the data” or even “enough data” – before they come up with an idea that can be tested with experiments. It may take multiple iterations of theory and experiment before meaningful insights emerge. An experimental biologist, on the other hand, may feel that more data is necessary before one can start making sense of it all. As a result, one often ends up with piles of data and then it becomes even harder, not easier, to extract higher level insights into which aspects of the data are more important or consequential than others. In short, this balance between theory and experiments – and how the two interact – is at the heart of a cultural divide between physics and biology. More often than not, a physicist is more likely to take an educated guess at a crude, “first-approximation” of a natural phenomenon, just to get the ball rolling _ even in the absence of “enough data”. As the data rolls in from more experimental tests of theoretical predictions, the latter improves. So, data still rules – but it is not a “pre-condition” for the two sides to do business. In biology, this relationship is, more often than not, flipped – get data first, think later.

Thus, there is a fundamental difference in how the “two cultures” – of biology and physics – use theory and experiment. It is not about one being “better” than the other. It is simply a reflection of how the two cultures differ in how they approach the same problem.

In this screen grab from video issued by Britain's Oxford University, a volunteer is injected with either an experimental COVID-19 vaccine or a comparison shot as part of the first human trials in the U.K. to test a potential vaccine, led by Oxford University in England on April 25, 2020. (University of Oxford via AP)

Seamless blending of two cultures

How does it matter? Increasingly, our world faces existential threats that no single discipline of science can tackle alone. The coronavirus crisis is a stark reminder of the need to unite the two cultures. On the one hand, we are in desperate need of deeper theoretical and quantitative understanding of the forces in play across biological scales. For example, this crisis demands that we not only analyse the biological features that make a virus lethal and contagious, but also how it impacts different populations and societies across the world. This understanding, in turn, will only emerge if data is collected in a manner that is in synergy with, and inspired by, theory. Such breakthroughs require innovations that are far more likely to happen through a seamless blending of the “two cultures”, not just an uneasy and occasional handshake.

The writer is a professor with National Centre for Biological Sciences, Bangalore.