What Statistics and Statistical Modelling Tell Us About COVID-19

Read Next

Featured image: Volunteers prepare themselves to spray disinfectants on streets and shops in Kabul, Afghanistan, on March 30, 2020. Photo: Reuters/Mohammad Ismail.

It is too early to assess the direct or indirect consequences of the coronavirus outbreak on the Indian economy.

However, before the 21-day lockdown announced by Prime Minister Narendra Modi, a survey conducted by the Federation of Indian Chambers of Commerce & Industry (FICCI) among its member companies reported that 53% of respondents experienced a significant impact of the COVID-19 pandemic on their businesses; 73% a big reduction in orders; 81% a decrease in cash flow; and 63% a significant effect on their supply chains, and which they expect to worsen.

The economic costs of fighting the coronavirus epidemic are mounting and going to be huge. The stocks are tumbling. Citizens are forced to stay at home. Educational institutions have been shut down. The airline and hospitality industries are taking the biggest hit.

Initially, large scale cancellations and rescheduling of trips happened, but then aircrafts were grounded, airline staffs were asked to take pay-cuts, and occupancy rates of hotel rooms touched zero. The prediction is that if this situation persists for the next two to three months, most airlines may go bankrupt and the hospitality industry may incur a $1.5 billion loss of revenue.

Millions will lose jobs. We are in a truly uncertain situation close to chaos and no one knows where are we heading in future. Most importantly, how long will things take to return to normal? About 88% of the respondents in the FICCI survey expect it to be between three to six months. These are forecasts based on the respondents’ own beliefs, which aren’t very reliable.

However, a lot depends on this magic uncertain number. If we have to wait for the development and distribution of a safe and effective vaccine or affordable treatment, it may take months or even a year or two. Can we afford to enforce lockdown and engage in social distancing for months?

Perhaps not – in that event the economy might go from recession to severe depression. That would be financially ruinous for hundreds of millions of Indians besides affecting their mental, emotional and physical wellbeing. It may lead to a complete breakdown of governance and the occurrence of deadly and widespread civil unrest.

It is essentially from this apprehension that the disease prevention expert John P. A. Ioannidisquestions the decision to go by the reasoning of “preparing for the worst” in the absence of reliable data and hard evidence. He cautions:

If the pandemic dissipates — either on its own or because of these measures — short-term extreme social distancing and lockdowns may be bearable. How long, though, should measures like these be continued if the pandemic churns across the globe unabated? How can policymakers tell if they are doing more good than harm?

Finally, he issues a caveat: COVID-19 instead of being “once-in-a-century pandemic” may turn out to be “once-in-a-century evidence fiasco”.

Going forward, no one disagrees with Ioannidis. However, at this juncture, should we have waited for reliable data before weed act? The worst case scenario that he predicts is indeed catastrophic:

In the most pessimistic scenario, which I do not espouse, if the new coronavirus infects 60% of the global population and 1% of the infected people die, that will translate into more than 40 million deaths globally, matching the 1918 influenza pandemic.

Still, he feels that the decision to reach for extreme measures is not rational.

To many, his argument seems to be unacceptable. Nassim Nicholas Taleb quipped on Twitter:

My own review is “Ioannidis mistakes absence of evidence for evidence of absence /recommends to buy insurance AFTER the harm when we now have evidence“.

In case of a black swan like event – an event with low probability and high consequences – pragmatism should prevail. Every possible step must be taken to lower the probability and mitigate the consequences. Harry Crane attributes the US government’s failure to act and refusal to acknowledge the breakout of epidemic early enough to “naive probabilism“.

[This is the view] held by many technocrats and academics, that all rational thought boils down to probability calculations. Naive probabilism puts accuracy (being ‘right’) first and common sense last.

The real-world application of naive probabilism is disproportionately risky for a black-swan-like event. “The 2019-2020 coronavirus outbreak (COVID-19) is a living example of the dire consequences of probabilistic naïveté,” Crane argues, adding that when survival is the issue, at a societal level we should do everything possible to reduce the spread of the virus.

Currently, the attention of policymakers is focussed on two numbers: the percentage of infected people (I) and the percentage of deaths (D) due to the new coronavirus over time. Expectedly, the media is flooded with all kind of predictions of these two numbers.

Unfortunately, some pseudo-experts indulging in sensationalism have been making crazy and scary predictions just for the 15 minutes of fame. Given the dynamic and complex nature of the phenomenon, making a long-term (even weeks ahead) prediction is notoriously difficult. And in the absence of test data, the actual number of infected cases is hugely uncertain. It is possibly a large and uncertain multiplier of the confirmed cases.

As Ioannidis writes, “Reported case fatality rates, like the official 3.4% rate from the World Health Organisation, cause horror — and are meaningless.”

A migrant worker’s family draped in blanket rests along a road as they return to their village, during a 21-day nationwide lockdown to limit the spreading of coronavirus disease (COVID-19), in New Delhi, India, March 30, 2020. Photo: Reuters/Danish Siddiqui

Using multiple data sources and sensible assumptions, he concludes that a reasonable guesstimate of case fatality ratio for the US population could be anywhere between 0.5% and 1%. A report from China that recently appeared in the Journal of the American Medical Association found this rate to be 2.3%, based on 72,314 case records of the Chinese Centre for Disease Control and Prevention. This estimate also appears to be biased upward given the missing infected cases. The data indicates that the rate is considerably lower for individuals below 65 years of age, and it is extremely low for very young people.

For example, the data reports no death among individuals below nine years of age. Given that the Indian population is considerably younger compared to most other countries, an estimate between 0.5% and 2% seems reasonable for India even if we factor in the quality of healthcare facilities.

In this whole business of prediction, what is notoriously difficult is to project the percentage of infected individuals (I) over time. From policymakers’ perspective, the simple and clear objective is to reduce I as far as possible over time. This will flatten the curve, and so lower the stress on the existing health care infrastructure.

On the other hand, controlling I depends crucially on the basic reproductive number R0, which simply represents how many people an infected person on average is likely to infect. In the extreme case, when R0 = 0, an infection stops spreading.

If R0 exceeds one, i.e., each patient infects on the average at least one, the population of infected people will grow. On the other hand, if it is less than one, the infected population shrinks over time. However, it is difficult to estimate R0.

For its estimation, one needs to know the duration (how long someone would remain infectious for coronavirus; the current estimate is between a week or two), the opportunity (how many people a patient would come in contact with on a day on average), the transmission probability (chance of infecting someone during interaction) and the susceptibility (the chance of the person at the other end of the interaction to catch infection).

Interestingly, of these four factors, policymakers can effectively control only the opportunity, by enforcing a lockdown and physica distancing measures. And enforcing a lockdown for a two-week period could bring R0 down to zero if all infected people are quarantined.

This conclusion, however, hinges on two key assumptions:

(i) the duration is at the most two weeks, and(ii) as soon as someone is infected, the person gets detected

Assumption (i) seems to be reasonable in light of the data collected thus far. However, assumption (ii) is clearly not tenable.

A recent article in the journal Science reported that about “86% of cases were ‘undocumented’ – that is, asymptomatic or had only very mild symptoms” and that “undocumented infections were the infection source for 79% of documented cases. These findings explain the rapid geographic spread of SARS-CoV-2 and indicate containment of the virus will be particularly challenging.”

Thus, to understand the dynamics of the virus’s spread, estimating the percentage of asymptomatic patients is critical. More importantly, some of these asymptomatic patients often turn out to be ‘super-spreaders’ whose R0 values are extremely high, and are outliers in the distribution of R0.

Super-spreaders are also not uncommon, making the distribution of R0 heavy tailed on the right side. CNN reported that three asymptomatic employees attending a recent Biogen conference in Boston were later tested positive: “As of March 16, 108 cases of Massachusetts’ total 164 coronavirus cases have been linked back to the Biogenconference.”

Another study reported that “66% of the 91 people in Singapore region contracted the infection from someone who was pre-symptomatic, while between 62% and 77% of the people caught the coronavirus from someone who had yet to show symptoms in the Tianjin cluster.”

The challenge for policy planners thus boils down to pushing the whole mass of the distribution of R0 below 1 by cutting the right-side tail. Thus, banning all kind of gatherings and enforcing physical distancing for a considerably longer period of time in future are crucial in the absence of comprehensive test data.

How long the pandemic will continue, is the most crucial question going forward.

Who knows that there won’t be a second phase of the pandemic caused by a mutant and more virulent coronavirus like what happened in Spanish flu pandemic in 1918.

The renowned virus ‘hunter’ and researcher W. Ian Lipkin of Columbia University shared his view on this in a recent interview to PBS, available on YouTube:

I don’t think that this virus is ever gonna disappear completely. I hope I’m wrong. I think it’s with us to stay. It would not surprise me if a substantial proportion of the world’s population ultimately became infected with this virus.

He then reaffirmed that we are not going to lose “the equivalent of the number of people we lost during 1918 flu pandemic” because:

… that was a very different era. It’s true that more people become exposed more rapidly with this virus, well, we have antibiotics to treat secondary infections, we understand more about how virus is transmitted, and we will have a vaccine, and I am confident, working together, pulling the world together in this fashion, and sharing information and sharing technologies, we will get to a vaccine in less than a year. We must get to a vaccine in less than a year.

These words from a world-renowned expert are like music to our ears.

At this point of time, however, the priority of the government is to control the progression of the epidemic in order to avoid stressing the existing healthcare infrastructure.

Going forward, no country – including India – can afford to fight the coronavirus by locking down for months. The economic and social costs are of unimaginable magnitude. In India, the population of gig workers is already around 10 million, apart from millions of daily-wage labourers in the unorganised sector.

Their livelihoods are already in peril, and their families would soon be thrown into an abyss of debt and poverty in the absence of adequate fiscal stimulus from the government and mobilisation of all kinds of support by the civil society. One needs to make a trade-off between lives and livelihoods.

In this context, Mireille Jacobson and Tom Chang, two experts of health policy, provided an economic rationale for the extreme measures adopted by the US government. Economists often use a concept called ‘the value of a statistical life‘ (VSL) for rationalising the tradeoffs between the risk of dying and money.

Estimates of VSL are obtained from data on willingness to pay to achieve incremental reductions in the risk of dying and are widely used by different countries, including the US, to examine the cost-benefit tradeoffs of health, safety and environmental regulations.

Doing a back-of-the-envelope calculation, Jacobson and Chang arrive at $8.5 trillion as the approximate economic cost of lives lost for the US in the worst case scenario. This is around half the GDP of the country. Thus, they conclude that the costs of not taking these extreme measures are too high.

A similar calculation for India, presuming the number of deaths to be 6.8 million (the population size of India is approximately four times that of the US), and VSL at age 60 to be about Rs 2.3 crore (half the VSL estimate for India) would get us to $2.23 trillion, close to our GDP of $2.72 trillion in 2018.

But this rationale seems to be flawed and missing the all-important human factor. These numbers don’t make much sense considering the disproportionately negative impact of extreme measures on the livelihoods of millions of Indians at the bottom of the pyramid, most of whom are not covered by any social security scheme.

As a civil society, we should all come forward and help these people tide over this crisis-of-a-century.

Tathagata Bandyopadhyay is professor of production and quantitative methods at the Indian Institute of Management, Ahmedabad.