Rothamsted turn to harvesting coronavirus data

A group of researchers based at Rothamsted Research, one of the oldest agricultural research institutions in the world, has responded to a request from the White House, Microsoft, Mark Zuckerberg and others to find a way to rapidly sift through the mountain of COVID-19 scientific data.

Taking time off from their own research, the Rothamsted team repurposed a tool they had originally developed to help crop scientists, to provide medical researchers with quick and intuitive access to all documented linkages between genes, medicines, and the virus.

By bringing together COVID-19 related data in one place, the hope is that this will speed up the international search for useful drugs, stop researchers repeating work done elsewhere, avoid harmful interventions, and ultimately, help pave the way to a vaccine.

A US Government-backed call had urged the world’s artificial intelligence experts to develop new text and data mining techniques that could help the science community answer urgent questions related to the deadly outbreak.

Project leader, Dr Keywan Hassani-Pak, originally developed the KnetMiner software to support scientists studying complex plant traits and diseases – but together with his team, quickly realized the potential of it to help aid coronavirus research.

“Using KnetMiner, medical researchers can now search for genes and keywords, visualize connections between biological concepts and explore knowledge relating to the new coronavirus and COVID-19 disease.

“Users can search for drugs related to coronavirus and explore the surrounding connected data. Alternatively, they can investigate what pathways the drugs affect and visualize if any negative downstream effects may be present with using the drug in certain diseased populations.

“The genetic component of how SARS-CoV-2 and the human body interact can also be explored.”

The software links together almost 170,000 scientific articles, the majority with detailed information about human genes, plus SARS and COVID-19 related proteins, drugs and other medical conditions.

This works out at more than 1.6 million relationships between biological entities – something that would take years of searching for, using conventional means.

“We have connected the dots in the COVID-19 biomedical data and put the information in a machine-readable format and in context with human genetics, pathogen-host, and drug-target interaction data” said Dr Hassani-Pak.

It was mid-March when The White House Office of Science and Technology Policy launched the COVID-19 call to action.

Over 500 scientists, software developers and clinicians joined forces in the COVID-19 virtual Biohackathon at the beginning of April to develop new tools for working with COVID-19 data.

Working from their homes, the team of Joseph Hearnshaw, Dr Marco Brandizi, Ajit Singh and Dr Keywan Hassani-Pak managed to develop the COVID-19 knowledge graph for KnetMiner in less than a month.

Dr Hassani-Pak said: “I knew our technology was versatile, but to deliver this within such a short time scale was beyond my expectation and only possible due to a fantastic team and a global effort to make COVID-19 data openly available.

“The newly developed biomedical resource offers developers and analysts the opportunity to use our data for new analyses and applications. A full download of our COVID-19 knowledge graph is available on request.”