You are here

VISUALISING DATA CONNECTIONS PROMISES FASTER DISCOVERIES

A new software collaboration makes a vast range of biological databases more accessible for exploitation

30

AUG

2017

SHARE

SHARE

Data mining and visualisation software created to help plant scientists trawl diverse biological databases for clues to design better crops can now support less specialist users and across a range of disciplines, including human disease research.

KnetMiner, with a silent “K” and standing for Knowledge Network Miner, is a suite of open-source software tools developed at Rothamsted Research for integrating and visualising large biological datasets. Genestack, with whom Rothamsted is collaborating, provides a secure, commercial platform.

The software mines the myriad databases that describe an organism’s biology to present links between relevant pieces of information, such as genes, biological pathways, phenotypes or publications. The aim is to provide leads for scientists who are investigating the molecular basis for a particular trait or ways of improving the organism’s performance in some way.

“It takes the slog out of preliminary investigations when you need to explore multiple online resources to look for what might be happening,” says Chris Rawlings, head of computational and analytical sciences at Rothamsted. “It helps people to understand the complexity of the biology that underpins traits, of how different genes contribute to a phenotype.”

KnetMiner has proved itself at Rothamsted as it has evolved over the past five years from being a visualisation component of a data integration system, known as Ondex, developed at Rothamsted more than a decade ago.

Published research aids the hunt for dormancy genes in common wheat, Triticum aestivum

“Genotype to phenotype analysis is at the core of what biologists do,” says Keywan Hassani-Pak, head of bioinformatics at Rothamsted and KnetMiner’s lead developer. “With KnetMiner, we have created software that enables biologists to take their own high-throughput experimental data and to see them in the context of all the public knowledge that is out there. This can help them to interpret their own data faster and more effectively.

“For a particular target species, such as a crop plant, KnetMiner integrates all the relevant genomics and multi-omics information that is present in more than 25 sources under a multitude of formats…and brings it together in the form of a heterogeneous knowledge network,” says Hassani-Pak.

He adds: “We don’t only integrate the data; we also create new relationships based, for example, on co-occurrences of genes and phenotypes in the scientific literature. We are the first in the UK to develop such detailed networks and make them mineable.”

With security-conscious corporations keen to use the software, KnetMiner has now advanced from being a research tool to a commercial product by joining with the Genestack software platform that is designed to overcome the challenges of bioinformatics in research enterprises.

“The Rothamsted researchers could spend months collecting all the data that was available for a particular organism, cleaning the data and writing scripts to transfer it into a format that was usable in KnetMiner and then presenting it so that other scientists could use the information,” says Misha Kapushesky, chief executive of Genestack.

After migrating KnetMiner onto the Genestack platform and automating the collection process, says Kapushesky: “It is now possible to simply ‘point and click’ on data that is in the public domain to create a network and then overlay your own data, using KnetMiner to visualise it.

“You can build your own network with collaborators in a secure environment. It is no longer a fixed set of data on the Rothamsted website but a dynamic tool that can be made commercially available,” says Kapushesky.

Genestack now hosts more than 40 plant and crop networks, as well as a prototype human disease network. Although the software originated in agri-research, network mining for gene discovery is generic and Genestack provides an environment for building and distributing these large-scale knowledge networks.

“There are a lot of tools out there that will return a list of ranked genes when you are conducting a gene candidate analysis, and of course KnetMiner also does that with its evidence-based gene rank algorithm. But most of them also stop there” says Hassani-Pak.

“KnetMiner is unique as it allows users to see how and why the prediction was made. They can fully understand the results because the process is completely transparent and the provenance is visualised” says Hassani-Pak. “There is no black box approach here.”

Hassani-Pak and Kapushesky say that this approach supports human-augmented knowledge discovery, which puts human experts – rather than machines – at the core of the decision-making process.

“We need to free [the human brain] from tedious tasks,” says Kapushesky. “By reducing the complexity, it makes it easier for researchers to see the patterns and links that push the frontiers of science further, and the tools also make it possible for others to apply the findings in a commercial environment.”

Rawlings adds: “This is a good example of how research software can be translated into a commercial platform for industry, with potential revenues, through royalty payments, returning to Rothamsted to fund further research.”

Funding from the Biotechnology & Biological Sciences Research Council supported the original Ondex project, and then a follow-on project under the BBSRC’s Tools and Resources Development Fund. Funding from Innovate UK has supported the collaboration between Rothamsted and Genestack.

Genestack is a bioinformatics company offering a platform for complex multi-omics data and metadata management, analysis and visualisation. The platform transforms the way genomics research and development is done by eliminating routine tasks, tackling inefficiencies and helping its users to overcome the challenges of bioinformatics. Genestack team brings together expertise in computational biology, genetics and algorithm design to help customers across a range of industries, including pharmaceuticals, healthcare and agriculture, to accelerate their genomics-based research.

About Rothamsted ResearchRothamsted Research is the longest-running agricultural research institute in the world. We work from gene to field with a proud history of ground-breaking discoveries, from crop treatment to crop protection, from statistical interpretation to soils management. Our founders, in 1843, were the pioneers of modern agriculture, and we are known for our imaginative science and our collaborative influence on fresh thinking and farming practices.Through independent science and innovation, we make significant contributions to improving agri-food systems in the UK and internationally. In terms of the institute’s economic contribution, the cumulative impact of our work in the UK was calculated to exceed £3000 million a year in 20151. Our strength lies in our systems approach, which combines science and strategic research, interdisciplinary teams and partnerships.Rothamsted is also home to three unique resources. These National Capabilities are open to researchers from all over the world: The Long-Term Experiments, Rothamsted Insect Survey and the North Wyke Farm Platform.We are strategically funded by the Biotechnology and Biological Sciences Research Council (BBSRC), with additional support from other national and international funding streams, and from industry. We are also supported by the Lawes Agricultural Trust (LAT).For more information, visit https://www.rothamsted.ac.uk/; Twitter @Rothamsted1Rothamsted Research and the Value of Excellence: A synthesis of the available evidence, by Séan Rickard (Oct 2015)

About BBSRCThe Biotechnology and Biological Sciences Research Council is part of UK Research and Innovation, a non-departmental public body funded by a grant-in-aid from the UK government.BBSRC invests in world-class bioscience research and training on behalf of the UK public. Our aim is to further scientific knowledge, to promote economic growth, wealth and job creation and to improve quality of life in the UK and beyond.Funded by government, BBSRC invested £469 million in world-class bioscience in 2016-17. We support research and training in universities and strategically funded institutes. BBSRC research and the people we fund are helping society to meet major challenges, including food security, green energy and healthier, longer lives. Our investments underpin important UK economic sectors, such as farming, food, industrial biotechnology and pharmaceuticals.More information about BBSRC, our science and our impact.More information about BBSRC strategically funded institutes

About LATThe Lawes Agricultural Trust, established in 1889 by Sir John Bennet Lawes, supports Rothamsted Research’s national and international agricultural science through the provision of land, facilities and funding. LAT, a charitable trust, owns the estates at Harpenden and Broom's Barn, including many of the buildings used by Rothamsted Research. LAT provides an annual research grant to the Director, accommodation for nearly 200 people, and support for fellowships for young scientists from developing countries. LAT also makes capital grants to help modernise facilities at Rothamsted, or invests in new buildings.