Microbiome Search Engine Assesses Microbiome Novelty and Impact

Our group developed a way to objectively evaluate the novelty and impact of plethora of microbiomes in the vast universe of microbiome big-data,based on an innovative tool called Microbiome Search Engine (MSE). These inventions, published in mbio, are the compasses guiding mankind’s exploration in the vast universe of microbiome big-data.

Microbiomes,microbial societies that colonize almost every corner of our planet, are pivotal to human health, indoor environment, air, soil, as well as the ocean, and shape these ecosystems’ past, today and destiny.

Despite the immense volume of these data, few computational approaches are available to process and integrate them. In particular, it is difficult to relate a new microbiome sample to the huge number of existing microbiome samples.

"MSE to microbiome big-data is like Google or Baidu to webpage big-data. By searching for the most structurally or functionally similar microbiomes in a super-fast manner, MSE offers the first opportunity to relate each microbiome ever published to the microbiome big-data known to mankind so far," said SU Xiaoquan, Lead of the Bioinformatics Group.

MNS evaluates the compositional uniqueness of a microbiome sample at the time of its birth. MAS quantifies the scientific attention devoted to the microbiome by counting the number of close neighbors of the microbiome. Microbiome Focus Index, or MFI, which is derived from MNS and MAS, can measure the impact and contribution of a microbiome sample to mankind’s exploration for novel microbiomes.

Microbiome samples with extraordinary MFI are samples that were born with high novelty and then attracted a lot of follow-up scientific investigation,Therefore, MNS, MAS and MFI serve as one objective way to measure the novelty and impact of a sample, a project, a scientist or a research area; these so called ’alt-metrics’, which are based on the ’data’ themselves, are fundamentally different from the conventional ways of assessing research impact such as the citation numbers or the Impact Factor, which are subject to human judgments and thus can be biased or skewed.

Using MSE, we predicts the "sleeping beauty" microbiomes, i.e., published microbiome samples that are still very novel in structure at present yet are destined to attract a lot of scientific attention in the next several years, based on temporal growth of their MAS.

These "sleeping beauties" are mainly from marine environments and mother-baby interactions. Thus, data mining, made possible by MSE, can help the scientific community and the funding agencies decide the research areas with the highest potential in generating high-novelty and high-impact microbiome data.

As one of the first big-data mining tools introduced by Chinese scientists in the Earth Microbiome Project, MSE will support ongoing mining of the immense datasets being generated by EMP as well as the CAS Microbiome Project.