Cleaning, Tagging, and Structuring Life Science Data

Changing the Experimentation Model

Any researcher knows proper preparation and planning are crucial to success for a new research project in the lab. While traditional life science search tools can aid researchers in identifying useful lab products that worked well for similar experiments (by pulling from scientific research papers), the burden is still on the researcher to sort through, download, and manually sift through what seems like an endless number of PDF articles. But with 30 million scientific articles available today, and 300 million life science products on the market, this is almost an impossible task.

The problem is that traditional life science search tools don’t perform the difficult but necessary intelligence work of analyzing, aggregating, and structuring data in the papers. Lab researchers, in turn, are burning through unnecessary cycles and wasting money to identify products that are likely to work best in their experiments and how to use them.

To advance life science and bring innovation to market, the research industry needs to address several significant issues that have plagued it for far too long. Here is a look at what is needed:

A new purchasing process for laboratory products

No doubt about it, our industry would greatly benefit from a modern laboratory products purchasing process— one that emulates the everyday retail eCommerce experience. This includes transparency of product information and price, fast shipping, and an ability to compare similar products and reviews from actual customers. Laboratory product suppliers have lagged behind other industries, failing to provide websites and purchasing experiences that meet the latest eCommerce standards.

Changing the model would create demand for non-product- supplying companies to help laboratories navigate the current lab product purchasing process. Locating and solving unmet market needs and customer challenges (including those around the purchasing process) will set innovative companies apart from stagnant ones. A revived research tool purchasing process would provide researchers with collated, simple- to-use information at their fingertips, allowing them to make quick and informed purchase decisions. Too many laboratories waste precious dollars to find the right product for their experiments—a huge problem with antibody purchases in particular, as labs sometimes perform months of trial and error to find the right antibody.

Address the repeatability and reproducibility problem of many experiments

It’s key for our industry to come together in insisting on more transparency. Given that the U.S. government funds much experimentation work, journal publishers should be required to make scientific articles publicly available. In an industry as imperative as life science research, which is seeing the decline of printed journals in favor of online information, it is unacceptable that researchers are blocked by publisher paywalls. Paywalls require money from researchers to allow them to view articles that they need to read in order to do their jobs.

Last year, the founder of Sci-Hub refused to shut down a website providing 48 million journal articles freely available online, despite a court injunction and a lawsuit. While Sci-Hub is no longer available on the open web, I anticipate this is just the beginning of a movement, as more researchers are banding together to support the open flow of research. And Congress is now getting involved, with two acts promoting open access to published scientific research funded by federal agencies currently under consideration: the Public Access to Public Science Act and the Fair Access to Science and Technology Research Act. Over the next few years, I expect significant strides will be made in making this information accessible to the public.

Keep vendors from selling incorrectly characterized products

Researchers need to demand transparent disclosure of the quality and compatibility of life science tools. Each year, over $30 billion in taxpayer money is allotted to the National Institutes of Health (NIH), which means every taxpayer has reason to expect openness and honesty from life science tool companies. This is especially so given that products are used by researchers performing work that can lead to breakthroughs, such as discovering cures for diseases and new medications and treatments for ailments affecting everyone. Taxpayer money consistently goes to waste when research products do not work as advertised, causing researchers to lose weeks and months working with “bad” products. As a case in point, antibodies fail almost 50 percent of the time, requiring researchers to perform multiple redundant experiments just to determine which antibody works. These wasteful experiments result in hundreds of millions of dollars being spent on bad products, and further millions being spent on wasted researcher time. The public will demand more transparency to ensure its money is being put to good use.

Create an objective rating for life science products based on quality

The industry rating system needs to evolve to a point where researchers are more dependent on data-driven objective product rating parameters to help improve experiment success rates. Amazon and Yelp provide user-generated reviews and ratings, and while these can be useful in many industries, they are less effective in life science research where biased agendas can impact the ratings of a product or assay. Moreover, the lack of data-driven objective product and service rating parameters has led to a flood of fake reviews in many industries, with some sites reporting 30 percent of their reviews as fake. Objective, unbiased rating algorithms based on reliable sources of data will form the basis for accurate rating platforms that facilitate better researcher decision making.

Today, there are about 10,000 companies out there selling life science tools. Some of them sell great products, while others do not. By algorithmically rating products based on quality, the industry will start to see the vendors selling good products float to the top, cleaning up the field. Grant money from the NIH will be used more wisely, and that $80 billion can be spent on products that actually work. This will all translate into faster, more effective drug discovery, as well as better basic research to find cures for cancer, Alzheimer’s, Parkinson’s, and other diseases. It will impact everything we see being done today in the life science domain. I liken it to the changes we saw in the travel industry when suddenly we were able to explore travel options instantaneously—this led to disruption in the field, but it was good disruption. We now have similar changes taking place in the life science space, with researchers now able to explore experimentation options instantaneously. They have been waiting a long time for this.

Introduce more modern technologies to life science research

Finally, to help inspire innovation and improve life science research and drug discovery, it’s time for the life science industry to evolve and rely on more modern technology, such as natural language processing, machine learning, and artificial intelligence. These examples have already been broadly used in other industries, including travel, to help users make smarter choices much easier and faster. This same technology can and should play a big role in advancing methods and practices in both academic and biopharma research environments. They can be used to clean and tag scientific data hiding behind long, text-dense PDF files. Mining hundreds of millions of pages of research publications and structuring the data in a comparable, easy-touse, and visual fashion in mere seconds will no doubt benefit and revolutionize the industry.

In conclusion, 2017 should be a pivotal year in the life science industry to bring about innovation and more speed in drug discovery and disease cure rates. It’s crucial that the industry as a whole works together and shifts its thinking to embrace and adopt modern cloud and IT data platforms in the research lab, such as advanced cloud-based search engines, unbiased reagent rating platforms, life science tool usage guides, and more efficient and transparent procurement methods. The modern, data-driven intelligent search engine will no doubt play a pivotal role in advancing life science research by offering higher-quality product ratings to inform experiments and adding intelligence.