Spidey Sense: Open Source Chemistry

Community Service Prize
Winner: Royal Society of Chemistry nominated by Collaborative Drug Discovery
Project: ChemSpider

By Allison Proffitt

July 29, 2010 | ChemSpider started as a “hobby project out of a basement” in 2007, says Antony Williams, ChemSpider’s VP of Strategic Development at the Royal Society of Chemistry. “We’ve seen for years that a lot of efforts were going on to create silos, but nobody was really doing the work to bring the information together.” Williams and a few others tried to do just that with ChemSpider, a free, curated online resource allowing integrated chemical structure data combined with biology data-searching of chemistry or biology databases, chemistry articles, patents, and web pages such as blogs and wikis. ChemSpider serves as a central spot to access information that was previously distributed across journals, commercial databases such as CAS SciFinder, and hundreds of smaller databases.

The work paid off, and ChemSpider was acquired by the RSC in May 2009. With the RSC’s support, ChemSpider has continued to grow. “Ours is a community website that free to access, anybody can deposit data there and we ask the community to participate in curating the data,” Williams explains, pointing out that ChemSpider is different from commercial databases that host and maintain their own data. ChemSpider links out to data and relies on the chemistry community to constantly expand the offerings.

ChemSpider has “provided a platform for searching existing data, the deposition of new data, and curation of existing data,” said nominating software company Collaborative Drug Discovery.

ChemSpider is committed to quality data. “[We] focus on the quality of data that we have in the database, to remove erroneous data, to deal with the errors that proliferate across the Internet—there’s a lot of junk floating around the Internet in these public compound databases and people are not addressing [the errors] and not curating [the data]. We’ve got a very stringent focus of improving the quality of what’s online through our system. That’s the only way we can do it.”

Under the RSC umbrella, ChemSpider’s mandate has grown. “We have a new project called ChemSpider Synthetic Pages that went live just after Christmas,” says Williams. “But we also work on internal projects to the RSC, supporting cheminformatics inside the organization... They are certainly steering those efforts, because we’re focused on them for the organization: semantic marker projects, integration of chemistry into their databases etc.”

The ChemSpider Synthetic Pages project is Williams’ biggest focus currently. “[ChemSpider is] moving from just a structure-centric database to building this community resource for synthesis procedures—chemical reactions if you like—CSSP.chemspider.com. We’ve got about 300 synthesis procedures on there right now, they’re very high quality and peer-reviewed,” he says. An editorial board reviews the procedures first before they are released to ChemSpider users, who also act as a peer review board.

Submitting syntheses is part of a trend Williams calls “micro publishing.” Each submission is issued a digital object identifier, and can be listed on resumes and CVs. Williams says that this type of contribution is essential to growing the field and ChemSpider’s platform enables that. “It’s a true semantic web platform. It exists because of the community’s contribution.”

Williams says that winning Bio-IT World’s first Community Service Award was a validation of much hard work. “To be recognized for contributing to the community, that was always our intent, to give back. This was to help the community,” he says. “It was very humbling. It was quite emotional actually. We worked very to be able to do it and to be recognized was wonderful.”

This article also appeared in the July-August 2010 issue of Bio-IT World Magazine. Subscriptions are free for qualifying individuals. Apply today.