4 PARTNERSTIB is the largest scientific library in the worldArchitecture, Chemistry, Computer Science, Mathematics, Physics, Engineering technologyFinanced by Federal Government and all Federal States€ 8 Mio. annual acquisition budget18,500 journal subscriptions7,0 Mio. itemsGlobal Supplier for scientific and technical information of all types – text, numeric data, audio, video, etc.Global consortium carried by local institutionsfocused on improving the scholarly infrastructure around datasets and other non-textual informationfocused on working with data centers and organizations that hold dataProviding standards, workflows and best-practiceInitially, but not exclusively based on the DOI systemFounded December 1st 2009 in London

5 PARTNERSThieme ChemistryPart of the Thieme publishing group, based in Stuttgart (Germany)Publishes highly evaluated information about synthetic and general chemistry for professional chemists and advanced students since 1909.

6 PARTNERSThis is one of the – far too few – intense co-operations between libraries and publishers.Our Journals always have been at the forefront of innovation and we are proud that once again, we can lead the way.

8 As a result, datasets areBACKGROUNDGap in the scientific record between published research and the underlying dataPublished work held by publishers and librariesDatasets held by data centersNo effective way to link between datasets and articlesNo widely used method to identify datasetsNo widely used method to cite datasetsAs a result, datasets areDifficult to discoverDifficult to access

12 What is needed: Servers/Data Centers Metadata DOIPROCESSWhat is needed:Servers/Data CentersMetadataDOICreation of new and strengthening of existing data centers.Responsible for:Quality assuranceStorage of the content and accessibilityCreation of metadataGlobal access to data sets and their metadata through existing catalogues.TIB stores the metadata and keeps it searchable.Use of persistent identifiers – also for data (DOI = Digital Object Identifier)TIB registers research data worldwide from a scientific, technical or medical backgroundThe Digital Object Identifier (DOI®) System is for identifying content objects in the digital environment.Information about a digital object may change over time, including where to find it, but its DOI name will not change.The DOI System provides a framework for persistent identification.The system is managed by the International DOI FoundationOver 40 million DOI names have been assigned by DOI System Registration Agencies in the US, Australia, and Europe.You might have come across this when citing advanced online articles.Digital Object Identifier (DOI)

13 Thieme hosts the research data in a data center (FIZ Karlsruhe).PROCESSAt the same time with the article the author submits the research data to Thieme.Thieme hosts the research data in a data center (FIZ Karlsruhe).TIB assigns a DOI to the data.At the same time the article is published the primary data are published as independent entity but in connection with the article.The article quotes the research data as reference items with the assigned DOI.

15 RESULTSAn abstract with primary data as supplementary information.Primary Data has its own DOI, different from the one of the paper – thus, PD can be cited independently.Clicking the link (or entering the DOI in a web browser) downloads a zip file.

16 RESULTSPrimary data come neatly organized in a zip file.Numbering of the folders corresponds to numbering of the compounds in the corresponding article.The folder also contains a Read me file.

17 RESULTSThe Read Me PDF in the zip-File describes the content and which programs can be used to view it

20 Benefits Citability of research data High visibility of the dataSUMMARYBenefitsCitability of research dataHigh visibility of the dataEasy re-use and verification of the data setsAvoiding duplicationsMotivation for new research

21 Benefits for authors More work Proof of qualitySUMMARYBenefits for authorsMore workProof of qualityDocumentation of validityMore exposure for resultsFirst, it looks just like another burden.Not only do the original data really show how clean the products were, they also add great value and trust to the used methods.In addition, the work gets a good deal more exposure, as it will show up not only when someone looks for an article, but also for a substance, a method, a spectrum, etc.Imagine what that can mean for the rating of an author [h-factor (if the methodology is good, of course)]

22 Benefits for users Quick evaluation of papersSUMMARYBenefits for usersQuick evaluation of papersFind structures by spectraFind similar patternsUnderstand individual peaksBut fast forward a few years with meMost articles come with primary dataThis data itself is fully searchableSpectra are linked via InChIs to the structuresUsers will be able to search for patterns, or even single peaks!

23 Open questions No specific regulations so far CopyrightSUMMARYOpen questionsNo specific regulations so farCopyrightCentralized data hostingData compatibilityTo realize all this, there are some burdens to be crossed:Centralized data hosting with clear definitions for requirements needed.see Pangaea – Publishing Network for Geoscientific & Environmental Datahosted by the Alfred Wegener Institute for Polar and Marine Research (Bremerhaven) and the Center for Marine Environmental Sciences (Bremen)Supported byThe European Commission, ResearchFederal Ministry of Education and Research (BMBF)Deutsche Forschungsgemeinschaft (DFG)International Ocean Drilling Program (IODP)The information system PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from earth system research. The system guarantees long-term availability of its content through a commitment of the operating institutions.No regulations regarding format, copyright, use of data, definition of data as primary, etc.The project presented here is a start-up prototype and currently not more which is also the reason why the issues I just mentioned have not been fully addresses and solved yet.Data qualityThis isn’t meaning the scientific quality of the data but their technical characteristics, compatibility with different “hardware” (e.g. currently two main suppliers of NMR spectrometers, data are only cross-readable into one direction).

24 SUMMARYSo, the only thing I have left to say: go, share your primary chemical data with your fellow researchers – now. Details on how-to do it can be found in our instructions for authors.If you happen to publish not only in SYNLETT and SYNTHESIS, please talk to your editor about primary data – they might already work on it.