Open Access Scientific and Technical Information: Research and Monitoring

Open access scientific and technical resources represent a significant complementary collection: about 20% of scientific articles are available for free. Access to these scientific resources is an important issue, but how to make the best use of them is, as well. Between the numerous open archives (institutional, thematic and central) and open access scientific journals, it is not always easy for researchers to find their way around. The aim of this article is to show a few “tools” that make information retrieval and monitoring of open access scientific resources easier.

This article is a translation of “L’information scientifique et technique en libre accès : recherche et veille” available at:http://blog.mysciencework.com/2011/09/05/linformation-scientifique-et-technique-en-libre-acces-recherche-et-veille.htmlIt was written by Hans Dillaerts and translated from French into English by Mayte Perea López. Scientific and technical information (STI) can be defined as “all the information produced by research that is essential not only to scientific activity but also to industry”[1]. Author-researchers are at the heart of the knowledge production process: they initiate the research carried out and they are responsible for the dissemination of their research results. This dissemination occurs through scientific publications. Scientific journals are the main vectors to spread new knowledge once it has been approved by the Editorial Review Board[2]. As the GFII Working Group on Open Access recalled [3]: “Scientific and technological information should be spread as widely, quickly and efficiently as possible, while retaining the highest level of quality, because this dissemination fully participates in the functioning of research”.

I. Why Are Open Access Scientific Resources Interesting?

Having immediate access to the latest research results is a major issue. Through libraries’ subscriptions, researchers are able to view and download the latest publications from paid editorial platforms such as Science Direct and Springerlink for STM fields (scientific, technical and medical), or CAIRNand SAGE journals for humanities and social sciences (HSS).

Even though the Salençon Report [4] highlights the problems raised by the agreements on e-journal bundles, it adds that: “The agreements […] probably had some beneficial effects, among them, it should be noted, a rationalization of resources and the increase in the number of journals available online in libraries”. However, the fact that researchers are given access to quite a large number of journals through libraries does not mean that they have access to all the documentary resources that should be selected for their research. This is the reason why open access scientific resources are a real advantage for research. Open access can be defined as “the provision of online scientific and technical documents and data that everyone can freely view, download, copy, spread, print, index”[5]. It should be noted that approximately 20% of articles are freely accessible online: about 8% of articles are displayed by open access electronic journals and another 12% are placed in open archives[6]. Open access electronic journals[7] experienced strong growth over the past ten years: with only 740 journals in 2000[8], the Directory of Open Access Journals (DOAJ) today provides access to more than 6,300 journals. The OAI Baselab[10] search engine processes more than 31,416,000 scientific documents, freely accessible in full text format.

These open access scientific resources, therefore, constitute a significant complementary collection. This does not mean that making these resources available is enough, it is essential to know how to find them in order to be able to use them.

II Some Practical Advice to Properly Prepare Your Scientific Research and Literature Monitoring

II.1 Monitoring: Definition

"

Getting information is above all a fundamental need of human beings. In an often hostile natural environment, humans had to acquire information to ensure their survival[11].

"

Setting up a monitoring system involves controlling our environment. So, implementing the monitoring of open access scientific resources involves monitoring the latest publications of scientific articles and scientific literature in general on a particular subject, concept, object or field: access to these scientific resources is immediate and free.

II.2 Implementing Your Alert System

II.2.1 Preparing for Information Retrieval: Steps

1) Identify main concepts and concepts related to the research subject. 2) Make a list of keywords related to the research subject. Do not forget to take into account possible synonyms. 3) “Specify the areas of knowledge that are related to the subject”[12]. 4) Take into account the names of the authors who worked on the subject or a related subject. All the terms (and names of authors) selected during this preparatory step will be used for different types of search tools described later in this article. The more consistent the selected terms are, the more consistent research results will be on the various types of search tools.

II.2.2 Preparing for the Monitoring

1) Sources to take into account are open archives and open access scientific journals. 2) Transmission means for alerts: open archives and open access electronic journals are equipped with RSS feeds and/or email alerts [13]. 3) Two online aggregators for RSS feeds: Google Reader (http://www.google.fr/reader/ and Netvibes [14] (http://www.netvibes.com/fr).

III The Importance of General Public Search Engines and Google Scholar for Finding Open Access Scientific Resources

Search engines for the general public (such as Google and Bing) and Google scholar must not be neglected when searching open access scientific resources. A study carried out in 2008 by Michael Norris, Charles Oppenheim and Fytton Rowland [15] shows that Google and Google Scholar are efficient tools for finding open access scientific literature. These two search engines have the advantage, as opposed to other more specialized search engines, of indexing the articles published on authors’ personal websites and research laboratories’ websites. Email alerts can be created on Google Scholar: in this way, Google Scholar sends a notification each time it finds a new reference with the search terms used. Therefore, it is important for users to carefully target their search (and consequently to choose consistent and targeted keywords) in order to avoid receiving too many notifications with useless references, or on the contrary, receiving too few of them.

IV Finding the Potential of Open Archives and Making the Most of It

It should be remembered that “the term open archive refers to a repository in which data coming from scientific research and teaching is placed and the access to which is meant to be open, that is to say without barriers” [16].

IV.1 What Types of Documents?

Pre-print or pre-publication: this is the version of the article submitted by the author to the scientific journal that must be evaluated by the Editorial Review Board. This version of the article has not been evaluated yet by peers so it has not been published yet by a scientific journal. Post-print or post-publication: this is the version of the article that has been approved by the Editorial Review Board of a scientific journal. A distinction should be made between: 1) The author version of the article: this is the final version of the author’s article that includes possible corrections required by the Editorial Review Board. This version of the article is used by the publisher to edit and publish the article. 2) The editor version of the article: this is the published version of the article (or in other words, the publisher’s PDF file). 3) Grey literature: Grey literature is “what is produced by all the government, education, public research, trade and industrial institutions, in a paper or digital format, and that is not controlled by commercial publishers”. In open archives, it is therefore possible to find conference proceedings, theses, posters, presentations, reports, etc. Open archives offer the possibility of citing the literature in any form: conference proceedings, theses, posters, presentations, reports… It must be noted though that a great majority of the articles placed in an open archive are copies of published articles: this means that the exact reference of the journal must be cited and not the article that was placed in the open archive. The open archive acts primarily as an intermediary: its aim is to provide access to articles that a researcher might otherwise have never been able to view.

IV.2 Open Archives: Some Examples

This is a French multi-disciplinary open access archive (STM and HSS). “Through the years it has become a national reference in terms of self-archiving”[17]. More than 204,400 publications were deposited in HAL. This archive provides an alert system through RSS feeds. RSS feeds offer the possibility to subscribe to: - the latest publications deposited in the archive - the latest publications deposited by an author - the latest publications deposited by a research laboratory - the latest publications deposited in a field of science

This is a French multi-disciplinary open access archive (STM and HSS) to deposit theses and “Habilitations à diriger des recherches” (accreditation to conduct research). More than 30,600 publications have been deposited in TEL. This archive provides an alert system through RSS feeds. RSS feeds offer the possibility to subscribe to: - the latest publications deposited in the archive - the latest publications deposited by an author - the latest publications deposited by a research laboratory - the latest publications deposited in a field of science

This is an open archive dedicated to information science. More than 13,900 publications have been deposited in E-LIS. This open archive provides an alert system through RSS feeds. These RSS feeds offer the possibility to subscribe to: - the latest publications deposited in the archive - research findings - the latest publications deposited on a particular theme

This is the first archive that was created 20 years ago. Launched in 1991 by Paul Ginsparg, it is “an archive for electronic preprints of scientific papers in the fields of mathematics,physics, astronomy, computer science, quantitative biology, statistics, and quantitative finance”[18]. More than 791,380 publications have been deposited in ArXiv. This open archive provides an alert system through RSS feeds. These RSS feeds offer the possibility to subscribe to the latest publications made available in a field or subfield of knowledge.

This is an open archive dedicated to biomedical science. The articles deposited in this archive are post-publications. 2.5 million articles have been deposited in Pubmed Central. This open archive provides an email alert system [19]. You have to register on the website to be able to create email alerts and save your searches.

This is an open archive dedicated to economics. It is comprehensive because it collects references from other open archives. 74 countries participate in the project. This turns it into a central point of entry to carry out research in the field of economics. More than 1,133,415 publications can be downloaded in full text. 7) EconBiz : http://www.econbiz.de/en/search/search/search-all/ EconBiz is a search engine including important German and international databases for economics and business studies (about 1 million free online full texts). This search engine is also a central point of entry to carry out research in the field of economics[source :http://www.econbiz.de/en/about-econbiz/about/]

8) Finding Other Open Archives

OpenDOAR[20], the directory of open archives, facilitates the search for other open archives. A very useful functionality of OpenDOAR is the open archives search tool[21]: it enables you, among other things, to search for open archives in a particular discipline.

IV.3 Multi-source search engines: some examples

As these search engines rely on multiple resources (in this case open archives), they are particularly interesting in terms of information retrieval. Some of these multi-source search engines for open access resources rely on the OAI-PMH protocol, which is pertains to OAI harvesters. The OAI-PMH protocol is “a protocol that determines the transfer conditions of metadata from an open archive, created by a data provider, to the server of a service provider”[22]. So an OAI harvester is a service that collects the metadata from open archives (or other repositories) through the OAI-PMH protocol. REPEC, mentioned previously, is one example.

This search engine facilitates searches within the full text of some documents deposited in the open archives listed in the OpenDOAR directory. It does not use the OAI-PMH protocol but Googlebot, Google’s indexation robot. Searches cannot be saved and OpenDOAR does not provide an alert system.

This search engine uses the OAI-PMH protocol. It is in charge of indexing metadata and full texts. Scientific Commons displays more than 38,354,160 full-text documents and relies on 1,269 open archives. The website is quite slow and the delay to obtain search results can sometimes be long. Searches cannot be saved and Scientific Commons does not provide an alert system.

BASELAB (Bielefeld Academic Search Engine) is a service provider of OAI-PMH. As such, Baselab collects, normalizes and indexes the data of open archives and other platforms [23] in accordance with the Open Archives Initiative Protocol for Metadata Harvesting. Baselab is therefore a central point of entry to search for open access articles. Baselab displays more than 37,442,830 documents. It keeps record of searches and provides an alert system through RSS feeds. Search results can be exported in the form of an RSS feed.

“ISIDORE is a French search platform that gives access to digital data in the field of humanities and social science (HSS). Open to all, and especially to teachers, researchers and PhD and other students, it is based on the principles of linked data and provides open access to data”[24]. This search engine displays more than 1,867,727 documents from 1,814 sources. It provides an alert system through RSS feeds: search results can be exported in the form of an RSS feed.

V. Open Access Scientific Journals

Remember that open access electronic journals are freely accessible journals: publications are accessible for free.

V.1 Open Access Publishing Platforms: Some Examples

OpenEdition is a French platform that provides access to scientific journals (≥374), book collections (≥ 18) and scholarly blogs (≥543) in the field of humanities and social sciences. This publishing platform provides an alert system through RSS feeds and emails: - RSS feeds offer the option of subscribing to the latest articles published in a particular journal and exporting search results in the form of an RSS feed. - Once an account is created on the website, it is possible to set up email alerts. As it says on the website, “the alert system makes it possible to save in a personal account a selection of searches made on the search engine of the OpenEdition portal. Once searches are saved in your account, you will receive by email the news of the portal related to these searches: articles and journal issues, scientific events or research notebook publications”.

BioMed Central publishes the content of more than 240 scientific journals in the field of science, technology and medicine. Journals are classified according to the scientific field they cover, so it is easy to select directly the scientific journals related to your research subject. This publishing platform provides an alert system through RSS feeds and emails. It is possible to subscribe to the latest articles published in a particular journal thanks to RSS feeds. Users can save their searches and create email alerts if they create an account (which is free).

INTECH is a publisher that issues open access scientific books in the field of science, technology and medicine. More than 1,960 books are available on the website and can be downloaded for free. The INTECH catalog does not provide a real alert system and searches cannot be saved. OAPEN (Open Access Publishing Network) is a catalog of open access scientific books in the field of humanities and social sciences. Electronic books are classified by discipline and can be downloaded for free. The OAPEN catalog does not provide an alert system and searches cannot be saved. Other publishers can be found on the webpage “Publishers of OA books” of the wiki Open Access Directory: http://oad.simmons.edu/oadwiki/Publishers_of_OA_books

V.2 Multi-Source Search Engines

The DOAJ is a directory that gathers open access electronic journals: more than 6,880 scientific journals are available. DOAJ offers two options to its users: - Manually searching the journals that are relevant in the context of the research subject. Journals are classified according to their area of knowledge. Several indicators can help you to decide whether a journal is relevant or not in the context of a research subject: - The title of the journal in itself is often a good indicator. - The language of the journal. - The keywords and subjects associated with the journal. - Searching in the articles themselves, from more than 4,090 scientific journals. Searches cannot be saved and no alert system is provided.

This portal gathers more than 9,350 open access journals. Open J-Gate offers 2 options to its users: - Searching journals manually by title, publisher or scientific field. Note that this search option seems to be experiencing problems: for instance journals cannot be searched by scientific field. - Searching in the articles from the journals listed in the Open J-Gate. Searches cannot be saved and no alert system is provided.

This search engine is dedicated to open access electronic journals in the field of humanities and social sciences. Jurn gathers more than 4,400 scientific journals and it is possible to search in the articles from listed journals. Searches cannot be saved and no alert system is provided.

“FreeFullPDF.com is a search engine developed by the French company KnowMade, specialized in the search for pdf format scientific articles. It currently provides more than 80 million open access articles”[27]. The knowledge areas concerned are medicine, biology, chemistry, physics, materials science and economics. Searches cannot be saved and no alert system is provided. 5)DOAB:http://www.doabooks.org/ DOAB (Directory of Open Access books) was launched by the OAPEN (Open Access Publishing Network) Foundation. This service gathers more than 1200 academic peer-reviewed books from more than 30 publishers. “The primary aim of DOAB is to increase discoverability of Open Access books […] The directory will be open to all publishers who publish academic, peer reviewed books in Open Access and should contain as many books as possible, provided that these publications are in Open Access and meet academic standards.” [Source: http://www.doabooks.org/doab?func=about&uiLanguage=en]

VI. Conclusion

Setting up an efficient scientific monitoring system is an arduous exercise. It must be highlighted that the preparatory steps for information retrieval and the implementation of scientific monitoring are essential. The keywords and concepts chosen will have a direct impact, first on research results, then on the efficiency of scientific monitoring as the research results are exported in the form of RSS feeds. On the basis of the above analysis of search and monitoring tools, it can be concluded that the way open archives and open access scientific journals are used is different. When they use open archives, users look primarily for the categories (or areas of knowledge) they are interested in for their research subject. Then they subscribe to RSS feeds to receive the latest publications deposited in the categories concerned. Users will also be able to export the results of their search in the form of RSS feeds if the open archive or the OAI harvester offers this service. It is relatively easy and fast to implement a scientific monitoring of open archives. In the case of scientific journals, implementing scientific monitoring is more complicated. On multi-source search engines for scientific journals, search results cannot be exported in the form of RSS feeds. Revues.org and Biomed Central offer interesting functionalities to implement scientific monitoring, but this is not the case for DOAJ. It is more difficult to select relevant journals on DOAJ. Even though a journal can look interesting at first sight, thanks to the keywords and concepts associated with the journal, it is crucial to visit its official website to check whether this actually is the case[28] before subscribing to the alert system of the journal, if there is one. When a field or subfield covers more than 100 journals, this exercise can be very time-consuming… In any event, implementing scientific monitoring that focuses on open access scientific resources can end up being crucial to carrying out searches. The aim of this type of monitoring is not to replace paid publishing platforms to which a researcher can have access through research libraries’ subscriptions. These scientific resources are complementary to “traditional” scientific resources and contribute to the improvement of the process of collecting scientific and technical information.