Transcription

1 11/04/14 Sharing and archiving of publicly funded research data Report to the Research Council of Norway

2 2 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

3 For information on obtaining additional copies, permission to reprint or translate this work, and all other correspondence, please contact: DAMVAD damvad.com Copyright 2014 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 3

4 Contents 1 Executive Summary Mandate Main findings Recommendations 7 2 Sammendrag (in Norwegian) Mandat Sentrale funn Anbefalinger 12 3 Background Mandate Context Data is vital to research Growing consensus on the importance of sharing publicly funded data Structure of the report 17 4 Former studies used to develop the hypotheses The consensus on the importance of access to data Lack of recognition, time and proper infrastructure Variations across disciplines and ages Input from researchers and data managers in Norway Hypotheses 21 5 Methodology Conceptual clarifications Scope Financing Research data Archiving Open access to research data Selecting the population Survey process Response rate A significant proportion of the researchers actively chose not to participate 27 6 Descriptive statistics 29 7 Researchers use data generated by other researchers Data formats vary across research disciplines Numerical data are easier to restore Researchers frequently use other researchers data Researchers mainly use data produced by other researchers from the same institution 35 4 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

5 7.5 Researchers would like even better access to other researchers data 35 8 Research data is rarely archived in data centres Most data is archived on portable storage units or institutional servers Storage reflects costs of recreation Most researchers are satisfied with their current archiving solution Those who are not satisfied point to security risks Archiving activities are financed as a part of project- and institutional funding 41 9 Most researchers share research data Researchers are positive to the principle of open access Many researchers are left undecided Health trusts are positive towards the effects of sharing data on research Researchers share their research data, but upon request More openness within humanities More openness among more experienced researchers Lack of time, infrastructure and incentives hamper further sharing of data Variety of barriers Relatively small differences across sector Textual records are more sensible Researchers see little support from management Limited institutional support Researchers call for better infrastructure, citation systems and guidelines Researchers working internationally find time to be a bigger challenge Researchers welcome data sharing as a part of publishing Main findings and recommendations Main findings Recommendations 65 References 70 Appendix 73 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 5

6 1 Executive Summary 1.1 Mandate Data is an important asset in the knowledge society and is vital to research. Open access to research data allows for the use of data for different purposes and for purposes other than originally intended. Sharing and archiving of data allows for further research, re-analysis, validation and research cooperation on complex matters. Consequently, open access to research data both can enable new research and innovation and the dissemination of knowledge. The debate about open access to research data is by no means new. It has intensified in recent years due to a growing amount of data and the growing possibilities offered by information technology, along with growing recognition of the value of data. The Organisation for Economic Co-operation and Development (OECD) has developed guidelines on the sharing of publicly funded research data. Publicly funded research data could be considered a public good, and as such should be available to the greatest extent possible, not reserved for the individual researcher or institution. Nonetheless, the sharing and archiving of research data faces technical, financial, legal and cultural obstacles and questions that remain unanswered. The objective of this study is to gain a better understanding of researchers in Norway s current practice on sharing and archiving, as well as barriers to the sharing and archiving of research data. The study also proposes possible approaches to overcome these barriers. The study will serve as a contribution to the Research Council of Norway's work on developing a strategy and guidelines for sharing and archiving of publicly funded research data in Norway. 1.2 Main findings Overall, findings in this report support findings in other international surveys. A total of 1,474 researchers completed the survey. This constitutes 21.8 percent of the selected survey population. Another 604 researchers actively indicated that they did not want to participate in the survey. In total, that is a response rate at 30.6 percent. An analysis of respondents indicates a high representativity across institution types and subject matters. Norwegian researchers frequently use and share research data with each other. As many as 64 percent of researchers had used research data from other researchers in the last three years. The researchers mostly used research data generated by other researchers from the same institution, though this is closely followed by data from researchers at other institutions outside of Norway and other researchers nationally. The remaining 36 percent of researchers report that they have not used data gathered by other researchers. Of these, 71.5 percent report that they would have liked to make use of other researchers data. The numbers indicate untapped potential for increased and improved sharing of data. Only 10 percent of the researchers had not used research data generated from other researchers over the past three years and did not wish to use data generated by others. 6 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

7 The survey confirms that researchers in Norway see the benefits of the sharing and archiving of research data. Around 80 percent of the respondent researchers agreed that open access to research data enhances research, and that it is an ethical obligation of research to make research data available for validation. These are also the two reasons for open access agreed to by most researchers. Further, 77 percent agree that open access to research data facilitates the education of students and new researchers and 74 percent agree that open access stimulates research collaboration. Although most researchers agree on the benefits of sharing data, many researches are also undecided about whether publicly funded research data should be considered public property. Of the remaining 20 percent who do not agree that open access to research data will enhance research, 15 percent are undecided and around 5 percent disagree. This high proportion of undecided researchers may reflect the complexity of the issue and the distance between good intentions and practical solutions that address storage, ownership and credit, replicability of use, and other obstacles. When asked about the barriers to sharing even more of their data, researchers emphasized the following: 1. Preparing data for open access takes up valuable time. 2. I do not have an adequate technical infrastructure. 3. Open access to research data might reduce my options for scientific publications in the future. These responses indicate, inter alia, that researchers lack adequate and user-friendly infrastructure, guidelines and procedures, and certainty about immaterial rights in order to embrace the idea of sharing data. Contrary to our hypothesis, we did not find any major differences across sectors, fields of research or years of professional experience. The study further finds that 85 percent of the respondents archive their data on their own devices or else at an institutional server. The figures do not vary across sectors, disciplines or scientific experience. The survey included an open answer option where respondents could write free text. Inputs in this section show that many researchers find the issue of open access challenging and complex. Most researchers share their research data with other researchers. Yet research data is generally shared under certain conditions (e.g., only upon request, under a non-disclose agreement, in an anonymized format). Researchers want to control who gets access to their data and how they use it. With each researcher setting the term, there is a risk that she becomes a gatekeeper. The survey responses suggest significant differences in the way in which research managers deal with the sharing and archiving of data. Consequently, researchers see a need for greater institutional support. 1.3 Recommendations The study reveals multiple obstacles and, therefore, that there is no single solution as to how to increase the sharing and archiving of research data. Both this and former studies suggest that there is a need for SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 7

8 work directed at the level of researchers, data managers and research funders, as well as government/international levels. Overall, researchers agree in principle as to the value of sharing data. However, increased sharing is hampered by uncertainty about how to go about it technically; it being felt that it takes away valuable time for research and that it will reduce academic credentials. It is also important to communicate that the archiving of data does not necessarily imply full open access to research data for all, but should be seen more as a premise for the sharing of data. Second, our study indicates that a lack of incentives for the crediting of data is a barrier. This could be addressed by clarifying and implementing a system for citation but also outlining the inherent responsibility and expectation on the part of researchers. The flipside of these barriers are possible solution. These include: Better infrastructure. Implementing a system for citation. Implementing guidelines, training and standards for sharing data. The Research Council of Norway can play a key role. Specific recommendations include raising awareness, finding ways to recognize data sharing, putting in place standards, rules and best practice, providing technical infrastructure, and making funding available for necessary infrastructure and training. Our recommendations are summarized in Figure 1. First, we suggest that the Research Council of Norway actively work to raise awareness on the benefits and pitfalls of the archiving and sharing of research data. In particular, exemplifying potential opportunities and their value is important, inter alia, by using best practice cases. Focus should be placed on showing that sharing and archiving is also worthwhile for researchers. For example, the Research Council of Norway can introduce requirement of data management plans and support implementation of systems for crediting to raise awareness, experience and recognition among researchers. Ideally, such measures should be easy to use, similar to international systems and work alongside the system for scientific publication. Third, many researchers lack knowledge as to what data to share and archive and how to do so. This includes information about what form the data should be archived in and how proper information about the data should be assigned. There is a need for guidelines, standards and training on the sharing and archiving of research data. Defining what data to share and what is worth archiving (or not) could help clarify the debate. These should be developed in close interaction with researchers, institutions and legal experts. Such work should be inspired by work initiated internationally to avoid creating a Norwegian bureaucracy alongside international standards. Furthermore, selective investments in infrastructure and technical skills are necessary. Both interviews and studies suggest that the infrastructure for sharing and archiving data is fragmented, overlapping and insufficient. Our study also suggests that 8 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

9 many researchers archive most of their data on their own servers or portable computers. Better infrastructure could increase the motivations for archiving data at data archiving centres. This could provide a more secure means of archiving data and the data could be more easily restored. Finally, archiving will lay ground for the sharing of more research data. Infrastructure investments should involve all relevant stakeholders while also ensuring a robust infrastructure which will serve the needs of the future. FIGURE 1 Problems, solutions and recommendations SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 9

13 3 Background This chapter presents the background and context for the report. 3.1 Mandate Data constitutes knowledge and is a valuable asset in the knowledge society. Sharing of research data allows for the use of data for purposes other than originally intended, linking of data across different data sets and validation of data. Accessible data also underpins democratic processes by making information available to a wider audience. Retrieving information and allowing new generations of researchers to stand on the shoulders of giants is the very essence of research (PARSE.Insight. 2012). The Norwegian Government - alongside international organizations such as the OECD and EU seeks to promote more sharing and archiving of research data. In its most recent White Paper 1 on research policy, the Ministry of Education noted that: Better access to research data helps facilitate research and to increase the quality of research. The government wishes to facilitate increased availability of publicly funded research data. Better utilization of research data could thus strengthen the quality of Norwegian research and ensure a more efficient use of resources. Consequently, enhanced access to research data is a key measure to reach overall research policy objectives. sociological obstacles. While overall policy goals and benefits are agreed, many questions still stand in the way of effective and successful implementation of the principles of open access to research data. The Norwegian Government has mandated the Research Council of Norway to explore and facilitate work on sharing and archiving of research data. The Research Council of Norway is the National strategic and funding agency for research activities in Norway. The goal of the Research Council is to strengthen the Norwegian research and innovation system and its infrastructure through the effective use of public resources. As previously, noted, enhanced access to research data can be seen as a measure to help achieve these goals. The Research Council of Norway is also principal source of expertise and advice on research policy for the Norwegian Government, the central government administration and the overall research community, including universities, research institutes and health trusts. In the autumn of 2012, the Research Council of Norway initiated an internal project called "Principles for open access to publicly funded research data", led by the Department for Research Infrastructure. The main objective of the project was to provide a knowledge base for further work shaping the Council's policy in line with the OECD guidelines from Yet archiving and sharing of research data brings to the fore a number of technical, financial, legal and A working group has been formed and a number of activities are being undertaken in close cooperation 1 Meld. St. 18 ( ) Report to the Storting, Long lines - knowledge provides opportunities freely translated by DAMVAD. 2 OECD (2007) Principles and Guidelines for Access to Research Data from Public Funding SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 13

14 with the research communities and data managers to explore how open access to research data can be strengthened. 3.2 Context Data is vital to research It is in this context that the Research Council of Norway has commissioned DAMVAD to undertake a survey among researchers in Norway. The objective of the survey is to gain a better understanding of researchers current practices and position regarding the archiving and sharing of research data. The topic clearly involves a broad range of stakeholders, including the government, research organizations, researchers, research institutes and civil society. This study exclusively investigates the viewpoint of researchers. Research can be defined in many ways. In the OECD Frascati manual 3 research is defined as "( ) creative work undertaken on a systematic basis in order to increase the stock of knowledge, including knowledge of man, culture and society, and the use of this stock of knowledge to devise new applications." Other definitions can be used, but regardless of the definition applied, data remains a vital part of research. The two overall questions investigated in this study are: 1. How do researchers in Norway share and archive research data? 2. What are the obstacles to the increased sharing and archiving of research data? Based on results and analysis on the two questions, the study discusses measures to reduce or overcome identified barriers. Data is vital to researchers in investigating events, features, and correlations, in adjusting findings from previous research, solving new or existing problems, supporting theorems and developing new theories for the benefit of society. The debate on open access to research data is not new. The concept and related policy goals were institutionalized by the establishment of the World Data Centre system, in preparation for the International Geophysical Year of The study feeds into the Research Council of Norway s work on developing strategies and guidelines for sharing and archiving of research data in Norway. The International Council of Scientific Unions (now the International Council for Science) established several World Data Centres to minimize the risk of data loss and maximize data accessibility, further recommending in 1955 that all research data should be made available in machine-readable form. 5 3 OECD (2002) Frascati Manual: proposed standard practice for surveys on research and experimental development, 6th edition. Retrieved 27 May 2012 from 4 National Research Council (2008). Earth Observations from Space: The First 50 Years of Scientific Achievements. The National Academies Press. 5 World Data Center System ( ). "About the World Data Center System". NOAA, National Geophysical Data Center. 14 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

15 The debate on open access to research data has been intensified in recent years following the growing amount of data and growing number of data possibilities offered by information technology. FIGURE 3.1 Data management as an integrated part of Research life cycle The rapidly increasing amount of data allows for the analysis of complex issues involving large datasets. New technology generates big data which carry significant data analysis opportunities, but also challenges in terms of storage, communication and processing software, and ownership issues. Examples include information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, etc., generates big data. 6 Information technology and the Internet have increased the amount of available data. This implies new and more extensive opportunities for collecting, analysing, storing and sharing data. Information technology has also affected the way in which research is done. Science has become more collaborative, data-intensive and computational, leaving academic researchers with new data management needs that have to be addressed as an integrated part of the data lifecycle. 7 Source: JISC Research 3.0: driving the knowledge economy. This new, data-intensive research environment of scientific study has been called the fourth paradigm of scientific inquiry, where all science literature is online, all of the science data is online and they interoperate with each other (Hey et al. 2009): 8 We must all accept that science is data and that data are science, and thus provide for and justify the need for the support of much-improved data curation 9 6 Big data is a term used for large and complex data sets; see, for example, 7 JISC Research 3.0: driving the knowledge economy and Tenopir et al. (2011) 8 Tony Hey, Stewart Tansley and Kristin Tolle, eds.,(2009): The Fourth Paradigm: Data-Intensive Scientific Discovery 9 Hanson, Sugden & Alberts,(2011) Making Data Maximally Available Science Vol SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 15

16 3.2.2 Growing consensus on the importance of sharing publicly funded data There is growing consensus that data in all its forms represents a significant resource in today's knowledge society. Access to data and the infrastructure allowing for the utilization of data has become a resource that should be protected and utilized in an efficient manner. A growing number of governments, organizations and research funders are actively working to increase openness to data. This is not limited to research data but all publicly funded data. The relevance of sharing publicly funded research data rests on two argument. Firstly, publicly funded research data should be utilized to the greatest extent possible and not be reserved for individual researchers or institutions. Further, open access to research data can be a mean to utilise resources more efficiently. The OECD principles were endorsed by the OECD Council in December 2006 and published in The OECD "Principles and Guidelines for Access to Research Data from Public Funding" (2007) essentially recommends that research data generated through publicly funded research is to be made publicly available to others: The value of data lies in their use. Full and open access to scientific data should be adopted as the international norm for the exchange of scientific data derived from publicly funded research. National Research Council study, Bits of Power. Sited in the OECD Guidelines (2007) A recommendation is a legal instrument of the OECD that is not legally binding and which is often referred to as soft law. As such, there are no legal obligations towards publishing data. However, when a recommendation is endorsed by a country the country is obligated to work towards fulfilling that recommendation. Sharing and open access to publicly funded research data not only helps to maximize the research potential of new digital technologies and networks, but provides greater returns from the public investment in research. OECD Guidelines (2007) In 2004, the governments of the 30 OECD countries as well as China, Israel, Russia and South Africa adopted the Declaration on Access to Research Data from Public Funding. In this declaration, they recognized the importance of access to research data and invited the OECD to develop a set of OECD guidelines based on commonly agreed principles to facilitate optimal cost-effective access to digital research data from public funding. The Norwegian government has endorsed the OECD guidelines in, for example, the previous white paper on research from 2009: Increased availability of research data, both in Norway and in the partner countries, helps to facilitate research and disseminate knowledge across borders. This is fundamental to the quality and something the government wants to facilitate. The Government intends to follow up on the OECD principles and guidelines for access to publicly funded research data. St. Meld 30 ( ) Report to the Storting, Climate for research. Freely translated by DAMVAD Alongside the work at the OECD level, the European Commission is also working towards more 16 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

17 openness of research data. Efforts have been made both in terms of building competences 10 and infrastructure as well as in developing European-wide policies and guidelines. Although the OECD guidelines have been endorsed in Norway, it has largely been left to the various institutions and disciplines to develop methods for implementing them. In 2010, the High Level Expert Group on Scientific Data submitted its report Riding the wave. How Europe can gain from the rising tide of scientific data to the European Commission. Riding the Wave offers a vision of how Europe, through the efficient use of research resources, can strengthen research and innovation in Europe and, thereby, strengthen Europe s competitiveness in the global economy. Since the beginning of its Seventh Framework Programme (FP7) for research and innovation in 2008, the European Commission has operated an Open access pilot to ensure open access to research publications from the FP7-funded projects. Based on these experiences, the European Commission has communicated that not only publications but also research data from the EU-funded projects should be openly available (when possible) in the future. In December 2013, the European Commission published Guidelines on Open Access to Scientific Publications and Research Data in Horizon Such initiatives is likely to affect Norwegian researchers in the times to come. In addition, the infrastructure for sharing and archiving research data in Norway is fragmented, with a decentralized system of local, regional, national and international data centres. There are wide variations between different subjects and disciplines. The government of Norway and the Research Council of Norway now see a need for more coordinated efforts to ensure that more data are shared and archived. However, knowledge as to practices and the obstacles faced is needed. This study will serve as input for such work. 3.3 Structure of the report Following this chapter on the mandate for and context of the report, the report provides a brief summary of the main findings from former studies and interviews in Chapter 4. Chapter 5 gives a detailed description of the methodology applied, covering conceptual clarifications, the selection of the population and the survey process, etc. The results of the surveys are presented in Chapters 6 through chapter 10. The results are presented in the following order: presentation of the respondents, the respondents practices regarding data usage and generation, the respondents practices regarding data archiving and, last but not least, the respondents practices and obstacles in relation to the sharing of data. The main findings and recommendations are included in the final chapter. 10 For example, through the funding of Parse.Insight, a two-year project co-funded by the European Union under the Seventh Framework Programme. It was concerned with the preservation of digital information in science, from primary data through analysis to the final publications resulting from the research. 11 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 17

18 4 Former studies used to develop the hypotheses Numerous studies have devoted themselves to the definition and importance of sharing research data (Borgman 2012, Kowalczyk & Shankar 2011). Several studies have addressed the technical aspects of infrastructure and data management (Tenopir 2012, Graaf et al. 2011), while strategy papers and policy documents have focused on the research process and proposed policies for the promotion of data sharing (PARSE.Insight 2012, EC 2012, Hey et al. 2009). Various studies have also focused on the practices of and barriers to sharing and archiving from the viewpoint of researchers. The following chapter summarizes the main findings from studies dealing with the current practices of and obstacles to sharing and archiving from a research point of view. The findings from previous studies have yielded significant insights into the matter and have been used in the development of the hypotheses and questions of our study. 4.1 The consensus on the importance of access to data support for open access to publicly financed research data among various stakeholders. Nine out of ten respondents stated that research data that is publicly available and publicly funded, has to be - as a matter of principle - available for re-use and free of charge on the Internet. Similarly does studies find support for importance of archiving of data. PARSE.Insight's (2012) European study on data archiving (preservation) concur that the preservation of research output is important, the reasons being that it may stimulate the advancement of science and that it allows for the re-analysis and validation of research Lack of recognition, time and proper infrastructure Previous studies show that however, data are often unavailable for various reasons. One of the key challenges for sharing research data concerns the legal issues involved. Data must be stored and shared in a way that safeguards privacy. Laws and regulatory policies in this area comprise provisions that have their origin in general social considerations and the need to protect citizens. Several studies find that researchers acknowledge the benefits open access to research data that is publicly funded. There may also be other legal challenges relating to who owns the rights to data when multiple funders are involved in a given research activity. A European Commission 12 study from 2012 on scientific information in the digital age found strong Tenopir et al. (2011) conducted a survey among 1,329 scientists, 14 exploring current data sharing practices and perceptions of the barriers to - and 12 The EC Online. The online survey on scientific information in the digital age was open from July 2011 to September The team received 1,140 responses in total from all Member States, except Ireland, Malta, Slovenia and Slovakia. 37 percent of all responses were submitted by German respondents. The responses represented the different stakeholders, 429 of which were individual researchers; six respondents (not limited to researchers) hailed from Norway. 13 Apparently, validation of research is a growing global concern, see Trouble at the Lab, The Economist (October 2013). 14 In Tenopir et al. (2011), the survey was open from October 2009 to July Initially, the investigators used a snowball sampling method. They sent an cover letter to DataONE team members (about 35 individuals throughout the world, but primarily in the United States). To increase international response, surveys were sent by an academic publisher to its database of over 7,000 previous authors. Ultimately, 1329 respondents answered at least one question. It is not unreasonable to estimate that the survey instrument reached 15,000 people, in which case the response rate was approximately 9 percent. 18 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

19 enabling of - data sharing. In this survey, the principal reasons stated by scientists for not sharing data were insufficient time and a lack of funding. 15 Parse.Insight (2010) was a two-year project co-funded by the European Commission under the Seventh Framework Programme (FP7) on Research Infrastructures. Major surveys were held within three stakeholder domains: research, publishing and data management. The survey inprivacy concerns, concerns about publishing opportunities, and the desire to retain exclusive rights to data. The respondents in the EU study previously referred also stated funding as a central barrier. In addition, lack of credit given to researchers for making data available was raised as a concern. Most of the researchers (81.1 percent), in the EU study, rated insufficient credit given as a very important or important barrier to accessing research data, followed by lack of funding to develop and maintain the necessary data infrastructures (78.7 percent) and insufficient national or regional strategies (74.6 percent). The European-wide study Parse.Insight 15 found that researchers often had major concerns about legal issues, misuse of data and incompatible data-types, all of which interfered with data-sharing practices. Enke et al. (2012) found a diverse mix of both technological (e.g., a lack of appropriate databases/mechanisms) and sociological (e.g., time, funding, etc.) causes that may impede scientists from sharing data. The main reason for not sharing data (cited in their international survey on data sharing in the field of biodiversity) was loss of control over the data, followed closely by the amount of time that would be needed to invest in sharing data sets. Studies indicate that sharing research data reflects personal factors, such as attitudes and culture. Tenopir et al. (2012) found that barriers to sharing research data were deeply rooted in the practices and culture of the research process, as well as in the researchers themselves. These factors can include Many journals require authors to share their data with other investigators, either by depositing the data in a public repository or else by making it freely available upon request. Caroline J. Savage and Andrew J. Vickers (2009) endeavoured to determine how well authors comply with such policies by requesting data from authors who had published in one of two journals that had clear data-sharing policies. They received only one of 10 raw data sets requested. This suggests that journal policies requiring data sharing do not lead to authors making their data sets available to independent investigators. Researchers who choose to withhold datasets often have specific reasons for doing so. Savage and Vickers (2009) noted that these reasons included concerns about patient privacy (for medical fields), concerns about future publishing opportunities and the desire to retain exclusive rights to data that had taken many years to produce. The studies presented above have provided insights into research practices and views regarding sharing and archiving. Nonetheless, the studies suggest that various barriers entails. We sum up the findings in a simple illustration in Figure 4.1. There seems to be a diverse mix of barriers involved, of which privacy issues, losing control over data, lack of credit, time for preparation and lack of proper infrastructure appear to be the most important (highlighted in Figure 4.1). cludes 1,389 responses from researchers, 262 responses from data managers and 178 responses from publishing. All parts of Europe were represented in these surveys. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 19

20 4.3 Variations across disciplines and ages Although the studies often highlight many of the same obstacles, the importance of the barrier itself might differ between the studies. This can be an indicator of differences in the nature of barriers across the respondent group, but could also be a consequence of the different methodologies applied. Surveys are typically sensitive to the way questions are articulated and which context they are placed in. Consequently, thus one should be careful when comparing different studies. This said, some studies have investigated the differences between different types of respondents within the same study and still found variations, especially across disciplines and age ranges. Some research disciplines are typically more reluctant to share data than others. Tenopir et al. (2011) found that the actual rate of data sharing varied considerably according to subject discipline, age, and geographic location. Researchers in medicine and social science were the least likely to share data. Atmospherics scientists were most inclined to making their data most available to others. Interestingly, when asked whether a lack of access to other researchers' or institutions' data were a major impediment to their research, social scientists agreed that this was the case more than other respondents (80 percent compared to 60 percent across disciplines). Such a lack of data sharing may also be a question of competition. Campbell et al. (2002) found that fields with increased opportunities for commercial applications, such as genetics, were less likely to share data when compared to less competitive fields. Younger researchers tend to be less likely to share data. This may be due to concerns regarding their career path. Tenopir et al. (2011) found differences in responses based on the age of respondents. Younger people were less likely to make their data available to others, whereas people above 50 years old showed more interest in sharing their data. FIGURE 4.1 Barriers to sharing research data Legal Privacy Shared ownership to data Lack of knowlegde on legal issues related to data Sociological Technical Lack of incentives/credit to researcher Concernes about researchers freeriding on data gathered by other researchers Fear of loosing controll over data Fear of loosing scientific edge Fear others might not understand data Lack of infrastructure Sharing data is time-consuming Lack of standards for sharing and preparing metdadata Lack of technical skills based on Tenopir et al. (2011), Enke et al. (2012), EC (2012), Kvale (2012), PARSE.Insight (2010). 20 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

21 These results correspond to findings in Kvale's (2012) 16 study of life science researchers in Norway. Kvale (2012) found the argument that publicly funded research should become public property to be stronger among researchers with more experience. However, the proposition that the sharing of research data might stimulate inter-disciplinary collaborations stood out as an argument with much stronger support among younger researchers than more experienced ones. 4.4 Input from researchers and data managers in Norway As part of preparing this report, DAMVAD participated in a workshop organized by the Research Council on sharing and archiving of research data in Norway in October Interviews and participation in this workshop provided certain insights and allowed for the detailed discussion on the practise of sharing and archiving in Norway. data formats and metadata 17, and a lack of professional data curators responsible for facilitating sharing and archiving on behalf of researchers. A list of informants is included in the appendix. 4.5 Hypotheses The international literature, workshop, and preliminary interviews served as a basis for formation of hypotheses to explore through the survey. We present the hypothesis in Table 4.1. Together, the different hypotheses allow for a detailed analysis of the practice of sharing and archiving of research data in Norway, what the main barriers for sharing and archiving are, and how these barriers can be reduced. Further, informants amongst researchers and data managers in Norway offered further insights into the barriers to sharing and archiving in the Norwegian context. Many of the barriers (such as issues relating to privacy, lack of credit and time) identified in the former studies were confirmed. Data managers are also typically concerned about the technical aspects, describing the Norwegian data management infrastructure as fragmented and overlapping. The informants point to a lack of central coordination, a lack of established standards for 16 A survey were conducted by Kvale as a part of her Master's thesis on data sharing in the life sciences of researchers at the Norwegian University of Life Sciences in The questions in the survey were largely similar to the questions included in the Parse.Insight survey in Of the 650 researchers and PhD students at the Norwegian University of Life Sciences (UMB) selected as a sample population for the questionnaire, 147 respondents (or 23 percent) replied. 17 Metadata is "data about data" i.e. information or content of data. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 21

22 TABLE 4.1 Hypothesis to be tested in the survey Researchers see the benefit of accessing other researchers' data, but want to retain control of their own data. There are various barriers to sharing research data (legal, technical, ethical and financial). Some data cannot be shared; nonetheless, a lack of incentives, time and infrastructure remain as central obstacles of sharing research data. Research data is archived for later reanalysis and validation. Sharing and archiving activities is financed as a part of research project funding. The barriers differ significantly between sector, discipline and age. Younger researchers are more negative about sharing research data than the older scientists. Researchers in the institute sector are more concerned with future revenue, whereas researchers in the university sector are more concerned with loosing scientific edge. Researchers in disciplines using numerical data are more experienced with sharing data. Internationally-oriented researchers are more open to sharing data than those that primarily work alone. Management supports the sharing and archiving of data. Work to increase sharing and archiving of research data needs to take place on many levels: policy level (guidelines, standards etc.), infrastructure/data management level, institutional level and research level. 22 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

23 5 Methodology This chapter describes the methodology of the survey: definitions used, how we selected the survey population, and analysis of respondents. As an introduction to the survey, we informed the respondents about the definition of publicly funded research data: 5.1 Conceptual clarifications This study contributes to a field of growing interest from researchers, research organizations, governments and civil society. Several studies have sought to investigate the field from a range of angles using different methodologies, concepts and terms. We have largely used the terms and definitions offered in accordance with the OECD guidelines and completed the necessary delineations to make the study relevant for the work of the Research Council of Norway Scope The researchers relevant to the study included researchers working at research institutes, universities and university colleges and health trusts (Helseforetak) in Norway. Researchers outside such institutions (e.g., researchers employed in private companies) are not the included in the survey. This delineation ensured that the study focused on the activities of those researchers one might expect to be publicly financed Financing The study will serve as input to the Council's work on drawing up guidelines for publicly funded research data. The survey has also sought to focus on publicly funded research data but not data that have been gathered for other reasons (such as for commercialization). This is in line with OECD guidelines. Publicly funded research data is defined as the use and generation of research data that is publicly funded (e.g., fully or partly funded by the Research Council of Norway, hospital trusts, universities and colleges, ministries and other public entities). Research that is fully funded by private or international organizations is not included in this survey Research data Various definitions of research data can be found in literature on the topic. This study uses the term in accordance with the OECD guidelines. As part of the introduction to the survey, we informed the respondents about the definition of research data: Research data are defined in accordance with the OECD guidelines for open access to research data, in which research data comprises factual records (numerical scores, textual records, images and sounds) used as primary sources for scientific research and which are commonly accepted in the scientific community as necessary to validate research findings. A research data set constitutes a systematic, partial representation of the subject being investigated. This term does not cover the following: laboratory notebooks, preliminary analyses and drafts of scientific papers, plans for future research, peer reviews, or personal communications with colleagues or physical objects. The OECD guidelines are primarily aimed at research data in digital, computer-readable format. It is in this format that the greatest potential lies for SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 23

24 improvements in the efficient distribution of data and its application to research, largely because of the marginal costs of transmitting data through the Internet. However, it could also apply to analogue research data in situations where the marginal cost of giving access to such data can be kept reasonably low. PARSE.Insight. (2010) used the term digital research data for all output in research. In practical terms, raw data, processed data, publications and post-publication materials are all covered by the same term. We have not used the term digital, as we would like to cover the entire range of research data. Moreover, we did not wish the respondents to make subjective valuations as to what type of data the survey covers. One can imagine research data that has not been made digital but which can be digitalized in the future. It is common to use several data sources in research. It is useful to delineate between source data and output data. Source data is data that already exist independently of the research to be undertaken. This may be information that is collected for a different purpose (e.g., administrative data or clinical data) or physical or digitized collections of objects and texts (such as libraries, text corpuses and other scientific collections). Output data is data generated through research. This can be data generated through new analysis or a compilation of existing data sources, but it can also be completely new data generated through new data collection. Typically, such data will be data from experiments, simulations, field work or interviews. However, the distinction between primary (output) and secondary (source) data can sometimes be subjective and contextual. As such we have not applied this distinction between types of research data in the survey. One of the research questions posed includes the term metadata. Metadata can be understood as structured data and information about data, of any sort in any media, which imposes order on a disordered information universe. Typically, metadata comprises index files and data dictionaries that store administrative information Archiving Storage, archiving and preservation are all terms used to describe how access to data at some later point in time is ensured. Although no clear distinction between the three terms can be made, storage might be understood as the saving of data during a project, archiving as the medium- to long-term saving of data after a project, and preservation as professional saving for even longer periods. This study focuses on the viewpoints of researchers and how they deal with their research data; in this study, we have used the term data archiving to denote storage beyond the lifetime of a project. As an introduction to the survey, we informed the respondents as to how archiving is defined: Data archiving refers to the long-term storage of scientific data and methods. Typically, data are archived at the end of a research project or else after a scientific publication or report has been prepared. Parse.Insight (2010) used the term digital preservation to refer to a set of processes and activities that ensure continued access to information in digital form. It denotes the process of storing digital information in such a way that it remains accessible, understandable and usable over the long-term (usually five, 10 or 50 or more years). The survey explored 24 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

25 several related activities, such as taking into account environmental changes (preservation watch), preservation planning (what needs to be done and when) and preservation actions (e.g., migration and emulation). We have chosen not to use the term digital preservation as a term in the survey, as it can be understood as an activity for professional data managers rather than as an active action that researchers undertake in their everyday research activities Open access to research data We have used the term open access to research data in line with OECD guidelines, which state that openness refers to access on equal terms for the international research community at the lowest possible cost, preferably at no more than the marginal cost of dissemination. The OECD guidelines also states that open access to research data from public funding should be easy, timely, user-friendly and - preferably - Internet-based. The latter part of these guidelines can be seen as a normative judgement rather that a definition of the term; therefore, to avoid misunderstandings and differences in interpretation, we have not included this definition in the survey. Open access to research data is the practice of providing access on equal terms at the lowest possible cost, preferably at no more than the marginal cost of dissemination. Concerning the term open access; this is in some instances merely used for open access to research publications and not to data. Therefore, we have specified throughout the survey that we do deal with open access to research data. 5.2 Selecting the population To ensure robustness of survey results, it was important to obtain a representative number of completed answers from each sub-population (i.e., the university sector, research institutes and health trusts). With representative sub-samples, we are able to compare different groups of respondents. We sampled our population by randomly selecting researchers from CRIStin. 18 In addition to the mentioned sub-populations, we sought representativeness within research disciplines in research institutes, universities and university colleges. All the sub-populations had a representative number of completed surveys once the survey had ended, with the exception being the Humanities within research institutions. In turn, we have used the following definition, of which the respondents were also informed as an introduction to the survey: 18 The Research Information System CRIStin is a tool aimed at the recording and promotion of publication data, projects, units and competency profiles. The system is also used to report publication points. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 25

26 5.3 Survey process In-depth understanding and knowledge reduces the risk of misinterpreting questions and ensures that a survey can cover all areas of the topic in question. Prior to designing the survey, we conducted extensive desk research including a literature review. Based on this, we formed a survey grounded on a proper understanding of the obstacles and barriers to open access for research data. Getting the researchers views also helped to define the questions and their response alternatives. As such, we conducted explorative and in-depth interviews and participated at a workshop on data management organized by the Research Council of Norway. With the information provided, we were able to confirm the hypotheses and gain a better understanding of the status of the archiving of and open access to publicly funded research data in Norway. An initial draft of the survey was developed in Enalyzer Survey Solution. We tested the draft extensively, both internally and in collaboration with the Research Council of Norway. These tests helped to ensure that the survey addressed the central hypotheses. Further, it was important that the questions asked should be unambiguous and easy to understand on the part of the respondent. Finally, it was of particular importance that the survey should draw a clear distinction between what information was needed and what information would be useful to have. Thus, we did not want a survey that was too long or contained irrelevant information. TABLE 5.1 Population, invites, response rates and the degree of representativeness Universities and university colleges Population Invites Response rate Degree of representation Humanities 2, % 114.9% Agriculture and fishery % 93.3% Mathematics and natural science 1, % 110.1% Medical science 3, % 112.2% Social science 2, % 121.0% Technology 1, % 93.6% Health trusts Medical science 1, % 84.3% Research institutes Humanities % 50.0% Agriculture and fishery 1, % 110.2% Mathematics and natural science % 97.9% Medical science % 107.0% Social science % 112.0% Technology 1, % 102.5% Total 18,863 6, % Note: The degree of representativeness covers how close the survey are to be representative for each subpopulation allowing for a 6 percent error level at a 90 percent confidence interval. This means that within a 6 percent margin the analytic is 90 percent confident that the population is representative. 26 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

27 After developing the survey internally, DAMVAD invited 68 randomly drawn researchers from CRIStin to test the survey electronically using Enalyzer Survey Solution. Ten researchers completed the pilot, which provided us with good feedback. After adjusting the pilot survey, the final survey was launched by through Enalyzer Survey Solution on the 18 th of December 2013 (week 51). 5.4 Response rate DAMVAD invited 9,262 researchers to participate in the survey, of which 2,480 addresses were no longer working. This left us with 6,782 active respondents. 1,474 researchers completed the survey while 604 actively chose not to participate. The response rate for the population as a whole was 30.6 percent, while it varied between 23 percent and 42 percent within different sub-populations. Figure 5.1 includes a complete overview of the number of invites, the response rates and the population size of our sample. 5.5 A significant proportion of the researchers actively chose not to participate Approximately 600 researchers actively chose not to participate in the survey. Health trusts saw the highest share of researchers not willing to participate, as shown in Figure 5.1. The share of respondents that did not want to participate in the survey was higher than what we have experienced in other surveys. There are variations across sectors and research disciplines percent of those working in health trusts did not wish to participate in the survey. Likewise, 14.6 percent working in the research institute sector in agriculture and fishery did not wish to participate. Researchers at universities were keener on participating. An average of six percent did not wish to participate, with the lowest share being in the research disciplines of mathematics and the natural sciences, whereby five percent did not wish to participate. One of the main objectives of the survey was to ensure representation in all the relevant sub-populations. The representative number of completed surveys varied according to the size of the total population. As the population size increases, the number of completed surveys needed for a representative sample as a percentage of the population will fall. That is, for small populations, a large portion of the actual population needs to complete the survey in order to generate a representative sample. The degree of representation is smaller for medical science performed at health trusts (84.2 percent) in comparison to medical science performed at research institutes (107 percent). SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 27

29 6 Descriptive statistics This chapter presents the characteristics of the survey respondents. Information about the researchers that have completed the survey is useful both to assess robustness of findings and to illustrate the complexity of the researcher population. A sufficient numbers of respondents within each category is important for later discussions and comparisons of findings and differences between sectors, research disciplines and scientific experience. Table 6.1 shows the distribution of the respondents for different sectors. Research institutes and universities together cover 85 percent of the respondents. Eleven percent are in hospital trusts while the last four percent comprise others, covering inter alia companies. The distribution between research institutes and universities is relatively even, which allows for comparisons between the two respondent groups. The number of respondents from hospital trusts is lower than for the two other sectors. Table 6.1 shows the distribution of respondents by affiliation. One concern is the level of respondents within the hospital trust. From table 5.1 we saw that we only reached an 84.3 percent representative level. Though 84.3 percent is relatively high it still not qualify as full statistically representative. With 158 observations, we still find that we can use the category when comparing with other sectors. Nevertheless, we will keep in mind the limitations of this category. TABLE 6.1 Participation across sector ( At what type of institution is your main occupation? ) Sector Freq. Pct. Research institute % University and university college % Hospital trust % Other % Note: Other covers private organisations, non-profit and foundations Table 6.2 shows the differences in gender across respondents. There is an over-representation of male respondents, at 60 percent yet respondents represent a representative sample of both genders. TABLE 6.2 Participation across gender ( What is your gender?) Gender Freq. Pct. Female % Male % Total 1, % Although the survey allows for analysis based on gender, gender is not used extensively to compare the results. This dimension is interesting, but less relevant to specific policies and strategies going forward, where most efforts will need to cut across gender. Table 6.3 shows the distribution of respondents across research disciplines. Social sciences and health sciences are the disciplines with the highest SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 29

30 TABLE 6.3 Participation across research discipline ( Which is your primary research discipline? ) Field of research Freq. Pct. Social science % Health science % Mathematics and science % Technology % Farming and fishery % Humanities % Other % TABLE 6.4 Participation across scientific experience ( For how many years have research constituted a major part of your work (including PhD or similar? ) Scientific experience Freq. Pct. Less than 3 years % 3-6 years % 7-10 years % years % More than 20 years % Total 1, % Total 1, % Note: Others typically covers multi-disciplinary research amount of responses, whereas humanities have the fewest. In total, we estimate that the different categories are well represented, enabling robust analysis and comparisons across different research disciplines. Finally, table 6.4 shows the distribution of respondents by scientific experience. We measure scientific experience in terms of the number of years the respondents have been conducting research (i.e., the number of years since and including their PhD). The distribution of the respondents is on this aspect as well. One fourth have conducted research for 11 to 20 years, and almost the same share have conducted research for more than 20 years. 30 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

31 7 Researchers use data generated by other researchers This section includes the findings related to how researchers generate and use data. This constitutes an important context to understand possibilities and limitations for archiving but especially sharing. We have applied a definition of research data in line with the OECD guidelines, which makes the distinction between numerical records, textual records, sounds, images, videos and graphics. The questions about data formats are included in the survey for two reasons. In particular, they offer an interesting perspective as to what kind of research data are most commonly used. Further, they also allow for the investigation of whether researchers views on the sharing and archiving of data differ across data formats. 7.1 Data formats vary across research disciplines Three-quarters of the respondents generated numerical data, (e.g., quantitative data, data models, data series, statistics, etc.). Health trusts in particular use numerical data in their research. Of all the respondents, almost 60 percent stated that they mainly generate numerical data. This is especially true for agriculture and fishing, as well as in mathematics, the natural sciences and medicine. A total of 23 percent generate textual records and 5 percent most frequently generated images, sounds, videos and so forth. In humanities, numerical data is rare. Researchers in humanities typically base their research on textual records (qualitative data, field report, interviews, social studies, etc.), images, sound and alike, or else they do not generate data at all. Only 14 percent of the respondents within the Humanities stated that they mainly generate numerical data. Compared with health trusts, only half of the respondents working in universities generated numerical scores. Researchers at universities mainly use textual records as 28 percent stated that they mainly generated textual records. This is shown in figure 7.1 on the following page. Textual records are more common within social sciences and humanities. Almost 50 percent of the responding researchers in these fields stated that they mainly generate textual records, in contrast to mathematics and natural sciences where very few (7 percent) primarily generated textual records. TABLE 7.1 Type of data generated ( What is the main format of your research data? ) Some researchers report that they do not generate data at all. This is true for 6 percent of the respondents at universities, 3 percent at research institutes and 2 percent at health trusts did not generate data at all. There are differences between research disciplines as to who does not generate data. For example, no data is reported by 1 percent within agriculture and fishing, but 7 percent within technology. This is not showed in the figure. Freq. Pct. Numerical scores % Textual records % Images, sounds, videos and graphics % I do not generate any research data % Other % Total 1, % SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 31

32 The distribution of the respondents also shows that approximately 10 percent within each group have answered other. Going through the survey, this most often implies that they generate both numerical and textual data. We found little evidence of differences in terms of the type of data generated by experience, which means that we will not present or comment upon the types of data generated by researchers with different levels of experience. 7.2 Numerical data are easier to restore As illustrated in Figure 7.2, numerical data are easier to restore. Fifteen percent answered that the numerical data could be restored very easily, and almost 50 percent stated that they could restore their numerical data with the same effort as they used when producing the data. Textual records is the source of data that is hardest to restore. Almost 50 percent answered that textual records are either impossible to restore or at least difficult to restore such data. FIGURE 7.1 Data format, by institution ( What is the main format of the research data you generate? ) 80% 76% 70% 60% 50% 50% 63% 40% 30% 20% 28% 22% 10% 0% 8% 6% 4% 6% 4% Numerical scores Textual records Images, sounds, videos and graphics 3% 2% Do not generate data Univerities and university colleges Research institutes Health trusts (hospitals) 32 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

33 FIGURE 7.2 Numerical data is easily restored ( If your research data gets lost, how easily can you recreate it?) 60% 50% 48% 47% 40% 30% 20% 10% 0% 19% 15% 12% 6% 32% 34% 24% 16% 15% 13% 10% 9% Numerical data Textual records Images, sounds, videos and graphics 1% Hardly Not possible Very easily With same effort I don t know 7.3 Researchers frequently use other researchers data The survey also asked researchers about the extent to which they use other researchers data in their work, and the extent to which they share their own data with other researchers. Many researchers have utilized research data of other researchers. Almost two thirds of the responding researchers had utilized research data provided by researchers within the past three years. TABLE 7.2 Use of other researchers data ( Have you within the last three years used research data gathered by other researchers? ) Freq. Pct. No % Yes % Total 1, % SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 33

34 Across affiliations, most researcher use data gathered by other researchers. FIGURE 7.3 Researchers use other researcher s data, by sector ( Have you within the last three years used research data gathered by other researchers ) 80% 68% Univerities and 62% university 60% 52% colleges 40% 20% 0% Research institutes Health trusts (hospitals) Note: The figure only include those that have answered yes to the question: Have you within the last three years used research data gathered by other researchers? Figure 7.3 shows that 68 percent of the respondents within research institutes have used research data gathered by other researchers within the last three years. This a little less common within health trusts. Yet 52 percent within health trusts have used data gathered by other researcher within the last three years. Differences are more important across research disciplines. The use of other researchers data seems more commonplace within mathematics and natural sciences and - to a lesser extent - within humanities and medical science. This corresponds to our hypothesis and international studies across disciplines (Tenopir, 2011). Specifically, 50 percent of the respondents within humanities and 44 percent in medical science report not to have used research gathered by other researchers within the last three years. In comparison, the share is 24 percent within mathematics and the natural sciences, and 32 percent within agriculture and fishery. FIGURE 7.4 Researchers use of other researcher s data, across disciplines ( Have you within the last three years used research data gathered by other researchers») 80% 70% 60% 50% 40% 30% 76% 68% 67% 63% 56% 50% 50% 32% 24% 44% 37% 33% 20% 10% 0% Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology 34 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

35 7.4 Researchers mainly use data produced by other researchers from the same institution Researchers do not travel far in their search for research data. Two thirds of the researchers used research data produced by other researchers from the same institution. However, many respondents also utilize data gathered by researchers from international institutions. Across all respondents, 56 percent stated that they used data from other researchers at international institutions. TABLE 7.3 Researchers use data produced by other researchers at their institute ( Whose research data have you used the most within the past 3 years? Multiple answers allowed) Research data from other researchers at my institution Research data from other researchers at national institutions Research data from other researchers at international institutions. Freq. Pct % % % Other % Total % Table 7.3 contrasts this finding with their reported interest in using data. Almost three-quarters (71.5 percent) of the respondents would like to make use of other researchers data. In other words, there is a substantial unmet demand for research data generated by other researchers. As few as 144 respondents (10 percent of the total respondent group) have not used research data generated from other researchers for the past three years, and do not wish to use data generated by others. Nine out of ten respondents either want, or are already using, research data gathered by other researchers. TABLE 7.4 Researchers that have not used other researcher s data, but would like to do so. ( If «no» to the above question: Would you like to make use of research data gathered by other researchers or institutions? ) Freq. Pct. No % Yes % Total % 7.5 Researchers would like even better access to other researchers data As illustrated in Table percent of researchers have not used data gathered by other researchers. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 35

36 8 Research data is rarely archived in data centres Data archiving refers to the long-term storage of scientific data and methods. That is data which are archived at the end of a research project or else after a scientific publication or report has been prepared. Archiving of data is an important prerequisite for validation of research findings. Infrastructure for archiving data can also be an important enabling factor for sharing data. This chapter presents the findings related to researchers practise concerning archiving of research data. 8.1 Most data is archived on portable storage units or institutional servers Various systems for data archiving exits. One can easily imagine that researchers use a variety of data archiving solutions. Sometimes data are archived at the institutional server, other times at a national data archive centre. When asked about the most common way of archiving data, the vast majority of research data is stored locally, either on researchers own personal computers, USB or CD/DVD/floppy disks, or on local servers at their institutes. More than 80 percent archive data locally (Table 8.1). One out of ten stored their data at central data archive centres, either at their organizations or at national centres. Finally, less than two percent used archive solutions outside of Norway. These findings are both surprising and cause for concern. The major concern relates to data security. If sensitive data is stored on CD/DVDs or personal computers, they are vulnerable to Internet-based intrusions. Institutional servers are better at keeping intruders out, but they are still not as good, or as secure, as more professional data archive centres (either local or national) which specialize in taking care of sensitive data. TABLE 8.1 Data archiving ( What is the most common way of archiving your research data after results are ready or beyond the life of a project? ) Where do you mainly store the data you generate? Freq. Pct. Portable storage unit % Institutional server % Data is submitted to digital archive centre in my organisation % Data is submitted to a national digital archive centre % Data is submitted to an international digital archive centre % Do not archive % Other % Total 1, % 36 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

37 Further, there is an issue concerning the restoration of data if they are lost. If a USB is lost or if a personal computer crashes, it can be rather difficult to restore the data, and months of work can be lost. There are some differences between types of affiliations. Storing locally on portable storage units is more common at the universities compared to the institute sector. Researchers at research institutes more often use institutional servers to store their data. 84 percent for research institutes and 88 percent for hospital trust. That in turn leaves a rather limited share of respondents that archive their data on archiving centres either nationally or internationally. At universities and university colleges it is 3 percent that mainly store their data on national archiving centres. The same figure research institutes are 1 percent whereas 2 percent within health trust archive mainly share their data at national archiving centres. But in general figure 8.1 confirms that 85 percent of respondents mainly store their data on a portable storage unit or at the institutional server. The figure is 82 percent for universities and university collages, FIGURE 8.1 Data archiving across institution ( What is the most common way of archiving your research data after results are ready or beyond the life of a project? ) 80% 70% 74% 71% 60% 54% 50% 40% 30% 28% 20% 10% 0% 10% 17% Portable storage unit Institutional server 10% 7% 7% Organizational digital archive/data center 3% 2% 4% 1% 2% 1% 2% 2% 1% National digital archive/data center International digital archive/data center Do not archive data Universities and university colleges Research institutes Health trusts (hospitals) SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 37

38 FIGURE 8.2 Data archiving by research discipline ( What is the most common way of archiving your research data after results are ready or beyond the life of a project? ) 90% 80% 70% 60% 81% 59% 70% 61% 73% 50% 48% 40% 34% 30% 20% 10% 0% 7% 21% 19% 19% 11% Portable storage unit Institutional server 14% 9% 9% 7% 6% 8% Organizational digital archive/data center 5% 4% 6% 2% 3% 0% 2% 3% 1% 1% 1% 1% 1% 1% 1% 2% 1% 0% National digital archive/data center International digital archive/data center Do not archive data Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology These differences between how stores at local portable units or institutional servers are largely explained by differences between research disciplines. Humanities, where portable storage is more common, are strongly represented in the university sector, while agriculture and fishery are strongly represented in the research institutes and report a higher share of centralized storage. Within humanities, 34 percent store their data on a portable storage unit. In agriculture and fishing, 81 percent of researchers store their data on institutional servers. This is illustrated in Figure Storage reflects costs of recreation The implications of losing data is particularly significant for data that would be costly or impossible to regenerate. Data that can be regenerated with the same effort as its initial creation is more commonly stored on portable storage units or institutional servers than 38 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

39 by other storage methods. 9 percent of data stored at a portable unit can be restored very easily. Whereas 27 percent of data stored at international archives and data centres can easily be restored. On the other hand, we see that 29 percent of the respondents, who do not archive their data, does not have the opportunity to restore data. That is only the case for 9 percent of those storing their data on national archiving centres. In total we see that those researcher that do not archive their data also find it harder or even impossible to restore data. Almost 60 percent of those not archiving their data cannot restore their data without extensive efforts. This should be compared to those archiving at national or international archiving centres where 50 percent or more can restore their data with the same effort or even a lesser effort. FIGURE 8.3 Data archiving by data regeneration ( What is the most common way of archiving your research data after results are ready or beyond the life of a project? ) 60% 50% 50% 45% 43% 40% 35% 30% 29% 29% 26% 26% 24% 22% 23% 23% 27% 20% 10% 13% 13% 14% 11% 9% 15% 17% 14% 9% 9% 12% 0% Not possible Hardly With same effort Very easily Portable storage unit Organizational digital archive/data center International digital archive/data center Institutional server National digital archive/data center Do not archive data SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 39

40 8.3 Most researchers are satisfied with their current archiving solution Most respondents seem to be satisfied with their current solution for the archiving of data. Two thirds stated that they were satisfied with their current solutions, while 15 percent did not know. TABLE 8.2 Satisfaction with current archiving solutions ( Are you satisfied with your archiving solution? ) Freq. Pct. Yes % No % I don t know % Total 1, % Most respondents were satisfied with their current archiving solutions. Two-thirds of the respondents stated that they were satisfied with their current archiving solution. 8.4 Those who are not satisfied point to security risks One-fifth report that they are not satisfied. Of these, half point out lacking security as a problem. Others pointed out that archiving is too complicated, that there is not enough capacity, and even that there are too many possible solutions. TABLE 8.3 Satisfaction with current archiving solutions ( If not, why are you not satisfied with your archiving solution? Multiple answers allowed) Freq. Pct. Too complicated to use % Too expensive 3 1.3% Too little capacity % Too many archiving solutions % Not secure enough % Other % Total % These are all important barriers to the active use of archiving solutions as a part of sharing data. Many researchers deal with sensitive information, and hence, security is essential. The presence of too many archiving solutions means that it can be difficult for researchers to know where to archive their data and that it can be time-consuming for researchers to archive their data. Further, it is noticeable that such systems are too complicated to use, which again will have the consequence that researchers will have to use valuable time to archive their data. Interestingly, more than 25 percent answered other to the question about satisfaction. When answering other, the respondents were able to add comments describing what they meant by this. The frequently used arguments are categorized and included in table SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

41 8.5 Archiving activities are financed as a part of project- and institutional funding Archiving activities are financed mainly as a part of institutional funding. Further, many such activities are financed on a project-by-project basis. Nearly 11 percent respond other to this question. Many of these argue that archiving activities is not funded or that they do not know how it is funded. Some even say that they have paid for archiving solutions them selves. TABLE 8.4 How archiving activities is financed ( How is your archiving activities financed? ) Freq. Pct. Part of research projects % Part of funding for researchbased operative tasks % Part of institutional funding % Other % Total % TABLE 8.5 Frequently used argument in the open answer on why researchers are not satisfied with their existing archiving solution Argument posed The archiving solution does not enable sharing data with others Not easily accessible by other researchers Too complicated and time consuming to use archiving systems, quoting the respondents Too time consuming to do all the back-up solutions are non-standard ad-hoc No common procedure for archiving makes it difficult and often not properly done. Lack of routines about how to store raw data. Lack of security and stability of the archiving systems Damage to hard drives pose a risk The back-up regime is not reliable Data has been lost due to change in storing technology, e.g. magnetic tapes were discarded without transfer of content to a new media. We do not trust the back up and use our own external hard disk Data can be lost at system upgrades etc. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 41

42 9 Most researchers share research data Research is cumulative in the way that research often build on previous research. Similarly does researchers often use research data of other researchers in line with the principles of the OECD guidelines. 9.1 Researchers are positive to the principle of open access Researchers clearly see the benefits of the sharing and archiving of research data. About 80 percent of the respondents agree that open access to research data enhanced research. In addition, 79 percent agree that it is an ethical obligation of research to make research data available for validation. These are the two reasons for open access to research data that most researchers agree to. Only 6.5 percent agreed that open access to research data would lead to less interesting research. Further, 77 percent and 74 percent agree that open access to research data facilitates the education of students and new researchers and that open access to research data stimulates research collaboration respectively. Below there a comment underpinning a positive attitude towards sharing data: As a matter of principle, generated data of a certain magnitude (small-scale surveys exempted) on publicly funded projects should be shared with the rest of the research community. A data set can in most cases be used and analysed for diverse purposes. In my view this is a matter of research ethics and should be included in the guidelines of the National Committee for Research Ethics in the Social Sciences and the Humanities (NESH). 9.2 Many researchers are left undecided Although most researchers agree on the benefits of sharing data, many are undecided as to whether publicly funded research data should be open or whether it should be considered public property. Table 9.1 illustrates this. Of the 20 percent who either do not agree that open access to research data will enhance research or that it is an ethical obligation of research to make research data available for validation, between 15 and 16 percent are undecided and 4-5 percent disagree. Similarly, 53 percent agree that publicly funded research data should be public property, 31 percent are undecided and 15 percent disagree. In both cases, the share of undecided is higher than the share of disagreeing. The relatively high number of researchers who did not want to participate in the survey can also be seen as an indication of the complexity of the issue. For both those who support and those who disagree with the overall principle of open access to research data, views are elaborated in the below: I don't see any challenges. Free access to everything. Researchers should not hoard their data, especially if publicly funded. After publishing their work - the data should ideally be available to others for robustness testing, replication, and the exploration of new hypotheses. 42 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

43 Researchers will often dislike open access because they are afraid of having made mistakes that might be revealed or having missed important patterns in the data that others might get good publications from. More important however, is that science is a social process where checking and challenging each other is what moves us forward - even if the process itself can be painful for the participants. I see this as a hyped up issue. As long as interpretations are needed to make sense of the data, there is no way those data are useful for others unless the original researchers are also part of a new study involving the data. 9.3 Health trusts are positive towards the effects of sharing data on research There are only small differences across sectors, research disciplines and professional experience. When looking across sectors, researchers at health trusts are a bit more positive towards the effects of sharing data. Figure 9.1 illustrates this. The figure shows respondents positions in relation to the question on whether open access to research data would enhance research. TABLE 9.1 Attitudes towards open access to research data ( Please indicate if you agree to the following statements related to open access to research data: ) Agree Undecided Disagree Freq. Pct. Freq. Pct. Freq. Pct. Open access to research data will enhance research Open access to research data will stimulate more research collaborations Open access to research data will make research less interesting Open access to research data will facilitate education of students and new researchers Publicly funded research data should not be public property Lack of open access to research data has restricted my ability to answer scientific questions It is an research-ethical obligation to make data available % % % % % % % % % % % % % % % % % % % % % Note: We have collapsed the positive statements in the survey I strongly agree and I agree and called it agree in the table. Likewise we have collapsed I strongly disagree and I disagree and called it Disagree in the table. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 43

44 FIGURE 9.1 Attitude towards open access to research data ( Open access to research data will to research data will enhance research ) 100% 80% 60% 40% 20% 0% 73% 74% Agree 83% Univerities and university colleges Research institutes Hospital trusts (hospitals) Note: We have collapsed the positive statements in the survey I strongly agree and I agree and called it agree When looking across, disciplines respondents in agriculture and fishing are less positive and respondents within humanities are most positive. As illustrated in figure 9.2, only 2 percent within humanities disagreed with the statement that open access to research data will stimulate to more research collaboration while the number is 14 percent within agriculture and fishing. Respondents from agriculture and fishing, alongside those within social sciences, were also the most undecided. As many as 27 percent within social sciences and 24 percent within agriculture and fishing declared themselves undecided as to the statement open access to research data will stimulate to more research collaboration. FIGURE 9.2 Attitude towards open access to research data ( Open access to research data to research data will stimulate to more research collaboration ) 90% 80% 84% 77% 79% 80% 70% 60% 50% 40% 62% 66% 30% 20% 10% 0% 27% 24% 16% 14% 14% 16% 7% 5% 7% 5% 2% Agree Disagree Undecided 15% Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology Note: We have collapsed the positive statements in the survey I strongly agree and I agree and called it agree in the figure. Likewise we have collapsed I strongly disagree and I disagree and called it Disagree in the figure. 44 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

45 Differences across professional experience are negligible and thus not reported here. 9.4 Researchers share their research data, but upon request Many researchers use research data generated by others. Even more researchers support the idea of sharing. Logically, one would expect that many researchers also share research data with other researchers. The survey confirms that most researchers share data with other researchers. Only 16% of the respondents stated that most of their research data is not available to other researchers. Further, 16 percent of the generated research data is available to everyone, while 12 percent of the generated research data is only available to other researchers. About half of the respondents state that their data is available to other researchers, but only upon request or under certain conditions. Researchers typically prefer to keep track of who is accessing their data and for what purpose. Consequently, each researcher becomes a gatekeeper for her own data. There are many reasons for being more restrictive in practice about one s own data than in principle. One is to ensure data is understood and used correctly. One researcher commented that: Generally, there is no big impediment against sharing my research data. I feel, however, that in most cases it is best done on a case-by-case basis upon a personal request because this allows me to give adequate explanation of the data and will best ensure that my contribution is adequately acknowledged. TABLE 9.2 Data availability ( Which of the following applies to the accessibility of most of your research data? ) Freq. Pct. Data is available to all % Data is available to other researchers Available for other researchers, but only upon request For other researchers, but under a license or non-disclosure agreement Could be made available with appropriate changes % % % % Data is not available % Other % Total % 9.5 More openness within humanities There seems to be more openness towards data sharing within humanities compared to other research disciplines. In medical sciences and social sciences, 12 percent report that they generate data that is readily available to others. The corresponding share within humanities is one third (see figure 9.3). Indeed, one might argue that the research data and sharing possibilities are fundamentally different between the medical sciences and the humanities. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 45

46 For example, and as illustrated in figure 7.1, 76 percent of data generated within the medical sciences is numerical data and 12 percent textual scores. This is in contrast to humanities, where 14 percent is numerical data and 48 percent consist of textual scores. The comparison is perhaps more interesting between social sciences and humanities. The two have an equal share of respondents generating textual scores, though they are somewhat different when it comes to numerical data. These two disciplines differ in their approaches to the sharing of data. Researchers in humanities are somewhat more inclined to unconditionally share data than their colleagues in social sciences. Sharing data that is not otherwise available is more common within health and social sciences, at 24 and 25 percent respectively. Only 6 percent within mathematics stated that most of their research data is not available. For agriculture and fishing, humanities and technology, the share is between 12 percent and 15 percent Figure 9.3 Data availability across research discipline ( Which of the following applies to the accessibility of most of your research data? ) 70% 60% 50% 60% 57% 54% 52% 50% 43% 40% 30% 33% 24% 25% 20% 10% 19% 16% 17% 17% 14% 13% 12% 12% 10% 9% 7% 14% 12% 6% 15% 0% Data is available Data is available for other researchers Data is available on demand Data is not available Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology Note: The statement Data is available on demand consists of the following possible answers: Available for other researchers, but only upon request, For other researchers, but under a license or non-disclosure agreement, Could be made available with appropriate changes 46 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

47 9.6 More openness among more experienced researchers Researchers with more experience appear more confident in sharing data. Among the respondents with more than 20 years of research experience, 19 pct report that their data is free to use. In comparison, 11 percent of the less experienced researchers make their freely available. FIGURE 9.4 Data availability across experience ( Which of the following applies to the accessibility of most of your research data? ) 70% 60% 50% 60% 55% 53% 51% 45% 40% 30% 23% 20% 10% 19% 14% 15% 16% 11% 18% 8% 9% 15% 12% 18% 15% 15% 12% 0% Data is available Data is available for other researchers Data is available on demand Data is not available Less than 3 years 3-6 years 7-10 years years More than 20 years Note: The statement Data is available on demand consists of the following possible answers: Available for other researchers, but only upon request, For other researchers, but under a license or non-disclosure agreement, Could be made available with appropriate changes SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 47

48 10 Lack of time, infrastructure and incentives hamper further sharing of data Although researchers already share data with other researchers, the study indicate that there is a potential for more sharing. Former studies and interviews point to a number of barriers for further sharing of research data. A central objective of the study is to identify the main barriers for more sharing of research data in Norway. This chapter presents the main findings on barriers and obstacles for sharing of research data in Norway and possible ways to reduce these barriers Variety of barriers As expected, the survey document that there is a variety of obstacles for sharing research data. Some research data cannot be shared due to issue of privacy, commercial issues or shared ownership. These aspects are however, not the most important barriers. The time involved is a main barrier to sharing data. Almost one-third of the respondents pointed out that TABLE 10.1 Main barriers towards sharing research data ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). Ordered by frequency. Frequency Pct. Preparing data for open access takes away valuable time for research % Lack of technical infrastructure % Reduce possibilities of future scientific publications % I am afraid other researchers will not understand my data % I cannot give access due to sensitivity issues % I cannot give access due to shared ownership % I don't know % I am afraid data will be misused % I cannot give access due to intellectual property rights % Open access to research data might have a negative economic impact for me and my institution % It would be unethical % I cannot give access due to commercial issues % I do not believe my research data is of interest to others % I do not believe data is secure at a data centre, journal site or alike % Other % Total 2, SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

49 preparing data for open access takes away valuable time for research. One-quarter of the respondents pointed out that lacking technical infrastructure is a challenge for the sharing of data. One way of reducing the time-constraint would be to improve the technical infrastructure. Further, many researchers are concerned that sharing data would reduce their possibilities regarding future scientific publications (25 percent). More than 20 percent of the respondents were afraid that others would not understand their data. Only one-fifth stated that they could not share data due to sensitivity issues or because of shared ownership of the data. These findings support those in other international surveys, such as Kvale (2012) and Tenopir (2011). Given the opportunity to make additional comments if they chose the category other, the following comments were made:. publications, arguing that researchers should have exclusive ownership to their data for an extended period of time: Data can be made available to others, but only after our institutions have had a reasonable period (3 years, for example) to analyse and publish in order to justify the high costs for data gathering. Or else, no institution will pay for data gathering. In general I may want some time of non-access (say 1-3 years) giving us, the researchers carrying out the project, the possibility of presenting results/documentation first, but then, afterwards, I would be thrilled if others would apply my/our data for re-analysis or new types of analyses. / We do try to support master students when they request use of our data, and I would also try to support other researchers in case of requests. / I do not know about funding of open access activity, thus any such activity will imply problems for my/our hour list. Some respondents pointed out the risk of others not understanding their data, and that it would require a significant effort to set up meta-data such that others would understand them: I am in the process of making my research data as public as possible. This takes a lot of time, and although I can't see any problem with it, there are little rewards except scientific/ethic satisfaction. Preparing data would be very time consuming. Research projects are often under financed and setting up data and metadata to enable open access take extra time usually spent in the last part of the project when the project run out of time and money. Many respondents focused on how open access could reduce their chances of producing scientific Preparing data so others can easily use it in the right way takes a lot of time. Often this time is not budgeted for and therefore the necessary data preparation is not possible within the given time frame for a project without taking away research time, using additional funding or using private time. / Making data available without a sufficiently detailed description of methods and the data generation may lead to misinterpretations of data and possibly wrong use of data. The limited resources and funding available for long term field experiments requires very large extra input of labour from scientists as compared to SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 49

50 the actual hours we are paid for. One of the few incentives to continue to do this is to collect and have unique access to the data. If anyone can use the same data without contributing to the work that is required in writing applications, designing and setting up the experiments and collecting the actual data, a large part of the incentive for research is gone. Then what is left is lots of hard labour with very low hourly wages and limited credit for the ideas or the results - who wants that? Such a situation comes through as very unfair. Some also comment on data sensitivity. The example below shows that it is not only a question of how sensitive your data are - it is also a question of respect for the informants: Most of my data could in principle be made available. Because of what I have informed my respondents about, the purpose of the study, and what the data is going to be used for, it would not be ethically defendable to share the data with others for other purposes than originally planned Relatively small differences across sector The survey does not indicate significant differences between respondent groups in terms of observed barriers. This section summarized the observed differences. Time constraints is a less important barrier for respondents working at health trusts than other types FIGURE 10.1 Main barriers for increased sharing of research data ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). Across sector. Only includes the five major obstacles. 40% 35% 30% 25% 20% 15% 10% 5% 31% 35% 16% 23% 24% 28% 26% 26% 23% 26% 18% 19% 20% 14% 28% 0% Making data available takes away valuable time for research Lack of technical infrastructure Open access would reduce possibilities of scientific publications Concerns connected to misinterpretation of data Cannot give access due to sensitivity issues Univerities and university colleges Research institutes Health trusts (hospitals) 50 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

51 of institutions. Researchers in health trusts are more concerned about the sensitivity of the data and the lack of infrastructure. Figure 10.1 shows the results. Researchers at institutes and universities are on the other hand more concerned with time, but also with the risk that others might misinterpret their data. Within humanities and medical science, the respondents are not particularly concerned about the misinterpretation of their data. within humanities, mathematics and the natural sciences. Researchers in these disciplines are significantly more concerned about this challenge than are researchers in social sciences and technology. Sensitivity issues was the key reason for not being able to share data within social sciences and health science. Figure 10.2 shows the results. Time is especially scarce for respondents within agriculture and fishing as well as those within mathematics and natural science. Some differences across disciplines When looking across disciplines, lack of technical infrastructure constitutes a particular challenge FIGURE 10.2 Main barriers for increased sharing of research data ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). Across research discipline. Only includes the five major obstacles. 45% 42% 40% 39% 39% 38% 35% 30% 25% 20% 15% 10% 5% 27% 21% 31% 29% 33% 31% 29% 25% 26% 24% 21% 20% 18% 16% 11% 26% 26% 26% 25% 23% 18% 18% 13% 8% 7% 5% 0% Making data available takes away valuable time for research Lack of technical infrastructure Open access would reduce possibilities of scientific publications Concerns connected to misinterpretation of data Cannot give access due to sensitivity issues Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 51

52 Younger researchers more concerned with sensitivity In terms of professional experience, two differences are worth noting. First, the less experienced respondents did not think of time as a challenge - they might be more familiar with technology and various solutions for sharing files. The results are shown in figure 10.3 On the other hand, the respondents who were more inexperienced were more attentive and alert to possible sensitivity issues concerning their data. This seems to be less of an issue for the more experienced respondents. This might be a result of their lack of experience with juridical issues or fear of misuse. Otherwise, the survey suggests small differences in terms of the perceived current barriers and challenges for sharing data across years of experience Textual records are more sensible This section discusses variations in responses across different data formats and the challenges the respondents foresaw. In figure 10.4 below, we can see that textual records involve more data that are sensitive. There are FIGURE 10.3 Main barriers for increased sharing of research data ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). Across experience Only includes the five major obstacles. 40% 35% 36% 35% 34% 30% 29% 28% 27% 27% 27% 28% 25% 20% 15% 13% 24% 24% 23% 22% 17% 23% 24% 22% 22% 19% 19% 23% 21% 20% 16% 10% 5% 0% Making data available takes away valuable time for research Lack of technical infrastructure Open access would Concerns connected tocannot give access due reduce possibilities of scientific publications misinterpretation of data to sensitivity issues Less than 3 years 3-6 years 7-10 years years More than 20 years 52 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

53 stronger concerns as to whether textual records will be misinterpreted. The nexus of the textual records are thus more important than are numerical scores. Furthermore, the respondents stated that they could not give access to textual scores due to sensitivity issues. stated that open access to research data would reduce their possibilities regarding future scientific publications Researchers see little support from management On the other hand, there are time issues relating to making numerical scores. Moreover, those respondents mainly working with numerical scores The survey shows that there is a perceived lack of support for open access to research data from management. FIGURE 10.4 Main barriers for more sharing of research data ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). Across data type Only includes the five major obstacles. 40% 35% 34% 36% 30% 25% 20% 26% 21% 20% 19% 28% 30% 19% 23% 28% 30% 15% 14% 16% 14% 10% 5% 0% Lack of technical infrastructure Concerns connected tocannot give access due misinterpretation of to sensitivity issues data Making data available takes away valuable time for research Open access would reduce possibilities of scientific publications Numerical scores Textual records Images, sounds, videos and graphics SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 53

54 There has been little focus on implementing open access to research data at the organizational level. Less than 50 percent of the respondents reported that management encouraged open access to research data. data to be the responsibility of individual researchers/groups. There seem to be little focusing on long term archiving and data handling Limited institutional support Less than one-fifth (16 percent) of the respondents reported that their organization provided training on best practices for sharing research data. Guidelines and standards existed for 27 percent of the respondents. Finally, 30 percent stated that their organization provided tools, technical support and infrastructure facilitating open access to research data. One of the respondents highlights the problem with the following quote: Institutions usually focus on measurement of performance through amount of publications and coarse counting of results, leaving aspects of collecting, investigation of, archiving and handling of Our study suggests significant differences in the way research management deal with the sharing and archiving of data. Management at research institutes facilitate open access to research data to a larger extent than the case is at universities and health trusts. Around 56 percent of the respondents within research institutes stated that their management, either to a high or to some degree, encourage[d] that our [the respondent s] research data should be open. Only 30 percent of the respondents from health trusts stated that their management encouraged open access to research data. TABLE 10.2 Does management or the organisation support open access to research data? ( To what extent do you experience that open access to research data is implemented in your organization) To a high degree To some extent Not at all I do not know freq. pct. freq. pct. freq. pct. Freq. pct. Management encourage that our research data should be open My organization provides training on best practice for open access to research data My organization has guidelines and standards for data format and for assigning information to data % % % % % % % % % % % % My organization provides the necessary tools, technical support and technical infrastructure for open access to research data % % % % 54 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

55 FIGURE 10.5 Does management encourage open access to research data? ( Management encourage that our research data should be open ) 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Univerities and university colleges Research institutes Hospital trusts (hospitals) 9% 16% 6% To a high degree 30% Research institutes have the best preconditions for sharing research data. The survey shows that 39 percent of respondents at research institutes stated that their organization, either to a high or to some extent, provided the necessary tools, technical support and technical infrastructure for open access to research data. At universities, the share is 27 percent and at health trusts it is 18 percent 40% 23% To some extent FIGURE 10.6 Do the organisation provide support and solutions for open access to research data? ( My organization provides the necessary tools, technical support and technical infrastructure for open access to research data ) 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Very few answered that their organization to a high degree provided the necessary solutions for open access. A high share of respondents state either that they do not know whether their organization provided the necessary solution or that in fact it did not do so. Univerities and university colleges Research institutes Hospital trusts (hospitals) 3% 6% 2% To a high degree 24% 31% 16% To some extent There are small or no differences across professional experience. As such the figures are not presented in this report. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 55

56 10.6 Researchers call for better infrastructure, citation systems and guidelines research data would make it easier to give credit to researchers preparing and generating data. The survey also asked the respondents about possible measures that would facilitate more sharing of research data. These answers constitute important inputs to recommended actions that can facilitate open access to research data. Most respondents point to better infrastructure as a solution to increased access to research data. Further, respondents state that the implementation of a citation system would facilitate increased sharing availability of data. A better citation system for In addition to increased funding, respondents highlighted various aspects of formal competence and technical support. The respondents called for the implementation of guidelines, standards and more training on open access to research data in order to increase sharing of data. These solutions pointed out are also in line with the challenges identified, especially those concerned with the time-constraints cited earlier. TABLE 10.3 Solutions to facilitate increased sharing of data (What efforts would make open access to research data to publicly funded research data more interesting for you? (maximum 3 answers)) Solutions for increased sharing of data Frequency Pct. Better infrastructure for open access to research data % Implementation of a system for citation % More resources allocated for open access to research data activities % Implementation of guidelines % More training on open access to research data % Implementation of standards % Don't know % Make open access to research data an indicator in the funding scheme % Guidelines to how long I can attain ownership to data before sharing % Make it mandatory to explain how data will be made available % Not allowed to share anyways % Total 2, % 56 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

57 Few differences across sectors As for the proposed solutions, we see few differences across sectors, with one important exception. The respondents based at health trusts are particularly concerned with the need for guidelines for open access to research data would facilitate increased sharing of research data. This input is perhaps not surprising, given their emphasis on sensitivity of data. Guidelines could focus on how to share sensitive data and what kinds of data that can be shared. As noted by some respondents, education might also involve how to handle open access to research data. FIGURE 10.7 Solutions to facilitate increased sharing of data, across sector (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 45% 40% 35% 41% 42% 43% 41% 39% 35% 41% 30% 28% 29% 25% 20% 22% 21% 23% 18% 22% 22% 15% 10% 5% 0% Better infrastructure More training Implementation of a citation system More resources Univerities and university colleges Research institutes Health trusts (hospitals) Implementation of guidelines for open access SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 57

58 A blurred picture across discipline In general, infrastructure is the dominant proposed solution. It is more important for some research disciplines than others. Technology and mathematics and natural sciences, with 46 and 45 percent respectively, point to infrastructure as the most important solution. For humanities training and to some extent guidelines would have more influence on making open access to research data more interesting. Around 30 percent within humanities point at training and guidelines. Within mathematics and natural science the corresponding share is 16 percent and 19 percent. Agriculture and fishery have a stronger focus on developing citation systems than other disciplines. In figure 10.8, we see that whereas 46 percent within agriculture and fishery state citation as an important factor for making open access to research data more interesting for them this is only the case for 30 percent within humanities and 33 percent within social science. FIGURE 10.8 Solutions to facilitate increased sharing of data, across research discipline (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 40% 35% 30% 25% 20% 15% 45% 46% 43% 42% 40% 36% 32% 25% 24% 24% 18% 16% 46% 44% 40% 38% 33% 30% 29% 27% 27% 25% 19% 19% 35% 29% 25% 21% 20% 19% 10% 5% 0% Better infrastructure More training Implementation of a citation system More resources Implementation of guidelines for open access Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology 58 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

59 Little difference across experience Interesting, but not surprisingly the more inexperienced researchers fancy training. They want to learn how to share data. More experienced researchers are more engaged in matters concerning lack of resources. In figure 10.9 we see that around 30 percent of the more experienced research points at the lack of resources in order to make open access to research data more interesting it is only around 11 percent for the inexperienced researchers. Further we see that whereas around 20 percent of the more experienced researchers points at more training as a measure for making open access to research data more interesting the share is about 30 percent among the more inexperienced researchers. Across experience, it is thus clear that infrastructure and citation system are the most preferred measures in order to make open access to research data more interesting according to the researcher. FIGURE 10.9 Solutions to facilitate increased sharing of data, across experience (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 40% 44% 43% 43% 39% 38% 41% 42% 42% 42% 35% 30% 25% 20% 29% 28% 20% 18% 18% 35% 31% 28% 25% 24% 26% 22% 23% 20% 21% 15% 10% 11% 5% 0% Better infrastructure More training Implementation of a citation system More resources Implementation of guidelines for open access Less than 3 years 3-6 years 7-10 years years More than 20 years SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 59

60 10.7 Researchers working internationally find time to be a bigger challenge - Work in collaboration with researchers at international institutions We hypothesized that researchers working in different settings have different perceptions of the barriers to sharing data. We asked each of the respondents to state the proportions of their working hours where they: - Work alone - Work in collaboration with colleagues within their institution Working mainly alone was defined as the researcher working alone more than 40 percent of their time. The same threshold delimits researchers collaborating with others within their own institution. Finally, we defined researchers working internationally as those spending more than 20 percent of their time collaboration with researchers at international institutions. FIGURE Main barriers for increased sharing of research data, across researchers way of working ( Do you see any challenges in making more of your research data available for other researchers? Maximum 3 answers). 40% 35% 35% 30% 27% 25% 20% 22% 24% 21% 19% 19% 23% 23% 17% 18% 20% 19% 19% 15% 13% 10% 5% 0% Making data available takes away valuable time for research Lack of technical infrastructure Open access would Concerns connected tocannot give access due reduce possibilities of scientific publications misinterpretation of data to sensitivity issues Alone ( > 40 pct.) Collaboration within the institution ( > 40 pct.) International collaboration ( > 20 pct.) 60 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

61 One might think that it exists differences among researchers way of working and their attitudes towards sharing data. However, we find little evidence that this should be the case. most important issue. There is an obvious explanation for this. Whereas researchers mainly working alone share data to a lesser extent than others do, they face fewer issues in sharing their data. The main difference between researchers way of working is found to be the challenge that making data available takes away time for research. The respondents mainly working alone did not see this as a big issue. On the other hand, the respondents collaborating internationally stated that this was the On the other hand, the researchers collaborating internationally have to deal with international standards and practice. Alongside few existing standards, multiple archiving solutions and different legislation across borders, the time involved in sharing data is a significant challenge. FIGURE Solutions to facilitate increased sharing of data, on how researchers work (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 40% 35% 36% 34% 41% 33% 34% 43% 30% 29% 25% 20% 24% 19% 19% 24% 23% 22% 17% 15% 14% 10% 5% 0% Better infrastructure More training Implementation of a citation system More resources Implementation of guidelines for open access Alone ( > 20 pct.) Collaboration within the institution ( > 40 pct.) International collaboration ( > 40 pct.) SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 61

62 10.8 Researchers welcome data sharing as a part of publishing We have seen that the respondents would like to use data generated by others, but that they lack the incentive s to make their own data available for others. Many scientific journals increasingly require that data should be made available as a part of the publishing process. But only 11 percent of researchers have already experienced this practise. Making data available through scientific journals could lead to increased interest from other researchers. As illustrated below, 50 percent see that increased focus on making data available as a part of scientific publications could mean that their research becomes more interesting for others to follow. for every researcher, as their research aims at creating new knowledge based on solid evidence. Roughly half the respondents (54 percent) agreed with this. Slightly less than thirty percent stated that their data and publications would be cited more. Interestingly enough, 11 percent did not see any benefits in making data available as a part of scientific publications. If we add the 9 percent who do not know and take the residual, the remaining 80 percent, can be considered positive towards making data available as a part of scientific research. Another positive outcome would be that the research could be quality assured. This is important TABLE 10.4 Researchers welcome sharing data as part of scientific publications. (Do you welcome the trend of making data available as a part of scientific publications?) Frequency Pct. Yes, it could mean that my research could be more interesting for others to follow % Yes, it is a sign that my research can be quality assured % Yes, it could mean that my data and or my publications will be more cited % No, I see no benefit for me % I do not know % Other % Total 2, % 62 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

63 11 Main findings and recommendations The objective of this study has been to gain a better understanding of researchers current practice on sharing and archiving of research data. In addition, we analyse the various fears and barriers involved from a researcher s point of view and how these barriers might be overcome. The following chapter summarize the main findings and recommendations for the Research Council of Norway Main findings Researchers share data Our study shows that Norwegian researchers frequently use and share research data with each other. As many as 64 percent of researchers have used research data from other researchers over the last three years. Researchers mostly use research data generated by other researchers from their own institution, but only slightly more than by data from researchers at other institutions outside of Norway. Potential to increase sharing of research data About one-third (36 percent) of the researchers have not used data gathered by other researchers. Of these, 71.5 percent reported that they would like to make use of other researchers data. This indicates a clear potential for increasing sharing of research data. The numbers indicate untapped potential for increased and improved sharing of data. Only one in ten had not used research data generated by other researchers over the past three years did not wish to use this type of data. Researchers see benefits in sharing their data The survey confirms that researchers in Norway see benefits of sharing and archiving their research data. Around 80 percent of the respondents agreed that open access to research data enhance research and that it is an ethical obligation to make their data available for validation. These are also the two reasons for open access to research data agreed to by most researchers. Further, 77 percent agreed that open access to research data facilitates the education of students and new researchers, and 74 percent that open access to research data stimulated research collaboration respectively. Many researchers are undecided Although most researchers agree as to the benefits of sharing their data, many researchers are undecided as to whether publicly funded research data should be considered public property. Of the remaining 20 percent who did not agree that open access to research data would enhance research, 15 percent were undecided and about 5 percent disagreed. This large proportion of undecided respondents may reflect the complexity of the issue. Additionally, more than 600 respondents actively decided not to participate in the survey. This reluctance to participate might also be seen as an indication that questions regarding open access to research data are perceived as being irrelevant to the SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 63

64 individual respondents (i.e., he or she is not an active researcher) or that he or she is undecided as to the issue. 3. Open access to research data might reduce researchers possibilities for future scientific publications. These 600 non-respondents correspond to 30 percent of the actual respondents. If they are regarded as undecided respondents, the share of researchers without a clear position on the issue of open access to research data is quite significant. These responses indicate, inter alia, that researchers lack adequate and user-friendly infrastructure, guidelines and procedures, and certainty about immaterial rights in order to embrace the idea of sharing data. The survey included an open answer option where respondents could write free text. Inputs in this section show that many researchers find the issue of open access to research data challenging and complex. Many researches are clearly positive towards sharing, but many researchers are also negative, as we have tried to state in the report. Researcher want to remain in control of their data Most researchers share their research data with other researchers. Yet research data is generally shared under certain conditions (e.g., only upon request, under a non-disclose agreement, in an anonymized format). Researchers want to control who gets access to their data and how they use it. With each researcher setting the term, there is a risk that she becomes a gatekeeper. Contrary to our hypothesis, we did not find any major differences across sectors, fields of research or years of professional experience. Archiving data on local computers and institutional servers The study found that 85 percent of the respondents archived their data on their own devices or at an institutional server. This figure do not vary across sectors, disciplines or professional experience, which is something of a paradox, since storing data on their own devices cannot be regarded the most secure means of storage. This is especially apparent insofar as many of the respondents were concerned about security and the sensitivity of their data. Researchers see few initiatives from their management Lack of proper infrastructure and incentives for sharing When asked about the barriers to sharing more of their data, the central barriers according to the researchers are: 1. Preparing data for open access takes valuable time away from research-activities. 2. Respondents do not have adequate technical infrastructure. The survey responses suggest significant differences in the way in which research managers deal with the sharing and archiving of data. Consequently, researchers see a need for greater institutional support. Only 16 percent (within the research institutions) and 6 percent (within health trusts) perceived to a - significant extent - that their management encouraged them to share data. Moreover, only six percent (within the research institutions) and two percent (within health trusts) perceived a significant degree 64 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

65 of solutions and technical support for sharing their data. 3. Implementing guidelines, training and standards for sharing data. There is a need for better infrastructure and a credit system for researchers The survey indicates a strong relationship between the major barriers to sharing data and the researchers proposed solutions to overcome them. The flipside of these barriers are possible solution. These include: 1. Better infrastructure. 2. Implementing a system for citation. Again, we find very limited differences across sectors, disciplines and professional experience Recommendations Previous studies suggest that there are multiple obstacles and, hence, no single solution to increase the sharing and archiving of research data. Yet we will present some recommendations here and in figure FIGURE 11.1 Problems, solutions and recommendations SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 65

66 Former studies, as well as this analysis, suggest that there is a need for work directed at both the level of researchers, research institutions, research funders and government/international levels. Initiatives need to take place in parallel. For example, taking action to make more researchers share data without the proper infrastructure will most likely prove counterproductive. Thus, there is a strong need for a coordinated effort. We see that the Research Council of Norway can play a key role promoting open access to research data in Norway. premise for sharing data. Even if the two matters differ, they are closely linked, and should be seen in relation to one another. Most researchers use other researchers data. Most researchers are also willing to let others reuse their generated data if certain conditions and restrictions fulfilled. The researchers are the ones who gather and analyse the data, and who will archive and share the data in the end. Researchers want to know what happens to their research data. As such, it is important to raise awareness among researchers. Raising awareness The sharing and archiving of research data entails many obstacles and questions in which need to be answered. Many respondents were undecided or did not wish to participate in the survey. This might suggest that researcher s consider sharing and archiving of research data as a complex and difficult topic. We would suggest that the Research Council of Norway actively work to raise awareness on the issue, covering both the benefits and pitfalls of archiving and sharing research data. In particular, exemplifying potential opportunities and value is important, inter alia, by using best practice cases. Emphasis should be on showing that sharing and archiving is worthwhile for researchers. In this respect, there also seems to be a need for certainty as to the differences between archiving and open access to research data. The archiving process does not necessarily imply full open access to research data for all - it should be considered a However, the study also indicate that researcher need support and many does not see this support from their management. Thus, it is also important to raise awareness at the institutional level. Giving credit as well as responsibility to researchers The study indicates that a lack of incentives and credit for gathering data are a barrier for increased sharing of research data. These findings correspond findings in former national and international studies (i.e., Kvale, 2012). The respondents would be more willing to share data if they received credit for their data generation work. One obvious way of crediting researchers would be by support the implementation of a citation or reference system for data. Accreditation is an important motivation for researchers. 66 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

67 References can be seen as a kind of normative payment Ingwersen (2011) Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure There is no well-established citation system for research data in Norway, giving researchers few incentives to prioritize time for preparation of data for sharing. The lack of a well-established citation system is also an international issue. We thus see the benefit the such systems should be coordinated at the international level. Ideally, the system should be easy to use and work alongside existing systems for publishing. Tenopir (2011) suggests promoting good sharing practices among researchers. For example, obtaining copies of articles using a researchers data is one example of conditions that would encourage sharing and promoting best practice. The Council could also introduce some kind of requirements on researchers. Lord et al. (2006) study large-scale data sharing in life sciences based on ten case studies, and found that a laissez-faire approach to the collection and distribution of data results in waste, as such data will not entail sufficient information to enable re-use. A key recommendation from Lord et al. (2006) is an insistence on a data management plan that clearly defines responsibilities and goals and awareness of the needs and practices of data management. The Research Council can introduce requirement of data management plans as a part of the traditional application procedure. It is also possible to make sharing of research data a part of the financial system for basic funding. Further, it could be a system in which the Research Council of Norway withholds funding until data is properly shared and archived. We do not recommend implementing such stringent measures at the current stage, as it would require considerable work in terms of its design and in terms of having the proper infrastructure in place. Without proper guidelines and a sound infrastructure, such as system could be counterproductive. For recent years, research communities has been left to establish methods and practices for sharing and archiving their research data. We are concerned that this leads to a suboptimal organization of solutions. As stated earlier, we do not find large differences across sectors or research disciplines. Hence, we cannot support arguments leading to the design of tailored solutions for each specific sector or individual research discipline. Yet the work must still be inclusive of all research communities, as they have the knowledge and will have to implement the supposed strategies and solutions. Guidelines, rules and best practice Our study suggests that many researchers lack knowledge as to what data to share and archive. In addition, researchers lack knowledge as to what form the data should have, and how proper information about the data should be assigned. Thus, the study suggest a need for better guidelines, standards and education relating to sharing and archiving research data. Such guidelines and standards should be developed in close interaction with researchers, institutions and legal experts. We recommend that implementation of guidelines and standards should be inspired by work initiated internationally to avoid creating a Norwegian bureaucracy alongside international standards. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 67

68 One way of promoting the use of shared data would be by creating solid and informative platform for metadata. A metadata-platform can be a low key activity, as it can be seen as a first step towards more complex infrastructure solutions. In addition, we perceive that many researchers are not aware of the possibilities of accessing data gathered by other researchers. Better metadata can overcome this issue. Finally, it would lay the ground for increased sharing of research data. Debate on infrastructure investment should involve all relevant stakeholders while ensuring a robust infrastructure that in turn will serve the needs of the future. We are somewhat cautious as to the design and scale of such a system because it could be a matter of cost and benefit. We thus see that more information on ambition s is needed. We would also suggest starting to work on data selection (i.e., on defining which data are worthwhile and which are not). Even though our study does not suggest any major differences in the practices and barriers across research disciplines and sectors, the open answers, however, indicated a strong need for better understanding and guidance as to which data to archive and share, and in which form to do so. In particular, researchers who mainly use textual data (e.g., interviews), have difficulties deciding which data to share and preserve. An ideal data infrastructure for science research would have a long list of technical characteristics. We refer to the wish list included in the EC white paper on scientific data, Riding the Wave. Infrastructure and funding Interviews and studies both suggests that the infrastructure for the sharing and archiving of data is fragmented, overlapping and inadequate. Many are satisfied with the current archiving solutions, yet researchers seems to archive most of their data on their own institutional servers or local storage devices. We found no differences across sectors or research disciplines on the topic of storage. Given the large share of storing data locally, there is clearly a need for better infrastructure solutions. Better infrastructure could increase the motivation for archiving data at data archiving centres, which could provide more secure means for archiving data and data could be restored easier. 68 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM

69 TEXTBOX 11.1 A WISH LIST FOR E-INFRASTRUCTURE Open deposit, allowing user-community centres to store data easily Bit-stream preservation, ensuring that data authenticity will be guaranteed for a specified number of years Format and content migration, executing CPU-intensive transformations on large data sets at the command of the communities Persistent identification, allowing data centres to register a huge amount of markers to track the origins and characteristics of the information Metadata support to allow effective management, use and understanding Maintaining proper access rights as the basis of all trust A variety of access and curation services that will vary between scientific disciplines and over time Execution services that allow a large group of researchers to operate on the stored date High reliability, so researchers can count on its availability Regular quality assessment to ensure adherence to all agreements Distributed and collaborative authentication, authorisation and accounting A high degree of interoperability at format and semantic level Adapted from the PARADE (Partnership for Accessing data in Europe) White Paper (2009) Partnership for Accessing Data in Europe (PARADE) is a consortium targeting to build efficient services addressing data management needs of multiple research communities. Strategy for a European Data Infrastructure (White Paper) was published in October 2009 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 69

73 Appendix Participant at workshop on open access and data management, Research Council of Norway October 25th, 2013 Øystein Godøy Norwegian Meteorological Institute Dagmar Langeggen BI Norwegian Business School Andreas Jaunsen UNINETT Sigma Koenraad De Smedt University of Bergen Vigdis Kvalheim Norwegian Social Science Data Services (NSD) Olav Hagen Sataslåtten The National Archives' Central Office Frode Arntsen BIBSYS Helge Sagen Institute of Marine Research Per Magnus The Norwegian Institute of Public Health Terje Risberg Statistics Norway Dag Undlien University of Oslo and Oslo University Hospital Jan Bjaalie University of Oslo Live Kvale University of Oslo Asbjørn Mo Research Council of Norway Roar Skålin Research Council of Norway Inngunn Sagebø Research Council of Norway Øystein Godøy Research Council of Norway Siri Lader Brun Research Council of Norway Additional interviews Øystein Godøy Norwegian Meteorological Institute Per Magnus The Norwegian Institute of Public Health Gunnar Simonsen University hospital of Tromsø Bjarne Strøm Norwegian University of Science and Technology Helge Sagen Institute of Marine Research SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA DAMVAD.COM 73

Assessment of the technical dossier submitted under EFSA/GMO/NL/2011/91 for approval of transgenic soya event DAS-68416-4 from Dow AgroSciences LLC Submitted to Direktoratet for Naturforvaltning by David

Electricity consumption should there be a limit? Implications of people s attitudes for the development of sustainable pricing systems Behave Conference Hege Westskog and Tanja Winther Background for the

Assessment of the technical dossier submitted under EFSA/GMO/UK/2010/83 for approval of transgenic maize event MIR604 by Syngenta Seeds S.A.S Submitted to Direktoratet for Naturforvaltning by David Quist

Health promotion capacity building An exploratory study of the Norwegian health promotion workforce Ausra Fehlker Thesis submitted in partial fulfilment of the requirements for the degree Master of Philosophy

Revolusjonen COPYRIGHT NOTICE The creative work in this presentation is protected by copyright. Redistribution or commercial use 2003: Facemash Dating? 2004: THEFACEBOOK.COM 2004: Facebook.com 25/8-2011:736

http://conference.ifla.org/ifla78 Date submitted: 24 May 2012 The role of libraries in supporting data exchange Susan Reilly Project Officer LIBER : Association of European Research Libraries The Hague,

Claus B. Jensen IT Auditor, CISA, CIA I am employed in Rigsrevisionen, Denmark. (Danish National Audit Office) I have worked within IT Audit since 1995, both as internal and external auditor and now in

360 Panoramic Guide a new visualisation and communication tool Per Erik Berger Managing Director ActionPhoto International AS action-photo.no Agenda What is a 360 Panoramic Guide? Example Bideford Dolphin.

1 Sharing public health data: a code of conduct In this information age, science is advancing by leaps and bounds. What is driving the exponential growth in knowledge in areas such as genetics, astrophysics,

In Need of a Better Framework for Success RAPPORT 22/2009 An evaluation of the Norwegian participation in the EU 6 th Framework Programme (2003 2006) and the first part of the EU 7 th Framework Programme

Open science tools available for Finnish higher education Open Science: Engaging Finland s Doctoral Schools, 20.10.2014 Outline Open Science and Research Handbook IDA - Storage service for research data

UNIVERSITY OF NAMIBIA SCHOLARLY COMMUNICATIONS POLICY FOR THE UNIVERSITY OF NAMIBIA Custodian /Responsible Executive Responsible Division Status Recommended by Pro Vice-Chancellor: Academic Affairs and

ENHANCED PUBLICATIONS IN THE CZECH REPUBLIC PETRA PEJŠOVÁ, HANA VYČÍTALOVÁ petra.pejsova@techlib.cz, hana.vycitalova@techlib.cz The National Library of Technology, Czech Republic Abstract The aim of this

Background paper to the Lund Declaration 2015 content Lund Declaration 2009...1 State of play and progress since 2009...1 A robust challenge-based approach for real solutions...2 Alignment...3 Frontier

Notater Documents 51/2012 Lars Wilhelmsen A question of context Assessing the impact of a separate survey and of response rate on the measurement of in Norway Documents 51/2012 Lars Wilhelmsen A question

A Component of Professional Skills Workshops for Graduate Research Students 06/03/2012 Research Data Management Seminar, February 1-2, 2012, Carleton University 1 Seminar presenters Ernie Boyko, Carleton

30/04/14 Evaluation of the Quota Scheme 2001-2012 Assessing impact in higher education and development 2 EVALUATION OF THE QUOTA SCHEME 2001-2012 DAMVAD.COM For information on obtaining additional copies,

Research Data Management Policy Version Number: 1.0 Effective from 06 January 2016 Author: Research Data Manager The Library Document Control Information Status and reason for development New as no previous

s of EPSRC expectations on research data management. Expectation I Research organisations will promote internal awareness of these principles and expectations and ensure that their researchers and research

Evaluation of the CLIMIT Programme Oxford Research was established in 1995 and is part of the Oxford Group. The company is a full service research company offering research and management consultancy services

Project Plan Overview of Project 1. Background The I-WIRE project will develop a workflow and toolset, integrated into a portal environment, for the submission, indexing, and re-purposing of research outputs

General terms and conditions 1. Exotiq Property Jimbaran (EPJ) Liability a. EPJ is an agent acting merely as a booking agent who works between its guests and on behalf of the villa owners of the villas

University of Copenhagen Water and Environment Theme Peter E. Holm and Jørgen E. Olesen Contributing Danish Institutions: Aarhus University University of Southern Denmark Technical University of Denmark

Data Sharing in Research: Four Key Concerns Sabina Leonelli Exeter Centre for the Study of Life Sciences (Egenis) & Department of Sociology, Philosophy and Anthropology University of Exeter @sabinaleonelli

An Introduction to Managing Research Data Author University of Bristol Research Data Service Date 1 August 2013 Version 3 Notes URI IPR data.bris.ac.uk Copyright 2013 University of Bristol Within the Research

THE LATVIAN PRESIDENCY UNLOCKING EUROPEAN DIGITAL POTENTIAL FOR FASTER AND WIDER INNOVATION THROUGH OPEN AND DATA-INTENSIVE RESEARCH IT-LV-LU TRIO PROGRAMME Overcome the economic and financial crisis Deliver

Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,

BioMed Central s position statement on open data Increasing transparency in scientific research has always been at the core of BioMed Central s strategy. Now, after more than a decade of open access research

Your door to future governance solutions www.egovlab.eu 2 3 not just in theory but also in practice 4 5 www.egovlab.eu * Word from egovlab s director Vasilis Koulolias: The power of information and communication

Open Access and Open Research Data in Horizon 2020 Celina Ramjoué Head of Sector Open Access to Scientific Publications and Data Digital Science Unit CONNECT.C3 22 November 2013 Train the Trainer for H2020

Smart Grid Smart Home Study Tour Canada, September 14-20, 2013 Canada and Smart Grid: Massive investments have been made Research centres and leading edge centres of competence established Interest and

SowiDataNet Bringing Social and Economic Research Data Together Monika Linne, Data Archive for the Social Sciences GESIS Leibniz Institute for the Social Sciences SowiDataNet General Overview What is SowiDataNet?

CAMP LOGOS administrated by Boligselskabet Sct. Jørgen (housing agency) Camp Logos the buildings Camp Logos is the dorm across the street from The Animation Workshop. There are 4 buildings side by side

International Open Data Charter September 2015 INTERNATIONAL OPEN DATA CHARTER Open data is digital data that is made available with the technical and legal characteristics necessary for it to be freely

Policy Paper on Non-Formal Education: A framework for indicating and assuring quality Adopted by the Council of Members/ Extraordinary General Assembly 2-3 May 2008 (Castelldefels, Catalonia - Spain) 0.

HALOGEN RESEARCH DATA MANAGEMENT BENEFITS CASE STUDY 1. BACKGROUND The cross-disciplinary Roots of the British collaboration between scholars in humanities and genetics at the University of Leicester (Wellcome

7th Framework Programme Online survey on scientific information in the digital age Studies and reports EUROPEAN COMMISSION Directorate-General for Research and Innovation Directorate B European Research

The National Test in English: Why it is important and why it is not enough* A study of how school leaders and teachers use the results from the National test in English Marthe Sibbern Mastergradsavhandling

Graduate School Online skills training (Research Skills Master Programme) Research Methods Research methods in the social sciences This course begins with an examination of different approaches to knowledge

A Policy Framework for Canadian Digital Infrastructure 1 Introduction and Context The Canadian advanced digital infrastructure (DI) ecosystem is the facilities, services and capacities that provide the

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking

2007/21 Plans and reports Strategy for data collection Strategy for data collection The demand for statistics and public administrative information is increasing, as are the demands on quality in statistics

INTRODUCTORY NOTE TO THE G20 ANTI-CORRUPTION OPEN DATA PRINCIPLES Open Data in the G20 In 2014, the G20 s Anti-corruption Working Group (ACWG) established open data as one of the issues that merit particular