Collaborating on an exploratory basis to trial the ‘CASRAI approach’ in the UK

João Mendes Moreira (Head of Scientific Information at FCT – Foundation for Science and Technology in Portugal) was a member of the Organisational Identifiers (Org Id) working group and, as part of this group, contributed to the review of organisational identifiers. Following the production of the review and its recommendations FCT|FCCN, within their PTCRIS program, looked at the recommendations to help address their own issues around organisational identifiers.

The PTCRIS action plan is based on the Org Id review and their goal is to conciliate international initiatives, that aim to address Org Id in a more comprehensive and sustainable way, with national efforts. They believe that, independently of international efforts, there are a number of basic things that they should address at the national level. Their roadmap is summarised in the following image and based on the recommendations from the Org Id study:

They aim to create a national registry that will synchronize with international authoritative databases (eg ISNI, Ringgold). In order to do so, they are gathering information on rules, procedures and support services regarding national registries.

As part of this approach they have developed a survey. The following text provides further background information. Please take the time to complete this survey and notice the end date is 18 January:

Fundação para a Ciência e a Tecnologia (FCT) is the Portuguese funding agency that supports science, technology and innovation, in all scientific domains, under responsibility of the Ministry for Education and Science. FCT is currently developing and implementing a national Current Information Research System (CRIS) – PTCRIS. Within this challenging programme, we are investigating the adoption of unique identifiers for organizations.

The preparation and definition of the action plan for adoption of unique identifiers for organizations is based on the recommendations of the study produced under the CASRAI-UK Organizational ID study.

PTCRIS is also aware of project THOR’s mission and goals. PTCRIS team believes that the work being undertaken with regards to Organization identifiers will fit well on project THOR’s results.

This survey aims to gather information about rules, procedures and support services regarding organization identifiers (OrgID). Results from the survey will inform the development of a set of procedures and services to manage OrgIDs nationally. These results will, upon request, be made available.

The survey is addressed to funding agencies, registration agencies and data contributors. It should take no longer than 15 min to complete and should be filled until 18th January.

Landscape Study
The terms of reference for this study were to interview members of the working group to establish what authoritative lists of organisations involved in UK research are being, or could be, used and undertake a landscape review of organisational identifiers currently used, and for what purpose, in the UK. The study identified the key aspects for an organisational identifier to support the widest range of use cases. These are governance, trust, transparency, temporal (historical) and having the appropriate metadata. When it came to identifying use cases against the core set of identifiers it was decided to look at the highest level first, rather than those more granular cases such as looking at departmental level, and determine which identifiers best satisfied these use cases.

There were 23 identifiers described within the landscape study; almost all of these are internal (they solve a specific requirement within the organisation that has created them, and any additional re-use of these identifiers is unintended). However, the evidence collected shows that some of the examples do go beyond this internal use:

Regulated lists. The focus of these lists is not the identifier, but that inclusion on the list represents some specific status. (HEFCE’s list of HE providers for England and Companies House).

Technical systems. These are primarily intended for network interactions, rather than as a governance mechanism (Janet and MACE/UK Federation).

UK Provider Reference Number (UKPRN). Although it is essentially a regulated list, UKPRN has been adopted as an identifier by a range of parties, including HESA and HEFCE.

International Standard Name Identifier (ISNI). ISNI is intended as a bridge identifier: it has no authority of its own, and is intended to be used to link disparate data sources together.

Business information. These sources focus on providing information about the organisations, and an identifier is necessary to do this (Ringgold and Dun & Bradstreet).

Authority lists. These lists are intended to increase harmonisation, but do not have the restrictive nature of a Regulated list (FundRef, VIAF, and ResearchFish).

The study came to the following conclusions:

None of the identifiers investigated fulfils the role of being an “authoritative list” of organisations involved in research. They are all constrained in scope, or not authoritative.

ISNI and UKPRN both have traction, and warrant particularly careful consideration by the working group. UKPRN does not cover the full range of organisations involved in research, is limited to the UK, and does not include departments, but is a robustly managed list that covers a defined subset of organisations very well. The role of the registration agency in ISNI is crucial, and whether the existing agencies offer appropriate services for this domain will need to be considered.

The Research Councils, as major funders of research in the UK, should be closely involved in the development of any new identifier system. At present, ROS, ResearchFish and Gateway to Research all use their own identifiers.

Given the range of existing identifiers, any new identifier system should only be developed and introduced if there is clear evidence of demand, and sufficient buy in to ensure that it is universally adopted.

The authority can remain separate from the identifier (for example, it would be feasible to establish an authority list with appropriate metadata but using the ISNI as the identifier).

Organisational Identifiers Review
Following on from the recommendations of the Landscape Study a further review was undertaken on a core set of identifiers – ISNI, Ringgold, Digital Science and UKPRN. Specifically, it looked to:

clarify a representative but not comprehensive set of use cases for the UK research community to use organisational identifiers (OrgIds);

survey and interview a small number of well-informed people in the field in order to create and prioritise a list of desirable features for the provision of OrgIds and potential services built around them;

check the use cases and these required features against four possible candidate OrgIds and their providers;

inform the Working Group of its conclusions and, if appropriate, make recommendations for adoption by the UK research community.

Before summarising the outputs from this review, it’s worth briefly describing the core set of identifiers chosen:

ISNI: ISNI is an ISO standard, in use by numerous libraries, publishers, databases, and rights management organisations around the world. The ISNI database is built from hundreds of databases worldwide and holds public records of over 8.6 million identities, including 8.24 million individuals (of which 2.25 million are researchers) and 446,000 organisations. The ISNI International Agency (ISNI-IA), in compliance with ISO’s policy and procedures, is designated by ISO as the ISNI Registration Authority. Its charge includes the maintenance and revision of the standard, the responsibility for the central ISNI database and assignment system, and the development of the related activities around the identifier, including contractual relations with the network of ISNI Registration agencies, ISNI members, etc. ISNI is a bridge identifier, designed to provide interoperability between different proprietary identifiers, such as the Ringgold ID.

Ringgold: The Ringgold Identifier was implemented as a key solution in a project undertaken with a major scholarly publisher seeking best practices for the identification and disambiguation of institutional subscribers. Their database contains 400,000 organisation records with organisational identifiers and associated metadata. It’s global and covers all market sectors, including but not limited to, universities, research centres, funders, corporations, non-profit organisations, government entities and organisations, healthcare and hospitals, schools and public libraries.

Digital Science: The Digital Science Institute Database provides global coverage of organisations that feature in the scientific lifecycle. This includes funders, those that receive funding, collaborators, those that publish articles in journals or conference proceedings, or any institution that consumes or produces any kind of scientific artefact such as data or software. It has been developed to provide solutions to typical data integration and scholarly attribution problems experienced across the portfolio of Digital Science companies and is actively used by Figshare, Altmetric, Symplectic Elements, Dimensions for Funders and Symplectic Dimensions for Institutions. The number of organisations indexed is expected to exceed 25,000 by its public release (expected February 2015).

UKPRN: The UK Register of Learning Providers is a register of legally verified learning providers in the UK. Each verified provider will be assigned with a unique provider reference number UKPRN. This information is shared across the sector with agencies such as the Skills Funding Agency, the Higher Education Statistics Agency (HESA), the Higher Education Funding Council for England (HEFCE) and UCAS. This is an optional register so not all learning providers need to register with UKRLP.

The conclusion of the review was that while one single candidate would not fulfil all the criteria, it would be useful to separate the infrastructure element (the provision and maintenance of the OrgID itself) and the service element (the services offered both to registrants and to end users of the services). The most desirable vision for the future would be for ISNI to emerge as a strong, sustainable and internationally well supported baseline or, in their own words, “bridging” ID with a few commercial players, and perhaps some non-commercial ones such as the British Library and HEFCE, acting as registration agencies and holding crosswalks or equivalence tables to their own IDs.

The Organisational Identifier Working Group accepted the following recommendations:

The Working Group should consider recommending a hybrid approach with ISNI as the backbone. Institutions and others needing to register and use OrgIds should use a solution which relies on and feeds the minimum data set curated by ISNI.

In considering registration solutions and value-added services, organisations should bear in mind that, in the short term, Ringgold is the most developed agency conforming to recommendation 1.

However, we very much hope that soon there will be other service providers working to deliver value added services on top of ISNI and the Working Group should do what they can to encourage such competition by, for example, Digital Science, who should consider the possibility of acting as a registration agency for ISNIs in a similar way to Ringgold.

Jisc should investigate the possibilities and costs of a bulk deal for UK academic institutions for value added services with Ringgold and (in time) with other service providers.

We understand that the Bibliothèque nationale de France (BnF) has recently become a registration agency for ISNI and we recommend that HEFCE and the British Library discuss whether it would be appropriate for there to be a UK-based registration agency and how bulk creation/checking of ISNIs might take place for UK academic institutions and other UK organisations involved in research.

The report and its findings are currently being publicised and disseminated. A draft statement of agreement (based on the report’s findings and highlighting the direction of travel) is being prepared so that key organisations, such as RCUK, Jisc, HEFCE, etc can sign up to it.

Share and Enjoy

On 10 February I gave a presentation on Organisational Identifiers and the outputs from the working group at the Jornadas FCCN workshop.

I had been kindly invited over there by João Moreira and I’d like to thank him and FCCN for their invitation and hospitality. Although the workshop was in Portuguese, so it’s difficult for me to summarise the other presentations, it was a great opportunity to disseminate the outputs from the group to a non-UK audience. There are clearly lessons learned from the group that are relevant internationally and not restricted to the UK. Although the original aim of the Organisational Identifier review, for example, was to focus on the UK, it soon became clear that many use cases have an international element as researchers and research groups collaborate more and more beyond the borders of their institution and home nation.

From discussions with João, it would seem that there is a lot of interest in Portugal in the work of Jisc, in particular in areas such as open access, research data, ORCID and the work of the CASRAI-UK pilot. There has also been interest from Sweden and the Netherlands on the work of this pilot, its outputs and lessons learned. Although we are trying to solve UK issues on organisational identifiers, data management plans and open access reporting the work is proving useful for other countries.

You can view and download my presentation from the Jornadas FCCN workshop here on Slideshare.

Share and Enjoy

The minutes from the February (online) meetings of all three working groups are now available on the CASRAI blog.

There were no meetings in January, due to the Christmas and New Year holiday, so these were the first meetings of the year. As the pilot nears completion, and we move to a new phase, each working group is busy producing outputs for the end of March and beyond.

Share and Enjoy

The three Working Groups that make up this pilot have monthly meetings to discuss progress. To improve how we communicate and disseminate the work and outputs from the three working groups, we have decided to make the minutes from these meetings available online. These are summaries so that they are easier to digest and we hope will make the work of the groups more visible.

The minutes are published on the CASRAI website and are available on their blog.

Share and Enjoy

The three working groups (WG), that make up the Jisc CASRAI-UK pilot, meet online every four weeks to discuss progress. Recently group meetings have been focussed on what’s been achieved by each group and what still needs to be done. This post provides an update of where we are with each group.

Organisational Identifiers

This group has made better progress than other groups for various reasons. It has a large membership that has grown over the last year. Its mandate is clear and described on the WG’s page. As described in the previous post, the WG recently funded a review of Organisational Identifiers. This work is due for completion by the end of November and the consultants have been busy working with the WG members to review the core set of identifiers against a number of use cases. Details of this work will be released later this year and communicated via this blog, in the first instance.

Data Management Plans

In recent meetings discussions have been on looking at the work that’s already been done and planning out what needs to be done in the next few months. A first use case “Submit Data Management Plan (DMP) to Funders” has been developed based on the current version of the DCC (Digital Curation Centre) data management plan template v1.0. However, this requires additional work to include the harmonisation of DMP requirements across the funders. Once this is done the application profile can be published. Also, the group will incorporate feedback from Research At Risk (a Jisc co-design theme) in future work. There may also be further use cases required. One of the issues with this group is the membership is quite small. There is also a lack of funder engagement, but the mitigating action we are taking is to refocus on institutional membership of the WG to suggest where harmonisation across funder requirements for DMPs could take place.

Open Access Reporting

This group was put on hold earlier in the year and it was agreed that the group would reconvene in September 2014 while waiting for forthcoming work from other initiatives (HEFCE REF OA policy and RIOXX work, for example). When the group recently came back together it was agreed that the WG should set forth on a new phase of work and to look at what can be practically delivered. The remit of the group was reviewed and it was decided that the focus of the group will initially be on Open Access reporting with action on other entities being reconsidered later.

The summary plans for each group have been updated and the Working Group pages have been modified to reflect these changes.

Jisc has commissioned a review of organisational identifiers for the Organisational Identifiers Working Group, which is one of three working groups operating under the Jisc CASRAI-UK pilot. The review will report to a workshop planned for the end of November 2014.

This is part of Jisc’s collaborative work with CASRAI (Consortia Advancing Standards in Research Administration Information) to trial the “CASRAI approach” in the UK. The working group on “organisational lists” aims to:

develop a sustainable process for maintaining authoritative lists of organisations.

The review will examine three candidates “standards” for use with organisational lists:

International Standard Name Identifier (ISNI)/Ringgold

UK Provider Reference Number (UKPRN)

Digital Science.

Important elements to investigate will include metadata, accessibility, associated licences/IPR, current scope, classification, coverage, relationships with other standards/organisations, maintenance and development, functions supported and business model.

In particular, the reviewers will check these standards against use cases already articulated by the working group and at the same time check that those use cases are accurately expressed and adequately reflect the needs of UK research. From a university perspective, the basic set of specific use cases will include:

Funder: reviewers’ matching (process of funding allocation based on review and matching of expertise).

Publishing an article: identify publisher; identify affiliations of all authors; identify funder(s) – who funded the research; funder(s) – who paid for any APC; identify data steward (often an organisation rather than a person).

Matching of institute names / identifiers across historical or non-scientific datasets (e.g. for the preservation of the historical integrity of patent applications. To capture and preserve the organisation associated with a data collection at the time of deposit (and at other times) and not to have this updated/changed, should the organisation’s name or structure change. Linking the two or more incarnations of that organisation.)

The review will take key steps towards identifying whether a clear candidate exists for adoption in the UK. It will be run so that:

the appropriate members of the working group can express in detail their requirements and expert opinions;

the project team can summarise and clarify points of agreement and investigate, check and test areas where consensus does not exist;

the project team can review the candidate solutions and check the use cases already outlined (and possibly others suggested by the working group) against each of the three suggested candidates;

a validation workshop is held at the end of the project to present the results with a view to the working group then making clear decisions with summary information, data and results at hand.

In addition to the checking of use cases against the candidate “standards”, the project team will interview members of the working group and the providers of each identifier.

The membership of the working group includes representatives from ARMA[1], the UK Research Councils, HEDIIP[2], CrossRef[3], The British Library, Wellcome Trust, CRIS (current research information system) vendors and UK universities.

[1]ARMA is the UK’s professional Association for Research Managers and Administrators[2]HEDIIP is the Higher Education Data & Information Improvement Programme[3]CrossRef is an association of scholarly publishers that develops shared infrastructure to support more effective scholarly communications.

Share and Enjoy

Jisc and CASRAI are strategic partners and are collaborating on an exploratory basis to trial the ‘CASRAI approach’ in the UK to improve research interoperability.

CASRAI engages national constituencies in a collaborative curation of an international dictionary to enable research information management metadata to be reused and shared in order to enhance the management and flow of information within and between organisations.

The dictionary contains definitions of key terms or information elements which relate to the management of e.g. research grants, CVs or data management plans. It documents controlled vocabularies, authoritative lists and identifiers that are relevant for these terms.

CASRAI data standards are aimed at simplifying interoperability in order to reduce duplication and improve the quality of research administration data that is distributed across multiple software tools, organisations, disciplines and countries.

The CASRAI dictionary provides a single, open and unambiguous reference source for standards agreements. It also includes national and international standards developed outside of the collaborative forum provided by CASRAI when consensus can be reached that these meet the UK research community’s requirements.

Following the CASRAI summit held in December 2012, it was agreed that Jisc and CASRAI should work together in the following three working groups:

Develop a sustainable process for maintaining authoritative lists of organisations in the CASRAI dictionary.

Research Contributions and Open Access Reporting

Data profile supporting institutional report to UK funders for the new policy on Open Access, and for research contributions / outputs more generally

A key input for this working group will be the final report from the UKRISS project. This project is consulting with a range of UK stakeholders to work towards a shared vocabulary to be used in research reporting.