Summary of contents:
HILTIV has produced a number of service demostrators examining “cross-searching multi-subject scheme information environments”. The demonstration aspect of their work is that the mapppings being used are of selected sections of vocabularl schema.

Summary of contents:
“It is often stated that, worldwide, the spontaneous level of self-archiving is around 10-15% (i.e. about 15% of published articles are made openly available by their authors).[Harnard (2006), Björk, B-C., Roosr, A. & Lauri, M. (2008)] We found similar levels of archiving: 16% of questionnaire respondents link to local, open copies of their work; 19% link to external copies – though often these are not openly accessible. Having said this, much of the self-archived content on web sites is working papers, reports and conference papers; the % of published journal papers spontaneously self-archived (on personal web sites or in any repository) by White Rose authors is likely to be lower than 15%. Of course, there is considerable variation between subject disciplines. This highlights the immediate potential value of open access repositories but also, perhaps, underlines the scale of the cultural change required – even after several years of institutional repository development – to engage researchers in active dissemination of their outputs.”

Comments:
This provides further evidence for the percentile statistics of self-archiving. One consequence of this figure (even within a now established repository) is the challenge faced by instituions seeking to comply with funder’s deposit manadates.

“Our experience to date, though, suggests authors will make the most of administrative support and that a helpful administrative framework results in higher levels of self-archiving overall. In particular, authors are responsive to well-known individuals in their departments: for example, local administrators have good success rates in persuading authors to re-send appropriate versions of their work where a non-archivable version (generally the published PDF) has been sent initially. Local administrators are well placed to “champion” and support the repository in ways that more “remote” central repository staff are not; this advantage needs to be balanced against the need to provide training and support for departmentally based administrators.”

The project also notes that encouraging this practice may hinder the promotion of self-archiving as such.

Comment:
This raises an interesting question of priority – is the goal author self-archiving or increased repository content?
From the point of view of a funding body / the promotion of Open Access / institutional statistic (and REF) concerns the latter is important;
however, there are strong historical ties to author self-archiving, the author is (in some senses) the one doing the sharing, and the less self-archiving the greater organisational and financial overhead of the repository.

Either way the project’s findings support the view that the invovlement of local administrators increases depost rates (motivation).

Summary of contents:
“Analysis of individual researcher publication pages revealed a good deal of inconsistency of formatting, including within individual publication lists. The idea of “scraping” publication metadata from researcher pages is attractive, but the reality is quite challenging.”
“The Perl code written for one author could not be reused with another and would need tweaking every time.”

Summary of contents:
The project had various intended outcomes. One of which was to double in size over the course of the project.

“At the original start date for the project (April 07), the repository held somewhere over 1,600 items. Taking this as the baseline, we have exceeded our target. However, as we delayed the official project start date toallow for staff recruitment, if we take our figure from July 07, we have fallen slightly short but will meet the target approximately 1 month post-project. As can be seen from the graph, the growth rate has been much stronger inthe latter half of the project.”

The project has not met it’s related goal of capturing 20% of the consortium’s reserch outputs but “progress has been made”
“Across the partnership, we estimate nine-ten thousand items falling within repository scope are produced per annum. Eventually, we need to be ingesting / be capable of ingesting over 200 new items each week; this excludes the “mountain” of legacy metadata and publications which could potentially be added to WRRO.”

At least 80% full text percentage:
“This target has been met. For the majority of its life, WRRO has had a high proportion of full text records (90- 95%). At the close of the project, approximately 82% of items have a local full text openly accessible copy of the research outputs; an additional 5% or so link to a full text open access works outside the repository. The proportion of metadata only records is increasing because of the addition of the University of York’s RAE data and other bulk imports. It is anticipated that the proportion of full text items will fall to 60% for a short time but that the proportion of full text will then start to recover.”

Summary of contents:
The project looked at importing from departmental bibliographic databases and from other departmental bibliographic collections (some of which where created explicitly for this purpose).

“It is interesting to note that the department preferred on balance to create their own local database and upload material en masse at the end of the summer. Similar suggestions have been made from time to time by other departments even though creating an additional collection system involves more work at the local level. For example, we have been asked to provide an Excel template to allow data to be collected ready for periodic bulk import into the repository.Though this approach may seem counterintuitive, local academics and administrators have suggested that, for some departments, this [local collection] may be a more sustainable method of data collection. Such solutions may be worth considering, perhaps as an interim measure, where sustained self-archiving activity is proving particularly elusive – though could prove counterproductive overall.”

It is also of note a number of departments already had their own bibliographic management tools. Some of which could export in formats that are directly importable into ePrints via plugins (DOI, EndNote, BibTex, Multiline Excel and PubMed ID). more detaisl on the use of the plugions ar available: http://eprints.whiterose.ac.uk/increase/plugins.html [page 18 notes that one difficulty with using DOI material from crossref is the lack of author data as a result “We have used CrossRef as a base source of metadata but not to enhance metadata in records already created within the repository.”]

The project notes that some of the desire to use other tools maybe be sidestepped by future developments that better integrate repository deposit into researcher’s workflows and by the introduction of research information/ management systems.

From the conlcusions
“There are likely to be personal and departmental sources of metadata suitable for bulk import at most /all HEIs. The metadata within such systems may well be inconsistent and incomplete. We found import to be more time-consuming than we hoped. A high degree of manual intervention was required: mainly to supplement incomplete metadata or add full publication details to imported “in press” items. Unless effective ways can be found to automatically check and improve bulk metadata this type of import may be a false economy and may not be the best way to grow the repository sustainably nor to embed into researchers’ workflow. An alternative approach would be to identify sources of pre-quality checked metadata – possibly from commercial sources – to create a back-catalogue of publication metadata.”

Comments:
There is again a highlighted concern about alternative solutions impacting on the adoption of self-archiving.

[I think] The project’s experience that departments may opt to run their own bibliographic systems is an important reminder that there is not one solution to either archiving Open Access copies and that information in one place does not equate to information in one system.

It demonstrates the effective use of a number of plugins around the e-prints software to successfully import data.

Summary of contents:
“Our observations suggest that conditions likely to improve self-deposit are:
(i) keeping things as simple as possible from the author’s perspective
(ii) always asking for the author’s final version of a work (… “Accepted Version” suggested by The VERSIONS project …)
(iii) facilitating capture of the work at the point of acceptance for publication. …
(iv) providing central support to monitor uploaded files and seek copyright clearance where required
(v) reminding authors to deposit: this could be a periodic reminder, or could be linked to a publication “event” such as a publication being indexed in a bibliographic database
(vi) highlighting the impact of deposit through the regular provision of usage data”

From the conclusions:
“There is probably no simple “optimum” deposit point for research outputs; however, in the short term, capturing papers at the point of acceptance for publication is probably the most realistic option. The emergence of desktop capture/deposit tools may facilitate earlier capture and assist with version control. Capturing the most appropriate version of a work continues to be an issue; all efforts should be made to inform researchers about the “accepted version” and its importance in the open access landscape. It is likely to be helpful to instil this awareness in early career researchers and PhD students by including open access / scholarly communication elements in training.”

Comments:
Based on their survey work and interviews these are the project’s suggestions to support the self-archiving process; this is an ongoing challenge even with mandates; in itself it provides workflow advice and suggests what software tools are needed.