How to keep up with the parallel sessions

It is only when I come to write up the conference that I wish I was ubiquitous, able to listen to three sessions simultaneously or at least make better choices about which session to attend, as the geography of the venue made it difficult to flit from one to the other. Parallel sessions of course are useful in that they cram a great deal of diverse material into a compact time frame, and fortunately you can glean at least some of what you missed from the ALPSP website later.

The ‘Publishing practicalities’ session for example, chaired by David Smith of The IET, looked at Creative Commons BY licenses, so central to the discussion of developing Open Access. It explained the reasoning behind PeerJ, an open access journal based on a ‘lifetime membership’ rather than APC model that sees itself as a ‘technology first’ publisher, opting for outsourcing to ‘the cloud’ from the outset rather than starting off by hosting and maintaining the technical infrastructure. Finally the session gave space to the use of Google Analytics, described as the most popular and widely used web traffic analytical tool to help make scientific, data driven decisions on the development of one’s website.

Alan Hyndman fro Digital Science

“Source: Wikipedia, so it must be true!” Alan Hyndman

The familiar ‘Industry updates’ session chaired by Toni Tracy gave opportunities to learn about the Copyright Hub, designed to help overcome some of the difficulties experienced in copyright licensing. ‘Force11: The future of research communication and e-scholarship’ is described as a community of scholars, librarians, archivists, publishers and research funders that has arisen organically to help facilitate the change toward improved knowledge creation and sharing; perhaps we can look forward to its being lifted out of the relative obscurity of parallel into one of the plenary sessions.

In the same update session, Heather Staines of SIPX talked about ‘The MOOC craze: what’s in it for publishers?’ MOOCs are massive open online courses aimed at large-scale participation and open (free) access via the internet. Those publishers interested in the freemium approach might be open to these opportunities. Finally Steve Pettifer of University of Manchester told how he “stopped worrying and learned to love the PDF”.

"While licensing content for use in [MOOC] courses challenges every existing model, there is a place for your content, whether it is OA, subscription or ownership based” Heather Ruland Staines

Fiona Murphy from Wiley

Apart from the session on accessibility, of which more in the final post to follow, I did seek out the parallel session on data chaired by Fiona Murphy of Wiley. Access to the data underlying reported research assists verifiability and reproducibility, and can help advance scholarly progress through evaluation and data mining. Questions arise such as which data, e.g. raw, reduced or structural as in crystallography (Simon Hodson).

To be fit for re-use or development, data must be discoverable, openly accessible, safe and useful (Kerstin Lehnert). There is a need for data provenance and standards for incorporation into metadata, and stewardship of data repositories. Steps towards consolidating such needs include the DRYAD repository, a nonprofit membership organization that makes the data underlying scientific publications discoverable, freely reusable and citable, and IEDA, or Integrated Earth Data Applications, a community-based data facility funded by the US National Science Foundation to support, sustain and advance the geosciences by providing data services for observational solid earth data from the ocean, earth and polar sciences.

A somewhat contrary view was provided by Anthony Brookes of Leicester University, who suggested not the sharing of data but the exploitation of knowledge. In biomedical, clinical, genetic and similar research areas there are privacy and ethical barriers to unfiltered sharing and access. That does not undermine the idea of sharing ‘data’ at various levels, and indeed the more abstracted data that can be shared under such circumstances might be richer, and fuller of ‘knowledge’. He foresaw a hierarchy where ‘safe data’ can be openly shared, ‘mildly risky’ data are accessible in an automated, ID-assisted fashion and personal data for which there is managed or no access. A prototype for this approach is Café Variome which seeks to provide the framework for such access/sharing management.

The discussion following this session suggested that there is room at future conferences for the wider issues to be debated: value added by linking across datasets, knowledge engineering from datasets demanding all the metadata and all the provenance, publishing models that facilitate all this and the role of scientists, editorial boards and learned societies in defining the issues of data quality, description, metadata, identifiers, seen as matters of some urgency.