Building Virtual Organizations:enhancing discovery and innovation bybringing people and resources together across institutional, geographical and cultural boundaries.

Congruent with the three thematic areas, CDI projects will enable transformative discovery to identify patterns and structures in massive datasets; exploit computation as a means of achieving deeper understanding in the natural and social sciences and engineering; simulate and predict complex stochastic or chaotic systems; explore and model nature’s interactions, connections, complex relations, and interdependencies, scaling from sub-particles to galactic, from subcellular to biosphere, and from the individual to the societal; train future generations of scientists and engineers to enhance and use cyber resources; and facilitate creative, cyber-enabled boundary-crossing collaborations, including those with industry and international dimensions, to advance the frontiers of science and engineering and broaden participation in STEM fields.

My own research work fits very well into this program. It is an area that will alter the nature of how science is done, and promises to -- as I said earlier today in an email -- radically increase the usefulness and value of the research ecosystem, both to those inside and outside of research. But more important, it will change many aspects of the dominant way in which science is done, including researchers being restricted to narrow and deep knowledge in a particular research area, sometimes comically (and somewhat unaccurately) illustrated by this limerick:

There once was an old man from Esser, Who's knowledge grew lesser and lesser. It at last grew so small, He knew nothing at all, And now he's a college professor.

The tools that this program suggests, along with the research programs at other organizations (including CISTI Research here at CISTI) allow for rich, productive and integrative inter-disciplinary approaches. Some of these tools which in themselves will grow towards being a researcher's de facto collaborator(s), in how they can represent expert knowledge external to the researcher's own focus, that can flag important external and out-of-band connections to the researcher. Who wouldn't want to have 30-40 (friendly!) experts in domains -- in which you are not an expert -- monitoring every thing you write, everything you search, everything you read, having an understanding of your core research interests, ready to identify and flag important external connections, collaborators, publications, initiatives, datasets, claims. knowledge?

continuously anticipate and adapt to changes in technologies and in user needs and expectations;

engage at the frontiers of computer and information science and cyberinfrastructure with research and development to drive the leading edge forward; and

serve as component elements of an interoperable data preservation and access network.

...these exemplar organizations can serve as the basis for rational investment in digital preservation and access by diverse sectors of society at the local, regional, national, and international levels, paving the way for a robust and resilient national and global digital data framework.

This is a significant step forward, in the spirit of many of the plans and ideas contained in the Canadian NCASRD (and other efforts), and hopefully some of this visionary investment in data management and data for science will take hold here in Canada.

The "Big Opportunities..." article I find particularly engaging for a couple of reasons: 1) their introduction of the "Commons of Scientific and Technical Data (CSTD)" as managed by the network of university libraries sounds promising, and 2) my own pet belief, that the data produced by "small science" is also useful and needs to be looked after.

Friday, September 14, 2007

In "An Information Revolution", David Penman discusses Open Access and Open Data (especially as applied to government-funded research) in general, and more specifically as applied to New Zealand science and scientists. While there is some good news:

"The Foundation for Research, Science and Technology is now reviewing its data policy and moving towards the norm for the OECD – greater open access for publicly-funded data. Rather than the research provider deciding on access, all information is openly and freely available unless restrictions such as national security, environmental damage (eg, the GPS co-ordinates of threatened species), or clear commercial disadvantage can be justified."

He has some blunt - and appropriate - words for NZ scientists:

Our researchers will also have to change. No longer can they sit with filing cabinets full of data waiting for the definitive experiment or the life time monograph. Publish quickly in electronic media, make your data and models freely available and get rewards from both publishing and showing that your data are being used by others – this should become the norm.

He also has an interesting view of the future of libraries, one that many libraries would be unwise to ignore:

Libraries are becoming available to all without leaving your home, information on your environment will become openly and freely available and communities will be able to use the internet to take more control of our institutions – a new style of democracy will emerge.

"If New Zealand is to assess this international science trend and respond to it, a much more united scientific response is required, and also a “whole of government” approach. Not only the traditional science agencies, but also the National Library, Statistics New Zealand, even local government, all may need to contribute in some way and assist with a consensual and comprehensive solution."

Fortunately, there are other examples where New Zealand is moving ahead in this area: the recent (July 2007) announcement of a free data policy by the New Zealand National Institute of Water & Atmospheric Research (NIWA). Hopefully more like this will follow....

Wednesday, September 05, 2007

The Canadian Institutes of Health Research (CIHR) have announced this new policy, which includes publications AND data. Basically, they have taken a gentle but significant step in opening-up the research outputs of the grant recipiants they support. It does not impact all forms of publications. Two of the most salient points:

5.1.1 Peer-reviewed Journal PublicationsGrant recipients are now required to make every effort to ensure that their peer-reviewed publications are freely accessible through the Publisher's website (Option #1) or an online repository as soon as possible and in any event within six months of publication (Option #2)."

and

5.1.2 Publication-related Research DataRecognizing that access to research data promotes the advancement of science and further high-quality and ethical investigation, CIHR explored current best practices and standards related to the deposition of publication-related data in openly accessible databases. As a first step, CIHR will now require grant recipients to deposit bioinformatics, atomic, and molecular coordinate data into the appropriate public database, as already required by most journals, immediately upon publication of research results (e.g., deposition of nucleicacid sequences into GenBank). Please refer to the Annex for examples of research outputs and the corresponding publicly accessible repository or database."

Gentle but firm: "as soon as possible....in any event within six months".

In "The irony of a web without science" (Sept 4 2007) James Boyle decries the state of scientific research and describes the limited amount of scientific output - in particular journal publications - that can be accessed in an Open Access manner. The author cannot reconcile what he describes as the "genius of the web is that it is an open network" with the closed and expensive nature of what is modern science and the modern scientific publishing landscape.

But the author goes on to say:

Thus I do not support the proposal that all articles based on state-funded research must pass immediately into the public domain. But there are more modest proposals that deserve our attention.

Pending legislation in the US balances the interest of commercial publishers and the public by requiring that, a year after its publication, NIH-funded research must be available, online, in full...

I think the author muddles a number of different ideas here (OA does not imply public domain...) and does not properly understand what is the Open Access movement. But it is interesting to see this discussed in something like the FT.