Data management solutions

Ecosystem research is about the Australian ecosystem dynamics. In particular, the role of Australian ecosystems in the cycling of water and carbon between biospheric and atmospheric stores; and the response of these ecosystems to changes in these cycles. Effective research was hampered by the lack of coordination in the collection, archiving and quality control of measurements from remote stations across Australia.

Utilising EIF (Educational Investment Fund) directed through the Australian National Data Service, a repository was developed and deployed for the OzFlux community. This system provides researchers with integrated access to Australian Ecosystem research data, facilitates collaborative research, and promotes the re-use of this data, through:

In 2010 the HumProteome Organanisation launched the Human Proteomic Project (HPP), aimed at cataloguing the protein information arising from the plethora of worldwide proteomic based studies. To support complete coverage, one arm of the project will take a gene- or chromosomal-centric strategy (C-HPP). The approach to dividing labour in this international effort has been to assign each of the 24 human chromosomes to one or more countries. In this scheme, the Australian/New Zealand consortium has been assigned Chromosome 7, as this chromosome contains various genetic markers associated with diseases relevant to the Australian population.

Despite multiple large international biological databases housing genomic and protein data, there is currently no single system that integrates up-to-date pertinent information from each of these data repositories and assembles the information into a format suitable for a global proteomics effort of the type proposed by the C-HPP.

Monash, in close collaboration with the Proteomics research community, is developing a data integration and analysis software system for the C-HPP effort to make data collections from this resource discoverable through ANDS (Australian National Data Service) Research Data Australia. Whilst the software is being designed to be species and chromosome independent, the initial focus is on the development of a resource for Human Chromosome 7.

Ultimately, this tool will assist in the analysing of biological function and the study of human disease.

The innate immune system is a highly conserved first line of host defence against infections and other disease stimuli. It initiates inflammatory responses and is important in surveillance of developing cancers. As a result of recent discoveries and an increased understanding of the components of this response, we are now able to better determine its role in diseases, assess human susceptibilities to disease and target therapeutics to this system. A key component of the innate response is the production of hormone-like proteins called interferons (IFNs). These proteins activate signalling pathways in cells resulting in the modulation of expression of up to several thousand genes that encode proteins which are responsible for constituting each of the many biological effects of interferons. They inhibit viral replication, prevent cell growth and activate cells of the immune response.

Interferome V2.0 is effectively a catalogue of IFN regulated genes. It assimilates a large number of data sets, including detailed annotation and quantitative data, from a microarray analysis pipeline and makes this available to researchers by providing enhanced search capabilities that allow researchers to query more that 2000 data points. Interferome V2.0 also has the ability to publish metadata about the catalogued research data collections to ANDS (Australian National Data Service) Research Data Australia; which promotes citations, data re-use, and enables new discoveries from old data.

This project was funded by the EIF (Educational Investment Fund) directed through the ANDS.

Microscopy images of animal and/or human tissues and cells are taken in many research projects at most Universities. Cellular and systems biology for example have become increasingly data-intensive fields of research. Current generation imaging instruments are now automated high-throughput devices capable of generating Terabytes of data daily. With various microscope types and vendors producing many different proprietary microscopy image data formats, some of the major issue are:

how to extract, store and query the machine-generated metadata; and

how to annotate and associate experimental metadata with the raw data whilst ensuring the integrity of the association and provenance of the data.

Open Microscopy Environment Remote Objects (OMERO) is an open-source software platform that was developed in the UK. It enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO was adapted by Monash to enable its data collections to be registered with ANDS Research Data Australia, promoting data re-use and citations.

OMERO benefits research by enhancing research practice and facilitating collaborative research.

The Australian Synchrotron generates terabytes of data daily from a range of scientific instruments. In the past, this crucially important data often ended up on old disk drives, unlabelled and offline. Sharing data with collaborators often involved physically sending it via the post.

MyTardis was created at Monash University as a web application geared towards receiving data from scientific instruments such as those at the Australian Synchrotron in an automated fashion. This gives researchers power and convenience - facilitating the organisation, long-term archival and sharing of data with collaborators. MyTardis also allows researchers to cite their data. This has been done for publications in high impact journals such as Science and Nature.

Since its creation and deployment at the Australian Synchrotron, MyTardis has expanded into deployments that fulfil the data management needs of researchers in areas such as: microscopy / microanalysis, particle physics, next-gen sequencing and medical imaging and has been deployed at more than 10 universities and other research institutions across Australia.