Already a member?

Contents

Lead Senior Research Data Management System Developer

Submitted by STulley on Fri, 2016-01-08 13:18

Posted to IASSIST on:

2015-12-17

Employer:

Harvard Medical School

Employer URL:

http://www.harvard.edu/

Description:

A joint project
between the SBGrid Consortium at Harvard Medical School and the
Dataverse Team at the Institute for Quantitative Social Science at
Harvard University has an immediate opening for a lead developer to help
us build a next generation data publication system for large biomedical
datasets. We aim to make biomedical datasets publicly available through
a federated data grid to facilitate access, citation, and data analysis
by scientists. Our pilot collection includes datasets generated using
X-ray crystallography, computer modeling, lattice light sheet
microscopy, and microED diffraction. This collection is currently
replicated to computing centers in the US, Europe, Asia, and South
America. The project is supported by the Helmsley Charitable Trust and
was recently selected as a pilot of the U.S. National Data Service. To
learn more about the environment, please visit our current
implementation at data.sbgrid.org and our group websites at sbgrid.org,
slizlab.org, and http://datascience.iq.harvard.edu/team.

The
lead developer will be responsible for successfully migrating our
in-house research data management system, written in Python, to
Dataverse (http://dataverse.org) after first extending Dataverse (with
the full support of the Dataverse development team) to include the
features necessary for the migration. The candidate will develop a final
set of requirements based on the feedback and experience of the
end-user community using our current pilot system. Examples of features
that must be added to Dataverse include better support for large (~100
GB) datasets, automatic data validation pipelines, and other
functionalities relevant to specific biomedical data types. The lead
developer will also help to evaluate data transfer and upload and
management technologies, such as Globus, that can integrate with
Dataverse to support larger datasets and provide direct computing on the
data. The developer will work with our team to ensure that all new
functionality developed under this project is merged into the Dataverse
open source project and shared with the community.

As a senior
member of our team, this individual will also support training junior
members, collaborate with collection specialists, and present outcomes
of the project at meetings and conferences.

Qualifications:

Bachelor's Degree in
computer science or engineering and 5-8 years of strong programming
experience is essential, preferably in Java and Python, ideally in the
context of web applications.

Our team will welcome
candidates with diverse technical backgrounds, but the successful
candidate will have experience handling large datasets and leading
software development projects. A working knowledge of Linux, shell
scripting, databases, and distributed version control systems (git,
mercurial, etc) is also necessary. The ideal candidate will also be
familiar with data management software and the handling and analysis of
large datasets.

IASSIST Quarterly

Special issue: A pioneer data librarianWelcome
to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013).
This special issue started as exchange of ideas between Libbie
Stephenson and Margaret Adams to collect