Summary

This tutorial will train reproducible research warriors on the practices
and tools that make experimental verification possible with an
end-to-end data analysis workflow. The tutorial will expose attendees
to open science methods during data gathering, storage, analysis up to
publication into a reproducible article. Attendees are expected to have
basic familiarity with scientific Python and Git.

Description

The tutorial will cover four hours with the following topics

Introduction (10min)

History of scientific societies and publications

Leeuwenhoek was the Man !

The Invisible College

Nullius in Verba

Replication of the early microscope experiments by Leeuwenhoek[a][b]

Image Acquisition (15 min)

Hands on: Cell camera phone microscope

With drop of water

Hands on: Each pair acquires images

Data Sharing (45min)

Image gathering, storage, and sharing (15min)

GitHub (www.github.com)

Figshare (www.figshare.com)

Midas (www.midasplatform.com)

Hands on: Upload the images

Metadata Identifiers (15 min)

Citable

Machine Readable

Hands on: Create data citation and machine readable metadata

Hands on: Download data via RESTful API (15min)

Provenance and

Python scripts

Hands on: Download the data via HTTP

Break (10min)

Local processing (60min)

Replication Enablement (20min)

Package versioning

Virtual Machines

Docker

Cloud services

Hands on:

Create a virtualenv

Run our tutorial package verification script

Revision Control with Git (20min)

Keeping track of changes

Unique hashes

Hands on:

Forking a repository in GitHub

Cloning a repository

Creating a branch

Making a commit

Pushing a branch

Diffing

Merging

Pushing again

Create pull request

Python scripts (20min)

Data analysis, particle counting.

Hands on:

Run scripts on new data

Generate histogram for the data

Testing (30min)

Unit testing with known data

Regression testing with known data

Hands on:

Run tests

Add coverage for another method to the unit tests

Break (10min)

Publication Tools (30min)

Article generation

RST to HTML

GitHub replication and sharing

Hands on:

Run dexy to generate the document

Reproducibility Verification (30min)

Reproducing Works

Publication of Positive and Negative results

Hands on:

Create Open Science Framework (OSF) project

Connect Figshare and Github to OSF project

Fork or link another group’s project in the OSF to run dexy on
their work

Infrastructure:

Attendees will use software installed in their laptops to gather and
process data, then publish and share a reproducible report.

They will access repositories in GitHub, upload data to a repository and
publish materials necessary to replicate their data analysis.

We expect that wireless network will be have moderate bandwidth to allow
all attendees to move data, source code and publications between their
laptops and hosting servers.