One of the biggest unanswered questions in neuroscience today is the organization of the brain at the level of the neural micro-circuits that form the basis of neural computation. Recently, due to technological and experimental advances in electron microscopy, it has become possible to investigate these networks through high-resolution imaging of the brain (Denk and Horstmann, 2004; Hayworth et al, 2006; Bock et al, 2011). For example, Bock et al. have recently imaged a 350 x 450 x 50 cubic micron region of mouse cortex with 4 nanometer lateral resolution - a sufficiently detailed dataset to resolve every synaptic connection in the field of view (indeed, even vesicles are readily apparent). It has been estimated that imaging the whole brain at this resolution would require multiple exabytes; a cubic millimeter occupies roughly 1 petabyte. Ultimately, to fully exploit this data, it is desirable to assign a label to each voxel indicating its identity and the structure to which it belongs.

Clearly, while even collecting this type of data is an enormous task, interpreting and analyzing the data is far more difficult. It is infeasible to annotate this volume of data manually, and probably impractical to assume that any one group will devise a perfect automated solution. We are therefore working to provide universal access to this type of data via web services hosted at http://openconnectomeproject.org. More specifically, we are developing tools for both human (visualization) and computer (application programming interface, or API) access to the data. Granting global access will enable the largest possible community of image processing and machine learning experts to investigate the data and develop algorithms to annotate it. Unlike standard crowdsourcing endeavors, we aim to compile efforts from a variety of machine annotators, as opposed to human annotators, an approach we have dubbed “alg-sourcing” (for algorithm outsourcing). As different groups tackle different aspects of the problem with different approaches, we intend to aggregate the results and share the collective output, building towards our long-term vision of a fully-annotated cortical volume.

Our project is being initialized with two datasets: (1) a 12 TB dataset from Bock et al. described above, and (2) a >600 GB dataset from Kasthuri and Lichtman (unpublished; spatial resolution: 3 x 3 x 29 cubic nanometers). Panning, zooming, and manual annotation are made possible via a web-based graphical user interface called CATMAID (Saalfield et al, 2009). An API for two-dimensional analysis of the data, including downloading arbitrary image planes and uploading planar annotations to the shared repository are in progress. An additional server for three-dimensional representation of the data is being built, along with an API for downloading volumes and uploading volumetric annotations. Graphics processor unit (GPU)-enabled software will allow for visualizing arbitrary rotations of the data in three dimensions, overlaid with the annotations. All of the services are designed to scale up to petabytes and beyond, and all of the code we develop will be released as open source.

In conclusion, the Open Connectome Project is gearing up for massive polyscience, i.e. science collectively conducted by a large group of individuals. This marks a radical departure from the typical scientific workflow, in which raw data are kept local until results are released, and will hopefully usher in a new era of understanding about the brain.