Members of the Norwegian, German and Polish INCF Node communities have set up a successful collaboration on validation of spike sorting methods. A paper describing the resulting tools and services was recently published in Current Opinion in Neurobiology.

The new generation of multi-electrodes, with the opportunity to record hundreds of spike trains at once, offers researchers a more detailed view of the activity in brain networks than ever before. But the crucial step, tangling out which neuron produced which spikes, is still heavily dependent on manual input in many of the processing steps. This makes the spike sorting process time-consuming as well as variable between labs, and usually the methods involved are not quantitatively validated. "No spike-sorting method will ever be perfect, so it is important to know how accurate they are and under what circumstances they fail" says Gaute Einevoll, lead author of a recent paper in Current Opinion in Neurobiology which describes how a collaboration between the Norwegian, German and Polish INCF Node communities spurred the development of a user-friendly service for validation and benchmarking of spike sorting methods.

The collaboration started in 2009 when Gaute Einevoll and Espen Hagen in Ås found out that they and Felix Franke (working with Klaus Obermayer in Berlin), were involved in similar projects, namely to promote the testing and validation of automatic spike-sorting methods. In Berlin, Felix had already put up a first version of a collaborative website where researchers with algorithms could meet researchers with data.

Following a suggestion by Gyuri Buzsaki at the 1st INCF congress in Stockholm in 2008, our group in Ås had gotten a project funded by the NevroNor-program in the Research Council of Norway to make test data tailored to be used in the development and testing of spike-sorting algorithms, and to further stimulate an international collaborative effort to address the problem. We decided to pool our resources and met several times during 2010 to work together on specific aspects of the project and to make plans for how to get more of the international research community involved.

After attending a "very useful" workshop on the challenge of spike-sorting arranged by Buzsaki and Dima Rinberg at Janelia Farm in May 2010, the collaborators decided to follow up by arranging a workshop on Validation of Automatic Spike-Sorting Methods at Ski outside Oslo in May 2011. Thomas Wachtler and Andrey Sobolev at the German INCF Node (G-node), got involved along the way, and at the workshop in Ski the group decided that the new website would be hosted and maintained by G-node: www.g-node.org/spike. A preliminary version is already up, and the full version is expected to be launched now in December.

Now several other people at G-node are also involved. Further, the Polish node hosted by the Nencki Institute in Warsaw got involved as Szymon Leski stayed a year in our lab at Ås, and worked together with Henrik Linden (now with Anders Lansner at KTH) and Espen on developing a tool called LFPy, which Espen now uses to make the spike-sorting test data. Right now we are working on finalizing LFPy and writing up a paper on it, and we plan to release the first version of it next year.

Using LFPy, a Python toolbox aimed specifically for simulations of extracellular field potentials which runs on top of the NEURON simulator, the group can take advantage of a biophysical forward-modeling scheme to calculate quite realistic neuronal spiking waveforms, for all kinds of neurons under all kinds of conditions. Since the spike data are generated by models rather than unidentified neurons, the ”ground-truth”, that is, the true underlying spike trains, is completely known and can be used for validation of the various spike-sorting methods.

At the moment, spike sorting is typically done with a large manual component, making the procedure very labor intensive and unreliable. Further, different labs often use their own method making it difficult to compare their results. The presently used spike-sorting procedures have also not been validated in a quantitative way, at least not comprehensively. This makes the accuracy difficult to assess and the methods difficult to trust.

The key motivation for the project is to aid the ongoing research on understanding dynamical properties of neural networks. Allowing researchers to reliably automate the spike-sorting process would make the analyses from various labs easier to replicate, but most important of all: it would make the analysis process scalable in a way which is not possible as long as the manual bottleneck remains.

So far, the initiative has had very positive responses, the group says. The main challenge to come is to get people to use and take advantage of the opportunities offered by the new collaborative website. Currently, they are therefore thinking of ways to give credit to people who contributes to the web-site by providing test-data or ready-to-run spike-sorting algorithms. Asked if their involvment in the INCF community and activities has helped the project in any way, they reply:

Absolutely. For one, the involvement in INCF has made us more aware of the needs for robust and validated neuroinformatics tools for analysis of neurobiological data. But an even more important role of INCF for this project is the critical involvement by the G-node in setting up the collaborative web-site, and their stated long-term commitment to host and maintain it. This long-term perspective for the necessary neuroinformatics infrastructure increases the credibility of the project and likely also the willingness of the research community to contribute to it. And also our motivation to work on this project.