Getting a Bigger 'N' for Studies of Rare Conditions

I get a kick out of seeing "ordinary" information technologies make new kinds of science possible. For example, the ability to move and share digital data is a pretty mundane topic in computer science circles. The mass market entertainment industry (think: iTunes, Netflix) has even brought streaming data and multi-megabyte files into our living rooms--for fun! Despite this dramatic adoption and transformation in some industries, there are still plenty of areas where the effects are being felt for the first time.

When a bunch of neuroscientists begin sharing data, for example, it can still have a huge impact, both on the rate of their discoveries and on the nature of their day-to-day work. For example, the FBIRN Functional MRI collaboration is enabling studies of rare neurological conditions such as schizophrenia by increasing the sample sizes of quantitative data available to individual researchers. Traditionally, sample sizes for these studies are limited to the number of people with a rare condition who can be physically transported to a single location where a magnetic resonance imaging (MRI) scanner is located. (It is difficult to compare scans from different scanners using quantitative analyses. See some sample images here.) This number is often too low to allow meaningful analysis of rare conditions. FBIRN is intentionally comparing scans from several institutions (often of the same human subject) in order to develop a methodology for normalizing scans from many sites so that they can be quantitatively compared, and thus used together in studies with a high degree of confidence that variations are real and not artifacts of the scanners. This increases the number of human subjects for studies, which means that we can be more confident in the results of these studies. In addition to making it possible for scientists to conduct studies of rare conditions, this has also introduced a new challenge in the lives of the scientists: how to get the scan data from each site copied to other sites for comparisons, quality checking, and scientific analysis. This is a “new problem” for neuroscientists, who are more accustomed to producing and using their own data. FBIRN is currently using the same GridFTP technology that is used by Globus Online. When FBIRN began using GridFTP in 2009 (before Globus Online was available), we encountered several minor issues with the technology and with the underlying networking infrastructure that consumed a lot of the researchers’ time as they became familiar with the ability to share data on a large scale. The management apparatus that FBIRN built to compensate for these issues looks a great deal like Globus Online! Now, Globus Online is able to handle these effects automatically on a large scale for many scientific teams without the scientists getting personally involved in debugging the problems or building solutions to them. This will further accelerate FBIRN’s dramatic progress in their quest to make multi-site MRI studies a reality on which all neuroscientists can build.