A new database pools health registry data from seven countries, dramatically boosting sample sizes for epidemiological studies of autism. The virtual tool, built by an international consortium of researchers, allows them to effectively compare data across populations.

“This is a first for autism,” says Diana Schendel, professor of psychiatric epidemiology at Aarhus University in Denmark, who spearheaded the project. Dubbed the International Collaboration for Autism Registry Epidemiology (iCARE), the project yokes together data from population-based health registries in Denmark, Finland, Israel, Norway, Sweden and Western Australia, as well as data from California.

National health registries are a boon for epidemiological research — both in autism and other fields. But there have been few attempts to combine data from different countries, in part because of strict regulations that prohibit the information from being transported outside the countries of origin.

“In principle, this kind of data collaboration should have been done a long time ago, but no one has been ambitious enough tackle the logistical and technical challenges,” says Brian Lee, assistant professor of epidemiology and biostatistics at Drexel University in Philadelphia. Lee has conducted autism studies using the Swedish or Danish national registries individually, but has not combined the data. He is not a member of iCARE.

The resource was launched in May 2009 with the help of a four-year, $1.2 million grant from the research and advocacy organization Autism Speaks. It pools data from more than 80,000 individuals with autism, from a total of about 10.8 million births between 1967 and 2009, including factors such as birth weight, birth order and age of diagnosis.

Most analyses are likely to focus on a smaller range of birth years in which the data of interest are represented across all the sites.

“iCARE can be analyzed on an iPad at Starbucks anywhere around the globe, and it’s safe and secure,” says Abraham Reichenberg, professor of psychiatry at the Icahn School of Medicine at Mount Sinai in New York and a member of the consortium. The group is open to considering applications from researchers who have ideas for using the resource, he adds.

Virtual pool:
Researchers often share large datasets by physically transferring them to a central location, but this was not possible for the national and state health registries iCARE intended to combine. The group’s first task was to devise a simple way for everyone to access the data that would still allow researchers complete control of the database within their home country.

“When we started out, it was very much, ‘Let’s see if we can do this,’” says Kim Carter, associate professor of bioinformatics at the Telethon Institute for Child Health Research in Perth, Australia, one of the researchers who led the development of the data analysis system.

The solution involved a database sorting technique called federation, in which the system virtually pools data from various sites for a given analysis session. Once the analysis is complete, the researcher can download the analysis, but the pooled dataset disappears without being saved to the researcher’s computer or altered at any of the original sites, allaying ethical and data privacy concerns.

Another challenge is harmonizing the different datasets, says Ezra Susser, professor of epidemiology and psychiatry at Columbia University in New York, who played a key role in bringing together the iCARE team. “Even a simple variable like birth weight or gestational week can have different meanings across registries,” he says.

Although none of the analyses are published yet, iCare offers a much more finely grained look at the many variables than was previously possible, Schendel says. For example, studies of the effects of parental age on autism have been largely limited to broad age categories, but iCARE’s pooled analysis includes enough individuals to examine the risk of autism for particular maternal and paternal ages. The analysis is already uncovering surprising distributions of risk across certain ages, which would never have emerged from analyses using single databases, she says.

Although the group’s initial grant support is winding down, the infrastructure will remain operational for five more years.

Reichenberg is leading a new project built on the iCARE infrastructure that will include many of the same researchers. Funded last year with a network grant from the Autism Centers of Excellence program at the National Institutes of Health, that effort will take a multigenerational look at potential risk factors for autism, such as whether exposure to various medications during pregnancy is associated with autism in the child.

Participants of the collaboration say that once the publications start to flow, the research community will take notice. “One of the steps we need to take in epidemiology is what was done in genetics — to create these repositories so that you can combine samples and get much, much larger numbers,” says Susser. “It’s where the future needs to go.”