How would you like to share?

Call it the end of creative individualism. (Just kidding!) But seriously, to hear leading scientists tell it, the field of Alzheimer disease research has reached a point where, on the translational goal of cerebrospinal fluid diagnosis, it’s time for individual centers to set aside their local modus operandi and agree to march in lockstep with the collective. Teasing aside, this is a story about how differences among individual centers’ CSF Aβ and tau measurements are posing a serious problem just as the field prepares to deploy those measures for multicenter drug trials in the earliest pre-dementia stages of AD that critically depend on CSF testing. This is as much a story about a potential solution. A large, external quality-control initiative led by Kaj Blennow at Sahlgrenska University Hospital in Molndal near Göteborg, Sweden, has started operating last month. It offers a standard protocol, and it welcomes centers around the world to join. (It’s free, too.)

Funded by the Alzheimer’s Association, the initiative aims to reduce site-to-site and batch-to-batch CSF test variation. Broadly speaking, this is intended to help CSF diagnostics mature single-center testing (aka “It works fine here!”) into robust procedures that yield the same results everywhere. This is necessary, research leaders in academia and industry agree, to enable direct comparison in large research databases and therapeutic trials.

The problem and proposed solution are not unique to AD research. Standardization coupled with external quality control (QC) have brought about consistency and uniform cutoff values in other areas of medicine such as blood glucose or hormone tests, and the same can be done for AD, too, researchers say. Moreover, second-generation biomarker candidates that are beginning to emerge from basic proteomics research in CSF and plasma are likely going to face the same challenge once they have more research evidence behind them. They could at that point be added to the ongoing external QC program, said Blennow’s colleague Henrik Zetterberg of Sahlgrenska University Hospital.

Fine points in protein measurement, technical details of sample collection, analytical procedure, and test manufacturing—oh, a writer can fairly hear those readers eager for the big new gene or clever new concept quietly sigh, “boooring.” But wait—the topic is important because the devil in the details of CSF testing could make a hash of the field’s collective push to implement biomarker-enhanced diagnostic criteria for a prodromal diagnosis (e.g., Dubois et al., 2007). Likewise, it could derail expensive multicenter drug trials that use such criteria.

“As we start to use CSF biomarkers in clinical trials to either identify patient populations at risk of disease or to measure progression of disease, it is critically important that we have reproducibility and consistency among sites and investigators around the world. Without standardization of these assays, misinterpretation of data is possible if not likely, and interpretation of data among studies and sites will be challenging if not impossible,” Menelas Pangalos wrote to Alzforum. After Pfizer’s acquisition last month of Wyeth, Pangalos became Chief Scientific Officer of Pfizer’s Neuroscience Research Unit.

The issue came to a head at the International Conference on Alzheimer’s Disease (ICAD) held last July in Vienna, Austria. In his lecture, Howard Feldman of Bristol-Myers Squibb, in describing the first such Phase 2 trial, mentioned reliability and quality control of the CSF component at the trial’s 75 study locations as his biggest concern for the trial. Also at ICAD, Niek Verwey of VU University Medical Center in Amsterdam, the Netherlands, presented results from his group’s recently published comparison of some 18 centers in Europe and the U.S. Established labs in the field had measured the Aβ, tau, and phospho-tau content of identical test samples with mean variations of up to 37 percent (Verwey et al., 2009; ARF related news story). And in a plenary lecture, Zetterberg reported that a massive clinical study, in which 12 international centers in Europe and the U.S. followed 1,583 people for two years, had achieved less accuracy in predicting incipient dementia within the MCI cohort based on CSF testing than do single-center studies (Mattsson et al., 2009).

“As CSF measurement of protein levels in the brain becomes more validated as a biomarker of early identification of AD, it is disturbing that when different labs take those measures you may not get robust and comparable results,” Maria Carrillo of the Alzheimer’s Association told this reporter. The problem is acute because variation of this magnitude—in the 20, even 30 percent range—may be as large as the hoped-for effect of the treatment under trial, effectively swallowing up a potential efficacy signal. Scientists generally agree they have to keep test variation reliably below 10 percent.

At ICAD in Vienna, Zetterberg called for worldwide standardization of procedures in observational and therapeutic studies. And things have moved quickly. Carrillo invited some 20 experts in AD fluid biomarkers to meet there, hoping to gauge their interest in creating, and then validating, a standard protocol of CSF collection and analysis. Sixty-six people came. The crowd included many pharma researchers whose companies are increasingly collecting CSF in their trials. “Everyone was for the idea of doing this together. The energy in the room was phenomenal,” Carrillo recalled. Because this many heads don’t make a nimble planning group, a handful of representatives from leading institutions formed an e-mail working group that created a consensus protocol for sample collection, handling, and analysis over the next month. By the time the Society of Neuroscience meeting rolled around last month, a grant proposal to validate this protocol in an external QC program was funded, and some 31 centers in academia, pharma, and biotech had signed up to participate, Carrillo said in Chicago. And already in the last week of October, Blennow’s group began sending out QC samples of pooled CSF to the participating centers in 12 countries and three reference laboratories. The program is open for additional participants; for inquiries, contact the QC Programme Coordinator at neurochem@neuro.gu.se or Henrik Zetterberg at henrik.zetterberg@clinchem.gu.se. See the Neurochemical Pathophysiology and Diagnostics Research Unit for more information.

Blennow and others interviewed for this article said they hope the QC program will help participating labs synchronize their procedures and enable them to see how their local performance of a given assay compares relative to an independently established reference range for that assay. If run continuously, the program may also spur test manufacturers to minimize batch-to-batch variation of the assay kits they sell. In addition, the program might offer a new quality claim for studies worldwide, whereby a study would demonstrate in its methods section that its tests were performed within the range established by this external quality control. There are hopes and early discussion about using this QC program to help the Food and Drug Administration validate testing labs faster, Carrillo added.

This worldwide QC program is independent of the Alzheimer’s Disease Neuroimaging Initiative (ADNI). The 800-pound gorilla of AD biomarker studies is itself expanding around the globe. The two initiatives are linked indirectly, though. The QC program incorporates some steps toward standardization that ADNI1 has already worked out, and vice versa; the results of the QC program may inform which assay is eventually chosen for use in ADNI2, Carrillo said. The grant for ADNI2 was submitted in October; if funded, enrollment will begin a year from now. Several CSF tests are in wide use and some scientific discussion surrounds their relative strengths and weaknesses. The QC program includes them all, and over the course of the next year is expected to show which ones perform most reliably across sites. “Robustness of CSF measurement is a necessity as we move forward into global trials, including ADNI2 and any other biomarker efforts that take place across the world,” Carrillo said.

Importantly, the QC program offers one additional service, Zetterberg told ARF. For QC purposes, the Göteborg group has established large pools of carefully calibrated CSF reference samples. They span the range of Aβ, tau, and phospho-tau concentrations scientists can expect to encounter in their studies. Besides supplying the QC protocol validation program, this reference CSF is also available upon request to investigators who want to run it alongside study samples in their drug trials. The Göteborg group is known for using such pooled samples as internal controls for their own research and for service measurement of research and clinical samples that are sent to them from across Sweden. Building on that, the group has made enough reference CSF to cover entire therapeutic trials by outside drug sponsors, Blennow said. If each study plate included a QC sample, investigators could monitor the performance of their sites and fix any problems the QC sample might flag. Sponsors could also normalize the data to the QC standard; this would make it easier to compare separate trials to each other. “This is the newest part of the QC program. It offers a practical way of dealing with the variation even now in ongoing studies, before the field has achieved worldwide standardization,” Zetterberg added.

This QC program comes free of charge to participating labs. In contrast, ongoing QC programs for other assays, such as glucose, hormones, liver enzymes, cardiac and tumor markers, etc., can cost participating labs a pretty penny. “This is an important service from the Alzheimer’s Association,” Blennow wrote. What’s free, exactly? Receipt, every three months, of QC samples that participating sites can analyze as part of their routine assays or planned studies, as well as inclusion in the ongoing QC reference analysis.

“The QC program is the right way forward,” concluded Amsterdam’s Verwey, who was among the first researchers to put his finger on the problem when he compared site performance and published the results. To learn where the variation arises and exactly how the QC program will work, see Part 2 and Part 3—Gabrielle Strobel.