A critical goal of Alzheimer disease research is to identify disease biomarkers that can be used in clinical trials to assist in the adjudication of treatment effects. While clinical validation remains a goal for many potential Alzheimer disease biomarkers, the rapid proliferation of markers has sparked comparative efforts as well. New data acquisition methods and sophisticated image-processing algorithms are poised to make a substantial impact on our ability to make precise measurements of the structure and function of regions within the living human brain and their connections and chemical composition. This commentary provides a perspective on a recently published paper and how it illustrates progress and challenges in the field.

A decade ago, Nick Fox and colleagues [1] used sample size estimates for hypothetical disease-modifying clinical trials to call attention to the important value of magnetic resonance imaging (MRI)-based imaging biomarkers for Alzheimer disease (AD). Since then, this approach has been employed to demonstrate the potential value of new methods for measuring anatomic, metabolic, and other putative AD biomarkers [2]. A new paper illustrates progress and challenges in this area [3].

The authors measured longitudinal change in cortical and subcortical volume with an interesting new method that takes advantage of a precise image registration algorithm (comparable to tensor-based morphometry, or TBM). A so-called volume-change field is produced at each voxel and then is averaged over a set of-in some cases large-a priori-defined anatomic regions of interest (ROIs) to obtain the percentage change from baseline. This averaging step is somewhat puzzling, however, particularly given the stated precision of registration and concept of this method as providing 'subregion' measures. The a priori atlas [4] provides a valuable service in that it does an excellent job of automating a cumbersome process of identifying neuroanatomic ROIs in individual scans. However, many of the cortical ROIs are quite large and most neurodegenerative diseases do not respect the anatomic boundaries of these ROIs-only subregions tend to be affected, and some effects span multiple ROIs. Thus, the power of this precise registration method may be reduced by constraining it to anatomic ROIs, rather than to ROIs generated from the known effects of AD itself ('disease signature' effects), such as have been described for both MRI [5, 6] and fluorodeoxyglucose-positron emission tomography (FDG-PET) data [7]. This point is illustrated in a recent TBM study [8] showing that an anatomically defined ROI in the temporal lobe required consistently higher samples than a 'disease signature' ROI defined from an independent longitudinal AD patient sample (Table 1 of [8]).

Although differences in sample size estimation methods make it difficult to directly compare the present study [3] to that of Fox and colleagues (2000) [1], a cursory examination of the whole-brain measure in AD patients from Table 1 of the present study (n = 189) with Table 1 of Fox and colleagues (n = 168) does not suggest an obvious advantage. With more closely matched sample size estimation calculations, comparison with the recent TBM study [8] demonstrates that the present method requires consistently larger sample sizes for both AD and mild cognitive impairment. (Compare Tables 1 and 2 of the present study [3] with Table 1 of the TBM study [8].) Nevertheless, the new method ultimately may make important contributions with further development.

How can we decide whether a new marker adds value? Sample size estimation is one approach since it reflects the size of the biologic or clinical effect of interest and its variability (subsuming both biologic and measurement variability).

One important challenge in comparing papers using such analyses is that many of the investigator-specified variables differ in addition to the parameters of relevance. Larger or smaller sample size estimates can be derived from the same measures simply by choosing different hypothetical drug effects. If sample size estimates are not recalculated using original data, it can be difficult to directly compare such measures, requiring readers to resort to comparing the atrophy rates and standard deviations, which also may be variably reported.

In large part because of the profound advances in infrastructure and standards being developed by the Alzheimer's Disease Neuroimaging Initiative (ADNI), it is now possible to efficiently perform comparisons of increasingly sophisticated measures derived from computational processing of MRI and PET data. Yet it can still be difficult to compare measures because different subsets of subjects may be included in any analysis. Laurel Beckett, Danielle Harvey, and colleagues affiliated with the ADNI biostatistics core are finishing an analysis that emphasizes the need to compare markers on a common set of subjects and demonstrates a method not only for characterizing biomarkers but also for statistically testing for differences between measures. These important advances should provide a framework that makes it easier to determine the pros and cons of new imaging analytic methods, which are advancing in at least two domains.

First, they are becoming more refined with respect to anatomy. Since the pioneering efforts of the neuroanatomist Constantin von Economo, who not only exhaustively mapped cortical cytoarchitecture but also painstakingly measured the thickness of cortical regions and laminae [9], anatomists have been interested in measures of the size of different brain regions. Although anatomists and pathologists over the years have observed cortical thinning in AD [10, 11], it has proven very challenging to measure in vivo. These issues have been largely solved in recent years through advanced computational procedures [12–14]. Since the volume of a gyral cortical region reflects both its thickness and surface area and since AD appears to affect thickness more prominently than surface area [15] (although this issue deserves further study), it also stands to reason that measures of thickness may be particularly sensitive to neurodegenerative disease effects. It is somewhat surprising that computational methods perform as well as they do in detecting submillimeter disease effects with raw voxel sizes of at least 1 mm; the precision of these measures will undoubtedly improve as the resolution of MRI data acquisition improves.

Second, advanced methods for mapping the spatial patterns of disease will likely enhance our ability to differentiate the effects of one disease from those of another or from normal aging. A number of these methods are being developed and they are derived mostly from machine learning and pattern recognition algorithms and increasingly are being used in applications such as face or voice recognition. Initial applications of these methods to AD and related disorders have been very promising [16–18], and these types of procedures will likely have an important impact on improving the specificity of imaging biomarkers in AD.

Ultimately, the goal of this research is to enable disease-modifying treatments to be identified more efficiently. A large and growing community of investigators in the field believes that we not only need to measure brain changes that can provide a glimpse of any such benefits of a given intervention but also need to realize the complexities of linking these changes to clinical benefit [19].