Validation of Intersubject Warping

Validation of strategies for warping one subject onto another poses unique difficulties. Visual inspection can be extremely misleading when evaluating warping algorithms that have many degrees of freedom (sometimes one or more degrees of freedom per voxel), potentially allowing them to morph any shape into any other. Point landmarks identified by a single anatomist cannot be used to estimate errors in landmark identification because no constrained spatial transformation model exists to allow errors to be distinguished from true variability. In many regions of the brain, experts are likely to disagree about what constitutes the correct homology or may simply be unable to identify a reasonable homology. Cross-validation may show that one of the two methods has a large error, but if neither method has been well validated previously, this information is not very helpful. Most algorithms used for nonlinear modeling have been deliberately chosen because they are quite robust for identifying a unique (even if possibly incorrect) solution to the warping problem, so little is likely to be learned from restarting the algorithm from a new set of initial parameters. Simulations may give a clue that the spatial transformation model is overparameterized (the ability to warp some arbitrarily shaped object of equivalent topology into the form of a brain is not necessarily a good thing), but are otherwise likely to reflect the similarities of modeling assumptions (especially the form of the spatial transformation model) between the simulation and the registration method. Phantoms offer no special advantages in validating intersubject warping since morphometry, not movement, is the main source of difficulty.

Despite the potential problems associated with expert identification of landmarks, this is the most convincing strategy for validation of warping algorithms. Demonstration that a warping algorithm approximates the accuracy of experts is an important first step in validation. Since experts may disagree about what constitutes a homology in certain brain regions, the degree of agreement between well-trained experts is a reasonable metric against which a warping algorithm can be calibrated. Some warping algorithms do not seek to exceed or even match the performance of experts in any particular region, but instead are intended to give a reasonable level of registration accuracy in an automated fashion. Validation against anatomic landmarks identified by a single expert is often sufficient for showing that these methods meet their objective. An example of this type of validation is given in Woods et al. [13]. For methods that use many degrees of freedom, anatomic validation is essential for demonstrating that the apparently good registration seen on visual inspection is due to true superimposition of anatomic homologues and not to the morphing of structures to create the appearance of homology. This means that the landmarks must be identified in the original images and then tracked through all steps of the warping transformation for comparison to one another in a common frame of reference.

For comparisons of different methods, care must be taken to ensure that the space in which errors are measured is comparable. Otherwise, a method that shrinks all images will appear to perform better than it actually does. One strategy for dealing with this problem is to map the anatomic landmarks of each individual used for validation into the unwarped original brain images of a randomly selected representative subset of these individuals, averaging the errors across the subset. For methods that use an atlas as the warping target, each landmark would be mapped into the atlas using a forward transformation and then back out into the selected subset using reverse transformations. This strategy effectively prevents the size and shape of the atlas from influencing the metric used for validation.

New and better ways of defining anatomic homologies are likely to play an important role in future intersubject warping validation studies. Microscopic examination of postmortem brain specimens can identify features that uniquely label certain regions. Although very laborious, work is ongoing to identify such features and to map them precisely back into MRI images obtained before or shortly after death (see the chapter entitled "Image Registration and the Construction of Multidimensional Brain Atlases"). Functional imaging techniques are also likely to provide unique imaging signatures that uniquely identify specific brain regions. Work is underway in our laboratory to identify functional imaging tasks and paradigms that can be used for this purpose. The fact that intersubject warping is essentially a biological question ensures that warping methods will continue to evolve and improve as our ability to identify homologies improves and as our understanding of biological processes that lead to intersubject variability deepens.