Abstract

Optical coherence tomography (OCT) is the de facto standard imaging modality for ophthalmological assessment of retinal eye disease, and is of increasing importance in the study of neurological disorders. Quantification of the thicknesses of various retinal layers within the macular cube provides unique diagnostic insights for many diseases, but the capability for automatic segmentation and quantification remains quite limited. While manual segmentation has been used for many scientific studies, it is extremely time consuming and is subject to intra- and inter-rater variation. This paper presents a new computational domain, referred to as flat space, and a segmentation method for specific retinal layers in the macular cube using a recently developed deformable model approach for multiple objects. The framework maintains object relationships and topology while preventing overlaps and gaps. The algorithm segments eight retinal layers over the whole macular cube, where each boundary is defined with subvoxel precision. Evaluation of the method on single-eye OCT scans from 37 subjects, each with manual ground truth, shows improvement over a state-of-the-art method.

Shown is (a) the original image in native space. In flat space are (b) the original image, (c) a heat map of the probabilities for one of the boundaries (ILM), and (d) the y-component of the GVF field for that same boundary. The color scale in (c) represents zero as blue and one as red.

Shown in flat space are (a) the MGDM initialization, and (b) the MGDM result. The MGDM result mapped back to the native space of the subject is shown in (d) and for comparison the manual segmentation of the same subject is shown in (c). The same color map is used in this figure and in Fig. 4.

Shown is a magnified (×18) region around the fovea for each of (a) the original image, (b) the manual delineation, and automated segmentations generated by (c) RF+Graph [24] and (d) our method. The result in (d) is generated from the continuous representation of the level sets in the subjects native space, shown in (e) is the voxelated equivalent for our method. The RF+Graph method has to keep each layer at least one voxel thick (the GCIP and INL in this case). We also observe the voxelated nature of the the RF+Graph result, whereas our approach has a continuous representation due to its use of levelsets shown in (d) but can also be converted a voxelated format (e). The same color map is used in this figure and in Fig. 3.

Tables (2)

Table 1 Mean (and standard deviations) of the Dice Coefficient across the eight retinal layers. A paired Wilcoxon rank sum test was used to test the significance of any improvement between RF+Graph [24] and our method, with strong significance (an α level of 0.001) in six of the eight layers.

Table 2 Mean absolute errors (and standard deviation) in microns for our method (MGDM) in comparison to RF+Graph [24] on the nine estimated boundaries. A paired Wilcoxon rank sum test was used to compute p-values between the two methods with strong significance (an α level of 0.001) in six of the nine boundaries.

Metrics

Table 1

Mean (and standard deviations) of the Dice Coefficient across the eight retinal layers. A paired Wilcoxon rank sum test was used to test the significance of any improvement between RF+Graph [24] and our method, with strong significance (an α level of 0.001) in six of the eight layers.

Table 2

Mean absolute errors (and standard deviation) in microns for our method (MGDM) in comparison to RF+Graph [24] on the nine estimated boundaries. A paired Wilcoxon rank sum test was used to compute p-values between the two methods with strong significance (an α level of 0.001) in six of the nine boundaries.

Tables (2)

Table 1

Mean (and standard deviations) of the Dice Coefficient across the eight retinal layers. A paired Wilcoxon rank sum test was used to test the significance of any improvement between RF+Graph [24] and our method, with strong significance (an α level of 0.001) in six of the eight layers.

Table 2

Mean absolute errors (and standard deviation) in microns for our method (MGDM) in comparison to RF+Graph [24] on the nine estimated boundaries. A paired Wilcoxon rank sum test was used to compute p-values between the two methods with strong significance (an α level of 0.001) in six of the nine boundaries.