Bottom Line:
The space locates patients in the context of known syndromes and thereby facilitates the generation of diagnostic hypotheses.Consequently, the approach will aid clinicians by greatly narrowing (by 27.6-fold) the search space of potential diagnoses for patients with suspected developmental disorders.Furthermore, this Clinical Face Phenotype Space allows the clustering of patients by phenotype even when no known syndrome diagnosis exists, thereby aiding disease identification.

Affiliation: Department of Engineering Science, University of Oxford, Oxford, United Kingdom Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.

fig4s4: Simulated example of probabilistic querying of Clinical Face Phenotype Space.(A) Visualization of a population of simulated faces in the first two Multi-Dimensional Scaling (MDS) modes. 7 classes of points (simulated 'syndrome groups') are shown with different distributions and variances. A central 'query' face is indicated by the boxed cross. The 20 nearest neighbors of the query are encircled with a black border. (B) Inset bar graph shows diagnosis hypothesis ranked by class priority. The class priority ranking weights the dispersion and prevalence (spread and number) of a class in the Clinical Face Phenotype Space with the nearest neighbors to assign the most probable diagnosis hypotheses. In the example, the ranked diagnosis estimates of the query point would be class 7, then class 6, and thirdly class 4. The scatter plot shows the individual similarity p0p1 estimates, reflecting their relative closeness in the space as compared to local neighborhood, for the 20 nearest neighbors of the query. The first nearest neighbor is estimated to be 2.6-fold closer to the query than the average based on the local density of neighbors. The dotted line indicates the average relative distance between points among the 20 nearest neighbors. (C) Inset bar graph shows the number of neighbors of the query per class. A scatterplot of dispersion vs cardinality, i.e. relative spread of points and what proportion of the total number of points belong to that class in the simulated space. Plots (B) and (C) allow objective assessment of the distribution of points shown in (A), and aid the interpretation of classification confidence.DOI:http://dx.doi.org/10.7554/eLife.02020.015

Mentions:
For any given image located in Clinical Face Phenotype Space, we obtain confidence ranked classifications to known disorders (see 'Materials and methods' and Figure 4—figure supplement 4). In addition, we objectively compare the image to others within the space. For any given query image, a probabilistic ranking of similar syndromes is obtained through nearest neighbor representation compared to random expectation of clustering among the 90 syndromes and 2754 faces. The classification confidence for a particular disorder depends on its location within the space, but also on the local densities of similar faces. We find that for the eight initial syndromes used to construct Clinical Face Phenotype Space, 93.1% (range 81.0–99.2%) are correctly classified as the top rank, cumulatively converging on 99.1% (95.8–100%) by the 20th rank (Figure 4B). Of syndromes not part of the Clinical Face Phenotype Space training, the classification accuracies positively correlated strongly with the number of instances in the database (Figure 4B). For the 20 syndromes where the database held 5 or fewer examples (Table 1), we classify on average 20.3% correctly by the 6th rank (exceeding 16.3-fold better than by chance alone).

fig4s4: Simulated example of probabilistic querying of Clinical Face Phenotype Space.(A) Visualization of a population of simulated faces in the first two Multi-Dimensional Scaling (MDS) modes. 7 classes of points (simulated 'syndrome groups') are shown with different distributions and variances. A central 'query' face is indicated by the boxed cross. The 20 nearest neighbors of the query are encircled with a black border. (B) Inset bar graph shows diagnosis hypothesis ranked by class priority. The class priority ranking weights the dispersion and prevalence (spread and number) of a class in the Clinical Face Phenotype Space with the nearest neighbors to assign the most probable diagnosis hypotheses. In the example, the ranked diagnosis estimates of the query point would be class 7, then class 6, and thirdly class 4. The scatter plot shows the individual similarity p0p1 estimates, reflecting their relative closeness in the space as compared to local neighborhood, for the 20 nearest neighbors of the query. The first nearest neighbor is estimated to be 2.6-fold closer to the query than the average based on the local density of neighbors. The dotted line indicates the average relative distance between points among the 20 nearest neighbors. (C) Inset bar graph shows the number of neighbors of the query per class. A scatterplot of dispersion vs cardinality, i.e. relative spread of points and what proportion of the total number of points belong to that class in the simulated space. Plots (B) and (C) allow objective assessment of the distribution of points shown in (A), and aid the interpretation of classification confidence.DOI:http://dx.doi.org/10.7554/eLife.02020.015

Mentions:
For any given image located in Clinical Face Phenotype Space, we obtain confidence ranked classifications to known disorders (see 'Materials and methods' and Figure 4—figure supplement 4). In addition, we objectively compare the image to others within the space. For any given query image, a probabilistic ranking of similar syndromes is obtained through nearest neighbor representation compared to random expectation of clustering among the 90 syndromes and 2754 faces. The classification confidence for a particular disorder depends on its location within the space, but also on the local densities of similar faces. We find that for the eight initial syndromes used to construct Clinical Face Phenotype Space, 93.1% (range 81.0–99.2%) are correctly classified as the top rank, cumulatively converging on 99.1% (95.8–100%) by the 20th rank (Figure 4B). Of syndromes not part of the Clinical Face Phenotype Space training, the classification accuracies positively correlated strongly with the number of instances in the database (Figure 4B). For the 20 syndromes where the database held 5 or fewer examples (Table 1), we classify on average 20.3% correctly by the 6th rank (exceeding 16.3-fold better than by chance alone).

Bottom Line:
The space locates patients in the context of known syndromes and thereby facilitates the generation of diagnostic hypotheses.Consequently, the approach will aid clinicians by greatly narrowing (by 27.6-fold) the search space of potential diagnoses for patients with suspected developmental disorders.Furthermore, this Clinical Face Phenotype Space allows the clustering of patients by phenotype even when no known syndrome diagnosis exists, thereby aiding disease identification.

Affiliation:
Department of Engineering Science, University of Oxford, Oxford, United Kingdom Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.