Understanding and classifying the different variants of Primary Progressive Aphasia based on spelling performance

1
Johns Hopkins University, Department of Cognitive Science, United States

2
Johns Hopkins Medicine, Department of Neurology, United States

Introduction: Previous findings suggest differences in the written spelling performance between the three variants of Primary Progressive Aphasia (PPA) - semantic (svPPA), logopenic (lvPPA) and non-fluent (nfvPPA) (Shim et al., 2012; Sepelyak et al., 2011). However, no attempts have been made to systematically distinguish the three variants in terms of their spelling performance. The challenges of classification are considerable and given the ease of administering a spelling test, we aimed to determine to what extent a spelling task can provide accurate classification of the PPA variants.
Method: Thirty-three participants with PPA were included - 14 lvPPAs, 11 nfvPPAs and 8 svPPAs – originally classified using the neuropsychological and spoken language criteria defined by Gorno-Tempini et al. (2011). Data were collected prior to spelling treatment, using a spelling to dictation task with both real-words and pseudowords (92-138 items/per participant), scored for each grapheme (i.e., letter) and analyzed for each participant individually using generalized linear mixed effects models (GLMEM) for real-words and pseudowords separately. The variables of interest for both real-words and pseudowords were word length, phoneme-grapheme conversion probability and grapheme position. The real-word models also included frequency, imageability, and the orthographic and phonological neighborhood density of the target words.
The coefficients from the output of the GLMEMs, together with 3 additional variables – verb/noun and pseudoword/word accuracy differences from the spelling task, and language impairment severity according to FTD-CDR (Knopman, 2008) - were used as predictors in a Random Forests (RFs) model implemented in Python, to identify the variables that contribute the most in distinguishing the three variants. Then, the three most significant predictors identified with RFs were used in multinomial models implemented in R to classify the PPA variants. The model was trained on a training set of all participants minus one (i.e. the left-out participant) and evaluated on the left-out participant, known as Leave-One-Out cross-validation. This process was repeated 33 times to evaluate all participants.
Results: The three most significant predictors of the RFs analysis were: (1) grapheme position in real-words, (2) pseudoword/word accuracy difference, and (3) length of real-words (Figure 1). The overall accuracy of the multinomial models with these three predictors only was 67%: lvPPA=71%, nfvPPA=64% and svPPA=63%. When severely impaired cases (language severity =3 in Knopman et al., 2008; FTD-CDR criteria) were excluded (giving a new dataset of 22 participants), the overall accuracy increased to 91%: lvPPA=90%, nfvPPA=86% and svPPA=100%.
Discussion: Our study provides evidence of the value of considering spelling performance in understanding and classifying the different variants of PPA. The results suggest that lexical status, word length and grapheme position are useful parameters for classification, which index key components of the cognitive architecture of spelling (Rapp, 2002). Also, the finding that prediction accuracy increased when more severe cases were excluded supports previous findings (Mesulam et al., 2012), as severity increases variants become less differentiated and classification is more difficult. In sum, a relatively short, easy-to-administer spelling test, provides useful information for PPA variant classification and can potentially be used as a clinical tool.