Abstract

Previous research has shown that individuals with greater cognitive abilities display a greater speed of higher-order cognitive processing. These results suggest that speeded neural information processing may facilitate evidence accumulation during decision making and memory updating and thus yield advantages in general cognitive abilities. We used a hierarchical Bayesian cognitive modeling approach to test the hypothesis that individual differences in the velocity of evidence accumulation mediate the relationship between neural processing speed and cognitive abilities. We found that a higher neural speed predicted both the velocity of evidence accumulation across behavioral tasks and cognitive ability test scores. However, only a negligible part of the association between neural processing speed and cognitive abilities was mediated by individual differences in the velocity of evidence accumulation. The model demonstrated impressive forecasting abilities by predicting 36% of individual variation in cognitive ability test scores in an entirely new sample solely based on their electrophysiological and behavioral data. Our results suggest that individual differences in neural processing speed might affect a plethora of higher-order cognitive processes, that only in concert explain the large association between neural processing speed and cognitive abilities, instead of the effect being entirely explained by differences in evidence accumulation speeds.

Keywords

Introduction

Individual differences in cognitive abilities are important predictors for real-world achievements such as job performance and highest level of educational attainment (Schmidt and Hunter 2004). Cognitive ability differences also predict differences in individuals’ health (Deary 2008; Der et al. 2009), happiness (Nikolaev and McGee 2016), and well-being (Pesta et al. 2010). However, what remains largely unexplored are the fundamental biological processes that give rise to individual differences in cognitive abilities across individuals. In this study, we explore how individual differences in cognitive abilities are associated with individual differences in neural processing speed, and how this association can be explained by individual differences in the velocity of evidence accumulation as an intermediate cognitive process.

Previous research has suggested that those individuals with greater cognitive abilities have a higher speed of information processing, typically measured as reaction or inspection times in elementary cognitive tasks on a behavioral level (Kyllonen and Zu 2016; Sheppard and Vernon 2008), or as latencies of event-related potential (ERP) components on a neurophysiological level (e.g., Bazana and Stelmack 2002; Schubert et al. 2015; Troche et al. 2015). Neuroimaging studies have shown that the association between the speed of information processing and cognitive abilities may reflect individual differences in white-matter tract integrity, either as an overall brain property (Penke et al. 2012) or in specific brain regions such as the forceps minor and the corticospinal tract (Kievit et al. 2016).

However, those with greater cognitive abilities do not seem to benefit from a higher speed of information processing during all stages of information processing. Instead, individuals with greater cognitive abilities show a higher speed of information processing only in higher-order cognitive processes such as decision making and memory updating (Schmiedek et al. 2007; Schubert et al. 2017). In particular, the velocity of evidence accumulation during decision making has been repeatedly associated with individual differences in cognitive abilities (Schmiedek et al. 2007; Schmitz and Wilhelm 2016; Schubert et al. 2015; van Ravenzwaaij et al. 2011). Moreover, cognitive abilities have been specifically associated with the latencies of ERP components reflecting higher-order cognitive functions such as memory and context updating (Bazana and Stelmack 2002; McGarry-Roberts et al. 1992; Schubert et al. 2017; Troche et al. 2009). Taken together, these results suggest that a greater speed of information processing may facilitate evidence accumulation during decision making and memory updating and may give rise to advantages in general cognitive abilities. In the present study, we explore this hypothesis by using a hierarchical Bayesian cognitive modeling approach to investigate if individual differences in the velocity of evidence accumulation mediate the relationship between neural processing speed and general cognitive abilities.

Measuring the Speed of Higher-Order Cognitive Processes

Reaction time measures are affected by a variety of cognitive and motivational processes and differences across individuals are not solely due to differences in the specific processes of interest (Nunez et al. 2015; Schubert et al. 2015). Therefore, mean reaction times and differences in reaction times between certain experimental conditions can only provide very imprecise measurements of the speed of specific higher-order cognitive processes. One approach to measure the speed of higher-order cognitive processes is to use validated mathematical models of decision making, which allow estimating the speed and efficiency of specific cognitive processes (Voss et al. 2004). One of the most influential model types used to jointly describe reaction time distributions and accuracies in binary choice tasks are diffusion models. Diffusion models assume that information accumulation follows a continuous, stochastic Wiener process that terminates once one of two decision thresholds has been reached (Stone 1960; Ratcliff 1978; Ratcliff and McKoon 2008). That is, it is assumed that on any given trial an individual will accumulate evidence for one choice over another in a random walk evidence accumulation process with an infinitesimal time step (while neural coding may be more sequential in nature, the infinitesimal approximation should hold true for small time steps). It is predicted that the change in relative evidence Et follows a Wiener (i.e., Brownian motion) process with an average evidence accumulation rate δ and instantaneous variance σ2 (Ross 2014).

Typically, the variance σ2 is fixed to some standardized value for reasons of identifiability (but see Nunez et al. 2017). The drift rate (δ) measures the relative velocity of evidence accumulation during decision making and individual differences in this parameter have been suggested to be associated with individual differences in cognitive abilities (Schmiedek et al. 2007; Ratcliff et al. 2010; 2011; Schubert et al. 2015; Schmitz and Wilhelm 2016). The evidence units per second of the drift rate (δ) are relative to a predetermined decision criterion for evidence (α), which reflects speed–accuracy trade-offs (Voss et al. 2004). In addition, a basic diffusion model consists of one more additional parameter describing and complementing the decision process: The non-decision time (ter) encompasses all non-decisional processes such as encoding and motor reaction time.

It is not surprising that the drift rate parameter in particular has become widely popular in individual differences research (Frischkorn and Schubert 2018), because it allows quantifying the speed of information uptake free of confounding process parameters such as encoding and motor times or decision cautiousness, which are captured by other model parameters and are largely irrelevant for cognitive abilities research. Individual differences in drift rates have been shown to exhibit trait-like properties (i.e., they show temporal stability and trans-situational consistency; Schubert et al. 2016) and to be associated with individual differences in cognitive abilities (Ratcliff et al. 2010; 2011; Schmiedek et al. 2007; Schmitz and Wilhelm 2016; Schubert et al. 2015), attention (Nunez et al. 2015), and word recognition (Yap et al. 2012). The drift rate can even be interpreted in the framework of item response theory (IRT), in which it can under certain assumptions be decomposed into an ability and difficulty parameter (van der Maas et al. 2011).

Moreover, several studies suggest a direct link between drift rates and neural processing correlates in the EEG. In particular, it has been shown that the P3, an ERP component occurring typically about 250–500 ms after stimulus onset with a positive deflection that is maximal at parietal electrodes (Polich 2007), is a neural correlate of the evidence accumulation process captured in the drift rate (Kelly and O’Connell 2013; O’Connell et al. 2012; Ratcliff et al. 2009, 2016; van Ravenzwaaij et al. 2017). O’Connell et al. (2012) and Kelly and O’Connell (2013) even suggested that the buildup rate of this positive centroparietral positive potential may directly reflect the rate of evidence accumulation on a neural level.

Particularly intriguing from an individual-differences perspective is the observation that individual differences in P3 amplitudes across conditions have been shown to explain about 74% of the variance in drift rates δ (Ratcliff et al. 2009). Because both individual differences in drift rates and individual differences in P3 characteristics have been shown to explain cognitive abilities, a theoretical framework of the neurocognitive processes underlying cognitive abilities needs to specify if individual differences in P3 characteristics and drift rates contribute jointly or independently to intelligence differences.

Bridging the Gap Between Neural and Behavioral Correlates of Cognitive Abilities to Outline a Cognitive Theory of Intelligence

As of yet, researchers from the fields of mathematical modeling and cognitive neuroscience have largely independently contributed to our understanding of the basic processes underlying individual differences in cognitive abilities. While mathematical modeling researchers have suggested that the velocity of evidence accumulation may be specifically related to cognitive abilities (Ratcliff et al. 2010; 2011; Schmiedek et al. 2007; Schubert et al. 2015), cognitive neuroscience researchers have characterized the time course of information processing and identified structural and function neural correlates of cognitive abilities(Basten et al. 2015; Neubauer and Fink 2009; Jung and Haier 2007). However, neurophysiological correlates of cognitive abilities still need to be integrated into a theoretical framework that outlines how advantages in neural processing translate into advantages in cognitive information processing that give rise to advantages in cognitive abilities to meaningfully explain the processes underlying individual differences in intelligence.

Based on the associations of P3 latencies and drift rates with intelligence, it may be proposed that the relationship between ERP latencies reflecting higher-order cognition and cognitive abilities is mediated by individual differences in drift rates. Such a mediation account is empirically supported by the result that reaction times partly mediate the relationship between ERP latencies and cognitive abilities (Schubert et al. 2015). Moreover, it has been shown that advantages in a larger number of white-matter tract integrity measures gave rise to advantages in a smaller number of behavioral processing speed measures, which in turn explained about 60% of variance in fluid intelligence in a many-to-one way (Kievit et al. 2016). On the other hand, individual differences in both neural processing speed and drift rates may reflect some confounding variable (e.g., functional brain properties) that is also substantially related to cognitive abilities. This confounding variable account was supported by a recent study that failed to find any transfer of an experimentally induced increase in both neural and behavioral processing speed by transdermal nicotine administration on intelligence test scores (Schubert et al. 2018). Candidate confounding variables may be properties of the salience network that have been associated both with P3 elicitation and individual differences in cognitive abilities (Hilger et al. 2017; Menon and Uddin 2010; Soltani and Knight 2000).

Recent advancements in the emerging field of model-based cognitive neuroscience have demonstrated the advantages of integrating mathematical modeling and cognitive neuroscience to generate and test theoretical accounts that jointly account for neural correlates and cognitive models of psychological processes (e.g., Forstmann et al. 2011; Nunez et al.2017; Palmeri et al. 2017; Turner et al. 2017). In the present study, we used a model-based cognitive neuroscience approach to test the hypothesis that the relationship between ERP latencies reflecting higher-order cognition and cognitive abilities is mediated by individual differences in drift rates. If evidence in favor of the mediation hypothesis is found, the mediation model will provide a clear theoretical outline how advantages in neural processing speed give rise to advantages in cognitive abilities. However, if evidence against the mediation model is found, this will imply that a confounding variable is likely to explain the association of neural processing and drift rates with cognitive abilities.

Jointly analyzing behavioral and brain data improves inferences about human cognition, because it is assumed that both measures reflect properties of the same latent cognitive process. In particular, the joint analysis of both behavioral and brain data allows to explicitly test theories regarding the cognitive processes and mechanisms governing the association between neural correlates and observable behavior. This simultaneous analysis can be achieved in a hierarchical Bayesian framework using formal mathematical models such as the diffusion model to constrain or inform inferences based on the brain data (Forstmann et al. 2011; Turner et al. 2017). The hierarchical Bayesian framework provides many advantages (Lee 2011; Shiffrin et al. 2008). First and foremost, joint models are fit to all data simultaneously and do not require separate parameter estimation stages that lead to an underestimation of parameter uncertainty or standard errors (Vandekerckhove 2014). Both empirical and simulation studies have shown that ignoring the hierarchy in hierarchically structured data can bias inferences drawn from these data (Boehm et al. 2018; Vandekerckhove 2014).

Second, hierarchical Bayesian models can easily handle low observation counts or missing data structures (Lee and Wagenmakers 2014), which is an ideal property when the cost of collecting neural measurements is high. In particular, Bayesian Markov Chain Monte Carlo (MCMC) sampling finds posterior distributions of model parameters without the need for strong assumptions regarding the sampling distribution of these parameters (Levy and Choi 2013). Moreover, Bayesian statistical modeling approaches do not rely on asymptotic theory (Lee and Song 2004). These two properties make convergence issues in multivariate regression models in smaller samples less likely. Another favorable property of Bayesian hierarchical modeling is shrinkage, which describes the phenomenon that individual parameter estimates are informed by parameter estimates for the rest of the sample. Because less reliable and outlier estimates are pulled toward the group mean, shrinkage has been used in neuroimaging research to improve the reliability of individual functional connectivity estimates by 25 to 30% (Dai and Guo 2017; Mejia et al. 2018; Shou et al. 2014). Taken together, these desirable properties of hierarchical Bayesian models open up the possibility to use multivariate regression models such as structural equation models (SEM) or latent growth curve models in neuroimaging research, where sample sizes are usually smaller than in behavioral research due to the cost associated with the collection of neural measures.

The joint analysis of behavioral and neural data can be expanded into a cognitive latent variable model (CLVM) by including data from multiple conditions and/or tasks and by introducing covariates such as cognitive ability tests or personality questionnaires into the hierarchical model (Vandekerckhove et al. 2011; Vandekerckhove 2014). In addition to jointly modeling behavioral and neural data, the cognitive latent variable framework allows estimating correlations between higher-order variables, which reflect the covariances between behavioral, neural, and cognitive abilities data across experimental tasks or ability tests. As such, a CLVM is a computationally expensive, but a highly flexible, tool that strongly resembles structural equation modeling (SEM) in the way that it allows specifying associations between latent variables and distinguishing between constructs and their measurements. Vandekerckhove (2014) demonstrated the advantages of a CLVM in comparison to a more conventional two-stage analysis when modeling the latent association between evidence accumulation rates in executive function tasks and psychometric measures of dysphoria.

In the present study, we constructed CLVMs to assess the latent relationship between latencies of ERP components reflecting higher-order processing (P2, N2, P3), reaction times and accuracies in elementary cognitive tasks, and general cognitive abilities (see Fig. 1). For this purpose, we reanalyzed data from a study with multiple measurement occasions previously reported in Schubert et al. (2017). In particular, we wanted to test if the association between latencies of ERP components associated with higher-order cognitive functions and general cognitive abilities established with conventional structural equation modeling could be explained by individual differences in the velocity of evidence accumulation.

For this purpose, we constructed one measurement model for each of the three variable domains (ERP latencies, behavioral data, intelligence test performance). In each of these measurement models, a superordinate latent variable provides an estimate of the common variance of conditions or subtests within each variable domain. This latent variable can be considered a latent trait free of measurement error and task-specific variances. The main reason for estimating those latent traits is that they allow the estimation of individual differences on the construct level and are therefore not restricted to specific measurements or operationalizations of constructs. For ERP latencies, this latent variable reflects an error-free estimate of the neural processing speed of higher-order cognitive processes. For behavioral data, this latent variable reflects an error-free estimate of velocity of evidence accumulation across different elementary cognitive tasks and their conditions. While we used a cognitive model (the diffusion model) to describe performance in these cognitive tasks, we could also have estimated behavioral processing speed as mean reaction times in these tasks. Finally, for intelligence test performance, the superordinate latent variable reflects an error-free measurement of general intelligence across different intelligence subtests.

Each of these latent traits contain a surplus meaning that allows the generalization of any results to other measurements of the same construct, i.e., any association between general intelligence and neural processing speed should not only hold for the specific tests used in the present study, but also for similar cognitive ability tests. To test the mediation hypothesis, we only used those superordinate latent variables and regressed general intelligence on neural processing speed and evidence accumulation velocity, which was in turn regressed on neural processing speed. Hence, the core of our hypothesis that individual differences in the velocity of evidence accumulation mediate the association between neural processing speed and general intelligence is reflected in this regression model of latent variables. The measurement models giving rise to the latent variables only serve to provide error-free and task-/test-general estimates of these three traits.

We also conducted out-of-sample forecasts to validate how well this mediation model was able to predict individual cognitive ability test scores solely based on new participants’ electrophysiological and behavioral data. We expected that a greater speed of neural information-processing would facilitate evidence acquisition during decision making and memory updating, and that this advantage in the velocity of evidence accumulation would mediate the predicted association between neural processing speed and general cognitive abilities.

Materials and Methods

Participants

N = 122 participants (72 females, 50 males) from different occupational and educational backgrounds participated in three sessions of the study. They were recruited via local newspaper advertisements, social media platforms, and flyer distributions in the Rhine-Neckar metropolitan region. Participants were between 18 and 60 years old (M = 36.7, Med = 35.0, SD = 13.6), had normal or corrected to normal vision, and reported no history of mental illness. All participants signed an informed consent prior to their participation in the experiment. The study was approved by the ethics committee of the faculty of behavioral and cultural studies, Heidelberg University.

Procedure

The study consisted of three sessions that were each approximately 4 months apart. Participants completed the experimental tasks in the first and third sessions while their EEG was recorded in a dimly lit, sound-attenuated cabin. The order of tasks (choice reaction time task, recognition memory task, letter matching task) was the same for all participants and both sessions. During the second session, participants completed the cognitive ability tests, a personality questionnaire (data reported in Kretzschmar et al. 2018), and a demographic questionnaire. Each session lasted approximately 3–3.5 h in duration with EEG being collected for approximately 2.5 h. Participants were given breaks between tasks and conditions to reduce mental fatigue.

Measures

Experimental Tasks

Choice Reaction Time Task (CR)

Participants completed a choice reaction time task with two conditions, a two-alternative (CR2) and a four-alternative (CR4) choice condition. Four white squares were presented in a row on a black screen. Participants’ middle and index fingers rested on four keys directly underneath the squares. After a delay of 1000–1500 ms, a cross appeared in one of the four squares and participants had to press the corresponding key as fast and accurate as possible. The screen remained unchanged for 1000 ms after their response to allow the recording of post-decision neural processes. Then, a black screen was shown for 1000–1500 ms between subsequent trials; the length of the inter-trial interval (ITI) was uniformly distributed. See the left part of Fig. 2 for an overview of the experimental procedure. While the task may suggest that the stimulus might simply “pop out,” resulting in immediate stimulus detection after its onset, this is not corroborated by empirical data. An increase in the logarithm of stimulus alternatives leads to a linear increase in RTs (Hick’s law, Hick 1952), which indicates that evidence is accumulated continuously until a decision point is reached and that this process takes longer the more stimulus alternatives are presented, either because more evidence has to be considered or because the process gets noisier. The slope of a regression across choice alternatives in Hick-like tasks is supposed to reflect the “rate of gain of information” (Hick 1952), which is conceptually very similar to the drift rate as a measure of the rate of evidence accumulation.

Participants completed three experimental tasks. The choice reaction time task (CR) consisted of 2-choice (CR2) and 4-choice (CR4) conditions with 200 trials each, the letter matching task of a physical identity (PI) and name identity (NI) condition with 300 trials each, and the recognition memory task (RM) of memory set sizes 1 (RM1), 3 (RM3), and 5 (RM5) with 100 trials each

In the two-choice response time condition, the number of choices was reduced to two squares in which the cross could appear for 50 subsequent trials. In the four-choice response time condition, the cross could appear in any of the four squares. Both conditions began with ten practice trials with immediate feedback followed by 200 test trials without feedback. The order of conditions was counterbalanced across participants. In the four-choice condition, we treated all three responses that were not the correct one as incorrect, allowing us to model the decision process with two decisions thresholds. Due to the high accuracy in the four-choice condition, it is unlikely that this simplification of the decision process has distorted the results, which is also supported by the similar and high factor loadings of the latent choice reaction time factor on the two- and four-choice conditions (see “Results”).

Letter Matching Task (LM)

Participants saw two white letters on a black screen and had to decide whether they were physically (physical identity condition) or semantically (name identity condition) identical by pressing one of two keys. Letters were identical in 50% of the trials. Each trial was followed by an inter-trial interval (ITI) of 1000–1500 ms. See the middle part of Fig. 2 for an overview of the experimental procedure. Conditions were presented block-wise. Each condition began with ten practice trials with immediate feedback followed by 300 test trials without feedback. All participants completed the physical identity condition first at the first measurement occasion, and second at the second measurement occasion.

Recognition Memory Task (RM)

Participants viewed memory sets of white, numerical digits (0 to 9) on a black screen. Digits were presented sequentially for 1000 ms each followed by a blank inter-stimulus interval shown for 400–600 ms. After the final digit was presented, participants saw a black screen with a white question mark for 1800–2200 ms. Subsequently, they were shown a single digit and had to decide whether the digit had been included in the previously presented memory set by pressing one of two keys. Each trial was followed by a uniformly distributed ITI of 1000–1500 ms. The probe digit was included in the memory set in 50% of the trials. There were three conditions of the experiment with the memory set consisting of either one, three, or five digits. See the right part of Fig. 2 for an overview of the experimental procedure in the set size 3 condition. The three conditions were presented block-wise and the order of presentation was counterbalanced across participants. Each condition consisted of ten practice trials with immediate feedback followed by 100 test trials without feedback.

Cognitive Abilities Tests

Berlin Intelligence Structure Test (BIS)

We administered the Berlin intelligence structure test (Jäger and Süß 1997), which distinguishes between four operation-related (processing speed, memory, creativity, processing capacity) and three content-related (verbal, numerical, figural) components of cognitive abilities. Each of the 45 tasks included in the test consists of a combination of one operation- (related) with one content-related component. Following the manual, we calculated participants’ scores in the four operation-related components by aggregating the normalized z-scores of tasks reflecting the specific operational components irrespective of content. The mean score of the processing capacity (PC) component was M = 101.70 (SD = 7.99), the mean score of the processing speed (PS) component was M = 98.00 (SD = 7.10), the mean score of the memory (M) component was M = 99.40 (SD = 6.51), and the mean score of the creativity (C) component was M = 98.02 (SD = 6.14). We then transformed these scores to z-scores for further analyses.

Advanced Progressive Matrices (APM)

Participants completed a computer-adapted version of Raven’s Advanced Progressive Matrices (Raven et al. 1994). The APM is a fluid intelligence test that consists of 36 items. Each item consists of a 3 × 3-matrix with geometric figures that follow certain logical rules and symmetries. The last element of the matrix is missing and must be chosen out of eight alternatives without time limit (see Fig. 3 for a fictional sample item). Following the manual, participants’ performance was calculated as the number of correctly solved items of the second set. Moreover, we calculated performance in the odd and even trials of the test separately to construct two indicators of latent APM performance. We then transformed these raw test sores to z-scores for further analyses. Participants solved on average M = 23.43 (SD = 6.71) items correctly, which corresponds to a mean IQ score of M = 98.80 (SD = 15.68). Performance on even trials, Meven = 12.23 (SD = 3.51) correctly solved items, was comparable to performance on odd trials, Modd = 11.20 (SD = 3.52) correctly solved items.

Example stimuli of Raven’s Progressive Matrices. Each item consists of a 3 × 3-matrix with geometric figures that follow certain logical rules and symmetries. The last element of the matrix is missing and must be chosen out of eight alternatives

EEG Recording

Participants’ EEG was recorded with 32 equidistant silver-silver chloride electrodes, a 32-channel BrainAmp DC amplifier (Brain Products, Munich), and a sampling rate of 1000 Hz (software bandpass filter of 0.1–100 Hz with a slope of 12 db/octave). In addition, participants’ electrooculogram (EOG) was recorded bipolarly with two electrodes positioned above and below the left eye and two electrodes positioned at the outer corners of the eyes. Electrode impedances were kept below 5 kΩ during recording. Data were collected with a central electrode reference but later offline re-referenced to the average activity of all electrodes (average reference). The data were filtered offline with a low-pass filter of 16 Hz with a slope of 12 db/octave.

Data Analysis

Behavioral Data

To remove outliers in the behavioral data, we discarded any reaction time faster than 100 ms or slower than 3000 ms. In the second step, we discarded any trial with logarithmized reaction times exceeding ± 3 standard deviations from the mean reaction time of each condition. Deviations in criteria (i.e., less strict criteria) did not affect the covariance structure between variables, suggesting adequate robustness.

Evoked Electrophysiological Measures

Event-related potentials (ERPs) were analyzed separately for each task and condition. ERPs were calculated by averaging all experimental trials, time-locked to the onset of the task-relevant visual stimuli, with windows of interest that were 1000 ms long with a preceding baseline of 200 ms. We corrected for ocular artifacts with the regression procedure suggested by Gratton et al. (1983). Windows of EEG data with amplitudes exceeding ± 70 μV at least once within the time window, with amplitude changes exceeding 100 μV within 100 ms, or with activity lower than 0.5 μV were discarded as artifacts.

Latencies of three ERP components were calculated for each participant in each experiment. Grand-average waveforms of event-related potentials are presented in Fig. 4. P2 peak latencies were determined with regard to the greatest positive local maxima at the fronto-central electrode on the midline, which roughly corresponds to the Fz electrode in the 10-20 system, in a 120 to 320 ms time window. N2 and P3 peak latencies were determined with regard to the greatest negative and positive local maxima at the parietal electrode on the midline, which roughly corresponds to the Pz electrode in the 10-20 system, in a 140 to 370 ms time window (N2) and a 200 to 630 ms time window (P3), respectively. Peak latencies were determined separately for each condition of each experimental task, then averaged across conditions within each experiment, and then z-standardized for further analyses. Prior to averaging across experimental conditions, we discarded any peak latencies exceeding ± 3 SDs from the mean peak latency of each condition. If any peak latencies were discarded, the average across conditions was calculated based on the remaining conditions.

Grand averages of event-related potentials at frontal, central, and parietal electrodes over midline. ERPs were elicited by stimulus onset and averaged across laboratory sessions and conditions for each experimental task

Cognitive Latent Variable Models

We constructed hierarchical Bayesian models to assess the latent relationship between reaction times, latencies of the three ERP components (P2, N2, P3), and cognitive ability test scores. For this purpose, we defined three separate sub-models describing the domain-specific associations between (a) ERP latencies in experimental tasks across two measurement occasions, (b) behavioral data in experimental tasks across two measurement occasions, and (c) performance in cognitive ability tests.

Then, we constructed two models using either (1) only ERP latencies or (2) ERP latencies and behavioral data to predict performance in cognitive ability tests. To test the hypothesis that drift rates mediate the relationship between neural processing speed and cognitive abilities, we compared performance of a direct regression model, in which ERP latencies predicted cognitive abilities (“Regression Model”), to a mediation model, in which the effect of ERP latencies on cognitive abilities was mediated by drift rates (“Mediation Model”).

We used Just Another Gibbs Sampler (JAGS; Plummer 2003) with a module that adds a diffusion model distribution to JAGS (jags-wiener; Wabersich and Vandekerckhove 2014) to find parameter estimates for the hierarchical model. Each model was fit with three Markov Chain Monte Carlo (MCMC) chains run in parallel. Each chain contained 2000 burn-in samples and 100,000 additional samples with a thinning parameter of 10, resulting in 10,000 posterior samples per chain. Posterior samples from the three chains were combined to one posterior sample consisting of 30,000 samples for each model parameter. Model convergence was evaluated based on the Gelman-Rubin convergence statistic R̂, which compares the estimated between-chains and within-chain variances for each model parameter (Gelman and Rubin 1992). Negligible differences between these variances were indicated by R̂ values close to 1.

Submodel: ERP Latencies in Experimental Tasks

ERP latencies were modeled in a hierarchical structural equation model (SEM) inspired by the parameter expansion approach suggested by Merkle and Rosseel (2018). Each of the three ERP latencies (P2, N2, P3) was quantified in three tasks at two sessions. Hence, six observed variables (3 tasks j× 2 sessions m) loaded onto each of the three first-order component (c)-specific ERP factors η(P2), η(N2), and η(P3). These three latent components were loaded onto a second-order latent factor B that was estimated per participant i.

Latent factors and observed variables had normally distributed prior and hyperprior distributions. The means of these priors reflected linear regressions of the respective higher-order factors. For reasons of identifiability, the loading γ(P2) of the first lower-order factor ηP2 on the higher-order factor B was fixed to 1, while the other loadings, γ(N2) and γ(P3), were given standard normal priors: γ(P2) = 1 and \(\gamma _{(N2)}, \gamma _{(P3)} \sim \mathcal {N}(0,1)\).

Subsequently, the observed latencies ERPicjm of ERP components c, tasks j, and measurement occasions m for each participant i were regressed onto the first-order latent variables. These regressions were defined by the respective factor loadings λcjm, the respective higher-order latent variables ηic, and the respective precisions 𝜃cjm . Factor loadings λcjm on the first-order latent variables were fixed to 1 for task j = CR and measurement occasion m = 1 for all three ERP components for reasons of identifiability. See the bottom left parts of Figs. 5, 6, and 7 for a graphical illustration of the measurement model of ERP latencies.

Graphical visualization of both the regression-linking and mediation-linking models (such that the mediation-linking model includes dashed connections). An alternate way of understanding the neurocognitive models presented in this manuscript is by viewing the graphical notation for hierarchical models as described by Lee and Wagenmakers (2014). Shaded nodes represent observed data while unshaded nodes represent unknown (fitted) parameters. Arrows represent direction of influence such that hierarchical parameters influence lower level parameters and observed data. Plates denote the number of observations for each variable and data point of participant i, experimental task j, experimental condition k, measurement occasion m, ERP component c, cognitive abilities task t, and trial n. Behavioral data y is a vector of both reaction time and accuracy observations

Submodel: Behavioral Data in Experimental Tasks

We used a combination of the SEM approach based on parameter expansion described above and the hierarchical diffusion model approach described by Vandekerckhove et al. (2011) to model individual differences in reaction times and accuracies in experimental tasks j, conditions k, and measurement occasions m.

In a first step, we modeled task-, condition-, and measurement occasion-specific drift rates in a hierarchical SEM with three task-specific first-order factors ηij. These three latent components loaded onto a second-order latent factor Δi. Again, latent factors and observed variables had normally distributed priors and hyperpriors. The means of these priors reflected linear regressions of the respective higher-order factors.

For reasons of identifiability, the loading γ(CR) of the first lower-order factor η(CR) on the higher-order factor Δ was fixed to 1, while the other loadings, γ(RM) and γ(LM), were given standard normal priors: γ(CR) = 1 and \(\gamma _{(RM)}, \gamma _{(LM)} \sim \mathcal {N}(0,1)\). Precisions ψ (inverses of variances) of all latent variables were modeled as gamma distributed variables: ψ(CR), ψ(RM), ψ(LM) ∼Γ(1,0.5).

Subsequently, the condition, task-, and measurement-occasion-specific drift rates δijkm were regressed onto the first-order latent variables ηij. Factor loadings on the respective first-order latent variables were fixed to 1 for condition k = 1, referring to the condition with lowest information processing demands within each task, and measurement occasion m = 1 for all three tasks for reasons of identifiability. The other loadings λjkm were given standard normal priors: \(\lambda _{jkm} \sim \mathcal {N}(0,1)\). Precisions of drift rates were modeled as gamma distributed variables: 𝜃jkm ∼Γ(1,0.5). In addition, we estimated intercepts νjkm for the lowest-order drift rates, because the behavioral data were not z-standardized: \(\nu _{jkm} \sim \mathcal {N}(2,1.5^{2})\).

In a second step, these drift rates were entered into the diffusion model distribution in addition to task-, condition-, measurement occasion-, and person-specific boundary separation αijkm and non-decision time τijkm parameters (with the starting point parameter fixed at 0.5). Both boundary separation parameters and non-decision times were given standard normal priors: \(\alpha _{ijkm} \sim \mathcal {N}(1, 0.5^{2})\), \(\tau _{ijkm} \sim \mathcal {N}(0.3, 0.2^{2})\). See the top parts of Figs. 5, 6, and 7 for a graphical illustration of the measurement model of behavioral data in experimental tasks.

Linking Models

Finally, we linked all submodels in two linking structures. Whereas the three submodels only established latent measurement models for each of the three variable domains (neural data, behavioral data, and cognitive abilities data), the two linking structures specified structural associations between variable domains. Hence, the comparison of the two linking models contained the critical comparison: If the velocity of evidence accumulation mediated the relationship between neural speed and cognitive abilities, the mediation model should outperform a direct regression of cognitive abilities on ERP latencies.

We therefore specified two linking structures. In the first linking structure, we specified a regression model and predicted cognitive abilities test scores solely through neural processing speed by regressing the latent cognitive abilities factor gi on the latent ERP latency factor Bi (see Fig. 1 and compare to Fig. 6), while the latent drift rate factor Δi was unrelated to the other two latent variables.

The second linking structure consisted of a mediation model, in which the latent cognitive abilities factor gi was regressed onto both the latent ERP latency factor Bi and the latent drift rate factor Δi, which was in turn regressed onto the latent ERP latency factor Bi (see Fig. 7).

The data of 92 randomly drawn participants (of 114 total; drawn without replacement) were used as a training set to find posterior distributions of cognitive latent variables (i.e., samples from probability distributions that reflect certainty/uncertainty about parameter estimates as reflected by the data). Standardized regression weights were calculated by multiplying unstandardized regression weights with the quotient between the ratio of standard deviation between the predictor (the higher-order latent variable) to the criterion (the lower-order latent or observed variable): \(\beta = b\cdot \frac {\sigma _{y}}{\sigma _{x}}\). The indirect mediation effect βindirect was calculated by multiplying the standardized regression weights β2 and β3 in the Mediation model as discussed by Baron and Kenny (1986). We report the median and 2.5th, and 97.5th percentiles, forming a 95% credible interval (CI) as an equal-tailed interval to describe the posterior distributions of standardized regression weights.

Model Evaluation

The performance of both linking structures was compared based on their in-sample prediction ability, their Deviance Information Criterion (Spiegelhalter et al. 2014), and, crucially, their out-of-sample-prediction ability of new participants’ data.

Table 1

Mean RTs (SD in parentheses) for all conditions of the three experimental tasks

Session 1

Session 2

Task

Accuracies

RTs

Accuracies

RTs

Choice reaction time task

CRT2

.99 (.01)

382.79 (58.02)

1.00 (.01)

381.27 (61.01)

CRT4

.99 (.01)

477.22 (82.64)

.98 (.02)

467.31 (85.70)

Recognition memory task

Set size 1

.97 (.02)

590.96 (115.67)

.98 (.02)

584.02 (135.64)

Set size 3

.97 (.02)

728.46 (167.21)

.98 (.03)

706.61 (176.81)

Set size 5

.97 (.03)

890.03 (240.74)

.95 (.09)

850.98 (223.18)

Letter matching task

Physical identity

.98 (.02)

617.79 (93.93)

.98 (.02)

605.19 (102.41)

Name identity

.98 (.02)

699.50 (113.02)

.97 (.02)

704.38 (126.36)

In-Sample Prediction

Fitting the model with the training set, we created posterior predictive distributions by simulating new neural, behavioral, and cognitive abilities data separately for each participant based on each participant’s posterior distributions of model parameters and on model specifications. Hence, we simulated two posterior predictive data sets for each of the 92 participants in the training set: One of these posterior predictive data sets was based on model specifications and parameter estimates of the regression model, and the other one based on model specifications and parameter estimates of the mediation model. Subsequently, we assessed how strongly these simulated data were related to the observed data for the whole sample of 92 participants separately for each of the two candidate models. For this purpose, we compared (a) observed and predicted ERP latencies for each ERP component c, experimental task j, and session m; (b) observed and predicted RT distributions and accuracies for each condition c, experimental task j, and session m; and (c) observed and predicted IQ test scores for each subtest t. Because accuracies in elementary cognitive tasks are typically near ceiling, the prediction of accuracies was considered less critical than the prediction of the other three variables in the present study. RT distributions were compared by comparing the 25th, 50th, and 75th percentiles of the observed and predicted RT distributions. To quantify the association between observed and predicted values, we calculated \(R^{2}_{pred}\) as the proportion of variance of values T (ERP latencies, percentiles of the RT distribution, accuracies in the experimental tasks, cognitive abilities test scores) explained by model predictions. This statistic is based on the mean squared error of prediction of T, MSEPT, and the estimated variance of T across participants, \(\widehat {Var(T)}\).

Deviance Information Criterion (DIC)

DIC is a measure of goodness-of-fit for hierarchical models that provides a penalty for model complexity (Spiegelhalter et al. 2014). DIC can be thought of as an extension of Akaike information criterion (AIC) for hierarchical models that enforce shrinkage, such that the number of parameters k is no longer useful as a penalty for model complexity. Another alternative is the Bayesian information criterion (BIC), which approximates the logarithm of the Bayes Factor (i.e., the ratio of Bayesian probabilities for two comparison hypotheses), but which is difficult to estimate in most hierarchical models (Kass and Raftery 1995). Due to ease of estimation and implementation in JAGS (Plummer 2003), we used DIC as a known model comparison metric. Smaller DIC values indicate more favorable models. However, we consider out-of-sample prediction of new participants to be the ultimate test of models that natively penalizes model complexity due to overfitting of in-sample data.

Out-of-Sample Prediction

A test set of 22 new participants (the randomly drawn remaining participants) was used to find a second set of posterior predictive distributions for each participant. This test set allowed us to assess how well models were able to predict new participants’ data in one domain (e.g., cognitive abilities) based on data from the other two domains (e.g., electrophysiological and behavioral data). We iteratively predicted data from each of the three domains (electrophysiological, behavioral, and cognitive abilities data) by the other two for each new participant and each of the two models. Out-of-sample prediction was then evaluated in each of the three data domains using \(R^{2}_{pred}\) as a measure of variance explained in variables of one domain by variables from the other two domains. Note that there is no constraint of \(R^{2}_{pred}\) in out-of-sample evaluation to values above 0. Negative values indicate that there is more deviation of the predicted values from the true values than there is variance in the true values themselves.

Open-Source Data and Analysis Code

Results

Mean performance (reaction times and accuracies) in the three experimental tasks is shown in Table 1. Grand-average waveforms of event-related potentials are presented in Fig. 4. See Table 2 for mean ERP latencies in both sessions.

Table 2

Mean ERP Latencies (SD in parentheses) averaged across conditions of each of the three experimental tasks

Task

P2

N2

P3

Session 1

Choice reaction time task

211.54 (32.82)

206.15 (27.71)

330.67 (44.26)

Recognition memory task

234.08 (34.48)

251.11 (42.05)

374.35 (74.76)

Letter matching task

222.26 (33.74)

247.87 (36.80)

414.97 (86.45)

Session 2

Choice reaction time task

208.44 (33.77)

210.38 (29.62)

324.40 (42.04)

Recognition memory task

230.35 (28.19)

248.48 (43.74)

382.39 (81.13)

Letter matching task

218.16 (25.27)

240.02 (44.65)

377.74 (75.09)

In-Sample Prediction

The first linking model (see Figs. 5 and 6), in which cognitive abilities were solely predicted by neural processing speed, provided an acceptable account of the training data. On average, it explained 63% of the variance in cognitive abilities tests, 62% of the variance in ERP latencies, 87% of the variance in the 25th percentile of the RT distribution, 89% of the variance in the 50th (median) percentile of the RT distribution, 83% of the variance in the 75th percentile of the RT distribution, and 30% of the variance in accuracies in reaction time tasks. Note that the cognitive latent variable model may have explained more variance in reaction times than in ERP latencies and cognitive abilities tests because the measurement model of reaction times was more complex (allowing the task-, condition-, and session-specific estimation of boundary separation and non-decision time models not depicted in the structural equation model visualization) than the other two more parsimonious measurement models. The DIC of the overall hierarchical model with the first linking structure was − 3.2012 × 105 and was thus the favored model by the DIC (compared to the second linking structure DIC below). The latent neural processing speed variable predicted the latent cognitive abilities variable to a large degree, β = .84, CI 95% [.75; .91], suggesting that participants with greater cognitive abilities showed a substantially higher neural processing speed.

The second linking model (see Figs. 7 and 5), in which the effect of neural processing speed was partly mediated by drift rates, also provided a good account of the training data. It explained on average 63% of the variance in cognitive abilities tests, 63% of the variance in ERP latencies, 89% of the variance in the 25th percentile of the RT distribution, 90% of the variance in the 50th (median) percentile of the RT distribution, 83% of the variance in the 75th percentile of the RT distribution, and 25% of the variance in accuracies in reaction time tasks. The explained variance is therefore nearly identical to the first linking model. The DIC of the model with the second linking structure was − 3.2007 × 105, a larger, and thus unfavored, DIC compared to the previous model. Again, the latent neural processing speed variable predicted the latent cognitive abilities variable, β1 = .78, CI 95% [.63; .89]. Individual latent neural processing speeds also predicted individual latent drift rates, β3 = .17, CI 95% [.05; .33]. However, there was only weak evidence that greater latent drift rates predicted greater cognitive abilities, β2 = .23, CI 95% [−.05; .52]. In addition, we found some evidence for a negligible indirect effect of neural processing speed on cognitive ability test scores that was mediated by drift rates, βindirect = .04, CI 95% [−.01; .09]. See Fig. 8 for posterior density distributions of the standardized regression weights. To compare both models, we calculated DICs as measures of model fit. The difference between DICs of ΔDIC = 43.27 indicated that the mediation model could not provide a better account of the data than the more parsimonious regression model.

Out-of-Sample Prediction of New Participants

To evaluate the ability to predict unknown data of a new participant in one domain (e.g., unknown cognitive ability test scores) from observed data in another domain (e.g., observed ERP latencies), we assessed out-of-sample-prediction ability for both models in a test set of 22 randomly drawn participants.

Given a new participant’s ERP and RT data, the regression linking model (see Fig. 6) yielded the ability to make somewhat accurate predictions of that participant’s cognitive abilities test scores and ERP latencies. That is, out-of-sample prediction explained 39% of the variance in cognitive abilities tests across participants and tasks and 22% of the variance in ERP latencies across participants and tasks. However, out-of-sample prediction of reaction time data was not successful, R2 = −.51 in the 25th percentile of the RT distribution, R2 = −.50 in the 50th (median) percentile of the RT distribution, and R2 = −.67 in the 75th percentile of the RT distribution. Accuracies could also not be predicted successfully, R2 = − 1.22. Note that \(R^{2}_{pred}\) is not constrained to values above 0 in out-of-sample prediction. Hence, negative values indicated that there was more deviation of the predicted values from the true values than there was variance in the true values themselves. The lack of a successful prediction of behavioral data is not surprising, as the regression model contained no link between drift rates and the other covariates.

The mediation linking model (see Fig. 7) produced very similar predictions of participants’ cognitive ability test scores and ERP latencies. Out-of-sample prediction explained 36% of the variance in cognitive abilities tests across participants and tasks and 23% of the variance in ERP latencies across participants and tasks. Again, prediction of out-of-sample reaction time data was not successful, R2 = − 1.10 in the 25th percentile of the RT distribution, R2 = −.96 in the 50th (median) percentile of the RT distribution, R2 = − 2.09 in the 75th percentile of the RT distribution, and R2 = − 1.46 for accuracies in the reaction time tasks. This lack of a successful prediction of the behavioral data indicates that the covariation of drift rates with ERP latencies and intelligence test scores on the latent level was insufficient to account for observed reaction time data in specific tasks and conditions. The predictive failure likely results from the small latent association of drift rates with ERP latencies and cognitive abilities, but also from large proportions of task- and condition-specific variances in condition-specific drift rates that were not predicted by any covariates.

Discussion

We investigated whether the association between neural processing speed and general cognitive abilities was mediated by the velocity of evidence accumulation. For this purpose, we used a Bayesian cognitive latent variable modeling approach that allowed the joint modeling of behavioral, neural, and cognitive abilities data and estimation of relationships between higher-order latent variables. The cognitive latent variable model was able to predict a substantial amount of variance in cognitive ability test scores in new participants solely based on those participants’ cortical processing speeds.

We observed a strong association between neural processing speed and general cognitive abilities in the way that individuals with greater cognitive abilities showed shorter latencies of ERP components associated with higher-order cognition. Moreover, we found that individuals with greater neural processing speed also showed a greater velocity of evidence accumulation. Given an individual’s speed of neural information processing and evidence accumulation, we could predict about 40 percent of their variance in intelligence test scores. However, the association between neural processing and general cognitive abilities was only mediated by drift rates to a very small degree, and the more complex mediation model did not provide a better account of the data than the more parsimonious regression model.1

These results support the idea that a greater speed of neural information processing facilitates evidence accumulation, and that this increase in the velocity of evidence accumulation translates to some negligible degree to advantages in general cognitive abilities. Although previous studies reported substantial correlations between drift rates and cognitive abilities (Schmiedek et al. 2007; Schmitz and Wilhelm 2016; van Ravenzwaaij et al. 2011), and although preliminary results suggested that measures of neural processing speed and drift rates can load onto the same factor (Schubert et al. 2015), the present study provided the first direct test of the hypothesis that the velocity of evidence accumulation mediates the relationship between neural processing speed and cognitive abilities. Our results suggest that only a very small amount of the shared variance between neural processing speed and cognitive abilities can be explained by individual differences in the velocity of evidence accumulation as a mediating cognitive process. In the following sections, we provide three conceptual explanations why the velocity of evidence accumulation may only explain little of the natural variation in human cognitive abilities associated with cerebral processing speed. Subsequently, we discuss methodological advantages, challenges, and possible extensions of the cognitive latent variable model used in the present study.

1. A Common Latent Process

Both neural processing speed and the velocity of evidence accumulation may reflect properties of the same latent process that is related to general cognitive abilities. However, the drift rate may be a more impure measure of this latent process or may be contaminated by properties of other processes unrelated to cognitive abilities. This position is supported by the observation that we found an association between ERP latencies and drift rates, and by our result that drift rates mediated the relationship between ERP latencies and cognitive abilities at least partially. Moreover, this explanation is consistent with previous research, which suggested that the P3 may be a neural correlate of the evidence accumulation process captured by drift rates (Kelly and O’Connell 2013; O’Connell et al. 2012; Ratcliff et al. 2009, 2016; van Ravenzwaaij et al. 2017). The fact that the associations between neural processing speed and drift rates were lower than the correlations reported in the literature may be due to deviations from previous studies: First, the current study focused on ERP latencies as measures of neural processing speed, whereas previous studies analyzed the relationship between amplitude and capacity-related measures of the EEG and drift rates. Second, previous studies focused mostly on late centro-parietal potentials, whereas the current study included a more diverse time course and topography of ERP components. Third, we only related the latent neural processing speed factor, which reflected the shared variance between different ERP latencies across different tasks, to the latent drift rate factor, and did not inspect task- or component-specific correlations. Considering the psychometric properties of both ERP latencies and drift rates (Schubert et al. 2015; Schubert et al. 2017), it is highly likely that associations between ERP latencies and drift rates would have been higher if we had modeled correlations separately for each condition of each experimental task. However, this task- or condition-specific variance in ERP latencies and drift rates is not of interest regarding general cognitive abilities.

2. Other Candidate Cognitive Processes

The velocity of evidence accumulation may not be the appropriate candidate process mediating the relationship between neural processing speed and cognitive abilities. Instead, shorter latencies of ERP components associated with higher-order cognitive processing may reflect a faster inhibition of extraneous processes and may thus be a neural correlate of the efficiency of selective attention (Polich 2007). The idea that attentional processes underlie individual differences in cognitive abilities has been discussed numerous times. Process overlap theory (Kovacs and Conway 2016), for example, proposes that a limited number of domain-general and domain-specific cognitive processes contribute to individual differences in general cognitive abilities. In the framework of process overlap theory, attentional processes represent a central domain-general bottleneck that constrains cognitive performance across different tasks. This notion is supported by several studies reporting substantial associations between measures of attentional control and executive processes and general cognitive abilities (e.g., Unsworth et al. 2014; Wongupparaj et al.2015).

Additionally, a greater neural processing speed may directly facilitate the storage and updating of information in working memory (Polich 2007), and may thus lead to a greater working memory capacity, which may positively affect performance in a large number of cognitive tasks. This notion is supported by numerous studies reporting large and even near-unity correlations between measures of cognitive abilities and working memory capacity (e.g., Engle et al. 1999; Conway et al. 2002; Kyllonen and Christal 1990). Individual differences in these working memory processes may not be reflected in drift rates estimated in simple binary decision tasks. Instead, future studies could use mathematical models of working memory, such as mathematical implementations of the time-based resource sharing model (Barrouillet et al. 2004) or the SOB-CS (Oberauer et al. 2012), to explicitly model individual differences in parameters of working memory processes and relate these parameters to neural data in a cognitive latent variable model.

Finally, it might even be possible that several cognitive processes mediate the relationship between neural processing speed and cognitive abilities, and that parameters of each single cognitive process only account for a small amount of the substantial association. Larger multivariate studies incorporating cognitive models of these candidate cognitive processes would be required to quantify additive and multiplicative effects of different cognitive processes on the relationship between neural processing speed and general cognitive abilities.

3. Brain Properties as Confounding Variables

Individual differences in neural processing speed may reflect structural properties of the brain that give rise to individual differences in cognitive abilities. Brain properties may be related both to neural processing speed and general cognitive abilities and may thus explain the substantial association between the two variables. Previous research has shown that individuals with greater cognitive abilities showed greater nodal efficiency in the right anterior insula and the dorsal anterior cingulate cortex (Hilger et al. 2017). These brain regions are core components of the salience network that is assumed to be responsible for the detection of salient information and its evaluation with regard to behavioral relevance and an individual’s goals (Downar et al. 2002; Menon and Uddin 2010; Seeley et al. 2007). Dynamic source imaging and lesion studies have revealed that the relative timing of responses of the anterior insula and the dorsal anterior cingulate cortex to stimuli can be indexed by the N2b/P3a component of the ERP, followed by an elicitation of the P3b in neocortical regions in response to the attentional shift (Soltani and Knight 2000; Menon and Uddin 2010). Hence, a more efficient functional organization of the salience network may affect the timing of these ERP components and may also positively affect performance in cognitive ability tests by facilitating the goal-driven selection of task-relevant information.

Cognitive Latent Variable Models

The use of cognitive latent variable models allows the simultaneous modeling of cognitive, neural, and behavioral data across different tasks and ability tests. CLVMs thus allow estimating latent correlations between different measurement areas that are free of unsystematic measurement error. This property is particularly useful when dealing with time-related electrophysiological data, which have been shown to be very inconsistent in their reliability (Cassidy et al. 2012; Schubert et al. 2017). Moreover, CLVMs allow modeling the shared variance between diffusion model parameters across different tasks and conditions in a hierarchical way and can thus solve the problem of low-to-moderate consistencies of model parameters in individual differences research (Schubert et al. 2016).

Three advantages of the hierarchical Bayesian approach have been highlighted by the present study: First, the CLVM demonstrated advantages over classical structural equation modeling approaches in its predictive abilities in small-to-moderate sample sizes. The model has been developed based on only 92 participants and has successfully predicted 62 to 89% of the within-sample variance in neural, behavioral, and cognitive abilities data. A conventional structural equation model with the same number of free parameters would require a substantially larger sample size. Following the rule of thumb to collect at least five observations per estimated parameter (Bentler and Chou 1987), the same model would require a sample size of at least 480 participants in a conventional SEM framework. Taking into account the ratio of indicators to free parameters r (r = number of indicators/number of free parameters), a sample size of at least 930 participants would be required according to the equation n = 50 ⋅ r2 − 450 ⋅ r + 1100 proposed by Westland (2010) based on the simulation results by Marsh et al. (1998). Such large sample sizes are hardly feasible for neuroimaging research except in large-scale collaborative research projects. The Bayesian approach presented here enabled us to fit a structural equation model of great complexity to a sample of only 92 participants. Most importantly, one of the main results previously shown in a more parsimonious conventional structural equation model applied to the same data set (i.e., the great association between neural processing speed and cognitive abilities reported by Schubert et al. 2017) was adequately recovered by the Bayesian model.

Moreover, the latent drift rate trait and task-, condition-, and state-specific boundary separation and non-decision time parameters could account for nearly 90% of the in-sample reaction time data. In comparison, latent diffusion model parameter traits have been shown to account for only 36 to 39% of variance in single-task parameter estimates in a conventional structural equation model (Schubert et al. 2016). This in-sample prediction ability demonstrates that it may be beneficial to model only parameters with known trait properties (e.g., drift rate, see Schubert et al. 2016) as hierarchical factors, while the other model parameters that are known to be more strongly affected by task-specific influences (e.g., non-decision time and boundary separation, see Schubert et al. 2016) are estimated separately for each task and condition.

Second, both the cognitive model and the structural model were fitted to the data in a single step, allowing an accurate representation of parameter uncertainty in posterior distributions (Vandekerckhove 2014), whereas previous studies relating diffusion model parameters to cognitive abilities tests have relied on a two-step process (e.g., Schmiedek et al.2007; Schmitz and Wilhelm 2016; Schubert et al. 2015).

Third, posterior distributions of model parameters were used to predict cognitive ability test scores from neural and behavioral data in a second independent sample. This is the first study to show that posterior predictives of regression weights relating ERP latencies, behavioral data, and cognitive ability test scores may be used to successfully generalize predictions to another independent sample and to predict a substantial amount of new individuals’ cognitive ability test scores solely based on their electrophysiological and behavioral data. That about 40% of new participants’ variance in intelligence test scores could be predicted by the model demonstrates that individual differences in cortical and behavioral processing speed are closely related to general intelligence, and that both models retained their ability to predict previously unseen data despite their complexity.

The model developed in the present study can be easily adjusted to include different sources of neural data, such as functional magnetic resonance imaging or diffusion tensor imaging data, and to relate these data to diffusion model parameters and cognitive ability tests. Within the same hierarchical framework, parameters of different cognitive models could be related to neural and cognitive abilities data. This would, for example, allow testing hypotheses about the relationship between parameters of working memory processes and neural and cognitive abilities data. The flexibility of the hierarchical Bayesian approach allows specifying model and linking structures directly guided by theoretical assumptions, which in turn allows direct comparisons of contradicting theories. In related areas of research, the joint modeling of neural and behavioral data has contributed to our understanding of episodes of mind wandering (Mittner et al. 2014; Hawkins et al. 2015), the dynamic inhibitory processes underlying intertemporal choice (Turner et al. 2018), stopping behavior (Sebastian et al. 2018), the role of attention in perceptual decision making (Nunez et al. 2017), the neurocognitive processes contributing to individual differences in mental rotation (van Ravenzwaaij et al. 2017), and the neurocognitive mechanisms underlying several other cognitive processes. All of these fields of research are of great relevance for individual differences research and may contribute to our understanding of the neurocognitive mechanisms underlying general cognitive abilities. In order to relate covariates to joint models of neural and cognitive behavioral data, different linking strategies have been suggested, ranking from simple regression models to multivariate factor-analytical approaches (e.g., Turner et al. 2017; Turner et al. 2017; Ly et al. 2017; de Hollander et al.2016).

Limitations

One limitation of the present study is that the tasks used to assess individual differences in the efficiency of information processing are so-called elementary cognitive tasks. Elementary cognitive tasks are cognitively relatively undemanding tasks typically used in individual differences research to minimize the influence of individual differences in strategy use and of previous experience with these tasks on task performance. However, cognitively more demanding tasks might yield a stronger association between the velocity of evidence accumulation and cognitive abilities. Whether drift rates based on performance in more demanding tasks such as working memory tasks mediate the association between neural processing speed and cognitive abilities remains an open question. In addition, low error rates may have limited the estimation and interpretation of diffusion model parameters. In particular, identifying drift rate and boundary separation parameters becomes difficult in tasks with few incorrect responses. Although diffusion model parameters provided a good account of the behavioral data in all three tasks, drift rate parameters might have reflected participants’ decision times to a larger degree than their evidence accumulation rates.

Conclusions

We used a cognitive latent variable model approach to show that a higher neural information processing speed predicted both the velocity of evidence acquisition and general cognitive abilities, and that a negligible part of the association between neural processing speed and cognitive abilities was mediated by individual differences in the velocity of evidence accumulation. The model demonstrated impressive forecasting abilities by predicting 35 to 40% of the variance of individual cognitive ability test scores in an entirely new sample solely based on their electrophysiological and behavioral data.

Our results illustrate, however, that the assumption of a unidirectional causal cascade model, in which a higher neural processing speed facilitates evidence accumulation, which may in turn give rise to advantages in general cognitive abilities, was not supported by the data. This result provides important novel insights for intelligence research, because the great associations between both neural and behavioral processing speed and cognitive abilities reported in previous studies may have suggested that a greater neural processing speed gives rise to greater cognitive abilities by facilitating the velocity of evidence accumulation (Schmiedek et al. 2007; Schubert et al. 2017). Our results contradict this hypothesis and instead suggest that neural correlates of higher-order information processing and drift rates might reflect the same latent process that is strongly related to general intelligence. Future research will reveal whether structural or functional brain properties may act as confounding variables giving rise to the association between mental speed and mental abilities by affecting both the speed of information processing and general cognitive abilities.

Footnotes

We fitted another variant of the mediation model, in which reaction times were described by a normal distribution instead of a diffusion model distribution to evaluate the benefits of diffusion modeling and the generalizability of our results (for details regarding modeling choices and results, see the online repository). The model predicted the same amount of in-sample variance in ERP latencies and intelligence test scores, but was less accurate in predicting reaction time data (75–84% of explained variance in percentiles of the RT distribution). The out-of-sample prediction of both reaction time data and cognitive ability test scores also deteriorated, with R2s ranging from − 1.79 to − 2.40 for the percentiles of the RT distribution and only 30% of explained variance in cognitive ability test scores. Taken together, these results illustrate the benefits of diffusion modeling and support the notion of a small mediating effect of drift rate, as predictability of cognitive abilities decreased when drift was not included in the model.

Notes

Acknowledgments

The authors thank Gidon T. Frischkorn, Ramesh Srinivasan, and members of the Human Neuroscience Laboratory for their constructive criticism on work related to this manuscript.

Funding Information

This work was supported by the National Science Foundation [No. 1658303] and the G.A.-Lienert-Foundation.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ratcliff, R., Philiastides, M.G., Sajda, P. (2009). Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the eeg. Proceedings of the National Academy of Sciences, 106(16), 6539–6544. https://doi.org/10.1073/pnas.0812589106.Google Scholar