Login using

You can login by using one of your existing accounts.

We will be provided with an authorization token (please note: passwords are not shared with us) and will sync your accounts for you. This means that you will not need to remember your user name and password in the future and you will be able to login with the account you choose to sync, with the click of a button.

Recent studies have reported improvements in a variety of cognitive functions following sole working memory (WM) training. In spite of the emergence of several successful training paradigms, the scope of transfer effects has remained mixed. This is most likely due to the heterogeneity of cognitive functions that have been measured and tasks that have been applied. In the present study, we approached this issue systematically by investigating transfer effects from WM training to different aspects of executive functioning. Our training task was a demanding WM task that requires simultaneous performance of a visual and an auditory n-back task, while the transfer tasks tapped WM updating, coordination of the performance of multiple simultaneous tasks (i.e., dual-tasks) and sequential tasks (i.e., task switching), and the temporal distribution of attentional processing. Additionally, we examined whether WM training improves reasoning abilities; a hypothesis that has so far gained mixed support. Following training, participants showed improvements in the trained task as well as in the transfer WM updating task. As for the other executive functions, trained participants improved in a task switching situation and in attentional processing. There was no transfer to the dual-task situation or to reasoning skills. These results, therefore, confirm previous findings that WM can be trained, and additionally, they show that the training effects can generalize to various other tasks tapping on executive functions.

Introduction

In recent years, interest toward “brain training” and its mechanisms has risen with a growing pace. Such training involves improving cognitive functions, which have previously been considered as stable abilities that cannot be affected by training. One of the most studied topics in this area has been working memory (WM) training. The concept of WM refers to a limited-capacity system that includes a short-term storage of information and the functions of updating and manipulating the storage contents. Studies have shown that the capacity of WM predicts performance in several other cognitive tasks ranging from simple attentional tasks (Kane et al., 2001; Bleckley et al., 2003; Fukuda and Vogel, 2009) to tasks tapping more complex abilities, such as reading comprehension (Daneman and Carpenter, 1980), reasoning and problem-solving (Kyllonen and Christal, 1990; Engle et al., 1991; Fry and Hale, 1996; Barrouillet and Lecas, 1999; Engle et al., 1999), along with executive functioning in everyday life (Kane et al., 2007). Accordingly, one could expect that training-related increases in WM efficiency are reflected as improvements in several other functions.

Although these studies offer intriguing insights into the potentials of WM training, the diversity of training and transfer effects is still obscure. In other words, despite the vast amount of training literature, we are rather far away from a comprehensive understanding of the characteristics of cognitive functions which may benefit from WM training. The present study aimed to contribute to answering this question by systematically investigating, which cognitive improvements following WM training can transfer to other tasks and situations. In particular, we focused on executive control processes. To our knowledge, there exists no study that has specifically investigated transfer from WM training to executive functions. This is somewhat surprising, considering that executive functions are involved in the control and coordination of various sub-processes or tasks (e.g., Miyake et al., 2001). Due to the general nature of these functions, we suppose an involvement in a number of situations and tasks, for instance in the coordination of the performance of multiple tasks; in attention tasks that require either selective attention or attentional switches; as well as in tasks, such as comprehension and learning, that require activation of representations in long-term memory. Since WM is essential in the execution of all of these processes (e.g., Baddeley, 1996a), we assume that WM training affects beneficially performance also in tasks requiring such functions.

We trained participants on a task that has recently been shown to improve performance in tests of Gf, namely the dual n-back (Jaeggi et al., 2008, 2010). The dual n-back task is an inherently complex task that taps various executive processes. This is because it consists of two n-back tasks—a visuospatial (VS) and an auditory-verbal (AV) one—and they have to be performed simultaneously. An n-back task alone requires diverse executive processes, such as WM updating, monitoring of ongoing performance, and inhibition of irrelevant items. In the dual n-back, the presentation of two n-back tasks in different modalities calls for yet additional processes, such as dividing of attentional resources and managing the performance of two simultaneous tasks (Jaeggi et al., 2008). Accordingly, training on the dual n-back could presumably have separable, advantageous effects on the different executive functions it engages. Another crucial component of the task is that it is adaptive; that is, the level of difficulty is constantly adjusted according to each individual's performance. As a consequence, the development of task specific strategies is minimized, which is a prerequisite in training WM processes as such, independent of the trained material (Klingberg et al., 2002, 2005; Jaeggi et al., 2008).

We specified four executive functions that seem to correspond to particular requirements of the dual n-back, and investigated transfer effects from training to tasks measuring these four processes separately. First, the n-back task taxes WM updating processes: while new, relevant stimuli have to be coded into WM, old, irrelevant items have to be replaced (Morris and Jones, 1990; Miyake et al., 2000). In accordance with the dual modality nature of the training paradigm, we included three WM updating tasks: an AV task, a VS task, and a dual-modality task involving both AV and VS items. All tasks included stimulus sequences of varying lengths, and after each sequence participants had to reproduce the four last presented items of the sequence in the correct order. As it cannot be anticipated by the participants at which point the four last items have to be reported, this task requires continuous updating of WM contents. Previous studies have already reported increases in the amount of correctly reported item sequences following training on similar updating tasks, as well as transfer effects to an n-back task (Dahlin et al., 2008a,b). Therefore, we tested whether participants would show improvements in a WM updating task following training on the dual n-back task.

Second, a key feature of the dual n-back is the requirement to coordinate the concurrent performance of two tasks. To investigate whether training-related improved coordination of performing two simultaneous tasks would generalize beyond the training task, our second transfer task required dual-task performance; although with a reduced WM load as compared with our training task. Generally, executing two simultaneous tasks leads to increases in reaction times (RTs) and error rates, in contrast to a situation in which only one task has to be performed. In speeded choice RT tasks, these dual-task costs are assumed to be the consequence of capacity-limited task processes (e.g., central response selection), which prevent the concurrent performance of two temporally overlapping tasks. In situations of the psychological refractory period (PRP) type, performance of two temporally overlapping tasks varies as a function of the interval between the two tasks [stimulus onset asynchrony (SOA)]. Dual-task costs occur mainly in the second task so, that the shorter the SOA, the more the reaction to the second task is delayed (Pashler, 1994; Schubert, 1999). Training the performance of two concurrent tasks has been shown to improve dual-task performance as indicated by reduced dual-task costs. Among others, these studies have reported that practice can decrease dual-task costs by improving task coordination skills (Liepelt et al., 2011; Strobach et al., 2012a, in press). In the present dual-task paradigm of the PRP type, in each trial first an auditory and then a visual discrimination task was presented, with varying SOAs between these tasks. Participants responded to both tasks in the order of presentation as fast and as correctly as possible. Considering the demand of our training task to simultaneously perform two tasks tapping two different modalities, we investigated, whether dual-task costs would decrease in a multimodal dual-task of the PRP type following dual n-back training.

Alternatively, one could assume that the type of dual-task coordination skills are different in the dual n-back and the PRP-paradigm: while the dual n-back task requires the correct performance of two simultaneous tasks in WM, in the PRP-paradigm the emphasis is on RTs when performing two tasks that are separated by a varying interval. Thus, it is possible that the dual-task coordination skills that consist of successful coordination of two simultaneous tasks within WM, and that are gained in dual n-back training, do not manifest as improvements in the PRP-task, which on its part indicates the speed of processing two tasks.

Third, simultaneous performance of both n-back tasks requires rapid switching between the two task streams. Typically task switching leads to longer RTs compared with situations in which the same task is repeated. This delay is explained by task-set reconfiguration processes that need to take place before the execution of the next task (Rogers and Monsell, 1995; Monsell, 2003). However, previous research has shown that task-switch abilities can be improved by training (Minear and Shah, 2008; Karbach and Kray, 2009; Strobach et al., 2012a,b). To investigate, whether improved task-switching abilities gained after training on the dual n-back would transfer to task-switch performance, we included a transfer task that taps task switching processes. This paradigm comprises two tasks: letter categorization and digit categorization (Rogers and Monsell, 1995). In every trial, a stimulus pair consisting of a letter and a digit is presented and the participant has to perform either one of the categorization tasks so that in every other trial the tasks switch. In this way, switch and repetition trials alternate in these so-called mixed blocks. Performance in task switching situations can be measured in different ways, depending on what processes one is interested in. Sustained control processes—including maintaining task-set information and selecting between two tasks—are reflected in mixing costs. These are acquired by comparing performance in the repetition trials of mixed blocks with performance in trials of single-task blocks (i.e., blocks in which only one of the tasks is completed through the whole block), (Meiran et al., 2000). The flexibility of task-switching abilities is indicated by switch costs, which are attained by contrasting switch trials with repetition trials within mixed blocks. Consistent with the requirement in the dual n-back to both maintain task information of two different tasks and to switch between the tasks, we tested whether there would be a transfer effect to the mixing and/or switch costs in a task switching paradigm following training.

Fourth, training on the dual n-back task engages attention processes. Specifically, it requires continuous switching of attention between items in WM, so that when attending to a new item, attention is detached from an old, irrelevant item. These operations require efficient control of attention under strong time pressure. A typical finding in studies of the temporal distribution of attentional resources is the attentional blink (AB). When two targets are presented in a rapid serial visual presentation (RSVP) stream, separated by a temporal interval between 200 and 500 ms, the detection of the second target (T2) is impaired, thus, attention “blinks.” It is not clearly established yet what causes the blink, but several models emphasize the role of central capacity limitations for the occurrence of the AB: attentional resources are depleted by the processing of the first target (T1), thus causing a deterioration in the processing of T2 (Shapiro et al., 1994; Chun and Potter, 1995; Jolicœeur, 1998; Dux and Harris, 2007). However, the AB is not insensitive to training effects, as reported in a study by Slagter and colleagues (2007). In their experiment, participants attended three-month meditation training, after which an improvement in T2 detection was observed, that is, a decrease in the AB. In the present study, we hypothesized that the demands of the dual n-back may lead to an increase in attentional control by improving the abilities to distribute attentional resources. In accordance with the dual-modality nature of our training task, we included a cross-modal AB paradigm, which consists of two concurrently presented rapid serial presentation streams: a visual and an auditory one (Arnell and Jolicœeur, 1999). There are two targets presented in each stream and the targets are separated by either a short or a long lag. Participants are required to detect the visual T1 and the auditory T2. We investigated whether there would be a change after training from pre- to post-test in correct T2 reports at the short lag, therefore, implicating a decrease in the magnitude of the AB.

Finally, we tested the hypothesis that WM training leads to increases in Gf. This is because up to date, evidence concerning the intriguing hypothesis of improving Gf by WM training has been inconclusive: some studies have reported improvements in reasoning tests following WM training (Klingberg et al., 2002, 2005; Olesen et al., 2004), while others have failed to show such transfer (Dahlin et al., 2008a; Thorell et al., 2009; Chein and Morrison, 2010). We administered the Raven's Advanced Progressive Matrices (RAPM) test, which is a classical measure of reasoning skills (Raven, 1990). We expected that, in line with the findings of Jaeggi and colleagues (2008), participants attending extensive and demanding WM training would score higher in the RAPM after training than before it, compared with their untrained counterparts. This assumption is plausible given that we provide a similar amount of training as provided to groups with increased scores after training in the reasoning task of Jaeggi and colleagues (2008).

There are several ways in which the transfer effects could arise. For example, transfer could occur when the training task and the transfer task engage shared processes of a single skill. For instance, Dahlin and colleagues (2008a) showed transfer from WM updating training to a 3-back task but not to other cognitive measures. Since both the training and the n-back task required continuous updating of WM contents, the authors inferred that the sharing of this process by the two tasks enabled the transfer effect. Along these lines, improvements gained via dual n-back training should be observed also in tasks that tap the respective executive functions involved in the dual n-back. On the other hand, it is possible that the training task affects a relevant domain-general mechanism that underlies both the training and the transfer tasks. Evidence in favor of this account was recently provided by Chein and Morrison (2010), who showed transfer from WM training to a broader scope of cognitive processes. They administered four weeks of training on complex verbal and spatial WM tasks taxing several different processes, such as encoding, attention, and WM updating. After training, improvements were demonstrated in other WM tasks as well as in cognitive control and complex reading comprehension tasks. Since training affected inherently different abilities, Chein and Morrison inferred that the training task must have affected a domain-general mechanism. The authors proposed that such a mechanism is likely to be responsible for attentional control processes that coordinate the maintenance of WM contents, irrespective of their modalities (verbal vs. spatial). Such a domain-general mechanism could comparably be affected by dual n-back training. Together, this is suggestive for transfer effects in the current study, although it remains open, whether such transfer would emerge for each of the applied transfer tasks given the differences in the underlying executive functions.

In summary, the present study set out to investigate, whether training effects from the dual n-back transfers to (1) a WM updating task, (2) dual-tasks with different demands on WM updating, (3) task switching, and (4) an AB task. Additionally, transfer to reasoning abilities was tested. Participants in the training group trained on the dual n-back task for 14 days, before and after which they attended pre- and post-tests on the training task as well as the five transfer tasks. In order to rule out mere retest effects, performance of the training group in each task was contrasted with the performance of a control group that underwent no training, but had a temporal interval between the pre- and post-tests equivalent in length to the training period of the training group. We are aware of the possible problems which may be related to the issue of an inactive control group, and these will be addressed in the discussion.

Materials and Methods

Subjects

Altogether 38 university students were recruited via announcements on notice boards at the psychology department of the Ludwig-Maximilians-University (LMU) Munich. They were randomly placed into two groups. While 20 participants (five male, mean age 24.4 years, two left-handed) took part in the training program, 18 participants (four male, mean age 24.5 years, two left-handed) were assigned to a control group that did not attend training; these group sizes exceeded the size in most of the trainings studies included in a recent review in the field of cognitive training (Morrison and Chein, 2011). Participants in both groups were equally rewarded with a monetary compensation of €8 per hour and all had normal or corrected-to-normal vision.

Design and Procedure

In the beginning, all participants completed the four transfer tasks as well as the dual n-back task. For the next three weeks, the training group attended 14 daily training sessions (excluding weekends) on the dual n-back, while the control group underwent no training. After approximately three weeks from the first assessments, all participants attended a post-test on the dual n-back task and on the four transfer tasks. Additionally, in the beginning, all participants attended a pre-test session on the RAPM. However, only 13 participants from the training group and nine participants from the control group were available for a RAPM post-test. All tasks, except for the RAPM, were computerized and all tasks were performed in a laboratory. During the dual n-back sessions as well as in the RAPM pre- and post-tests, several participants could complete the tasks at the same time; while for the other tasks, only one participant at a time was tested. In all computerized tasks, except for task switching, responses were given on a German standard computer keyboard (QWERTZ).

Materials

Training task

Our training task, the dual n-back1, utilized the material described by Jaeggi and colleagues (2007), including simultaneously presented AV and VS stimuli (Figure 1). The AV stimuli consisted of eight German consonants (C, G, H, K, P, Q, T, and W) spoken in random order via headphones. The VS stimuli were blue squares presented one by one on a black background, randomly in eight possible locations. All stimuli were presented for 500 ms, and the interstimulus interval (ISI) was 2500 ms, thus resulting in a stimulus presentation rate of 3 s. A white fixation cross was present throughout each run. Participants reacted by pressing the key “A” with their left index finger for the VS task (i.e., match of square position in the present and n-back trial) and the key “L” with their right index finger for the AV task (i.e., match of consonant in the present and n-back trial). A new run was commenced by pressing the space-bar. Each run started with instructions about the level of n in the upcoming run, and ended with feedback of the participant's performance in the preceding run. The level of n was always the same in both tasks, with each training session starting from level n = 2. For each consecutive run, the n-back level was automatically adjusted so, that if the participant had at least 90% correct in both modalities in the previous run, the level of n in the next run was increased by one. But, if the participant had at most 70% correct in either of the modalities, the level of n was decreased by one in the next run, with the minimum level always being n = 1. In other cases the n-level stayed constant between successive runs. Altogether, 20 runs were completed in each session, and one run consisted of 20 + n trials (e.g., a 2-back task contained 22 trials). The dependent measure was the mean n-back level achieved during a training session.

FIGURE 1

Figure 1. Example of a 2-back condition in the dual n-back task that was used as the training task. The visual and auditory stimuli are presented simultaneously at identical rates. Figure adapted from Buschkuehl and colleagues (2007).

Transfer tasks

Updating. This task included AV and VS stimuli. The AV stimuli consisted of the numbers 1, 2, 3, and 4, spoken in German and presented through headphones. The VS stimuli were black bars that appeared one by one in four different locations on the vertical axis of a computer screen. All stimuli were presented for 2000 ms with an ISI of 1000 ms. Each trial included a list of sequentially presented stimuli, and the list lengths were 5, 7, 9, 11, 13, and 15 items. On the presentation of the digits 1, 2, 3, and 4 in the AV task, participants responded by pressing the keys “Y”, “X”, “C”, and “V” with the little, ring, middle, and index fingers of the left hand, respectively. In the VS task, responses were given using the right hand. Participants pressed the key “.” with the little finger for a bar presented in the uppermost part of the screen, the “,” key with the ring finger for a bar presented slightly above the middle of the screen, the “M” key with the middle finger for a bar presented slightly below the middle of the screen, and the “N” key with the index finger for a bar presented in the lowermost part of the screen. Altogether three blocks of 10 trials each were completed. The first block contained only AV stimuli, the second block only VS stimuli, and in the last block the AV and the VS stimuli were presented simultaneously. In the first two blocks, immediately following the presentation of a list, participants were asked to report the four last presented items of that list in the correct order. In the third block the task was the same; however, it was randomly required to reproduce either the last four AV or the last four VS items (the respective correct modality was indicated in the request presented after each sequence, i.e., “Please report the four last positions” or “Please report the four last digits”). In each task, participants were instructed to constantly update the four last items during the presentation of the lists. No speeded responses were required, but the participants were informed that a new list would start automatically after a fixed period of time (6000 ms) following the question about the last four presented items. Here, the outcome measure was the number of correctly reported four-item sequences, in each block separately.

Dual-task. The dual-task comprised two discrimination tasks. Task 1 was an auditory task in which participants had to react according to the pitch of a tone that was low (350 Hz), medium (950 Hz), or high (1650 Hz). Task 2 was a visual task in which participants were instructed to react according to the size of a triangle that was small (3.0° × 3.0°, of visual angle), medium (3.6° × 3.6°), or large (5.3° × 5.3°). Each trial started with the presentation of a white horizontal line in the middle of the screen, and it remained visible through the whole trial. Stimulus presentation followed 500 ms later. In the dual-task blocks, each trial started with the presentation of Task 1, followed by Task 2, and the SOA was randomly 50, 100, or 400 ms. In task instructions the correct order of responses (that is, first to Task 1, and then to Task 2) was emphasized. The intertrial interval (ITI) following correct trials was 1000 ms. After an erroneous response the word “Error” appeared on the screen for 1500 ms and the ITI was extended to 2500 ms. In the auditory task, responses were given with the left hand, by pressing the key “C” with the index finger, “X” with the middle finger, and “Y” with the ring finger for a low, medium, and high tone, respectively. The right hand was used for reactions in the visual task, by pressing “N” with the index finger, “M” with the middle finger, and “,” with the ring finger for a large, medium-size, and small triangle, respectively. The whole experiment included five blocks, of which the first was a single-task block with Task 1 and the second was a single-task block with Task 2. Each of these blocks contained 45 trials. The last three blocks were dual-task blocks of 54 trials each. In all blocks participants were instructed to respond as fast and as correctly as possible. The RTs and error rates of Task 1 and Task 2 were used as the dependent measures.

Task switching. Each trial consisted of the presentation of a character pair including a digit that was either even (2, 4, 6, 8) or odd (3, 5, 7, 9) and a letter that was either a consonant (G, K, M, R) or a vowel (A, E, I, U). One pair at a time was presented in the center of a cell of a 2 × 2 grid. The first pair of each block appeared in the upper left cell, and the presentation of the following pairs moved always to the next cell clockwise. Each trial lasted until participant's response, or until 5000 ms had elapsed. The ITI was 150 ms; however, after an erroneous trial it was extended to 1500 ms and during this time also a tone of 30 ms in length was presented to indicate error. Participants were instructed to perform a number discrimination task (even vs. odd) and a letter discrimination task (consonant vs. vowel). They were asked to respond as fast and as correctly as possible with a response-box including two keys, by pressing the left key with the left index finger for even digits or consonants, and the right key with the right index finger for odd digits or vowels. Altogether six blocks of 48 trials each were completed. The first two blocks were single-task blocks: one letter categorization and one digit categorization block; and their order was counterbalanced across participants. The last four blocks were mixed blocks, in which both tasks had to be performed so that whenever the stimulus pair appeared in one of the upper cells of the grid, the digit categorization task was to be performed, and whenever the pair appeared in one of the lower cells of the grid, the participant had to perform the letter categorization task. Thus, half of the trials in these blocks were trials in which the same task was repeated from one trial to the next, and half were switch-trials in which the task switched. RT and error rates were used as outcome measures.

Attentional blink. This task included visual and auditory stimuli comprising letters of the alphabet (excluding N, X, C, and Y), and the digits 1, 2, 3, and 4. All visual items appeared sequentially in the same location in the middle of the screen. The auditory stimuli were presented through headphones. Each trial consisted of a concurrently presented visual and auditory stream. The lengths of the streams varied randomly, with one stream including 13, 15, 17, 19, or 21 items. Each stream consisted of mainly letters, except for two digits that appeared concurrently at two positions in the two modalities (i.e., simultaneous visual and auditory digits at position A and simultaneous visual and auditory digits at position B). The positions of the digits in the streams varied randomly, so that the first digits were presented at position 5, 7, 9, 11, or 13 and the second digits followed either three or six positions later. Each stimulus was presented for 80 ms, and with an ISI of 13 ms the presentation rate of the stimuli was 10.75 stimuli per second. Thus, the lag between the first and the second digit pair was either 279 ms or 558 ms. The first trial of a block was commenced by pressing the space-bar, and the following trials started automatically once the preceding trial had ended. In each trial, first a fixation cross was presented (500 ms), followed by a blank screen (500 ms), after which the auditory and the visual streams started simultaneously. At the end of each trial the participants were asked about the identities of the first visual digit (T1) and of the second auditory digit (T2). Responses were given with the right hand, using the number pad of a keyboard. Altogether two blocks with 40 trials each were completed. The critical outcome measure was the proportion of correctly identified T1 and T2.

RAPM. The RAPM consists of 36 test items, in each of which the task is to select a correct alternative among several possibilities to a matrix of patterns in which one pattern is missing. To enable the administration of the test two times (pre- and post-test)—meanwhile excluding test repetition effects—all participants performed in the pre-test either the odd numbered problems or the evenly numbered problems, and the other half in post-test (counterbalanced between participants). In both sessions, participants were given 20 min time to finish the test (i.e., half of the time of finishing the whole test as instructed in the test manual). The dependent measure was the number of correctly solved problems.

Results

We first conducted a multivariate analysis of variance (MANOVA, Pillai's Trace) with Group (training vs. control) as a between-subject factor and Session (pre-test vs. post-test) as a within-subject factor on the data of each task as dependent variables (i.e., the mean level of n in the dual n-back, the number of correctly reported items in the WM updating task, RTs in Task 1 and Task 2 in the dual-task as well as in each trial type of task-switching, and the proportion of correct target identifications in the AB task). Since RTs were our primary measures in dual-task and in task-switching situations, we did not include the error rate data of these tasks in the MANOVA. This analysis yielded significant main effects of Group [F(17, 54) = 3.78, p < 0.001, η2p = 0.54] and Session [F(17, 54) = 3.32, p < 0.001, η2p = 0.51] as well as a significant Group × Session interaction [F(17, 54) = 3.39, p < 0.001, η2p = 0.52], which indicated that there were reliable group-specific performance changes from pre- to post-test. In the following we report the follow-up analyses for each task.

Training Task

Owing to technical problems, the data of two participants in the control group was lost (one male, one female), and thus, the analyses for the dual n-back task included the data of 16 control participants. A 2 (Group: training vs. control) × 2 (Session: pre-test vs. post-test) mixed-design analysis of variance (ANOVA) yielded main effects of Group [F(1, 34) = 29.18, p < 0.001, η2p = 0.46] and Session [F(1, 34) = 60.52, p < 0.001, η2p = 0.64], indicating that the trained group generally showed higher n-back levels (M = 3.63) than the control group (M = 1.24), and that the achieved mean n-back level at post-test (M = 3.78) was higher than that at pre-test (M = 2.31) across groups. Importantly, the Group × Session interaction was significant [F(1, 34) = 54.94, p < 0.001, η2p = 0.62], indicating a larger improvement of the training group than that of the control group (Table 1, Figure 2). This was confirmed by paired t-tests that showed a significant difference between the pre-test and post-test performances of the training group [t(19) = −8.70, p < 0.001] and no such difference for the control group (p > 0.44). There was no difference between the performances of the two groups at pre-test (p = 0.49).

TABLE 1

Table 1. Pre- and post-test performance as well as the effect sizes for pre- and post-test comparisons of the training group and the control group in each transfer task.

FIGURE 2

Figure 2. Improvement in the performance of the training group through the training period and the performance of the control group in the pre- and post-tests in the dual n-back task. For each session, the mean n-back level is presented. Error bars indicate standard errors of the mean.

Transfer Tasks

Means and standard deviations in pre-test and in post-test, as well as effect sizes of the pre-test—post-test comparisons are presented in Table 1 for each task, separately for the training group and the control group.

Updating

A 2 (Group: training vs. control) × 2 (Session: pre-test vs. post-test) × 3 (Block: AV vs. VS vs. dual-modality) mixed-design ANOVA conducted on the mean amount of correctly reported four-item sequences yielded a main effect of Session [F(1, 36) = 11.95, p < 0.005, η2p = 0.25], reflecting the fact that the participants reported more sequences correctly at post-test (M = 4.21) than at pre-test (M = 3.27). Also the main effect of Block was significant [F(2, 72) = 57.93, η2p = 0.62], which confirmed that the amount of correctly reported sequences varied between the three blocks (AV: M = 5.11; VS: M = 4.08; dual-modality: M = 1.98). The Group × Session × Block interaction reached significance [F(2, 72) = 3.60, p < 0.05, η2p = 0.09], suggesting that an interaction of Session and Block was modulated by the factor Group. Therefore, each block was separately submitted to two (Group: training vs. control) × 2 (Session: pre-test vs. post-test) ANOVAs. For the AV and dual-modality blocks, the Group × Session interaction was not significant (both p's > 0.3). However, for the VS block, this interaction was reliable [F(1, 36) = 5.48, p < 0.05, η2p = 0.13]. Bonferroni corrected paired t-tests conducted for the pre-test and post-test performances of the training and the control group confirmed that the trained participants showed an increase in the amount of correctly reported four-item sequences [t(19) = −2.49, p < 0.05], while there was no difference for the control group between their pre- and post-test performances (p > 0.48) (Figure 3). Both groups did not differ with respect to their pre-test (p = 0.80), but differed regarding their post-test performance [t(17) = 3.02, p < 0.01, Cohen's d = 0.82]. The main effect of Group and the remaining interactions were non-significant (all p's > 0.10). These results suggest that the trained participants improved in the VS updating task but not in the AV or the dual-modality task, and that the improvement of the training group in the VS task was not driven by differences in the groups' performances already at pre-test.

FIGURE 3

Figure 3. The number of correctly reported four-item sequences in the VS updating task. Performance for both groups is illustrated separately for pre-test and post-test. Error bars indicate standard errors of the mean.

Dual-task

The RTs and error rates in Task 1 and in Task 2 were analysed separately with mixed-design 2 (Group: training vs. control) × 2 (Session: pre-test vs. post-test) × 3 (SOA: 50 ms vs. 100 ms vs. 400 ms) ANOVAs. For the RT analyses we excluded trials, in which an erroneous response was made to either one or both of the tasks.

Task 1. Participants were faster in post-test (M = 866 ms) than in pre-test (M = 932 ms), as confirmed by the significant main effect of Session [F(1, 36) = 13.15, p < 0.005, η2p = 0.27] in the RT analysis. The analysis of error rates revealed that participants made less errors in post-test (M = 5.79%) than in pre-test (M = 9.62%), as indicated by the significant main effect of Session [F(1, 36) = 6.33, p < 0.05, η2p = 0.15]. The main effect of SOA [F(2, 72) = 3.97, p < 0.05, η2p = 0.10] revealed that the proportion of errors varied as a function of SOA (error rate for SOA 50 ms: M = 8.44%; for SOA 100 ms: M = 7.83%; and for SOA 400 ms: M = 6.78%). No further main effect and no interaction reached significance in the Task 1 data (all p's > 0.10). These results indicate that both groups improved their performance from pre- to post-test equally; thus, there was no training-related improvement in Task 1 performance.

Task 2. A main effect of Session [F(1, 36) = 40.16, p < 0.001, η2p = 0.53] was obtained, indicating that the RTs in post-test (M = 989 ms) were significantly faster than in pre-test (M = 1098 ms). Additionally, the main effect of SOA was significant [F(2, 72) = 590.01, p < 0.001, η2p = 0.94], revealing the typical PRP effect in that the mean RTs decreased as the SOA increased (mean RT for SOA 50 ms: M = 1177 ms; for SOA 100 ms: M = 1115 ms; and for SOA 400 ms: M = 838 ms). The error rate analysis revealed a significant main effect of Session [F(1, 36) = 4.72, p < 0.05, η2p = 0.12], showing that more errors were made in pre-test (M = 7.31%) than in post-test (M = 4.18%). No other main effect and no interaction were significant in the Task 2 data (all p's > 0.07). The results of the error rate analyses are thus in concordance with the results of the RT as well as the Task 1 analyses, which showed an equal improvement for the training group and the control group from pre- to post-test, indicating that there was no training-related improvement in the dual-task performance.

Task switching

We conducted separate three-way mixed-design ANOVAs with factors Group, Session, and Trial type for analysing mixing costs and switch costs. In both analyses, the first two factors were identical (Group: training vs. control and Session: pre-test vs. post-test). In the analysis for mixing costs, the factor Trial type included data (RTs and error rates) from repetition trials vs. single-task trials; while in the analysis for switch costs, this factor included data (RTs and error rates) from switch trials vs. repetition trials. Due to an error in data acquisition, one participant in the training group had more than 87% incorrect responses on each trial type in the post-test, for which reason this subject's data was omitted from the task switching analyses. In the RT analyses of the remaining data, trials with incorrect responses (5.6% of trials) were excluded.

Mixing costs. We were interested in whether training affected sustained control processes, reflected as mixing costs in our task switching paradigm. The analysis on the mixing costs revealed a main effect of Session [F(1, 35) = 51.14, p < 0.001, η2p = 0.59], indicating faster RTs in post-test (M = 719 ms) than in pre-test (M = 803 ms). The RTs were also faster in single-task trials (M = 716 ms) than in repetition trials (M = 806 ms), [F(1, 35) = 28.12, p < 0.001, η2p = 0.45]. Furthermore, two interactions were significant. First, the reliable Group × Session interaction [F(1, 35) = 4.38, p < 0.05, η2p = 0.11] reflects the fact that the training group's improvement from pre-test to post-test was larger (M = 108 ms) than that of the control group (M = 59 ms). Second, and importantly, the three-way interaction Group × Session × Trial type was also significant [F(1, 35) = 4.55, p < 0.05, η2p = 0.12], which suggests that the group-specific improvement is differently expressed for different types of trials. Two further Group × Session ANOVAs were conducted separately on the RTs in single-task trials and repetition trials in order to investigate, which types of trials showed the stronger group-specific training effect. For the single-task trials, only the main effect of Session reached significance [F(1, 35) = 14.51, p < 0.001, η2p = 0.29], such that all participants improved from pre-test (M = 744 ms) to post-test (M = 689 ms). The analysis for the repetition trials revealed a reliable main effect of Session [F(1, 35) = 55.13, p < 0.001, η2p = 0.61] but, additionally, the Group × Session interaction reached significance [F(1, 35) = 8.52, p < 0.01, η2p = 0.20], confirming that the improvement of the training group from pre-test to post-test was larger (M = 155 ms) than that of the control group (M = 68 ms) (Figure 4). This indicates a greater improvement of the training group in mixing costs, compared with the control group. Other main effects or interactions or results from the analysis on error rates were not significant (all p's > 0.12).

FIGURE 4

Figure 4. Reaction times of the training and control groups in the repetition and single-task trials of the task switching experiment. Error bars indicate standard errors of the mean.

Switch costs. To investigate the effect of dual n-back training on the flexibility of task-switching abilities, we ran an analysis on the switch costs. This revealed a significant main effect of Session [F(1, 35) = 35.06, p < 0.001, η2p = 0.50], which indicated that the RTs were faster in post-test (M = 984 ms) than in pre-test (M = 1123 ms). Also the main effect of Trial type was significant [F(1, 35) = 306.80, p < 0.001, η2p = 0.90], indicating that the RTs in repetition trials (M = 806 ms) were faster than in switch trials (M = 1300 ms). An analysis for the error rates revealed only a significant main effect of Trial type [F(1, 35) = 98.96, p < 0.001, η2p = 0.74], indicating that the participants made more errors in switch trials (M = 8.02%) than in repetition trials (M = 2.99%). The other main effects and interactions were not significant (all p's > 0.06; for the important interaction Group × Session × Trial type p = 0.54), which indicates that the improvements from pre- to post-test were equal across both groups and that no group-specific transfer effects occurred for the switch costs.

T1

The analysis yielded a significant main effect of Session [F(1, 36) = 11.81, p < 0.005, η2p = 0.25], indicating that the participants identified T1 more often correctly in post-test (M = 88.84%) than in pre-test (M = 82.99%). Also the main effect of Lag reached significance [F(1, 36) = 9.37, p < 0.005, η2p = 0.21], indicating that T1 was more often correctly identified in the long lag (M = 87.64%) than in the short lag (M = 84.19%). The main effect of Group and the interactions did not reach significance (all p's > 0.09), thus showing that training had no effect on T1 identification.

T2

The means were calculated using only trials in which T1 was identified correctly. Significant main effects of Session [F(1, 36) = 20.76, p < 0.001, η2p = 0.37] and Lag [F(1, 36) = 70.93, p < 0.001, η2p = 0.66] revealed that the participants identified T2 better in post-test (M = 58.03%) than in pre-test (M = 50.03%) as well as in the long lag (M = 60.73%) than in the short lag (M = 47.33%). The Group × Session interaction was significant, as well [F(1, 36) = 6.14, p < 0.05, η2p = 0.15]. Follow-up analyses confirmed that the training group improved significantly in T2 identification from pre-test (M = 51.60%) to post-test (M = 63.73%) [t(19) = −5.04, p < 0.001), while the control group performed equally well in both sessions (p = 0.16). Other main effects or interactions were not significant (all p's > 0.07). Since the group differences were not affected by Lag, it indicates that the improvement of the training group from pre-test to post-test was similar in both the long and the short lag (Figure 5). This suggests that the training group showed improvements in the identification of T2 across both lags.

FIGURE 5

Figure 5. Proportion of correctly reported T2|T1 for both lags in pre-test and in post-test for the training group and the control group. Error bars indicate standard errors of the mean.

Raven's advanced progressive matrices (RAPM)

Performance scores of the participants who attended the RAPM-test in pre-test as well as in post-test were submitted to a 2 (Group: training vs. control) × 2 (Session: pre-test vs. post-test) mixed-design ANOVA. The training group gained higher scores (M = 13.77) than the control group (M = 9.94), [F(1, 20) = 9.69, p < 0.01, η2p = 0.33]. However, a significant Group × Session interaction [F(1, 20) = 5.25, p < 0.05, η2p = 0.21] indicated that these two groups differed to a different amount in the pre- and post-test sessions. While the training group showed higher scores than the control group in the pre-test session [t(8) = −3.69, p < 0.01], this difference disappeared in the post-test session (p > 0.8). Probably, this finding can be attributed to a general ceiling effect in the training group, which performed very well in both the pre- and the post-test sessions. Therefore, due to its relatively low performance level in the pre-test session, the control group had more space for an improvement of the RAPM values in the post-test session, relative to the training group. In any case, we provided no evidence for WM transfer effects to the performance in the RAPM after training.

Discussion

The purpose of this study was to investigate, which improvements in executive control functions achieved through WM training can generalize beyond the training task and situation. Within three weeks of training with a demanding WM task, the dual n-back, participants improved their performance significantly from the first to the last session. A control group that did not undergo training, performed on an equal level in post-test as compared with its pre-test performance three weeks earlier. The improvement of the training group generalized to three untrained tasks: a VS WM updating task, task switching, and an AB task. Importantly, the improvement of the training group was confirmed by a MANOVA. There was no transfer to an AV WM updating task, to a dual-modality WM updating task and to a dual-task of the PRP type.

Transfer Effects

The nearest transfer occurred to the VS WM updating task. Both the dual n-back and the updating task share the requirement to constantly update WM contents. However, there are crucial differences between the tasks that must be noted. First of all, there are dissimilarities between the stimuli of the two tasks (blue squares vs. black bars). Furthermore, the presentation time of the stimuli in the transfer task is different from that of the training task. Most importantly, the two tasks engage different processes: the n-back requires recognition of stimuli, whereas in the updating task correct stimuli have to be recalled from WM. With these aspects in mind, it can be concluded that the training paradigm indeed enhanced the ability to update WM contents, independent of the trained material. Interestingly, this transfer effect was only seen in the VS modality and spared the AV modality. There are two—not mutually exclusive—possibilities to explain this observation. Firstly, it is plausible that the auditory WM system is more rehearsed or automatized as a result of everyday auditory experiences, because remembering auditory information demands effective rehearsal processes (for example to understand speech) (Baddeley, 2003). Thus, there could be less space to improvement as compared with the visual WM, which for its part is not as strained in daily life (Baddeley, 1996b). According to our results, auditory WM updating is not insensitive to improvements related to task repetition, since we did see an improvement for both groups from pre-test to post-test in the AV WM updating task. But, to induce an effect of training on skill-level, a more demanding task than the current auditory part of the dual n-back task would probably be required. The second possibility is related to a theory posited by Miyake and colleagues (2001), according to which VS WM is more closely related to executive functioning (or, “the central executive”) than verbal WM (see also Baddeley, 1996b). It might therefore be that the training task indeed rehearsed a central executive mechanism; but, since such mechanism is more closely tied to VS WM processes than to auditory ones, the current transfer effect was more pronounced in the VS updating task.

As for task switching, we found a transfer effect that was reflected in mixing costs but not in switching costs. It, therefore, seems that the transfer effect did not tap transient processes related to task switching (i.e., the ability to rapidly switch between performing two different tasks), but rather covered processes concerning sustained control (i.e., maintaining the two task sets in WM and in selecting appropriately between them when task performance is required). To calculate the magnitude of mixing costs, we compared performance in repetition trials to that in single-task trials. Even though these two trial types both require the performance of the same task from one trial to the next, they differ from each other in one critical aspect. In repetition trials, one has to maintain two task sets in WM, while in single-task trials only one task set is sufficient. The observation of a transfer effect on mixing costs (i.e., the difference between repetition and single-task trials) is therefore, nicely in accordance with the nature of the training task, which requires efficient control over the contents of WM. It is also congruent with the results from the WM updating task, in that an improvement in WM updating was observed only in the VS task and the stimuli in our task switching paradigm were also presented visually. With respect to switch costs, they have been described to be—at least partly—a measure of interference from the preceding task set (Allport et al., 1994; Mayr and Keele, 2000; Monsell, 2003; Kiesel et al., 2010). Thus, it is conceivable that our trained participants showed no reduction in switch costs since the training task did not encourage inhibiting one or the other task: participants were explicitly instructed that only successful performance of both the AV and the VS task would make them advance to the next n-level. Thus, concentrating on only one of the tasks and therefore having to inhibit the information from the other task would not have led to a performance improvement. This interpretation would to that end also be in accordance with the lack of transfer to the dual-modality updating task (see above), which in turn specifically required inhibition of the irrelevant task modality at the response phase.

Finally, we found a transfer effect to the AB task, such that T2 identification was improved after training. Also T1 accuracy improved from pre-test to post-test, excluding the possibility that the improvement in T2 identification was a sole consequence of the participants simply attending more to T2 at the expense of T1. Since our AB task tapped both the visual and the auditory modality, this is the first time that a training-related effect to a cross-modal AB task is shown; note that previous studies have shown effects only within the visual modality (Green and Bavelier, 2003; Slagter et al., 2007).

In the present study, participants showed an improvement in T2 accuracy in both the short and the long lag. Therefore, we cannot infer that there was a specific decrease in the trained participants' AB, but only that they could report T2 more correctly in general. However, a closer inspection of our data shows that participants still seemed to manifest an AB at the long lag (i.e., they detected T2 worse than T1 even though T2 followed T1 beyond the supposed AB time frame of 500 ms). In that event, it could be that our long lag may have not been long enough for the T2 to surpass the effect of AB. Assuming that the AB was indeed decreased and that we missed it because of the properties of our task, this finding would suggest that the improvement in temporal dividing of attentional resources was transferable beyond the training task. This would be in accordance with a previous study by Oberauer (2006), in which it was suggested that WM training (specifically on the n-back task) leads to a speed up in attentional processes within WM, rather than to a pure increase in WM capacity. Theories of AB generally address the magnitude of AB to be dependent on the amount of attentional capture by T1 and on the efficiency of T1 processing (Shapiro et al., 1997, 2006). It is thus possible that the improvement in the auditory T2 identification in our paradigm came about by a reduced limitation of T2 encoding due to an improvement in the processing of the visual T1. This would particularly be consistent with the already reported effects of transfer to tasks in the visual modality (i.e., the VS WM updating task and task switching). In fact, in a study by Slagter and colleagues (2007), a decreased AB after meditation training was explained by more efficient processing of T1. This was evident in their electrophysiological (EEG) data as a smaller P3b-component for T1 after training. As the P3b-component generally reflects the allocation of attentional resources, Slagter and colleagues suggested that meditation training improved the participants' control over the distribution of attentional resources: they were more efficient in deploying resources to T1, thus leading to an increased T2 accuracy. Consistent with our interpretation of improved division of attentional resources in time are also the findings by Green and Bavelier (2003). In their study, participants trained action video-game playing. Following training, the T2 accuracy was improved, such that the trained participants recovered faster than non-trainers from the effects of AB.

There is, however, another study by Boot and colleagues (2008) that did not find transfer after video-game training to AB. We believe that this discrepancy could be due to general differences between the studies. For example, the AB task itself was somewhat different between these studies. In the Boot and colleagues' study the task was to identify T1 and to detect whether T2 appeared or not; whereas in our study the task was to identify both T1 and T2, and T2 also appeared in every trial. Moreover, we used a cross-modal AB task, while Boot and colleagues' AB task was purely visual. It is thus possible that our AB task was more sensitive to the type of training we implemented. Yet another critical difference between these studies is that the collection of the transfer tasks in the study by Boot and colleagues was different from the present study: while in the former study participants performed 12 different tasks, in the latter study participants performed only four different tasks. Thus, it is possible that the larger number of transfer tasks in the study by Boot and colleagues, compared with the number of transfer tasks in the present study (four tasks) and in the study by Green and Bavelier (three tasks) counteracted a possible manifestation of transfer in the AB task. This would be consistent with findings of Schmeichel (2007), who has shown that engaging in one task including an executive function component can have a debilitating effect on the performance in other executive function tasks.

Lacking Transfer Effects

Interestingly, training did not transfer to dual-task coordination skills, as revealed by a lack of training-related improvements in the PRP-paradigm. Although we initially expected an improvement in dual-task abilities following training, the observation of lacking transfer to the PRP-task may not be surprising for two reasons. First, a key element of the training task was indeed the demand to efficiently update WM contents, which was not essential for the transfer situation in the PRP dual-task. Second, the training task did not require speeded processing and execution of appropriate stimulus-response mappings, which is an essential characteristic for dual-task processing of the PRP task type (Schubert, 1999, 2008). Thus, the lack of commonalities between the dual-task processing in the trained dual n-back task and the transfer PRP dual-task situation may have avoided the appearance of specific transfer effects between both task situations.

We also found no transfer to Gf, as measured by the RAPM. This finding is consistent with the study by Jaeggi and colleagues (2008), which used the same training paradigm and found no transfer to the RAPM after eight sessions of training. However, another study by Jaeggi and colleagues (2010) did find transfer to RAPM after 20 sessions of dual n-back training. There is a critical difference between the ways how the RAPM were administered in the present study and in those other studies: Jaeggi and colleagues (2008, 2010) applied the test with a time restriction (20 min), whereas in our study the test was conducted according to the standardized procedure (Raven, 1990), which instructs to give participants a sufficient amount of time to finish the test. It seems plausible to explain the observation of a training-induced improvement of Gf in a speeded version of the RAPM by the proposed hypothesis that the current WM training optimizes specifically the efficiency of attentional processes within WM, as suggested in our AB results. Therefore, when the test is administered in line with the standardized procedure described in the test manual (as it was the case in the present study), potentially improved attentional processes may not decisively contribute to the performance level in the Gf test. As a consequence, the improvement in attentional processing does not reflect in the Gf level results of the current type of the RAPM test administration. It has already been suggested elsewhere, that the link between Gf and WM is a common attentional control mechanism (Gray et al., 2003; Kane et al., 2004; Halford et al., 2007), and in fact, Jaeggi and colleagues (2008) also included such views in their explanation for transfer from the dual n-back to measures of Gf. Other studies using a different WM training paradigm but that have administered the RAPM similarly to the present study (i.e., without time restrictions), have likewise not shown reliable transfer effects to Gf (Dahlin et al., 2008a; Chein and Morrison, 2010; Richmond et al., 2011), and thus our results support findings from these studies. In the present study, some participants were not available for the post-test on the RAPM. Thus, the sample size in this test was fairly small, and the lack of power might have contributed to the non-significant transfer effect. However, we applied a power analysis using G*Power (Faul et al., 2009), given α, power, and the effect size of our experiment to have an idea about whether a lack of power may explain lacking effects from pre- to post-test in the RAPM (see Faul et al., 2007, for critical issues with retrospective power analyses). Consistent with this idea, the present power analysis demonstrated that even the original sample size of 38 participants would not have been sufficient to lead to a significant training advantage from pre- to post-test.

Summarizing our results, we found transfer to a VS WM updating task, to a task switching situation as measured by mixing costs as well as to the AB task. The diversity of these transfer effects corresponds to the findings of Chein and Morrison (2010), who found transfer effects from a complex WM span task to a variety of other tasks, for example the Stroop-task and reading comprehension, and who proposed training of a domain-general mechanism as a prerequisite for transfer effects. The observations in the present study are also consistent with the assumption that cognitive enhancements from our training paradigm may have affected not only a specific but also a more domain-general mechanism involved in various executive processes. A strong candidate for such a more general mechanism would be, according to Chein and Morrison, the mechanism of attentional control. Attentional control processes are strongly present in all of the processes to which we observed transfer: in WM updating as detaching attention from irrelevant items and attending to new relevant items (similarly to our training task), in task switching mixing costs as the requirement to control attention between the two task sets (Braver et al., 2003), and finally in AB as the requirement to control the temporal dividing of attentional resources. Notably, regarding WM updating, we found transfer only to the VS task. This is worthy of mentioning in reference to theories, which propose that executive attentional mechanisms are more closely related to VS WM than auditory WM processes (Baddeley, 1996b; Miyake et al., 2001). Alternatively, it is possible, that our transfer effects were the consequence of improvements in the separate processes that were recruited by the training task and tapped by our transfer tasks. However, this approach would be problematic in explaining the lack of transfer to certain tasks and/or modalities, especially when one regards how small the differences between these distinct processes seem.

At last, there are certain limitations in the present study that should be acknowledged and discussed. In controlled cognitive training studies, one general practice has been to compare the performance of the training group to that of a control group, which does not attend any intervention (e.g., Olesen et al., 2004; Dahlin et al., 2008b; Jaeggi et al., 2008). In this way it has been possible to eliminate re-test effects; however, it is still questionable, to what degree performance changes of the training group can be attributed to the training task and not just to the existence of an intervention per se (Shipstead et al., 2010). In the current study we did not include an active control group, which might raise the question, how much of the performance improvements of the training group in the transfer tasks were due to our training paradigm and how much can be attributed to rather unspecific effects like e.g., the Hawthorne-effect (an improvement in a participants' performance caused by the sole awareness of being studied), to effects of motivation or simply to the engagement in a challenging and adaptive training task. Generally, we believe, that had the performance improvements been affected by these factors, we would have observed improvements across all tasks and situations. This was not the case in the present study. In fact, we demonstrated specific transfer effects (e.g., effects on repetition but not on single-task trials in task switching). Of course, one could argue that the transfer tasks were of different difficulty and, therefore, unspecific training effects could occur only in a subset of only the easiest tasks. However, this argument seems not to be valid, as, for example in the updating task, according to the amount of correctly reported sequences across both sessions and groups, the VS task was more difficult than the AV task, whereas the dual-modality task seemed to be the most difficult one. These observations are also supported by the comments of participants, who reported the VS task to have been more difficult than the AV task and the dual-modality task to have been the most difficult one. Therefore, if the transfer effect was driven by the easiness of the task, we should have observed improvements in the AV task rather than in the VS task. Similarly in the task switching paradigm, we observed transfer to the mixing costs, and this effect was driven by a group-specific improvement in the repetition trials compared with single-task trials, in which there was no training-related improvement. Considering that the RTs in the repetition trials were generally slower than the RTs in the single-task trials, it seems plausible that the repetition trials were more complex than the single-task trials. On the other hand, we found no transfer to switching costs, although the performance in the repetition and switch trials differed from each other significantly so that the RTs in switch trials were slower than the RTs in repetition trials. If the simplicity of the task underlay the transfer effect, our transfer effects in the task switching paradigm would seem counterintuitive. Based on this rather unsystematic pattern of transfer effects (from the perspective of task difficulty), we believe that the easiness or the simplicity of a transfer task does not determine transfer. Further, a study by Thorell and colleagues (2009) has shown that motivational factors as well as pure engagement in an intervention play a rather minor role in cognitive training, as in their study there were no differences in the performances of an active and a passive control group.

Apart from the methodological concerns about a no-contact control group, we would also emphasize that the inclusion of an active control group may not have been critical to the problem setting in our study. Our aim was to investigate transfer effects related to the dual n-back task without thoroughly specifying the components of the training that may underlie transfer.

Another issue pointed as questionable by Shipstead and colleagues (2010) is the inclusion of only a single task for each function. We recognize the problem with this approach, as it cannot be unambiguously concluded that there are improvements in a certain function, but rather in an aspect of a function as measured by a single task. With respect to the present study, we emphasize that first of all, on a general level, we investigated transfer effects from WM training to executive functions; and we used not only one but four different executive tasks for this purpose (WM updating, dual-task, task switching, AB). Second, although at first glance it would seem that for each executive function we implemented only one task, we would like to highlight that our transfer tasks did involve also overlapping processes. For example, WM updating is an essential process in our updating task as well as in task switching. Attentional control was required in the updating task, task switching, and in the AB task. Multitasking was relevant in the dual-task and in the dual-modality part of the updating task. Our results are also in accordance with these overlaps, in that we, for instance, found no transfer to either the dual-task or the dual-modality updating task.

The overlapping of processes between our transfer tasks aside, it should be kept in mind that in such comprehensive studies as the present one, one important criterion is not to exhaust the participants by bombarding them with an immense battery of tests. This assumption is consistent with (1) findings of Schmeichel (2007), who demonstrated effects of exhausting between executive tasks, and (2) the reduced transfer effects in a more exhausting post-test session including 12 transfer tasks (Boot et al., 2008), compared with a less exhausting test session including three transfer tasks (Green and Bavelier, 2003; see also Strobach et al., 2012a). We aimed to tap several executive functions, and encourage future studies to broaden the range of measurements in order to clarify the specific effects of WM training.

In sum, in the present study we have provided evidence that complex WM training can produce transfer effects to executive functions. Given the relative new field of training research and the contradictions in transfer findings, it is of great importance that future studies consistently aim at replicating the transfer effects found thus far in this and in previous training studies, with alterations in training and transfer tasks; as well as at investigating the crucial components and characteristics of successful training paradigms.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.