Summary

Patients in the completely locked-in state have no means of communication and they represent the target population for brain–computer interface research in the last 15 years. Although different paradigms have been tested and different physiological signals used, to date no sufficiently documented completely locked-in state patient was able to control a brain–computer interface over an extended time period. We introduce Pavlovian semantic conditioning to enable basic communication in completely locked-in state. This novel paradigm is based on semantic conditioning for online classification of neuroelectric or any other physiological signals to discriminate between covert (cognitive) ‘yes’ and ‘no’ responses. The paradigm comprised the presentation of affirmative and negative statements used as conditioned stimuli, while the unconditioned stimulus consisted of electrical stimulation of the skin paired with affirmative statements. Three patients with advanced amyotrophic lateral sclerosis participated over an extended time period, one of which was in a completely locked-in state, the other two in the locked-in state. The patients’ level of vigilance was assessed through auditory oddball procedures to study the correlation between vigilance level and the classifier’s performance. The average online classification accuracies of slow cortical components of electroencephalographic signals were around chance level for all the patients. The use of a non-linear classifier in the offline classification procedure resulted in a substantial improvement of the accuracy in one locked-in state patient achieving 70% correct classification. A reliable level of performance in the completely locked-in state patient was not achieved uniformly throughout the 37 sessions despite intact cognitive processing capacity, but in some sessions communication accuracies up to 70% were achieved. Paradigm modifications are proposed. Rapid drop of vigilance was detected suggesting attentional variations or variations of circadian period as important factors in brain–computer interface communication with locked-in state and completely locked-in state.

amyotrophic lateral sclerosis

brain–computer interface

semantic conditioning

locked-in state

Introduction

People with complete paralysis due to neurological disease (e.g. amyotrophic lateral sclerosis, stroke, brain damage, etc.) lose all communication channels with their social environment. In particular, patients with amyotrophic lateral sclerosis in the course of their disease first enter the locked-in state, in which residual voluntary control of some muscles (e.g. eye muscles, lips, fingers, external sphincter) is still possible (Bauer et al., 1979), then if they had accepted artificial ventilation and nutrition they might finally lose all motor control entering the completely locked-in state (Birbaumer et al., 2008). Therefore, those completely paralysed patients represent the primary target population for assistive communication technology such as a brain–computer interface (Birbaumer, 2006a). A brain–computer interface represents a communication aid that relies on brain signals rather than on spinal and peripheral motor systems. Different types of brain signals can be used to control a brain–computer interface, all generated in the brain, measurable and available for online processing and classification whose digital output can be used not only for communication purposes but also to interact with the environment or control an external device (e.g. move a wheelchair, surf the web, etc.) (Galán et al., 2008; Mugler et al., 2008).

Previous attempts to restore communication with brain–computer interfaces were successful in locked-in patients (Birbaumer et al., 1999; Kübler et al., 2005; Nijboer et al., 2008) but failed in patients with complete locked-in syndrome without any remaining motor output channel, primarily without eye movements (Kübler and Birbaumer, 2008). It was hypothesized (Birbaumer and Cohen, 2007) that extended periods of complete paralysis lead to extinction of goal-directed thinking and intentions, comparable to a lack of voluntary (operant) learning in chronically curarized animals (Dworkin and Miller, 1986); in this study learning and execution of operant-voluntary response control over physiological functions turned out to be impossible under complete paralysis. In addition, the progression of amyotrophic lateral sclerosis disease and the loss of communication and interaction with the external environment may severely impair controlled attention necessary to drive a brain–computer interface and to select meaningful information from a computer menu with a particular brain potential or any other non-motor physiological signal (Birbaumer, 2006b). Recently, Halder et al. (2011) studied the cortical correlate of brain–computer interface control and found that during motor observation in a functional MRI scanner the activation of the right dorsolateral prefrontal cortex, known for its attention and workload monitoring function, correlated strongly with the users’ brain–computer interface control performance. Additionally, in another study, Halder et al. (2013) found that the use of the results of a standard auditory oddball session can be employed to predict aptitude in an auditory and a visual P300 brain–computer interface. These results highlighted the importance of brain state monitoring during brain–computer interface sessions for a better analysis of the users’ performance.

The visual modality cannot be used for interactive purposes because paralysis of eye movements in patients with amyotrophic lateral sclerosis is frequently accompanied by loss of clear vision due to drying of the cornea and eye ball (Birbaumer, 2006a; Murguialday et al., 2011). Additionally, owing to extended bed-rest and skin deformations, the tactile input channel is also sometimes impaired as documented previously (Murguialday et al., 2011). Therefore, the paradigm described here employed only auditory and somatosensory nociceptive channels.

In this study, Pavlovian semantic conditioning was proposed as an alternative to conventional brain–computer interfaces. These are either based on operant (voluntary) learning paradigms or they require intact selective attention, as in the frequently investigated P300-brain–computer interface using an event-related brain potential for letter selection (Farwell and Donchin, 1988). The paradigm used here was based on semantic classical conditioning (Razran, 1939, 1949; Lacey and Smith, 1954; Razran, 1961). These studies demonstrated that it is possible to learn a reaction to the meaning of a word or a sentence irrespective of the constituents of the word or the sentence. Indeed, Razran reported generalization of the conditioned semantic response to sentences with similar contextual content (Razran, 1949).

After extensive testing in healthy samples (Furdea et al., 2012; De Massari et al., 2013; Ruf et al., 2013), this ‘reflexive’ classical conditioning paradigm was applied in this study online with patients with amyotrophic lateral sclerosis (two of them in a locked-in state, one in a completely locked-in state) to allow basic communication (i.e. responding to yes–no statements). During the experiment patients were acoustically presented with true and false statements. Only true statements were followed by electrical stimulation to the hand. The electroencephalographic (EEG) signal was recorded and used to differentiate between the brain responses to true and false statements by means of an online classifier. Further to the conditioning paradigm, the patients underwent auditory oddball experiments before and after conditioning to assess the patients’ vigilance changes. It was hypothesized that the use of a paradigm based on classical conditioning involving auditory and electrical stimulation would establish a new communication channel in patients in a locked-in or completely locked-in state.

Materials and methods

Patients

Patient 1 (female, 66 years old, completely locked-in state) was diagnosed with spinal and sporadic amyotrophic lateral sclerosis 4 years before the experiment. One year before the first session, she entered the completely locked-in state and no communication with eye movements or any other external devices was possible (score 0/48 in the Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised; Cedarbaum et al., 1999; Supplementary Video 1). She has been artificially ventilated and fed for 2 years. Her cognitive functions were assessed by means of extensive neurophysiological examination based on event-related brain potentials as described in Neumann and Kotchoubey (2004). Specifically, four recording sessions, each comprising five different cognitive paradigms, selected from Neumann and Kotchoubey (2004); namely, mismatch negativity, oddball without instructions, oddball with instructions, word pairs and semantically incongruent sentences. The results, along with additional information concerning these paradigms, are presented in the Supplementary material and suggested partially intact cognitive functioning.

At the age of 72, Patient 2 (male, locked-in state) received a diagnosis of bulbar amyotrophic lateral sclerosis. He was artificially fed through a percutaneous endoscopic gastrostomy and 1 year before the first measurement he started to be artificially ventilated (score 0/48, in the Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised). During the period of measurement, he still had some residual gaze control and he could slightly twitch one hand through which he could communicate ‘yes’ and ‘no’ signals.

Patient 3 (male, 40 years old, locked-in state) was diagnosed with spinal amyotrophic lateral sclerosis 5 years before the beginning of this study and 4 years later he started to be artificially ventilated. He was also artificially fed (score 1/48, in the Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised) and showed only minimal voluntary control over gaze which allowed him to answer yes–no questions by rolling the eyes up or down for a limited period of time.

Experimental procedure

The experiments reported in this study were approved by the Internal Review Board of the Medical Faculty of the University of Tübingen. Informed consent was given either by the patient or, in the case of Patient 1, by the patients’ legal representative, the husband.

The semantic conditioning paradigm was based on the pairing between true statements (conditioned stimuli, CS1) and an electrical stimulus (unconditioned stimulus) delivered over the left thumb immediately after the end of the statement. These trials were regarded as CS1+ to differentiate them from the trials composed by a true statement not followed by any unconditioned stimuli and called CS1−. Conversely, the false statements (CS2) were not conditioned and never followed by any unconditioned stimulus.

The study consisted of 2 weeks of measurements with four sessions per week. The first session comprised five blocks: the first four blocks consisted of 50 (25 CS1+ and 25 CS2 at random) and the last block of 60 trials (15 CS1+, 15 CS1− and 30 CS2 at random). The remaining seven sessions all had the following structure, that is a first block of 50 trials (25 CS1+ and 25 CS2 at random) followed by four blocks of 60 trials (15 CS1+, 15 CS1− and 30 CS2 at random). Figure 1 gives a graphical representation of the experimental design and examples of statements. The intertrial interval had a duration of 5 s. Some of the trials consisted of personalized statements, which referred to the biography of the patient (e.g. ‘The name of my son is Ander/Markus’) and may elicit stronger responses (Perrin et al., 2006; Di et al., 2007), and those trials were used both as the training set for the classifier and as the input for the online classification and feedback generation (for a detailed description refer to ‘Online classification’ section). Additionally, in the case of Patient 1, in the fifth block of all sessions the feedback was computed on the cortical reactions to open statements whose truthfulness was known only by the patient (e.g. ‘The position of the legs is comfortable’). For all patients the feedback consisted of a sentence played acoustically according to the outcome of the classification in discriminating between covert ‘yes’ (e.g. ‘You thought yes’) and ‘no’ (e.g. ‘You thought no’) responses.

Experimental design. The number and types of trials per block are displayed, as well as the number of blocks per session: CS1+ consists of a trial containing an affirmative statement followed by the electrical stimulus; CS1− consists of a trial containing an affirmative statement not paired with electrical stimulation; CS2 consists of a trial containing a negative statement without electrical stimulation. All the sessions had the same structure except session one during which no feedback was provided. Additionally, it is shown from which blocks the trials are selected to form the training data set provided to the classifier.

Each daily measurement included two auditory oddball runs, one performed before (oddball-pre) and the other after (oddball-post) the conditioning paradigm, respectively. Each oddball run consisted of 360 stimuli, the target and the frequent stimuli were represented by 60 high-pitch tone and 300 low-pitch tone stimuli, respectively, with an interstimulus interval of 900 ms. Both the frequent and the target stimuli had a duration of 100 ms (Neumann and Kotchoubey, 2004). These measurements were introduced to assess changes in the patient’s level of vigilance within a daily session and its relationship with semantic conditioning performance.

The EEG was continuously recorded with a multi-channel EEG amplifier (Brain Amp DC, Brain Products) from 32 Ag/AgCl electrodes mounted on a 64-channel cap (Easycap). The placement of the electrodes was based on the international 10-10 system and the selected channels were F3, Fz, F4, T7, C5, C3, C1, Cz, C2, C4, C6, T8, Cp5, Cp3, Cp1, Cpz, Cp2, Cp4, Cp6, P5, P3, P1, Pz, P2, P4, P6, Po7, Po3, Poz, Po4, Po8 and Oz. Each channel was referenced to an electrode placed on the tip of the nose and grounded to the left mastoid (for one patient the ground electrode was located on the right mastoid because this configuration was more compatible with his bed rest position). Electrodes impedances were kept <5 KΩ and the EEG signal was sampled at 200 Hz. The auditory stimuli, both the statements employed as conditioned stimuli and the feedback, were provided through pneumatic earphones inserted in the patient’s ears. All the experiments were conducted at the patients’ homes. Patient 3 was measured at his bedside, whereas the other patients were sitting in a wheelchair during the measurements.

The electrical stimulation was generated by a bipolar direct current stimulator (DS5, Digitimer Ltd) The stimulus was delivered through Velcro® fastening electrodes placed on the distal and proximal phalanxes of the left thumb and the unconditioned stimulus had a duration of 1 ms. The stimulus intensity was adjusted at the beginning of each session in order to consider the daily fluctuation of the somatosensory and perception thresholds. For the two patients in the locked-in state the intensity of the stimulus was subjectively set at the beginning of each session using a procedure aimed to monitor the somatosensory and pain thresholds of the participant (De Massari et al., 2013). The patient was asked to rate individually several stimuli with increasing magnitude by using a visual analogue scale ranging from 0 (no sensation) to 10 (pain tolerance threshold, ‘unbearable pain’). The intensity rated as 8 in the visual analogue scale was selected to be employed throughout the session. For the patient in the completely locked-in state another approach was adopted because of the impossibility to assess her subjective rating. In this case, multiple trains of stimuli were delivered, each one comprised 25 stimuli with the same intensity. After each train the intensity was increased and a two-sided signed rank test was performed on the 25 previous trials to assess whether the somatosensory evoked potential peak amplitude in the EEG was significantly different from zero. The lowest stimulus intensity for which the statistical test found a significant difference was employed throughout the experiment.

Unlike Patients 2 and 3, Patient 1 underwent an extensive event-related brain potential examination and an additional 29 conditioning sessions were recorded (37 sessions on 37 days in total). Starting from the third week of measurement, a sixth block was added at the end of each session containing the semantically reversed versions of the open statements presented in the fifth block (e.g. ‘Last night I slept well’ was changed to ‘Last night I slept badly’). Therefore, it was possible to assess the accuracy of the classifier comparing the classification of the EEG response to the two versions of each open statement presented in blocks 5 and 6. Additionally, from session 12 to session 19 the electrical pulse was substituted with a train of pulses (duration of 1 s, frequency of 20 Hz, width of single pulses of 200 µs) generated by a functional electrical stimulation device (UNAFET 8, UNA Systems) and paired with an auditory stimulus (75 dB) resembling the sound of metal scraping to increase the salience of the unconditioned stimulus. The functional electrical stimulation was delivered over the left upper arm through two patch electrodes positioned over extensor communis digitorum muscle, which is responsible for the extension of the fingers. This combination of stimuli was introduced in order to involve both the auditory sensory system and the proprioceptive afferent channels, as suggested by Murguialday et al. (2011). Both the electrical pulse and the functional electrical stimulation were extensively tested with healthy participants in our laboratories before the measurements; moreover, the husband of Patient 1 (completely locked-in state) underwent a pilot experiment after which he gave his consent for the use of these stimuli on his wife.

Online classification

The stimulus presentation and online classification was performed using BCI2000 (Schalk et al., 2004) and Matlab R2009b (www.mathworks.com). The classification was performed using a linear discriminant analysis classifier with the input features being extracted from the EEG segments immediately following the end of the statements used as conditioning stimuli. The segments of interest had a duration of 4 s. Because the conditioned response was not expected to resemble the somatosensory evoked potential elicited by the electrical stimulation, the classifier could not be provided with those trials contaminated by the cortical responses triggered by the unconditioned stimulus. This procedure was also suggested by the visual inspection and analysis of data acquired in a previous pilot study. Therefore, only the trials without electrical stimulation could be included in the classification procedure. The features were represented by wavelet coefficients computed using a fast (discrete) wavelet transform focusing on the spectral components below 3.125 Hz because a slow wave was recorded to the end of the statements. From the second session the feedback was provided online in the fourth and fifth block (Fig. 1). During those blocks the 15 CS1+ and 15 CS1− trials recorded were five affirmative and five negative statements that were repeated three times, and the three EEG responses were used to generate the features provided to the classifier for the online feedback. Thus, the patient was provided with feedback of the classification outcome 10 times per block employing an approval of the classified ‘answer’ (e.g. ‘You thought yes’ or ‘You thought no’). In block 4 the classifier was trained with 30 CS1− and 30 CS2 trials acquired during blocks 2 and 3, whereas in block 5 the training data set was expanded to include also 15 CS1− and 15 CS2 trials recorded during the fourth block.

Offline analysis

The offline analysis was performed to assess whether the use of an alternative classifier and the inclusion of other features could improve the classification accuracy. After visual inspection, only the data sets containing eye related artefacts in Patients 2 and 3 underwent the ocular correction through independent component analysis using the toolbox available in the BrainVision Analyzer 2 software (Brain Products). The patient in the completely locked-in state showed no eye movements, therefore in this case, the ocular correction was not performed.

Two different types of features were analysed for classification: wavelet coefficients and EEG signal amplitudes. In the offline processing, the same spectral method and frequency band as in the online classification were considered for generating the wavelet coefficients. To obtain the signal amplitude features the signal was low-pass filtered below 5 Hz (linear phase filter), moving average-filtered and decimated using a factor of 5, therefore both input feature types focused on low frequency components. These features were provided as input to a radial basis function kernel support vector machine classifier that was already shown to outperform linear classifiers (Furdea et al., 2012). The offline performances were computed using a 10-fold cross-validation approach including all the trials used during online classification both for training and testing the linear classifier. Furthermore, the use of support vector machines implies the selection of some parameters that can influence the output accuracy (Burges, 1998), thus a grid search based on a 10-fold cross-validation scheme was performed to obtain the best kernel parameters for each training data set.

The oddball data were analysed only offline using a step-wise linear discriminant analysis classifier which has been widely applied in P300 brain–computer interfaces (Krusienski et al., 2006; Furdea et al., 2009). This analysis was performed to detect possible changes within each daily measurement in the patient’s attention level monitored by the oddball paradigm. Each auditory oddball run comprised 60 sequences of stimuli and a sequence consisted of six stimuli (one target and five frequent tones). In order to increase accuracy of the classifier in discriminating between the EEG responses to the target and the frequent tones, several sequences have to be provided to the classifier rather than performing a single trial classification. The higher the number of sequences the more accurate will be the classification but also the lower would be the output bit rate if the classification is used online in an event-related brain potentials based brain–computer interface speller. The number of sequences to correctly classify the event-related brain potential is subject-dependent; however, in order to provide consistency among the results between the different patients, a value of 15 sequences per classification step was selected. Consequently, a 4-fold cross-validation classification was performed for each oddball run using baseline-corrected EEG epochs of 1 s. The correlation between the auditory oddball classification results and the offline semantic conditioning performances was obtained for each patient. Specifically, the Pearson’s r was computed between the change in the classification of the auditory oddball paradigm from the oddball-pre to the oddball-post (i.e. the accuracy obtained in the oddball-post was subtracted from the accuracy obtained in the oddball-pre) and the offline semantic conditioning classification accuracy obtained in the same day’s measurement considering both input features for each patient. It was hypothesized that those two values would be negatively correlated if the patient was able to perceive the probability difference of target and non-target stimuli, that is a decrease in the accuracy from oddball-pre to oddball-post, e.g. due to fatigue, would predict a low classification accuracy of the semantic conditioning data.

In order to analyse the stability of the entire EEG frequency spectrum and its relationship with spelling performance achieved in the semantic conditioning paradigm, the sessions were divided into two samples of sessions according to the online classification accuracy. A first sample comprised all the sessions with an online classification accuracy below or equal to chance level, which was five answers to five statements out of 10 for Patient 1, and 10 answers to 10 statements out of 20 for Patients 2 and 3, regarded as ‘chance-level’ sample; while a second sample included all the sessions with an online classification accuracy above chance level, regarded as ‘non-chance level’ sample. This procedure was performed for each patient. The spectral analysis estimation was performed for all CS1− and CS2 trials using an autoregressive spectral estimator, namely the maximum entropy method (McFarland et al., 2000). Five different frequency bands were investigated (0–3.5 Hz, 3.5–6 Hz, 6–11 Hz, 11–30 Hz and 30–40 Hz), therefore all the standard EEG oscillatory bands were included up to the low-gamma band. Frequency power >50 Hz, at which line noise is predominant (in Europe), is difficult to separate from noise or muscle activity in scalp EEG (Lutzenberger et al., 1997). At each channel the signal power in the selected frequency bands was computed for all trials used in the offline analysis comprising the cortical responses to positive and negative statements. Additionally, the signal power relative change was computed dividing the signal power calculated as described above by the signal power for the baseline, which was recorded at the beginning of each session. Therefore, it was possible to compare the relative power change in each frequency band between ‘chance-level’ sessions and ‘non-chance level’ sessions for trials comprising cortical responses to positive and negative statements. A statistical test was performed to assess any significant differences between samples; due to violation of normality (tested with Kolmogorov–Smirnov test), a non-parametric statistical test was applied, namely the Wilcoxon rank-sum test.

Results

Figure 2 shows the results for the online and offline classification for Patient 1, and the classifiers’ accuracies for Patients 2 and 3 are reported in Fig. 3. The number of correctly classified outputs (online) per session for the completely locked-in state (Patient 1) and the two locked-in state patients (Patients 2 and 3) are displayed in the upper panels of Figs 2 and 3, respectively. The online feedback was provided during the fourth and fifth block starting from the second session. For the patient in the completely locked-in state, only the 10 classifier outputs of the fourth block of each session are considered, since in the fifth block open statements were played and thus, even though the classifier provided a classification for those trials and a feedback was given, the accuracy couldn’t be computed because only the patient knew whether the statement was true or false. Thus, the chance level (horizontal line) was 5 for the results displayed in the upper panel of Fig. 2, whereas it was 10 for the upper panel of Fig. 3. From the ninth session an additional block was added for Patient 1, in which the open statements used for classification in block 5 were changed in their last word in order to reverse the meaning of the statements. Therefore, the comparison of the online outputs of the classifier for each pair of reversed statements could provide the level of accuracy for the open statements. On average across all sessions, the online correct outputs for the linear discriminant analysis classifier were around chance level for all the patients. On the other hand, the offline classification showed an average accuracy of 57.81% and 61.79% for Patient 3 when the radial basis function kernel support vector machine was provided with wavelet coefficients and signal amplitudes as input features, respectively. While the average offline accuracies were 49.80% (wavelet coefficients) and 50.35% (signal amplitudes) for Patient 1 (completely locked-in state), and 47.95% (wavelet coefficients) and 49.05% (signal amplitudes) for Patient 2. The offline classification results are represented in Figs 2 and 3.

Classification results for Patient 1 (completely locked-in state). In the upper panel the online results are displayed; the number of correct classifier (linear discriminant analysis) outputs is displayed for each session. The circles indicate the number of correct classifications for block 4, that is how often the classifier classified correctly known statements; whereas the downward-pointing triangles indicate the number of correct classified semantic switch (from session 9), that is when the classifier labelled oppositely two semantically reversed versions of the same open statement from block 5 to block 6. In the lower panel the offline results are presented: the circles and the downward-pointing triangles indicate the classification accuracy when wavelet coefficients (mean accuracies = 49.80%) and signal amplitude (mean accuracies = 50.35%) were provided to the radial basis function-support vector machine classifier, respectively. The horizontal line indicates chance level.

Classification results for Patients 2 and 3. In the upper panel the online results are displayed: the number of correct classifier outputs is displayed for each session. The squares indicate the number of correct classifications for blocks 4 and 5 for Patient 2, i.e. how often the classifier classified correctly known statements; whereas asterisks indicate the number of correct classifications for Patient 3. In the middle and lower panels the offline results are displayed: specifically, in the middle and lower panel one can see the results when wavelet coefficients and signal amplitudes were provided to the radial basis function-support vector machine classifier. The horizontal line indicates chance level. The mean accuracies across sessions for Patient 2 were 47.95% (wavelet coefficients) and 49.05% (signal amplitude). The mean accuracies for Patient 3 were 57.81% (wavelet coefficients) and 61.79% (signal amplitude). Additionally, the time of hospitalization for Patient 3 is visualized in each graph.

Figure 4 presents the time courses of the averaged (195 trials acquired in the first four sessions) EEG signal recorded at channel Cp3 with respect to the end of the sentences for the three patients along with their standard error. Each panel shows the averaged EEG conditioned response to true (CS1−) and false (CS2) statements. For Patient 1, the two averaged signals show no difference in the time course, while a difference is present for the other two patients. Specifically, two distinguishable patterns in the two cortical responses are visible around 1500 ms after the end of the sentence, with Patient 3 showing a larger difference between CS1− and CS2 trials.

Time courses of the averaged (195 trials acquired in the first four sessions) EEG signal recorded at channel Cp3 with respect to the end of the sentences for the three patients. CS1− trials are represented with grey line and CS2 with black line along with their standard error. The upper, middle and lower panels show the cortical responses for Patient 1 (completely locked-in state), Patient 2 and Patient 3, respectively.

In the oddball paradigm the mean accuracies in detecting the target stimulus for Patient 1 were 27.96% and 22.14% for the pre- and post-runs, respectively. Patient 3 had the highest mean accuracy in detecting the target stimulus both in the oddball-pre (38.02%) and oddball-post (27.6%) runs compared with the other patients, whereas Patient 2 had a mean accuracy in detecting the target stimulus of 29.16% in oddball-pre and 22.91% in oddball-post. A tendency towards a decrease in accuracy was revealed when comparing the classification performances obtained in the pre-runs with the performances computed in the post-runs. Indeed, in 19 (out of 37) sessions the accuracy from the oddball-pre to the oddball-post run decreased for Patient 1, in 5 (out of 8) for Patient 2, and in 4 (out of 8) for Patient 3.

Within a session, correlation analysis was performed to study the linear dependence between (i) the change in the auditory oddball classification accuracy (from oddball-pre to oddball-post); and (ii) the offline semantic conditioning classification accuracy. Figure 5 shows the results of the correlation analysis for each patient and both types of input features that were considered in the offline analysis of the conditioning paradigm data. A negative significant correlation was obtained for Patient 2 in both scenarios: namely, r = −0.770 (P < 0.05) when wavelet coefficients were used as input features for the classifier and r = −0.796 (P < 0.05) when signal amplitudes were the input features. Additionally, for Patient 1, a negative significant correlation r = −0.396 (P < 0.05) was found when signal amplitudes were used as features.

Pearson’s correlation between offline semantic classification accuracies and difference in the auditory oddball accuracies between pre- and post-runs. Six scatter plots are shown, each row refers to one patient; left and right columns refer to the scenario in which wavelet coefficients and signal amplitudes were used as input features for the semantic conditioning data classification. Pearson’s correlation are reported in each panel (*P < 0.05). On the y-axis the P300 difference accuracy is shown, which is obtained by subtracting the classifier performance in the auditory oddball-post run from the classifier performance in the auditory oddball-pre run; therefore, a positive and negative value is an indirect measure of a decrease and increase, respectively, of the patient’s attention level. On the x-axis the offline classification accuracies of the semantic conditioning data are reported.

Supplementary Figs 5, 6 and 7 report the mean signal power relative changes and standard deviations of one exemplary channel Cp3 (the same channel selected to depict the averaged time course of the different cortical responses in Fig. 4) in the selected frequency bands for Patients 1, 2 and 3 respectively. For Patient 1 (completely locked-in state), a significant difference (P < 0.01) was revealed between ‘chance-level’ and ‘non-chance level’ sessions in the 0–3.5 Hz, 6–11 Hz and 30–40 Hz bands both for cortical responses to positive (CS1−) and negative statements (CS2), and in the 3.5–6 Hz band only for CS1− trials. For Patient 2, a significant difference was found in 0–3.5 Hz and 30–40 Hz bands for both CS1− and CS2, in 3.5–6 Hz only for CS1− and in 6–11 Hz only for CS2. For Patient 3, a significant difference was revealed in 0–3.5 Hz, 6–11 Hz, 11–30 Hz and 30–40 Hz bands for both CS1− and CS2, and in 3.5–6 Hz only for CS1−. For Patients 1 (completely locked-in state) and 2, a more pronounced power decrease in the 0–3.5 Hz band was revealed in ‘non-chance level’ sessions compared to ‘chance level’ sessions both for CS1− and CS2 trials. Conversely, for Patient 3 in the 0–3.5 Hz band a more pronounced power decrease was revealed in ‘chance level’ sessions compared to ‘non-chance level’ sessions both for CS1− and CS2 trials. Moreover, for Patient 3, a more pronounced power increase was detected in the 6–11 Hz band in ‘chance level’ sessions compared to ‘non-chance level’ sessions both for CS1− and CS2 trials. Conversely, in Patient 1 (completely locked-in state) a more pronounced power increase was detected in the 6–11 Hz band in ‘non-chance level’ sessions compared with ‘chance level’ sessions both for CS1− and CS2 trials. These results support the conclusions drawn from the analysis of the auditory P300 pre- and post-session data that Patients 1 (completely locked-in state) and 2 performed at chance level when vigilance dropped, as indicated by the frequency changes. While comparing ‘chance level’ to ‘non-chance level’ sessions, Patients 1 (completely locked-in state) and 2 showed a similar power change pattern in opposition to the pattern revealed for Patient 3, whose data led to a better offline classification on average.

Discussion

This experiment represents the first attempt to classify in real-time, EEG data recorded for communication purposes within a semantic conditioning paradigm in locked-in state and completely locked-in state over extended sessions. Three severely affected patients with amyotrophic lateral sclerosis were recruited, two of them with minimal motor control and one in completely locked-in state without any motor control. The communication speed provided by the paradigm described here is slow if compared to other assistive communication devices and most EEG-based brain–computer interface spellers such as the P300-speller (Farwell and Donchin, 1988); nevertheless, in this clinical scenario the information transfer rate is not the main issue as no brain–computer interfaces or other assistive technologies were ever reliably controlled by a completely locked-in state patient for communication. On the other hand, the lack of a communication channel is of a great concern since there is no possibility to comprehend or infer the current emotional and cognitive state of a patient in a completely locked-in state. In this context, real-time decoding of macroscopic brain states becomes vitally important (Blankertz et al., 2010). Because of failure to improve classification and communication with surgical implantation of electrocorticographic electrodes in completely locked-in state (Wilhelm et al., 2006; Murguialday et al., 2011), implantation was not considered as a viable alternative to improve signal quality.

The online classification of the conditioned response, which was generated by semantic conditioning in the EEG slow components <3.125 Hz, was around chance level in all patients if averaged across all sessions. The analysis and classification of different frequency bands is reported in the Supplementary material and do not lead to different conclusions. Figure 4 shows the averaged conditioned response for the first four sessions for each patient. A linear classifier, namely linear discriminant analysis, was employed for real-time detection of the conditioned response in the brain signals, which has been widely used in the brain–computer interface field for its simplicity and computational efficiency (Lotte et al., 2007). On the other hand, linear discriminant analysis classifier shows poor performance on complex non-linear EEG data; additionally, the absence of a regularization parameter makes this type of classifier unable to properly weight outliers (Lotte et al., 2007), which might explain the low accuracies with this type of analysis. In the offline classification, a non-linear classifier (radial basis function kernel support vector machine) was used. This analysis provided high accuracies for Patient 3 with an average accuracy of 61.79% across sessions (Fig. 3). Therefore, in the case of Patient 3, the non-linear classifier outperformed the linear discriminant analysis classifier. Interestingly, the first three sessions for Patient 3 revealed high offline classification performances followed by a substantial decrease. At this time point, Patient 3 was hospitalized because of medical complications, which might explain the drop in accuracy due to his deteriorating physical condition.

The classification results for the oddball paradigms revealed average accuracies in discriminating between target and non-target trials higher than chance level for all three patients. The recording of two runs at the same day, one before and the second after each conditioning session, allowed us to detect possible changes in the cortical processing of rare and frequent auditory stimuli within the same daily session and provided an indirect measure of automatic attention and vigilance. These changes in the classification of attentional performance were found to be significantly correlated with the offline semantic conditioning classification results. Specifically, a negative and significant Pearson’s r was obtained for Patients 1 and 2 when the oddball classification changes were correlated with the offline results including signal amplitudes as input features and for Patient 2 also when the oddball classification results were correlated with the offline classification performances obtained with wavelet coefficients as input features. A decrease in the cortical processing of frequent and rare auditory stimuli revealed a change in the attention level of the patient, which influences the outcome of the administered paradigm (e.g. semantic conditioning paradigm) as described by Neumann and Kotchoubey (2004). Both patients with a significant negative correlation performed around chance level whereas Patient 3 showed significant classification and no negative correlation of P300 changes over time and performance, suggesting that Patients 1 and 2 suffered from a drop in vigilance during communication periods that may explain the negative result in about half the sessions for the patient in the completely locked-in state.

Although these findings were observed in a specific patients’ population, they might have consequences and applications also on other clinical populations. For instance, the misdiagnosis of disorders of consciousness is a huge issue and has different clinical, ethical and legal implications (Monti et al., 2010). The absence of a reliable motor output, which can be seen in brain-injured patients, represents the main obstacle towards an accurate diagnosis or assessment of a capacity to communicate (Bardin et al., 2011). In recent years, a small number of studies proposed new paradigms based on different imaging approaches in order to overcome this issue (Neumann and Kotchoubey, 2004; Monti et al., 2010; Bardin et al., 2011). We believe that the identification of a neurophysiological marker of the attention level could be employed to reduce the performance variability of patients with disorders of consciousness in the use of functional MRI or EEG-based communication paradigms leading to a potential solution for the unresponsiveness of those patients.

The analysis of all EEG-frequency components from 0 to 40 Hz for chance level performance and non-chance level performance session strengthens the interpretation of a vigilance problem in Patients 1 and 2. ‘Chance level’ and ‘non-chance level’ sessions differ in terms of a different arousal pattern: Patients 1 and 2 with more frequent chance level performance show a significant difference in the very slow (0–3.5 Hz) frequency bands and in the high frequency low-gamma band both for CS1− and CS2. Moreover, a different power change pattern was detected in the 6–11 Hz band comparing the data of Patients 1 (completely locked-in state) and 3. Both P300-decrement and slow frequency EEG during communication sessions reflect a decrement in vigilance and future brain–computer interface experiments should try to prevent drops in arousal by adapting the communication sessions to the patient’s circadian pattern and vigilance level.

Except for Patient 3 (who had an Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised score of 1 and some minimal residual eye control), communication provided by semantic conditioning brain–computer interface and the applied classification of this specific slow EEG components (see Supplementary material for the analysis of different frequency bands) was not reliably achieved, particularly in the patient in the completely locked-in state who was trained for 37 sessions over 37 days. These data emphasize the fact that a valid assessment of communication capacity in locked-in state and completely locked-in state with any type of assistive communication, particularly with brain–computer interfaces, needs extensive repetition and long periods of training with careful control of the vigilance status. Patient 1 (completely locked-in state) showed peaks of high communication accuracy of 60 to 70% in 4 of 37 sessions. If we exclude cognitive decline and dementia in this patient because of her largely intact cognitive event-related brain potential data (Supplementary material) two explanations for these inconclusive results are possible: (i) our theoretical account predicting extinction of goal directed cognition and volition (Birbaumer et al., 2012) is correct and may even sometimes extend into the late locked-in state excluding all types of motivated communication and instrumental learning; and (ii) attention and arousal requirements for this semantic classical conditioning brain–computer interface are still too high and patients with reduced vigilance and drowsiness during many of the training sessions fail to demonstrate stable above chance level classification. In this respect, future experiments should involve real-time vigilance state assessment in order to adapt the paradigm and the stimulus presentation rate to the patient’s brain state of consciousness and awareness. As demonstrated in Patient 3, the use of a non-linear classifier is recommended for discriminating among different conditioning stimuli. Other physiological signals than the EEG (e.g. heart rate, haemodynamic response, etc.) need to be included despite the constant and redundant physiological state of the patients caused by artificial respiration and feeding and the complete motionless body position over years. Another possibility for the negative results in Patients 1 and 2 may be the short time interval available for brain-responding: patients with complete paralysis and compromised cognitive functioning and disorders of attention may need more time to respond to complex statements.

Frequent pairing of a neutral stimulus with a biologically relevant stimulus leads to the enhancement of neuronal connections between the activated neuronal pools (Montoya, 1996). In the semantic conditioning paradigm presented herein, affirmative statements were paired with electrical stimulation resulting in the simultaneous activation of brain regions involved in language and sensory processing. According to principles of classical conditioning, once the new cell assembly underlying the paired stimulation is formed, presentation of the neutral stimulus alone is enough for its activation. Conceivably this can trigger the EEG pattern that could have been identified by the classifier and visualized in Fig. 4. An amplification of this signal of interest would benefit a more accurate classification. With this respect, future studies could investigate the use of different stimulation techniques. If a stable and reliable conditioned response could be observed in a patient, this paradigm could be used to address questions concerning basic needs as well as legal issues.

Conclusion

Excluding the possibility of cognitive decline with progression of the disease in these patients, several explanations for the lack of stable conditioning effects and brain-communication with EEG are possible. Attention span and vigilance fades rapidly and prevents conditioning and communication, particularly during the critical final phases of an experimental session; the auditory P300-data and the EEG-frequency analysis reported here support that interpretation. The proposed ‘extinction of goal directed thinking effect’ (Birbaumer et al., 2012) prevents semantic conditioning because the motivated intention to anticipate a ‘yes’ or ‘no’ answer after the conditioning stimulus is attenuated or extinguished (Perky, 1910) and may also be responsible for the inconsistent results. Another explanation of the results concerns the aversive stimulus anticipation after an affirmative statement that may retroactively block processing of affirmative representations; alternatively, no aversive stimulus or an aversive stimulus after negative statements and/or positive stimuli such as harmonic sounds and/or statements of significant others could be used as unconditioned stimulus; however, auditory stimuli were used as unconditioned stimulus in the past in this procedure described here without significant above-chance classification accuracies (Furdea et al., 2012; Ruf et al., 2013). It is also possible that the selected slow wave and frequency bands (Fig. 4 and Supplementary Figs 5, 6 and 7) do not represent the differential representations underlying ‘yes’ and ‘no’ cognition; semantic identification and semantic conditioning may involve different neuronal signatures such as travelling alpha-waves moving semantic memory representations from their prefrontal storage areas to the more central or limbic affirmative or negative areas (Fellinger et al., 2012). In addition, measures based on metabolic changes such as brain blood flow and oxygenation level (e.g. measured by near-infrared spectroscopy) may be more suited to reflect the underlying cognitive process in paralysis.

Except for the second possibility of general extinction of goal-directed thinking, solutions to the other three can and are currently explored. Fading of vigilance could be postponed with anodal transcranial direct current stimulation or high frequency transcranial magnetic stimulation, increasing brain excitation level (Cohen et al., 1998). Invasive electrical microstimulation of attentional brain sites or pharmacological application of stimulating drugs such as amphetamine constitutes another possible solution. The possibility that different classification algorithms may result in improved segregation of the two representations is another possibility but only small differences in classification performance of different methodologies evolved in the brain–computer interface literature.

Supplementary material

Acknowledgements

The authors wish to thank Stephan Rink for his invaluable contribution towards the implementation of the brain–computer interfaces described herein. Particularly, the patients and their significant others assisted with untiring patience and hope. Their contribution has to be admired. The authors would also like to acknowledge the strong help that was provided by Axel Wingerath during the experiments.