It is generally agreed that considerable amounts of low-level sensory processing of visual stimuli can occur without conscious awareness. On the other hand, the degree of higher level visual processing that occurs in the absence of awareness is as yet unclear. Here, event-related potential (ERP) measures of brain activity were recorded during a sandwich-masking paradigm, a commonly used approach for attenuating conscious awareness of visual stimulus content. In particular, the present study used a combination of ERP activation contrasts to track both early sensory-processing ERP components and face-specific N170 ERP activations, in trials with versus without awareness. The electrophysiological measures revealed that the sandwich masking abolished the early face-specific N170 neural response (peaking at ∼170 ms post-stimulus), an effect that paralleled the abolition of awareness of face versus non-face image content. Furthermore, however, the masking appeared to render a strong attenuation of earlier feedforward visual sensory-processing signals. This early attenuation presumably resulted in insufficient information being fed into the higher level visual system pathways specific to object category processing, thus leading to unawareness of the visual object content. These results support a coupling of visual awareness and neural indices of face processing, while also demonstrating an early low-level mechanism of interference in sandwich masking.

Introduction

The degree to which any cognitive or perceptual process proceeds in the absence of awareness can be explored by creating conditions in which stimuli invoking that process cannot be explicitly reported but are still possibly being processed by the subject. The demonstration of such processes often use so-called dissociation paradigms, which seek to establish that stimuli of which the subject is unaware (evident in an explicit behavioral measure) still exert an influence at a neural, cognitive, or behavioral level (as in priming, e.g., Merikle & Cheesman, 1987; Reingold & Merikle, 1988). As generally applied, when the relative sensitivity of the two measures (an explicit behavioral measure and an implicit behavioral or neural measure) related to the same perceptual process changes across conditions of awareness, perceptual processing in the absence of awareness is implied.

For example, it has been shown that the affective valence of a visually masked image of an emotional face, even though not perceived as assessed by explicit report, can still influence decisions about other stimuli (e.g., concurrent or succeeding neutral ones). This behavioral demonstration of affective priming (Murphy & Zajonc, 1993), as well as corroborating electrophysiological findings (Vizueta et al., 2007), suggests at least partial emotional processing outside of awareness. More broadly, a dissociation such as this helps to delineate the type and extent of visual processing that can occur outside of awareness (Holender, 1986; Reingold & Merikle, 1988). Such logic has also been used to provide evidence that visual feature extraction, such as line orientation (Montaser-Kouhsari, Moradi, Zandvakili, & Esteky, 2004), the binding of low-level visual features based on Gestalt principles of good continuation or common region (Mitroff & Scholl, 2005), and the grouping of visual cues to form the Muller-Lyer illusion (Moore & Egeth, 1997), can occur in the absence of explicit awareness. However, while these previous findings are compelling, it has not been demonstrated, behaviorally or neurally, that visual processing at the level of specific object category discrimination occurs outside of awareness. Moreover, the neural mechanisms by which such higher level visual processing does or does not reach conscious awareness are not at all clear.

Neuroimaging studies have provided compelling evidence of visual object category processing of stimuli within awareness. For example, the perception of human faces closely correlates with greater hemodynamic responses in the ventral occipital area known as the fusiform face area (FFA; Andrews & Schluppeck, 2004; Kanwisher, 2000; Kanwisher & Yovel, 2006; McCarthy, Puce, Gore, Allison, 1997). This finding is corroborated by corresponding event-related potential (ERP; Allison, Mccarthy, Nobre, Puce, & Belger, 1994; Allison, Puce, Spencer, & McCarthy, 1999; Bentin, Allison, Puce, Perez, & McCarthy, 1996; McCarthy, Puce, Belger, & Allison, 1999; Puce, Allison, Asgari, Gore, & McCarthy, 1996; Puce, Allison, & McCarthy, 1999) and event-related magnetoencephalographic (MEG) measures (Liu, Higuchi, Marantz, & Kanwisher, 2000). These latter two measures of high temporal resolution, EEG and MEG, have succeeded in characterizing the well-studied N170 component and its MEG analog, the M170 (Bentin, 1998; Bentin et al., 1996), as reflecting face-specific processing. The N170 is a negative-polarity ERP response to images of faces relative to images of other object categories. It typically peaks first at approximately 170 ms after stimulus onset and is often followed by an extended negative-polarity ERP wave with a similar topographic distribution over the next several hundred milliseconds, with this later effect tending to be more closely tied to behavior (i.e., delayed for longer face categorization response times; Philiastides & Sajda, 2006). Such a dual-phase spatiotemporal profile of visual evoked potentials in response to objects (Fahrenfort, Scholte, & Lamme, 2008) and faces in particular (Jemel, Schuller, & Goffaux, 2010; Luo, Feng, He, Wang, & Luo, 2010) has been characterized in several studies and supports an account of feedforward signal propagation followed by reentrant processing of the same polarity and topographic distribution. The face-specific N170 exhibits a bilateral, although typically somewhat right-weighted, occipitotemporal scalp distribution (Bentin, 1998; Bentin et al., 1996). By tracking the intactness of the N170, along with the later recurrent face-specific ERP activity phase, in conditions of awareness and unawareness of visual object stimuli, it is possible to evaluate whether this type of object category level processing occurs in the brain in the absence of awareness.

The extent to which visual processing of faces occurs in the absence of awareness remains controversial. One particularly rich body of literature concerns the phenomenon of face priming in which behavioral or neuronal responses to a face are modulated by pre-exposure to the same face (Jemel, Pisani, Calabria, Crommelinck, & Bruyer, 2003). Studies of face priming have provided conflicting indirect evidence either for or against face-specific processing in the absence of awareness. On the one hand, priming effects manifested by higher accuracy, shorter response times, and above-chance face–name associations were observed in studies using sandwich masking for faces that were presented but undetected according to verbal report (Schweinberger, Klos, & Sommer, 1995). In addition, several electrophysiological studies have shown reduced visual ERP responses at both early and late time windows to masked faces when they are preceded by the same face (Henson, Mouchlianitis, Matthews, & Kouider, 2008). Similar masked face priming effects have been localized to face-specific regions within the occipital complex using functional magnetic resonance imaging (Kouider, Eger, Dolan, & Henson, 2009). These effects suggest that some amount of face processing at the level of identity continues uninterrupted by masking and is dissociable from awareness. On the other hand, other studies using sandwich-masking procedures have also shown that early visual evoked potentials to a masked face exhibit adaptation effects while later indices of face familiarity do not, suggesting a higher level processing interruption by masking (Jemel, Calabria, Delvenne, Crommelinck, & Bruyer, 2003; Martens, Schweinberger, Kiefer, & Burton, 2006; Trenner, Schweinberger, Jentzsch, & Sommer, 2004).

Although studies of masked face priming have been useful in examining the extent of face processing in both behavioral and neural terms, there are still several limitations of the literature with respect to the question of face-specific processing being investigated in the present study. For example, while adaptation effects are sensitive and valid as indices for the level of target processing, they are somewhat indirect, as they relate to a later probe and not necessarily to the target of interest. In fact, some studies assert that observing a reduction of adaptation effects at high levels may not necessarily reflect the effect of decreased awareness but rather of the persistence or lack thereof of a “fast-decaying iconic memory trace” (Martens et al., 2006). In addition, previous studies of face priming have focused on face processing at the level of familiarity and not at the level of object category. Accordingly, the level of object categorization processing (e.g., faces versus other objects) that can be achieved within and outside of awareness still remains unclear.

Other studies have examined the processing of face targets directly in various conditions of identification and categorization performance, which presumably reflect various levels of awareness. For example, the amplitude of the face-specific N170 component to face images of parametrically degraded contrast was found to be positively correlated with the level of participants' subjective awareness of the face images (Jemel, Schuller et al., 2003). In the case of masking approaches, some degree of face-specific hemodynamic activity in the right fusiform gyrus has been reported to be present in a sandwich masking experiment (Morris, Pelphrey, & McCarthy, 2007), while in an object substitution masking paradigm containing parafoveal face and house targets presented at unpredicted locations, face-specific electrophysiological indices were reported absent (Reiss & Hoffman, 2007). Indeed, different studies use different paradigms to render a visual stimulus invisible, each of which presumably suppresses visual processing at a unique stage of processing (Kim & Blake, 2005). Therefore, the discrepancies among studies may be due to the wide-ranging quality and nature of stimulus exposure to the visual system.

The present study aimed to address some of these issues concerning face processing in the absence of awareness through an electrophysiological approach employing event-related potential (ERP) measures of brain activity. By using a variant of visual masking wherein target stimuli are both preceded and followed by non-object visual masks (i.e., sandwich masking), we investigated how the face-sensitive ERP component, the N170 (Bentin, 1998; Bentin et al., 1996), would vary with masking-induced modulations of perceptual awareness. The design of the present study afforded several advantages. First, sandwich masking robustly reduces perceptual awareness while keeping the physical qualities and exposure duration of the masked images themselves the same as those of the unmasked stimuli. Second, because of the efficacy of sandwich masking, it is possible to use stimuli that are presented foveally and that are always spatially attended, thus dissociating the effects of awareness of interest here from previously observed effects of visuospatial attention (Crist, Wu, Karp, & Woldorff, 2008). Third, using the face-specific N170 effect as a neuronal signature of category-specific processing provides clear insight into the underlying neural substrates of conscious vs. unconscious perceptual processing of object category. Fourth, the use of an electrophysiological measure with high temporal resolution affords the ability to decompose the visual processes leading up to and presumably contributing to face perception. Fifth, and finally, the inclusion of blank-image trials provided an important control condition for investigating the underlying mechanisms of sandwich masking by providing electrophysiological indices of image vs. no-image processing in both masked and unmasked conditions.

Methods

Participants

Two separate experiments were performed, in which thirty-four healthy adults with normal or corrected-to-normal vision participated. Data from eight participants were excluded from analyses due to excessive eye movements, leaving data from 12 participants (mean age of 20 ± 1.5 years, 6 males) and 14 participants (mean age of 25 ± 6.9 years, 7 males) from the first and second experiments, respectively. All participants used their right hand to perform the task (two participants were left-handed). Informed consent was obtained for each subject according to protocols approved by the Duke University Internal Review Board (IRB). All participants were recruited through local advertisements at Duke University campus and were compensated for their participation in accordance with stipulations outlined by the IRB.

Stimuli and task

Participants were seated with their eyes 60 cm from the center of a 19-inch CRT stimulus presentation monitor with a 60-Hz refresh rate. During each session, participants completed two blocks of runs, each with a different task. The stimulus set consisted of 6.6° × 8.8° grayscale face and house images, along with scrambled non-object masks produced by the “liquefaction” function in Adobe Photoshop, which imparts a set of masking random swirls to images. In each trial, a first mask (100-ms duration), an object image (17-ms duration), and then a second mask (100-ms duration) were presented sequentially, with varied intervals between these stimuli that robustly modulated the perceptual awareness of the object images. More specifically, in the unmasked condition, the interval between the masks and object images was 100 ms (distal masking), while in the masked condition it was 0 ms (proximal masking). The intertrial interval (ITI) was randomly jittered between 500 and 800 ms. Four trial types—masked faces and houses and unmasked faces and houses—occurred with equal frequency and with a randomized order within each run.

In the first experiment, participants performed either a color detection task or a two-alternative forced-choice (2AFC) task in each run, with these runs separated into two different task blocks. In the color detection task (Figures 1A and 1B), participants were instructed to identify, with a button press, rare target masks (20% of trials) that were either slightly magenta- or cyan-tinted (the rest being grayscale). For the color detection task, subjects completed six runs of 100 trials each, evenly and randomly distributed among the four trial types (masked and unmasked faces and houses). In the 2AFC task, participants were instructed to categorize, with a button press, the object images as being either a face or a house (Figures 1C and 1D; note that none of the masks in the 2AFC runs were colored). In the second experiment, a “blank-image” trial type was added to the color detection task, in which a blank image of background color was presented between the two masks, a trial type that provided a baseline measure for assessment of low-level visual processing of the object images. Thus, in this second experiment, trials could randomly include a face, a house, or a blank image in between the masks. Aside from the introduction of the blank image to the color detection task block, task instructions (for color detection and categorization tasks) were identical across experimental groups, the only difference being the number of trials in the color detection task in each run (120 randomized trials evenly distributed between masked and unmasked faces, houses, and blank images).

Stimuli and task. In each trial, five consecutive images were presented in quick succession at fixation. The target image (i.e., face, house, or blank) was always presented in the third temporal position and was either immediately preceded and followed by non-object scrambled masks (masked/subliminal condition) or preceded and followed by non-object scrambled masks with a 100-ms period between images (unmasked/supraliminal condition). (A, B) In the color detection task, participants were required to make a speeded response to the infrequent color-tinted masks. (C, D) In the two-alternative forced choice categorization task, participants were required to identify target object stimuli, masked or unmasked, as being either a face or a house.

Figure 1

Stimuli and task. In each trial, five consecutive images were presented in quick succession at fixation. The target image (i.e., face, house, or blank) was always presented in the third temporal position and was either immediately preceded and followed by non-object scrambled masks (masked/subliminal condition) or preceded and followed by non-object scrambled masks with a 100-ms period between images (unmasked/supraliminal condition). (A, B) In the color detection task, participants were required to make a speeded response to the infrequent color-tinted masks. (C, D) In the two-alternative forced choice categorization task, participants were required to identify target object stimuli, masked or unmasked, as being either a face or a house.

In addition to accuracy and reaction time data for the color detection task, d-prime scores based on signal detection theory (Macmillan & Creelman, 1997) were calculated to quantify the amount of object information participants acquired in the subliminal vs. supraliminal conditions during the 2AFC task (see relevant Results section).

EEG and ERP data

The electroencephalogram (EEG) was recorded continuously from a custom 64-channel cap (Electrocap, Eaton, OH) with a right mastoid reference, using a band-pass filter of 0.01–100 Hz, sampling rate of 500 Hz, and gain of 1000 (Neuroscan Amplifier System, Charlotte, NC). Eye movements were monitored with a zoom-lens video camera, two vertical EOG channels below the eyes referenced to prefrontal electrodes (Fp1 and Fp2), and a horizontal EOG channel measuring differential activity between the left and right outer canthi. Artifact rejection was performed offline to remove trials contaminated by blinks, muscle activity, drift, or eye movement.

The artifact-free data were time-locked-averaged selectively for the different stimulus types. These averages were then low-pass filtered offline using a nine-point moving-average filter, which, at our 500-Hz sample rate, heavily attenuates signal noise with frequencies at and above ∼56 Hz. The ERP averages were subsequently algebraically re-referenced to the average of all electrodes (common reference). In addition, we conducted analyses employing an average mastoid reference, a commonly used referencing scheme. All electrophysiological data were time-locked to the onset of the object image in each trial (faces, houses, and, in the case of the color detection task in the second experiment, blank images). These time-locked electrophysiological responses were then averaged according to condition and stimulus type to extract comparisons of interest as described below. Face-selective ERP effects were extracted by contrasting the ERP evoked by the face image stimuli with those evoked by the house image stimuli, separately for the different perceptual conditions. Object-specific ERP effects were extracted by contrasting the ERP evoked by faces and houses (collapsed) with those evoked by the blank-image trials included in the color detection task in the second experiment. The ERPs were statistically analyzed using repeated-measures analyses of variance (ANOVAs) (Object Category by left versus right Electrode Site [left versus right]) of mean amplitudes within specific latency windows (6 ms wide, from 0 to 636 ms post-stimulus) for the corresponding electrodes over left and right occipitotemporal scalp sites (TO1 and TO2). A separate analysis with the same factors was performed at two more medial and posterior scalp sites (O1 and O2), which are sensitive to early visual sensory ERP responses, to examine mean amplitude differences in masked and unmasked conditions in response to objects (face and houses, collapsed) versus blank-image trials. Observed mean amplitude effects between conditions were considered significant if the p-value was less than 0.05 for at least 6 consecutive time bins of 6 ms, assessed between 0 and 636 ms post-stimulusly. The onset of the component was identified as the first of these 6 consecutive significant time bins.

Results

Behavior

In the color detection task, the mean accuracy and reaction time values were 96.7 ± 1.8% and 399 ± 7.8 ms, respectively. This high level of performance indicates that participants were in fact closely attending to the visual stimulus stream, as instructed.

A highly significant decrement of visual awareness in the subliminal masking condition as compared to the supraliminal one was revealed by behavioral measures of participants' percepts about the objects (i.e., the ability to discriminate between faces and houses). Mean discrimination accuracy approached ceiling at 94.5% for the supraliminal trials but was 54.3% for the subliminal ones (very close to chance level performance of 50%). The mean d-prime score for the supraliminal condition was 3.7 ± 0.14, whereas the mean subliminal one was over an order of magnitude lower (0.31 ± 0.1), resulting in a highly significant difference in discrimination sensitivity between the conditions (t25 = −19.0; p < 0.00001). These behavioral results provide evidence for robust attenuation of awareness of the image objects in the subliminal masking condition, despite sustained focal attention toward the stimuli.

Electrophysiology

All statistics and plots below are with respect to the ERP data measures relative to a common reference. Parallel analyses employing an average mastoid reference yielded the same pattern of effects.

Face/House N170 difference (data from both Experiments 1 and 2)

Results from the electrophysiological data, for both the color-mask detection and the 2AFC tasks in both experiments, indicated that early face-selective neural processing was present in the supraliminal conditions but eliminated in the subliminal masked conditions. In particular, hallmark face-selective N170 ERP activity—namely, a significant negative-polarity difference between faces and houses, peaking at around 170 ms and having an occipitotemporal scalp distribution—was observed for the supraliminal trials but not for the subliminal ones (Figure 2). For the supraliminal condition in the color detection task (Figure 2A), a significant face-specific N170 effect was observed between 132 and 246 ms post-stimulus (F1,25 = 58.2, p < 0.0001), followed by a second phase of similarly distributed, longer latency, negative-polarity, face-selective activity between 510 and 588 ms (F1,25 = 7.94, p = 0.009). Likewise, for the supraliminal conditions in the forced-choice categorization task (Figure 2C), a similar set of biphasic face-selective effects was observed in the post-stimulus intervals of 138–264 and 504–588 ms (F1,25 = 32.7, p < 0.0001 and F1,25 = 33.3, p < 0.0001, respectively). These face-specific ERP effects did not differ significantly between the color detection and categorization tasks.

Face-selective ERP effects in the subliminal vs. supraliminal condition. Plotted here are the ERP waveforms time-locked to the onset of the object image stimulus and the corresponding topographical distribution maps of the face-minus-house contrast for (A) the supraliminal condition in the color detection task, (B) the subliminal condition in the color detection task, (C) the supraliminal condition in the forced-choice categorization task, and (D) the subliminal condition in the forced-choice categorization task. For both tasks, a bilateral, right lateralized, negative-polarity difference between the ERPs to the faces and to the houses was observed in the supraliminal conditions but not in the subliminal conditions.

Figure 2

Face-selective ERP effects in the subliminal vs. supraliminal condition. Plotted here are the ERP waveforms time-locked to the onset of the object image stimulus and the corresponding topographical distribution maps of the face-minus-house contrast for (A) the supraliminal condition in the color detection task, (B) the subliminal condition in the color detection task, (C) the supraliminal condition in the forced-choice categorization task, and (D) the subliminal condition in the forced-choice categorization task. For both tasks, a bilateral, right lateralized, negative-polarity difference between the ERPs to the faces and to the houses was observed in the supraliminal conditions but not in the subliminal conditions.

In summary, we observed normal face-selective neural activity with intact perceptual awareness and accurate discrimination of the object image content in the distal sandwich-masking conditions (i.e., trials with the brief blank interval between the masks and object images). In contrast, both the face-selective neural activity and the perceptual awareness of the visual stimulus content was eliminated by the proximal masking (i.e., trials with no blank interval between the masks and the object image stimulus), despite the stimulus images being displayed with the same duration.

Object-specific P1 difference (data from Experiment 2 only)

In addition to the modulation of the face-specific N170 responses, the effects of the masking on earlier low-level visual processing were of interest. In Experiment 2, to investigate possible contributing mechanisms of the sandwich-masking paradigm being employed here, responses on trials with face and house images were collapsed and compared to responses on the trials with a blank target image (essentially providing a “something versus nothing” comparison), separately for the masked and unmasked conditions. In the supraliminal (unmasked) condition, we observed a robust positive-polarity ERP response (Figure 3A) over bilateral medial posterior scalp in the sensory ERP component latency range of 100 ms post-stimulus (66–138 ms (F1,13 = 48.16, p < 0.0001)). This positivity exhibited a similar scalp distribution and onset latency to that of the visual evoked P1 effect generally thought to originate from early extrastriate low-level visual processing areas in occipital cortex (Clark & Hillyard, 1996; Di Russo, Martinez, Sereno, Pitzalis, & Hillyard, 2002; Woldorff et al., 1997). In contrast, for the subliminal (masked) conditions, the same component (faces and houses collapsed vs. blank images), although reaching significance (F1,13 = 9.48, p = 0.01), was drastically reduced in both amplitude and duration (84–114 ms post-stimulus; Figure 3B). These results suggest that in the subliminal condition, early visual sensory processing, although present, was so heavily attenuated as to be almost eliminated by the sandwich masking.

Sensory-related ERP effects in the subliminal vs. supraliminal condition. The early sensory evoked ERP effects were extracted by contrasting the responses to trials with object images (faces and houses) with the responses to trials with blank images, separately for the two masking contexts (N = 14). (A) In the supraliminal condition, we observed a robust positive-polarity ERP over bilateral occipital regions peaking at 100 ms, a response highly consistent with the well-studied sensory evoked P1 effect elicited by the object stimuli. (B) In the subliminal condition, a much weaker (∼five times smaller) sensory evoked P1 effect was observed at this latency.

Figure 3

Sensory-related ERP effects in the subliminal vs. supraliminal condition. The early sensory evoked ERP effects were extracted by contrasting the responses to trials with object images (faces and houses) with the responses to trials with blank images, separately for the two masking contexts (N = 14). (A) In the supraliminal condition, we observed a robust positive-polarity ERP over bilateral occipital regions peaking at 100 ms, a response highly consistent with the well-studied sensory evoked P1 effect elicited by the object stimuli. (B) In the subliminal condition, a much weaker (∼five times smaller) sensory evoked P1 effect was observed at this latency.

The present study shows that visual awareness of faces, as measured by object category discrimination ability, was eliminated in conditions of immediately temporally adjacent masks. When visual awareness of faces was abolished through visual sandwich masking, the face-specific N170 effect was also eliminated. This is not to say that the N170 can or should be equated to the awareness process but rather that the masking as employed here disrupted both the N170 and the emergence of awareness. Moreover, and more specifically, the present results suggest that sandwich masking reduces awareness by disrupting the visual signal at processing stages prior to face-specific processing and prior to the emergence of awareness.

Discordant findings regarding the extent of face-specific processing that can occur within and outside visual awareness seem to have arisen from three issues in terms of design and interpretation. First, in many studies, the experimental designs did not allow for a direct measure for face-specific categorical processing. Second, in some studies reporting a lack of face processing in the absence of awareness, the interpretation is based upon physical manipulation of the target image itself (i.e., degrading the contrast of a face image until viewers no longer categorize it as one). Third, studies asserting intact processing of faces without awareness often define and sort “aware” and unaware” trials in a way that risks contaminating the “unaware” condition with trials by instances in which the subject may have in fact been aware of the stimuli. For example, if the method employed to disrupt visual awareness is not robust enough to yield chance performance, or no awareness, trials deemed “unaware” based only on the presentation manipulation will contain some trials in which there was some awareness. This, in turn, may lead to an inflated estimate of the extent of face-specific processing outside of awareness.

The extent to which the content of faces is processed in the absence of awareness is of broad interest and has led to a number of cases of indirect evidence supporting unconscious face-related processing. However, the indirect nature of these studies, especially vis-à-vis face-specific processing, tends to limit the interpretability of these findings concerning such specificity. For example, several previous studies have reported that the emotional content of a face stimulus is processed in the absence of awareness (Jiang & He, 2006; Kiss & Eimer, 2008; Murphy & Zajonc, 1993; Pessoa, 2005; Pessoa, Japee, & Ungerleider, 2005; Vizueta et al., 2007; Whalen et al., 1998; Wiens, 2006; Williams, Morris, McGlone, Abbott, & Mattingley, 2004). This is manifested as enhanced (higher amplitude) visual evoked potential responses during perceptual suppression or as enhanced hemodynamic activity in the amygdala. In these studies, however, the responses to purportedly suppressed face stimuli are not compared to those associated with other object categories. This is true of both the behavioral measure establishing conditions of awareness (an affective discrimination task) and the implicit measure of the same process (modulation of scalp ERP components or of amygdala activation). Because these studies do not directly probe category-specific processing, and because the fast and presumably automatic processing of the affective content of faces does not necessarily require the categorical processing of the face as an object (Hung et al., 2010), these studies would not seem to be adequate grounds upon which to base claims of object category processing in the absence of awareness.

This general issue of internal validity is also instantiated in a number of studies claiming that face-specific processing is not occurring in the absence of awareness. For example, in face priming studies, reaction times and neural responses to faces that were previously presented under masked conditions are modulated, providing evidence that processing of the face had occurred in the absence of awareness. On the other hand, several studies (Jemel, Pisani et al., 2003; Martens et al., 2006; Trenner et al., 2004) have demonstrated that priming effects at low levels of visual processing occur in the masked absence of awareness conditions but that such priming effects are not manifested at the level of face processing in these conditions. Simply put, the absence of a reduced face-specific response to a face previously presented under unaware conditions (an effect of adaptation) suggests that the face was not processed to the level of identity during the “unaware” presentation. In each of these studies, relevant comparisons are made not at the categorical level but at the level of face familiarity, which does not necessarily address face–object-specific processing in the absence of awareness.

Another common issue related to some studies reporting a lack of face-specific processing during unawareness is that the conditions identified as unaware are actually conditions of degraded physical integrity of the target (i.e., face) stimuli. For example, one group reported the stepwise emergence of the face-specific N170 component as a function of increasing awareness of the face images (Jemel, Schuller et al., 2003). In that study, however, parametric degradation of image contrast was employed to extract behavioral curves of increasing categorization performance. This manipulation of such a low-level visual property, however, would seem to represent less a decrease in the awareness of the viewer and more a decrease in the actual faceness of the visual image stimulus input. This finding was also supported by data showing that as the duration of a masked face image is decreased and the image is increasingly scrambled, behavioral performance as well as neural indices of face-specific processing decreases (Grill-Spector, Kushnir, Hendler, & Malach, 2000). What these studies have in common is the direct manipulation of the physical integrity of target stimuli to reduce categorization performance, as reflected behaviorally or by the brain. Such results, although of considerable interest, would seem to be less about perceptual processing outside of awareness and more about how physical stimulus integrity relates to the object-related processing.

Finally, in the case of studies asserting that face processing occurs in the absence of awareness, it is important to consider possible contamination of the “unaware” condition with trials in which the subject may actually be aware. Here, the issue lies in the way trials are binned to explore implicit measures rather than being derived from the visual presentation protocol. For example, in a study employing sandwich masking, the results showed hemodynamic activation in right fusiform gyrus in the masked condition in response to face images relative to images of other objects (Morris et al., 2007). While the comparison examined in this study was directed toward assessing object category processing outside of awareness, the manner in which data was examined may be important to consider. More specifically, the manipulation did effectively diminish awareness as evident in decrements to chance levels in both detection and categorization of stimuli. As is common practice, however, masked trials were collapsed regardless of behavior into a condition called “unawareness.” This means that, in this particular instance, the 10% of trials in which the subject managed to detect the images in the masked condition could have been driving the face-specific activity observed in that condition. Furthermore, because no direct comparison was made between the effect size of masked and unmasked trials, the possibility of a markedly smaller effect in “subliminal” conditions being driven by a small number of trials in which the viewer was aware of the images is left open. This problem has previously been articulated by Kouider and Dupoux (2004), who asserted that partial awareness of a target stimulus in experiments investigating semantic priming can lead to its illusory reconstruction at the probe stage, thus perpetuating the controversy surrounding unconscious semantic priming.

The present study sought to mitigate these issues in several ways. First, all relevant electrophysiological data were time-locked to the onset of the object images, and the contrasts were made between responses to the face images versus to the house images, thus explicitly differing at the categorical level. Second, the physical integrity of the target images themselves was preserved while fully inducing elimination of visual awareness. Thus, the “faceness” of the face image stimuli themselves was not compromised. Finally, the conditions upon which interpretations are based in the current study reflect robust and unambiguous disruption of visual awareness. The distinction between awareness and unawareness was bolstered by the drastic behavioral performance difference between the distally masked (supraliminal) and proximally masked (subliminal) conditions. More specifically, in the distally masked (i.e., supraliminal) condition, the face/house discrimination performance indicated that the physical characteristics of the target images were readily and easily identified, whereas in the proximally masked (subliminal) condition task performance was essentially at chance. Because the physical qualities of the target stimuli were held constant (i.e., identical images presented for identical durations of 17 ms), this perceptual decrement can only be attributed to the masking-related interaction processes in the brain. Furthermore, such robust masking made it possible to examine electrophysiological data according to trial type and not necessarily according to behavioral performance. Examining only misses and hits in a less robust paradigm is subject to a certain degree of ambiguity in that any extracted face/house differences in the masked condition may reflect later processes associated with behavioral responses (Summerfield, Egner, Mangels, & Hirsch, 2006) and not the stimuli that are of primary interest here.

Concerning the electrophysiological findings, the addition of a blank-image condition allowed us to assess the possible mechanisms through which sandwich masking disrupts awareness. By comparing electrophysiological responses to objects versus blanks across the masking conditions, we observed that there was strong attenuation in the subliminal masking conditions of the feedforward visual signal reflected in the P1 component at 100 ms, which has been associated by source analysis and neuroimaging linkages with the initial feedforward activity in extrastriate regions of visual cortex (Clark & Hillyard, 1996; Di Russo et al., 2002; Heinze et al., 1994; Woldorff et al., 1997). This result suggests that the disruption by sandwich-masking techniques such as that used here occurs at or before the level of the feedforward signal through extrastriate visual cortex. This, in turn, suggests that under such circumstances insufficient visual information is able to reach the hierarchically later stages of visual processing, such as those associated with face detection/discrimination, resulting in the observed decrement in both categorization ability and the face-specific N170 neural response. Thus, the current results suggest that the effects of sandwich masking derive from an early signal disruption mechanism, which leads to the visual object image content never reaching awareness, as well as never eliciting electrophysiological measures of that content.

The present results are in contrast to those functional accounts of backward masking and object substitution masking that propose that unawareness is accomplished through disrupting longer-latency reentrant processing to early visual areas (Fahrenfort, Scholte, & Lamme, 2007; Reiss & Hoffman, 2007). Our findings suggest masking as employed here disrupts awareness through an early stage of disruption not unlike that described by the race model of sustained (mask) and transient (target) channels proposed by other groups (Breitmeyer & Ogmen, 2000; Ogmen, Breitmeyer, & Melvin, 2003). However, this seems likely due to the inclusion of a forward mask and its disruptive effect on the feedforward signal. In examining the raw electrophysiological data comparing masked objects (faces and houses) with masked blank images, it appears that the visual evoked potential (VEP) associated with the target was integrated with the larger and dominant VEP initiated by the forward mask in the subliminal masking conditions. This may have, in turn, resulted in a low visual target signal-to-noise ratio and the inability to extract the target image content in those conditions (Keysers & Perrett, 2002), a mechanism comparable to those observed as decrements in higher order categorization performance for lag 1 secondary targets embedded in a rapid serial visual presentation (RSVP) stream (Hommel & Akyurek, 2005). In sum, and taking into account previous findings concerning the mechanisms of backward masking, it would appear that while it is possible that the backward mask disrupts reentrant visual processing of a signal associated with the target, the forward mask, because of its relative strength and temporal proximity, may disrupt the feedforward signal before reaching such reentrant stages, as suggested by our results.

A possible caveat of the present study is that it employed only two interstimulus intervals (0 and 100 ms) for the masked and unmasked conditions, respectively. Although this resulted in very clear-cut conditions of awareness as measured behaviorally, these conditions represent the two extremes over a possibly wide spectrum of intermediate levels of awareness. A parametric approach that varies the ISI between masks and object images across intermediate d-prime values could also be a useful way for assessing the association between awareness and the neural indices of face-specific processing.

In summary, our results show that, in the context of unawareness as achieved with sandwich masking, neural indices of face-specific processing are closely associated with behavioral measures of image discriminability, being eliminated when image discriminability and awareness are also eliminated. In addition, our electrophysiological findings indicate that sandwich masking also induces heavy attenuation of early lower level visual processing, prior to the stage at which robust object-category-specific processing activity is typically elicited. The results therefore suggest that the effects of sandwich masking, a widely used approach in the study of visual processing and conscious awareness, derive from an early disruption mechanism of the feedforward pathways in or prior to extrastriate visual cortical areas and thus considerably prior to entry into the higher level object discrimination processing. Moreover, the present findings underscore the importance of elucidating the mechanisms by which a given stimulation or task approach may actually accomplish an observed disruption in awareness.

Acknowledgments

This research was supported by NIMH R01 MH60415 and NINDS P01-NS41328 (Project 1).

Stimuli and task. In each trial, five consecutive images were presented in quick succession at fixation. The target image (i.e., face, house, or blank) was always presented in the third temporal position and was either immediately preceded and followed by non-object scrambled masks (masked/subliminal condition) or preceded and followed by non-object scrambled masks with a 100-ms period between images (unmasked/supraliminal condition). (A, B) In the color detection task, participants were required to make a speeded response to the infrequent color-tinted masks. (C, D) In the two-alternative forced choice categorization task, participants were required to identify target object stimuli, masked or unmasked, as being either a face or a house.

Figure 1

Stimuli and task. In each trial, five consecutive images were presented in quick succession at fixation. The target image (i.e., face, house, or blank) was always presented in the third temporal position and was either immediately preceded and followed by non-object scrambled masks (masked/subliminal condition) or preceded and followed by non-object scrambled masks with a 100-ms period between images (unmasked/supraliminal condition). (A, B) In the color detection task, participants were required to make a speeded response to the infrequent color-tinted masks. (C, D) In the two-alternative forced choice categorization task, participants were required to identify target object stimuli, masked or unmasked, as being either a face or a house.

Face-selective ERP effects in the subliminal vs. supraliminal condition. Plotted here are the ERP waveforms time-locked to the onset of the object image stimulus and the corresponding topographical distribution maps of the face-minus-house contrast for (A) the supraliminal condition in the color detection task, (B) the subliminal condition in the color detection task, (C) the supraliminal condition in the forced-choice categorization task, and (D) the subliminal condition in the forced-choice categorization task. For both tasks, a bilateral, right lateralized, negative-polarity difference between the ERPs to the faces and to the houses was observed in the supraliminal conditions but not in the subliminal conditions.

Figure 2

Face-selective ERP effects in the subliminal vs. supraliminal condition. Plotted here are the ERP waveforms time-locked to the onset of the object image stimulus and the corresponding topographical distribution maps of the face-minus-house contrast for (A) the supraliminal condition in the color detection task, (B) the subliminal condition in the color detection task, (C) the supraliminal condition in the forced-choice categorization task, and (D) the subliminal condition in the forced-choice categorization task. For both tasks, a bilateral, right lateralized, negative-polarity difference between the ERPs to the faces and to the houses was observed in the supraliminal conditions but not in the subliminal conditions.

Sensory-related ERP effects in the subliminal vs. supraliminal condition. The early sensory evoked ERP effects were extracted by contrasting the responses to trials with object images (faces and houses) with the responses to trials with blank images, separately for the two masking contexts (N = 14). (A) In the supraliminal condition, we observed a robust positive-polarity ERP over bilateral occipital regions peaking at 100 ms, a response highly consistent with the well-studied sensory evoked P1 effect elicited by the object stimuli. (B) In the subliminal condition, a much weaker (∼five times smaller) sensory evoked P1 effect was observed at this latency.

Figure 3

Sensory-related ERP effects in the subliminal vs. supraliminal condition. The early sensory evoked ERP effects were extracted by contrasting the responses to trials with object images (faces and houses) with the responses to trials with blank images, separately for the two masking contexts (N = 14). (A) In the supraliminal condition, we observed a robust positive-polarity ERP over bilateral occipital regions peaking at 100 ms, a response highly consistent with the well-studied sensory evoked P1 effect elicited by the object stimuli. (B) In the subliminal condition, a much weaker (∼five times smaller) sensory evoked P1 effect was observed at this latency.