Search asymmetry is a robust phenomenon with various stimuli and is important for understanding determinants of efficiency in visual search. However, its underlying mechanism remains unknown due to the lack of a method for estimating visual features used by human observers. This study used a classification image technique to solve this problem. Standard classification image analyses with an experiment of visual search asymmetry between Q and O revealed that observers used the same features in both search tasks, rejecting a hypothesis incorporating top-down feature selection. More quantitative data analysis and an additional experiment with a singleton search task also rejected target-dependent selective tuning of the common feature. Further model-based analyses revealed that a standard signal detection model with nonlinear signal transduction and multiplicative internal noise is sufficient to account for the classification image data. Contrary to intuitively appealing accounts based on attention and spatial uncertainty, these findings suggest that search asymmetry is a characteristic of elementary visual processing of multiple items by a nonlinear system. The classification image technique is a valuable tool for investigating search behavior beyond mere visualization of visual features.

Introduction

When we search for an object in a natural setting, such as looking for a friend in a crowd, various factors affect the efficiency of our search behavior. Visual search is a typical experimental paradigm to investigate the underlying mechanisms of search behavior (Treisman & Gelade, 1980; Wolfe, 1994), and search efficiency has served as an important index. An important phenomenon known to affect search efficiency is search asymmetry; for example, searching for a Q among O's is easier than searching for an O among Q's (Treisman & Souther, 1985). Despite a simple exchange of target and distractor, search asymmetry shows a large and robust change in search efficiency and is observed with a wide variety of stimulus sets. The fact that search asymmetry occurs with the same stimulus set and thus cannot be attributed to simple stimulus-specific effects has attracted many researchers' attention. Various hypotheses attribute search asymmetry to either top-down control (Duncan & Humphreys, 1989; Navalpakkam & Itti, 2005; Treisman & Gormican, 1988; Treisman & Souther, 1985; Wolfe, 1994) or a stimulus-driven mechanism (Palmer, Verghese, & Pavel, 2000; Rosenholtz, 1999; Rubenstein & Sagi, 1990). Some accounts postulate that search asymmetry reflects the selection of a different set of target-defining features by switching the targets (Wolfe, 1994), implying that search asymmetry is one behavioral manifestation of top-down control of visual attention. In contrast, other accounts propose that search asymmetry reflects some asymmetry in the distribution of stimulus representation. For example, an account based on signal detection theory (SDT) proposes that differential variance between the target and the distractors induce search asymmetry (Palmer et al., 2000; Rubenstein & Sagi, 1990). A similar, more explicit stimulus-driven account is the statistical saliency map model by Rosenholtz (1999), which explains search asymmetry by a change in the covariance structure of feature space.

Due to the lack of clear definition of visual features, however, the underlying mechanisms of search asymmetry remain unknown. Even with a simple case such as Q's and O's, we still do not know what produces search asymmetry. For example, observers may use the presence of a vertical bar, a crossing, or some other feature. The bar and the crossing are conceptually distinct visual features, but standard experimental procedure, in which stimulus manipulation is independent of responses, had serious difficulty in dissociating these two possible features because these two features are inherently correlated in the geometric structure of the stimulus. To estimate visual features, this study utilized the classification image (CI) technique, which estimates stimulus information used by an observer by taking a cross-correlation between the observer's responses and external noises (Abbey & Eckstein, 2002; Beard & Ahumada, 1998; Neri, 2004). First, standard classification images were constructed, and initial statistical analyses without prior hypothesis were performed using tests based on random field theory (Chauvin, Worsley, Schyns, Arguin, & Gosselin, 2005). The initial analyses revealed that an identical feature was utilized in both O- and Q-search conditions. Then, a region of interest (ROI) was defined to investigate properties of human performance and detailed analyses including hypothesis testing were performed. The detailed analyses revealed important properties of human performance, including the signature of search asymmetry in classification images. Finally, model-based analyses elucidated the mechanisms underlying human performance. Overall, stimulus-driven mechanisms alone can produce search asymmetry, and a simple nonlinear transducer model can account for the properties of human performance.

Observers performed a 4-AFC (4 alternative forced choice) target localization task. Each trial began with 500 ms fixation, followed by 400 ms display presentation, and observers judged the location of the target item by a key press. Feedback on response accuracy was not provided. Each task condition had 15 blocks of 400 trials, so each observer performed 12,000 trials in total. Observers switched search conditions every five blocks, and the order of search conditions was counterbalanced across observers. Each observer performed 3–5 blocks a day.

Data analyses

Behavioral data

The main behavioral measure was threshold contrast. For each block of 400 trials, the threshold was estimated by taking the average of signal contrasts excluding the first 10 trials, whose contrasts were high. Accuracy was also calculated for these 390 trials in each block. Search asymmetry was evaluated to determine whether the mean threshold was significantly higher in the O-search condition than in the Q-search condition for each observer. Additionally, accuracy data were used supplementarily to evaluate the asymmetry in cases where accuracies were not properly equated between search conditions by the staircase procedure. This additional analysis was carried out in Experiment 2.

Classification image analyses

The search task is a 4-AFC task; thus, the classification image (CI) was constructed following a method for a 2-AFC task (Abbey & Eckstein, 2002). Signal-present noise image, denoted as n+, can be defined in the same way as in a 2-AFC case for the noise image of the target item, but signal-absent noise image, denoted as n−, needs some modifications. In error trials, the noise of the incorrectly selected item was defined as n− because this image is clearly relevant to observers' choice behavior. In correct trials, however, three distractors were not selected by observers, making different definitions of n− possible. Three definitions were tried: (1) randomly picking up noise image of one of the three distractors, (2) taking the average of noise images of three distractors, and (3) taking a noise image which is the most similar to the target. Because the correct trials had little effect on the classification image, the three definitions above did not show any significant difference. Therefore, the random pickup method was employed as the definition of n− based on its simplicity. The CI was calculated as

c is the observed proportion correct, and nC+, nC−, nIC+, and nIC− are the noise images of a signal-present item in a correct trial, a signal-absent item in a correct trial, a signal-present item in an incorrect trial, and a signal-absent item in an incorrect trial, respectively.

Statistical tests for the CI used a method correcting for multiple comparisons using random field theory (Chauvin et al., 2005; Worsley et al., 1996). This technique treats the classification image as a multiple regression problem and tests the presence of a signal hidden in regression coefficients under the null hypothesis of a random N-dimensional process (i.e., random field). Compared with other tests, this test is not as conservative as the Bonferroni test and can be applied when an a priori expectation is absent, unlike the Hotelling test proposed by Abbey and Eckstein (2002).

The top-down control hypothesis postulates that a visual feature is selected by top-down control depending on the target identity. Some versions assume qualitatively different features such as a bar and a cross-section (Duncan & Humphreys, 1989; Navalpakkam & Itti, 2005; Wolfe, 1994). They can be evaluated by simple visualization of estimated visual features used in O- and Q-search trials. Observed CIs (Figure 3) clearly show estimated visual features as a vertical bar in both O- and Q-search tasks. Statistical testing for significant modulation (Chauvin et al., 2005; Worsley et al., 1996) confirmed that the significant feature is the vertical bar alone (Figure 3B). Thus, observers did not use features such as the cross-section, or the arc at the bottom, rejecting the different feature account.

Although not proposed in the literature, another possible form of top-down control is to assume the same visual feature, such as a bar, with differential tuning of its strength. To examine its possibility and elucidate the mechanisms underlying the obtained human data, more quantitative analyses with decomposition of CIs are necessary.

Methods

Overall image analyses

Recent study revealed that with high stimulus contrast, obtained CIs reflect almost exclusively the noises in incorrect response trials, unlike linear observers (Tjan & Nandy, 2006). In such cases, the obtained CI denotes a noise pattern inducing incorrect responses rather than a linear template. To examine whether this was the case with the current experiment, correct and incorrect images were analyzed separately, followed by random field theory tests.

ROI analyses

For detailed quantitative analyses, the ROI (region of interest) was defined as the region of the vertical bar and the horizontal surrounds (25 × 20 pixels, Figure 4A). Two quantitative analyses were conducted: overall pattern of significant regions and evaluation of vertical bar feature strength.

Detailed image analyses for the region of interest (ROI). Definition of ROI. A matrix of 25 × 20 pixels containing the vertical bar of the Q at the center was analyzed. From this matrix, a linear feature vector was constructed by taking an average of pixel values along each column.

Figure 4

Detailed image analyses for the region of interest (ROI). Definition of ROI. A matrix of 25 × 20 pixels containing the vertical bar of the Q at the center was analyzed. From this matrix, a linear feature vector was constructed by taking an average of pixel values along each column.

To elucidate the mechanism underlying correct and incorrect images, the image was further decomposed into signal-present and -absent images. For each image, the same test as the whole image based on the random field theory was used to construct a color map representing statistically significant regions. The obtained data were evaluated based on the following three different mechanisms regarding linearity: linearity, task-dependent nonlinearity, and stimulus-dependent nonlinearity ( Figures 5A– 5C). First, the linear observer predicts equal amplitudes and the opposite sign for signal-present and -absent images (Abbey & Eckstein, 2002). Figure 5A shows the expected pattern of results. By contrast, a significant difference in amplitudes of signal-present and -absent images implies nonlinearity. There are two types of nonlinearity, task-dependent and stimulus-dependent nonlinearity, depending on the definition of “signal.” Task-dependent nonlinearity is based on the usual definition of signal, namely, a target-defining feature. Thus, the target item is treated as having a signal, meaning that in the O-search task, O and Q are considered as signal-present and -absent items, respectively, whereas in the Q-search task, the opposite is the case. In contrast, stimulus-dependent nonlinearity is based on a signal inherent to the stimuli, namely, a distinguishing feature. Thus, independent of the task, Q is always considered as the signal-present item.

Task-dependent and stimulus-dependent nonlinearities can account for the obtained incorrect image data in the following way. Note that nonlinearities can take two forms: Noises cancel the signal in the signal-present image (canceled signal), and noises form a “phantom” signal in the signal-absent image (phantom signal). In task-dependent nonlinearity, signal is defined in terms of the task. Therefore, task-dependent nonlinearity predicts the following patterns: either (1) in the case of canceled signal, O- and Q-search conditions show positive and negative modulations in the signal-present images (O and Q), respectively ( Figure 5B, bottom), or (2) in the case of phantom signal, O- and Q-search conditions show negative and positive modulations in the signal-absent images (Q and O), respectively ( Figure 5B, top). Intuitively, the former means that observers miss the target items, and the latter means that observers mistakenly perceive a nontarget as the target. The account by stimulus-dependent nonlinearity is simpler: either (1) regardless of the task, the signal-present images (Q) show negative modulation ( Figure 5C, top) or (2) the signal-absent images (O) show positive modulation ( Figure 5C, bottom). According to the stimulus-dependent nonlinearity, observers mistakenly perceive either a Q as an O or an O as a Q, regardless of the search target. The task-dependent nonlinearity implies top-down control. By contrast, stimulus-dependent nonlinearity was determined solely by stimulus properties, which implies a stimulus-driven mechanism.

Bar feature strength analysis

The initial classification image analyses showed that the only significant feature was the vertical bar of the Q stimulus, regardless of the search target. To explore the image structure more quantitatively, a linear feature vector was constructed from the ROI matrix by taking an average of pixel values along each column ( Figure 4). This 20-dimensional vector represents the strength of modulation by the bar feature. To test whether the modulation magnitudes in the O- and Q-search conditions were different, Hotelling's two-sample test for differences was conducted with the central four dimensions covering the bar location (Abbey & Eckstein, 2002).

Results and discussion

Significant image regions were observed almost exclusively with the incorrect trials ( Figure 6). Thus, the dark bar CI in the O-search condition should be interpreted as dark pixel noises at the location of the bar inducing incorrect responses.

Because the significant regions were obtained only with incorrect images, ROI analyses were reported only for incorrect trials. Figure 7 show color maps of the overall pattern analyses. Three human observers' data unequivocally support the stimulus-dependent nonlinearity. Color maps of image magnitude normalized for observed standard error showed significant negative modulations of pixels at the bar location for Q stimuli, consistent across three observers ( Figure 7). For O stimuli, however, only sporadic weak modulations were inconsistent across observers. The darker pixel noise at a vertical bar of Q induced an error, whereas the lighter noise at the corresponding location of O did not, despite the fact that both were equally likely. Second, the intensity of the negative modulation was stronger in the O-search condition than in the Q-search condition. This is a direct reflection of search asymmetry in the CIs. A linear feature summarizing the modulation magnitude of the vertical bar feature ( Figure 8) showed that the amplitude of negative modulation with a Q stimulus was significantly stronger in the O-search condition than in the Q-search condition for all three observers ( Figure 8; F(4, 2860) = 3.809, F(4,3065) = 3.352, F(4,3153) = 8.020, for YU, TI, and JS, respectively; all p < .01, Hotelling's two-sample test for differences). One should note that the pattern of amplitude difference between O-search and Q-search tasks appeared to be opposite to many classification image studies, showing that higher amplitudes correlate with better performance. This reversal likely reflects nonlinearity in human search behavior, and a possible account for the reversal is proposed in the model-based analysis section.

Taken together, external noise only affected observers' incorrect responses, and regardless of the search target, observers made errors by perceiving a bar with dark noise as no-bar, but not vice versa. This higher sensitivity to dark noise on Q than to bright noise on O provides important qualifications on the process of feature detection. This pattern reflects that observers use the erroneous information of the dark noise on Q, but not the information of the bright noise on O, in their judgment, which apparently is inconsistent with an intuitively appealing claim that the presence of a feature is easier to detect than its absence (Treisman & Souther, 1985). According to this claim, the expected pattern of result is opposite to the observed data, such that bright noises on the bar location lead to erroneous bar detection.

One should note that the stimulus-dependent nonlinearity discussed here is not mere physical asymmetry in stimuli. One may argue that bar detection is inherently asymmetrical because a bar is defined as a conjunction of many pixels in a specific configuration, making the judgment of the presence of a bar more difficult than of its absence. However, this fact does not imply the observed pattern of stimulus-dependent nonlinearity at all. Rather, depending on observers' criteria, observed CI structure can take various forms. When the criterion is quite high, such that even a few canceled pixels lead to the judgment of bar absence, the expected result matches the observed data. On the contrary, when the criterion is quite low, such that even a few uncanceled pixels lead to the judgment of bar presence, the expected result is now opposite to the observed data, O being more sensitive to noises than Q. Thus, a particular pattern of stimulus-dependent nonlinearity is a psychophysical property, not simply a physical one, in the sense that it depends on observers' information processing.

One important aspect of the observed stimulus-dependent nonlinearity is the lack of task-dependent effect. Although a particular pattern of CI structure is in principle sensitive to observers' decision criteria, the factor of the search target has no effect, suggesting that search asymmetry does not reflect changes in observers' criteria of detection of the bar feature depending on the target identity. One remaining question is whether differential modulation magnitudes between O- and Q-search tasks, the signature of search asymmetry in CI, reflect top-down control; this will be addressed in Experiment 2.

Experiment 2: Singleton search experiment

Although the stimulus-dependent nonlinearity is at odds with top-down control, the difference in negative modulation magnitudes between O- and Q-search conditions may reflect top-down tuning of the target-defining feature. Observers may use the identical feature but may change its tuning strategically depending on the target identity. To test this possibility, a search experiment of O and Q was conducted in a singleton search setting (Saiki, Koike, Takahashi, & Inoue, 2005). Here, trials with 1 O and 3 Q's and with 1 Q and 3 O's were randomly mixed within each block and observers identified the location of a singleton target item.

Methods

Observers

Three observers (JS, YU, and AK) including the author participated in the experiment.

Materials and procedure

Materials and procedure were identical to those in the target-defined search experiment except for the following changes. Within each block, O- and Q-search trials were randomly mixed and observers judged the location and identity of the singleton target. Observers first judged the location of the target as in Experiment 1, followed by the judgment of target identity with a key press. The number keys 1 and 2 were assigned to O and Q, respectively. Each block had 200 O-search and 200 Q-search trials with 30 blocks.

Data analyses

The same analyses were used as in Experiment 1, except for the restriction of analyses for trials with correct singleton identification.

Unlike in previous experiments, the target stimulus was unpredictable beforehand, so observers were highly unlikely to change feature-tuning depending on target identity. If search asymmetry reflects top-down feature tuning, differential feature modulation would be eliminated or greatly reduced in this experiment. By contrast, if search asymmetry is stimulus driven, the same pattern of asymmetry would be observed.

Overall, the singleton search task showed the same pattern of CI data. Overall CI data in Figure 10 and ROI analyses for incorrect trials shown in Figure 11 both show almost identical pattern to those in Experiment 1. A linear feature analyses ( Figure 12) again showed an amplitude of stronger negative modulation with a Q stimulus with the O-search trials than with the Q-search trials for all three observers ( F(4, 1541) = 5.020, F(4,1803) = 7.337, and F(4,1712) = 6.238 for YU, AK, and JS, respectively; all p < .01, Hotelling's two-sample test for differences), suggesting that the search asymmetry observed here is stimulus driven.

Image structure of ROI for Q and O stimuli in the Q- and O-target trials in Experiment 2. Trials with incorrect target identification were excluded. Analyses and graph format are the same as in Figure 7.

Figure 11

Image structure of ROI for Q and O stimuli in the Q- and O-target trials in Experiment 2. Trials with incorrect target identification were excluded. Analyses and graph format are the same as in Figure 7.

The classification image analyses above were based on trials with correct target identification. What about the classification image for trials with incorrect target identification? Figure 13 shows the results of ROI analysis for incorrect trials with incorrect target identification. The significant structure across three observers was a positive modulation at the vertical bar location with Q-stimulus in O-target trials. This means that when bright noise makes the vertical bar of one of the three Q stimuli more salient, observers tend to select the Q-stimulus mistakenly and to identify it as the singleton target. Consistent with data with correct target identification trials, noises on O-stimuli did not contribute to observers' responses. However, the overall magnitude of this modulation was weaker than the modulations for trials with correct target identification. Thus, if classification images were formed irrespective of target identification performance, the structure with correct target identification trials dominated. Given that the numbers of trials with correct and incorrect target identification within trials with incorrect target localization were comparable, the weaker classification image structure probably reflects that noises with all three distractors jointly contribute observers' judgment in trials with incorrect identification. In the case of O-target trials, for example, unselected two Q stimuli should be misidentified as O's when the target was misidentified as Q, meaning that noise information leading to incorrect judgment are more distributed across search items in trials with incorrect identification than those with correct identification. This fact can partly justify the analyses used in this study excluding trials with incorrect identification.

Image structure of ROI for trials with incorrect target identification in Experiment 2. Analyses and graph format are the same as in Figure 7. Note that scale of color map was different from previous figures.

Figure 13

Image structure of ROI for trials with incorrect target identification in Experiment 2. Analyses and graph format are the same as in Figure 7. Note that scale of color map was different from previous figures.

Further analyses of the stimulus-driven mechanism with mathematical models

The behavioral data suggest that search asymmetry between O and Q is exclusively determined by stimulus-driven mechanisms. What kinds of stimulus-driven mechanisms are responsible for the search asymmetry? Three aspects of the observed data were particularly important for this question: (1) a significantly larger modulation amplitudes with O-target trials than with Q-target trials, which is the signature of search asymmetry; (2) a nearly identical pattern of data between target-defined and singleton search tasks, which is evidence for a stimulus-driven mechanism; and (3) a strong stimulus-dependent nonlinearity.

Simulation models

To evaluate which properties may explain the observed classification image data, I setup a series of mathematical models and conducted numerical simulations. All of the models were based on signal detection theory (Green & Swets, 1966). See Figure 14 for basic architecture and common properties.

A linear filter was applied to a set of search items, and internal noises were added to the filter's outputs, which were sent to the decision mechanism.

Input to the model

For the sake of simplicity, luminance values in the ROI of each item (the 20 × 25 pixel matrix around the vertical bar location, shown in Figure 4) were applied to the model. Mean luminance of the display was set to 0, so that brighter and darker noise pixels had positive and negative values, respectively. Signal intensity was also positive. Input to the model, denoted as I, was a mixture of signal, denoted as S, and Gaussian white noise, denoted as Ne. The values used for each observer were used for simulations.

I=S+Ne,

(2)

S={αSo,i⁢fs⁢t⁢i⁢m⁢u⁢l⁢u⁢si⁢sOαSq,i⁢fs⁢t⁢i⁢m⁢u⁢l⁢u⁢si⁢sQ

(3)

in which α is the signal intensity, which was adjusted to the threshold level in experiments using the QUEST procedure on a trial-by-trial basis. So and Sq are signal templates for O and Q stimuli, respectively.

Ne∼⁢N(0,σe).

(4)

Linear filter

An ideal bar detector was used. This is simply a filter taking an average of luminance values of pixels covering the bar location. In matrix form, denoted as F, four central columns (columns 9 to 12) are 1 and all other elements are 0. Output of the ith item's filter, denoted as oi is a scalar:

oi=Ii*F′,

(5)

in which Ii is the input data for the ith search item.

Internal noise

Additive and multiplicative noises were used (Dosher & Lu, 2000). Additive noise, denoted as Na, was random Gaussian noise with zero mean and a fixed variance (σa) added to each filter's output independently. Multiplicative noise, denoted as Nm, was another set of random Gaussian noise with zero mean and variance proportional to the mean of four filters' outputs. σa and σm were two free parameters to adjust the model's performance to each human observer. Output of the ith item sent to the decision mechanism, denoted as xi, is

xi=oi+Na+Nm,

(6)

Na∼⁢N(0,σa),

(7)

Nm∼⁢N(0,σ′m)σ′m=σm∑i=14oi.

(8)

Decision mechanism

The decision mechanism takes xi's from four items and decides which item is the target using the following rule. The decision rule is basically the extreme value rule, which is a standard one in models based on signal detection theory. In Q-search and O-search conditions, the item with maximum and minimum values is selected.

When T is the target, the model made a correct choice; otherwise, the model's response is incorrect. In incorrect trials, item T corresponds to a signal-absent item, and the item with target stimulus corresponds to a signal-present item. Noises of these items are stored and submitted to the classification image analyses as detailed below.

In the singleton search task, deviations of maximum and minimum from the average were calculated, and the item with the larger deviation was selected as the singleton target. When the maximum had the larger deviation, the singleton target was identified as Q, and when the minimum had the larger deviation, the singleton target was identified as O. One should note that the model for the singleton search task is not optimal. The optimal model would choose the location with maximum likelihood of the presence of a target at one location and the distractors at the other locations, over all locations and both target, given the outputs at the four locations (for an ideal observer approach to the singleton search task, see Schoonveld, Shimozaki, & Eckstein, 2007).

Instead of the linear filter at the exact location of the vertical bar, a bank of filters, denoted as FU, was placed. In the simulations, I used a set of nine filters that were located with a horizontal offset of 2 pixels, so that the entire ROI region was covered by filters.

FUj={1→,c⁢o⁢l⁢u⁢m⁢n⁢s2(j−1)+d,d=1,2,3,40→,a⁢l⁢lo⁢t⁢h⁢e⁢rc⁢o⁢l⁢u⁢m⁢n⁢s

(11)

For each of the four search items, the maximum or the minimum value of the nine filters' outputs was selected in the Q-search or O-search task, respectively. The selected value was sent to the decision mechanism after additive and multiplicative internal noises were added.

oi={maxj(Ii*FUj′),i⁢nQ−s⁢e⁢a⁢r⁢c⁢hminj(Ii*FUj′),i⁢nO−s⁢e⁢a⁢r⁢c⁢h

(12)

Nonlinear transducer model (Figure 15C)

This model is the same as the linear transducer model except for the introduction of nonlinear transformation of the output of the linear filter. The quadratic function transformation preceded by half-wave rectification was applied before the additive and the multiplicative noises were added. The multiplicative noises were proportional to the nonlinearly transformed values.

oi=(⌈Ii*F′⌉+)2,

(13)

in which ⌈ ⌉ + denotes half-wave rectification.

Simulation procedure and data collection

A series of simulations was conducted with each observer's data. A complete set of noises and signal contrast data were used to reconstruct stimuli for all trials. First, the values of two free parameters were estimated to match the model's accuracy with that of an observer as closely as possible. After the best parameter values were found, simulation runs of the entire stimulus set were repeated 10 times for each observer's data. For all observers and search conditions, mean proportions correct of 10 simulation runs were not significantly different from the observed human data.

For each simulation run, classification image analyses were conducted. Given that the model's ideal bar detector was a simplification, pixel-by-pixel distributions of classification images were beyond the scope of the modeling. Thus, a summarized linear feature vector was defined for comparisons between human and models. Because the bar was four pixels wide, which was also the width of the model's bar detector, I used four columns of pixels as a unit of analysis. Therefore, instead of 20 dimension vectors with average of each column of pixels, four columns were averaged to produce a 5-dimension vector. The five elements denote averaged noise intensities at far left surround, near left surround, the bar location, near right surround, and far right surround. As in the human data, signal presence and signal absence images for incorrect trials were constructed based on the model's classification responses. Simulation results shown in Figure 16 denote results of a representative simulation run, in which simulated proportions correct for both Q-search and O-search conditions had less than 0.5% difference from the observed human data. Summary results shown in Figures 17 and 18 are the results from 10 simulation runs.

The results of one model's 10 simulation runs using noises for one observer can be considered as results from 10 randomly sampled observers with the same parameters. In evaluating a model's behavior, I tested whether human data can be considered as a sample from the same population. If the model properly simulates human data, the human data should not significantly deviate from the simulation data, whereas if the model fails in simulating human data, a significant deviation should be expected. For this purpose, summary statistics of human data for a single observer and those of 10 simulation runs with the same noise stimuli were treated as 11 data points and tested if human data were an outlier by Grubbs' (1950) test. Grubbs' test is used to detect an outlier in a univariate data set assuming a normal distribution. In comparisons of simulation data with fixed values (e.g., difference in magnitudes is 0), t test was used.

Results

To show the characteristics of human data explicitly, the analyses started with the simple linear filter model. A standard signal detection model of visual search with linear signal transducer cannot account for search asymmetry (Eckstein, 1998; Palmer et al., 2000; Verghese, 2001) in terms of signal contrast threshold or accuracy due to its inherent symmetric structure. A simple way to implement asymmetry is to introduce asymmetric internal noise (Palmer et al., 2000), such that the more difficult O-search task has larger internal noises. However, the identical classification image results between target-defined and singleton search tasks impose a constraint that different internal noises should be stimulus driven. Here, the introduction of multiplicative noise proportional to the sum of transducer outputs of four search items satisfied this constraint (Figure 14). In the singleton search, the signal-to-noise ratio depends on stimulus configuration, not task set, and the multiplicative noise implemented this characteristic. Displays with 3 Q's (i.e., O-search condition) have larger noise than displays with 3 O's, giving the O-search condition higher internal noise than the Q-search condition, not because of target identity but because of the number of feature-present items. A signal detection model with display-dependent internal noise, which is the implementation of a differential stimulus variability account of search asymmetry (Palmer et al., 2000; Rosenholtz, 1999; Rubenstein & Sagi, 1990), could successfully simulate the accuracy data.

Importantly, however, the linear model could not reproduce decomposed CI data in two important respects ( Figure 16B). First, the linear transducer is sensitive to both dark noises canceling the bar and bright noises making a “phantom” bar for the O stimulus, which was never observed with humans. Second, a difference between O- and Q-search conditions, the signature of search asymmetry, was observed with CI for the O stimulus, which was contrary to the observed data. Thus, classification image data clearly indicate that the previously proposed differential variability account is insufficient to explain human search behavior.

Another point worth mentioning with the linear observer model is that the predicted asymmetry in modulation amplitude positively correlates with task performance: Higher amplitudes correlate with better performance. This is contrary to the observed classification data, suggesting that the reversal of amplitude difference in the observed data is related to nonlinearity.

The stimulus-dependent nonlinearity could ameliorate the discrepancies between model and data. The stimulus-dependent nonlinearity, according to the literature, may reflect spatial uncertainty and nonlinear signal transduction (Abbey & Eckstein, 2006; Tjan & Nandy, 2006), which are related to the accounts of the lack of a feature map (Treisman & Souther, 1985), and of differential variability of search display (Palmer et al., 2000; Rosenholtz, 1999; Rubenstein & Sagi, 1990), respectively. The accounts of the lack of a feature map can be considered as spatial uncertainty because logically speaking, a map for the absence of a feature denotes all locations without the feature, which usually covers most of the visual field. Spatial uncertainty in this situation is high, because only a tiny subset of “feature-absence map” corresponds to locations of objects without the feature. Nonlinear signal transduction will lead to differential variability among search items because the stimuli with higher intensity (e.g., signal plus noise image of the ROI in Q-stimulus) will have larger variance after accelerating nonlinear transduction than those with lower intensity (e.g., signal plus noise image of the ROI in O-stimulus). Thus, the evaluation of nonlinearity is important for understanding the determinant of search asymmetry. First, as Figure 16C shows, a spatial uncertainty model (Manjeshwar & Wilson, 2001; Pelli, 1985) eliminated the “phantom” bar, a positive peak with the O stimulus, reflecting the effect of uncertainty of the bar location. Unlike human data, however, the model showed significant positive modulation for an O stimulus around the bar location compared with human data (Figures 16C and 17). Grubbs' (1950) test for outliers revealed that all three observers' data were significantly deviated from the uncertainty model in both O- and Q-search conditions (O-search: G = 2.614, G = 2.811, G = 2.790; Q-search: G = 2.729, G = 2.894, G = 2.653, for YU, TI, and JS, respectively; all p < .01). More importantly, differences in modulation magnitude with the Q stimulus between O- and Q-search conditions, the signature of search asymmetry, were not significant (Figure 18, two-tailed t test, t(9) = 0.095, p = .92, t(9) = 2.00, p = .076, and t(9) = 1.87, p = .095 for YU, TI, and JS, respectively). Overall, spatial uncertainty could not account for the observed structure of human CI data, suggesting that hypotheses postulating spatial uncertainty as a critical factor are insufficient to account for the current data.

According to the nonlinear transducer model, the reversed amplitude difference with the Q-stimulus is due to the interaction of accelerating transducer and probability summation. First, accelerating transduction reduces modulation with the O-stimulus. Modulation with the Q-stimulus was mainly determined by the number of Q's in the search display. Due to probability summation, the O-search condition with 3 Q's tends to have more extreme noise cancelling the bar than the Q-search condition with 1 Q.

Abbey and Eckstein (2006) proposed a different nonlinear transducer model for their studies. They used a sigmoidal transducer function and implemented early and late nonlinearities to provide more quantitatively precise account of the human classification image data. In contrast, precise quantitative simulation is beyond the scope of this study; thus, the quadratic transducer proposed in this study should be considered as an example of accelerating nonlinear function. The phenomena addressed in this study may reflect characteristics of accelerating portion of the sigmoidal transducer function in Abbey and Eckstein. Further studies are necessary to clarify the system's characteristics in more detail.

Although the simulation results favor the nonlinear transducer model, one may argue that the failure of the uncertainty model is due to the particular model parameters selected for the simulation, not reflecting the general property of the uncertainty model. To address this issue, I examined how the degree of uncertainty affects the model's simulation results, focusing on two aspects: (1) positive modulation for an O stimulus and (2) differences in magnitude of negative modulation for a Q-stimulus between O- and Q-search tasks. A new simulation was conducted with an uncertainty model with higher uncertainty. The region of uncertainty was enlarged from a 25 × 20 pixel matrix to a 33 × 28 pixel matrix, which was covered by a filter bank with 39 filters (horizontal offsets of 2 pixels and vertical offsets of 4 pixels). As for the positive modulation for an O-stimulus, the increase in spatial uncertainty reduces the extent of the positive modulation. Mean positive modulation magnitude was significantly decreased in the high uncertainty model, ( Figure 19A, two-tailed t test, O-search: t(18) = 2.18, p < .05, t(18) = 4.03, p < .001, and t(18) = 4.44, p < .001; Q-search: t(18) = 4.53, p < .001, t(18) = 3.08, p < .01, and t(18) = 2.25, p < .05 for YU, TI, and JS, respectively). This result was consistent with those reported by Tjan and Nandy (2006). In contrast, the increase in spatial uncertainty does not affect the difference in negative modulation for the Q-stimulus between the two search conditions (Figure 19B, two-tailed t test, t(18) = 0.16, p > .1, t(18) = −0.74, p > .1, and t(18) = 1.20, p > .1 for YU, TI, and JS, respectively). The lack of improvement in simulation with increased uncertainty suggests that the uncertainty model has difficulty in accounting for the critical aspect of classification image data. Taken together, the human data with the O-stimulus could be accounted for both by nonlinear transducer and by spatial uncertainty models, but the nonlinear transduction appears to be necessary to account for the difference in negative magnitudes with the Q-stimulus, the signature of search asymmetry. This additional analysis, however, did not include the exhaustive search in parameter space; thus, further analyses are necessary to draw a firm conclusion about the models' plausibility.

It is important that this study clarified how the differential variability emerges using empirical data. Because of the lack of a method for estimating visual features and underlying mechanisms of search behavior, various hypotheses about the search asymmetry remained unsettled. The CI analyses in this study unambiguously clarified the mechanism of search asymmetry between O and Q with solid support by the empirical data. In particular, the differential variability is shown to be the product of both internal noise and nonlinear signal transduction, which could be shown only through classification image analyses.

How general is the stimulus-driven mechanism observed with the O and Q search? The findings with the O and Q search can be generalized as a differential signal-to-noise ratio (SNR) of stimuli. As stated above, accelerating nonlinearity leads to higher variability for the Q stimulus than for the O stimulus, making the SNR for the Q-stimulus smaller. Thus, O-search display has more items with small SNR (three Q's) than Q-search display (one Q). This differential SNR leads to search asymmetry between O- and Q-search tasks: O-search task with smaller SNR is more difficult than Q-search task. This differential SNR account can be applied to various complex stimuli showing asymmetry depending on observers' knowledge, such as letters, faces, and objects (Shen & Reingold, 2001; Wang, Cavanagh, & Green, 1994). These cases of search asymmetry can be summarized as follows: Searching for an unfamiliar item among familiar items is more efficient than searching for a familiar item among unfamiliar items. A perceptual template for an unfamiliar stimulus may be less efficient than that for a familiar stimulus, leading to a smaller SNR for an unfamiliar stimulus. A search display with fewer small SNR items, an unfamiliar target, is thus more efficient than a display with more small SNR items, a familiar target.

The same logic can be applied to so-called “standard-deviation” cases. According to the standard-deviation account, search asymmetry with simpler stimuli, such as vertical and tilted, is characterized that search for a standard stimulus (e.g., vertical) among its deviations (e.g., tilted) is more difficult than vice versa. This result is explained by asymmetric tuning of feature detectors. For example, Treisman and Gormican (1988) assume that the standard vertical detector has wider tuning than the deviated tilt detector. Assuming that observers use vertical and tilt detectors for vertical and tilted targets, the detectors' output will be more differentiated with the narrowly tuned tilt detector than with the broadly tuned vertical detector (Treisman & Gormican, 1988). This original standard-deviation account is inconsistent with the current data showing the use of the same template (detector) for O- and Q-search tasks. The differential SNR idea can provide an alternative account in the following way. With an assumption that templates for deviation stimuli are less efficient, the prediction is a more efficient search for the smaller SNR target (deviation) than the search for the larger SNR target (standard), consistent with the empirical data. In the literature, familiarity-based search asymmetry with complex stimuli and “standard-deviation” asymmetry with simpler stimuli has been treated as separate types (Treisman & Gormican, 1988; Wolfe, 1994), but the differential SNR account may integrate both of them in terms of the SNR of stimulus templates. More importantly, the CI technique now provides a way to test these hypotheses.

Another issue for further study is to evaluate the current model in more typical visual search settings: tasks with higher uncertainty and with suprathreshold items. The underlying mechanism might be identical to that in the current study if higher spatial uncertainty functions as external noise for suprathreshold items. Alternatively, some different mechanisms might be involved. Classification image techniques can provide concrete hypotheses and methods for testing these alternatives, if care is taken to deal with nonlinearities. To date, both behavioral measures, such as response time and accuracy and functional brain imaging data are not useful in these regards, which makes CI promising to solve these problems.

Conclusion

The current study provides direct evidence for the unresolved issue in the literature of visual cognition: visual search asymmetry between O and Q emerges from a stimulus-driven mechanism. Differential SNR of stimulus templates produces asymmetry in search efficiency from probability summation with a search display composed of numerous distractors and a single target.

Acknowledgments

I wish to thank H. Yamamoto and M. Nagai for their help and two anonymous reviewers for comments on earlier draft of this work. This work was supported by PRESTO from JST, Grants-in-Aid for Scientific Research (#19500226) from JMEXT, and 21st Century COE (D10 to Kyoto University) from JMEXT, and Global COE (D07 to Kyoto Univeristy) from JMEXT.

Detailed image analyses for the region of interest (ROI). Definition of ROI. A matrix of 25 × 20 pixels containing the vertical bar of the Q at the center was analyzed. From this matrix, a linear feature vector was constructed by taking an average of pixel values along each column.

Figure 4

Detailed image analyses for the region of interest (ROI). Definition of ROI. A matrix of 25 × 20 pixels containing the vertical bar of the Q at the center was analyzed. From this matrix, a linear feature vector was constructed by taking an average of pixel values along each column.

Image structure of ROI for Q and O stimuli in the Q- and O-target trials in Experiment 2. Trials with incorrect target identification were excluded. Analyses and graph format are the same as in Figure 7.

Figure 11

Image structure of ROI for Q and O stimuli in the Q- and O-target trials in Experiment 2. Trials with incorrect target identification were excluded. Analyses and graph format are the same as in Figure 7.

Image structure of ROI for trials with incorrect target identification in Experiment 2. Analyses and graph format are the same as in Figure 7. Note that scale of color map was different from previous figures.

Figure 13

Image structure of ROI for trials with incorrect target identification in Experiment 2. Analyses and graph format are the same as in Figure 7. Note that scale of color map was different from previous figures.