Affiliation: Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.

ABSTRACTThe visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task.

pcbi-1002726-g003: Methods and experimental design.(A), Experimental set-up of experiment 1 (EEG experiment). Subjects were presented with individual images of dead leaves while EEG was recorded. Single-image evoked responses (ERPs) were computed for each electrode, by averaging two repeated presentations of each individual image. Regression analyses of ERP amplitude on contrast statistics were performed at each time sample and electrode. (B), Representational dissimilarity matrices (RDMs) were computed at each sample of the ERP. A single RDM displays Euclidean distance (red = high, blue = low) between multiple-electrode patterns of ERP amplitude between all pairs of stimuli at a specific moment in time. The (cartoon) inset demonstrates how dissimilarities can cluster by category: all images from one category are in consecutive rows and can be ‘similarly dissimilar’ to other categories. (C), Experimental set-up of experiment 2 (behavioral experiment). On each trial, subjects were presented with a pair of stimuli for 50 ms, followed by a mask after an interval of 100 ms. Subjects were presented 8 times with all possible pairings of stimuli and were instructed to indicate whether stimuli were the same or different. (D), Cartoon example of leave-one-out classification based on contrast statistics. One stimulus is selected in turn, after which the median (thumbnail) of the remaining stimuli of its category is computed, as well as the median of other categories (here, just one). Classification accuracy reflects how many stimuli are closer to the median of other categories instead of its own category in terms of distance in image statistics.

Mentions:
Nineteen subjects took part in this experiment. The dead leaves images were presented on a 19 inch Ilyama monitor, whose resolution was set at 1024×768 pixels with a frame rate of 60 Hz. Subjects were seated 90 cm from the monitor such that stimuli subtended 11×11° of visual angle. During EEG acquisition, a single image was presented in the center of the screen on a grey background for 100 ms, on average every 1500 ms (range 1000–2000 ms; Fig. 3A). Each stimulus was presented twice, in two separate runs. Stimuli were presented intermixed with phase-scrambled versions of grayscale natural images; subjects were instructed to indicate which type of image they were shown. This instruction was intended to ensure that subjects attended to the stimuli: the required discrimination between the dead leaves and phase-scrambled natural images did not correspond to any distinction between the categories of dead leaves themselves. Examples of the two types of images were displayed prior to the experiment. Each run was subdivided in 8 blocks across which response mappings were counterbalanced. Stimuli were presented using the software package Presentation (www.neurobs.com).

pcbi-1002726-g003: Methods and experimental design.(A), Experimental set-up of experiment 1 (EEG experiment). Subjects were presented with individual images of dead leaves while EEG was recorded. Single-image evoked responses (ERPs) were computed for each electrode, by averaging two repeated presentations of each individual image. Regression analyses of ERP amplitude on contrast statistics were performed at each time sample and electrode. (B), Representational dissimilarity matrices (RDMs) were computed at each sample of the ERP. A single RDM displays Euclidean distance (red = high, blue = low) between multiple-electrode patterns of ERP amplitude between all pairs of stimuli at a specific moment in time. The (cartoon) inset demonstrates how dissimilarities can cluster by category: all images from one category are in consecutive rows and can be ‘similarly dissimilar’ to other categories. (C), Experimental set-up of experiment 2 (behavioral experiment). On each trial, subjects were presented with a pair of stimuli for 50 ms, followed by a mask after an interval of 100 ms. Subjects were presented 8 times with all possible pairings of stimuli and were instructed to indicate whether stimuli were the same or different. (D), Cartoon example of leave-one-out classification based on contrast statistics. One stimulus is selected in turn, after which the median (thumbnail) of the remaining stimuli of its category is computed, as well as the median of other categories (here, just one). Classification accuracy reflects how many stimuli are closer to the median of other categories instead of its own category in terms of distance in image statistics.

Mentions:
Nineteen subjects took part in this experiment. The dead leaves images were presented on a 19 inch Ilyama monitor, whose resolution was set at 1024×768 pixels with a frame rate of 60 Hz. Subjects were seated 90 cm from the monitor such that stimuli subtended 11×11° of visual angle. During EEG acquisition, a single image was presented in the center of the screen on a grey background for 100 ms, on average every 1500 ms (range 1000–2000 ms; Fig. 3A). Each stimulus was presented twice, in two separate runs. Stimuli were presented intermixed with phase-scrambled versions of grayscale natural images; subjects were instructed to indicate which type of image they were shown. This instruction was intended to ensure that subjects attended to the stimuli: the required discrimination between the dead leaves and phase-scrambled natural images did not correspond to any distinction between the categories of dead leaves themselves. Examples of the two types of images were displayed prior to the experiment. Each run was subdivided in 8 blocks across which response mappings were counterbalanced. Stimuli were presented using the software package Presentation (www.neurobs.com).

Affiliation:
Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.

ABSTRACTThe visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task.