Cognitive Sciences Stack Exchange is a question and answer site for practitioners, researchers, and students in cognitive science, psychology, neuroscience, and psychiatry. It's 100% free, no registration required.

I've often heard that the process of saccading can be described as a statistical sampling technique. Specifically, the standard textbook definition of the function of saccades seems to be that the field of vision is too complexity to known in its entirety, so we look at a representative sample of it in order to form a generally adequate representation.

But this conceit seems to be made purely on a loose analogy to much more well defined problems than vision (e.g. anything soluble via Monte Carlo Method), and in the perception literature I haven't found any proposed mechanisms by which this integration of sampled information could occur.

So my question is this: If eye saccading is best thought of as a sampling process, then how does he brain integrate the sampled information? And if not, then what are some viable alternatives to how the brain makes use of the sequence of images producing via saccading?

2 Answers
2

A good place to start for a high level understanding of all perception and action is Jaoquin Fuster's perception-action cycle. As he says, it's a "cybernetic cycle linking the organism to its environment". He describes two moieties of the brain, posterior sensation moiety, and the anterior behavioral moiety. Information cycles between perception, interpretation (in the posterior moiety), action plans (in the anterior moiety), and actual movement. The movement is perceived, and the cycle continues. This is a good frame for thinking about how saccades work.

Your actual question then is opening a big ol' meaty can of worms. How the brain makes use of the sequence of images produced by the saccades, is actually two questions. First, how does the brain perceive images? Next, how are saccades relevant to this? In the context of the perception-action cycle, we're then looking at how the posterior moiety interprets visual information, and how the anterior moiety acts on that interpretation. Without writing 10,000 words, I'll try and give a thumbnail sketch of this. Then I'll try and circle it back to answer your specific question.

So, how does the brain interpret visual images? Here we only really need to understand that the posterior moiety creates distributed and hierarchical cell assemblies representing increasingly complex sets of features. Distributed because the cell assemblies are sparse, as in they consist of many scattered members each representing a different feature. In some way these are bound together to form the gestalt of the object. Hierarchical because cortical regions are connected in a way where successive regions integrate features from previous regions. So, basically a hierarchical set of cell assemblies are activated and thus the sensory information is represented in the mind.

One point before moving on: these cell assemblies are temporal. They represent information over time. It's not so obvious in vision, but for other sensory modalities it is. What does a single configuration of cochlear activations mean? Or a single configuration of touch sensors? Not much. It means vastly more in temporal sequence. A melody, or a hand patting your shoulder. Vision is not different, the confounding thing is that each individual configuration carries so much information it's easy to think of the visual system as a static system. But it isn't. Saccades themselves are helping create temporal information, by focusing on successive things. If you think in terms of the activity of the retina, instead of the static external scene, you'll see that saccades are an integral part of making visual information temporally dynamic.

Because of this, I don't think it is optimal to think of saccades as a procession of images. On the level of the retina, or V1, this might be true. But throughout the perceptual moiety, these images are bound together. Each saccade doesn't present a radically new image, it elaborates an already existing image within the mind. The higher hierarchical regions are binding the information from each saccade into a framework.

Ok, so if the posterior moiety understands the scene, how is it translated into an action (like a saccade)? I believe the answer is the route through the basal ganglia. The basal ganglia helps "select" motor behaviors. To put it very simply, the perceptual moiety is the input, and the output goes to the anterior motor moiety. Through mechanisms that are unclear, a transform is done where a saccade movement is disinhibited that will move the eyes to the bit of the scene that it wants to see next.

I realize I'm shooting kind of wide here, so let me know if you want to know about something more specific. I'm just not sure that your current question can be answered narrowly.

Fuster, J.M. (1990). Prefrontal cortex and the bridging of temporal
gaps in the Perception-Action Cycle. Annals of the New York Academy of
Science, 608, 318-336.

Let's begin by being clear about terminology and what is involved in eye movements at a basic level. Saccades are eye movements. Fixations are when your eyes are still. No visual information is gathered during saccades - we are functionally blind when making saccades. The easy way to demonstrate this is to try staring in a mirror and watching your own saccades. You can't do it!

Second, though Preece's response is very detailed from a more neurological perspective, I'd like to offer an answer from the perspective of eye movement researchers. The easy way to get into eye movement research, and the perspective of eye movement researchers, is to read something like the Scholarpedia article on the topic: http://www.scholarpedia.org/article/Eye_movement . If you'd like to read actual papers, head over to some of the reviews by Keith Rayner that are cited in that link. They are very detailed and thorough and will serve as a good starting point.

Next, your point of "Specifically, the standard textbook definition of the function of saccades seems to be that the field of vision is too complexity to known in its entirety, so we look at a representative sample of it in order to form a generally adequate representation," is only really half of the story, and its worth being explicit about what this actually means. The reason that we fixate objects in the world around us is to use our fovea to gain a high-quality, colour, high-resolution inspection of those objects. Visual acuity (and colour vision) drops off in the periphery very rapidly.

Now.. to get to your question! What you are really asking about is something called trans-saccadic integration or trans-saccadic memory. It's all about how we piece together and integrate the information gained at successive fixations at different points in the world around us. David Irwin and colleagues have done extensive work along these lines, and a good starting point is the wikipedia page and its references: http://en.wikipedia.org/wiki/Transsaccadic_memory