Local sensory information is often ambiguous forcing the brain to integrate spatiotemporally separated information for stable conscious perception. Lateral connections between clusters of similarly tuned neurons in the visual cortex are a potential neural substrate for the coupling of spatially separated visual information. Ecological optics suggests that perceptual coupling of visual information is particularly beneficial in occlusion situations. Here we present a novel neural network model and a series of human psychophysical experiments that can together explain the perceptual coupling of kinetic depth stimuli with activity-driven lateral information sharing in the far depth plane. Our most striking finding is the perceptual coupling of an ambiguous kinetic depth cylinder with a coaxially presented and disparity defined cylinder backside, while a similar frontside fails to evoke coupling. Altogether, our findings are consistent with the idea that clusters of similarly tuned far depth neurons share spatially separated motion information in order to resolve local perceptual ambiguities. The classification of far depth in the facilitation mechanism results from a combination of absolute and relative depth that suggests a functional role of these lateral connections in the perception of partially occluded objects.

Introduction

Local visual information is massively ambiguous, but fortunately the visual system does not base conscious perception on local information alone. Spatial and temporal contexts are highly effective in disambiguating local visual information, which results in a perceptual system that is relatively stable and able to interpret sensory input more globally. When the brain reconstructs the three-dimensional world from a two-dimensional projection on the retina it uses a multitude of cues such as stereoscopic disparity, occlusion, shading or (relative) motion patterns (for an extensive review see Howard & Rogers, 2002). A nice example of how context shapes the three-dimensional interpretation of two-dimensional images can be found in the famous lithograph ‘Relativity’ by M. C. Escher (Escher, 1953/1992). It depicts a world with multiple gravity sources in which the depth interpretation of a room is disambiguated by the presence of people going up or down a set of stairs.

In the laboratory most of the contextual information is often removed from visual stimuli to study highly specific mechanisms of visual processing (Rust & Movshon, 2005). The inference of three-dimensional structure from contextual cues for example can be studied with stimuli that lack explicit depth cues, but whose motion pattern gives rise to the perception of a three-dimensional object. A vivid example of such a stimulus is the two-dimensional projection of a rotating transparent cylinder covered with points, constructed from two layers of randomly positioned dots moving in opposite directions (e.g. Andersen & Bradley, 1998; Kourtzi, Krekelberg, & van Wezel, 2008) (Figure 1a). In the absence of an explicit depth ordering of the two dot layers, this stimulus is bistable with respect to its rotation direction. Bistable stimuli in general offer equal sensory evidence for two mutually exclusive perceptual interpretations causing conscious perception to alternate between the possible interpretations while the stimulus remains the same (for reviews see: Blake & Logothetis, 2002; Leopold & Logothetis, 1999). In the case of the bistable cylinder this means that upon prolonged viewing the rotation direction is perceived to switch every few seconds (Andersen & Bradley, 1998; Nawrot & Blake, 1989; Treue, Husain, & Andersen, 1991).

a) Schematic representation of a kinetic depth cylinder stimulus. The spatial distribution and speed profile of the dots create the vivid impression of a three-dimensional cylinder rotating around a vertical axis. Without explicit depth cues the rotation direction is ambiguous and bistable. The axis drawn here was not present in the actual stimulus. b) Two coaxially presented stimuli have a strong tendency to be perceived as rotating in the same direction. c) Examples of modal and amodal completion with Kanizsa triangles (Kanizsa, 1979). In the top image, a white triangle appears to float in front of black circles. The illusory triangle surface is constructed through modal completion. The lower image's white triangle is perceived as through a set of apertures in a white ‘foreground’ (amodal completion) while the black shapes are perceived as part of an occluded black ‘background.’ d) Amodal spatial facilitation can resolve local ambiguities. An image of an occluded Schröder's staircase, looked at through three apertures. The image in the middle aperture has ambiguous depth information whereas the left and right are disambiguated by contextual information. If the middle aperture is combined with only one of the two flanking apertures, amodal facilitation disambiguates the depth structure in the middle aperture.

Figure 1

a) Schematic representation of a kinetic depth cylinder stimulus. The spatial distribution and speed profile of the dots create the vivid impression of a three-dimensional cylinder rotating around a vertical axis. Without explicit depth cues the rotation direction is ambiguous and bistable. The axis drawn here was not present in the actual stimulus. b) Two coaxially presented stimuli have a strong tendency to be perceived as rotating in the same direction. c) Examples of modal and amodal completion with Kanizsa triangles (Kanizsa, 1979). In the top image, a white triangle appears to float in front of black circles. The illusory triangle surface is constructed through modal completion. The lower image's white triangle is perceived as through a set of apertures in a white ‘foreground’ (amodal completion) while the black shapes are perceived as part of an occluded black ‘background.’ d) Amodal spatial facilitation can resolve local ambiguities. An image of an occluded Schröder's staircase, looked at through three apertures. The image in the middle aperture has ambiguous depth information whereas the left and right are disambiguated by contextual information. If the middle aperture is combined with only one of the two flanking apertures, amodal facilitation disambiguates the depth structure in the middle aperture.

Adding context or depth cues to a bistable cylinder can overcome the rotation direction ambiguity and bias the stimulus towards one, more or less, stable perceptual interpretation. These cues can be part of the stimulus itself (e.g. Dosher, Sperling, & Wurst, 1986; Klink, van Ee, & van Wezel, 2008b; van Ee, van Dam, & Erkelens, 2002) acting on a local scale or they can be an added context that influences perception in a global manner. Examples of global contextual influences are center-surround interactions between the cylinders and surrounding motion patterns (Sereno & Sereno, 1999), an apparent friction effect when two spheres rotating around parallel axes appear to touch (Gilroy & Blake, 2004) or the perceptual coupling of multiple coaxially rotating stimuli (Eby, Loomis, & Solomon, 1989; Freeman & Driver, 2006; Grossmann & Dobbins, 2003) (Figure 1b). The last case is particularly interesting, since it shows that even an ambiguous context can have strong rivalry resolving effects. It suggests that the visual system combines spatially separated information to minimize the degree of visual conflict in the scene (e.g. Attneave, 1968; Freeman & Driver, 2006; Ramachandran & Anstis, 1983). It has been shown that the extent of this perceptual coupling is largest for two ambiguous cylinders, but coupling also occurs if one of the two stimuli is rendered less ambiguous by either adding disparity or a luminance gradient (Freeman & Driver, 2006; Grossmann & Dobbins, 2003). However, whereas for a full disparity defined cylinder the coupling persists, it is strongly reduced—or absent—for stimuli with a maximal luminance gradient (Freeman & Driver, 2006). Such a maximal luminance gradient effectively reduces a cylinder to a single layer of dots. This has led to the proposition that perceptual coupling between cylinders depends on the presence of both surface layers of the two cylinders (Freeman & Driver, 2006). The functional mechanism of perceptual coupling however still remains unclear.

Here we present an alternative explanation for perceptual coupling that has not been previously considered or studied. We hypothesize that perceptual coupling reflects a more common neural mechanism involved in the perception of partially occluded objects or scenes. The visual system can resolve local ambiguities by combining information from different spatially separated locations (e.g. Georgeson, Yates, & Schofield, 2008; Spillmann & Werner, 1996; van der Smagt & Stoner, 2008; Watanabe & Cole, 1995; Yang & Blake, 1995). In real life situations this is particularly useful when objects are partially occluded. When we encounter occlusion, the brain binds the separate chunks of visual information and we perceive a single occluded object rather than multiple separate objects. This perceptual construction of objects that are partially occluded or seen through an aperture is known as amodal completion as opposed to the construction of illusory contours and surfaces in the foreground, which is termed modal completion (e.g. Anderson, Singh, & Fleming, 2002; Kanizsa, 1979) (Figure 1c). Amodal completion is thought to be a hardwired mechanism by which spatial facilitation resolves locally ambiguous visual information (e.g. Driver, Davis, Russell, Turatto, & Freeman, 2001) (Figure 1d). Amodal completion has been shown in a multitude of species such as domestic chicks (Forkman, 1998), pigeons (Nagasaka & Wasserman, 2008), mice (Kanizsa, Renzi, Conte, Compostela, & Guerani, 1993) and baboons (Fagot, Barbet, Parron, & Deruelle, 2006) as well as for a broad range of stimulus dimensions such as shape (e.g. Anderson et al., 2002), color (Pinna, 2008) or sound (Miller, Dibble, & Hauser, 2001). The widespread occurrence of amodal completion combined with the strong contrast between the apparently effortless perception of partially occluded objects and the difficult detection of camouflaged objects suggests that the visual system is better equipped for sharing spatially separated information in the far depth plane (amodal) than in the near depth plane (modal). The findings that human observers are better in judging the relative alignment of two gratings (Anderson et al., 2002) and in recognizing faces (Nakayama, Shimojo, & Silverman, 1989) if they are presented in an amodal rather than modal fashion add further evidence to this suggestion.

In two initial experiments that are added as appendices to this manuscript, we replicate previous findings (e.g. Freeman & Driver, 2006) with our new experimental paradigm and demonstrate: 1) How dot luminance and stereoscopic disparity influence the perceptual interpretation of single kinetic depth cylinders; 2) That perceptual coupling between coaxial cylinders occurs for all disparity biases, but collapses for large luminance gradients, and; 3) That the direction of information sharing is not necessarily from the cylinder with depth cues to the ambiguous cylinder, but rather from the ‘more certain’ to the ‘least certain’ representation. The experiments described in the main text of the manuscript further aim to unravel the nature of the perceptual coupling mechanism. Experiment 1 demonstrates perceptual coupling between disparity defined single surface ‘backsides’ and complete ambiguous stimuli for both cylinders and spheres. These findings demonstrate that spatial facilitation takes place in the background and cannot be simply attributed to surface continuation. Experiment 2 demonstrates that the collapse of perceptual coupling with increasing dot luminance gradients scales with the distance between the two cylinders. This finding supports the idea that the shared information decays over traveled distance and stronger signals in the background are needed to establish perceptual coupling across larger gaps. Experiment 3 investigates the nature of the spatial facilitation signal with asynchronously presented stimuli and reveals that perceptual coupling must occur on a fast activity-driven, rather than a slow adaptation-driven timescale. Our fourth experiment aims to unravel the roles of absolute and relative depth in spatial facilitation. In other words, does coupling occur between backsides (relative depth) or ‘far depth’ surfaces (absolute depth)? The results of this experiment indicate that the coupling mechanism depends on a mixture of absolute and relative depth that is functionally very well suited to deal with occlusion.

The model and experiments were both developed to test our functional hypothesis that the perceptual coupling of kinetic depth stimuli relies on spatial facilitation in the far depth plane. Even though the two approaches form a coherent argument in favor of this hypothesis they might be read independently of each other. The amount of detail in the neural network section of this paper is not strictly necessary to understand the psychophysical results. The experimental results on the other hand may facilitate a better understanding of the model section, but they are also not strictly necessary for it.

A neural network model

Classic models of bistable stimuli such as the kinetic depth cylinder are based on competing neuronal populations coding for two mutually exclusive perceptual interpretations. These neuronal populations are subject to adaptation and they are generally believed to interact via cross-inhibitory connections. The model we present here is based on a recently developed physiologically plausible, single-stage model of visual competition (Noest et al., 2007). This model was developed specifically to gain insight in the mechanism that selects a conscious percept at the onset of a visual rivalry stimulus. The model describes this selection process as a classic competition between mutually inhibitory, percept-coding neural populations. During dominance of a given percept, the response properties of the neurons coding for this percept are altered in a way that does not immediately revert when dominance ends. These continued altered response characteristics thus carry a memory trace of prior dominance (Brascamp, Pearson, Blake, & van den Berg, 2009). In the percept-choice paradigm, the intermittent presentation of visual rivalry stimuli offers a window on these implicit memory traces. Long interruptions (up to seconds) between stimuli result in sequences of repeated dominant percepts (Klink et al., 2008a; Leopold et al., 2002; Noest et al., 2007), whereas shorter interruptions (less than half a second) result in perceptual alternations on subsequent presentations (Klink et al., 2008a; Noest et al., 2007). The Noest-model can account for these findings with an interaction of a neural baseline (β parameter in the model) with the adaptation dynamics of the percept coding neural populations. This interaction functionally creates a head start in the neural competition for the more adapted population at the next stimulus onset (at the level of its near-threshold field potential). If the adaptation levels are high, they will easily overcome the small head start, causing the least adapted neural population to ‘win’ the competition resulting in a classic perceptual alternation. However, if the adaptation levels are too low to overcome the head start, the more adapted neural population will become dominant again at the next stimulus onset, causing perceptual repetitions. The adaptation levels of the competing populations build up during stimulus presentation and decay during the intermittent blank periods. Consequently, short interruptions will allow little decay of adaptation and the resulting high adaptation levels lead to perceptual alternations. Longer interruptions on the other hand, allow much more adaptation decay, resulting in lower adaptation levels at the next stimulus onset and thus in perceptual repetitions.

Kinetic depth

Our current model has the same internal dynamics as the original Noest-model, but for the kinetic depth cylinders we have split up the percept-coding neural populations in surface-coding neural populations (Andersen & Bradley, 1998; Nawrot & Blake, 1991). There is sufficient reason to assume that the percept of kinetic depth cylinders is constructed through the depth ordering of the two dot-layers that constitute the concave/convex front and backside of the cylinder (Klink et al., 2008b; Li & Kingdom, 1999; Nawrot & Blake, 1991; Treue, Andersen, Ando, & Hildreth, 1995). This leaves us with a set of four neural populations, each coding for a combination of depth order and motion direction, eventually giving rise to the percept of a bistable rotating cylinder (Figure 2a, Equations 1 and 2). Classic cross-inhibitory connections are assumed between neural populations coding for opposite directions at the same depth level and same directions at different depth levels. Weak facilitatory connections are assumed between opposite directions at different depth planes for considerations of surface continuity (even though they do not crucially change the model's behavior). Since fully opaque kinetic depth stimuli (only one motion direction visible) are predominantly perceived as convex (see for example our results in 1, Figure A1a), we incorporate a small positive bias for ‘near’ over ‘far’ surfaces. This manipulation is also in agreement with the idea that relatively small stimuli that are surrounded by a uniform, differently colored region are interpreted as ‘figure’ or foreground and thus perceived as closer to the observer (Rubin, 1921/2001). The quantitative predominance of neurons tuned for near depth over those tuned for far depth that has been demonstrated in many visual cortical areas (area V2: von der Heydt, Zhou, & Friedman, 2000; area V3: Adams & Zeki, 2001; area V4: Hinkle & Connor, 2005; Watanabe, Tanaka, Uka, & Fujita, 2002; area MT: Bradley & Andersen, 1998; DeAngelis & Uka, 2003; Maunsell & van Essen, 1983; area MST: Gonzalez, Perez, Justo, & Bermudez, 2001; Roy, Komatsu, & Wurtz, 1992; area IT: Uka, Tanaka, Yoshiyama, Kato, & Fujita, 2003) could also be an indication of a bias for near over far surfaces.

In these equations X represents the visual input, i & j represent the two motion directions, m & n represent the two depth levels, α is the strength of neuronal adaptation, β can be regarded as an intraneural baseline (see Noest et al., 2007 for details); γD represents the strength of classic cross-inhibition between depth levels and γM that between motion directions and ɛ represents the strength of a surface continuity facilitation (Figure 2a). The fast ‘local field’ activity (h) of the neural populations is translated into a spike rate by a sigmoid function S and undergoes a slow shunting type adaptation (Equation 2). For more details on the internal dynamics of the neural populations see Noest et al. (2007) or Klink et al. (2008a).

Our four-population version of the model reproduces the aforementioned findings about the timing of intermittent presentation that were demonstrated with the original two-population version of the model (Klink et al., 2008a; Noest et al., 2007): short interruptions cause perceptual alternations, longer interruptions cause perceptual repetitions (Figure 2b). The neural populations, that together code for a coherent cylinder percept, modulate their activity in synchrony with the near surfaces having stronger responses than far surfaces. The parameters we used in these simulations were taken from the original publication of the Noest model (Noest et al., 2007). This gave us α = 5, β = 4/15 and τ = 1/50. Since we doubled the number of populations involved in constituting a percept compared to the original Noest-model interpretation, we end up with twice the number of cross-inhibitory connections. To stay in accordance with the original parameter-set we divided the strength of the original cross-inhibitory connections by two, leaving us with γD = γM = 5/3. The small surface continuity facilitation that we propose was set to ɛ = 0.1, but setting it to zero did not significantly change the simulation results. The basic input to the model was set as Xnear = 1, while the advantage of near over far surfaces was incorporated as Xfar = 0.75 * Xnear.

To convert simulated neural responses to percepts we calculated the average activity of all four populations during the entire presentation-period of the stimulus and determined a single combination of dominant front and back directions via a winner-take-all mechanism. By using the average activity over the whole presentation epoch, we mimic the perceptual decision process of our human observers that are also allowed to use the entire presentation duration to reach a decision about their percept. This approach could in principle lead to four different percepts: 1) A rotating cylinder with the front moving upwards, 2) A rotating cylinder with the front moving downwards, 3) Two convex surfaces moving in opposite directions, and 4) Two concave surfaces moving in opposite directions (Hol, Koene, & van Ee, 2003). In our simulations, we only encountered the two consistent cylinder percepts.

Spatial facilitation

Two cylinders that are presented simultaneously can be modeled with two sets of four neuronal populations, each with their own inhibitory and facilitatory connections as described above. The principles of modal and amodal completion suggest that there may be lateral connections between similarly tuned populations of neurons coding for the different cylinders ( Figure 2c, Equation 3 in Appendix C). As we noted in the Introduction, ecological optics (Gibson, 1950) would suggest that the visual system is better equipped to deal with occlusion than with camouflage, leading us to assume that the facilitatory connections are stronger in the far (amodal) than in the near (modal) depth plane (purple lines in Figure 2c).

The only difference between the fast dynamics Equations 1 and 3 is that the newly introduced p & q represent the two coupled sets of neural populations and the λ term indicates the strength of the spatial facilitation. Equations 2 and 4 that denote the adaptation dynamics are identical.

We developed our model in order to account for the existing experimental data that demonstrated that perceptual coupling occurs between two coaxial ambiguous cylinders and between a disparity defined and an ambiguous cylinder, but not between a fully luminance defined and an ambiguous cylinder (Freeman & Driver, 2006). Simulations with the model of amodal spatial facilitation were performed to investigate the properties of perceptual coupling. Depending on the duration of the blank period, two bistable cylinders either stabilize or alternate together (Figure 3a).

Simulations of the spatial facilitation model ( Figure 2c). Colored lines represent the simulated response of the four neural populations and colored shading represent percepts, inferred from the neural responses via a winner-take-all mechanism (see text). a) Perceptual coupling between two ambiguous cylinders. During the first few presentations the cylinders are individually stabilized, but later they couple and they stay coupled. The moment coupling kicks in depends on the strength of the spatial facilitation parameter. b) Perceptual coupling between an alternating disparity biased and an ambiguous cylinder. The ambiguous cylinder no longer stabilizes but follows the alternating disparity-defined percept of the biased cylinder. The strength and direction of the depth cue bias is given as a modulation parameter M (see text) and visualized with the green and gray lines that correspond to the green and gray neural populations in the schematic model icon (corresponding to Figure 2a) next to it. c) No perceptual coupling between an alternating luminance biased and an ambiguous cylinder. The dominant percept of the ambiguous cylinder stabilizes while the luminance-defined percept of the biased cylinder alternates. d) Perceptual coupling between a weakly luminance biased and an ambiguous cylinder. The luminance bias alternates direction on consecutive presentations, but is overruled by the perceptual stabilization that couples from the ambiguous to the biased cylinder. Parameters used in the simulation are: α = 5, β = 4/15, τ = 1/50, Xnear = 1, Xfar = 0.75 * Xnear, γD = γM = 5/3, ɛ = 0.1, λfar = 0.4 and λnear = λfar/5. Modulation in b & c: Xnear{0.5–1}, modulation in d: Xnear{0.9–1}.

Figure 3

Simulations of the spatial facilitation model ( Figure 2c). Colored lines represent the simulated response of the four neural populations and colored shading represent percepts, inferred from the neural responses via a winner-take-all mechanism (see text). a) Perceptual coupling between two ambiguous cylinders. During the first few presentations the cylinders are individually stabilized, but later they couple and they stay coupled. The moment coupling kicks in depends on the strength of the spatial facilitation parameter. b) Perceptual coupling between an alternating disparity biased and an ambiguous cylinder. The ambiguous cylinder no longer stabilizes but follows the alternating disparity-defined percept of the biased cylinder. The strength and direction of the depth cue bias is given as a modulation parameter M (see text) and visualized with the green and gray lines that correspond to the green and gray neural populations in the schematic model icon (corresponding to Figure 2a) next to it. c) No perceptual coupling between an alternating luminance biased and an ambiguous cylinder. The dominant percept of the ambiguous cylinder stabilizes while the luminance-defined percept of the biased cylinder alternates. d) Perceptual coupling between a weakly luminance biased and an ambiguous cylinder. The luminance bias alternates direction on consecutive presentations, but is overruled by the perceptual stabilization that couples from the ambiguous to the biased cylinder. Parameters used in the simulation are: α = 5, β = 4/15, τ = 1/50, Xnear = 1, Xfar = 0.75 * Xnear, γD = γM = 5/3, ɛ = 0.1, λfar = 0.4 and λnear = λfar/5. Modulation in b & c: Xnear{0.5–1}, modulation in d: Xnear{0.9–1}.

In our biased cylinder simulations, a temporal presentation profile was used that would normally give rise to sequences of repeated percepts (1.0 second presentations with 1.5 seconds blank periods; see also Figure 2b). Depth cues were incorporated in the model by multiplying the input the neural populations with a modulation factor M. Introducing depth cues to the two cylinders with dot luminance or stereoscopic disparity has different effects on the activity of the surface-coding neuronal populations ( Table 1). Dot luminance manipulations result in biases that are based on motion direction only and will thus affect the activity of the two populations coding for the same direction of the manipulated dots regardless of their depth assignment (vertically positioned pairs of populations in Figure 2a). The small positive bias for ‘near’ over ‘far’ surfaces ensures that the brightest dots are perceived as the ‘near‘ side of the cylinder. Stereoscopic disparity manipulations on the other hand, result in biases based on combined motion and depth information and will consequently affect the relative activities of the two pairs of populations coding for a coherent cylinder percept (diagonally positioned pairs of populations in Figure 2a). To visualize the effect of the depth cues, we simulated a switch in cue direction on each consecutive presentation (gray and green M-lines in Figures 3b– 3d). If full disparity cues are used, the two coupled cylinders together follow the biased direction ( Figure 3b), but with full dot luminance biases perceptual coupling collapses ( Figure 3c), which is in agreement with the existing data. When stimulus biases are relatively small, they are no longer the strongest percept-determining feature. The perceptual stabilization that arises from the intermittent stimulus presentation with long blank periods (Klink et al., 2008a; Leopold et al., 2002; Maier et al., 2003; Noest et al., 2007; Pearson & Brascamp, 2008) is now more effective and the two cylinders appear coupled but their rotation direction stabilizes (Figure 3d) despite the alternating depth cue biases.

The effect of simulated input modulations on the effective input to the neural populations in our model. To account for the preference for single surfaces to be perceived as being near rather than far we state that Xfar = 0.75 * Xnear. Depth cue modulations affecting the different neural populations of the model are denoted as gain factors M1 and M2 (green and gray lines in Figure 3). For simulated luminance manipulations the two populations coding for the same motion direction have the same modulation gains, while for simulated disparity modulations the two populations that code for a consistent cylinder (different depth, opposite directions) receive the same gain factor.

Table 1

The effect of simulated input modulations on the effective input to the neural populations in our model. To account for the preference for single surfaces to be perceived as being near rather than far we state that Xfar = 0.75 * Xnear. Depth cue modulations affecting the different neural populations of the model are denoted as gain factors M1 and M2 (green and gray lines in Figure 3). For simulated luminance manipulations the two populations coding for the same motion direction have the same modulation gains, while for simulated disparity modulations the two populations that code for a consistent cylinder (different depth, opposite directions) receive the same gain factor.

Population

Ambiguous

Luminance depth cue

Disparity depth cue

Near/Up

X = Xnear

X = M1 * Xnear

X = M1 * Xnear

Near/Down

X = Xnear

X = M2 * Xnear

X = M2 * Xnear

Far/Up

X = 0.75 * Xnear

X = M1 * 0.75 * Xnear

X = M2 * 0.75 * Xnear

Far/Down

X = 0.75 * Xnear

X = M2 * 0.75 * Xnear

X = M1 * 0.75 * Xnear

The strengths of spatial facilitation in our simulations was chosen to reproduce the dissociation in coupling between luminance and disparity biases, and to reflect our hypothesis that coupling is stronger in the far than near depth field, resulting in λfar = 0.4 and λnear = λfar/5. The depth biases were simulated by multiplying Xnear with a modulation factor M so that Xnear-mod = M * Xnear. To demonstrate the effect of strong luminance and depth cues we modulated M between 0.5 and 1. A weak modulation of M{0.9–1} was used to reveal the occurrence of “reverse coupling” with the two cylinders being stabilized together despite a depth cue that is alternating in direction.

In a series of psychophysical experiments we also tested the hypothesis that perceptual coupling is driven by connections in the far depth plane ( Experiment 1). If such spatial facilitation through lateral connections exists, it is likely to exhibit a certain decay of signal strength with increasing interstimulus distance. A second experiment ( Experiment 2) investigates whether the strength of spatial facilitation is indeed a function of the distance between the two cylinders. Our third and fourth experiment shed light on the nature of the facilitatory mechanism and the roles of absolute (depth relative to the plane of fixation) and relative depth (front or back side of the cylinder) respectively.

Methods

Observers

Five observers participated in Experiments 1 and 4, four observers in Experiments 2 and 3. In each experiment, one of these observers was an author while the others were naive about the purpose of the study. All observers had normal or corrected to normal visual acuity. After we explained the task and showed the stimuli to the observers we obtained their informed consent.

Apparatus

Visual stimuli were generated on a Macintosh computer in MATLAB (Mathworks, Natick, MA) using the Psychtoolbox extensions (Brainard, 1997; Pelli, 1997) and presented on a 22 inch CRT monitor with a resolution of 1600 × 1200 pixels and a refresh rate of 100 Hz. Observers viewed the stimuli through a mirror stereoscope from a distance of 100 cm.

Stimuli

In all experiments, stimuli were kinetic depth cylinders or spheres (only in Experiment 1), consisting of white dots on a black background (∼0 cd/m 2), rotating around a horizontal axis with 120 deg/s. Cylinders or spheres were 3 × 3 deg and the individual dots were 0.11 deg in size. Stimuli without disparity cues were presented monocularly to prevent explicit ‘flatness.’ Disparity biases were implemented by horizontally shifting the dots presented to the individual eyes in fractions of the ‘realistic’ disparity (0, 20, 40, 70 and 100%). In the luminance biased condition, the ‘nearest’ dots always had full luminance (69.7 cd/m 2) while the other dots' luminance was modulated down to fractions of the full luminance (0, 25, 60, 90 and 100% modulation) depending on their simulated depth. Ambiguous (0% modulation) and disparity defined cylinders thus consisted of dots that were all 69.7 cd/m 2, whereas e.g. 100% luminance modulated cylinders contained dots ranging in luminance between 0 cd/m 2 (the ‘farthest’ dots in the middle of the back surface) and 69.7 cd/m 2 (the ‘nearest’ dots in the middle of the front surface). Stimuli were presented on the screen for one second separated by 1.5 seconds inter stimulus interval. Blocks of stimulus presentations lasted 120 seconds and conditions were picked in pseudo-random order. During the entire duration of a block there was a fixation cross (6 × 6 pixels, 69.7 cd/m 2) at the center of the screen.

Procedures

Experiment 1: Information sharing in the near and far planes. Two coaxial cylinders or spheres were presented spatially separated by a gap of 0.5 degrees. The rightmost stimulus was always completely ambiguous. Only one of the two dot layers of the left stimulus was displayed. This layer could be the far or near side of a cylinder as defined by its luminance gradient or disparity information. We performed a short selection experiment to test whether these cues were sufficient for our observers to impose the specified percept. Only observers that perceived the biases in the veridical direction more than 75% of the time (80 presentations with random bias direction) were selected for this experiment (7 out of 8 observers passed this test). They then performed the experiment in which they only reported the perceived direction of the near/front surface of the full, ambiguous stimulus by pressing a button on the keyboard. The ‘half’ stimulus had a 40% probability of changing its direction on consecutive presentations.

Experiment 2: The spatial decay of perceptual coupling. Two coaxial cylinders were presented on each side of the fixation cross. The distance between the cylinders was variable over blocks (0.25, 0.5, 1.0 or 2.0 degrees). The rightmost cylinder was always completely ambiguous whereas the left could have a luminance bias. Observers indicated the perceived direction of the near/front surface of both cylinders by pressing buttons on the keyboard. Any possible stimulus bias (disparity or luminance) again had a 40% probability of changing its direction on consecutive presentations.

Experiment 3: Asynchronous presentation. Two coaxial cylinders were presented on both sides of a central fixation cross, spatially separated by a gap of 0.5 degrees. The rightmost stimulus was always completely ambiguous while the other was fully disambiguated by stereoscopic disparity (changing direction with a 40% probability). There was a temporal offset of 1.25 seconds between the presentation of the two cylinders causing each cylinder to be on the screen only during the other cylinder's inter stimulus interval ( Figure 6a). The presentation of these alternating cylinders thus had a residual true blank period of 250 milliseconds. Observers indicated the perceived direction of the near/front surface of both cylinders by pressing buttons on the keyboard.

Experiment 4: Relative vs. absolute depth. Two coaxial cylinders were presented on both sides of a central fixation cross, spatially separated by a gap of 0.5 degrees. The rightmost stimulus was always completely ambiguous while the other was fully disambiguated by stereoscopic disparity (changing direction with a 40% probability). The disparity defined cylinder could either be fully displayed or be restricted to it's near or far side. The set of cylinders were defined to have a location in depth with their axis of rotation either one diameter closer to the observer than the plane of fixation or one diameter further away from the observer than the plane of fixation. Their size on the screen was maintained the same for both situations. In the plane of fixation we added a framework of three vertical and two horizontal gray bars (25.3 cd/m 2) with a width of 0.5 degrees to aid depth discrimination (see schematic representation in Figure 7). This addition caused some of the dots on the left and right sides of the cylinder to be either (partially) occluded by or on top of the this null-plane framework which—combined with the disparity information—resulted in a vivid percept of the cylinders being behind or in front of the plane of fixation. As in Experiment 1, observers reported the perceived direction of the near/front surface of the full, ambiguous stimulus by pressing a button on the keyboard. The ‘half’ stimulus had a 40% probability of changing its direction on consecutive presentations.

Results

Experiment 1: Information sharing in the near and far planes

. Our hypothesis for amodal spatial facilitation in perceptual coupling predicts that the difference in perceptual coupling between luminance and disparity depth cues results from the existence of lateral connections between neural populations involved in the representation of the two individual cylinders or spheres (Figure 2c). In particular, we argue that the principle of amodal completion of occluded objects suggests that these facilitatory lateral connections are only present in the far depth plane or in any case much stronger than in the near depth plane. This implies that previous assumptions about the necessity of both dot layers (or ‘sides’) of a kinetic depth cylinder for perceptual coupling (Freeman & Driver, 2006) may have been premature. It could very well be that one dot layer is enough to establish coupling as long as it explicitly constitutes the ‘far half’ of the cylinder. In this experiment we test this hypothesis by using fully biased half cylinders and spheres that are defined by luminance or disparity to be either far or near sides of a kinetic depth stimulus. We included spheres here to investigate whether any possible coupling effect should be attributed solely to surface continuation, which could drive coupling between coaxial cylinders but not between spheres. The results convincingly demonstrate that perceptual coupling can occur between an ambiguous stimulus and a coaxial half stimulus as long as the latter is a disparity defined far side (Figure 4a for cylinders, T-test: p < 0.001; Figure 4b for spheres, T-test: p < 0.02) and because the effect is present for both cylinders and spheres it cannot be solely attributed to surface-continuation. Disparity defined near sides (T-test: pcylinders = 0.36; pspheres = 0.12), luminance defined far (T-test: pcylinders = 0.34; pspheres = 0.50) or near sides (T-test: pcylinders = 0.31; pspheres = 0.09) do not couple with an ambiguous cylinder. Furthermore, disparity defined far sides couple significantly better than disparity defined near sides (T-test: pcylinders < 0.01; pspheres < 0.02) or luminance defined far sides (T-test: pcylinders < 0.03; pspheres < 0.01). Luminance defined far sides appear to couple slightly better than luminance defined near sides but this difference was not significant (T-test: pcylinders = 0.30; pspheres = 0.11). It must however be noted that luminance cues on a single surface are not very effective. Even though a luminance gradient can define a convex or concave surface, the dots we used to define the concave backsides were very dim and the general tendency of observer's to perceive single surfaces as being the near side of a cylinder appears to dominate the luminance depth cue altogether.

The fraction of perceptual coupling between ‘halves’ and ambiguous kinetic depth stimuli for five observers for cylinders (a) and spheres (b). ‘Half’ stimuli are defined to be the near or far sides of the full stimulus using either full luminance gradients or full disparity biases. For both types of stimuli, the only case in which the fraction of coupling is significantly larger than chance is when there is a disparity defined far side. In those cases there is also significantly more perceptual coupling than in disparity defined near sides or luminance defined far sides. Error bars represent S.E.M.

Figure 4

The fraction of perceptual coupling between ‘halves’ and ambiguous kinetic depth stimuli for five observers for cylinders (a) and spheres (b). ‘Half’ stimuli are defined to be the near or far sides of the full stimulus using either full luminance gradients or full disparity biases. For both types of stimuli, the only case in which the fraction of coupling is significantly larger than chance is when there is a disparity defined far side. In those cases there is also significantly more perceptual coupling than in disparity defined near sides or luminance defined far sides. Error bars represent S.E.M.

The main conclusion of this experiment is the demonstration that perceptual coupling can occur between an ambiguous cylinder and a single surface as long as this single surface is a clearly defined cylinder backside.

Experiment 2: The spatial decay of perceptual coupling

Our explanation of the perceptual coupling phenomenon proposes the existence of lateral connections that are responsible for information sharing between neural pools coding for spatially separated stimuli. It seems legitimate to think that the effectiveness of the information sharing mechanism will depend on the distance that needs to be bridged. In particular, one might expect that strong initial signals will be able to bridge larger distances between stimuli than weak ones. Experiment 2 tests this assumption by measuring the proportion of perceptual coupling as a function of dot luminance bias and gap-size between the cylinders. From the experiments in 2 (and previous work by Freeman & Driver, 2006; Grossmann & Dobbins, 2003), we know that perceptual coupling between a luminance biased cylinder and an ambiguous cylinder will cease to exist when the bias gets too large. Figure 5a demonstrates that with all gap-sizes used there is a near perfect coupling between two ambiguous cylinders and coupling at chance level with full luminance gradients. However, the moment of the drop in perceptual coupling depends not only on the strength of the luminance depth cue, but also on the distance between the cylinders (2-way ANOVA: Flum(4,60) = 58.59, plum < 0.001; Fgap(3,60) = 15.00, pgap < 0.001). A significant interaction between gap-size and luminance bias (Finter(12,60) = 3.19, pinter < 0.01) further demonstrates that when the distance between the cylinders increases, the proportion of perceptual coupling starts to decrease at much smaller luminance biases. This suggests that spatial facilitation over larger distances needs the presence of stronger signals in the far depth plane. Our model predicts that a facilitatory signal from an ambiguous towards a weakly luminance biased cylinder can overcome the luminance bias (Figure 3d). If this phenomenon of ‘reversed coupling’ takes place, the biased cylinder will be perceived to couple with the ambiguous cylinder and rotate against its bias. The balance between the strength of the depth cue and the strength of the spatial facilitation determines whether this will happen. If the effect of facilitation indeed scales with the distance between stimuli we would thus expect that the proportion of trials in which a weakly biased cylinder is perceived veridically would be larger for smaller gap-sizes. Figure 5b plots the proportion of veridically perceived biased cylinders as a function of bias strength and gap-size. The effect of bias strength is highly significant (2-way ANOVA, F(4,60), p < 0.002), but the effect of gap-size is not (F(3,60), p = 0.93) nor is the interaction between bias and gap-size (F(12,60), p = 0.99). The gap-size dependency of the ‘reversed coupling’ is however expected to be a relatively subtle effect and our rather noisy data lacks the appropriate resolution to make any strong statements about it.

a) The influence of gap-size on perceptual coupling for four observers. The proportion of perceptual coupling is plotted against the strength of a luminance bias. The proportion of perceptual coupling decreases when luminance biases become too large. If the gap between the two cylinders increases the drop in perceptual coupling occurs at smaller luminance biases. b) The influence of luminance bias and gap-size on the proportion of trials in which the observers perceive the biased cylinder in accordance with the bias. This proportion increases fast with stronger biases but is not significantly influenced by gap-size. The open square at bias level zero is a theoretical point at chance level since there is no veridical percept here. Error bars in both plots represent S.E.M.

Figure 5

a) The influence of gap-size on perceptual coupling for four observers. The proportion of perceptual coupling is plotted against the strength of a luminance bias. The proportion of perceptual coupling decreases when luminance biases become too large. If the gap between the two cylinders increases the drop in perceptual coupling occurs at smaller luminance biases. b) The influence of luminance bias and gap-size on the proportion of trials in which the observers perceive the biased cylinder in accordance with the bias. This proportion increases fast with stronger biases but is not significantly influenced by gap-size. The open square at bias level zero is a theoretical point at chance level since there is no veridical percept here. Error bars in both plots represent S.E.M.

Figure 6b demonstrates that while the proportion of perceptual coupling between disparity defined cylinders and ambiguous cylinders was high when they were presented simultaneously (data from 2), it is completely absent if the two cylinders are presented with a temporal offset (all observers; T-test, p > 0.13; group data, p = 0.50). Here, perceptual coupling was defined as coupling between the disparity-defined cylinder and the subsequent ambiguous cylinder, since the proportion of veridical perception of the disparity defined cylinder was at ceiling level (average over observers was 0.95 ± 0.05 standard deviation; not significantly different from 1.0 as indicated by a T-test, p = 0.47). In our model, the spatial facilitation term acts on the fast h-dynamics representing local field activity ( Equation 3) and consequently has little effect on the slower adaptation dynamics ( Equation 4). Simulations with our model indeed reproduce the absence of perceptual coupling when the two stimuli are presented asynchronously ( Figure 6c).

a) Temporal profile of the presentation of the two cylinders in our control experiment. The left cylinder (C1) was disambiguated by stereoscopic disparity; the right cylinder (C2) was ambiguous. Each cylinder was presented alone for 1.0 seconds separated by 1.5 seconds intervals during which the other cylinder was presented. b) Significant perceptual coupling with synchronous presentation (gray bar, data from 2) ceases to exist when the stimuli are presented asynchronously (white bar). Error bars represent SEM c) Simulations with our model reproduce the lack of perceptual coupling with asynchronous presentation. The simulation was performed with the same parameters as in Figure 3b, only now the input to the two sets of neuronal populations was asynchronous.

Figure 6

a) Temporal profile of the presentation of the two cylinders in our control experiment. The left cylinder (C1) was disambiguated by stereoscopic disparity; the right cylinder (C2) was ambiguous. Each cylinder was presented alone for 1.0 seconds separated by 1.5 seconds intervals during which the other cylinder was presented. b) Significant perceptual coupling with synchronous presentation (gray bar, data from 2) ceases to exist when the stimuli are presented asynchronously (white bar). Error bars represent SEM c) Simulations with our model reproduce the lack of perceptual coupling with asynchronous presentation. The simulation was performed with the same parameters as in Figure 3b, only now the input to the two sets of neuronal populations was asynchronous.

This experiment aimed to unravel whether the distinction between near and far sides of a cylinder in perceptual coupling that is demonstrated with Experiment 1 relies on relative or absolute depth. It is important to realize that while our terminology of absolute and relative depth resembles the distinction between absolute and relative disparity (for a review see Roe, Parker, Born, & DeAngelis, 2007 or Parker, 2007), they are in fact significantly different. The absolute depth of the potentially coupling surfaces is defined relative to the plane of fixation and can thus be regarded as an analog of absolute disparity, which describes the angular difference of retinal projections relative to the fovea. However, relative depth in our terminology indicates whether we are talking about a front side or backside of a cylinder and is something totally different from relative disparity, which is taken as the difference in absolute disparity between two points. A more direct analog of relative disparity would be the difference in depth between the two cylinders, but since the ambiguous stimulus is presented monocularly, it lacks an explicit location in depth and relative disparity cannot play a role. The results (Figure 7) demonstrate that perceptual coupling between ‘complete’ disparity-defined cylinders and ambiguous cylinders is maintained in both the near and far condition (T-test, p < 0.01). For both these conditions the far cylinder sides alone also establish a significant fraction of coupling (T-test, p < 0.05) that is not significantly different from the fraction that results from complete cylinders (T-test, p > 0.12). Looking at the near sides of the disparity defined cylinders alone it becomes clear that significant coupling does not occur (T-test, p = 0.95) when the stimuli are closer to the observer than the plane of fixation (matching the results from Experiment 1). However, when the stimuli are behind the plane of fixation the near sides can establish a significant fraction of perceptual coupling (T-test, p < 0.02). This fraction is smaller than that for whole cylinders or far sides at the same depth location (T-test, p < 0.05) but nevertheless present. The addition of a framework in the plane of fixation adds a minor depth cue to the display due to the partial occlusion of some of the dots at the edges of the far depth cylinders. Whereas, this manipulation greatly enhanced perceptual depth ordering, we believe that it is unlikely to have critically influenced our perceptual coupling results in any other way.

The roles of absolute and relative depth. The fraction of perceptual coupling between disparity defined ‘half’ and complete cylinders and ambiguous cylinders that were either closer to the observer than the plan of fixation (left) or further away than the plan of fixation (right). For the closer set of stimuli the results are comparable to those of Experiment 1 ( Figure 4). For the set of stimuli behind fixation the ‘near halves’ of cylinders (rightmost white bar) also cause a significant fraction of perceptual coupling. Error bars represent S.E.M.

Figure 7

The roles of absolute and relative depth. The fraction of perceptual coupling between disparity defined ‘half’ and complete cylinders and ambiguous cylinders that were either closer to the observer than the plan of fixation (left) or further away than the plan of fixation (right). For the closer set of stimuli the results are comparable to those of Experiment 1 ( Figure 4). For the set of stimuli behind fixation the ‘near halves’ of cylinders (rightmost white bar) also cause a significant fraction of perceptual coupling. Error bars represent S.E.M.

The visual system uses spatial and temporal context to disambiguate local sensory information and construct a global conscious percept. If two ambiguous kinetic depth spheres or cylinders (Andersen & Bradley, 1998; Nawrot & Blake, 1989; Treue et al., 1991) are presented spatially separated but rotating about a common axis, their rotation directions couple and they switch directions simultaneously (Eby et al., 1989; Freeman & Driver, 2006; Grossmann & Dobbins, 2003). Apparently, even an ambiguous context can disambiguate a visual conflict. Studies investigating this perceptual coupling phenomenon have shown strong coupling both between multiple ambiguous stimuli and between disparity defined and ambiguous cylinders, but not between strong luminance biased and ambiguous cylinders (Freeman & Driver, 2006; Grossmann & Dobbins, 2003). This has led to the suggestion of a visibility constraint on the occurrence of perceptual coupling, stating that both sides of a context cylinder needs to be present to effectively couple rotation directions (Freeman & Driver, 2006). This visibility constraint in turn challenges the assumption that the two surfaces of a kinetic depth stimulus are represented in a co-dependent, mutually antagonistic way (Andersen & Bradley, 1998; Klink et al., 2008b; Li & Kingdom, 1999; Nawrot & Blake, 1991; Treue et al., 1995). In the current study we consider an alternative explanation that is based on a general mechanism by which the brain could process partially occluded visual objects.

Our findings suggest that perceptual coupling can occur with single context surfaces but that it's effectiveness (or lack thereof) depends on the neural mechanisms of the coupling process. The general extrapolation of spatially separated visual information into a globally consistent percept is known as spatial facilitation. Visual completion is a special case of spatial facilitation in which a single object or surface is perceived while it is only defined by spatially separated chunks of visual information. Completion is termed modal when illusory contours or surfaces are perceived in the foreground and amodal when it leads to the impression of an object or surface that is partially occluded or seen through an aperture (e.g. Anderson et al., 2002; Kanizsa, 1979) (Figure 1c). Even though there is a lively discussion about the extent to which modal and amodal facilitation share a common mechanism (e.g. Bakin, Nakayama, & Gilbert, 2000; Hegdé, Fang, Murray, & Kersten, 2008; Murray, Foxe, Javitt, & Foxe, 2004; Rauschenberger, Liu, Slotnick, & Yantis, 2006; Weigelt, Singer, & Muckli, 2007), it is clear that they both involve the binding of spatially separated visual information. Ecological optics (Gibson, 1950) suggests that occlusion may be a more generally occurring feature than camouflage and amodal spatial binding of visual information (in far depth) should thus be more efficient than modal binding (in near depth). This idea is consistent with the finding that vernier shift discrimination is more accurate for amodally completed gratings than for modally completed ones (Anderson et al., 2002), more accurate face recognition in amodal vs. modal displays (Nakayama et al., 1989), and the demonstration of amodal, not modal, continuation of visual motion behind an occluder (van der Smagt & Stoner, 2008). The model and experimental data that we present in this manuscript suggest that a similar amodal spatial facilitation mechanism may be responsible for the perceptual coupling and resulting disambiguation of kinetic depth stimuli.

In the light of our current results, the visibility constraint that was put forward by Freeman and Driver (2006) should be disregarded. Perceptual coupling of kinetic depth stimuli does not necessarily need two surfaces; a single far side surface suffices. In fact, the near side surfaces show very little if any coupling. When a strong luminance gradient is used to bias a kinetic depth cylinder towards a specific interpretation the amount of signal constituting the far side will be relatively small or absent, hence the failure of perceptual coupling. If an additional occluder is positioned between the observer and the cylinders perceptual coupling can also occur between near side surfaces suggesting that the near/far depth assignment results from a combination of absolute and relative depth that is particularly suitable to resolve occlusion in the visual scene. While our assumptions about the functional coupling mechanism are based on amodal visual completion it should be noted that perceptual coupling cannot be attributed to amodal surface completion (Fang & He, 2004). Whereas this explanation would be feasible for coaxial cylinders, it cannot explain why we find similar effects for coaxial spheres. The amodal information sharing is apparently occurring between pools of neurons tuned for combinations of depth and motion direction suggesting a more general mechanism by which neurons tuned to the same depth plane share sensory information. This idea is consistent with the recent finding that depth information propagates between surfaces only when these surfaces are located in the far depth plane (Georgeson et al., 2008).

Another interesting aspect of our experimental findings that is confirmed by model simulations is the existence of coupling against a stimulus bias ( 2). Whereas the existence of a luminance gradient or binocular disparity is the only spatial context from which visual information can be inferred, there is additional temporal context in the presentation paradigm. Our use of the percept-choice paradigm not only has the advantage of being a sensitive measure to detect small imbalances in the activity of underlying neural populations (Noest et al., 2007), it is also a paradigm in which the inter-stimulus interval duration is crucial for the probability at which perception switches on consecutive trials. Because we use relatively long inter-stimulus intervals (1.5 seconds) we see an expected high level of perceptual stabilization when there is only a single stimulus (Klink et al., 2008a; Noest et al., 2007). In the two-stimulus condition with the biased stimulus stochastically changing direction there are thus two sources of contextual information leading to opposite conclusions. Whereas the spatial context signals percept changes, the temporal context signals percept stabilization. As can be seen in our results of 2, the relative strengths of the individual contexts ultimately determine conscious perception whereas perceptual coupling is high for all cases. This means that the information sharing mechanism we introduce is indeed bi-directional rather than only from the biased to the ambiguous cylinder.

The proposed connectivity between pools of neurons coding for similar sensory features at different spatial locations could be established in different ways. The most likely modes of connectivity would be 1) overlapping receptive fields of neurons in the two pools share information through their adaptation states, 2) direct single synapse connections between neurons in the two pools, or 3) an attenuating dilation of neural signal through ‘horizontal connections’ (Roelfsema, 2006) over a multitude of neurons covering the gap between stimuli (Ullman, 1979; van der Smagt & Stoner, 2008; Watanabe & Cole, 1995). Our use of the percept-choice paradigm allowed us to perform a specific control experiment (Experiment 3) that tested whether the information sharing mechanism occurs on the slow timescale of neuronal adaptation or on a fast timescale suggesting a direct activity-driven connection. The results demonstrate that perceptual coupling does not occur when two cylinders are presented with a temporal offset that causes them to be on the screen only during each other's interstimulus intervals. This suggests that the coupling mechanism does not occur on the slow adaptation timescale and should thus result from fast activity-driven lateral connections.

Whereas visual cortex is predominantly vertically organized in columns, horizontal connections with a length up to several millimeters have been demonstrated to connect similarly tuned clusters of neurons (Gilbert & Wiesel, 1979, 1983, 1989; Livingstone & Hubel, 1984; Malach, Schirman, Harel, Tootell, & Malonek, 1997; Martin & Whitteridge, 1984; Rockland & Lund, 1983). These connections are excitatory and the longer ones connect neurons with well-separated receptive fields (Ts'o, Gilbert, & Wiesel, 1986). The number of horizontal connections decreases with increasing distance between connected clusters (Ts'o et al., 1986), which could explain why the proportion of perceptual coupling declines with increasing distance between the stimuli. Recently, lateral connections were discovered in the middle temporal area (MT) of the rhesus macaque (Ahmed et al., 2008). In MT, both depth and motion information are represented (Bradley, Qian, & Andersen, 1995; DeAngelis, Cumming, & Newsome, 1998; Maunsell & Van Essen, 1983; Nadler, Angelaki, & Deangelis, 2008) and responses are modulated by the three-dimensional structure of spatial context (Duncan, Albright, & Stoner, 2000). Lateral connections between similarly tuned clusters of neurons in MT would be an interesting candidate for our amodal spatial facilitation of kinetic depth stimuli. We are not aware of any existing studies looking into the specific distribution of lateral connections based on the depth selectivity of the neurons they are connecting, but our experiments suggest that if lateral connections are responsible for the perceptual coupling of SFM stimuli, the connections between ‘far-tuned’ neurons should be either stronger or more numerous than those between ‘near-tuned’ neurons.

The decrease in proportion of perceptual coupling with increasing distance between the stimuli is however also consistent with an attenuating dilation of neural signal over multiple cells ‘covering the gap.’ For orientation perception, cells in monkey primary visual cortex have been found that respond specifically to an invisible line segment only if it could be inferred from amodal completion, not when disparity information defined modal completion (Sugita, 1999). Cells in area MT or MST (middle superior temporal) could form such a bridging mechanism, either direct or via feedback from posterior parietal cortex where neural correlates of occluded motion have been demonstrated (Assad & Maunsell, 1995). In the absence of direct sensory stimulation these ‘bridge-neurons’ will not give rise to any percept, but their information-transporting role may cause adaptation that could perhaps be visualized using a subsequent test-stimulus on the location of the gap. A first hint that this might work can be found in a study by Fang and He (2004) that demonstrates a small (probably non-significant) adaptation effect in the non-stimulated gap between two co-rotating disparity defined cylinders (the yellow bars in their Figure 2b). Future experiments specifically designed to unravel the nature of the amodal information-sharing connectivity may be more successful in distinguishing between the two possible mechanisms.

Our last experiment demonstrated that the spatial facilitation mechanism is neither based purely on information about absolute depth (behind or in front of fixation), nor solely on the relative depth of the surfaces constituting the cylinders (front side vs. backside), but rather on a mixture of the two. Whereas this seems to be an excellent functional approach to handle occlusion situations (like occlusion, spatial facilitation occurs at any depth plane that is not nearest to the observer), it complicates the physiological interpretation a little bit. The brain is known to exhibit neural substrates for both absolute and relative disparity (for a review see Roe et al., 2007 or Parker, 2007), but the mechanisms by which these sources of depth information are combined are currently far from clear. As a result, our neural network model is likely to be a serious oversimplification of the actual process of spatial facilitation, but it provides a nice first handle in an attempt to understand how the brain uses spatially separated information in the perception of partially occluded objects. It should however be kept in mind that the proposed distinction in ‘far’ and ‘near’ tuned neurons should apparently be based on a mixture of absolute depth and depth relative to other parts of the visual scene.

In conclusion, our current findings suggest that the perceptual coupling of bistable stimuli reflects a more common mechanism by which the brain deals with occlusion. Facilitatory connections may exist between similarly tuned far depth neurons, establishing an information sharing mechanism that resolves local ambiguities by integrating spatially separated global information.

Appendix A

Depth cues in a single kinetic depth cylinder

This experiment demonstrates whether there are any qualitative differences in the way that the perception of kinetic depth cylinders are influenced by either disparity or luminance defined depth cues. A single cylinder was presented at the center of the screen (see Methods for more details) and 7 observers (including 2 authors) indicated the perceived direction of the near/front surface of the cylinder by pressing a button on the keyboard. Any possible stimulus bias (disparity or luminance) had a 40% probability of changing its direction on consecutive presentations. The results are presented in Figure A1 and demonstrate that the different depth cues have more ore less similar qualitative effects. A quantitative comparison is difficult. Even if the two are plotted as ‘fraction of full bias.’ First of all, the full bias for luminance for depends on the monitor used for displaying the stimuli and secondly, it is unclear how luminance gradients would compare to ‘realistic disparity.’ When stimuli are fully ambiguous (fraction of bias is zero), our experiments replicate previous findings of perceptual stabilization (Klink et al., 2008a; Leopold et al., 2002; Maier et al., 2003; Noest et al., 2007) (Figure A1b). When a depth cue is introduced, it biases the stimulus towards one particular perceptual interpretation. When these depth cues are getting stronger, the stimuli are perceived consistent with the bias for a larger proportion of the trials (ANOVA: Fdisp(4,30) = 26.14, pdisp < 0.001; Flum(4,30) = 8.57, plum < 0.001) (Figure A1a). Because the direction of the bias has an alternation probability of 40%, the proportion of perceptual stabilization decreases in accordance with the increasing veridicality (ANOVA: Fdisp(4,30) = 10.57, pdisp < 0.001; Flum(4,30) = 5.16, plum < 0.003) (Figure A1b). Both depth cues reach high proportions of veridical perception and are thus effective determinants of perceptual interpretation.

a) The fraction of trials that observers ( n = 7) perceived the cylinder to rotate in agreement with the bias as a function of bias strength for both disparity and luminance depth cues. The point indicated with the open square is a theoretical starting point since a stimulus cannot be perceived according to a bias if there is no bias. The effectiveness of both depth cues increases when the biases get larger and both reach high veridicality values. b) In the absence of stimulus biases we see clear perceptual stabilization. When the depth cues become stronger and observers start to perceive the stimulus in accordance with the bias more often (see a) stabilization probabilities naturally decrease since our stimulus biases changed direction with a probability of 40%. Error bars in both panels represent S.E.M.

Figure A1

a) The fraction of trials that observers ( n = 7) perceived the cylinder to rotate in agreement with the bias as a function of bias strength for both disparity and luminance depth cues. The point indicated with the open square is a theoretical starting point since a stimulus cannot be perceived according to a bias if there is no bias. The effectiveness of both depth cues increases when the biases get larger and both reach high veridicality values. b) In the absence of stimulus biases we see clear perceptual stabilization. When the depth cues become stronger and observers start to perceive the stimulus in accordance with the bias more often (see a) stabilization probabilities naturally decrease since our stimulus biases changed direction with a probability of 40%. Error bars in both panels represent S.E.M.

This experiment investigated the occurrence of perceptual coupling between two spatially separated kinetic depth cylinders rotating about a common axis. It is basically a repetition of the work of Freeman and Driver (2006) but we use a different experimental paradigm. In our percept-choice paradigm, stimuli are presented in sequences separated by short blank intervals. Freeman and Driver (2006) presented their stimuli for extended periods of 30 or 40 seconds. In the current experiment two coaxial cylinders were presented on each side of a fixation cross (Figure 1b). They were separated by a gap of 0.5 degrees, measured between their closest edges. The rightmost cylinder was always completely ambiguous whereas the left could have a disparity or luminance bias. Seven observers (including two authors) indicated the perceived direction of the near/front surface of both cylinders by pressing buttons on the keyboard. Any possible stimulus bias (disparity or luminance) again had a 40% probability of changing its direction on consecutive presentations.

Figure B1a demonstrates the proportion of trials in which the two stimuli were perceived to rotate in the same direction as a function of the depth cue strength. Our findings confirm those of Freeman and Driver (2006). Strong coupling occurs for all values of disparity biases (solid line, no statistical differences within disparity cue strengths. ANOVA: F(4,30) = 0.66, p = 0.63). For luminance depth cues there is also clear coupling, except for full luminance gradients (dotted line, ANOVA: F(4,30) = 8.10, p < 0.001). For full depth cue biases the difference between luminance and disparity is highly significant (T-test, p < 0.01) replicating previous findings by Freeman and Driver (2006). The effectiveness of the depth cues in determining perception increases when the cues get stronger (ANOVA: Fdisp(4,30) = 6.19, pdisp < 0.001; Flum(4,30) = 3.79, plum < 0.02) and Figure B1b demonstrates that when veridicality increases, the proportion of stimulus coupling also increases (ANOVA: Fdisp(4,30) = 6.33, pdisp < 0.001; Flum(4,30) = 3.22, plum < 0.03). Stimulus coupling is defined as the fraction of the trials with perceptual coupling in which the rotation direction is consistent with the specified bias direction. Interestingly, for small depth biases the amount of perceptual coupling is very high (Figure B1a) while the proportion of stimulus coupling remains relatively low (Figure B1b) indicating a substantial proportion of trials in which the cylinders jointly rotated against the bias.

a) The fraction of perceptual coupling as a function of bias strength for seven observers. High fractions of perceptual coupling are present for both luminance and disparity biases over almost the entire range of bias strengths. The significant difference between disparity and luminance cues occurs with full biases. Here there is still coupling between an ambiguous and a disparity defined cylinder but not between an ambiguous and a luminance defined cylinder (gray shaded area). b) The fraction of stimulus coupling as a function of bias strength. The fraction of stimulus coupling is the number of trials when stimuli were perceptually coupled and consistent with the bias direction divided by the total number of perceptually coupled trials. It is clear that with small biases the fraction of stimulus coupling is well below one, meaning that on a substantial number of trials the stimuli were perceptually coupled but rotated against the bias. When the bias gets stronger, the fraction of stimulus coupling also increases. Error bars in both panels represent S.E.M.'s. The point indicated with an open square is a theoretical starting point in the absence of biases.

Figure B1

a) The fraction of perceptual coupling as a function of bias strength for seven observers. High fractions of perceptual coupling are present for both luminance and disparity biases over almost the entire range of bias strengths. The significant difference between disparity and luminance cues occurs with full biases. Here there is still coupling between an ambiguous and a disparity defined cylinder but not between an ambiguous and a luminance defined cylinder (gray shaded area). b) The fraction of stimulus coupling as a function of bias strength. The fraction of stimulus coupling is the number of trials when stimuli were perceptually coupled and consistent with the bias direction divided by the total number of perceptually coupled trials. It is clear that with small biases the fraction of stimulus coupling is well below one, meaning that on a substantial number of trials the stimuli were perceptually coupled but rotated against the bias. When the bias gets stronger, the fraction of stimulus coupling also increases. Error bars in both panels represent S.E.M.'s. The point indicated with an open square is a theoretical starting point in the absence of biases.

This work was supported by a VIDI grant from the Netherlands Organization for Scientific Research (NWO) and a High Potential grant from Utrecht University, both awarded to RvW. The authors thank Maarten van der Smagt for valuable discussions and comments on earlier versions of this manuscript.

a) Schematic representation of a kinetic depth cylinder stimulus. The spatial distribution and speed profile of the dots create the vivid impression of a three-dimensional cylinder rotating around a vertical axis. Without explicit depth cues the rotation direction is ambiguous and bistable. The axis drawn here was not present in the actual stimulus. b) Two coaxially presented stimuli have a strong tendency to be perceived as rotating in the same direction. c) Examples of modal and amodal completion with Kanizsa triangles (Kanizsa, 1979). In the top image, a white triangle appears to float in front of black circles. The illusory triangle surface is constructed through modal completion. The lower image's white triangle is perceived as through a set of apertures in a white ‘foreground’ (amodal completion) while the black shapes are perceived as part of an occluded black ‘background.’ d) Amodal spatial facilitation can resolve local ambiguities. An image of an occluded Schröder's staircase, looked at through three apertures. The image in the middle aperture has ambiguous depth information whereas the left and right are disambiguated by contextual information. If the middle aperture is combined with only one of the two flanking apertures, amodal facilitation disambiguates the depth structure in the middle aperture.

Figure 1

a) Schematic representation of a kinetic depth cylinder stimulus. The spatial distribution and speed profile of the dots create the vivid impression of a three-dimensional cylinder rotating around a vertical axis. Without explicit depth cues the rotation direction is ambiguous and bistable. The axis drawn here was not present in the actual stimulus. b) Two coaxially presented stimuli have a strong tendency to be perceived as rotating in the same direction. c) Examples of modal and amodal completion with Kanizsa triangles (Kanizsa, 1979). In the top image, a white triangle appears to float in front of black circles. The illusory triangle surface is constructed through modal completion. The lower image's white triangle is perceived as through a set of apertures in a white ‘foreground’ (amodal completion) while the black shapes are perceived as part of an occluded black ‘background.’ d) Amodal spatial facilitation can resolve local ambiguities. An image of an occluded Schröder's staircase, looked at through three apertures. The image in the middle aperture has ambiguous depth information whereas the left and right are disambiguated by contextual information. If the middle aperture is combined with only one of the two flanking apertures, amodal facilitation disambiguates the depth structure in the middle aperture.

Simulations of the spatial facilitation model ( Figure 2c). Colored lines represent the simulated response of the four neural populations and colored shading represent percepts, inferred from the neural responses via a winner-take-all mechanism (see text). a) Perceptual coupling between two ambiguous cylinders. During the first few presentations the cylinders are individually stabilized, but later they couple and they stay coupled. The moment coupling kicks in depends on the strength of the spatial facilitation parameter. b) Perceptual coupling between an alternating disparity biased and an ambiguous cylinder. The ambiguous cylinder no longer stabilizes but follows the alternating disparity-defined percept of the biased cylinder. The strength and direction of the depth cue bias is given as a modulation parameter M (see text) and visualized with the green and gray lines that correspond to the green and gray neural populations in the schematic model icon (corresponding to Figure 2a) next to it. c) No perceptual coupling between an alternating luminance biased and an ambiguous cylinder. The dominant percept of the ambiguous cylinder stabilizes while the luminance-defined percept of the biased cylinder alternates. d) Perceptual coupling between a weakly luminance biased and an ambiguous cylinder. The luminance bias alternates direction on consecutive presentations, but is overruled by the perceptual stabilization that couples from the ambiguous to the biased cylinder. Parameters used in the simulation are: α = 5, β = 4/15, τ = 1/50, Xnear = 1, Xfar = 0.75 * Xnear, γD = γM = 5/3, ɛ = 0.1, λfar = 0.4 and λnear = λfar/5. Modulation in b & c: Xnear{0.5–1}, modulation in d: Xnear{0.9–1}.

Figure 3

Simulations of the spatial facilitation model ( Figure 2c). Colored lines represent the simulated response of the four neural populations and colored shading represent percepts, inferred from the neural responses via a winner-take-all mechanism (see text). a) Perceptual coupling between two ambiguous cylinders. During the first few presentations the cylinders are individually stabilized, but later they couple and they stay coupled. The moment coupling kicks in depends on the strength of the spatial facilitation parameter. b) Perceptual coupling between an alternating disparity biased and an ambiguous cylinder. The ambiguous cylinder no longer stabilizes but follows the alternating disparity-defined percept of the biased cylinder. The strength and direction of the depth cue bias is given as a modulation parameter M (see text) and visualized with the green and gray lines that correspond to the green and gray neural populations in the schematic model icon (corresponding to Figure 2a) next to it. c) No perceptual coupling between an alternating luminance biased and an ambiguous cylinder. The dominant percept of the ambiguous cylinder stabilizes while the luminance-defined percept of the biased cylinder alternates. d) Perceptual coupling between a weakly luminance biased and an ambiguous cylinder. The luminance bias alternates direction on consecutive presentations, but is overruled by the perceptual stabilization that couples from the ambiguous to the biased cylinder. Parameters used in the simulation are: α = 5, β = 4/15, τ = 1/50, Xnear = 1, Xfar = 0.75 * Xnear, γD = γM = 5/3, ɛ = 0.1, λfar = 0.4 and λnear = λfar/5. Modulation in b & c: Xnear{0.5–1}, modulation in d: Xnear{0.9–1}.

The fraction of perceptual coupling between ‘halves’ and ambiguous kinetic depth stimuli for five observers for cylinders (a) and spheres (b). ‘Half’ stimuli are defined to be the near or far sides of the full stimulus using either full luminance gradients or full disparity biases. For both types of stimuli, the only case in which the fraction of coupling is significantly larger than chance is when there is a disparity defined far side. In those cases there is also significantly more perceptual coupling than in disparity defined near sides or luminance defined far sides. Error bars represent S.E.M.

Figure 4

The fraction of perceptual coupling between ‘halves’ and ambiguous kinetic depth stimuli for five observers for cylinders (a) and spheres (b). ‘Half’ stimuli are defined to be the near or far sides of the full stimulus using either full luminance gradients or full disparity biases. For both types of stimuli, the only case in which the fraction of coupling is significantly larger than chance is when there is a disparity defined far side. In those cases there is also significantly more perceptual coupling than in disparity defined near sides or luminance defined far sides. Error bars represent S.E.M.

a) The influence of gap-size on perceptual coupling for four observers. The proportion of perceptual coupling is plotted against the strength of a luminance bias. The proportion of perceptual coupling decreases when luminance biases become too large. If the gap between the two cylinders increases the drop in perceptual coupling occurs at smaller luminance biases. b) The influence of luminance bias and gap-size on the proportion of trials in which the observers perceive the biased cylinder in accordance with the bias. This proportion increases fast with stronger biases but is not significantly influenced by gap-size. The open square at bias level zero is a theoretical point at chance level since there is no veridical percept here. Error bars in both plots represent S.E.M.

Figure 5

a) The influence of gap-size on perceptual coupling for four observers. The proportion of perceptual coupling is plotted against the strength of a luminance bias. The proportion of perceptual coupling decreases when luminance biases become too large. If the gap between the two cylinders increases the drop in perceptual coupling occurs at smaller luminance biases. b) The influence of luminance bias and gap-size on the proportion of trials in which the observers perceive the biased cylinder in accordance with the bias. This proportion increases fast with stronger biases but is not significantly influenced by gap-size. The open square at bias level zero is a theoretical point at chance level since there is no veridical percept here. Error bars in both plots represent S.E.M.

a) Temporal profile of the presentation of the two cylinders in our control experiment. The left cylinder (C1) was disambiguated by stereoscopic disparity; the right cylinder (C2) was ambiguous. Each cylinder was presented alone for 1.0 seconds separated by 1.5 seconds intervals during which the other cylinder was presented. b) Significant perceptual coupling with synchronous presentation (gray bar, data from 2) ceases to exist when the stimuli are presented asynchronously (white bar). Error bars represent SEM c) Simulations with our model reproduce the lack of perceptual coupling with asynchronous presentation. The simulation was performed with the same parameters as in Figure 3b, only now the input to the two sets of neuronal populations was asynchronous.

Figure 6

a) Temporal profile of the presentation of the two cylinders in our control experiment. The left cylinder (C1) was disambiguated by stereoscopic disparity; the right cylinder (C2) was ambiguous. Each cylinder was presented alone for 1.0 seconds separated by 1.5 seconds intervals during which the other cylinder was presented. b) Significant perceptual coupling with synchronous presentation (gray bar, data from 2) ceases to exist when the stimuli are presented asynchronously (white bar). Error bars represent SEM c) Simulations with our model reproduce the lack of perceptual coupling with asynchronous presentation. The simulation was performed with the same parameters as in Figure 3b, only now the input to the two sets of neuronal populations was asynchronous.

The roles of absolute and relative depth. The fraction of perceptual coupling between disparity defined ‘half’ and complete cylinders and ambiguous cylinders that were either closer to the observer than the plan of fixation (left) or further away than the plan of fixation (right). For the closer set of stimuli the results are comparable to those of Experiment 1 ( Figure 4). For the set of stimuli behind fixation the ‘near halves’ of cylinders (rightmost white bar) also cause a significant fraction of perceptual coupling. Error bars represent S.E.M.

Figure 7

The roles of absolute and relative depth. The fraction of perceptual coupling between disparity defined ‘half’ and complete cylinders and ambiguous cylinders that were either closer to the observer than the plan of fixation (left) or further away than the plan of fixation (right). For the closer set of stimuli the results are comparable to those of Experiment 1 ( Figure 4). For the set of stimuli behind fixation the ‘near halves’ of cylinders (rightmost white bar) also cause a significant fraction of perceptual coupling. Error bars represent S.E.M.

a) The fraction of trials that observers ( n = 7) perceived the cylinder to rotate in agreement with the bias as a function of bias strength for both disparity and luminance depth cues. The point indicated with the open square is a theoretical starting point since a stimulus cannot be perceived according to a bias if there is no bias. The effectiveness of both depth cues increases when the biases get larger and both reach high veridicality values. b) In the absence of stimulus biases we see clear perceptual stabilization. When the depth cues become stronger and observers start to perceive the stimulus in accordance with the bias more often (see a) stabilization probabilities naturally decrease since our stimulus biases changed direction with a probability of 40%. Error bars in both panels represent S.E.M.

Figure A1

a) The fraction of trials that observers ( n = 7) perceived the cylinder to rotate in agreement with the bias as a function of bias strength for both disparity and luminance depth cues. The point indicated with the open square is a theoretical starting point since a stimulus cannot be perceived according to a bias if there is no bias. The effectiveness of both depth cues increases when the biases get larger and both reach high veridicality values. b) In the absence of stimulus biases we see clear perceptual stabilization. When the depth cues become stronger and observers start to perceive the stimulus in accordance with the bias more often (see a) stabilization probabilities naturally decrease since our stimulus biases changed direction with a probability of 40%. Error bars in both panels represent S.E.M.

a) The fraction of perceptual coupling as a function of bias strength for seven observers. High fractions of perceptual coupling are present for both luminance and disparity biases over almost the entire range of bias strengths. The significant difference between disparity and luminance cues occurs with full biases. Here there is still coupling between an ambiguous and a disparity defined cylinder but not between an ambiguous and a luminance defined cylinder (gray shaded area). b) The fraction of stimulus coupling as a function of bias strength. The fraction of stimulus coupling is the number of trials when stimuli were perceptually coupled and consistent with the bias direction divided by the total number of perceptually coupled trials. It is clear that with small biases the fraction of stimulus coupling is well below one, meaning that on a substantial number of trials the stimuli were perceptually coupled but rotated against the bias. When the bias gets stronger, the fraction of stimulus coupling also increases. Error bars in both panels represent S.E.M.'s. The point indicated with an open square is a theoretical starting point in the absence of biases.

Figure B1

a) The fraction of perceptual coupling as a function of bias strength for seven observers. High fractions of perceptual coupling are present for both luminance and disparity biases over almost the entire range of bias strengths. The significant difference between disparity and luminance cues occurs with full biases. Here there is still coupling between an ambiguous and a disparity defined cylinder but not between an ambiguous and a luminance defined cylinder (gray shaded area). b) The fraction of stimulus coupling as a function of bias strength. The fraction of stimulus coupling is the number of trials when stimuli were perceptually coupled and consistent with the bias direction divided by the total number of perceptually coupled trials. It is clear that with small biases the fraction of stimulus coupling is well below one, meaning that on a substantial number of trials the stimuli were perceptually coupled but rotated against the bias. When the bias gets stronger, the fraction of stimulus coupling also increases. Error bars in both panels represent S.E.M.'s. The point indicated with an open square is a theoretical starting point in the absence of biases.

The effect of simulated input modulations on the effective input to the neural populations in our model. To account for the preference for single surfaces to be perceived as being near rather than far we state that Xfar = 0.75 * Xnear. Depth cue modulations affecting the different neural populations of the model are denoted as gain factors M1 and M2 (green and gray lines in Figure 3). For simulated luminance manipulations the two populations coding for the same motion direction have the same modulation gains, while for simulated disparity modulations the two populations that code for a consistent cylinder (different depth, opposite directions) receive the same gain factor.

Table 1

The effect of simulated input modulations on the effective input to the neural populations in our model. To account for the preference for single surfaces to be perceived as being near rather than far we state that Xfar = 0.75 * Xnear. Depth cue modulations affecting the different neural populations of the model are denoted as gain factors M1 and M2 (green and gray lines in Figure 3). For simulated luminance manipulations the two populations coding for the same motion direction have the same modulation gains, while for simulated disparity modulations the two populations that code for a consistent cylinder (different depth, opposite directions) receive the same gain factor.