Detection of apparent motion in random dot patterns requires correlation across time and space. It has been difficult to study the temporal requirements for the correlation step because motion detection also depends on temporal filtering preceding correlation and on integration at the next levels. To specifically study tuning for temporal interval in the correlation step, we performed an experiment in which prefiltering and postintegration were held constant and in which we used a motion stimulus containing coherent motion for a single interval value only. The stimulus consisted of a sparse random dot pattern in which each dot was presented in two frames only, separated by a specified interval. On each frame, half of the dots were refreshed and the other half was a displaced reincarnation of the pattern generated one or several frames earlier. Motion energy statistics in such a stimulus do not vary from frame to frame, and the directional bias in spatiotemporal correlations is similar for different interval settings. We measured coherence thresholds for left–right direction discrimination by varying motion coherence levels in a Quest staircase procedure, as a function of both step size and interval. Results show that highest sensitivity was found for an interval of 17–42 ms, irrespective of viewing distance. The falloff at longer intervals was much sharper than previously described. Tuning for temporal interval was largely, but not completely, independent of step size. The optimal temporal interval slightly decreased with increasing step size. Similarly, the optimal step size decreased with increasing temporal interval.

Introduction

Two light sparks, placed closely together, evoke the percept of motion when presented sequentially with a small temporal interval (Exner, 1875, 1888). Motion can be perceived for spatial offsets so small that the two sources cannot be spatially resolved. It suggests that motion sensitivity is a fundamental property, based on low-level motion detectors. For motion detection in flies, Reichardt (1961) proposed such a low-level mechanism, in which signals from different receptors are correlated with a time delay for one of the signals. Such a motion detector is tuned for a specific combination of spatial span and temporal delay. Based on Reichardt's proposal, human motion detection has also been modeled by a front-end array of bilocal motion sensors, similar to the Reichardt motion detector (Reichardt, 1961; van Santen & Sperling, 1984, 1985). Equivalent descriptions have also been developed in terms of motion-energy filtering (Adelson & Bergen, 1985; Perrone, 2004; Watson & Ahumada, 1985) or spatiotemporal gradient detection (Fennema & Thompson, 1979; Johnston, McOwan, & Buxton, 1992).

Adopting the framework of bilocal motion sensors naturally divides the process of motion detection in three stages: A spatiotemporal operation generating the motion signal, which is preceded by spatiotemporal luminance filtering and followed by spatial and temporal integration of motion signals. We will refer to the stage generating the motion signals as the “correlation” stage. We use the term in a general sense, without claiming that the operations involved match those of a Reichardt detector (Emerson, Bergen, & Adelson, 1992). In a physiological context, this stage would be the first level at which directional selectivity arises. In primates, in the geniculate–striate pathway, directional selectivity first arises in V1 (De Valois, Cottaris, Mahon, Elfar, & Wilson, 2000; Mikami, Newsome, & Wurtz, 1986; Movshon & Newsome, 1996; Perrone, 2004; Saul, Carras, & Humphrey, 2005; Snowden, Treue, Erickson, & Andersen, 1991). Directionally selective V1 neurons, therefore, implement the presumed correlation stage. Consequently, preprocessing would comprise all spatial and temporal filtering preceding this stage, including the retina and LGN. Most important, these stages transform the stimulus into a band-pass-filtered image in both spatial and temporal domains. Motion signals presumably arise at multiple locations in such receptive fields, which, therefore, also comprise a first step of spatial and temporal integration (Livingstone, Pack, & Born, 2001; Movshon & Newsome, 1996). Additional integration of motion signals is implemented at higher cortical stages, including area MT and MST (Britten & Heuer, 1999; Dubner & Zeki, 1971; Duffy & Wurtz, 1991a, 1991b; Heuer & Britten, 2004; Livingstone et al., 2001).

Numerous studies used equivalent model frameworks to psychophysically explore the spatiotemporal requirements for coherence detection in random dot displays (Fredericksen, Verstraten, & van de Grind, 1993, 1994a, 1994b, 1994c; Morgan & Ward, 1980; van de Grind, Koenderink, & van Doorn, 1986; van den Berg & van de Grind, 1989; van Doorn & Koenderink, 1982a, 1982b, 1984). A primary aim in many studies was to examine the requirements for spatiotemporal correlation, irrespective of pre- and postprocessing. Especially in the temporal domain, this has been difficult because stimulus manipulations should not affect the temporal frequency content, to rule out prefiltering effects, or the modulation of motion energy from frame to frame, to prevent temporal integration effects. Moreover, measurements of complete tuning curves, rather than just the spatial and temporal limits, require manipulation of motion coherence without affecting other relevant parameters.

To measure complete tuning curves, van Doorn and Koenderink (1982a, 1982b) introduced a luminance signal-to-noise ratio (LSNR) for manipulating the motion strength in moving random pixel arrays. LSNR thresholds characterized motion sensitivity irrespective of the parameters specifying the apparent motion stimulus (i.e., step size and temporal interval). To specifically address the low-level correlation requirements for coherent motion detection, van Doorn and Koenderink used spatial and temporal alternations of patterns moving in opposite direction. They observed and measured critical alternation frequencies, at which directional sensitivity was sharply reduced. For lower temporal frequencies, the two patterns could be seen to alternate in time. For high frequencies, the two patterns no longer segregated in time but, instead, fused into transparency. At intermediate alternation rates, motion was irresolvable. They concluded that these observations were in line with a bilocal motion detection device, with the critical alternation frequency corresponding to the preferred temporal delay of motion detectors. At this interval, they reasoned that no motion could be detected because motion sensors failed to establish the correlation. At longer intervals, motion was detectable within a single interval, and at shorter intervals, motion was detectable through correlation over multiple alternations. Although these observations are in line with bilocal motion detection, one cannot rule out that varying the spatial or temporal frequency of alternations also affected spatial and temporal integration of motion signals.

Fredericksen et al. (1993) combined the manipulation of motion strength, using the LSNR, with a single-step dot lifetime paradigm for controlling the step size and interval content of a motion stimulus. It is a continuous version of previously used two-frame motion stimuli (Baker & Braddick, 1985; Casco et al., 1989; Morgan et al., 1997; Snowden & Braddick, 1989). Although the stimulus of Fredericksen et al. was suitable for measuring spatial tuning, it was inadequate for manipulating motion interval durations without affecting the temporal frequency content of the patterns. The reason is that for intervals longer than the minimal value, patterns remained stationary during the motion interval. Consequently, the temporal frequency content varied with interval duration. Moreover, the temporal integration of motion signals was also likely affected because displacements were presented at different temporal frequencies. In addition, and most important, because patterns remained stationary during the interval between motion steps, the motion comprised a wide range of temporal correlations and, hence, the stimulus failed to isolate unique combinations of step size and interval for correlation. The absence of a falloff at large interval durations in their results contrasts with earlier findings by Morgan and Ward (1980) and might have been due to the presence of correlation across short intervals in all of their stimuli.

Morgan and Ward's (1980) stimulus, on the other hand, effectively confined the motion energy to a single combination of step size and interval. This was achieved by showing corresponding dots for two moments only, separated by a predefined interval. Using an oscilloscope display on which dots were drawn successively, they had full control over temporal interval and were able to determine both minimal and maximal temporal intervals supporting coherence detection. Contrary to findings by Fredericksen et al. (1993), they found spatial and temporal limits to be independent of step size, as well as a sharp limit at large temporal intervals. Numerous methodological differences, however, prevent a direct comparison of these results to those obtained by others: Dot densities were low, stimulus durations were highly variable, and measurements consisted of determining reaction times. Most important, rather than spatial and temporal tuning curves, only the upper and lower limits were determined.

To clarify the discrepancies between previous results, we used an improved version of the stimulus used by Morgan and Ward (1980) and combined it with motion coherence threshold measurements to obtain full spatial and temporal tuning curves. The stimulus was constructed by generating sparse random dot patterns on every frame of a computer monitor and by showing a shifted version of this pattern once again after a specified time interval. Therefore, each frame contained a newly generated pattern and a displaced reincarnation of a previously shown pattern. The interval between corresponding dot patterns could be chosen freely, without affecting the number of steps per second, steps in total, and temporal frequency content. Motion information, that is, a directional imbalance for spatiotemporal correlations present in the stimulus, was confined to a single temporal interval only, and the stimulus contained constant motion energy statistics on every frame of the monitor. Moreover, measurements for different intervals were directly comparable because the level of spatiotemporal correlation at the specified motion parameters and overall spatiotemporal energy content were invariable with interval duration.

Methods

Random dot patterns were generated in real time on a Macintosh G4 personal computer and displayed on a 19-in. Sony monitor (Multiscan E400) at a resolution of 800 × 600 pixels and at a frame rate of 120 Hz. At the standard viewing distance of 125 cm, one monitor pixel corresponded to 0.02° × 0.02°. In all experiments, the patterns consisted of black dots (single monitor pixels) on a white background (mean luminance, 37 cd/m 2) displayed in a window of 400 × 400 pixels. The standard number of dots was 5,000, yielding a dot density of 80 dots/deg 2. In two control experiments, we tested the effect of different dot densities and different viewing distances on the step size and interval tuning for an optimal interval and step size, respectively.

Consistent spatiotemporal correlations in the stimulus were limited to a single combination of step size and temporal interval. This was achieved by plotting a dot only twice, separated in time by the required temporal interval. An apparently continuous version of such a motion stimulus was constructed by newly generating 50% of the dots on each frame and by combining this pattern with a displaced version of a pattern presented one or several frames earlier. Each frame thus contained a combination of random noise (refreshed dots) and coherent motion. Step size (number of pixels) and temporal interval (number of frames) between corresponding dot patterns were varied in the experiments to measure the full spatiotemporal tuning profile.

Figure 1 illustrates the type of motion used in the experiments, in the form of space–time diagrams. Dot positions for a single row of the display are shown along the X-axis. The Y-axis represents time, with each line of dots corresponding to a monitor frame. The left-hand column shows examples of dynamic noise and 100% coherent motion, for different combinations of step size and interval. A coherence value of 100% in these stimuli corresponds to coherent displacement of all relocated dots in combination with randomly refreshing the other half of the dots. All combinations (1/1, 2/2, 4/4, and 8/8 pixels/frames) represent the same mean velocity. Coherent motion shows up as an oriented pattern in such space–time plots, with the orientation representing the velocity. In comparison, apparent motion with unlimited dot lifetime would give a noiseless, oriented pattern, with correlations across all time steps in the display. In contrast, a horizontal line in Figure 1 correlates only with a line preceding or following it by the specified interval. There is no correlation bias across smaller or larger intervals. What is important is that the directional correlation bias in the stimulus is the same on every frame of the monitor and does not vary with the specified temporal interval for displacements. As noted previously by Morgan and Ward (1980), apparent motion generated in this way looks surprisingly continuous, irrespective of the specified temporal interval. Observers have the impression of a rigidly moving pattern within dynamic noise rather than a dynamically and discretely changing pattern.

Space–time plots for two-frame, single-step motion stimuli. The left-hand column shows plots for the maximal coherence value (100%), whereas the right-hand column shows the same motion step size and interval parameters for a coherence level of 20%. The top row shows dynamic noise: The pattern is randomly refreshed each time step. The additional four rows show settings for combinations of increasing step size and interval. In all four cases, the mean velocity is equal: In the top row, the pattern is displaced 1 pixel to the right with an interval of 1 frame. Step size and interval increase by a factor of 2 from one row to the next.

Figure 1

Space–time plots for two-frame, single-step motion stimuli. The left-hand column shows plots for the maximal coherence value (100%), whereas the right-hand column shows the same motion step size and interval parameters for a coherence level of 20%. The top row shows dynamic noise: The pattern is randomly refreshed each time step. The additional four rows show settings for combinations of increasing step size and interval. In all four cases, the mean velocity is equal: In the top row, the pattern is displaced 1 pixel to the right with an interval of 1 frame. Step size and interval increase by a factor of 2 from one row to the next.

To measure sensitivity for a specific combination of step size and interval, we determined motion coherence thresholds in a two-alternative, forced-choice paradigm. Coherence levels were varied by manipulating the percentage of dots taking part in the coherent motion (Britten, Newsome, Shadlen, Celebrini, & Movshon, 1996). The other dots were randomly displaced in the stimulus window. The right-hand column in Figure 1 shows the effect of reducing the coherence level to 20%. Decreasing the coherence effectively reduces the directional bias in the stimulus.

Left–right discrimination thresholds were determined in a Quest staircase procedure (Watson & Pelli, 1983). A trial consisted of a single presentation of 1 s in which the pattern moved either to the left or to the right. Subjects indicated the perceived direction of motion by pressing the left or right arrow key on the keyboard. The staircase determined the coherence value at which observers performed at 85% correct. A single threshold measurement consisted of 40–60 trials (depending on experience of the observer), which was repeated three to five times to determine the mean threshold value and its standard error. All staircases were inspected, and a staircase was discarded and repeated if it had not properly converged within the maximum number of trials.

Observers viewed the stimuli binocularly in a dark room at a viewing distance of 125 cm. They were instructed to steadily fixate a small marker (0.08° × 0.08°), which was centered on the display window. Thirteen observers participated in the experiments. Three (R.B., M.L., and S.S.) were experienced observers in motion and other psychophysical experiments. The other observers (M.S., E.S., B.A., C.C., E.F., F.R., A.J., E.J., R.P., and W.S.) were naive concerning the purpose of the experiment. Observers had full control over the pace of the experiment, starting each trial by a key press. No feedback on the correctness of answers was given. Data collection was started after subjects ran a few test staircases to minimize any learning effects. All observers had normal or corrected-to-normal vision.

Results

Figure 2 shows coherence thresholds for various step sizes as a function of the time interval between motion steps. Each plot shows data for one subject, with step size as parameter in the graph. Coherence thresholds range from about 5%, corresponding to the highest sensitivity, to 100% (invisible). Coherence thresholds may be higher than those obtained using standard noisy motion stimuli. This is due to the substantial amount of noise that was present irrespective of coherence setting. Moreover, the motion stimulus used in this study is more selective in triggering motion detectors that contribute to the motion percept, as biases in spatiotemporal correlation in the stimulus were limited to a single combination of step size and temporal interval. The duration of optimal intervals ranged from about 17 to 42 ms (2–5 frames). Sensitivity sharply dropped toward higher interval values. None of our observers could resolve an interval of 133 ms (16 frames). Thresholds also increased for lower intervals, but this falloff was less pronounced. All observers performed fairly well at the minimal temporal interval (8.3 ms) in our setup. At the shortest interval, our stimulus is comparable to the one used by Fredericksen et al. (1993). As expected, the perceptual behavior at short intervals is comparable to their results.

Coherence thresholds for different step sizes plotted as a function of temporal interval. Coherence thresholds correspond to the percentage of coherently moving dots in a Quest staircase procedure at 85% correct responses. Lower thresholds correspond to higher sensitivity; that is, sensitivity is inversely proportional to the coherence threshold. Step sizes are given in arc minutes, and intervals are expressed in milliseconds. A single monitor pixel measured 1.2 arcmin and a single monitor frame lasted 8.3 ms. Error bars show the standard error of the mean for each threshold measurement, based on three to five repetitions.

Figure 2

Coherence thresholds for different step sizes plotted as a function of temporal interval. Coherence thresholds correspond to the percentage of coherently moving dots in a Quest staircase procedure at 85% correct responses. Lower thresholds correspond to higher sensitivity; that is, sensitivity is inversely proportional to the coherence threshold. Step sizes are given in arc minutes, and intervals are expressed in milliseconds. A single monitor pixel measured 1.2 arcmin and a single monitor frame lasted 8.3 ms. Error bars show the standard error of the mean for each threshold measurement, based on three to five repetitions.

In general, temporal tuning curves had similar shapes for all step sizes. The curves seem shifted up or down, depending on the step size, but the minimum as well as the high and low interval falloffs were fairly similar. For some observers, especially E.S. and, to a lesser extent, S.S., the shapes of the curves indicate interactions between step size and interval tuning. Tuning for small step sizes tended to reach a minimum at higher interval values, whereas for larger step sizes, the minimum shifted toward slightly lower intervals.

To analyze the interactions between step size and interval tuning in more detail, the data in Figure 2 were replotted in the form of contour plots (left-hand column in Figure 3). Coherence thresholds are given in colors ranging from yellow–white (low thresholds, i.e., high sensitivity) to black (high threshold, low sensitivity). Spatiotemporal interactions were quantitatively assessed by comparing a spatiotemporally separable model to models implementing several different interactions between step size tuning and temporal interval tuning. For fitting models to the data, we first converted the coherence thresholds ( T) to a sensitivity measure, in which sensitivity ( S) was defined on a logarithmic axis:

S=2−log10(T),

(1)

where the value 2 corresponds to the logarithm of 100%, i.e., the maximum coherence threshold for an indiscriminable stimulus. Low coherence thresholds are thus converted into high sensitivity values. Logarithmic values were used to normalize standard deviations, at different performance levels, to more or less the same size.

Contour plots of motion coherence thresholds as a function of step size and interval duration. The first column shows measured data, with each row representing a different observer. The data are the same as those in Figure 2. The second and third columns show fits of a spatiotemporally separable model and a model including third-order (cubic) variations in optimal step size and optimal interval. The fourth and fifth columns show the fit errors for both models. Coherence thresholds are color-scale coded (horizontal bar above the first three columns), where low detection thresholds (high sensitivity) are white/yellow and high thresholds (low sensitivity) are black. Performance below 85% correct responses at the highest coherence level (no sensitivity) was set to a coherence threshold of 100%. The fit error in the fourth and fifth columns is also color-scale coded, where blue shades indicate overestimation of the coherence thresholds and red shades indicate underestimation. The intersections between horizontal and vertical gray or black dotted lines indicate the combinations of step size and interval that were presented (first column) or the combination used for plotting the fit and fit error. The fitting procedure is described in the text.

Figure 3

Contour plots of motion coherence thresholds as a function of step size and interval duration. The first column shows measured data, with each row representing a different observer. The data are the same as those in Figure 2. The second and third columns show fits of a spatiotemporally separable model and a model including third-order (cubic) variations in optimal step size and optimal interval. The fourth and fifth columns show the fit errors for both models. Coherence thresholds are color-scale coded (horizontal bar above the first three columns), where low detection thresholds (high sensitivity) are white/yellow and high thresholds (low sensitivity) are black. Performance below 85% correct responses at the highest coherence level (no sensitivity) was set to a coherence threshold of 100%. The fit error in the fourth and fifth columns is also color-scale coded, where blue shades indicate overestimation of the coherence thresholds and red shades indicate underestimation. The intersections between horizontal and vertical gray or black dotted lines indicate the combinations of step size and interval that were presented (first column) or the combination used for plotting the fit and fit error. The fitting procedure is described in the text.

On the basis of physiological results by Nover, Anderson, and DeAngelis (2005), we used log-normal functions to describe the temporal and spatial tuning curves:

f(s)=1σS2⁢πexp(−log10(s/μS)22σS2),

(2)

g(t)=1σT2⁢πexp(−log10(t/μT)22σT2),

(3)

where f(s) describes tuning as function of step size (s) and g(t) describes tuning as function of interval (t). μS and μT are the optimal values, and σS and σT represent tuning width.

The separable model is given by h( s,t) and is defined by the product of spatial ( f( s)) and temporal ( g( t)) factors:

h(s,t)=A⁢f(s)g(t),

(4)

where A represents global amplitude scaling.

To get a maximum likelihood estimate of the model parameters, we used a nonlinear, least squares fitting routine based on the Gauss–Newton method. A Monte Carlo simulation using a nonparametric bootstrapping procedure was used to assess the variability in the fitted parameters. The simulated data sets were based on the average of two randomly chosen values from the three individual measurements, for each combination of step size and interval. Subsequently, we quantified the goodness of fit by the R2 value, which quantifies the fraction of the total squared error that is explained by the model. All procedures were implemented in MATLAB, using custom and standard Matlab functions.

To compare fitted tuning functions to the measured coherence thresholds, the sensitivity data were transformed back into coherence thresholds. The second column in Figure 3 shows the resulting tuning curves for all observers. A comparison to the measurements (first column) shows that the data can be described quite accurately with independent spatial and temporal tuning curves ( R2 values were .931, .822, .925, .881, and .841 for subjects R.B., S.S., M.L., M.S., and E.S., respectively). Oval-shaped tuning as for R.B. and asymmetric spatial tuning as for observer M.S. are reproduced accurately. Independent step size and temporal interval tuning, however, fail to capture any oblique effects, such as observed for S.S. The fourth column in Figure 3 quantifies the differences between experimental data and separable tuning. Differences are given as log values. The maximum difference of about 0.6 log unit for observer S.S. corresponds to a factor of 4.0 in sensitivity. Although deviations from independence are relatively small, the data do show a general trend. Sensitivities for combinations of large temporal intervals (16–90 ms) and small step sizes, as well as small temporal intervals and larger step sizes (6–30 arcmin), were underestimated. Combinations of large intervals and large steps, as well as small intervals and small steps, were overestimated. To assess the nature and significance of spatiotemporal interactions, we extended the model to include spatiotemporal interactions. We used an F test to compare the fits of the two models ( http://www.graphpad.com/curvefit). It provides a quantitative estimate of the significance of the increase in R2 value, given the change in degrees of freedom due to additional parameters. The test calculates the chance ( p value) that the data set fits the more complicated model better, if the simpler model is, in fact, correct. p values are based on the F ratio, given by

F=(SS1−SS2)/SS2(df1−df2)/df2,

(5)

where SS symbolizes the sum of squares, df denotes the degrees of freedom, and the subscripts identify the simpler model (subscript 1) or the more complicated model (subscript 2). p values below .05 were taken to indicate significant improvements of the model fits.

We tested several different dependences between spatial and temporal tuning functions. The interaction model that gave the most significant improvement of the fit included a shift of the spatial optimum ( μS) with temporal interval, as well as a shift of the temporal optimum ( μT) with step size. We compared first-order (linear), second-order (quadratic), and third-order (cubic) shifts of spatial and temporal optima. For all observers, except M.L., increasing the order resulted in significantly better fits. R2 values increased from an average value of .880 for no interactions to .922, .941, and .957 for linear, quadratic, and cubic interactions, respectively. Results for fits with a third-order shift of optimal interval as well as optimal step size are shown in Figure 3 (third column). The last column in Figure 3 quantifies the fit errors for the interaction model. In general, temporal optima decreased with increasing step sizes, although the decrement was less for larger step sizes. For observer M.S., the temporal optimum was constant, irrespective of step size. Spatial optima tended to decrease with increasing temporal interval, except for M.L., who showed little interactions.

A variation in the width of the tuning curves did not provide significant improvements. Linearly shifting the tuning curves along the spatial or temporal axes, which is slightly different from varying the μ parameter, did provide significant improvements, but these improvements were smaller than for the shift in μS and μT.

Dot density

One obvious difference between our stimuli and those of Fredericksen et al. (1993) and Morgan and Ward (1980) was dot density. In the experiments described so far, we used 80 dots/deg2 per frame, at a monitor refresh rate of 120 Hz, in an 8° × 8° window. Morgan and Ward used a relatively low dot density (13,000 points per 728 ms, in a 2.25° × 2.25° window). A single point subtended 2.4 arcmin and was present for less than 56 μs. Fredericksen et al., on the other hand, used much higher dot densities. Variations in dot density might, therefore, play a role in comparing our data to those of others. To gain insight into the effects of dot density in our stimulus, we performed several control experiments at different dot densities. Figure 4A shows measurements of temporal tuning curves for densities ranging from 5 to 160 dots/deg2. Figure 4B shows similar measurements for spatial tuning curves. Temporal tuning was measured at the optimal step size of 7.2 arcmin, and spatial tuning was measured at the optimal temporal interval of 33 ms. Measurements for 80, 40, and 20 dots/deg2 did not differ substantially. A dot density of 80 dots/deg2 thus seemed sufficient to reach optimal sensitivity. An increase to 160 dots/deg2 dots resulted in a sharper falloff for intervals above 16.7 ms. Yet, for intervals of 16.7 ms and lower, thresholds were similar to the condition with 80 dots/deg2. A reduction to 10 and 5 dots/deg2 had little effect for intervals beyond 42 ms. Threshold for intervals smaller than 42 ms were higher when compared with the condition of 80 dots/deg2. Nonetheless, the optimal interval was not affected. Figure 4B shows that step size tuning for 80 and 40 dots/deg2 was similar. For higher and lower dot densities, thresholds were raised and optimal step sizes shifted toward lower values. In summary, we conclude that dot density may affect both spatial and temporal tuning. However, the value of 80 dots/deg2 did not seem to limit performance in our experiments.

Performance as a function of dot density. Coherence thresholds are shown as a function of temporal interval (A) and step size (B) for different dot densities. Temporal tuning was measured with a step size of 7.2 arcmin, and spatial tuning was measured with an interval of 33.3 ms. Different panels show results for different observers.

Figure 4

Performance as a function of dot density. Coherence thresholds are shown as a function of temporal interval (A) and step size (B) for different dot densities. Temporal tuning was measured with a step size of 7.2 arcmin, and spatial tuning was measured with an interval of 33.3 ms. Different panels show results for different observers.

Two other obvious differences between our stimuli and those of Fredericksen et al. (1993) and Morgan and Ward (1980) were dot size and viewing distance. Previous reports have shown significant effects of viewing distance on tuning for step size. van de Grind, Koenderink, and van Doorn (1992) showed that this resulted in distance invariance: If step sizes were expressed in monitor pixels, that is, object properties rather than visual angles, different viewing distances gave the same result. Changing the viewing distance affects the spatial frequency content of the stimulus, as well as the step size expressed in visual angles. Together, these effects supposedly caused distance invariance. In a second control experiment, we checked whether this rule also holds for the stimuli used in the present set of experiments and to what extent temporal tuning depended on viewing distance.

Figure 5 illustrates step size tuning measured at three different viewing distances for an interval of 33 ms. The left column shows the data with step sizes given in visual angle, whereas the right column shows the data with step sizes expressed in monitor pixels. The latter curves nearly overlap for different viewing distances, which supports previous findings of distance-invariant coherence detection. Figure 5B shows temporal tuning curves measured at three different viewing distances for an optimal step size of 5 pixels. None of the observers showed a clear change in tuning curve with viewing distance: Optimal intervals as well as the falloff at the high end seemed unaffected.

Performance as a function of viewing distance. Coherence thresholds are shown as a function of step size (A) and temporal interval (B), for different viewing distances. Temporal tuning was measured with a step size of 5 pixels, which corresponds to 0.6, 1.2, and 2.4 arcmin at a viewing distance of 67, 125, and 250 cm, respectively. Step size tuning was determined with an interval of 33.3 ms. Different panels show results for different observers. Stimulus conditions were the same for all viewing distances. As a result, step sizes expressed in visual angles varied. The left-hand column shows step sizes expressed in visual angle, whereas the right-hand column shows step sizes expressed in monitor pixels.

Figure 5

Performance as a function of viewing distance. Coherence thresholds are shown as a function of step size (A) and temporal interval (B), for different viewing distances. Temporal tuning was measured with a step size of 5 pixels, which corresponds to 0.6, 1.2, and 2.4 arcmin at a viewing distance of 67, 125, and 250 cm, respectively. Step size tuning was determined with an interval of 33.3 ms. Different panels show results for different observers. Stimulus conditions were the same for all viewing distances. As a result, step sizes expressed in visual angles varied. The left-hand column shows step sizes expressed in visual angle, whereas the right-hand column shows step sizes expressed in monitor pixels.

We used a continuous apparent motion stimulus with two-frame, single-step pattern lifetime to specifically investigate the requirements for correlation detection in moving random dot patterns. Variations in step size and temporal interval did not affect low-level luminance information or higher level temporal integration of motion signals. Moreover, the total amount of motion energy in the stimulus was independent of step size and interval. Therefore, our results directly reflect the effect of variations in spatiotemporal correlation on low-level motion detection. By combining single-step, two-frame motion with coherence thresholds, we were able to measure optimal values for intervals between motion steps and the decline for longer intervals. We found optimal interval values between 17 and 42 ms, as well as a steep falloff of sensitivity for larger temporal intervals. As a result, we found sharp upper limits for the temporal interval, similar to the findings by Morgan and Ward (1980).

Our data clearly differ from those presented by Fredericksen et al. (1993), who reported efficient coherence detection up to much longer intervals. One obvious reason for this discrepancy might have been a difference in mean luminance. It is well documented that both spatial and temporal properties in coherence detection strongly depend on mean luminance and the state of light adaptation (Eagle & Rogers, 1997; Lankheet, van Doorn, & van de Grind, 2002; Lankheet, van Wezel, & van de Grind, 1991; Morgan & Ward, 1980; van de Grind, Koenderink, & van Doorn, 1987). At high luminance levels, low-level (retinal) visual processing speeds up, which results in a shift toward shorter intervals in coherence detection. In this study, we used black dots on a white background (37 cd/m2), which was practically comparable to the mean luminance in the study by Fredericksen et al. (50 cd/m2). If anything, we might have expected slightly longer optimal intervals and step sizes. The discrepancy cannot be explained by differences in dot density or viewing distance either. Our control experiments showed that the shape of the temporal tuning curves for coherence detection did not vary drastically with viewing distance. Similar to previous reports in which Dmax measurements were found to depend on various stimulus parameters, including dot density, varying the dot density in our experiments did affect both spatial and temporal tuning. However, increasing the dot density reduced rather than increased the high temporal falloff. We conclude, therefore, that low thresholds for long intervals in the study of Fredericksen et al. are a result of the presence of multiple correlations in their stimuli with long intervals. Because patterns remained stationary between displacements in their stimuli, at large temporal intervals, correlations were also present at shorter intervals. This might very well explain why observers remained highly sensitive to long intervals in their stimulus, whereas in our stimuli, sensitivity for long time intervals sharply declined.

Spatiotemporal separability

Because we measured full spatial (step size) and temporal (interval) tuning profiles, our data critically test the dependence between spatial and temporal properties for coherence detection. We found that a fully separable model described the data fairly well. However, a variation of spatial tuning with temporal interval and a variation of temporal tuning with step size did provide a significantly better fit to the data. The general trend for five subjects was that the temporal optimum was inversely proportional with step size and that the spatial optimum was inversely related to interval. The reduction in fit error relative to a fully separable model was, however, relatively small.

The interaction effects were fairly subtle, which probably explains why Morgan and Ward (1980), based on upper and lower limits, found that step size and interval tuning were independent. Baker and Braddick (1985) reached the same conclusion based on two-frame apparent motion experiments. Motion detection in displacements of random dots was separably dependent on the displacement from one exposure to the next and on the time interval between exposures. They found no evidence for velocity-related dependences. Other studies reported more complex tuning properties, where spatial and temporal tuning very much depended on pattern velocity. Anderson and Burr (1985) used sinusoidal gratings to show two types of temporal tuning in motion detection, low-pass and band-pass in the range of 7–13 Hz. Temporal frequency tuning in their study very much depended on spatial frequency. Boulton and Baker (1993) used Gabor function micropatterns to study spatiotemporal properties for motion mechanism. Depending on SOA, they found fundamentally different behavior. At short SOAs, the mechanisms were more or less linear and could be described by Fourier motion, whereas at larger SOAs, the mechanism behaved clearly nonlinear.

It should be noted, though, that step size and interval tuning for moving random pixel arrays cannot directly be compared to spatial and temporal frequencies in gratings or Gabors. Moving random pixel arrays contain a wide range of spatial frequencies moving at the same “speed.” Temporal frequencies vary for different spatial frequencies, and hence, step size for moving patterns translates into different temporal frequencies for different spatial Fourier components. Speed-related spatiotemporal dependences for gratings therefore do not necessarily contradict independent tuning for step size and interval for random dot patterns.

Several previous studies (Koenderink et al., 1985; van de Grind et al., 1986; van Doorn & Koenderink, 1982a, 1982b) also reported speed-dependent step size and interval tuning for moving random pixel arrays. They measured optimal temporal intervals using temporal alternations of two oppositely moving patterns (see the Introduction section). Optimal step sizes were measured in a comparable experiment, using spatial alternations of two oppositely moving patterns. For wide apertures (orthogonal to the direction of motion), observers perceived both directions of motion segregated into alternating bars. For narrow apertures, the two patterns were perceived to move transparently. At intermediate aperture widths, the motion percept was greatly reduced. Because correlation detection was impossible at this critical bar width, it was interpreted as the optimal step size for motion detection. Both optimal step sizes and optimal intervals were determined as a function of pattern speed. At low speeds, optimal step sizes were constant and optimal temporal intervals varied. At high speeds, this pattern was reversed. Although these speed-related variations in optimal step size and interval seem to contradict separable step size and interval tuning, their results cannot indisputably rule out inseparable tuning for step size and interval. Drawing imaginary speed lines in the (separable) contour plots of Figure 3 straightforwardly illustrates this notion. Low speeds have a shallow slope and cut the sensitivity surface at combinations of small step sizes and large intervals. High speeds, on the other hand, cut the surface for short intervals and large step sizes. Moreover, the deviations from separability that we found are compatible with the speed-dependent relationship as reported by van de Grind et al. (1986). The important conclusion we can draw from this comparison is that the data of van Doorn and Koenderink (1982a, 1982b) do not necessarily imply the existence of differently tuned detectors. The same pattern could, in principle, result from separable step size and interval tuning, that is, based on a single type of detector that is broadly tuned for step size and for interval. However, given the pattern of spatiotemporal interactions and the regularity of fit errors in our data (Figure 3), it does seem more likely that different combinations do play a role.

Acknowledgments

This research was supported by the Helmholtz Institute, Ultrecht University and by the Innovational Research Incentives Scheme (VIDI) of the Netherlands Organization for Scientific Research (NWO).

Space–time plots for two-frame, single-step motion stimuli. The left-hand column shows plots for the maximal coherence value (100%), whereas the right-hand column shows the same motion step size and interval parameters for a coherence level of 20%. The top row shows dynamic noise: The pattern is randomly refreshed each time step. The additional four rows show settings for combinations of increasing step size and interval. In all four cases, the mean velocity is equal: In the top row, the pattern is displaced 1 pixel to the right with an interval of 1 frame. Step size and interval increase by a factor of 2 from one row to the next.

Figure 1

Space–time plots for two-frame, single-step motion stimuli. The left-hand column shows plots for the maximal coherence value (100%), whereas the right-hand column shows the same motion step size and interval parameters for a coherence level of 20%. The top row shows dynamic noise: The pattern is randomly refreshed each time step. The additional four rows show settings for combinations of increasing step size and interval. In all four cases, the mean velocity is equal: In the top row, the pattern is displaced 1 pixel to the right with an interval of 1 frame. Step size and interval increase by a factor of 2 from one row to the next.

Coherence thresholds for different step sizes plotted as a function of temporal interval. Coherence thresholds correspond to the percentage of coherently moving dots in a Quest staircase procedure at 85% correct responses. Lower thresholds correspond to higher sensitivity; that is, sensitivity is inversely proportional to the coherence threshold. Step sizes are given in arc minutes, and intervals are expressed in milliseconds. A single monitor pixel measured 1.2 arcmin and a single monitor frame lasted 8.3 ms. Error bars show the standard error of the mean for each threshold measurement, based on three to five repetitions.

Figure 2

Coherence thresholds for different step sizes plotted as a function of temporal interval. Coherence thresholds correspond to the percentage of coherently moving dots in a Quest staircase procedure at 85% correct responses. Lower thresholds correspond to higher sensitivity; that is, sensitivity is inversely proportional to the coherence threshold. Step sizes are given in arc minutes, and intervals are expressed in milliseconds. A single monitor pixel measured 1.2 arcmin and a single monitor frame lasted 8.3 ms. Error bars show the standard error of the mean for each threshold measurement, based on three to five repetitions.

Contour plots of motion coherence thresholds as a function of step size and interval duration. The first column shows measured data, with each row representing a different observer. The data are the same as those in Figure 2. The second and third columns show fits of a spatiotemporally separable model and a model including third-order (cubic) variations in optimal step size and optimal interval. The fourth and fifth columns show the fit errors for both models. Coherence thresholds are color-scale coded (horizontal bar above the first three columns), where low detection thresholds (high sensitivity) are white/yellow and high thresholds (low sensitivity) are black. Performance below 85% correct responses at the highest coherence level (no sensitivity) was set to a coherence threshold of 100%. The fit error in the fourth and fifth columns is also color-scale coded, where blue shades indicate overestimation of the coherence thresholds and red shades indicate underestimation. The intersections between horizontal and vertical gray or black dotted lines indicate the combinations of step size and interval that were presented (first column) or the combination used for plotting the fit and fit error. The fitting procedure is described in the text.

Figure 3

Contour plots of motion coherence thresholds as a function of step size and interval duration. The first column shows measured data, with each row representing a different observer. The data are the same as those in Figure 2. The second and third columns show fits of a spatiotemporally separable model and a model including third-order (cubic) variations in optimal step size and optimal interval. The fourth and fifth columns show the fit errors for both models. Coherence thresholds are color-scale coded (horizontal bar above the first three columns), where low detection thresholds (high sensitivity) are white/yellow and high thresholds (low sensitivity) are black. Performance below 85% correct responses at the highest coherence level (no sensitivity) was set to a coherence threshold of 100%. The fit error in the fourth and fifth columns is also color-scale coded, where blue shades indicate overestimation of the coherence thresholds and red shades indicate underestimation. The intersections between horizontal and vertical gray or black dotted lines indicate the combinations of step size and interval that were presented (first column) or the combination used for plotting the fit and fit error. The fitting procedure is described in the text.

Performance as a function of dot density. Coherence thresholds are shown as a function of temporal interval (A) and step size (B) for different dot densities. Temporal tuning was measured with a step size of 7.2 arcmin, and spatial tuning was measured with an interval of 33.3 ms. Different panels show results for different observers.

Figure 4

Performance as a function of dot density. Coherence thresholds are shown as a function of temporal interval (A) and step size (B) for different dot densities. Temporal tuning was measured with a step size of 7.2 arcmin, and spatial tuning was measured with an interval of 33.3 ms. Different panels show results for different observers.

Performance as a function of viewing distance. Coherence thresholds are shown as a function of step size (A) and temporal interval (B), for different viewing distances. Temporal tuning was measured with a step size of 5 pixels, which corresponds to 0.6, 1.2, and 2.4 arcmin at a viewing distance of 67, 125, and 250 cm, respectively. Step size tuning was determined with an interval of 33.3 ms. Different panels show results for different observers. Stimulus conditions were the same for all viewing distances. As a result, step sizes expressed in visual angles varied. The left-hand column shows step sizes expressed in visual angle, whereas the right-hand column shows step sizes expressed in monitor pixels.

Figure 5

Performance as a function of viewing distance. Coherence thresholds are shown as a function of step size (A) and temporal interval (B), for different viewing distances. Temporal tuning was measured with a step size of 5 pixels, which corresponds to 0.6, 1.2, and 2.4 arcmin at a viewing distance of 67, 125, and 250 cm, respectively. Step size tuning was determined with an interval of 33.3 ms. Different panels show results for different observers. Stimulus conditions were the same for all viewing distances. As a result, step sizes expressed in visual angles varied. The left-hand column shows step sizes expressed in visual angle, whereas the right-hand column shows step sizes expressed in monitor pixels.