Humans are exquisitely sensitive to changes in relative position. A fundamental and long-standing question is how information for position acuity is integrated along the length of the target, and why visual performance deteriorates when the feature separation increases. To address this question, we used a target made of discrete samples, each subjected to binary positional noise, combined with reverse correlation to estimate the behavioral “receptive field” (template), and a novel 10-pass method to quantify the internal noise that limits position acuity. Our results show that human observers weigh individual parts of the stimulus differently and importantly, that the shape of the template changes markedly with feature separation. Compared to an ideal observer, human performance is limited by a template that becomes less efficient as feature separation increases and by an increase in random internal noise. Although systematic internal noise is thought to be one of the important components limiting detection thresholds, we found that systematic noise is negligible in our position task.

Introduction

Humans can make highly precise judgments about the relative positions of two closely spaced features. Under ideal conditions, this resolution is much finer than inter-cone distance (Westheimer, 1975). While a number of elegant models have been proposed to explain this extraordinary vision (Geisler, 1984; Hu, Klein, & Carney, 1993; Klein & Levi, 1985; Levi, Klein, & Carney, 2000; Wilson, 1986), there is a long-standing question, dating back over a century to Ewald Hering (1899), about how information is integrated along the length of the target, to provide the high precision of position acuity, and whether this changes when the features are separated (Levi & Klein, 1986; Watt, Morgan, & Ward, 1983; Westheimer & Mckee, 1977; Whitaker, 1993; Zeevi & Mangoubi, 1984).

Recently, the noise image classification technique (Ahumada, 2002; Neri, 2004) has been extensively used in the field of human spatial vision. By recording the trial-by-trial effects of noise, it is possible to estimate the observer's internal decision template for a specific visual task. The classification image is a “map” or spatial profile that shows which image components influence the observer's performance. The classification image may be thought of as a behavioral receptive field (Gold, Murray, Bennett, & Sekuler, 2000) and it is the psychophysical analogue of reverse correlation methods used in the physiological mapping of receptive fields (DeAngelis, Ghose, Ohzawa, & Freeman, 1999; Ringach, Hawken, & Shapley, 1997).

Most previous studies used luminance noise, like black and white snow on a TV screen, to look into the neural mechanisms subserving position acuity (Beard & Ahumada, 1998; Levi & Klein, 2002, 2003). The conventional method to estimate the classification images involves thousands of response trials for averaging and classification (Beard & Ahumada, 1998). However, the use of positional noise (Li, Levi, & Klein, 2004) combined with linear multiple regression (Levi & Klein, 2002, 2003) has been recently shown to be useful and very efficient; a reliable classification image can be obtained in hundreds of trials.

In this study, we were interested in how observers integrate the local position information along the length of a target. Thus, we used positional noise (Li et al., 2004), that is, perturbation of the positions of parts of the stimulus, rather than by obscuring their visibility as occurs with luminance noise (Beard & Ahumada, 1998; Levi & Klein, 2002, 2003), to explore the underlying mechanisms. The use of a target made up of discrete samples, each subjected to positional noise, combined with reverse correlation, enabled us to estimate the two-dimensional “receptive field” (template) for position acuity at different feature separations.

To anticipate, our results show that human observers weigh individual parts of the stimulus differently, and that the observers' template changes markedly as feature separation increases. Compared to an ideal observer (a machine that uses all of the samples in the stimulus), we found that the inefficient human template could not fully account for human performance. Visual performance (both psychophysical and neuronal) may also be limited by internal noise or variability (Parker & Newsome, 1998). To assess and quantify our observers' internal noise, we used molecular psychophysics (Green, 1964), that is, trial-by-trial analysis of repeated sequences. Specifically, using a carefully designed “10-pass” method (in which each noise combination was presented 10 times), we were able to quantify the observers' internal noise and parse it into random (stimulus independent) and systematic (stimulus dependent) internal noise. Using a simple variance model, we are able to determine the extent to which human performance is limited by the template, by random noise, and by systematic noise over a range of separations.

Methods

Visual stimuli

The stimulus was comprised of two horizontal segments with a 7.5 (1.25λ), 20 (3.33λ), or 60 (10λ) arcmin gap between the centers of the two innermost patches. Each segment consisted of five Gabor patches (carrier SF, 10 cpd), and the inter-patch separation was 5 arcmin (the actual stimulus can be seen by viewing Figure 1 from a distance of 1.3 m). The patches were constructed to have a 2/3 aspect ratio: the Gaussian envelope standard deviation was 2.33 and 3.50 arcmin for the horizontal and vertical orientations, respectively. Positional noise was produced by shifting the position of each Gabor patch in the vertical direction around the intended mean line position of the test (right) segment according to a discrete binary probability function. The binary noise amplitude was always 0.67 arcmin, either positive or negative. No noise was added to the reference (left) segment. The stimulus was briefly presented (200 ms) on a flat 21-in. Sony F520 monitor screen at 90 Hz refresh rate. Subjects were asked to maintain fixation at the center of the monitor screen. The mean luminance of the stimuli was 55 cd/m2, and the contrast of each Gabor patch was 84%. The monitor was viewed directly at a distance of 4 m.

Visual stimuli with (a) 1.25λ and (b) 10λ separations. The binary noise of 0.67 arcmin, either upward or downward, was added to the right segment only. On each trial, the test line was presented randomly in one of three positions: aligned with the reference line, one step above it, or one step below it.

Figure 1

Visual stimuli with (a) 1.25λ and (b) 10λ separations. The binary noise of 0.67 arcmin, either upward or downward, was added to the right segment only. On each trial, the test line was presented randomly in one of three positions: aligned with the reference line, one step above it, or one step below it.

On each trial, the test line was presented randomly in one of three positions: aligned (0) with the reference line, one step above (−) it, or one step below (+) it. From session to session, the offset magnitude was varied (in random order) to obtain the classification images for stimuli with a range of detectability (d′) values from 0 to 2. The observers' task was to rate the position of the test line compared to the reference line by giving an integer number from −2 (above) to 2 (below), including 0 (aligned). The observers were instructed to attend to the whole of the jittered line, to determine the average location, and to compare this with the reference line. A rating scale signal detection paradigm was used to calculate d′ for discriminating the direction of offset in noise (Levi et al., 2000). The position offset at which d′ = 1 was taken as threshold. Trial-by-trial verbal feedback was provided.

Observers

Four normal adult observers, all with corrected to normal visual acuity (20/16 or better) in each eye, participated. Viewing was monocular. All observers except RL (one of the authors) were naive to the purpose of experiment.

where wi is the weighting (classification image) of each Gabor patch, xi is the patch position, and the subscript i goes from 1 to 5 for the five Gabor patches (see Figure 1). The sum of the weightings is unity.

Template noise (σtemp) can be calculated from Equation 1 by the following equation:

σt⁢e⁢m⁢p=σx∑iwi2,

(2)

where σx is the standard deviation of xi (external noise). For an ideal observer, this becomes σx/√N for N patches because wi = 1/N (the same weighting is given to each patch).

The d′ of an observer limited only by the external noise and the internal template is:

d′t⁢e⁢m⁢p=o⁢f⁢f⁢s⁢e⁢tσt⁢e⁢m⁢p.

(3)

Because threshold is taken at d′ = 1, offset threshold is equal to the external noise or template noise for the ideal observer or template observer, respectively.

o⁢f⁢f⁢s⁢ett⁢h=σt⁢e⁢m⁢p.

(4)

The external noise is 2/3 arcmin for each Gabor patch and we have five independent patches, so the ideal observer threshold is very close to 2/3√5 arcmin.

Step 2: Calculate human threshold

The data that are relevant for this calculation are the 3 × 5 matrix of the number of times human response r (−2, −1, 0, +1, and +2) is given to stimulus s (−2/3, 0, and +2/3 arcmin). The calculation of the human threshold (d′ = 1) was computed based on signal detection theory (Levi et al., 2000). A nonlinear chi-square minimization was used to fit the data with six parameters: four criteria and two d′s (one between zero and negative offset stimuli and the other one between zero and positive offset stimuli). Equations 3 and 4 for the case of the human threshold become:

Thh⁢u⁢m⁢a⁢n=s⁢t⁢i⁢m/d′a⁢v⁢e,

(5)

where d′ave is the average of the two d′ values and stim is the stimulus offset.

Step 3: Calculate the random noise

Random noise that is completely independent of the external stimulus represents the genuine internal positional noise characterizing the variability of neuronal activity in the visual system. In our experiments, for each of three offset levels (−, 0, and +), there are 25 = 32 possible external noise combinations in total (Figure 2; for stimulus numbering, see legend in Figure 3; U [up] and D [down] indicate the individual patch position). For a given 1 of the 96 stimuli, the data are randomly spread out (see Figures 2 and 3).

Signal detection. Each of three offset levels (−, 0, and +) consists of 32 different noise combinations (0 to 31), see legend in Figure 3 for stimulus numbering. The classification of the analog template response into a discrete human response (−2 to 2) is based on the observer's internal criteria (c).

Figure 2

Signal detection. Each of three offset levels (−, 0, and +) consists of 32 different noise combinations (0 to 31), see legend in Figure 3 for stimulus numbering. The classification of the analog template response into a discrete human response (−2 to 2) is based on the observer's internal criteria (c).

An example showing the linear relationship between template response and human response; the stimulus is abutting with a 0.5-arcmin position offset and the noise is 0.67 arcmin. There are a total of 96 data points (32 noise combinations × 3 stimulus levels) in the figure; the red and green triangles show the data for negative (up) and positive (down) offset levels, respectively, and the blue circles show the data for zero offset level. Each data point consists of 10 trials (10-pass) for the stimuli with identical noise configuration. Dotted lines show the 1 and 2 SD confidential intervals of the regression line.

Figure 3

An example showing the linear relationship between template response and human response; the stimulus is abutting with a 0.5-arcmin position offset and the noise is 0.67 arcmin. There are a total of 96 data points (32 noise combinations × 3 stimulus levels) in the figure; the red and green triangles show the data for negative (up) and positive (down) offset levels, respectively, and the blue circles show the data for zero offset level. Each data point consists of 10 trials (10-pass) for the stimuli with identical noise configuration. Dotted lines show the 1 and 2 SD confidential intervals of the regression line.

The data that are relevant for the random noise calculation are the 96 × 5 matrix of the number of times the human response r (−2, −1, 0, +1, and +2) is given to each of the 96 stimuli. A nonlinear chi-square minimization was used to fit the data with 99 parameters: 4 criteria, and 95 d′s. Each of the 96 stimuli had 10 trials because of our “10-pass” methodology. Ten repeats were sufficient for pinning down d′ for each stimulus with adequate sensitivity. The sequence of stimuli presented to observers was randomized in each of the 10 runs.

To calculate the random noise in units of minutes of arc, we need to use calibrated stimuli for which the precise template is irrelevant. Six such stimuli are available: these are the stimuli for which each of the five Gabor patches has equal displacement (stimuli 0 and 31; see legend in Figure 3). For these six stimuli, the predicted displacement is independent of the template weighting. We can then do a linear regression so that the value of the internal noise produces thresholds that are in best agreement with the six known offsets. We repeated this process for all 96 stimuli (to have greater statistical reliability) and obtained a similar estimate for the noise.

The special aspect of this fit is that no assumptions were made about how higher order nonlinearities that depend on noise patterns affect d′. The d′ values are free to float. The multi-pass nature of the stimuli is critical for being able to obtain an accurate estimate of the internal noise by this procedure.

Step 4: Calculate the template

The data that are relevant for this calculation are the same as in Step 3 (96 × 5 matrix). The nonlinear chi-square minimization was performed with eight parameters: four criteria and four template weightings, with the internal noise fixed at the value determined from those six calibrated stimuli in Step 3. The fifth template weighting (for the outside Gabor patch) was fixed by making the sum of the five weightings equal unity. As discussed below, for these experiments, the simple linear template (Equation 1) is able to account for the systematic loss of efficiency. Figure 3 illustrates an example showing the linear relationship between template and human responses (r = 0.94).

Note that Steps 3 and 4 are done quite differently from previous papers (Levi & Klein, 2002, 2003; Levi, Klein, & Chen, 2005) using the same type of rating scale, method of constant stimuli data. Previously we calculated the template using linear regression on the integer human rating responses. In the present paper, the linear regression is performed inside the signal detection search program where criteria are also free parameters. The previous method of doing the calculation had additional noise due to the round off of going from the internal response to the external button press.

Step 5: Calculate the systematic noise

Systematic noise is stimulus dependent (Levi et al., 2005); different noise configurations may bias the observer's responses in one direction or another to a certain extent. For example, the perceived offset might be based on the single Gabor patch with the maximum or minimum offset. This inefficient nonlinear judgment would replace the linear template weighting of Equation 1. The systematic noise is calculated by the root mean square deviation of each datum in Figure 3 from the template prediction, after the random noise is subtracted off.

Results and discussion

Positional thresholds in noise

To determine how the brain weights different samples of the stimulus, we used a two-segment position stimulus (Figure 1) in which each segment consisted of five samples. We introduced positional noise by perturbing the position of the individual patches of the test segment. The observers' task was to judge the position of the test segment relative to the reference segment. We used an optimized version of the stimulus with binary noise, which allowed the number of possible noise combinations to be small (32) and enabled us to obtain highly reliable classification images in less than 1000 trials.

In the experiment, we varied the stimulus separation from abutting (1.25λ) to widely spaced (10λ) and found that position threshold in noise increased linearly with separation when plotted on a log–log scale (Figure 4a). For comparison, we also measured the threshold for a stimulus with no noise (red solid squares in Figures 4a and b) in one observer, RL. As expected, thresholds for the noisy stimulus are much higher (by almost a factor of 2) than those for the stimulus with no noise. The thresholds rise with increasing separation. However, the increase in threshold is much shallower than predicted by Weber's law (slope = 1). A power function was used to fit the threshold data; y = 0.41λ0.5 ±0.06 (blue line) and y = 0.27λ0.5 ±0.04 (red line) for the stimuli with and without noise, respectively. Adding noise to the stimulus shifts the threshold line upwards, with no change in the power constant, or slope of the regression line. The slope is about the same magnitude as reported previously, using zero noise Gabor stimuli (Whitaker, Bradley, Barrett, & McGraw, 2002).

(a) The increase in position threshold in noise with increasing separation. For comparison, we also measured position threshold with no noise in one observer, RL (▪). It is obvious that external noise makes the performance worse. (b) The decrease in contextual effect with stimulus separation. To show the contextual effect, we calculated the threshold for the six lined-up stimuli embedded in noise.

Figure 4

(a) The increase in position threshold in noise with increasing separation. For comparison, we also measured position threshold with no noise in one observer, RL (▪). It is obvious that external noise makes the performance worse. (b) The decrease in contextual effect with stimulus separation. To show the contextual effect, we calculated the threshold for the six lined-up stimuli embedded in noise.

We found an interesting contextual effect (Figure 4b). Alignment threshold is elevated when the “lined-up” stimuli are interleaved with noisy stimuli. For abutting stimuli, the positional threshold for the six stimuli in which all patches are lined-up (stimuli 0 and 31 for each offset level) was elevated (green line) when compared with the threshold with zero noise; approaching the performance with noise (96 stimuli). The context of embedding these noiseless stimuli in noisy stimuli increases their threshold. Possibly, the observer's criteria are less stable when all the noise is present. The contextual effect decreases with increasing separation. An additive contextual noise could account for our findings. The three no-noise thresholds are about 0.3, 0.5, and 0.8 arcmin. A Pythagorean sum with contextual noise of 0.3 arcmin gives with-noise thresholds of 0.42, 0.58, and 0.85 arcmin, in good agreement with the data. For widely spaced stimuli, the threshold for those lined-up stimuli in noise was very close to the threshold with no noise.

Behavioral “receptive field”

The classification images (Figure 5) show that human observers place different weights on individual samples of the stimulus (Figure 1a). Note that the weight on each patch was normalized so that the five coefficients add up to 1. We refer to this weighting function as the observers' template. Intuitively, the patch that is closest to the reference segment would seem to have the strongest influence; however, this is not always the case.

Influence function. (a) Classification images for a range of position offsets (subthreshold d′ < 0.75, threshold 0.75 ≥ d′ ≤ 1.25 and suprathreshold d′ > 1.25) at three separations. The regression coefficients are computed for offset stimuli. Note that the template is normalized to sum to 1. The abscissa represents the relative position of each Gabor patch as shown in Figure 1. From session to session, the offset magnitude was varied (in random order) to obtain the classification images for stimuli with a range of d′ values from 0 to 2. Three of our observers completed all three separations (small, medium, and wide); observer LN only completed the small and medium separations. (b) Behavioral receptive field for position acuity. The classification images for all position offsets in Figure 5a are pooled together in a contour map. The z-axis indicates regression coefficients for all location images, and the x- and y-axes indicate the position of each location image in 2D space.

Figure 5

Influence function. (a) Classification images for a range of position offsets (subthreshold d′ < 0.75, threshold 0.75 ≥ d′ ≤ 1.25 and suprathreshold d′ > 1.25) at three separations. The regression coefficients are computed for offset stimuli. Note that the template is normalized to sum to 1. The abscissa represents the relative position of each Gabor patch as shown in Figure 1. From session to session, the offset magnitude was varied (in random order) to obtain the classification images for stimuli with a range of d′ values from 0 to 2. Three of our observers completed all three separations (small, medium, and wide); observer LN only completed the small and medium separations. (b) Behavioral receptive field for position acuity. The classification images for all position offsets in Figure 5a are pooled together in a contour map. The z-axis indicates regression coefficients for all location images, and the x- and y-axes indicate the position of each location image in 2D space.

Interestingly, when the separation is small, the second sample often carries as much or more weight than the more proximal first sample in influencing the observers' judgements. For the largest separation, with the exception of KP, the observers tend to use the first sample almost exclusively, with the other samples contributing little to the responses (Figure 5a). This is not a consequence of low visibility of the outer samples. A control experiment using highly visible Gaussian rather than Gabor patches yielded a similar classification image (grey squares and dashed line in the panel RL10λ of Figure 5a).

In some cases, especially for the largest separation, the fifth patch has a weak repulsion effect (negative image). The negative weighting at large separation would be expected if the position judgement involves an extrapolation in the direction of the perceived tilt from one line to the other. If the outer patch is shifted upwards, the perceived tilt would induce a downward perception of offset that would increase with line separation. It is interesting that the observers' template is exceptionally consistent across a range of offset levels.

Similar findings were also reported in an earlier study. Whitaker (1993) varied the separation of pairs of thin strips of grating to represent different parts of the long abutting grating in the measurement of Vernier acuity. He found, in accord with our findings, that the second sample can provide more information than the abutting sample, and that the most distant patches actually interfered with position judgements.

In an elegant study, Beard and Ahumada (1998) used reverse correlation to measure Vernier acuity in luminance noise and demonstrated that the visual system uses spatial filters to detect misalignment for abutting stimuli and local signs for widely separated stimuli. Our use of position noise allows us to further identify the contribution of each target element (or sample) for position processing and to also quantify the observers' internal noise.

To summarize these results, we pooled classification images across all observers to produce a two-dimensional mean behavioral receptive field for all offset levels (Figure 5b). From this map, we can visualize the role of different parts of the stimulus in judging relative position. Each panel, representing the data collection of more than 30-kilo trials over 30 experimental sessions, shows the two-dimensional template for our position task. It is obvious that the tuning peak gradually shifts from patch position 2 (the leftmost panel) to 1 (the rightmost panel) with increasing separation.

Our modeling reveals that the imperfectly shaped template can explain a substantial part of the variation in threshold (Figure 6a). We compared our human observers' templates to that of an ideal observer (i.e., a machine that knows the precise details of the stimulus and task) to assess the efficiency of the human template. For the ideal observer, identical weighting is assigned to all stimulus samples. However, the templates used by human observers are not ideal (see Figure 5). The human template is indeed very efficient (about 70% efficiency) for abutting stimuli; the template threshold is just slightly higher (worse) than the ideal threshold (Figures 6a and d). The observers' template threshold increases as separation increases. For the widely separated stimuli, human observers adopt a very inefficient template (about 40% efficiency) for position discrimination. The template observer's threshold is considerably higher than that of the ideal observer, but still well below that of the human observers (Figure 6d). While we often consider human position acuity to be remarkably good, compared to the ideal observer it is rather poor—about a factor of 2 worse at small separations, and more than a factor of 4 worse at large separations.

The changes in the three threshold components, (a) template, (b) random noise, and (c) systematic noise with separation. (d) The squared threshold area map showing the relative contribution of each component to position threshold.

Figure 6

The changes in the three threshold components, (a) template, (b) random noise, and (c) systematic noise with separation. (d) The squared threshold area map showing the relative contribution of each component to position threshold.

Why is human performance so much worse than ideal and why does it change with separation when the ideal thresholds do not? While an inefficient template clearly contributes to the human threshold performance, it does not fully account for the loss of human efficiency, and the template threshold changes only modestly with separation (green area in Figure 6d). Below we examine how internal noise contributes to human performance.

Systematic and random noise

It is well known that internal noise plays an important role in human vision (Barlow, 1956; Pelli, 1990). We used a novel 10-pass paradigm in which the identical stimulus and noise sequences were repeated 10 times (see Methods section) to assay the observers' internal noise. The amount of response disagreement between the 10 independent passes allows the system's noise beyond the template response to be parsed into two components: (1) random noise that is independent across multiple presentations of the identical stimulus, and (2) systematic noise that is perfectly correlated across multiple presentations. Systematic noise is completely dependent on the external noisy stimulus configuration.

We found that, for the abutting stimuli, the random noise was about 0.52 arcmin for each Gabor patch and the systematic noise was about 0.07 arcmin (Figures 6b and c). Note that the calculations of random noise were based on all 96 stimuli. Our measurement of systematic and random noise is very robust, and it is quite independent of the stimulus offset level. Random noise increases more or less linearly as stimulus separation increases, in contrast, systematic noise decreases rapidly to about 0.02 arcmin at 10λ. Thus, the increase in random noise is an important factor in accounting for the change in threshold with increasing separation. The effect of external noise was to produce an upward shift of the power function (Figure 4b), increasing the thresholds by the same proportion for a wide range of separations. This kind of effect is typically attributed to changes in either template tuning or multiplicative internal noise, consistent with our results. Moreover, it implies that the change in random noise with separation is not additive in nature.

We speculate that there might be a close relationship between systematic noise and the contextual effect at small separations we mentioned earlier in Figure 4b. The contextual effect also plays an important role in learning contrast discrimination (Yu, Klein, & Levi, 2004).

The relative contributions of each of these components is summarized in the squared threshold area map in Figure 6d. The squared threshold map shows how much of the variance in performance can be attributed to the template, systematic, and random noise. The bottom grey area shows the threshold of an ideal observer, in which the observer knows all information about the stimulus and the same weighting is given to each of five patches. The four variances are estimated independently and it is gratifying that Figure 6d shows that their sum is close to the human threshold.

Weber's law for position

It is well established that position accuracy degrades proportionally with increasing feature separation. Under ideal conditions, when thresholds of broadband stimuli, that is, lines or dots, are plotted as a function of separation on log–log scale, the best fitting line is a power function with an exponent or slope of ≈0.7 to 1 and is termed “Weber's law” for position (Levi, Klein, & Yap, 1988). A considerably shallower slope is reported in studies using narrowband Gabor stimuli (Hess & Hayes, 1993; Whitaker et al., 2002).

There are (at least) three putative mechanisms to explain this threshold change with separation.

Alignment performance could be limited by an orientation cue (Sullivan, Oatley, & Sutherland, 1972; Watt, 1984). For widely separated stimuli, an increase in offset displacement is necessary to maintain a constant detectable orientation cue. The constant orientation hypothesis predicts a slope of 1; however, as noted in the introduction, with narrowband stimuli (like those used here) the slope is considerably lower than 1.

With increasing separation, there is a shift in the spatial scale of analysis such that larger filters are engaged as separation increases (Hess & Hayes, 1993; Whitaker et al., 2002). For narrowband stimuli such as Gabor patches, at small separations thresholds are inversely proportional to the carrier spatial frequency. At large separation, thresholds are determined by spatial scale characteristics of the envelope.

The use of narrowband stimuli in this study makes it unlikely that there is a strong shift in spatial scale (at least of low level linear filters). Moreover, an earlier study using broadband stimuli and spatial frequency masking showed only a modest spatial shift for separations less than 1 deg (Waugh & Levi, 1995). Our classification image data seem to be consistent with these findings. If a larger two-dimensional spatial filter is recruited for processing position information at larger separations, the optimal strategy would be to select those filters covering the entire segment region, and thus a greater weighting would be given to those middle patches. However, it appears from Figure 5a that the localization judgments were mostly based on the first two patches (observers RL, HD, and LN) for all separations, and the template tuning did not “shift to the right” when separation increased up to 10λ (1 deg).

While we can rule out Hypotheses 1 and 2 as the sole explanation, we note that both Hypotheses 1 and 2, when combined with a “floor” imposed by either blur (the Gaussian envelope) or the carrier cue, could result in a lower slope. The elegant study of Whitaker et al. (2002) shows that the envelope cue has a constant threshold of approximately 0.15 times the Gaussian standard deviation (σ) for separations up to ≈15 σ. For our stimuli, the Gaussian envelope standard deviation of 3.50 arcmin (for the vertical orientation) implies a floor of ≈0.52 min. The Whitaker et al. carrier floor for a very closely separated stimulus is approximately 0.3 min (for DW with an 8 c/deg carrier—their Figure 5) similar to RL's “no-noise” thresholds at the smallest separation (our Figure 4). Thus, we cannot completely exclude either mechanism; however, we discuss a more parsimonious alternative below.

As the stimuli are separated, their eccentricity covaries with their separation, and because the cortical filters at large eccentricities are not as closely and regularly packed as in the fovea (Levi et al., 1988; Waugh & Levi, 1995), the position of stimulus features becomes more uncertain when separation/eccentricity increases. Indeed, when position stimuli are presented on an isoeccentric arc, at a given eccentricity the threshold remains constant over a range of separations. Thus, there is no effect of separation (Levi & Klein, 1990; Levi et al., 1988), and Weber's law (for non-isoeccentric stimuli) can be attributed to the effects of eccentricity. Consistent with this notion, our 10-pass results show that random positional noise increases with separation.

Summary

In this study, we examined the factors limiting human position acuity for a range of separations. Here we show that humans use an inefficient template for position processing that contributes in part to the threshold. By keeping track of the trial-by-trial effects of noise, we were able to reveal the observers' strategy for processing the local sample information along the length of target for position acuity.

We demonstrated that the human template changes with feature separation. In general, when the feature separation is small, most of the proximal patches are used for localization judgements. When the features are widely separated, only the most proximal patch is weighted, and the most distal patch shows a weak repulsion, as would be expected from tilt uncertainty as discussed in the Results and discussion section. In a previous study, we reported a similar finding that the outermost patches always show a small repulsion effect on shape perception (Levi, Li, & Klein, 2003).

Classification images for position tasks have been derived previously (Beard & Ahumada, 1998; Levi & Klein, 2002, 2003) using luminance noise. Our use of a discretely sampled stimulus pattern, combined with positional noise, enabled us to construct a behavioral receptive field that provides a quantitative measure of the influence of each sample point in 2D space on the observer's responses in making relative position judgements.

By making multiple passes through our stimuli, we were able to accurately quantify the effects of internal noise, both random and systematic noise, at different separations. The increase in random noise with wider separation requires stronger signals for stimulus localization. Although systematic noise is thought to play an important role in limiting detection tasks (Levi et al., 2005), for our position task, systematic noise is negligible especially at the largest separation.

Conclusions

Human observers weigh individual parts of the stimulus differently and importantly, the shape of the template changes with feature separation.

Compared to an ideal observer, human performance is limited by a template that becomes less efficient as feature separation increases and by an increase in random internal noise.

Systematic noise plays a negligible role in our position task.

Acknowledgment

This work was supported by research grants R01EY01728 (DML) and R01EY03776 (SAK) from the National Eye Institute. The authors thank Peter Neri for his useful comments on an earlier version of the manuscript.

Commercial relationships: none.

Corresponding author: Dennis M. Levi.

Email: dlevi@berkeley.edu.

Address: School of Optometry, University of California, Berkeley, CA, 94720.

Visual stimuli with (a) 1.25λ and (b) 10λ separations. The binary noise of 0.67 arcmin, either upward or downward, was added to the right segment only. On each trial, the test line was presented randomly in one of three positions: aligned with the reference line, one step above it, or one step below it.

Figure 1

Visual stimuli with (a) 1.25λ and (b) 10λ separations. The binary noise of 0.67 arcmin, either upward or downward, was added to the right segment only. On each trial, the test line was presented randomly in one of three positions: aligned with the reference line, one step above it, or one step below it.

Signal detection. Each of three offset levels (−, 0, and +) consists of 32 different noise combinations (0 to 31), see legend in Figure 3 for stimulus numbering. The classification of the analog template response into a discrete human response (−2 to 2) is based on the observer's internal criteria (c).

Figure 2

Signal detection. Each of three offset levels (−, 0, and +) consists of 32 different noise combinations (0 to 31), see legend in Figure 3 for stimulus numbering. The classification of the analog template response into a discrete human response (−2 to 2) is based on the observer's internal criteria (c).

An example showing the linear relationship between template response and human response; the stimulus is abutting with a 0.5-arcmin position offset and the noise is 0.67 arcmin. There are a total of 96 data points (32 noise combinations × 3 stimulus levels) in the figure; the red and green triangles show the data for negative (up) and positive (down) offset levels, respectively, and the blue circles show the data for zero offset level. Each data point consists of 10 trials (10-pass) for the stimuli with identical noise configuration. Dotted lines show the 1 and 2 SD confidential intervals of the regression line.

Figure 3

An example showing the linear relationship between template response and human response; the stimulus is abutting with a 0.5-arcmin position offset and the noise is 0.67 arcmin. There are a total of 96 data points (32 noise combinations × 3 stimulus levels) in the figure; the red and green triangles show the data for negative (up) and positive (down) offset levels, respectively, and the blue circles show the data for zero offset level. Each data point consists of 10 trials (10-pass) for the stimuli with identical noise configuration. Dotted lines show the 1 and 2 SD confidential intervals of the regression line.

(a) The increase in position threshold in noise with increasing separation. For comparison, we also measured position threshold with no noise in one observer, RL (▪). It is obvious that external noise makes the performance worse. (b) The decrease in contextual effect with stimulus separation. To show the contextual effect, we calculated the threshold for the six lined-up stimuli embedded in noise.

Figure 4

(a) The increase in position threshold in noise with increasing separation. For comparison, we also measured position threshold with no noise in one observer, RL (▪). It is obvious that external noise makes the performance worse. (b) The decrease in contextual effect with stimulus separation. To show the contextual effect, we calculated the threshold for the six lined-up stimuli embedded in noise.

Influence function. (a) Classification images for a range of position offsets (subthreshold d′ < 0.75, threshold 0.75 ≥ d′ ≤ 1.25 and suprathreshold d′ > 1.25) at three separations. The regression coefficients are computed for offset stimuli. Note that the template is normalized to sum to 1. The abscissa represents the relative position of each Gabor patch as shown in Figure 1. From session to session, the offset magnitude was varied (in random order) to obtain the classification images for stimuli with a range of d′ values from 0 to 2. Three of our observers completed all three separations (small, medium, and wide); observer LN only completed the small and medium separations. (b) Behavioral receptive field for position acuity. The classification images for all position offsets in Figure 5a are pooled together in a contour map. The z-axis indicates regression coefficients for all location images, and the x- and y-axes indicate the position of each location image in 2D space.

Figure 5

Influence function. (a) Classification images for a range of position offsets (subthreshold d′ < 0.75, threshold 0.75 ≥ d′ ≤ 1.25 and suprathreshold d′ > 1.25) at three separations. The regression coefficients are computed for offset stimuli. Note that the template is normalized to sum to 1. The abscissa represents the relative position of each Gabor patch as shown in Figure 1. From session to session, the offset magnitude was varied (in random order) to obtain the classification images for stimuli with a range of d′ values from 0 to 2. Three of our observers completed all three separations (small, medium, and wide); observer LN only completed the small and medium separations. (b) Behavioral receptive field for position acuity. The classification images for all position offsets in Figure 5a are pooled together in a contour map. The z-axis indicates regression coefficients for all location images, and the x- and y-axes indicate the position of each location image in 2D space.

The changes in the three threshold components, (a) template, (b) random noise, and (c) systematic noise with separation. (d) The squared threshold area map showing the relative contribution of each component to position threshold.

Figure 6

The changes in the three threshold components, (a) template, (b) random noise, and (c) systematic noise with separation. (d) The squared threshold area map showing the relative contribution of each component to position threshold.