People can hit rapidly moving balls with amazing precision. To determine how they manage to do so, we explored how various factors that we could manipulate influenced people's precision when intercepting virtual targets. We found that temporal precision was highest for fast targets that subjects were free to intercept wherever they wished. Temporal precision was much poorer when the point of interception was specified in advance. Examining responses to abrupt perturbations of the target's motion revealed that people adjusted where rather than when they would hit the target if given the choice. A model that combines judging how long it will take to reach the target's path with estimating the target's position at that time from its visually perceived position and velocity could account for the observed precision with reasonable values for all the parameters. The model considers all relevant sources of errors, together with the delays with which the various aspects can be adjusted. Our analysis provides a biologically plausible explanation for how light falling on the eye can guide the hand to intercept a moving ball with such high precision.

Introduction

The best players in cricket (McLeod & Jenkins, 1991; Regan, 1992) and table tennis (Bootsma & van Wieringen, 1990) are reputed to reliably hit balls when the time window within which they must do so is 4 ms or less. McLeod, McLaughlin, and Nimmo-Smith (1985) asked people with no specific training to hit falling balls with a bat and combined the ball's size and speed with the width of the bat to determine the time window for hitting the ball (see also Brouwer, Smeets, & Brenner, 2005; Tresilian & Lonergan, 2002; Tresilian & Plooy, 2006; Tresilian, Plooy, & Carroll, 2004). Relating the percentage of balls that were hit to this time window gives an estimated standard deviation for the timing of the hits of just above 5 ms (throughout this article, we use the standard deviation as our measure of precision). Using a similar task and subjects, we found a temporal precision of just above 6 ms (Brenner, van Dam, Berkhout, & Smeets, 2012).

To fully appreciate how precisely people can intercept moving targets, it is revealing to consider estimates of temporal precision in simpler, related tasks. When people are asked to tap with two hands in synchrony, the variability in the relative timing of the two hands suggests that a desired movement can be produced with a temporal precision of 6 ms at best (Brenner et al., 2012; Doumas & Wing, 2007; Doumas, Wing, & Wood, 2008). Temporal precision is 20 ms at best for simple visual tasks such as judging whether or not two stimuli were presented at the same time (Virsu, Oksanen-Hennah, Vedenpää, Jaatinen, & Lahti-Nuuttila, 2008) or which of them was presented first (without being able to infer this from perceived motion; Baruch, Yeshurun, & Shore, 2013; Brenner & Smeets, 2010; Exner, 1875; Nava, Bottari, Zampini, & Pavani, 2008). The whole film and television industry is based on the fact that we perceive series of flashes with intervals of 17–42 ms as a continuous stream of visual information. Thus, the temporal precision of interception is amazing.

Several issues need to be considered when discussing temporal precision. We know that arm movements are adjusted on the basis of updated sensory information until doing so is no longer possible due to sensorimotor delays (Brenner & Smeets, 2011; Carlton, 1981). During the sensorimotor delay, one must rely on predictions based on the target object's judged position and motion, either to determine where the target object will be at the moment of impact or to determine when the target object will reach the planned interception point. We also know that the hand moves along with the target object to make precise timing of the moment of contact less important (Brenner & Smeets, 2005). Moreover, the hand's speed near the time of impact is adjusted to the target object's speed to optimize the balance between spatial and temporal errors (Brouwer, Brenner, & Smeets, 2000). The balance depends on the object's speed because temporal errors give rise to larger spatial errors if the object is moving faster. Moving faster oneself decreases spatial precision but increases temporal precision (Schmidt, Zelaznik, Hawkins, Frank, & Quinn, 1979).

In this article we examine whether continuously adjusting movements on the basis of the latest sensory information (Brenner, Driesen, & Smeets, 2014; Brenner & Smeets, 2011; Land & McLeod, 2000) while relating the adjustments to the information in a clever manner (Caljouw, van der Kamp, & Savelsbergh, 2004; Rushton & Wann, 1999) could account for the high precision that is found, and if so, how. Since the timing of interception is not always so precise (Brenner & Smeets, 2009; Katsumata & Russell, 2012; Tresilian, Oliver, & Carroll, 2003), we first ran six sessions to determine how various combinations of visual information and movement requirements influence temporal precision. We ran two additional sessions to estimate sensorimotor latencies that we considered to be critical for modeling how visual information guides the hand. Estimates of the propagation of errors revealed that our model could account for the data with reasonable parameters for the various components. The model with these parameters was tested in a final session.

Materials and methods

The task was always to tap with one's right index finger on virtual targets that moved rightward across a screen. We used virtual targets for our study because they make it easy to manipulate the task constraints in many ways. We started with six sessions in which various details of the task that might influence precision were varied. In Sessions 7 and 8, the target could jump 1 cm as soon as the index finger started to move. This occurred on about half the trials and was used to see how subjects adjusted to errors and with what latency. In the final session, three subjects each performed many trials of a set of conditions that was specially selected to test a model based on the results of the first eight sessions.

Table 1 provides a quick overview of what we wanted to examine in each session, which subjects took part in each session, and how many trials in total each subject completed in each session. Only the author (S1) was explicitly aware of the manipulations under study, but most of the variations were quite evident. In total, there were 19 subjects, of whom 14 were men and five women. The sessions were conducted on separate days, in the order in which they are described. The last session was split into three parts, with breaks of at least 1 h and at most 2 days between the parts. The study was part of a research program that has been approved by the local ethics committee.

The nine sessions. Notes: S1 is an author. S1, S8, S9, S18, and S19 are male.

Table 1

The nine sessions. Notes: S1 is an author. S1, S8, S9, S18, and S19 are male.

Session

Studied manipulation is variation in

Subjects

Trials per subject

1

Image rate

S1, S2, S3, S4, S5, S6

200

2

Target velocity

S1, S2, S3, S4, S5, S6, S7, S8

180

3

Finger's starting point

S1, S2, S3, S4, S8, S9, S10, S11

228

4

Target acceleration

S1, S2, S3, S4, S5, S9, S10, S12

200

5

Viewing time, size of interception region, target velocity

S1, S2, S3, S4, S7, S8, S13, S14

240

6

Presence of interception region, finger's starting point

S1, S2, S4, S5, S6, S7, S9, S13

320

7

Horizontal and vertical target jump

S1, S2, S4, S8, S9, S11, S15, S16

220

8

Horizontal target jump (fixed interception region)

S1, S2, S7, S9, S13, S17, S18, S19

220

9

Target velocity and starting point, presence of interception region

S1, S2, S9

729

The setup

The experiment was conducted in a normally illuminated room (fluorescent illumination). Images were projected at 120 Hz (InFocus DepthQ Projector; resolution: 1280 × 768 pixels) from behind onto a 1.25-m × 1.00-m (width by height) acrylic rear-projection screen (Techplex 150) that was tilted backwards by 30°. The image was slightly smaller than the physical screen, so that image resolution was about 1 mm/pixel. Subjects stood in front of the screen and tapped the screen with their right index finger (Figure 1). They were not restrained in any way. An Optotrak 3020 that was placed at about shoulder height to the left of the screen measured the position of an infrared light-emitting diode attached to the nail of the subject's right index finger at 500 Hz.

At the beginning of each session, the diode position was measured when the fingertip was at four indicated positions on the screen (dots presented at the corners of an imaginary 60-cm × 50-cm rectangle at the screen center). This simple four-point calibration was used to relate the position of the fingertip to the projected images, automatically correcting for the fact that the diode was attached to the nail rather than the tip of the finger. The Optotrak also measured the position of a second diode that was attached to the left side of the screen and that stopped emitting infrared light for about 10 ms at 1 ms after light fell on a sensor that was placed in the path of the light directed towards the top left corner of the screen. Flashes were presented at the top left corner of the screen at critical moments during the experiment for temporal calibration. Measuring at 500 Hz both the position of the first diode and whether the second diode stopped emitting infrared light allowed us to determine the position of the finger with respect to the screen every 2 ms and to determine the moments at which images were presented to within the same 2 ms (although new images were only presented every 8.3 ms).

Stimulus and procedure

This section describes the aspects of the stimulus and procedure that were the same in all or most sessions. The details in which some sessions differed are mentioned in the descriptions of the individual sessions in the next three sections.

After the calibration, subjects started each trial by placing their index finger at the starting point (a 1.5-cm diameter gray disk) that was usually 5 cm to the right of and 10 cm below the screen center. Subjects could rest whenever they wanted by not placing their finger at the starting point. Between 2.0 and 2.5 s after the finger was placed at the starting point, a 1.5-cm diameter target disk that was moving at a constant velocity to the right appeared 20 cm to the left of and 15 cm above the screen center. The target was gray in Session 1 and white in all other sessions. If the finger left the starting point before the target appeared, nothing happened until it was placed back at the starting point and remained there for the required time.

Once the target appeared, subjects were expected to lift their finger off the screen and to try to tap on the target. In some sessions, they were free to choose when and where to try to intercept the target. In other sessions, a region within which they were to do so was indicated by a dark-gray rectangle (5 or 10 cm to the right of and 15 cm above the screen center). The rectangle's height was 5 cm. Its length was adjusted to the target's velocity so that the target center would be within the region specified by the rectangle for a specified amount of time. The target passed in front of rather than behind the rectangle, so it was always visible. The rectangle was only present when subjects were required to hit the target at an indicated position. Once a tap was detected (deceleration larger than 250 m/s2 while the finger was less than 0.5 cm above the screen and within 2.25 cm of the target's path), the performance was evaluated and feedback was provided for 500 ms.

To determine whether the target had been hit, we compared the position of the fingertip at the moment of the tap with the (interpolated) target position at that moment. If the position of the marker was within the outline of the target at the moment of the tap we considered the target to have been hit. All the delays in our equipment were considered when doing so. If the target was hit, it disappeared. If the target was hit within the interception region, or if there was no interception region, a sound indicated that the hit was successful. If the target was missed, it deflected away from the finger at 1 m/s. For example, if the finger tapped above and to the left of the target, the target moved down and to the right. Although subjects could see their finger and the target throughout, it was extremely difficult to tell whether one had tapped at the right moment without the explicit feedback.

The first six sessions

Session 1: Image rate

In Session 1 we varied the presentation rate. Even for our relatively high frame rate (120 Hz), the interval between images on the screen (8.3 ms) is of the same order of magnitude as the precision that we are interested in. To determine to what extent this interval influences performance, we varied the presentation rate by either presenting a target on each frame or doing so on each second, third, fourth, or fifth frame (with blank frames in between), resulting in frame rates of 120, 60, 40, 30, and 24 Hz. Since presenting the same target on fewer frames would reduce the time-averaged target luminance, we scaled the luminance so that the time-averaged target luminance and contrast were the same for all frame rates. Consequently, the presentations with high frame rates looked identical, while those with lower rates looked similar but the target appeared to flicker slightly (as in old motion pictures). Two slightly different target velocities (38 and 42 cm/s) were used, to discourage subjects from moving in the same way on all trials. After 20 practice trials that were not analyzed (two for each of the 10 combinations of frame rate and target velocity), subjects performed 200 trials (20 for each of the 10 combinations of frame rate and target velocity). The trials were presented in random order. The data for the two target velocities were combined for the analysis.

Session 2: Target velocity

Temporal precision becomes more important as target velocity increases. In Session 2 we therefore varied the target's velocity across a wide range of values (10, 25, 40, 55, 70, and 85 cm/s). In order to make it possible for subjects to hit all such targets within the same region of the screen, the targets with different velocities started at different lateral positions (1.25, 10.625, 20, 29.375, 38.75, and 48.125 cm to the left of the screen center). After 12 practice trials (two for each velocity), subjects performed 180 trials (30 for each of the six velocities) in random order.

Session 3: Finger's starting point

In Session 3 we varied the required movement rather than visual information from the target. We varied the finger's starting position in two directions. It could be at the original lateral position (5 cm to the right of the screen center) or 10 cm to the left or right of this position. Moreover, it could be at the original distance from the target's path (10 cm below the screen center; 25 cm below the target's path) or 10 or 20 cm closer to the target's path (at or 10 cm above the screen center). After 18 practice trials (two for each of the nine starting positions), subjects performed a set of 90 trials (10 for each starting position) in random order, then a set of 30 blocked trials for the central starting position (5 cm to the right of the screen center) to examine whether not varying the starting point (and therefore the movements that were to be made) across successive trials influenced precision, and finally another set of 90 trials (10 for each starting position) in random order. The two sets of trials with targets presented in random order were combined. The block of trials with a single starting position was analyzed separately. The target always appeared 20 cm to the left of and 15 cm above the screen center, moving at 40 cm/s.

Session 4: Target acceleration

It is known that people are poor at judging acceleration (Brouwer et al., 2002; Gottsdanker, Frick, & Lockard, 1961; Werkhoven, Snippe, & Toet, 1992). In Session 4 we examined whether this makes them very imprecise in intercepting accelerating targets. There were targets that moved at a constant velocity (40 cm/s) as well as targets that accelerated or decelerated at either 20 or 40 cm/s2. The accelerating targets started moving at 33.75 or 27.5 cm/s, and the decelerating targets started moving at 46.25 or 52.5 cm/s, so that all targets moved 25 cm during the first 625 ms. To make sure that subjects did not learn to tap at a fixed place at that time, the targets started moving either 15 or 25 cm to the left of the screen center. After 20 practice trials (two for each combination of acceleration and starting position), subjects performed 200 trials (20 for each of the 10 combinations) in random order. The data for the two starting positions were combined for the analysis.

Session 5: Viewing time, size of interception region, target velocity

Estimates of the target's speed might improve if the target has been visible for a longer time (de Bruyn & Orban, 1988; Rasche & Gegenfurtner, 2009), so in Session 5 we varied the time for which the targets were visible before being hit. We varied the target velocity across trials, so that relying on the velocity from previous trials would decrease rather than increase precision. In order to control the viewing time, we instructed subjects to intercept the targets within an indicated interception region (5 cm to the right of the screen center). We selected the targets' starting positions for each velocity (30 or 40 cm/s) such that they reached the center of the interception region after a viewing time of 600, 900, or 1200 ms. In order to vary the required precision of the additional requirement of hitting the target within a particular region, the widths of the interception regions were adjusted to the target's velocity so that the center of the target was within the interception region for either 50 or 100 ms (e.g., for a target moving at 30 cm/s and a duration of 100 ms, the interception region was 3 cm wide). The visual feedback did not depend on whether or not subjects tapped when the target was within the interception region, but the auditory feedback did: If the target was not within that region at the time of the tap, there was no sound to indicate that the hit had been successful. After 24 practice trials (two for each combination of viewing time, target velocity, and width of the interception region), subjects performed 240 trials (20 for each of the 12 combinations) in random order.

Session 6: Presence of interception region, finger's starting point

In Session 6 we directly evaluated the influence of specifying an interception region. We also evaluated whether affecting the variability in positions at which the screen was hit in other ways would have a similar influence on temporal precision. We compared four movement conditions: upward movements with an indicated interception region that the target reached 700 ms after it appeared (10 cm to the right of the screen center), upward movements without an indicated interception region (baseline), leftward movements (opposite the direction of target motion) without an indicated interception region, and movements towards the screen from just above the indicated interception region of the first movement condition (but without an explicitly indicated interception region). We expected to find little variability in where the screen was hit when starting just above the screen and when there was an explicit interception region. We expected to find a lot of variability when the hand was moving in the opposite direction from the target. Since half the trials no longer started below the target's path in this session, and having to keep the hand at a starting position high on the screen would be too tiring, we moved the target's path down to the screen center. In all four conditions, the targets appeared 42.5 cm to the left of the screen center and moved to the right at 75 cm/s. The starting point was either 10 cm to the right and 20 cm below the screen center (in the two upward movement conditions), 30 cm to the right of the screen center (in the leftward movement condition), or 10 cm to the right of the screen center and at a distance of 2 to 4 cm from the screen (in the downward movement condition). In the leftward movement condition, the starting point was on the target's path but subjects were required to lift their finger to tap on the target (and always moved their finger to the left when doing so). In the downward movement condition, the position above which the finger was to be held to start a trial was shown on the screen until the finger was at the correct distance (and positioned above the target), at which time it disappeared (informing subjects that their finger was at an adequate position). Each of the four conditions was presented in a separate block of five practice trials followed by 75 trials that were analyzed. The order of the blocks was counterbalanced across subjects.

The two sessions with jumping targets

The 2-cm diameter white target disk appeared 30 cm to the left of and 10 cm above the screen center, moving to the right at 50 cm/s. The starting point (a 1.4-cm diameter red disk) was 10 cm to the right of and 20 cm below the screen center. Taps were identified by the acceleration being larger than 50 m/s2 while the finger was less than 0.5 cm above the screen and within 10 cm of the target. The target sometimes jumped as soon as the finger moved 1 cm from its initial position. When it did so, it jumped by 1 cm (while continuing to move).

Session 7: Horizontal and vertical target jump

In Session 7, the target could jump to the left, to the right, up along the screen, or down along the screen. The session started with 20 practice trials without any jumps, after which 100 trials without jumps were randomly interleaved with 100 trials with jumps (25 for each direction of the jump).

Session 8: Horizontal target jump (fixed interception region)

In Session 8, the target jumped only to the left or to the right. It had to be hit within an indicated 2-cm × 2-cm interception region, 10 cm to the right of and 10 cm above the screen center. After 20 practice trials without jumps, there were another 100 trials without jumps randomly interleaved with 100 trials with jumps (50 for each direction of the jump).

A session for testing the model

In the last session, we chose nine conditions for which we expected the temporal precision to differ considerably. These expectations were based on our interpretation of the results of the initial eight sessions (described later in Modeling temporal precision). In all conditions, trials started when the finger was placed on a 1.5-cm diameter gray disk, 10 cm to the right of and 10 cm below the screen center. The trials started with a 1.5-cm diameter white target disk appearing 15 cm above the screen center. From the moment it appeared, the target was moving to the right at one of three different velocities: 10, 18, or 60 cm/s. Each velocity was used in combination with three task configurations.

Session 9: Presence of interception region

In one task configuration, there was an indicated interception region 10 cm to the right of the screen center. The width of the region was adjusted to the target velocity so that the (center of) the target was within this region for 100 ms. Targets appeared 3 cm to the right, 2.6 cm to the left, or 32 cm to the left of the screen center, depending on their velocity, so that they always reached the center of the indicated region after 700 ms. In the other two task configurations, subjects were free to tap on the target wherever they liked. In the second task configuration, the target appeared 10 cm further to the left than when the interception region was indicated (i.e., 7, 12.6, or 42 cm to the left of the screen center, depending on the target's velocity). In the third task configuration, the target started 10 cm further to the right (i.e., 13 or 7.4 cm to the right or 22 cm to the left of the screen center). After 18 practice trials (two for each condition, in random order), there were 225 trials (25 for each of the nine conditions, in random order). Each subject performed three such sessions, so that we had data for 75 trials for each subject and condition.

Analysis

We were obviously mainly interested in temporal precision: the standard deviation in the tapping errors (in terms of timing). To improve our measure of the tapping errors, we did not use the moment of the tap that we determined online based on an acceleration threshold, but redetermined the moment of the tap on the basis of the peak acceleration of the fingertip in the direction orthogonal to the screen (Brenner, Cañal-Bruland, & van Beers, 2013). Acceleration was determined from three consecutive measurements of the distance from the screen (by subtracting the difference between the last two distances from the difference between the first two distances, and assigning the outcome to the moment of the central measurement). This method of determining the moment of the tap is more reliable than the one we used for providing the feedback, but it was not used online because doing so would introduce additional delays in providing the feedback. It also allows us to consider taps that were too gentle to reach the acceleration threshold.

Once we knew the moment and position of the tap, we could calculate the position of the target and therefore the spatial tapping error. We excluded trials if the subject did not move in time to intercept the target, did not clearly tap the screen, or missed the target by more than 5 cm or more than 100 ms. We divided the component of the spatial error along the target's path by the target's velocity to determine the timing error. We also determined the component of the spatial error orthogonal to the path. We defined movement onset as the moment at which the finger had moved 0.2 cm from where it had been when the target appeared. Reaction time was defined as the time between when a target appeared and movement onset. Movement time was defined as the time between movement onset and when the finger hit the screen. For reasons that will become evident later, we also determined the finger's final lateral velocity and its final velocity towards the screen (these values were determined by averaging the velocity from when the finger was 5 cm from the position that was tapped until it reached 0.5 cm from that position). We determined means and standard deviations for each subject and condition and then averaged across subjects.

For the first six sessions, the consistency of the differences in temporal precision across conditions was evaluated with repeated-measures analyses of variance, separating factors whenever it seemed reasonable to do so (the two directions in Session 3; speed, viewing time, and time window in Session 5). When the main effect was significant and there were more than two options, we conducted pairwise post hoc comparisons using t tests with Bonferroni correction for multiple comparisons. For Session 5, we analyzed the random set in this way and compared the data for movements from the central starting position in the blocked and interleaved trials with a paired t test. For Session 6 we compared the data for the three different starting positions without an indicated interception region in the way just described and compared the data for the same starting positions with and without an indicated interception region with a paired t test. We compared reaction times and movement times across conditions in a similar manner.

For Sessions 7 and 8, we determined the latencies of the responses to the target jumps using the extrapolation method described by Oostwoud Wijdenes, Brenner, and Smeets (2014). We used the mean difference in velocity rather than the mean difference in acceleration (which was too noisy to be used without filtering), so we may be slightly overestimating latency.

Results

No data were excluded for Session 1 or for the sessions with jumping targets (Sessions 7 and 8). In total, one trial was excluded from Session 3, two from Session 5, eight from Session 2, 11 from Session 9, 15 from Session 4, and 63 from Session 6. These were trials in which the finger did not move, a tap was detected very far from the target (criteria mentioned previously in Analysis), or recording the position of the index finger failed (for instance because the subject rotated his or her finger so that the infrared light was no longer detected by the Optotrak). Results that we consider to be essential for understanding the temporal precision of interception are presented in the following sections. Additional information about our subjects' performance can be found in the Appendix.

The first six sessions

Session 1: Image rate

Contrary to what one may expect, temporal precision did not decline when the image rate was decreased (Figure 2.1), F(4, 20) = 0.49, p = 0.75). The standard deviations were about 12 ms, even when the intervals between the images were 41.7 ms so that subjects only saw about 17 images before the tap. The image rate had no significant influence on reaction time (about 360 ms) or movement time (about 355 ms).

Timing precision for all conditions of the first six sessions. The panel numbers correspond with the session numbers. The outlined pictograms show the layout on the screen, with the starting point in red, the moving target in black, and sometimes a region within which the target is to be hit in green. Error bars are standard errors across subjects. (1) Influence of image rate. The red lines in the additional pictograms show the intervals between presentations of images of the target (with line lengths indicating the target's contrast) in relation to the mean distribution of timing errors (represented by green normal distributions), approximately to scale. (2) Influence of target velocity. (3) Influence of the location of the starting point. Light bars represent the randomly interleaved positions. Dark bar represents the block of trials with a fixed starting point (at the central location). (4) Influence of target acceleration. (5) Influence of viewing time for two target velocities (30 and 40 cm/s) and interception regions that require a hit within two time intervals (A: 100 ms, B: 50 ms). (6) Influence of various factors that might affect the variability in where the screen is hit (baseline; starting on the right; starting just above the screen; indicated interception point).

Figure 2

Timing precision for all conditions of the first six sessions. The panel numbers correspond with the session numbers. The outlined pictograms show the layout on the screen, with the starting point in red, the moving target in black, and sometimes a region within which the target is to be hit in green. Error bars are standard errors across subjects. (1) Influence of image rate. The red lines in the additional pictograms show the intervals between presentations of images of the target (with line lengths indicating the target's contrast) in relation to the mean distribution of timing errors (represented by green normal distributions), approximately to scale. (2) Influence of target velocity. (3) Influence of the location of the starting point. Light bars represent the randomly interleaved positions. Dark bar represents the block of trials with a fixed starting point (at the central location). (4) Influence of target acceleration. (5) Influence of viewing time for two target velocities (30 and 40 cm/s) and interception regions that require a hit within two time intervals (A: 100 ms, B: 50 ms). (6) Influence of various factors that might affect the variability in where the screen is hit (baseline; starting on the right; starting just above the screen; indicated interception point).

Temporal precision did depend on the target's velocity, F(5, 35) = 65, p < 0.0001. Post hoc comparisons using paired t tests showed that precision for the slowest targets (10 cm/s) was significantly different from that for all others, and that precision for the second-slowest targets (25 cm/s) was significantly different from that for the targets moving at 55 and 70 cm/s. It may seem counterintuitive that precision would increase with target velocity (the number of hits does not; see Appendix). The decrease in the standard deviation of the timing error with target speed is probably caused by the same spatial errors being interpreted as smaller temporal errors if the target is moving fast. We will return to this issue later in Modeling temporal precision. The target's velocity influenced both reaction time, F(5, 35) = 11, p < 0.0001, and movement time, F(5, 35) = 12, p < 0.0001. For a possible explanation of why movement time is shorter for faster targets (Table 2) see Brouwer et al. (2000). One might also have expected reaction time to decrease with increasing target velocity (Smeets & Brenner, 1994), but since the faster targets appeared at a larger eccentricity, the effects of velocity and eccentricity probably influenced reaction times in opposite directions (Tynan & Sekuler, 1982).

Mean (and standard deviation) reaction times (RTs) and movement times (MTs) across subjects in Session 2.

Table 2

Mean (and standard deviation) reaction times (RTs) and movement times (MTs) across subjects in Session 2.

Target velocity (cm/s)

RT (ms)

MT (ms)

10

340 ± 16

434 ± 41

25

321 ± 12

394 ± 26

40

324 ± 11

374 ± 21

55

334 ± 14

351 ± 15

70

347 ± 15

337 ± 13

85

362 ± 13

318 ± 11

Session 3: Finger's starting point

The finger's position at the beginning of the trial did not influence the precision of the hit: distance from path: F(2, 14) = 0.16, p = 0.85; lateral position: F(2, 14) = 0.19, p = 0.83; interaction: F(4, 28) = 0.60, p = 0.67. That the best precision was obtained for the block of trials starting at the central starting point suggests that repeating precisely the same movement may improve performance a bit, but the comparison with performance when starting at the central starting point while all nine starting points were interleaved was not statistically significant, t(7) = 1.7, p = 0.14. A slight improvement when repeating the same movement would be consistent with evidence that people rely on information from the previous trial to some extent (as demonstrated for judged target speed by de Lussanet, Smeets, & Brenner, 2001). Reaction time and movement time were both influenced by the position of the starting point, F(8, 56) = 18, p < 0.0001, and F(8, 56) = 26, p < 0.0001, respectively, for the interleaved starting points. Reaction time was shortest when starting the movement at the bottom left (306 ms) and longest when starting at the top right (412 ms). Movement time was shortest when starting the movement at the top left (212 ms) and longest when starting at the bottom right (407 ms).

Session 4: Target acceleration

The target's acceleration influenced temporal precision, F(4, 28) = 51, p < 0.0001. All pairwise comparisons were significant except the comparison between precision for targets moving at a constant velocity (acceleration of 0 cm/s2) and those accelerating at 20 cm/s2, between precision for targets moving at a constant velocity and those decelerating at 20 cm/s2 (acceleration of −20 cm/s2), and between precision for the targets accelerating at 20 and 40 cm/s2. The best precision was obtained with accelerating targets rather than with ones traveling at a constant velocity. We will return to this finding in the Discussion. The target's acceleration influenced both reaction time, F(4, 28) = 28, p < 0.0001, and movement time, F(4, 28) = 22, p < 0.0001. Reaction time was probably longer when acceleration was higher (Table 3) because the accelerating targets were moving more slowly when they appeared, and reaction time is longer for slower targets (Smeets & Brenner, 1994). Movement time was probably shorter when acceleration was higher (Table 3) because the accelerating targets were moving faster at the time of the hit, and people move faster when hitting faster targets (Table 2; Brouwer et al., 2000).

Mean (and standard deviation) reaction times (RTs) and movement times (MTs) across subjects in Session 4.

Table 3

Mean (and standard deviation) reaction times (RTs) and movement times (MTs) across subjects in Session 4.

Target acceleration (cm/s2)

RT (ms)

MT (ms)

−40

297 ± 7

432 ± 24

−20

299 ± 8

421 ± 17

0

306 ± 8

390 ± 15

20

309 ± 10

379 ± 17

40

318 ± 10

365 ± 13

Session 5: Viewing time, size of interception region, target velocity

Temporal precision did not improve when we extended the viewing time, F(2, 14) = 0.23, p = 0.79. Neither did the modest variation in the target's velocity affect temporal precision, F(1, 7) = 1.3, p = 0.29. Temporal precision was better when the interception region was larger (bars marked A in Figure 2.5), F(1, 7) = 11, p = 0.01. A striking aspect of this session is that temporal precision was much poorer than it was for about the same target velocities in Sessions 1–4. This is presumably the result of subjects having to hit the target in an indicated interception region, because the effect is larger for the smaller interception region. Note that the fact that precision was poorer when there was an indicated interception region cannot mean that subjects simply did not have enough time to intercept the target within that region, because extending the viewing time did not matter. None of the interactions were significant: viewing time × velocity: F(2, 14) = 0.47, p = 0.64; velocity × region: F(1, 7) = 0.21, p = 0.66; viewing time × region: F(2, 14) = 0.11, p = 0.90; viewing time × velocity × region: F(2, 14) = 0.26, p = 0.78. Reaction time and movement time were influenced by viewing time (significant main effects as well as many significant interactions). Reaction time was also longer, F(1, 7) = 27, p = 0.001, and movement time shorter, F(1, 7) = 46, p = 0.0003, for the faster targets. There were no significant effects on reaction or movement times of the size of the interception region nor of interactions between the size of the interception region and target speed.

Session 6: Presence of interception region, finger's starting point

The standard deviation in the position at which the finger hit the screen was 7 mm when the interception region was indicated explicitly, 15 mm when starting just above the screen, 23 mm in the baseline condition, and 32 mm when starting on the right. Thus, our manipulations influenced the variability in where subjects hit the screen more or less in the manner that we anticipated (see Appendix for further details). In accordance with the notion that precision is poorer when the target has to be hit within an indicated interception region, there was a significant difference in precision between the baseline condition and the condition with an explicit interception region, t(7) = 3.4, p = 0.01 (comparison between leftmost and rightmost bar in Figure 2.6). This suggests that having the freedom to adjust one's movement is critical for achieving high temporal precision.

The comparison between the three starting positions (three leftmost bars in Figure 2.6) was significant, F(2, 14) = 3.7, p = 0.05, but none of the post hoc comparisons were significant. Precision tended to be poorer when starting with the finger just above the target's path, which reduced the variability in where the finger hit the screen with respect to the baseline condition, but it also tended to be poorer when starting on the right, which increased the variability in where the finger hit the screen with respect to the baseline condition; so precision does not appear to be related to the actual variability in the movements. The timing might be slightly less precise when the finger starts just above the screen because of the lower velocity with which the finger approaches the screen (Brenner et al., 2012; Schmidt et al., 1979), and it might be slightly less precise when starting to the right of the target because doing so makes it more difficult for the finger to move along with the target as it approaches the screen (Brenner & Smeets, 2005). Reaction time is longest (490 ± 57 ms), and movement time shortest (158 ± 38 ms), when starting close to the target's path. The mean movement time in the other three conditions was 309 ± 11 ms, with no significant difference across conditions, F(2, 14) = 1.3, p = 0.3. Reaction time was longer when starting on the right (463 ± 58 ms) than in the other two conditions (374 ± 16 and 318 ± 17 ms, with and without the indicated interception region), F(2, 14) = 4.2, p = 0.04, but the post hoc tests were not significant.

Accounting for target speed and acceleration

In order to better understand the origins of the variations in temporal precision, we examined how judgments of the target's velocity and acceleration might influence the standard deviations. In the sessions in which velocity was varied, the conditions were presented in random order. Consequently, if subjects were not only considering the actual velocity on that trial but also relying on values from previous trials, we would expect to see systematic errors (de Lussanet et al., 2001). Similarly, if subjects failed to adequately consider the acceleration (Brouwer et al., 2002; Lee, Young, Reddish, Lough, & Clayton, 1983) in Session 4, we would expect to see systematic errors.

Relying to some extent on the target's velocity on previous trials would make subjects systematically hit behind the center of fast targets and ahead of the center of slow targets. In fact, they mainly had a tendency to hit ahead of the target center (Figure 3A; in line with the results of Brenner et al., 2013). Nevertheless, in Session 2, in which target velocity was varied, there was a tendency to hit further ahead of the target center the slower the target was moving. The slope of the fit line is about −2 ms (−2 mm per 100 cm/s of target velocity; thick line in Figure 3A; standard error across subjects: 1 ms), which is not significantly different from zero, t(7) = 1.75, p = 0.12 (note the opposite trend for the two values of Session 5). A slope of −2 ms corresponds to underestimating the time until the hit by 2 ms or to giving a weight of up to 2% to the mean velocity or to the velocity on the previous trial (assuming that the sensorimotor delay is at least 100 ms). Thus, on average, our subjects used target velocity approximately adequately.

Systematic errors: mean lateral positions of the taps with respect to the target center in sessions in which target velocity (A) or acceleration (B) varied. Positive values indicate hitting ahead of (to the right of) the target. Error bars are standard errors across subjects. Thick lines through the filled symbols are based on the average parameters of linear fits to the individual subjects' data.

Figure 3

Systematic errors: mean lateral positions of the taps with respect to the target center in sessions in which target velocity (A) or acceleration (B) varied. Positive values indicate hitting ahead of (to the right of) the target. Error bars are standard errors across subjects. Thick lines through the filled symbols are based on the average parameters of linear fits to the individual subjects' data.

Ignoring the target's acceleration would make subjects systematically hit behind the center of accelerating targets and ahead of the center of decelerating targets. We see such a tendency in Session 4 (together with an overall tendency to tap ahead of the target center; Figure 3B). Ignoring the acceleration for a time interval t before hitting the screen will give rise to a linear relationship between the target acceleration a and the error x introduced by ignoring the acceleration: x = (1/2)at2. The slope of a fit line relating systematic error to target acceleration can therefore be used to determine the duration of time interval t. This was done separately for each subject. The average slope of the fit line (about −7.4 mm per 100 cm/s2 of target acceleration; thick line in Figure 3A) corresponds to ignoring the differences in acceleration for 116 ms (standard error across subjects: 12 ms), which is significantly different from zero, t(7) = 8.82, p < 0.0001. A simple way to interpret this is that subjects were using the target velocity 116 ms before the tap to predict the change in the target's position during the last 116 ms. The target moved for considerably longer than 116 ms, but assuming that the tapping movement is continuously adjusted (Brenner & Smeets, 2011), we would only expect to see effects of anything that is ignored during the final part of the movement, when sensorimotor delays prevent direct feedback-based correction. A sensorimotor delay of about 116 ms is reasonably consistent with the literature (Brenner & Smeets, 1997; Carlton, 1981; Oostwoud Wijdenes et al., 2011). Thus the results are consistent with subjects' constantly updating the judged velocity without considering the acceleration.

Movement speed and spatial precision

It is well known that the speed at which the finger moves influences spatial precision: Faster movements are less precise (e.g., Fitts, 1954). We found quite a few significant differences in movement time between conditions. To separate spatial variability (related to movement speed) from variability related to judging when to tap, we examined tapping errors orthogonal to the target's motion, which are unaffected by errors in judging when to tap. These errors increase with finger speed, as expected (Figure 4).

Speed–accuracy trade-off in the first six sessions. Each symbol shows the mean values for one condition, averaged across subjects (with the standard errors across subjects). The values for the condition in which the finger started just above the screen in Session 6 are outside the range of the figure (average speed is 166 cm/s; standard deviation is 3.4 mm). The vertical values show the spatial precision orthogonal to the target's motion. The horizontal values show the average speed of the finger, which was determined by dividing the distance between the starting point and the position that was tapped by the movement time. In general, variability increases with increasing movement speed.

Figure 4

Speed–accuracy trade-off in the first six sessions. Each symbol shows the mean values for one condition, averaged across subjects (with the standard errors across subjects). The values for the condition in which the finger started just above the screen in Session 6 are outside the range of the figure (average speed is 166 cm/s; standard deviation is 3.4 mm). The vertical values show the spatial precision orthogonal to the target's motion. The horizontal values show the average speed of the finger, which was determined by dividing the distance between the starting point and the position that was tapped by the movement time. In general, variability increases with increasing movement speed.

To better understand why explicitly specifying an interception region decreases temporal precision, we examined how subjects adjust their ongoing movements and how long it takes them to make these adjustments. In Session 7, subjects were free to choose where to intercept the target; in Session 8, they had to do so within an indicated interception region. We examined how subjects adjusted their movements and how long it took them to do so by having targets jump once the finger started moving. For lateral jumps, if subjects were free to tap on the target wherever they liked (Session 7), they could adjust their movement time by 20 ms and tap on the target at the usual position, adjust the position by 1 cm and tap on the target at the usual time, or adjust both to complementary extents. If the tap had to be within an interception region (Session 8), subjects had to adjust the movement time (shorter for rightward jumps; longer for leftward jumps). For vertical jumps (Session 7) subjects had to adjust the position, but when doing so they could either maintain the movement time or adjust the movement time to the new distance that they had to move (note that we refer to the direction up and down along the screen as vertical, although the screen is actually slanted backwards).

Session 7: Horizontal and vertical target jump

When subjects were free to tap wherever they liked, they mainly adjusted the position at which they hit the target. The mean movement time increased slightly (though not significantly) to adjust to the target's jumping to the left (red bar in Figure 5A), but the main adjustment to targets' jumping laterally was that the position that was tapped shifted in the direction of the jump (red and blue bars in Figure 5B). The response latency for such lateral adjustments was about 116 ms (Figure 6). When the targets jumped vertically there was obviously a vertical adjustment (Figure 5C). Movement time and lateral position were not affected. The latency of the vertical adjustment in response to the target's jumping upward or downward on the screen was about 109 ms.

How subjects adjusted their movements when they were free to adjust the position of the tap (Session 7) and when the position at which they were to try to tap the target was fixed (Session 8). All values are deviations from the condition with no jump. The colors indicate the direction of the jump. Since the target was moving at 50 cm/s to the right, a 1-cm rightward jump could be compensated for by a 20-ms reduction in movement time, and so on. The error bars are 95% confidence intervals (across subjects' mean values).

Figure 5

How subjects adjusted their movements when they were free to adjust the position of the tap (Session 7) and when the position at which they were to try to tap the target was fixed (Session 8). All values are deviations from the condition with no jump. The colors indicate the direction of the jump. Since the target was moving at 50 cm/s to the right, a 1-cm rightward jump could be compensated for by a 20-ms reduction in movement time, and so on. The error bars are 95% confidence intervals (across subjects' mean values).

Difference between lateral or vertical finger velocities for target jumps in opposite directions. Each curve is the average of eight subjects' values. Blue: difference between the lateral velocities after the target jumped to the right and to the left (Session 7; latency: 116 ms). Green: difference between the vertical velocities after the target jumped up and down (Session 7; latency: 109 ms). Red: difference between the vertical velocities after the target jumped to the right and to the left (Session 8; latency: 169 ms). Shaded areas around the curves indicate the mean plus or minus one standard error across subjects. Black lines are used to determine the latencies (see Materials and methods).

Figure 6

Difference between lateral or vertical finger velocities for target jumps in opposite directions. Each curve is the average of eight subjects' values. Blue: difference between the lateral velocities after the target jumped to the right and to the left (Session 7; latency: 116 ms). Green: difference between the vertical velocities after the target jumped up and down (Session 7; latency: 109 ms). Red: difference between the vertical velocities after the target jumped to the right and to the left (Session 8; latency: 169 ms). Shaded areas around the curves indicate the mean plus or minus one standard error across subjects. Black lines are used to determine the latencies (see Materials and methods).

When the interception point was fixed, subjects changed their movement time appropriately (Figure 5A), but it clearly took them longer to change their finger's velocity in order to do so (latency of about 169 ms; Figure 6) than it took to adjust the finger's velocity when they adjusted where they would hit the target.

Adjusting where rather than when

Subjects changed their vertical velocity sooner after the jump to adjust where they would tap the target (when the position was free) than to adjust when they would tap the target (when the position was fixed). This is in line with earlier reports that changing one's timing in response to a change in target velocity takes longer than changing one's position in response to a target jump (Brenner et al., 1998). That changing where one is aiming is faster than changing when one intends to make contact can explain why the former dominates when one is free to do both: Fast changes to aiming positions make later adjustments to timing superfluous. Thus, in terms of adjusting ongoing movements, it makes sense to consider that people pick a time to intercept the target and then determine where the target will be at that time (rather than picking a place and then determining when the target will be there, as is often assumed; de Azevedo Neto & Teixeira, 2009; Lee et al., 1983; Tresilian, 1999). Of course, this does not apply to the initial choice of where to intercept the target (and therefore perhaps not to movement onset; López-Moliner & Bonnet, 2002; Marinovic, Plooy, & Tresilian, 2009), because for that one must consider when the target will be within reach.

Modeling temporal precision

With the notion in mind that it takes longer to adjust the moment of the tap than to adjust the position of the tap, we decided to model temporal precision on the basis of first estimating when one will reach the screen and then determining where precisely to do so. Temporal precision in hitting targets is limited by errors in determining what movement to make (due to errors in judging the position and velocity of both the target and one's own hand) as well as errors in executing that movement. Continuously using updated information to adjust the movement removes some of the initial errors, but delays within the pathway from retinal stimulation to the muscles moving the arm limit the extent to which errors can be removed. For a total sensorimotor delay τ, errors at the moment of the tap will depend on the judgments a time τ before hitting the screen. A target that is at position xτ and is moving at velocity vτ with acceleration a at that moment will be at position xτ + vττ + (1/2)aτ2 at the moment of the tap.

The person in question will try to estimate this position. If τ̂ is the person's estimate of the time that it will take his or her finger to cover the remaining distance to the screen at time τ before hitting the screen (we use a hat to indicate that a measure is the value that the person estimates rather than the true value), x̂τ and v̂τ are the target's judged position and velocity at that moment, and acceleration is ignored, the person will ultimately aim to hit the screen at position x̂τ + v̂ττ̂.

Besides misjudging how long it will take to cover the remaining distance to where one considers the screen to be (τ̂ ≠ τ), one might also misjudge the distance to the screen. Doing so will give rise to a timing error that is not considered when estimating where to hit the target. Its magnitude depends on how the finger is moving. For a judged distance d̂, a true distance d, and a finger moving in depth at velocity vFd, this error in timing will be (d − d̂)/vFd. In this time, a target that is moving at a velocity of vτ + aτ (the velocity at the time of the hit) will have moved [(d − d̂)/vFd](vτ + aτ) further than was considered for estimating where to hit (We disregard the fact that for an accelerating target, the velocity is changing during this time.) If the finger was not approaching the screen orthogonally but followed a curved path to move along with the target as it approached the screen (Brenner & Smeets, 2005), misjudging the distance to the screen will also result in hitting the screen at a different position along the path. For a lateral finger velocity (in the direction of the target's motion just before the hit) of vFl, the finger will hit the screen [(d − d̂)/vFd]vFl further than planned. Combining all these judgments with the fact that the finger may not move exactly as planned, leading to an additional random execution error e, the hit error can be summarized as

Assuming that all the included judgments are independent and unbiased (i.e., that the average judged values are equal to the true values), the variance in the hit error Display Formula will be

The temporal precision (standard deviation in the timing errors; σtiming) can be found by dividing the square root of this variance by the target velocity at the time of the hit:

When fitting this equation to the data, we used average values for any velocity in Equation 4 that was not held constant within the condition in question. For the value of σs, which is the spatial error along the target's path, we used the individual subjects' standard deviations in the error orthogonal to the target's path for the condition in question (about 3 mm; Figure 4), assuming that the spatial errors are isotropic. We think this approximation is justified, because in our tapping task the finger approaches the screen almost orthogonally. We based the values of τ on the results of the sessions with jumping targets (Figure 6). For the conditions in which subjects could hit the target wherever they liked, we set τ to 116 ms (the value obtained both from our interpretation of Figure 3B and from Figure 6). For the conditions with an explicit interception region, we set τ to 169 ms (the value obtained from Figure 6).

Since we cannot separate the contributions of the two Weber fractions (fv and fτ), we are left with two parameters to fit: the combined Weber fraction ( Display Formula ) and uncertainty in the judgment of distance ( Display Formula ). We determined the values of these two parameters by minimizing the sum of the squared differences between the average measured standard deviations of the timing errors (the heights of the bars in Figure 2) and the values given by Equation 4 for each condition of each of the six initial sessions. The fit values were 9% for Display Formula and 2 mm for Display Formula . With these values, Equation 4 reproduces the mean measured standard deviations quite well (Figure 7).

Each predicted standard deviation in Figure 7 is based on several values that were determined separately for each subject and condition and then averaged across subjects. These values are the vertical spatial variability, the finger's mean velocity in depth just before hitting the screen, the finger's mean lateral velocity just before hitting the screen, and the target's mean velocity one sensorimotor delay before the tap. The predictions also make use of the two values for the sensorimotor delay that we determined from the average data in Sessions 7 and 8 (Figure 6). We varied only two parameters to fit the model predictions to the mean human performance in all the conditions of the first six sessions. We consider the model to fit the data quite well. One way to further evaluate the model's credibility is by considering whether the fit values are realistic.

The fit value for the combined Weber fraction for judging velocity and time is 9%. Standard deviations of judgments of retinal velocity are between 5% and 7% of the velocity for the range of velocities used in our study (de Bruyn & Orban, 1988; McKee & Welch, 1989). Standard deviations for detecting changes in the velocity of pursued targets are about 8% of the target's velocity (Haarmeier & Thier,2006). Standard deviations of judgments of duration may also be about 8% of the time in question (Westheimer, 1999), although more precise timing has also been reported (e.g., Doumas & Wing, 2007). Thus, the fit value for the combined Weber fraction is plausible.

The fit value for the precision of the estimate of distance of only 2 mm is more difficult to evaluate. We found a similar value for tapping on a surface in a previous study (Brenner et al., 2012). It may seem strange that the fit precision in depth would be better than the measured vertical (and presumably lateral) precision (2.5–4.5 mm; Figure 4), because visual judgments in depth are normally less precise (when expressed in millimeters; Brenner & Smeets, 2000), but note that the fit 2-mm precision in depth represents variability in visual judgments alone, whereas the measured vertical precision includes errors of motor origin.

How important is misjudging depth in the present experiment? Considering that the finger is moving towards the screen at about 80 cm/s just before the tap, the standard deviation of 2 mm corresponds to a standard deviation of 2.5 ms. According to Equation 4, the influence on performance is smaller, because the finger is moving along with the target (on average at about 10 cm/s). The contribution of misjudging the distance is therefore almost negligible. Consequently, the fit is not very reliable, so although this value is credible, it is not a strong test of the model.

The reasonably good fit for the data of Session 2 (blue symbols in Figure 7), in which target velocity was varied, supports our assumption that the spatial errors contributing to σs can be considered to be approximately isotropic. The main reason for the predicted standard deviation being different for different target velocities is that the spatial error does not scale with target velocity, but is divided by target velocity when being converted to a temporal error (see Equation 4). If spatial errors had systematically been considerably larger in the overall direction of the finger's motion than in the orthogonal direction, we would have systematically been overestimating the magnitudes of the spatial errors, and therefore also the range of predicted standard deviations. Figure 7 shows that this is not the case.

Testing the model

Equation 4 fits the average data of the first six sessions quite well, with reasonable values for the fit parameters, but the range of predicted standard deviations is rather limited (except for the large predicted standard deviation for the slowest target of Session 2; rightmost blue dot in Figure 7). Moreover, we varied two parameters to fit the data. We therefore examined whether Equation 4 would also fit a new set of data without changing the two fit parameters (Session 9). We selected conditions that were expected to lead to large differences in performance and had three subjects perform each condition many times. We varied the target velocity, whether or not the position at which the target was to be hit was fixed, and the direction in which we expected the finger to be moving as it hit the screen (by varying the target's starting point). We predicted each subject's precision in each condition on the basis of the target's velocity, the sensorimotor delays that we determined from the results of Sessions 7 and 8, the values of 0.09 and 2 mm for Display Formula and Display Formula , and the individual subject's values in that condition for the vertical spatial variability (σs), for the finger's velocity towards the screen just before hitting the screen (vFd), and for the finger's lateral velocity just before hitting the screen (vFl). We used the individual subjects' values for the latter measures because there is reason to believe that different subjects may approach the target differently (Cesqui, d'Avella, Portone, & Lacquaniti, 2012).

Session 9: Presence of interception region

The measured standard deviations follow the predictions reasonably well (Figure 8A), considering that there was no new fitting involved in determining these predictions. The idea of having the targets start quite far to the left or right was that this would influence the finger's lateral velocity just before hitting the screen, because if the overall movement of the finger was to the left, the finger would be less likely to be moving rightward as it approached the target than if the overall movement was to the right. For slow target velocities, subjects did indeed tap the screen almost 20 cm further to the right when the target appeared 20 cm further to the right, but for fast target velocities, the difference was only about 5 cm (Figure 8B). Consequently, the finger's lateral velocity as it approached the screen hardly depended on the initial target position for the faster targets (Figure 8C). The finger's lateral velocity did increase with target velocity, even when the position at which it had to hit the target was fixed (gray symbols).

Results of Session 9. (A) Three individual subjects' timing precision as a function of predictions for their performance. Values for subjects S1, S2, and S9 are presented in blue, green, and red, respectively. The predictions are based on Equation 4 with the same values of the fit parameters and of the sensorimotor delays as in Figure 7, and the subject's individual values for the measures for which we used average values across subjects for Figure 7. (B) Lateral position of the tap relative to the screen center (for each condition; averaged across subjects, with standard errors). Values for the task configuration with a fixed interception region, 10 cm to the right of the screen center, are shown in gray. (C) Final lateral velocity of the finger in the same conditions.

Figure 8

Results of Session 9. (A) Three individual subjects' timing precision as a function of predictions for their performance. Values for subjects S1, S2, and S9 are presented in blue, green, and red, respectively. The predictions are based on Equation 4 with the same values of the fit parameters and of the sensorimotor delays as in Figure 7, and the subject's individual values for the measures for which we used average values across subjects for Figure 7. (B) Lateral position of the tap relative to the screen center (for each condition; averaged across subjects, with standard errors). Values for the task configuration with a fixed interception region, 10 cm to the right of the screen center, are shown in gray. (C) Final lateral velocity of the finger in the same conditions.

We used the same two values of τ and the same values of the two fit parameters for all three subjects. Despite our not considering that the sensorimotor delays, the precision in judging position and velocity, and the precision in judging depth may all differ between subjects, many of the differences between the subjects' performance are in accordance with the predictions, as are many of the differences between the conditions (Pearson's r = 0.87 across subjects and conditions). Thus altogether, these results support the assumptions that underlie Equation 4.

Discussion

This study allows us to draw a number of conclusions, both directly from individual results and indirectly by modeling the entire set of experiments. The foremost conclusion is that we have identified a possible mechanism by which people can achieve their amazing temporal precision in interception.

Virtual targets

An advantage of virtual targets is that we can easily manipulate their motion in ways that we would not be able to do with real objects. However, virtual targets may also have disadvantages. Temporal precision in hitting virtual targets was better than 10 ms for several of our conditions, but it was never as good as in some of the earlier studies with real balls (Brenner et al., 2012; McLeod et al., 1985). This might be due to the more detailed feedback and to the higher velocities and acceleration of the real balls (as will be explained later). It is not due to the intermittent presentation of virtual targets, because image rate makes very little difference (Figure 2.1). Image rate might make so little difference because people pursue the target with their eyes (Brenner & Smeets, 2007, 2009, 2011), so that the target's velocity is mainly judged from eye-movement signals (it is known that pursuit only deteriorates at lower image rates than those that we used; Fetter & Buettner, 1990; Morgan & Turnbull, 1978).

Tapping the screen gives haptic feedback about the time of the tap but not about the target's position relative to the finger at the time of the tap. Subjects could simultaneously see their finger and the target, but it is unlikely that they could judge their timing with respect to the target well enough from that alone (de la Malla, López-Moliner, & Brenner, 2012), so we provided additional information by adjusting the target's motion after the tap to their performance. Such adjustments were based on the online estimates of the time and place of the tap, which were slightly less precise than the estimates used in the analysis. Errors in such feedback might have occasionally led to incorrect adjustments to the taps and thereby to larger variability. With real targets there is never incorrect feedback.

Another possible reason for temporal precision being better when hitting real falling balls than when tapping on our virtual targets is that temporal precision is higher for faster targets (Figure 2.2). According to our model, this is because the effect of misjudging positions becomes negligible in relation to that of misjudging time or velocity if the target moves very fast (see Equation 4). However, even if we ignore all spatial variability (σs = Display Formula = 0) and consider that the velocity of a ball that is falling under gravitational acceleration is higher at the moment of the hit (vτ + aτ) than when the position was estimated (vτ), the expected temporal precision for a ball moving at 8.7 m/s when it is hit (as in Brenner et al., 2012) is 9 ms. Even if we also assume that we overestimated the sensorimotor delay by relying on the average lateral velocity of the finger rather than its average lateral acceleration (Oostwoud Wijdenes et al., 2014), and that the true delay is 100 ms, the combined Weber fraction would still have to be about 7% (rather than 9%) to achieve a temporal precision of 6 ms. Perhaps people judge the velocity of a real approaching ball more precisely than they do that of virtual targets moving laterally across a screen, possibly due to the additional information from optical expansion and binocular cues (Rushton & Wann, 1999; but see Brenner et al., 2014). They might also judge when a bat that they are swinging with both arms will reach a ball better than they judge when their index finger will tap a screen.

Ignoring acceleration

In Session 4, we found systematic errors that correspond with ignoring the acceleration during the last 116 ms before the tap (Figure 3B), which is exactly what one would expect if the change in velocity during the last part of the movement were not considered due to a 116-ms sensorimotor delay (value based on the analysis of Session 7; Figure 6). People probably disregard acceleration when predicting an object's upcoming displacement (see also Lee, Port, & Georgopoulos, 1997; Soechting & Flanders, 2008) because they cannot judge it reliably enough (Brouwer et al., 2002; Gottsdanker et al., 1961; Werkhoven et al., 1992). It is possible that using a poor judgment of acceleration would have been better than using none at all in our Session 4, but the acceleration of objects such as falling balls can probably usually be better anticipated on the basis of experience than judged from visual information (Zago et al., 2004; Zago, McIntyre, Senot, & Lacquaniti, 2009). Moreover, as long as the acceleration does not change all the time, disregarding it can be compensated for by adjustments based on feedback (de la Malla et al., 2012; Gray, 2009). The tendency to aim ahead of targets moving at a constant velocity (Figure 3; see also Brenner et al., 2013) might be the result of often having to deal with objects that accelerate due to gravity in daily life. That we found the best temporal precision for accelerating targets, despite the systematic errors, is probably a consequence of the different velocities at the time of the hit (see mean values given at the top of Figure 2.4).

Spatial adjustments are faster than temporal ones

Indicating where the target was to be hit reduced temporal precision substantially. We attribute this to it taking longer to adjust when the finger will hit the screen (as one must if the position is specified) than to adjust where the finger will hit the screen. The difference was not caused by different muscles and joints producing the adjustments: A similar increase or decrease in the finger's vertical velocity occurred sooner after the target jumped vertically (so that the finger had to move further or less far) than after it jumped laterally but with a fixed point of interception (so that the finger's vertical velocity had to be modified to reach the same position sooner or later).

The strategy

We propose that the highest precision is achieved when people judge the position at the time they expect to hit the target, rather than judging when the target will reach a certain position. Most studies on interception have emphasized the latter judgment and evaluated the information available for making such a judgment (e.g., Lee et al., 1983; López-Moliner & Bonnet, 2002; Tresilian, 1999; Rushton & Wann, 1999). Our study indicates that the strategy of picking an interception point, judging when the target will reach that point, and adjusting one's movements to also reach there at that time (Lee, Georgopoulos, Clark, Craig, & Port, 2001) can indeed be used (the conditions in which the interception point is indicated), but that performance is considerably better when one first estimates the time one needs to reach the target's path and then fine-tunes the precise point of interception throughout the movement. We therefore propose that people use the latter strategy to achieve the amazing precision that is reported for several sports situations (Bootsma & van Wieringen, 1990; McLeod & Jenkins, 1991; Regan, 1992). The reason that this strategy leads to better precision is that the delay in using feedback to update the anticipated point of interception is shorter than the delay for updating the time of interception.

Neuronal considerations

It is difficult to imagine how such precise performance can be achieved by processing based on neuronal firing frequencies in which the minimal interspike intervals are of the same order of magnitude as the achieved temporal precision. Of course, temporal precision is not determined by a single action potential or even a single neuron, but emerges from the activity of many neurons which together activate many muscles. The high temporal resolution of the emerging behavior presumably results from mechanical averaging of the forces on the arm that are exerted by many muscles that are each driven by many neurons.

Besides having to time the response precisely, a person must also avoid systematic errors. One might therefore expect timing to be limited by the temporal response characteristics of the photoreceptors (Kietzman & Sutton, 1968), which make the response latencies in various brain areas depend on luminance, color, and contrast. We can readily observe the consequences of such sensitivity in perceptual tasks (e.g., Thompson, 1982). As with other systematic errors, such as the errors that arise from ignoring acceleration, these kinds of influences could be compensated for by responding to feedback. However, the finding that the temporal precision does not depend on the image rate (Session 1; Figure 2A) suggests that the temporal response characteristics of the photoreceptors might simply not be critical, because the eyes are pursuing the target (as they have been shown to do in previous studies; Brenner & Smeets, 2009, 2011), so that judgments about the target's position and speed rely heavily on (continuous) oculomotor signals rather than on (intermittent) retinal signals. The fact that even temporal precision in the presence of an indicated interception region is poorer when the eyes are not pursuing the moving target (Brenner & Smeets, 2011) supports the notion that the precision of interception is close to the limits of the neural machinery.

Acknowledgments

This study was partly funded by the Netherlands Organization for Scientific Research (NWO) grant 464-13-169.

This article is primarily about precision, so we only report systematic errors when evaluating the use of target speed and acceleration. We also ignore actual success rates and the variability in where on the screen our subjects tried to hit the target in the first six sessions. Some additional information on these measures is presented here. We report overall means (with standard deviations across subjects) when the differences between conditions were not consistent across subjects (i.e., when they were not statistically significant in repeated-measures analyses of variance). When there were significant differences, we present separate means and standard deviations for the different conditions.

In Session 1, subjects tended to hit slightly further to the right of the target center the higher the image rate (0.6 ± 1.2 mm, 0.7 ± 1.3 mm, 1.3 ± 0.7 mm, 2.3 ± 1.1 mm, and 2.7 ± 0.6 mm for image rates of 24, 30, 40, 60, and 120 Hz), F(4, 20) = 8.36, p = 0.0004. We have no explanation for this. In Session 2, subjects hit 2 ± 1 mm to the right of the target center (Figure 3A; no significant effect of target speed). In Session 3, errors depended systematically on the starting position, F(9, 63) = 3.21, p = 0.003. Pairwise comparisons using t tests (with Bonferroni correction) only revealed significant differences between starting at the lower left (3 ± 2 mm) and upper right (1 ± 2 mm) and between starting at the upper left (4 ± 2 mm) and upper central (1 ± 2 mm) positions. In Session 4, subjects hit further to the right of the target center the more the target decelerated (Figure 3B) F(4, 28) = 19.1, p < 0.0001. In Session 5 there was a significant effect of viewing time, F(2, 14) = 8.97, p = 0.003, and a significant interaction between target speed and viewing time, F(2, 14) = 12.0, p = 0.0009. Subjects hit 0.5 ± 2.0 mm to the left of the target center if they had to hit the target about 600 mm after it appeared and 1.5 ± 3.0 mm and 3.3 ± 2.7 mm to the right of the center if they had to hit it after about 900 and 1200 ms. They might have underestimated the differences between the three viewing times, because the biases were larger for the faster targets (−1.0, 2.9, and 3.7 mm for targets moving at 40 cm/s; −0.1, 0.2, and 2.9 mm for targets moving at 30 cm/s). In Session 6, subjects hit 5 ± 3 mm to the right of the target center.

Tapped positions on the screen

In Session 1, subjects hit 8.7 ± 1.0 cm to the right of the screen center. The variability across individual subjects' trials was larger than the variability across subjects' mean values, and was higher for the highest (2.2 cm) and lowest (2.5 cm) image rates, than for the other image rates (all 2.1 cm), F(4, 20) = 2.90, p = 0.05. In Session 2, subjects hit further to the right the faster the targets were moving. They hit 6.8 ± 1.2 cm, 7.5 ± 1.9 cm, 8.1 ± 2.2 cm, 8.4 ± 2.7 cm, 9.4 ± 2.9 cm, and 9.7 ± 3.1 cm to the right of the screen center for targets moving at 10, 25, 40, 55, 70, and 85 cm/s, respectively, F(5, 35) = 6.07, p = 0.0004. For all velocities, the target was 5 cm to the right of the screen center 625 ms after it appeared, so this difference just means that subjects took longer than 625 ms to tap the screen. Individual subjects' standard deviations across trials also tended to increase with velocity. The standard deviations were 1.2, 1.6, 2.0, 2.2, 1.9, and 2.2 cm for targets moving at 10, 25, 40, 55, 70, and 85 cm/s, F(5, 35) = 3.24, p = 0.02.

In Session 3, not at all surprisingly, the point at which targets were intercepted depended on the finger's starting point, F(9, 63) = 20.2, p < 0.0001. Subjects hit furthest to the left (1.7 ± 2.2 cm to the right of the screen center) when starting at the top left and furthest to the right (10.2 ± 3.7 cm to the right of the screen center) when starting at the lower right. The standard deviation across trials was about 2.4 cm. In Session 4, performance was quite consistent across subjects and conditions, despite a lower consistency across trials than in previous sessions. Subjects hit 8.1 ± 1.5 cm to the right of the screen center, with a mean individual standard deviation across trials of 4.4 cm. The large variability across trials is undoubtedly the result of combining the data for the two different starting positions.

In Session 5, subjects were no longer free to choose where to hit the target, so the variability in where the target was hit was obviously smaller than in the previous sessions. Nevertheless, subjects hit slightly further to the right the shorter the presentation time, F(2, 14) = 5.73, p = 0.02. The center of the interception region was 5 cm to the right of the screen center. Subjects hit 5.4 ± 0.3 cm to the right of the screen center when the presentation time was 600 ms. They hit 5.1 ± 0.1 cm to the right of the screen center for the other two presentation times. The mean individual standard deviation across trials was also larger for the 600-ms presentation time (0.5 cm) than for the other two presentation times (both 0.4 cm), F(2, 14) = 8.14, p = 0.005. It was also larger for targets moving at 40 cm/s (0.5 cm) than for ones moving at 30 cm/s (0.4 cm), F(1, 7) = 8.26, p = 0.02, and larger when there was 100 ms within which to hit the target (0.5 cm) than when there was only 50 ms within which to do so (0.4 cm), F(1, 7) = 23.6, p = 0.002.

In Session 6, the position at which subjects hit the screen depended on the condition, F(3, 21) = 4.13, p = 0.02. For most conditions it also varied considerably across subjects (Figure A1). Subjects hit close to the center of the interception region (10 cm to the right of the screen center) when such a region was indicated (10.3 ± 0.3 cm; black curves). Most subjects hit further to the right when starting on the right (14.2 ± 9.4 cm; green curves). Most hit further to the left in the baseline condition (5.6 ± 6.2 cm; blue curves). The average position was also further to the left when starting close to the target's path (6.8 ± 5.5 cm; red curves), but this is mainly because two subjects tapped about 2 cm to the left of the screen center in this condition. These subjects quickly moved their finger to the left rather than waiting for the target to arrive below the finger. The other six subjects tapped close to the finger's starting point. The average temporal precision of the two subjects who tapped further to the left was not worse in this condition than in the baseline condition, whereas it was worse for the subjects who waited for the target to reach their finger (so that on average, precision was poorer; Figure 2.6, baseline vs. starting just above the screen). Not only the variability across subjects but also the individual standard deviation across trials was by far the smallest when an interception region was indicated (0.7 cm, as opposed to 1.5 cm when starting just above the screen, 2.3 cm in the baseline condition, and 3.2 cm when starting on the right), F(3, 21) = 12.6, p < 0.0001.

Average paths in Session 6. Each curve shows the average path for one subject in one condition. The different colors indicate the conditions: baseline (blue), starting on the right (green), starting just above the screen (red), and interception region indicated explicitly (black). Each path was divided into 100 segments of equal length, and the positions of the ends of corresponding segments were averaged across trials. The top view is the projection of the finger's path on the screen. The side view is the path as seen from the side. There was considerable variability in where subjects hit the screen except for when the interception region was indicated explicitly.

Figure A1

Average paths in Session 6. Each curve shows the average path for one subject in one condition. The different colors indicate the conditions: baseline (blue), starting on the right (green), starting just above the screen (red), and interception region indicated explicitly (black). Each path was divided into 100 segments of equal length, and the positions of the ends of corresponding segments were averaged across trials. The top view is the projection of the finger's path on the screen. The side view is the path as seen from the side. There was considerable variability in where subjects hit the screen except for when the interception region was indicated explicitly.

Timing precision for all conditions of the first six sessions. The panel numbers correspond with the session numbers. The outlined pictograms show the layout on the screen, with the starting point in red, the moving target in black, and sometimes a region within which the target is to be hit in green. Error bars are standard errors across subjects. (1) Influence of image rate. The red lines in the additional pictograms show the intervals between presentations of images of the target (with line lengths indicating the target's contrast) in relation to the mean distribution of timing errors (represented by green normal distributions), approximately to scale. (2) Influence of target velocity. (3) Influence of the location of the starting point. Light bars represent the randomly interleaved positions. Dark bar represents the block of trials with a fixed starting point (at the central location). (4) Influence of target acceleration. (5) Influence of viewing time for two target velocities (30 and 40 cm/s) and interception regions that require a hit within two time intervals (A: 100 ms, B: 50 ms). (6) Influence of various factors that might affect the variability in where the screen is hit (baseline; starting on the right; starting just above the screen; indicated interception point).

Figure 2

Timing precision for all conditions of the first six sessions. The panel numbers correspond with the session numbers. The outlined pictograms show the layout on the screen, with the starting point in red, the moving target in black, and sometimes a region within which the target is to be hit in green. Error bars are standard errors across subjects. (1) Influence of image rate. The red lines in the additional pictograms show the intervals between presentations of images of the target (with line lengths indicating the target's contrast) in relation to the mean distribution of timing errors (represented by green normal distributions), approximately to scale. (2) Influence of target velocity. (3) Influence of the location of the starting point. Light bars represent the randomly interleaved positions. Dark bar represents the block of trials with a fixed starting point (at the central location). (4) Influence of target acceleration. (5) Influence of viewing time for two target velocities (30 and 40 cm/s) and interception regions that require a hit within two time intervals (A: 100 ms, B: 50 ms). (6) Influence of various factors that might affect the variability in where the screen is hit (baseline; starting on the right; starting just above the screen; indicated interception point).

Systematic errors: mean lateral positions of the taps with respect to the target center in sessions in which target velocity (A) or acceleration (B) varied. Positive values indicate hitting ahead of (to the right of) the target. Error bars are standard errors across subjects. Thick lines through the filled symbols are based on the average parameters of linear fits to the individual subjects' data.

Figure 3

Systematic errors: mean lateral positions of the taps with respect to the target center in sessions in which target velocity (A) or acceleration (B) varied. Positive values indicate hitting ahead of (to the right of) the target. Error bars are standard errors across subjects. Thick lines through the filled symbols are based on the average parameters of linear fits to the individual subjects' data.

Speed–accuracy trade-off in the first six sessions. Each symbol shows the mean values for one condition, averaged across subjects (with the standard errors across subjects). The values for the condition in which the finger started just above the screen in Session 6 are outside the range of the figure (average speed is 166 cm/s; standard deviation is 3.4 mm). The vertical values show the spatial precision orthogonal to the target's motion. The horizontal values show the average speed of the finger, which was determined by dividing the distance between the starting point and the position that was tapped by the movement time. In general, variability increases with increasing movement speed.

Figure 4

Speed–accuracy trade-off in the first six sessions. Each symbol shows the mean values for one condition, averaged across subjects (with the standard errors across subjects). The values for the condition in which the finger started just above the screen in Session 6 are outside the range of the figure (average speed is 166 cm/s; standard deviation is 3.4 mm). The vertical values show the spatial precision orthogonal to the target's motion. The horizontal values show the average speed of the finger, which was determined by dividing the distance between the starting point and the position that was tapped by the movement time. In general, variability increases with increasing movement speed.

How subjects adjusted their movements when they were free to adjust the position of the tap (Session 7) and when the position at which they were to try to tap the target was fixed (Session 8). All values are deviations from the condition with no jump. The colors indicate the direction of the jump. Since the target was moving at 50 cm/s to the right, a 1-cm rightward jump could be compensated for by a 20-ms reduction in movement time, and so on. The error bars are 95% confidence intervals (across subjects' mean values).

Figure 5

How subjects adjusted their movements when they were free to adjust the position of the tap (Session 7) and when the position at which they were to try to tap the target was fixed (Session 8). All values are deviations from the condition with no jump. The colors indicate the direction of the jump. Since the target was moving at 50 cm/s to the right, a 1-cm rightward jump could be compensated for by a 20-ms reduction in movement time, and so on. The error bars are 95% confidence intervals (across subjects' mean values).

Difference between lateral or vertical finger velocities for target jumps in opposite directions. Each curve is the average of eight subjects' values. Blue: difference between the lateral velocities after the target jumped to the right and to the left (Session 7; latency: 116 ms). Green: difference between the vertical velocities after the target jumped up and down (Session 7; latency: 109 ms). Red: difference between the vertical velocities after the target jumped to the right and to the left (Session 8; latency: 169 ms). Shaded areas around the curves indicate the mean plus or minus one standard error across subjects. Black lines are used to determine the latencies (see Materials and methods).

Figure 6

Difference between lateral or vertical finger velocities for target jumps in opposite directions. Each curve is the average of eight subjects' values. Blue: difference between the lateral velocities after the target jumped to the right and to the left (Session 7; latency: 116 ms). Green: difference between the vertical velocities after the target jumped up and down (Session 7; latency: 109 ms). Red: difference between the vertical velocities after the target jumped to the right and to the left (Session 8; latency: 169 ms). Shaded areas around the curves indicate the mean plus or minus one standard error across subjects. Black lines are used to determine the latencies (see Materials and methods).

Results of Session 9. (A) Three individual subjects' timing precision as a function of predictions for their performance. Values for subjects S1, S2, and S9 are presented in blue, green, and red, respectively. The predictions are based on Equation 4 with the same values of the fit parameters and of the sensorimotor delays as in Figure 7, and the subject's individual values for the measures for which we used average values across subjects for Figure 7. (B) Lateral position of the tap relative to the screen center (for each condition; averaged across subjects, with standard errors). Values for the task configuration with a fixed interception region, 10 cm to the right of the screen center, are shown in gray. (C) Final lateral velocity of the finger in the same conditions.

Figure 8

Results of Session 9. (A) Three individual subjects' timing precision as a function of predictions for their performance. Values for subjects S1, S2, and S9 are presented in blue, green, and red, respectively. The predictions are based on Equation 4 with the same values of the fit parameters and of the sensorimotor delays as in Figure 7, and the subject's individual values for the measures for which we used average values across subjects for Figure 7. (B) Lateral position of the tap relative to the screen center (for each condition; averaged across subjects, with standard errors). Values for the task configuration with a fixed interception region, 10 cm to the right of the screen center, are shown in gray. (C) Final lateral velocity of the finger in the same conditions.

Average paths in Session 6. Each curve shows the average path for one subject in one condition. The different colors indicate the conditions: baseline (blue), starting on the right (green), starting just above the screen (red), and interception region indicated explicitly (black). Each path was divided into 100 segments of equal length, and the positions of the ends of corresponding segments were averaged across trials. The top view is the projection of the finger's path on the screen. The side view is the path as seen from the side. There was considerable variability in where subjects hit the screen except for when the interception region was indicated explicitly.

Figure A1

Average paths in Session 6. Each curve shows the average path for one subject in one condition. The different colors indicate the conditions: baseline (blue), starting on the right (green), starting just above the screen (red), and interception region indicated explicitly (black). Each path was divided into 100 segments of equal length, and the positions of the ends of corresponding segments were averaged across trials. The top view is the projection of the finger's path on the screen. The side view is the path as seen from the side. There was considerable variability in where subjects hit the screen except for when the interception region was indicated explicitly.