Surfaces that have different disparity in a static stereogram appear to move relative to one another when the observer moves relative to the stereogram. How does a stimulus with no moving parts give rise to apparent motion? At one level of explanation, a "motion from structure" (MFS) inference occurs because, in a real scene, the absence of relative motion (e.g. dynamic occlusion) in the proximal stimulus requires that surfaces move relative to one another. What mechanism(s) are responsible for this inference? MFS looks smooth and is visible for minute head movements, suggesting that it may be supported by a dedicated mechanism that combines 2D image motion (including zero-velocity motion) with represented depth structure to estimate 3D object motion per se. Extra-retinal signals might play a role. We conducted experiments in which observers translated their heads (45 cm side-to-side, 0.5 Hz oscillation) and adjusted the speed (gain) of a position-yoked figure that had crossed disparity a stationary background. Stimuli were dense RDS projected onto a screen at 200cm (60 Hz per eye, field sequential). The square was 46 cm wide at eye height, the observer standing. For both of two observers, and across four disparities (8, 16, 24, and 32 arcmin, or 14, 26, 37, and 46 cm in front of the screen, respectively), motion gain settings (on-screen motion/head motion) were consistently close to 50% of the prediction from geometry as specified by binocular disparity. However, apparent depths averaged 83% of the depth specified by disparity, so gain settings were also less than predicted from apparent depth. Accordingly, real stationary objects were positioned in front of the screen; they appeared to move against the head. Additional experiments presented stimuli against a blank background or moving relative to a stationary head. No single model fitted all data but several lawful principles emerge.