Visual figure-ground segregation can rely on differences in luminance, texture, colour, motion, depth, and also on asynchronous changes in luminance and motion direction. However, since these displays were suspected to contain global motion and contrast artefacts, we created a new stimulus showing constantly 20 by 20 randomly oriented dipoles ("colons") flipping (i.e. rotating instantaneously by 90 deg) with a defined frequency. All dipols flip synchronously within both areas, figure and ground but with a defined delay between them. Human subjects had to localize the targets in a 4-alternative forced-choice task.Given a high luminance contrast between dipols and monitor surround subjects could detect the target up to a threshold frequency of 23 Hz and down to a threshold delay of 14 ms. Replacing the luminance-defined dots by contrast-defined Mexican Hats led to comparable results. Although both tasks are contrast sensitive the segregation can still be detected when dots are isoluminant, whereas dichoptic presentation doesn´t allow for segregation at all.Visual Evoked Potentials and Functional Magnetic Resonance Imaging studies showed VEP components and fMRI locations previously identified with form-from-motion stimuli.In the last study we re-examined form from asynchronous motion-reversals using defined frequencies and delays, and demonstrate that segregation can rely on short intervals (15 ms) of opposing motion directions between figure and ground and on longer (40 ms) intervals of differing contrast cues (i.e. motion energy cues), but not on asynchronous motion reversals per se.In conclusion, we confirm that segregation can rely solely on local changes occuring temporally asynchronously in figure and ground. We propose a segregation mechanism consisting of a set of local (monocular) motion detectors at the front end and a second stage reading and globally comparing their output with a high temporal resolution.