We typically think of vision as the recovery of increasingly rich information about individual objects, but there are also massive amounts of information about relations between objects in space and time. Recent studies of visual statistical learning (VSL) have suggested that this information is implicitly and automatically extracted by the visual system. Here we explore this possibility by evaluating the degree to which VSL of temporal regularities (Fiser & Aslin, 2002) is influenced by attention. Observers viewed a 6 min sequence of geometric shapes, appearing one at a time in the same location every 400 ms. Half of the shapes were red and half were green, with a separate pool of shapes for each color. The sequence of shapes was constructed by randomly intermixing a stream of red shapes with a stream of green shapes. Unbeknownst to observers, the color streams were constructed from sub-sequences (or ‘triplets’) of three shapes that always appeared in succession; these triplets comprised the temporal statistical regularities to be learned. Attention was manipulated by having subjects detect shape repetitions in one of the colors. In a surprise forced-choice familiarity test, triplets from both color streams (now in black) were pitted against foil triplets composed of shapes from the same color. If VSL is preattentive, then observers should be able to pick out the real triplets from both streams equally well. Surprisingly, however, they only learned the temporal regularities in the attended color stream. Further experiments that improved learning of the attended stream failed to elicit commensurate improvements for the unattended stream. We conclude that while VSL is certainly implicit (because it occurred during a secondary task), it is not a completely data-driven process since it appears to be gated by selective attention. The mechanics of VSL may thus be automatic, with top-down selective attention dictating the populations of stimuli over which VSL operates.