Abstract: Computational models of bottom-up attention can perform significantly above chance at predicting eye positions of observers passively viewing static or dynamic images. Nevertheless, much of eye movement behavior (50 percent or more) is unexplained by purely bottom-up models, and is typically attributed to top-down, inter-observer, task-dependent, or random effects. Other studies have qualitatively described such high-level effects in naturalistic interactive visual tasks (e.g., while driving, how often do people fixate other cars, or the road, or road signs); yet the underlying neurocomputational mechanisms remain unknown. Here, we introduce a simple computational model of task-related eye position influences in interactive tasks with dynamic stimuli. This model extracts from each frame a low-dimensional feature signature (``gist''), compares that with a database of eye position training frames, and produces an eye position prediction map. Finally, we combine the task-related and bottom-up maps, and compare the combined maps with observers' actual eye positions across 216,000 frames from 24 five-minute videogame-playing sessions. For analysis, each map was rescaled to have zero mean and unit standard deviation; the average predicted value at human eye position locations was 0.61 +/- 0.1 in the purely bottom-up maps, and 2.42 +/- 0.07 in the combined maps (a random model gives an average value of 0). Thus, this straightforward model of task-dependent effects offers some of the strongest purely computational general-purpose eye movement predictions to date, going significantly beyond what is explained by purely bottom-up effects; yet it relies only on simple visual features, without requiring any high-level semantic scene description.