Two stimuli were presented at varying stimulus onset asynchronies (SOAs), with each stimulus associated with a distinct task. The first stimulus was a tone at one of either two or three frequencies. In two conditions, the task associated with a tone was either a speeded two-alternative discrimination (2AD), or a speeded three-alternative discrimination (3AD) based on the pitch of the tone. In a third condition, subjects were told to ignore the tone. The second stimulus was a briefly exposed study matrix of red and black squares followed by a mask. After a fixed delay, the mask was replaced by a test matrix that was either the same or different from the study matrix. The task associated with the matrices was to indicate, with no speed pressure, whether the study matrix and the test matrix were the same or different. Results from each speeded AD condition showed that subject's accuracy in the matrix task decreased as the SOA between the tone and the study matrix decreased. This effect was larger for the 3AD tone task than with a 2AD tone task. In addition, within each speeded AD condition, longer RTs in the tone task were associated with lower accuracy in the matrix task. None of these effects was evident when the subjects were told to ignore the tone. These results suggest that encoding visual information can be subject to significant capacity limitations imposed by cross-modal multitasking.