Clicking on the image below will take you to a short Quicktime movie. Make sure you have your sound turned up, because I’ve recorded a few sentences that play along with the movie. Your job is to determine, as quickly as possible, if each sentence is grammatically correct — while you focus your vision on the animated display.

This demonstration replicates part of an experiment conducted by a group of researchers led by Michael P. Kaschak. The researchers showed similar animations to a group of volunteers and asked them to make similar judgments about spoken language. The question: does our reaction time differ when the animation corresponds to the movement described in language?
In the demonstration you just tried, the first two sentences were distractors. In sentence 1, the motion was “towards,” but the animation was moving down. Sentence 2 was ungrammatical. The two sentences we’re interested in are 3 and 4. Sentence 3, “The leaves fell from the tree,” describes a downward motion, just like the motion in the animation. Sentence 4, “The balloon ascended into the clouds,” describes upward motion, opposite the animation.

Kaschak et al. have good reason to suspect that in the case of these last two sentences, the animation may indeed affect how quickly you can process the sentence. We recently reported on data that suggests that impairment of motor control of the hands may also impair our ability to visualize the same motion. Further, memory for visual objects also appears to make use of the visual system. Kaschak’s team points to other research showing that understanding sentences also involves a “sensorimotor simulation” of the action the sentence describes.

But prior to Kaschak’s team’s experiment, no one had tried to measure how quickly people process language when the motion they are viewing corresponds to an animated display. If viewing motion affects language processing, then there are two possibilities for how the two activities interact. It’s possible that watching motion that corresponds to the motion in a sentence will cause viewers to process language faster (i.e. participants will respond faster to “the leaves fell from the tree” when the animation is moving down). Another possibility is that viewing the animation burdens the same region of the brain that is needed to process the language, so when the animation is moving down, then sentences describing downward motion will be processed slower.

Kaschak et al. showed participants four different animations depicting basic motions: moving lines showing up or down motion, and a spinning spiral that could show movement towards or away from the viewer. During each animation, 10 sentences were read: 2 corresponding to the direction of the movement in the animation, 2 in the opposite direction, and 6 distractors. The test questions were always grammatically correct so that each participant was performing an equivalent task. The distractors included some nongrammatical sentences (to keep the task realistic) and some grammatical sentences describing movement that did not correspond to the animation (like Sentence 1 above).

Respondents took an average of 369 milliseconds to respond to sentences that matched the direction of the animation, but only 330 milliseconds to respond to sentences that described movement in the opposite direction. The difference was statistically significant: people take longer to process sentences that match the movement of an animation than they do to process sentences that don’t match it. Kaschak’s team reasons that we must be using the same region of the brain to process the motion itself as we do to process the language describing that motion.

Note that these results are only for animations showing a very generic sort of motion. There’s little doubt that if we saw an actual leaf falling, or balloon ascending, we’d be able to process that language very quickly. Yet the simple concept of “downward motion” does appear to distract from our ability to process a simple sentence describing a particular sort of downward motion.

We weren’t able to measure how quickly you processed the sentences in our demonstration, but did you notice anything different about trying to assess the “down” sentence compared to the “up” sentence? Let us know in the comments.

Comments

I wonder what would happen with metaphorical sentences, such as, “The stock market is rising/falling.”

I can see two possible interpretations of the experimental results: (1) When we read a sentence about motion, we must mentally model that motion, in order to fully understand the sentence. (2) When we watch an action and hear an action sentence at the same time, we try to harmonize the two inputs. It seems to me that (1) is a more interesting hypothesis, but (2) is sufficient to explain the experimental results. I wonder how an experiment could be designed to separate (1) and (2).

Another factor might be that it’s much easier to discriminate obvious differences than it is to confirm that two things are really functionally identical. I takes me longer to determine that the little red car in the parking lot is not my little red car than it does to reject the yellow car as a candidate for getting me out of this horrible shopping center.

Interesting study, but I wonder if the image itself is getting in the way. The moving lines are slightly hypnotic and not particularly the kind of motion you would naturally see. This study even discussed the idea of “integrability” which refers to a direct match between the moving object and the sentence (show a leaf falling – not moving bars).

Maybe it’s just me, but watching the animation and processing the sentence didn’t matter much because I separated the speech from the image. The reason being, the generic image had nothing to do with the sentences. I saw the animation but paid no mind to how it could relate to the image and just focused on the speech. So long story short, the image was irrelevant so I ignored it.

Now had there been an image that contradicted or supported the sentence, I can see how one would take longer to evaluate the sentence.

“People take longer to process sentences that match the movement of an animation than they do to process sentences that don’t match it. Kaschak’s team reasons that we must be using the same region of the brain to process the motion itself as we do to process the language describing that motion.”

It’s a reasonable conclusion, but it raises the question of *which* brain region we’re talking about. Presumably, viewing downward motion recruits V1 (early visual processing), some other early visual areas, MT for sure, and also presumably any other areas involved in the *concept* of downward motion. There’s also pretty good evidence that people covertly (i.e., unconsciously) activate the names of the things that they see, so language areas would be recruited by viewing visual motion.

So which brain regions overlap between language and visual perception of motion? Certainly the ones involved in representing the concept of motion, and probably any involved in the linguistic representation of motion (assuming these are different). Whether the “purely” perceptual areas are involved in both seems to be an open question.

If you were the leaf falling, the view you would have against a static background would be opposite the animation (the background rising). If you were in the balloon acending, the animation is exactly what you would see as the background appeared to go by in a downward motion. Is it possible that in processing the sentences we are taking the perspective of the leaf and the balloon instead of a third party observer as your experiment assumes?

In the case of a signed language, for a sentence to not match the visual input is very difficult, so it’s not surprising we use the same region of the brain to process both. Also, studies show that mismatches between speech and its co-occurring gesture are harder to process, exhibiting something like a McGurk effect (say “went up the stairs” while wiggling fingers downward!).

All this makes it puzzling that in this case it “takes longer to process sentences that match the movement”. It’s well established that processing of linguistic visual info is different from that of graphical visual info, notably so in hemispheric neglect, so maybe an explanation lies in that aspect.

There is an excellent group of experiments by Bergen et al. (2007) that tests for delayed time in object identification according to object location just after three kinds of sentence simuli: literal, abstract (testing for vertical axis conceptual metaphor), and linguistic metaphor. I happened to write an entry on this group of studies the same day as the entry here at Cognitive Daily: http://www.poohsthink.com/future-research-on-simulation-metaphor-conceptual-metaphor/

must be using the same region of the brain to process the motion itself as we do to process the language describing that motion.
How is this possible in a signed sentence, when the language describing the motion IS the motion?

I’m confused by their reasoning here. The same region processes both, and in one case but not the other, it gets overloaded or something? This seems conclusively disproven by cases of hemispheric neglect in signed languages, which impair pantomimed gesture without affecting linguistic use of the same space and motion.

It seems more likely that Lx and non-Lx input go to different regions, and it is easier to sort them out when they are dissimilar.

I think Peter Turney raises an interesting point about trying to ‘harmonise the two inputs’, and bearing this in mind we should also consider the western style of reading, i.e. from the top and downward. This reading style may play a significant role in confusing the subjects. It would be interesting to see the results from people who could comprehend the test but who have never learned to read. This may tell us if the delay is created by our attempt to read what is not there. Or would it be possible to see if the eyes are moving left to right as they move with the animation?