Here's what it sounded like to me: the scanner picks up some sort of information from the brain which is then fed to a computer. The computer is then given a multiple choice of video clips to match the information to. It picks one and superimposes that clip onto the information somehow to create the image we were shown. Not as interesting as it seems at first.

Here's what it sounded like to me: the scanner picks up some sort of information from the brain which is then fed to a computer. The computer is then given a multiple choice of video clips to match the information to. It picks one and superimposes that clip onto the information somehow to create the image we were shown. Not as interesting as it seems at first.

Why are the images all fuzzy :s

Why can't they just pull up the actual image that they're guessing why do they have to "recreate" it?

Abstract:
Quantitative modeling of human brain activity can provide crucial insights about cortical representations [1,2] and can form the basis for brain decoding devices [3,4,5]. Recent functional magnetic resonance imaging (fMRI) studies have modeled brain activity elicited by static visual patterns and have reconstructed these patterns from brain activity [6,7,8]. However, blood oxygen level-dependent (BOLD) signals measured via fMRI are very slow [9], so it has been difficult to model brain activity elicited by dynamic stimuli such as natural movies. Here we present a new motion-energy [10,11] encoding model that largely overcomes this limitation. The model describes fast visual information and slow hemodynamics by separate components. We recorded BOLD signals in occipitotemporal visual cortex of human subjects who watched natural movies and fit the model separately to individual voxels. Visualization of the fit models reveals how early visual areas represent the information in movies. To demonstrate the power of our approach, we also constructed a Bayesian decoder [8] by combining estimated encoding models with a sampled natural movie prior. The decoder provides remarkable reconstructions of the viewed movies. These results demonstrate that dynamic brain activity measured under naturalistic conditions can be decoded using current fMRI technology.

Here's what it sounded like to me: the scanner picks up some sort of information from the brain which is then fed to a computer. The computer is then given a multiple choice of video clips to match the information to. It picks one and superimposes that clip onto the information somehow to create the image we were shown. Not as interesting as it seems at first.

Not quite, they are actually predicting based only on neural activity. But they are associating (through Bayesian method) that neural activity with elementary motions and then reconstructing the movie they're actually watching with the elements of motion.

They do this by training the voxel definition(s?) on each individual subject.

The neurons they're recording are encoding motion, not a static image, so rather than having a bank of colors (that comes standard with every computer nowdays), they need a bank of motions (which does not come standard with computers). They gathered that bank of motions from youtube (and they were not the same clips the subjects actually saw, they are just a bank of elementary motions).

researchers fed the computer 18 million one-second YouTube clips that the participants had never seen. They asked the computer to predict what brain activity each of those clips would evoke.
Then they asked it to reconstruct the movie clips using the best matches it could find between the YouTube scenes and the participants' brain activity.
The reconstructions are blends of the YouTube snippets, which makes them blurry. Some are better than others. If a human appeared in the original clip, a human form generally showed up in the reconstruction.

Not quite, they are actually predicting based only on neural activity. But they are associating (through Bayesian method) that neural activity with elementary motions and then reconstructing the movie they're actually watching with the elements of motion.

They do this by training the voxel definition(s?) on each individual subject.

The neurons they're recording are encoding motion, not a static image, so rather than having a bank of colors (that comes standard with every computer nowdays), they need a bank of motions (which does not come standard with computers). They gathered that bank of motions from youtube (and they were not the same clips the subjects actually saw, they are just a bank of elementary motions).

Yes, I see from your explanation I misunderstood the "bank" to be much simpler than it actually is, and therefore misunderstood what the result demonstrated. It is more interesting than I thought. Thanks for clarifying.