by Dr Justin Marley

Turning A Person’s Visual Experiences Into A Movie

I’m still trying to come to terms with this paper which has just been published in the journal Current Biology. The implications of this paper are just staggering. Whilst it had often been suggested that one day neuroscientists would be able to view a person’s experiences I had often disregarded this possibility*. Instead I thought that we would come to understand the brain in a different way – in a way which still separated mind and brain. What the researchers have done in this study however is to overturn many common assumptions which have defined central debates in areas of neuroscience including discussions about the nature of consciousness and the mind-brain divide. What the researchers have done effectively is to film a person’s thoughts as if we were watching a movie. Now strictly speaking this is incorrect but if you hold this thought it becomes easier to get to grips with the paper and i’ll later explain why this is nearly right but not quite. First of all to understand the magnitude of this discovery take a look at the clip below. The video on the left was presented to the subject and the video on the right was video created solely on the basis of that person’s brain scan images used in conjunction with a database-aided software reconstruction. In other words, the researchers didn’t need to see the original video images to reconstruct them.

Reconstruction of Video Images in Gallant’s Lab

Introduction to fMRI and a Background to the Study

Gallant and colleagues are a group of researchers working in the University of California, Berkeley, USA. They produced a 6 page report together with supplementary material containing further details of their methodology. The researchers used functional Magnetic Resonance Imaging to build up a picture of the activity in the brain when a person watches videos. One of the central problems in functional Magnetic Resonance Imaging (fMRI) research is that this approach hasn’t been thought to be good at picking up the very fast pace of neural activity. fMRI uses a large magnetic field applied across the body and small changes in the magnetic field are detected. These small changes are then used to build up a picture of the brain. However this picture has to be interpreted – so the next question is what causes these changes in magnetic field strength? When researchers use BOLD signals they are using Blood Oxygen Level Dependence a measure of oxyhaemoglobin. When we breathe in oxygen, it binds to haemoglobin becoming oxyhaemoglobin the carrier of oxygen. Oxyhaemoglobin gets carried around in the bloodstream and the oxygen is used by tissues. When the oxgyen is used by the brain, there is a change from Oxyhaemoglobin to Deoxyhaemoglobin. These two molecules have different magnetic signatures.

Video Detailing the Physiology of Oxygen Transport

Thus when the BOLD signals are used, the researchers can view the use of oxygen in the brain in real time. How does this relate to brain activity? The answer to this is that brain activity is fast and blood supply changes are slow and delayed. So when researchers view the blood supply changes the neural activity has already happened and they’re watching the afterevents. The pairing of this very slow change in blood supply with this fast neural activity was thought by many to pose insurmountable problems to the interpretation of neural activity. However the researchers in this study have demonstrated otherwise. The researchers have done two things which have helped them to make this breakthrough.

1. They have studied a well-understood area of the brain – the visual cortex. As this area of the brain has been very well researched, the physiology of this part of the brain has been well detailed even down to the level of how individual brain cells – neurons – can code for the visual information they are receiving.

2. They have developed a sophisticated mathematical model which allows them to interpret changes of blood flow in the brain.

The researchers used 3 subjects in their study. The subjects watched a series of movies. The researchers ensured that the movies occupied a circumscribed area within their visual field and were presented at a frequency of 15 Hz. fMRI recordings were made while the subjects were watching 7200 seconds of colour movies for the training data and 540 seconds of colour movies for the test data. An understanding of this study necessarily involves a look at the details of fMRI statistical analysis.

With fMRI analysis the brain is divided into cubes which are referred to as voxels. The researchers can view the change in signal within each voxel over time. We can see from the above discussion that the changes in these voxels come from the release of oxygen into the brain tissue but the researchers wanted to find out something else – the activity in the brain a few seconds earlier. The researchers focused their analysis on an important part of the visual system – the visual cortex and in particular on areas V1, V2, V3A and V3B (see also here and here).

The Human Visual System

The human visual system takes visual information from the outside world and translates into a form that can be understood and used within the brain. The earliest parts of the visual system are detailed in the video below.

Schematic Overview of the Early Visual System in Humans

V1 is referred to as the Striate Cortex or Primary Visual Cortex while the other areas are referred to as Extrastriate Visual Cortex. These areas are also described by another system of nomenclature – the Brodmann Areas – which were classified in the early twentieth century according to the microscopic properties of the brain which vary subtly from one area to another. The areas of interest to the researchers in this study correspond to Brodmann Areas 17, 18 and 19 and are displayed in the diagrams below.

Researchers have tended to think of the visual cortex in terms of the dorsal and ventral streams. The dorsal stream includes areas V1 and V2 as well as other areassuch as V5 and is thought to be involved in the processing of information about ‘where’ things are. For example there is evidence that this stream is involved in the processing of motion. The ventral stream includes areas V1 and V3 as well as other areas including the Inferior Temporal Cortex and is thought to be involved in the processing of information about ‘what’ things are. For instance there is evidence that this stream is involved in processing the form of objects. Researchers have found abundant evidence that the visual cortex operates in both a hierarchical and distributed manner. In other words in order to do certain things like identifying moving objects, the visual cortex processes visual information in a series of steps. Each step is necessary in order for the next step to happen. In this manner the visual cortex is hierarchical. The occurrence of ventral and dorsal streams which occur in different ‘circuits’ within the visual cortex also means that the visual cortex works as a parallell processing system. These two properties of parallel and hierarchical are key properties of the brain.

How the Brain ‘Sees’ the Visual World

A fundamental principle of processing in the visual cortex is that the information conveyed in lightwaves – the amplitude and wavelength of the light – is translated into the firing of neurons. This is central to an understanding of the research paradigm used in this study and can be understood by inspecting the diagrams beneath and comparing it with the all-or-nothing firing response of a neuron.

In order to transmit information, a neuron can alter only whether it fires or not and the rate at which it fires. The properties of the lightwave above must be encoded in the firing rate of the neuron if it is conveying information about the light wave (although the information conveyed by a group of neurons complicate this simplistic interpretation as they allow for a continuous rather than discrete set of responses). The reader might ask how the neurons can translate the incredibly complex visual information from the environment into a collection of all-or-nothing firing responses which are incredibly reliable. The research community in visual neuroscience research use a mathematical technique known as Fourier Analysis (and related techniques) to simulate this property of neurons. A simplistic interpretation of Fourier Analysis is that it is a mathematical technique which can translate information about waveforms in space and across time into frequency information. Without going into too much detail it takes advantage of the ability to use waveforms (sine and cosine) in specific combinations to represent complex waveforms. The example below illustrates how sine and cosine waveforms can approximate a square waveform.

What is remarkable about this is that when we ‘see’ the world around us we are having to do something counterintuitive and exceptionally elegant in order to do this. Indeed this happens without us being consciously aware of this or needing to be concerned with it at all. The brain is doing the maths automatically. Furthermore since solutions are conserved in evolution, it is likely that most other organisms that process visual information with their nervous systems are ‘performing’ the same maths operations automatically. These same maths operations can be competently performed consciously and explicitly by a small percentage of people when it is used mainly in engineering solutions (and of course neuroscience research). This study adds further evidence to support the notion that the natural language of our brain is a mathematical encoding of the world**.

Putting it all Together

In order to figure out the relationship between blood oxygen changes and neural activity the researchers needed to write software that analysed the video material and translated it into a format that could be mathematically related to the signals in the voxels. The video clips needed to be turned into coded signals for the software to use in the analysis. The researchers used a Fourier related type of analysis – Gabor Wavelet analysis – to analyse the scene.

A Video Showing a Software Program Analysing Facial Expressions and Working out the Corresponding Emotion using Gabor Filters to Deconstruct the Visual Scene.

The research used the term ‘motion-energy filters’ to describe the filters that they applied to the visual scene. First they simplified the video clips by reducing the range of colours and also the number of pixels in the images. The analysis involved lots of different filters. Each filter detected a different property of the video footage. For instance one filter might detect movement in one direction in one part of the video. Each filter receiving the input signal from the video then turns it into an output signal. This signal needs to be translated into a blood flow signal. Another filter is used for this. When this filter receives the signal from the video processing filter it needs to translate it into the corresponding change in blood flow. Just to recap, when oxyhaemoglobin delivers oxygen to the right area of the brain, it becomes haemoglobin and produces a signal which follows the brain activity by up to a few seconds. The filter that the researchers used follows the same pattern by producing an output with a time delay which is related to the brain activity input.

So now the researchers had worked out how they could program the software to analyse the visual scene in a way which is similar to that of the brain. The software had lots of filters just as the brain has lots of neurons dedicated to visual processing. However rather than produce a one-to-one correlation between the neurons and the filters they used in the software, the researchers looked at groups of neurons in each voxel. This makes sense because the fMRI analysis was geared towards looking at a scale which is relevant to groups of neurons rather than individual neurons. The oxgen changes that were being measured related to the energy demands of groups of neurons. There are further refinements within their methodology. For instance the researchers note that of the original 15,000 voxels in the Occipital Cortex that they identified

‘..to obtain optimal reconstructions for each subject we used only the 2000 voxels that produced the most accurate predictions‘

They write further that each voxel is modelled as a collection of thousands of filters. Thus they are producing a many-to-many correlation. Indeed the researchers go on to say that they use another mathematical approach known as regression to map the filter output signal onto the voxel activity signal. What the researchers found by using this regression analysis was the speed detection properties of voxels differed according to their position within the visual cortex which the researchers measured in terms of eccentricity. Voxels that detected higher speeds in the visual scene tended to be more peripheral.

Now that the researchers had identified the most optimal voxels to include in their modelling and they had an effective method for translating a visual scene into a signal that had a good relationship with the voxel activity they were detecting the last stage was to reconstruct the images from the detected activity. The researchers had collected training and test data as described above. The idea here was to test the model’s accuracy in matching voxel activity with the signal produced by processing of the visual scene and is relevant to a previous debate on statistical analysis of fMRI research. The researchers trained the software with a huge amount of footage from YouTube – over 18 million seconds of video footage in order to make predictions about the BOLD signals that these images would produce. From their bank of images, the researchers were then able to reverse engineer the images that were being seen when the research subjects (who were the researchers themselves!) viewed a new set of images while being scanned.

Here again things get a little tricky. For instance the software tended to retrieve image clips that had initial movement followed by a relatively static scene which interfered with their reconstruction. Therefore the researchers included only the first 5 seconds of the clips they were retrieving. The images that most closely matched the predicted signal associated with the voxel activity were then ordered from most to least likely and the researchers averaged out the top 100 matching clips. This approach is demonstrated in the labs video below.

Video Reconstructions of Clips Presented to 3 Subjects. The average of the best-fit clips is on the left, while those on the right are the best fit clips. Each row represents a single subject.

What Do These Study Findings Mean?

Having produced these results, the researchers were able to demonstrate this in a very unusal and yet convincing way. Often fMRI study reports will include pictures of the brain activity averaged across subjects with the suggestion of a correlate with subjects experiences. Here however, as well as the statistical analysis the researchers have created a very convincing series of videos showing the reconstructed clips next to the original. For me these clips were so convincing it was almost like saying that I didn’t need to see the statistical analysis – the clips were enough to do the convincing. There are various reasons why this should not be the case and why statistics are still necessary to do the convincing but it does raise an interesting question about whether new techniques for persuading other members of the scientific community about a hypothesis can be developed.

One important question however is whether the voxel activity we are ‘seeing’ is really the visual experience of the subject. This is really a question about phenomenology. When the subjects are saying ‘I can see this person moving in the clip’, is the ‘I am seeing’ bit of the brain in the Occipital Cortex? This might not necessarily be so. There are higher association areas in other parts of the brain where the visual information is combined with information from other parts of the brain. The information in the visual cortex is reproduced elsewhere. However it would make sense for that experience to occur in the part of the brain where the processing is taking place because this would mean that it is experienced as close to the stimulus event as possible which is essential in the natural world where the brain’s functions have been shaped. However answering this question is necessary to say that the researchers are imaging a person’s visual experiences. We would think it silly to suppose that the visual experience is occurring in the retina or in a single rod or cone in the retina for instance. Where in the hierarchy of visual processing does it start to make sense for us to be able to say that there is a useful correlation with consciousness.

If this activity is a conscious experience then we can say that the researchers are reproducing that person’s experiences. Remarkably subjective is being turned into objective opening up many possibilities. Just what are these possibilities? The researchers put it thus

‘as long as we have good measurements of brain activity and good computational models of the brain, it should be possible in principle to decode the visual content of mental processes like dreams, memory, and imagery‘

Could We Ever See a Person’s Hallucinations?

If it holds that the voxel activity that the researchers have identified in the Occipital Cortex correlates with visual perception then it is entirely possible that without any change in the methodology, the researchers would be able to reproduce the hallucinatory experiences of subjects with psychosis. Of course this would raise many ethical questions and people who are very unwell with psychotic experiences may not be able to tolerate the long hours that are spent in the scanner. However with improvements in the methodology including refinements of the software and training sets this may become more practical and could possibly help come not just to a better understanding of psychotic experiences but also look at the effectiveness of antipsychotics in ways which were unimaginable before the publication of this study. There is one type of hallucination in particular that could be investigated with this approach – the hallucinations that arise following strokes involving the Occipital Lobe. There are many examples of this (see here, here, here, here, here, here, here and here) and one of these includes a case of Charles Bonnet Syndrome. A certain form of dementia – Lewy Body Dementia is commonly associated with visual hallucinations and in one study has been associated with changes in Occipital Cortex glucose metabolism suggesting that this approach may be useful in further investigating the Occipital Cortex link. There is also one study in which the researchers found that people with Alzheimer’s Disease were more likely to experience visual hallucinations when there were white matter lesions in the Occipital Cortex. Researchers such as Fftyche have summarised the electrophysiological correlates of visual hallucinations in the occipital cortex. Complex visual hallucinations have also been evoked following electrical stimulation of the Occpital Cortex in surgery for certain types of epilepsy.

Could We Reconstruct Out-of-Body Experiences?

One Italian group have reported out-of-body experiences (also known as autoscopy) in association with damage to the Occipital Lobe in one person. Again the same study methodology could be applied to this phenomenon providing that this is experienced at the time of the scanning. The subject would be presented with a video material training set in order to be able to reconstruct the images.

Conclusions

The researchers in this study have achieved a remarkable result and no doubt groups around the world will be trying to replicate these findings. These results will change the nature of the mind-brain debate in a fundamental way.

Footnotes

* The argument runs along the following lines. If we assume a connectionist model in which experiences and processing within the brain are represented in the synaptic connections then it follows that understanding the content of activity within the brain will be dependent on knowing where those synapses are. This level of detail in turn requires a very subtle level of analysis – using microscopy and understanding small groups of neurons. The macroscopic properties of neurons will not be so useful as it will be taken out of context.

** A further interesting inference is that we can share this ‘mathematical’ experience with each other by means of a non-mathematical language. Perhaps it is because we share the same mathematical infrastructure in conscious experience that we don’t need to place so many constraints on the language that we use to share experiences. The experiences conveyed by the language can be readily reconstructed because we have many constraints on the interpretations of language. This would also mean that an ability to understand language is dependent on initially engaging with language experientially and if correct this may explain the innate ability to impose a ‘correct’ grammatical structure on a language under certain circumstances e.g (Pinker, 1994).

References

Steve Pinker. The Language Instinct. The New Science of Language and Mind. Penguin Science. 1994.

An index of the site can be found here. The page contains links to all of the articles in the blog in chronological order. Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.