Why do humans see in color? According to neuroscientist Mark Changizi, who left academia to run a research institute called 2Ai, it’s so that we could read the emotions of others. In his book, Harnessed, published last summer, Changizi described his theory — along with others about how humans developed the basic capacities to think, speak and read. Below, Changizi explains further.

Why do humans see in color vision when many animals don’t?
Your dog doesn’t see in full-color vision, but we primates evolved an extra dimension of color vision that other mammals don’t have. For over 100 years it was thought that [this new] ability to see reds and greens was about finding fruit in the forest or seeing young leaves or [something else to do with] eating. What I was able to argue was that our vision was never well optimized for that. My hypothesis was that it was about sensing emotions or health on the skin of others.

Like blushing or going red with anger?
[Yes.] If it’s really about that, then it better be that [our vision is] optimized for sensing hemoglobin and the physiological modulation of that. When you look at how hemoglobin changes as it oxygenates and deoxygenates, it’s in a very particular part of the spectrum that’s subtle and hard to see, unless there are cones [light-sensitive cells in the retina that detect color] just in the spot where primates have a pair.

But don’t most other primates have fur rather than skin?
[Yes.] The other major prediction was that if it’s true [that color vision is] about sensing emotion on bare skin, then primates with naked spots [would have it]. And primates with color vision have bare faces and rumps, and the others have furry faces like typical mammals.

You’ve recently developed some high-tech glasses that improve on this ability.
There are three different technologies. There are two dimensions you can see in color [related to blood under the skin]. You can see the veins in your wrist, that’s almost purely [about] oxygenation — the yellow-blue axis. That’s the variation one set of glasses highlights. The other dimension is if you look at your palm, squeeze it and then let go, it becomes yellower or whiter when you squeeze the blood out. That’s the red-green axis.

Two of the specialized glasses are for medicine. There’s the oxygenation isolator — a vein finder. It’s good for any case where you are interested in just looking at variations in oxygenation, like phlebotomy or to get a line or IV in. The second one kills the ability to see oxygenation and amplifies [the ability to see] whether blood is pooling. You wouldn’t see veins at all, but when you squeeze your fingers, you can see where blood is and isn’t. It [has applications in] dermatology.

So why would amplifying either thing be useful for detecting other people’s emotions?
Each of those could be of interest in detecting emotion and general health. For over 2,000 years, [many] medical diagnoses have mentioned acute pallor of skin. It’s part of the doctors’ toolkit. In some countries you can’t go to medical school if you are colorblind. On the emotional side, it would be interesting to know whether it might be useful for security. Or, it could be that in poker playing, one might be more useful than another, I don’t know.

Might it be able to help autistic people read the emotions of others?
The thought has occurred to us. I don’t know anything about it. There’s a counterview of autistic people not having lower sensitivity to emotion, but greater sensitivity. Someone ought try it, but I have no idea if it would work.

Your other work has examined the evolution of music and written and spoken language.
We do seem to have an instinct for reading: we read very early with relatively little training. By the time we’re adults, we’re reading more than we’re listening to words all day long. If you were an alien, you’d think surely that was part of natural selection. We even have parts of the brain that look like reading areas. [But all written languages] have all the hallmarks of instinct, not by virtue of natural selection but because culture has shaped the very structure of words across writing systems, so that they look like visible objects in the natural world. That way, you can transform the brain’s object recognition system into a reading system. For example, in Chinese, the symbol stands for an entire word and they already look like simple drawings of things. The trickier case is writing systems like our own where individual letters stand for speech sounds. If culture is trying to make words look like objects and natural scenes, letters should look like object parts. There are object parts long known within vision science: junctions like L junctions or T junctions.

And these are the sort of low-level building blocks for seeing the world that are used by the brain to process vision?
It’s what you get in a world with opaque objects strewn about. You get Ls and Ts and not Xs. And that’s what’s found in writing systems. You can find lots of ’s and Ts but not that many Xs. In the full set of topological configurations of strokes, there are 36 different shapes that you can work out. Some happen in natural scenes and some don’t. So, [the question is], Are the ones that are found in the natural world that shaped the visual system also the ones that you find in writing? And that’s what you find.

But we do have the letter X.
It’s not that we don’t have any Xs. But if you look at the relative probabilities, the ones that are common in nature are relatively common in writing and those that are relatively rare should be rare, and that’s what you see.

So how does this apply to the evolution of language?
Once you have this view for writing, then you can say, We’ve only been culturally evolved with writing for 2,000 years, but we may have been speaking for 100,000 years. The question was, what could it be harnessing? If writing is harnessing the visual system’s object-recognition system, what could speech be harnessing?

This took a lot of getting grasshoppery about sounds. It really depends on a very fundamental notion of a world of opaque objects strewn about, not [something specific] like the savannah. The world of sounds is mostly sounds of solid objects [interacting]. There’s liquid sounds and the sounds of wind, but for the most part, solid objects turn out to have characteristic patterns of sounds that they make. They hit, they slide, and when they hit or slide they vibrate: they ring. Unlike a gong, a coffee cup will ring [quietly]. But on the basis of these rings, you recognize the objects around you. You have an auditory system that can tell what objects are involved, [even when you can’t see very much].

In language, the most fundamental sounds you find are plosives like puh and guh and tuh that sound like hits. And then you’ve got fricatives like shuh and zuh. They sound like slides. You also have a third category: sonorants or vowels, letters like Y and L and R. They all have a ringlike sound. The three categories of phonemes are effectively the sounds of hits, slides and rings.

What’s been the reaction to your idea that speech and written language harness these properties of the physical world?
Overall, when I give these talks people are very excited because no one has put forth a view like this. People had noticed the observation that among the sounds of speech are all these similarities. But inside the sounds of speech are these fundamental common and natural sounds of solid objects. I have had great reactions among neuroscientists and linguists. But some linguists just don’t care how it evolved, they’re interested in formal logical rules.

The idea is that human gait has the structure of beat and rhythm. The Doppler shift sounds that moving things make transfer to melody. The path that pitches take tells you the path that humans are moving around. You end up with the four major parts of music mapping straightforwardly to the four aspects of sounds of humans that we have to be good at processing. We need to be able to track and understand other people and understand what it means for us.

So, since music already about people moving, it gets us moving? How might this explain why certain songs get stuck in your head?
I had a student interested in the question of earworms and this is his work. The standard story is that they are really repetitive, so they become stuck in your brain. But most pop songs are repetitive so his thought was that it could it be that the ones that are disproportionately likely to be earworms are disproportionally likely to be like humans moving. And when you look at the top 10 list of earworms, they are disproportionately like the Macarena and have things you do with your body while you [hear them]. He found that the extent to which movement is connected was way more common among earworms than among pop songs in general.

Overall, when I give these talks people are very excited because no one has put forth a view like this. People had noticed the observation that among the sounds of speech are all these similarities. But inside the sounds of speech are these fundamental common and natural sounds of solid objects. "

I don't understand why people think this is a new idea. Poets have used the concept of onomatopoeia forever - words that sound like the objects they are describing. The river was rushing, for example. Rushing sounds like the sound flowing water makes. Drip sounds like the sound water makes when it, well, drips! Now I do give him full credit for putting this all together with human linguistics, evolution, and neurology. PLUNK! Goes the ball when it hits the baseball bat!