Cognitive Sciences Stack Exchange is a question and answer site for practitioners, researchers, and students in cognitive science, psychology, neuroscience, and psychiatry. It's 100% free, no registration required.

Readability is often intuitively synthesized. If you see some piece of code, you just know whether it is readable or not. But what are actual psychological, scientific explanations for this?

There are readability metrics for written text, but do they apply to source code as well? For example: I believe the word superiority effect to be applicable to source code. What other effects can be found while programming, that make code readable?

I am not looking for educated opinions or examples as in here or here, but for neural correlates or a perspective from cognitive sciences.

This question came from our site for professional programmers interested in conceptual questions about software development.

1

One thing that frequently inhibits readability is deviance from common patterns. When someone iterates over some collection, but starts with the second element, that's easily overlooked and can make the code "feel" more complex than it actually is.
–
Joachim SauerApr 9 '13 at 11:46

Deviance from common patterns also factors in when comparing "my" code to "their" code - you are used to your own patterns.
–
MadKeithVApr 9 '13 at 11:52

1

There's no direct references in Code Complete under 34.3 Write Programs for People First, Computers Second. Have you had a look through the extensive biblography of that book for relevant sources?
–
StuperUserApr 9 '13 at 12:11

There is research for metrics regarding software source code readability, based on readability metrics for natural languages: Buse and Weimer, Buse and Weimer, Dorn. These don't discuss the psychological roots, however, but may help address your second paragraph about applying readability metrics to source code, how to determine if code is readable, and how to make readable code.
–
Thomas OwensApr 9 '13 at 13:50

3 Answers
3

Using the English language, given two sentences that say the same thing, what makes one more readable than the other? Usually terseness while retaining clarity and removing ambiguity.

The exact same things make code more readable. Remove everything that doesn't add anything, but don't remove things that do add information. And avoid ambiguity.

In code, we have the advantage that we can reuse similar pieces of text and replace specific values within them. If only we could do that with books, we could save a small rainforest.

On top of terseness, we have good use of punctuation and spacing. Ever tried to read a blog piece where there are no paragraphs whatsoever? That's what it's like reading code that isn't neatly spaced out.

The psychology of this is simple. Your brain receives all sorts of hints as to what it's reading. Try taking a long piece of text and printing it out (with a serif font) without paragraphs, then without periods and commas, or maybe switch punctuation marks (? => ., ! => :, etc), then in a sans-serif font.

You'll quickly see how much you rely on your brain's instincts when you're reading. Code is exactly the same. And herein lies a problem, because some things are certainly subjective. Or rather they are a matter of habit.

If you're used to seeing underscores preceding member variables, then you move to a company where they don't do that, your brain will still expect the same hints and it will be harder to read. But to change it would mess up everybody else in the team. It would be like moving to Spain and telling them to stop using the preceding inverse-question mark because it throws you off.

Design patterns are a similar thing. If everyone knows what a Builder Pattern does, then adding Builder to a class name gives everyone a lot of information. But if I have my own Bracksfort Pattern, that information only helps me, at least until other people start to ask me, or I document it and get the community to buy into it.

This. I hate it when people say verbosity of code doesn't matter because IDE writes it or some other excuse. What about the reader who has to wade through all those lines while trying to find a signal from the noise?
–
EsailijaApr 9 '13 at 12:11

Great answer, outlining the underlying difficulty of defining what is "readability"
–
Clement HerremanApr 9 '13 at 12:28

@what: Well, yes. If I took a piece of English text and reduced all the nouns and adjectives to one letter, you might struggle to read that too.
–
pdrApr 12 '13 at 19:31

Reducing code is a common practice, especially on heavy traffic sites. Look at the JavaScript code on a Google page: all variables are one or two letters, and there is almost no whitespace (no line breaks, no spaces, no tabs). That is unreadable code. So creating readable code might involve creating readable code and then reducing its volume by search and replace before you put it online.
–
user1196Apr 13 '13 at 9:55

This is my current area of research (I'm a Ph.D. student in computer science and cognitive science). Like you said, there are a large number of readability/complexity metrics, but very little research trying to quantify what makes a piece of code psychologically complex. For more information on qualitative studies and models, I'd highly recommend the 2001 book Software Design: Cognitive Aspects.

I've been running some experiments recently with the goal of developing a cognitive model that can "read" code like a human. The inspiration for my model largely comes from a 1995 paper by Simon Cant et. al.. In the paper, they lay down an (incomplete) foundation for a mathematical model that "chunks" small blocks of code into memory and "traces" to other parts of the program in order to resolve dependencies. I have a workshop paper titled "Cognitive architectures: a way forward for the psychology of programming" that suggests we may be able to built such a model on top of a cognitive architecture like ACT-R.

I'm currently writing up the results of one of my experiments. Once it's ready, I'll be posting about it on my blog (synesthesiam.com). In short, we had programmers of all different experience levels predict the output of 10 Python programs (each program had 2 or 3 versions). Some of them did the experiment in front of an eye-tracker, which will be another paper someday (hopefully) soon. The rest were from Mechanical Turk, and we observed interesting speed/accuracy differences that were not always in favor of the more experienced programmers.

Thank you for your answer! I believe that "Cognitive Processing" is too abstract. The speed of the lexical access for example (retrieving what a word means from your mental lexicon) is a well studied construct that explains your answer ($f vs. $first_name) with the word superiority effect. I am looking for constructs like this!
–
cessorApr 19 '13 at 8:10