Main menu

Post navigation

Generative models and programming talent

What, if anything, is the essential and unique talent of good programmers?

The best account of this I’ve seen was written by one Stuart Reges in 1992. It has been interestingly corroborated by two more recent papers on CS education. I think Reges was onto something, but he was missing a couple of concepts he needed and therefore could not state his theory as concisely or effectively as he might have. I’m going to restate his idea here, and (I think) do a somewhat better job of tying it into other kinds of knowledge than he does.

Reges tried to write an account of a skill that he called “CS IQ”. I observe that CS IQ has to do with the ability to construct what philosophers of science call “generative theories” (this is the concept Reges is missing).

Here is Reges’s 1992 account, lightly edited for concision and readability (mainly by adding some paragraph breaks) and with my own analysis inserted:

[Good] programmers are able to “play computer” in their head (sometimes requiring the aid of a scrap of paper). In other words, we have a model of exactly what the computer does when it executes each statement. For any given program, we have a mental picture of the state the computer is in when execution begins, and we can simulate how that state changes as each statement executes. This is rather abstract, so let me try to explain by giving a specific example.

In the language of philosophy, Reges is saying that (good) programmers have a generative model of computing. That is, they have a model that is rich in internal causal connections. They can reason forward about how causal effects will ripple through the rest of the model when the state of some part of it changes. They can also reason backward about what sort of state change would be required to produce a specified effect.

Let me tell a story that is typical of those I heard from the TAs who worked for me at the computing center. A student comes up to the TA and says that his program isn’t working. The numbers it prints out are all wrong. The first number is twice what it should be, the second is four times what it should be,and the others are even more screwed up. The student says, “Maybe I should divide this first number by 2 and the second by 4. That would help, right?”

No, it wouldn’t, the TA explains. The problem is not in the printing routine. The problem is with the calculating routine. Modifying the printing routine will produce a program with TWO problems rather than one. But the student doesn’t understand this (I claim because he isn’t reasoning about what state his program should be in as it executes various parts of the program).

The student goes away to work on it. He comes back half an hour later and says he’s closer, but the numbers are still wrong. The TA looks at it and seems puzzled by the fact that the first two numbers are right but the others don’t match. “Oh,” the student explains, “I added those 2 lines of code you suggested to divide the first number by 2 and the second by 4.” The TA points out that he didn’t suggest the lines of code, but the student just shrugs his shoulders and says, “Whatever.”

The TA endeavors to get the student to think about what change is necessary, but the student obviously doesn’t get it. The TA has a long line of similarly confused students, so he suggests that the student go sit down and think through his calculating procedure and exactly what it’s supposed to be doing.

Half an hour later the student is back again. “While I was looking over the calculating procedure, a friend of mine who is a CS major came by and said my loop was all screwed up. I fixed it the way he suggested, but the numbers are still wrong. The first number is half what it’s supposed to be and the second is one-fourth what it’s supposed to be, but the others are okay.”

The TA considers for a moment whether he should bring up the student on an honor code charge for receiving inappropriate help, but decides that it isn’t worth it (especially since that line of similarly confused students is now twice what it was an hour ago). He asks the student whether he still has those lines of code in the printing routine that divide by 2 and 4 before printing. “Oh yeah,” the student exclaims, “those lines you said I should put in. That must be the problem.” The TA once more politely points out that he didn’t suggest the two lines of code, but the student again shrugs and says, “Whatever. Thanks, dude!”

The student in my hypothetical displays the classic mistake of treating symptoms rather than solving problems. The student knows the program doesn’t work, so he tries to find a way to make it appear to work a little better. As in my example, without a proper model of computation, such fixes are likely to make the program worse rather than better. How can the student fix his program if he can’t reason in his head about what it is supposed to do versus what it is actually doing? He can’t.

In the language of philosophy, the failing student does not have (or even seek) a generative model of computing, and therefore has very little ability to see how the facts about his particular program are causally connected. His ability to reason forward from causes like the construction of his loop to effects like the observed output is weak; his ability to reason backwards from desired output to the construction of the loop is nonexistent.

But for many people (I dare say for most people), they simply do not think of their program the way a programmer does. As a result, it is impossible for a programmer to explain to such a person how to find the problem in their code. I’m convinced after years of patiently trying to explain this to novices that most are just not used to thinking this way while a small group of other students seem to think this way automatically, without me having to explain it to them.

What is missing from Reges’s account of a “model of computation” is what the model held in a good programmer’s mind actually consists of: a rich set of causal relationships between possible states of the model. (But I am pretty certain Reges would agree with this elaboration instantly.)

Reges’s implicit theory is this: good programmers, and students who will become good programmers, have a generative model of computing. Poor programmers, and students who will become poor programmers, don’t. CS IQ consists of a talent for building generative models of computing.

Some recent work on the teaching of programming reinforce Reges’s point. See, for example, Improving the Viability of Mental Models Held by Novice Programmers. The authors, llike Reges, lack the notion of a generative model. But in considering how students succeed and fail, they note that students may hold different causal theories about the assignment operator in Java — and of course, only one of these theories is correct.

Another recent paper, The camel has two humps, draws a more general conclusion. In this one, two British researchers correlated responses on an exam used to filter incoming students aiming to study CS. They boldly claim to be able to predict a student’s success in CS courses before students have had any contact with any programming language.

Their exam included several questions about how assignments in computer languages will change variables, with multiple-choice answers implying different theories about how assignment works. Theories included the correct one (a = b copies the value of b to a) and several incorrect ones (a = b moves the value of b to a, zeroing b; or a = b swaps the values of a and b).

The result: The test-takers who went on to fail in the CS courses were the ones who applied different theories of assignment to different questions. The ones who succeeded applied the same theory to each question; whether it was the correct theory did not matter!

In other words, the successful CS entrants were the ones who, before learning any CS, responded to the test questions by constructing a generative model of assignment in their heads. The failures did not. Whether the result was correct or incorrect, it was the predisposition to seek a generative theory that predicted success.

I have not linked to Reges’s original post for a reason. It’s because Reges was using his argument about “CS IQ” as a stepping stone to a more contentious theory, which I’ll examine in a future post. In the meantime, Reges’s theory has a consequence; it is, itself, generative. It is exactly the consequence that the authors of The camel has two humps actually observed; successful CS students are those who, given a set of facts, will instinctively seek a consistent generative model to connect them.

I’m going to coin a term. Successful CS students are “model-seekers”. Reges’s “CS IQ” is a measure of model-seeking tendency.

UPDATE: Stuart Reges has not disappeared; he’s now a Senior Lecturer in Computer Science and Engineering at the University of Washington.

UPDATE**2: Reges himself informs me that the authors of The camel has two humps have now withdrawn their strongest claims. More experimental results refuted the theory that model-seeking is the only predictor of success, but they still believe it is an important factor and are looking for possible confounding variables.

34 thoughts on “Generative models and programming talent”

I suppose making mental models is the key to success in nearly every field, not just CS. I read that Nikola Tesla could leave a machine running in his head for weeks, then come back and check it for wear.

I remember seeing the paper you linked to, and I thought that if I was able to make it over assignment statements, I must be doing pretty well as an amateur programmer.

>I suppose making mental models is the key to success in nearly every field, not just CS

Possibly not. I don’t see a lot of use in it for musicians, for example. And I am one! I don’t need to know the physics of a vibrating air column to play a flute. I don’t need a generative model of music, either, though something not unlike one is helpful when I’m composing.

>Possibly not. I donâ€™t see a lot of use in it for musicians, for example.

Brian Wilson supposedly knew the sound of every instrument, plus he claimed to hear vocal arrangements in his head. I don’t think he could have created something as brilliant as “Pet Sounds” without a generative model of music.

Not specific to this topic, but general to your blog: are you aware that whatever you’re doing with referrer headers is completely broken? When I come here from Google Reader, all I get is php error messages. If I click in the location bar and hit enter (i.e. reloading via the URL alone) it works fine.

>Brian Wilson supposedly knew the sound of every instrument, plus he claimed to hear vocal arrangements in his head.

Oh, heck, I can do those things. And no, I’m not saying they’re trivial; they’re not, I have exceptional musical talent and I’m aware of that. Though self-taught, I’ve been a sideman on two albums and a moderately successful composer, even though I’ve given music comparatively little of my energy. (Thus, my skills don’t come anywhere near matching my talent.)

Precisely because I do have strong musical talent, and I’m also a programmer, I know that the generative model I have of programming is a great deal more conscious and detailed than whatever near-functional-equivalent I have for music is. I don’t reason causally about music; I don’t have to. What I do when I improvise or compose is more like paying attention to a pattern generator in my head.

I’ve dabbled in just enough visual arts (I helped design the cover for my last book) to believe something similar holds for them as well. That is, they’re more about paying a special kind of attention than they are about causal reasoning.

I think artists in general are like this. Much less conscious and deductive than programmers are, less aware of their creative process, less invested in causal reasoning. Because they don’t have to be.

Getting back to programming, I once composed and debugged a script in my head while walking home from a bus stop. When I finally typed it into my computer, it actually worked. It was a trivial program but I suppose it would be a better example of a generative model than music.

Would there by certain personality archtypes that would naturally “fit” into programming than other types? For example would an INTJ have a better advantage than say, an ENFPtype? If so, how would the other compensate for the disadvantage?

>It is pretty well known that programming heavily concentrates *NT* types, iespecially NTs.

I’m an INTJ, and I find the concepts come fairly easily to me. I grokked Unix Zen almost immediately as well. I suppose NT types are more comfortable with abstract thinking, which would be a boon for programming.

Maybe you should add some kind of short quiz similar to the one in the Reges paper to your “How To Become a Hacker” document.

I think Eric should build a quiz tailored to the traits defined in “How to be a Hacker” document. The results would reveal what areas are weak that need to be focused on to obtain the “hacker” mindset.

I believe that Bach and Beethoven would both be examples of musicians who exhibit generative models. Leonard Bernstein as well …

Regarding generative models and programming in general, I can tell you that in large cross functional teams, data modelers and information modelers tend to exhibit the least amount of having a generative model for dealing with the consequences of decisions, where architects tend to exhibit the most. This is actually a huge source of friction in dealing with multi-role teams.

Is it possible that there exists a correlation between good programmers, if they’re determined to be composed of model-seekers and people with Asperger traits, considering that they are seeking a model of human behavior?

Woz (of hardware genius Apple fame (eg the apple II floppy controller)) told us the story in oxford last year of how he was forced to learn and do all his early designs in his head because he didn’t have the kit. and pretty quickly he started optimising, then every design became a challenge to do it with the absolute minimum number of parts (he always challenged himself to improve the core design (elegance) so that he could take a part out (my pilot father’s favourite quote: harry hawker’s old maxim: “simplify, and add lightness!”)).
and that he believes that that skill –of solving problems and optimising designs purely in his head– was what distinguished him from other hardware designers and was what drove the raved-about elegance of his hardware designs.

>I was incredulous when I read this observation from Reginald Braithwaite:

>Like me, the author is having trouble with the fact that 199 out of 200 applicants for every programming job canâ€™t write code at all. I repeat: they can’t write any code whatsoever.
…
>After a fair bit of trial and error I’ve discovered that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.

don’t call it “excel”. call it a declarative 4GL with embedded database and 3GL plugin compatibility (including user macros).
it remains the fastest prototyping tool ever invented for codifying poorly-defined and/or “messy” math-based problems, and the bulk of the modern financial markets’ cutting-edge depends upon it utterly for precisely this reason.

like “maths”, much of people’s ability to grok programming depends on their early teachers. i’ve taught both, and been deeply angered by the previous teachers’ incompetence implied in the radical improvements i’ve achieved in my students in disturbingly short periods of time. in one classic case, an entire subject’s entire semester’s exam results were initially rejected by the university senate as implausible.

It is cross platform (Mac and PC). For very simple things, it’s even cross platform to Linux/Unix with OO Calc. I still find I regularly do things with Excel (no macros, no external file references) that cause OO Calc to either lock up or go so slowly that it’s impossible to use.

It seems to me that all this post says is “programming is science, you gotta think like a scientist” – or am I oversimplificating?

Science pretty much means understanding causality in the world, and for a while natural sciences could progress based on practical observations, but somewhere around the time of Galilei & Newton they had to adopt making abstract models, it just couldn’t progress otherwise. Understanding causality via abstract models – what’s exactly new about it? Don’t think there is a big difference between the way a physicist thinks and a programmer ought to.

The main problem is that it’s misrepresented in the education system – schools believe physics is for the intellectual elite but they can teach any random guy to be a useful Java-coder. Which they can’t. All they need to do is to MEAN in when they say “Computer Science” and problem solved. Or am I oversimplificating?

@shenpen: i think you’re missing eric’s key point, which is that there are some people who, in your words, CAN’T “think like a scientist”. ie, that there is a fundamental difference between categories of humans which is relevant to the ability to program.

an unstated hence possibly unobvious implication of my comment re education was that there are a lot of people who apparently can’t, but can in fact develop the ability very rapidly if they are taught differently.

ie, there are those that CAN, those who CAN’T, and great grey mass in-between who tend to be taught that they can’t.

@ken: truly excel is the dog’s bollocks. i’ve been bending it since v.1 (versions 5 and ’98 are the best) and it is a wonderful example of the “principle of least surprise”. other spreadsheets are different technical implementations of the overall concept and i haven’t come across a single one which i regard as clean and hence usable for more than trivial stuff. excel’s technical implementation retains its core elegance no matter how far you push it: like python, you handle the problem, not the tool’s exceptions.

for example, in the original macro language, tied closely to the core concepts of the spreadsheet, metaprogramming was rendered equivalently effortless to python’s due to just 1 or 2 deceptively simple-appearing design decisions. lispy!

@Saltation: yes, there is a fundamental difference between the people who can understand causality in abstract models and those who cannot. The later type cannot become competent programmers, neither competent physicists, neither competent molecular biologists etc.

In my experience, people in the physical sciences tend to gravitate toward ugly number-crunching languages like FORTRAN or Matlab and never appreciate the beauty of CS. They’re no good at designing systems larger than a couple KLOC. However, they routinely manage to pick up basic programming by osmosis, with little if any tutoring.

Daniel Franke, there are thankfully some open-source inroads being made into the hard sciences, with free numeric and mathematical software like NumPy, SAGE, and Maxima. Or at least the punk undergrad kids are talking loudly about and promoting these things and that could be a big first step towards REVOLUTION. I don’t think FORTRAN will ever, ever die, though and it’s for the same reason Java is proving unkillable in the enterprise space: Libraries, libraries, libraries. And speed, but mainly libraries because you can write some kick-ass C++ these days.

>Is it possible that there exists a correlation between good programmers, if theyâ€™re determined to be composed of model-seekers and people with Asperger traits, considering that they are seeking a model of human behavior?

If so, I’m a counterexample or outlier. I’m certainly in the top 5% of ability, yet the test for Asperger’s traits I took a few posts back puts me so close to zero on that scale that my score is probably in the statistical-noise range.

I’ve seen references to studies that claimed programming talent strongly corresponded to short-term memory capacity – the theory proposed was that this puts an upper limit on the complexity of a component that can be modeled (and reduced to interfaces to be used as components in higher-level models, and so on).

That was some years ago, though; I don’t think I ever saw the studies themselves, even assuming I’m remembering the summaries or that those were correct…

On Asperger and programming: I suspect there is a different kind of correlation. There are the “hard” programmers, the types who enjoy learning math and hard sciences as a hobby, the analytical thinkers who take logical deduction and proofs and details very seriously, who love hard problems, the type ESR or the xkcd guy belong to, and they don’t tend to be Aspies.

I think we Aspies – I’m strongly Aspie – tend to be the “soft” programmers, the syntetic thinkers who often have an intuitive solution for problems first and then look for a logical proof (if needed), who enjoy learning about history for example and abhor math, who like to look at the big picture rather than the details, who treat programming as a form of art rather than science and enjoy tackling easy problems either very quickly or very beautifully, and don’t mind solving the same problem a thousand times and rather take pride in solving it quicker and more beautifully each time, who don’t really like hard problems because it’s unlikely that you would come up with a perfect, optimal, beautiful solution at the first try and anything not 100% perfect is a shame etc. etc.

Shenpen, I think you’re completely off base. I’ve never met a programmer of even passable quality who dislikes math or solving hard problems. As for the rest of your description of the “soft” type, it is in no way mutually exclusive with the “hard” characteristics. I don’t understand your basis for classifying either ESR or Randall Munroe as “hard”, nor is it clear to me which category I, a diagnosed high-functioning autistic, should fall into.

Well I am a programmer of I guess passable quality (I submitted this: http://code.activestate.com/recipes/534150/ ) (though I never claimed to be and don’t even intend to be a hacker, that culture is just too “fact-junkie”, too hung up on details for me) and never liked hard problems or math.

The difference is between analytical, scientific kind of thinking (details, proofs, math, sci-fi fandom etc.) and synthetic, artistic kind of thinking (intuitive, big picture, spotting large trends or directions, spotting an error because it doesn’t fit into the “flow” or “rythm” or “picture” of code as if it was a poem etc., fantasy literature fandom, history, etc. )

The analytical approach works very well in scientific and technical computing (from simulations to device drivers) or if you consider yourself a general programmer who can program anything, while the synthetic approach is useful if you are a “multi-class” programmer: your primary goal is to be an expert of a given domain (say, manufacturing & warehouse processes, medicine or railroad traffic control) and just design the rules, the big picture of such a system expressed in code (in my case, ERP customization).

I admit that the association of the synthetic type with Aspie might be an overgeneralization. However, this is my general experience: the most introverted guys in the high school weren’t the math & science freaks, they were rather a playful type like Feynmann, or I guess ESR. The most introverted types were us synthetic types, who geeked in history, rhetorical and not logical philosophy (Nietzsche, not Wittgenstein), fantasy rather than SF, RPG and strategy computer games and then getting into programming only later on.

There are many exceptions. The both most autistic and smartest guy I know programs COBOL in a bank, so that supports my case, but OTOH he is a math genious and plays synthetizers which puts him more into the hard/hacker category… And the most interesting synthetic thinker of the last century, Miachel Oakeshott clearly wasn’t Aspie. So there might indeed be no connection.

In “Programming Pearls”, Jon Bentley has as a useful pons asinorum, the implementation of a binary search algorithm (i.e. given a list of sorted data and a candidate data element, find if the candidate is in the list and if so at which offset). Programmers are given a couple of hours to code the problem in a high-level language of their choice, and then thirty minutes are given to testing. 90% of ‘professional programmers’ fail. The trick comes in seeing what the algorithm MUST do, and engineering it so that it cannot do otherwise (in this case, the size of the range spanned by the upper and lower sentinel variables strictly decreases every time round the loop). Seeing things like this is a learnable skill to an extent, but I believe there is an innate aptitude in play as well.

I used to make a living from having a fairly detailed generative model of the Oracle SQL engine – I could look at a query for a few seconds and describe in some detail how it would be processed, and how that processing could be improved.

Now I’m trying to move into a different industry, management consulting. A standard interview format for this industry is the case interview. In a case interview, the interviewee is presented with some background details and a business question (“how can we improve profitability?” / “should company A buy company B?” / “should we launch product C?” &c.)

I’ve listened to good interview performances and bad, and I’ve come to the tentative conclusion that, alongside some style considerations, interviewers are mostly looking for a generative model of business. They want people who understand how all of the pieces fit together, and which levers can be pulled to make which changes.

Sadly, Googling for ‘generative theory of business’ doesn’t bring back anything useful. I wonder if anyone has ever worked on it before?

More to the point, for me at least, is the question of how one builds a generative model. Practice must surely be a part of it. What else? Does anyone have any suggestions as to how I should practice?