Thursday, November 13, 2014

The gene delusion

Metaphors in science are notoriously slippery, but biologists seem particularly poorly attuned to the implications of theirs. The tenacity of the misleading “genes for” picture is one of their legacies.

You might think it’s sheer bad luck to be struck by lightning. But some of us are cursed with a struck-by-lightning (SBL) gene. Sure, as with many genetic conditions, if you have the SBL gene it doesn’t mean you will be struck by lightning, just that your chances are higher (here by a factor of about three or four) than those without it. But that seems a fairly big risk factor to me – and I should know, because I’ve got the gene.

Yet no one is working on a genetic remedy. Scandalous? Not really, because SBL can be identified as the gene better known as SRY, the sex-determining gene on the Y chromosome, which makes an embryo develop into a male. Yes, men get hit by lightning more often, because their behaviour – rushing about on golf courses and football pitches in the rain, that sort of thing – makes it more likely. Call it stereotyping all you like: the statistics don’t lie.

Geneticist Steve Jones has used this example to point to the absurdity of the concept of a “gene for”. If we knew nothing else about what SRY does, and it fell out of a statistical search for genetic associations with being hit by lightning, we might indeed conclude that warrants the label SBL. But the association with lightning strikes is merely a side-product of the way the gene’s effects play out in a particular environment. SRY could equally be misattributed as a gene for criminality, murder, baldness, watching Top Gear.

“The most dangerous word in genetics is ‘for’”, Jones has said. “Only fifteen years ago people expected that they would find genes for cancer, heart disease or diabetes. But medicine’s big secret is that we haven’t found them. And we haven’t found them because they are not there.” Compare that with Bill Clinton promising, next to smiling scientists in 2000, that the decoding of the human genome means “doctors increasingly will be able to cure diseases like Alzheimer's, Parkinson's, diabetes and cancer by attacking their genetic roots.”

What does this mean for the much vaunted age of “personalized medicine” – of health care tailored to our individual genome, which can now be decoded for a few thousand dollars and might soon be as common a feature as blood group and cholesterol index on everyone’s health records? The answer is complicated. Genetic data do reveal a lot about our inherent predispositions to certain medical conditions. But that doesn’t necessarily mean we have the “genes for” those conditions in any meaningful sense – genes that can be considered to lie at the “roots”.

The tendency to assign genes the responsibility for well defined personal attributes doesn’t just muddy the waters of post-genomic medicine. It distorts the whole public discourse around genetics, and arguably around the way genomes are shaped by natural selection. And it takes us down some dark avenues, from the notorious history of eugenics to the recurring minefield of how genes are influenced by race. The furore over the views expressed by former New York Times science reporter Nicholas Wade in his book A Troublesome Inheritance: Genes, Race and Human History is just the latest skirmish in this ongoing saga. Wade suggests that differences in social behaviour and characteristics among human societies may be genetically encoded. It’s an old argument, although expressed less crudely than in the anthropology of yore: the intellectual and economic hegemony of Western culture is down to innate biological differences. Scientists have lined up to savage Wade’s book, but the contentious questions it asks – are differences in, say, intelligence, rationality and social cohesion down to our genes? – won’t go away. Nor should they – but we’re not going to make much headway with them until we get to grips with the distinctions between what genes do and what genes are “for”.

Born that way

Geneticists now gnash their teeth at the bad journalism that proclaims the discovery of a “gene for”. But the burden of guilt for this trope lies with the research community itself. It’s not hard to find both implicit and explicit references to “genes for” in the literature or pronouncements of biologists. They are not always as ill-judged as DNA pioneer James Watson’s suggestion that genetic testing for “gay genes” could offer a woman the opportunity to abort a child that carried them. But the implication that traits such as personality and susceptibility to disease are necessarily determined by one or a few genes permeates the field. Without that sort of functional autonomy, for example, it is hard to see how the notion of selfish genes can be coherent. References to blueprints, lists of parts and instruction manuals during the Human Genome Project carried the same baggage.

It’s understandable how this habit began. As the modern era of genetics dawned and it became possible to probe the effects of particular genes by mutating, adding or silencing them (the latter being called “knockout” experiments) in flies, mice and other laboratory animals, researchers began to find clear links between the presence or absence of a gene variant – for brevity I’ll follow the sloppy convention and just say “gene” – in an organism’s genome and certain traits of the whole organism. Surely it stands to reason that, if you see a particular trait in the presence of a gene but not in its absence, that gene is in some sense a gene “for” the trait?

Well, yes and no. So-called coding genes contain the instructions for making particular proteins: enzymes that comprise the biomolecular machinery, and protein fabrics of the body. That’s the only thing they are really “for”. Spiders have a “gene for silk”; humans have a “gene for digesting the milk sugar lactose”. Mutations of these genes can be responsible for inheritable conditions.

But the lack or malfunction of a particular enzyme due to a genetic mutation can have complex knock-on effects in the body. What’s more, most genes are non-coding: they don’t encode proteins, but instead regulate the activity of other genes, creating complex networks of gene interactions. Most human traits arise out of this network, which blurs the picture a “genes for” picture. As the spurious “SBL gene” shows, it’s then wrong to infer causation from correlation. That’s not just a difficulty of finding the right genes within the network. For some traits, even if they are genetically encoded it can be inappropriate to talk of causative mechanisms and explanations at the genetic level.

Indeed, gene knockout studies tended to undermine the “gene for” picture more than they confirmed it. Time and again geneticists would find that, if they knocked out a gene apparently “for” a feature indispensible to an organism’s vitality, the organism hardly seemed to bat an eyelid. We now know that this is at least partly because of the immense complexity of gene networks, which have redundancy built in. If there’s a failure in one part of the network then, just as with closing a station on the London Underground, there may be an alternative route to the same goal.

Nothing here would surprise engineers. They know that such redundancy and failsafe mechanisms are an essential part of the robustness of any complex system, whether it is a chemicals plant or a computer. There is nothing that need have surprised geneticists either, who have known since the 1960s that genes work in self-regulating networks. All the same, I sat through countless editorial meetings at Nature in the early 1990s in which a newly accepted paper would be described breathlessly as showing that a gene thought to do this had now been shown to do that too. The language remained resolutely that of “genes for”: such genes were just multi-tasking.

One of the most notorious episodes of “genes for” from that period was a 1993 study by a team of geneticists at the US National Cancer Institute, who published in the premier journal Science the claim that with “99.5% certainty there is a gene (or genes) in [a particular] area of the X chromosome that predisposes a male to become a heterosexual” – in other words, in effect a “gay gene”.

Anyone interested in genes was already primed to accept that idea. Geneticists had been talking about a genetic basis for homosexuality since the 1970s, and in his 1982 book The Extended Phenotype Richard Dawkins used the possibility (“for the sake of argument”) to explore the notion of how a gene might exert different effects in different environments. For Dawkins, this environmental influence shows only that we must recognize a contingency about what a gene is “for”, not that the whole idea of it being “for” a particular trait or behaviour may be meaningless.

This complexity in the emerging view of what genes do is tellingly, and perhaps inadvertently, captured in Matt Ridley’s book Genome, published in 1999 as the completion of the Human Genome Project was about to be announced. Ridley offered little portraits of inheritable traits associated with each of the 23 human chromosomes. He began with a confident description of how the gene associated with Huntington’s chorea was tracked down. Here, surely, is a “gene for” – if you are unlucky enough to have the particular mutation, you’ll develop the disease.

But then Ridley gets to asthma, intelligence, homosexuality and “novelty-seeking”. All do seem to have an inherited component. “In the late 1980s, off went various groups of scientists in confident pursuit of the ‘asthma gene’”, Ridley writes. By 1998 they had found not one, but fifteen. Today some researchers admit that hundreds might be involved. In the other cases, Ridley admitted, the jury is still out. But it’s not any more: today, all the candidate “genes for” he mentioned in relation to intelligence, homosexuality and novelty-seeking have been ruled out. Isn’t this odd? There was Ridley, an (unusually well informed) science writer, declaring the futility of quests for specific genes “for” complex personality traits, yet finding himself compelled to report on geneticists’ efforts find them. So who was to blame?

Intelligence tests

On genes for intelligence, Ridley mentioned the work of Robert Plomin, who in 1998 reported an association between IQ and a gene called IGF2R. The fact that the gene was known to encode a protein responsible for a very routine and mundane cell function might have been a clue that the connection was at best indirect. That the gene had previously been associated with liver cancer might have been another. Still, Ridley said, we’ll have to see. In 2002 we saw: Plomin and others reported (to scant press attention) that they had not been able to replicate the association of IGF2R with IQ. “It doesn’t look like that has panned out,” he said in 2008.

“Anybody who gets evidence of a link between a disease and a gene has a duty to report it”, Ridley wrote. “If it proves an illusion, little harm is done.” Isn’t that just the way science works, after all? Surely – but whether innocent errors and false trails cause harm depends very much on how they are reported. Studies like Plomin’s are well motivated and valuable, and he has deplored the “genes for” picture himself. But there’s little hope that this research will avoid such associations unless biologists can do a better job of correcting the deeply misleading narrative that exists about what genes do, which has flourished amidst their often complacent attitude towards explaining it.

If you want to see the hazards of illusory gene associations, take the recent claim by Michael Gove’s education adviser Dominic Cummings that findings on the inherited, innate aspect of intelligence (in particular the work of Plomin) are being ignored. For a start, the very mention of genetics seemed to send rational argument out of the window. Some on the left sucked their teeth and muttered darkly about eugenics, or declared the idea “incendiary” and outrageous without bothering to explain why. That’s why Jill Boucher, writing in Prospect, had a point when she excoriated the “politically correct” attacks on Cummings’ comments. But unless Boucher can point to an educationalist or teacher who denies that children differ in their innate abilities, or who regards them all as potential Nobel laureates, she is erecting something of a straw man.

A real problem with Cummings’ comments was not that they attribute some of our characteristics to our genes but that they gave the impression of genetics as a fait accompli – if you don’t have the right genes, nothing much will help. This goes against the now accepted consensus that genes exert their effects in interaction with their environment. And the precise extent of inheritability is unclear. While IQ is often quoted as being about 50% inheritable, there is some evidence that the association with genetics is much weaker in children from poor backgrounds: that good genes won’t help you much if the circumstances are against it. (This finding is seemingly robust in the US, but not in Europe, where social inequalities might not be pronounced enough to produce the effect.)

Nonetheless, there’s nothing wrong in principle with Cummings’ suggestion that research to identify “high IQ” genes should be encouraged. But if he were to look a little more deeply into what it has already discovered (and sometimes un-discovered again), he might wonder what it offers education policy. A 2012 study pointed out that most previous claims of an association between intelligence and specific genes don’t stand up to scrutiny. Nor is there much encouragement from ones that do. In September an international consortium led by Daniel Benjamin of Cornell University in New York reported on a search for genes linked to cognitive ability using a new statistical method that overcomes the weaknesses of traditional surveys. The method cross-checks such putative associations against a “proxy phenotype” – a trait that can ‘stand in’ for the one being probed. In this case the proxy for cognitive performance was the number of years that the tens of thousands of test subjects spent in education.

From several intelligence-linked genes claimed in previous work, only three survived this scrutiny. More to the point, those three were able to account for only a tiny fraction of the inheritable differences in IQ. Someone blessed with two copies of all three of the “favourable” gene variants could expect a boost of just 1.8 IQ points relative to someone with none of these variants. As the authors themselves admitted, the three gene variants are “not useful for predicting any particular individual’s performance because the effect sizes are far too small”.

Perhaps, then, the media would be best advised not to call these “IQ genes”. But you could forgive them for doing so, for they’d only have been echoing one of the paper’s authors, the influential cognitive scientist Steven Pinker of Harvard University. The proper response to a study showing that most existing candidates for gene-intelligence associations were wrong, and that the few that weren’t contribute almost negligibly to inheritability, surely isn’t “Here they are at last”, but “Jesus, is this all there is?”

Where, then, is the remainder of the inherited component? It must presumably reside among a host of genes whose effects are too subtle to be detected by current methods. Those genes will surely be involved in other physiological functions, their effects in intelligence being highly indirect. They are in no meaningful sense “genes for intelligence”, any more than SRY is a gene for being struck by lightning.

So it’s not clear, pace Cummings, what this kind of study adds to the conventional view that some kids are more academically able than others. It’s not clear why it should alter the goal of helping all children achieve what they can, to the best of their ability. Such findings offer very dim prospects for Plomin’s hope, laudable in principle, that education might be tailored to the strengths and weaknesses of individual pupils’ genetic endowment.

Race matters

So, then, to Wade’s claims that genetics causes racial differences in traits such as the propensity for violence or the organization of social institutions. As Wade’s book has shown, the issue of race and genes remains as tendentious as ever. On the one hand, of the total genetic variation between random individuals, around 90% is already present in populations on a single continent – Asia, say – and only 10% more would accrue from pooling Europeans, Africans and Asians together. Some biologists argue that this makes the notion of race biologically meaningless. Yet ancestry does leave an imprint in our genomes: for example, lactose intolerance is more common in Africa and Asia, sickle-cell anemia in people of African origin, and cystic fibrosis in white northern Europeans. That’s why the concept of race is useful as a proxy for medical risk assessment and diagnosis. Besides, arguments about statistical clusters of gene variation don’t alter the fact that culturally conventional indicators of race – pigmentation and eye shape, say – are genetically determined.

What you choose to emphasize and ignore in this matter is largely a question of ideology, not science. But arguments like those Wade puts forward draw their strength from the simplistic notions of how genes relate to phenotype. We know that what we can, in this case, reasonably call cystic-fibrosis or sickle-cell genes (because the conditions derive from a single gene mutation) differ in incidence among racial groups. We also know that genetic variation, while gradual, is not geographically uniform. Might it not be that those variations could encompass genes for intelligence, say?

Yet if the genetic constitution of such traits is really so dispersed, this is a little like grabbing a hundred Scrabble tiles from some huge pile and expecting them to spell out this sentence. Ah, but such random grabs are then filtered into meaningful configurations by natural selection, Wade argues: genes producing a predisposition to capitalism or tribalism might be more useful in some populations than others. Setting aside the improbability of those particular genes existing in the first place, this idea relies on the assumption that every inheritable trait can be selected for, because it stems from genes “for” that trait. That’s precisely the fallacy that once supported eugenic arguments for the betterment of the human race: that we can breed out genes for criminality, stupidity, mendacity.

While it has been reassuring to watch Wade’s thesis be comprehensively dismantled (here and here and here, say) by scientists and other knowledgeable commentators, it’s hard not to contrast their response with that to James Watson’s claim in 2007 that the idea that all races share “equal powers of reason” is a delusion. Despite the fact that Watson adduced as “evidence” only the alleged experience of “people who have to deal with black employees”, he was defended as the victim of a witch-hunt by an “illiberal and intolerant thought police”. Even though it is hard to disentangle genuine prejudice from habitual liberal-baiting in Watson’s remarks, all we are really seeing here is one natural endpoint of the “genes for” and “instruction book” mentality underpinning the Human Genome Project that Watson helped establish and initially led.

The dark genome

The dispersed, “polygenic” nature of inheritable intelligence is likely to be the norm in genetics, at least for many traits we care about. Much the same applies to many inheritable medical conditions, such as schizophrenia and multiple sclerosis: like asthma, they seem to arise from the action of many, perhaps even hundreds, of genes, and there’s not one gene, or even a small number, that can be identified as the main culprits. This “missing heritability”, sometimes called the “dark matter” of the genome, is one of the biggest challenges to the promised personalized medicine of the post-genome era. But it should also be seen as challenging our understanding of genetics per se. Jones, who has been energetic about puncturing the worse misunderstandings of the “genes for” picture, admits that he wouldn’t now attempt to explain how genetics really works, in a manner akin to his brilliant The Language of the Genes (1994), because the field has got so damned complicated.

Yet the linguistic analogy – with genes as words and genomes as books – might remain a serviceable one, if only it were taken more seriously. Combing the genome for genes for many (not all) complex traits seems a little like analyzing Hamlet to look for the words in which Hamlet’s indecision resides. Sure, there’s a lot riding on the cluster “To be or not to be”, but excise it and his wavering persists. Meanwhile, “to” does a lot of other work in the play, and is in no meaningful sense an “indecisive Hamlet” word.

The irony is that a study like the latest “IQ genes” report, while showing yet again the inadequacy of the “gene for” picture, is likely to perpetuate it. As Jones has pointed out, such work has the unfortunate side-effect of feeding our fascination with the putative genetic basis of social problems such as discrimination or differences in educational achievement, about which we can do rather little, while distracting us from the often more significant socioeconomic causes, about which we could do a great deal.

3 comments:

[ Sorry, I originally submitted this on your follow up blog post, but it addresses the article above. ]

Hi Philip,

This is indeed a touchy subject. Given your interest, you might find this overview of use:http://arxiv.org/abs/1408.3421

"Gene for" may be OK terminology for Mendelian traits that are controlled by something akin to an on/off switch. But for polygenic traits (also known as quantitative traits), such as height or intelligence, one has to talk about "causal variants" with specific effect sizes. As a former physicist, you might find it striking (as do I) that most of the variation in polygenic traits tends to be linear (additive). That is, to first approximation one can simply sum up individual effects to get a prediction for the phenotype value. Thus, we can talk about individual variants that (on average) increase height or intelligence by a certain fraction of a cm or even IQ point. There are deep evolutionary reasons for this (see section 3 in the paper linked above), as well as quite a bit of experimental evidence.

Finding causal variants is a matter of sample size, or statistical power. At present, almost 1000 such variants are known for height, accounting for about 20% of population variation. Available genomic datasets for which we have IQ scores are much smaller, hence the more limited results. I expect that as sample sizes continue to grow we will eventually capture most of the heritability for these traits; see section 4 and figure 13 in the paper. (Note heritability by itself can be estimated using different methods, from much smaller sample sizes.)

While I think the overall argument is well made, I want to take issue with one aspect. You say:"So, then, to Wade’s claims that genetics causes racial differences in traits such as the propensity for violence or the organization of social institutions. As Wade’s book has shown, the issue of race and genes remains as tendentious as ever. On the one hand, of the total genetic variation between random individuals, around 90% is already present in populations on a single continent – Asia, say – and only 10% more would accrue from pooling Europeans, Africans and Asians together. Some biologists argue that this makes the notion of race biologically meaningless. Yet ancestry does leave an imprint in our genomes: for example, lactose intolerance is more common in Africa and Asia, sickle-cell anemia in people of African origin, and cystic fibrosis in white northern Europeans. That’s why the concept of race is useful as a proxy for medical risk assessment and diagnosis. Besides, arguments about statistical clusters of gene variation don’t alter the fact that culturally conventional indicators of race – pigmentation and eye shape, say – are genetically determined."

The idea that this constitutes "race" in any "culturally conventional" sense is both mistaken and dangerous. Jerry Coyne, one of the biologists I most respect, accepts this notion even as he goes on to say that of course by any of the current "scientific" ideas, there could be anywhere from 5 to 33 "races". Ascribing scientific value to such an indeterminate concept strikes me as mistaken. (Coyne has a number of entries on his blog about the Wade book where he also addresses the scientific validity of the notion of "race"; H. Allen Orr's article on the book is similarly toned.) I think the use of race in this way is also dangerous because it validates race as a "usable" notion, even though race is conceptually inextricable from hierarchical divisions of capability, intelligence, "civility", maturity, moral integrity, etc. I understand the desire to recognize that some phenotypic distinctions are genetically grounded in more or less distinct groupings within the human species as a whole, but that the groupings shifts dramatically depending on what one is looking at would seem by itself to militate for a different nomenclature, unless you would suggest that the natural sciences are incapable of establishing their own, unconventional and yet more accurate nomenclature. Especially when what scientists are proposing is a way of comprehending and predicting medically-relevant differences in human biology that are not intended to indicate a hierarchical division within the human species.

In relation to this, I would note that the last sentence of the paragraph takes for granted, naturalizes if you will, "culturally conventional indicators of race", something that ought to be avoided at all costs given the hierarchical and discriminatory "culturally conventional" meanings and politics of race. To put it another way, naturalizing historical or cultural concepts like race is virtually a definition of ideology and to do so is not scientific, but the very opposite of science.

Finally, I would add that pigmentation in Africa is so diverse and covers such a range that it overlaps dramatically with a variety of other "races" (sub-continental Indians, themselves diverse in pigmentation; Australian aboriginals; Sicilians; and so on), as to be meaningless scientifically as a "racial indicator" and the idea of pigmentation as a "culturally conventional indicator of race" is logically and empirically incoherent.