Archive for May, 2010

Hopefully by now the image to the left is familiar to you. It’s from a paper in Human Genetics, Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study. The paper is interesting in and of itself, as it combines a wide set of populations and puts the focus on the extent of disjunction between self-identified ethnic identity, and the population clusters which fall out of patterns of genetic variation. In particular, the authors note that the “Native Hawaiian” identification in Hawaii is characterized by a great deal of admixture, and within their sample only ~50% of the ancestral contribution within this population was Polynesian (the balance split between European and Asian). The figure suggests that subjective self assessment of ancestral quanta is generally accurate, though there are a non-trivial number of outliers. Dienekes points out that the same dynamic holds (less dramatically) for Europeans and Japanese populations within their data set.

All well and good. And I like these sorts of charts because they’re pithy summations of a lot of relationships in a comprehensible geometrical fashion. But they’re not reality, they’re a stylized representation of a slice of reality, abstractions which distill the shape and processes of reality. More precisely the x-axis is an independent dimension of correlations of variation across genes which can account for ~7% of the total population variance. This is the dimension with the largest magnitude. The y-axis is the second largest dimension, accounting for ~4%. The magnitudes decline precipitously as you descend down the rank orders of the principle components. The 5th component accounts for ~0.2% of the variance.

The first two components in these sorts of studies usually conform to our intuitions, and add a degree of precision to various population scale relations. Consider this supplement chart from a 2008 paper (I’ve rotated and reedited for clarity):

The post is titled the Chinese Muslims, not the Muslims of China. One may make a semantic distinction here in that the latter connotes the residence of a Muslim community within Chinese society, while the former indicates members of Chinese society who happen to be Muslim. Such black and white dichotomies are naturally artificial, but to a large extent the Uyghurs of Xinjiang fall into the category of a group of Muslims (of Turkish language) who happen to fall within the boundaries of the modern Chinese state (thanks to that inheritance of the Chinese state of the full expanse of the Manchu Empire of the 18th century). On the other hand, the Hui people are arguably more a Chinese people who happen to be Muslim.

For more on the topic, please see my blog post at the Islam in China website. It was submitted a while back, but it only went up recently.

But why has evolutionary genetics stood apart from biology’s resolutely qualitative, rather than quantitative, tradition? Most remarkably, while biomechanics employs the laws of physics, and biochemistry is founded on the quantitative science of chemistry, evolutionary genetics is based on axiomatic foundations that are entirely biological, and yet are capable of precise mathematical formulation. The rules of Mendelian genetics, encapsulated by unbiased inheritance and random mating in a diploid genetic system, predict Hardy-Weinberg frequencies, the binomial sampling of gametes in finite populations determines the properties of genetic drift, and, with a Poisson process of mutation, the complex theory of neutral genetic variation can be established on the basis of very simple assumptions.

Background
Genetic admixture is a common caveat for genetic association analysis. Therefore, it is important to characterize the genetic structure of the population under study to control for this kind of potential bias.

Results
In this study we have sampled over 800 unrelated individuals from the population of Spain, and have genotyped them with a genome-wide coverage. We have carried out linkage disequilibrium, haplotype, population structure and copy-number variation (CNV) analyses, and have compared these estimates of the Spanish population with existing data from similar efforts.

Conclusions
In general, the Spanish population is similar to the Western and Northern Europeans, but has a more diverse haplotypic structure. Moreover, the Spanish population is also largely homogeneous within itself, although patterns of micro-structure may be able to predict locations of origin from distant regions. Finally, we also present the first characterization of a CNV map of the Spanish population. These results and original data are made available to the scientific community.

It is probably obvious that I’m not on the internet as much right now. But I’ve been thinking on the topic of this paper for a few days, and plan on putting together a post when I have something interesting to say, and nothing interesting to do off-net.

A few weeks ago I read Peter Heather’s Empires an Barbarians, but I had another book waiting in the wings which I had planned to tackle as a companion volume, Robert Ferguson’s The Vikings: A History. Heather covered the period of one thousand years between Arminius and the close of the Viking Age, but his real focus was on the three centuries between 300 and 600. It is telling that he spent more time on the rise and expansion of the domains of the Slavic speaking peoples than he did on the Viking assaults on Western civilization; an idiosyncratic take from the perspective of someone writing to an audience of English speakers. But within the larger narrative arc of Empires and Barbarians this was logical, the Slavs were far closer to the relevant action in terms of time and space than the Scandinavians who ravaged early medieval rather than post-Roman societies (where the latter bleeds into the former is up for debate). In Heather’s narrative the Viking invasions were a coda to the epoch of migration, the last efflorescence of the barbarian Europe beyond the gates of Rome before the emergence of a unified medieval Christian commonwealth. And these are the very reasons that Robert Ferguson’s narrative is a suitable complement to Peter Heather’s. Ferguson’s story begins after the central body of Heather’s, and most of its dramatic action is outside of the geographical purview of Empires and Barbarians. In The Vikings the post-Roman world has already congealed into the seeds of what we would term the Middle Ages, and it is this world which serves as the canvas upon which the Viking invasions are painted. Aside from what was Gaul the world of old Rome is on the peripheries of Ferguson’s narrative.

… domestic chickens diverged from red junglefowl 58,000±16,000 years ago, well before the archeological dating of domestication, and that their common ancestor in turn diverged from green junglefowl 3.6 million years ago. Several shared haplotypes nonetheless found between green junglefowl and chickens are attributed to recent unidirectional introgression of chickens into green junglefowl. Shared haplotypes are more frequently found between red junglefowl and chickens, which are attributed to both introgression and ancestral polymorphisms. Within each chicken breed, there is an excess of homozygosity, but there is no significant reduction in the nucleotide diversity. Phenotypic modifications of chicken breeds as a result of artificial selection appear to stem from ancestral polymorphisms at a limited number of genetic loci.

I wonder if domesticates in particular exhibit these more complex reticulated patterns in their phylogenies because they spread along human trade routes.

In the comments below a question was asked in regards to “fundamentalist” vs. agnostic Jews. I put the quotations around fundamentalist because the term means different things in different religions. As for the idea of an agnostic Jew, remember that Jews are a nation (ethnicity) as well as a religion, and that religious belief has traditionally been less explicitly emphasized than religious practice.

It wasn’t too hard to find some answers in the GSS. I used the somewhat crude “BIBLE” variable again. Remember that BIBLE asks if the respondent believes that the Bible is the literal and inerrant Word of God, the inspired Word of God, or a book of fables. I reclassified these as Fundamentalist, Moderate, and Liberal, respectively. There are two variables I used in the first chart, JEW and RELIG. The former looks just as Jews, and breaks down by Orthodox, Conservative and Reform. The latter I combined with BIBLE to bracket out Fundamentalists, Moderates and Liberals of each religious group. The vocabulary test scores are from WORDSUM. Remember that they correlate 0.71 with adult IQ. Because the sample size for Jews was so small I included 95% intervals so you can modulate confidence appropriately. I limited the sample to whites.

Beinart offers a condescending glance at the “warmth” and “learning” of Orthodox Jews, but neglects to mention the most startling factoid in Jewish demographics: a third of Jews aged 18 to 34 self-identify as Orthodox. “Secular Jew” is not quite an oxymoron–the Jews are a nation as well as a religion–but in the United States, at least, secular Jews have a fertility barely above 1 and an intermarriage rate of 50 percent, which means their numbers will decline by 75 percent per generation. It is tragic that the Jewish people stand to lose such a large proportion of their numbers, but they are lost to Judaism in general, not only to Zionism. That puts a different light on the matter.

Well, I would certainly love to be wrong; neither I nor my descendants gain anything out of a world of decline. But it would be useful to go back and look at how 19th-century progressives expected the 20th century to be a wonderland of peace, prosperity and progress. Didn’t quite work out that way. I suspect the truth is that nobody knows anything about tomorrow, and that we can only make our best educated guesses based on history and the wisdom of experience.

Now that the sex lives of Supreme Court justices have become grist for commentators, we are finally free to discuss a question formerly only whispered about in the shadows: Why does Justice Antonin Scalia, by common consent the leading intellectual force on the Court, have nine children? Is this normal? Or should I say “normal,” as some people choose to define it? Can he represent the views of ordinary Americans when he practices such a minority lifestyle? After all, having nine children is far more unusual in this country than, say, being a lesbian.

The GSS can answer this question. Sort of. It turns out that the highest number of children it asks about are “8 or more.” Limiting the sample to 1998-2008 so it has some contemporary relevance, ~1% of respondents in the GSS has 8 or more children. But that’s not quite fair, since many respondents are young adults, or just starting their families. Limiting the sample to those who are 60 years or older you have ~3.5%. Limiting to 70 and above it goes up to ~4.5%. Scalia is 74 years old, so I think it might be appropriate to judge him by his generation, though the relative gerontocracy of the Supreme Court, and American politics in general, might warrant examination. In 2008 in the GSS asked about sexual orientation, and ~2% of women stated they were lesbian, gay or homosexual. So whether Scalia is more abnormal than a lesbian measured against the general population depends on the reference population you use. For his generation, probably not, but for this generation, perhaps.

But now I think we’ve turned a corner. It feels, to mix metaphors, that we’ve hit a tipping point. The Human genome project, the mapping and sequencing of the/a human genome from 1990 to 2003, cost approximately 2,700,000,000 dollars (that’s 2.7 billion, I wanted to get all the zeros in). Celera did the genome for 300,000,000. The cost of sequencing an entire human genome has been plummeting ever since. In 2007, the cost of sequencing the genome of James Watson (co-discoverer of DNA) was about 2,000,000. The today cost is about 10,000. Complete Genomics and other companies are on the march to quickly reducing the cost of sequencing a genome under 1,000.
…So, within a year, the cost of sequencing your, my, genome will reach 1,000. If not less. We’ve seen this coming for years now, and it’s upon us. But what does it mean? A lot of data. But data means nothing without context and analysis. Sequencing my genome would be a waste of 1,000 dollars if I gleaned nothing from it.

I can believe that we’ll be able to get a tarball with our own full sequence for a reasonable price in a few years. Cheaper than orthodontia and cosmetic surgery even. Though the utility in prevention and treatment is a different matter. Most people already have a treasure trove of data through family history, and that doesn’t seem to change behavior for many in the short-term. Once the magical power of genomics wears off I suspect that knowing you have variant X with risk Y will be less transformative than not.

Gene Tests For Everyone. Probably not much value in most of these tests for most people right now. Also, many of these common variants have been found in subject populations which are European, so if you are Colored it might not tell you anything relevant (i.e., the SNP which is identified as a risk has only been shown to have an effect in Europeans, or, you know you have a trait but it turns out you don’t have any of the “common” variants, perhaps because your population has different variants which are common).

Return of the Neanderchimps. Complex demographic history for one and all! I do find the genetic isolation between Bonobos and Common Chimpanzees interesting. Apparently the Congo river was an imposing enough barrier to allow for allopatric speciation. I wonder if this can tell us about the fear our own ancestors might have had in traversing water barriers.

Mutations are as you know a double-edged sword. On the one hand mutations are the stuff of evolution; neutral changes on the molecular or phenotypic level are the result of from mutations, as are changes which enhance fitness and so are driven to fixation by positive selection. On the other hand mutations also tend to cause problems. In fact, mutations which are deleterious far outnumber those which are positive. It is much easier to break complex systems which are near a fitness optimum than it is to improve upon them through random chance. In fact a Fisherian geometric analogy of the affect of genes on fitness implies that once a genetic configuration nears an optimum mutations of larger effect have a tendency to decrease fitness. Sometimes environments and selection pressures change radically, and large effect mutations may become needful. But despite their short term necessity these mutations still cause major problems because they disrupt many phenotypes due to pleiotropy.

But much of the playing out of evolutionary dynamics is not so dramatic. Instead of very costly mutations for good or ill, most mutations may be of only minimal negative effect, especially if they are masked because of recessive expression patterns. That is, only when two copies of the mutation are present does all hell break loose. And yet even mutations which exhibit recessive expression tend to generate some drag on the fitness of heterozygotes. And if you sum small values together you can obtain a larger value. This gentle rain of small negative effect mutations can be balanced by natural selection, which weeds does not smile upon less fit individuals who have a higher mutational load. Presumably those with “good genes,” fewer deleterious mutations, will have more offspring than those with “bad genes.” Because mutations accrue from one generation to the next, and, there is sampling variance of deleterious alleles, a certain set of offspring will always be gifted with fewer deleterious mutations than their siblings. This is a genetics of chance. And so the mutation-selection balance is maintained over time, the latter rising to the fore if the former comes to greater prominence.

In the post below I pointed to various differences in regards to acceptance of evolution by demographic. One of the issues is that just because X correlates with Y, does not entail that X causes Y (and of course, if X correlates with Y, and Y correlates with Z, that does not entail that X correlates with Z). You can use the GSS to run some regressions and see what the strongest predictive variables. Because of this I know that the variable BIBLE is very predictive of skepticism of evolution. Additionally, even smart people with college educations who have a literal inerrant view of the Bible are skeptical of evolution. To show the power of Biblical fundamentalism I thought it would be useful to plot differences in regards to the Index of Creationism by various demographics for both Fundamentalists and non-Fundamentalists. So below I have a set of charts which have two series, one for Fundamentalists, and one for non-Fundamentalists, of a given demographic. So for example one chart has Fundamentalists and non-Fundamentalists separated by attainment or non-attainment of college educations.

The primary variables are BIBLE & SCITEST4.

BIBLE is:

Which of these statements comes closest to describing your feelings about teh Bible? 1. The Bible is the actual word of God and is to be taken literally, word for word. 2. The Bible is the inspired word of God but not everything in it should be taken literally, word for word. 3. The Bible is an ancient book of fables, legends, history, and moral precepts recorded by men.

I recoded so that responses 2 and 3 are classed as non-Fundamentalist.

SCITEST4:

For each statement below, just check the box that comes closest to your opinion of how true it is. In your opinion, how true is this? d. Human beings developed from earlier species of animals.

I created the Index of Creationism = (% “definitely not true”) X 3 + (% “probably not true”) X 2 + (% “probably true”) X 1, from three of the four responses to SCITEST4.

In the charts below the blue squares = Fundamentalists. The red diamonds = non-Fundamentalists. I rescaled so that 1 is the minimum for the Index of Creationism on all charts.

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!