Reply to some self-important dork whining about this talk at Dienekes':

"You simply cannot criticize a new, rapidly-evolving and improving model just based on its trivial, known shortcomings. Such a thing is ludicrous and paints a truly bad picture of the talk presenters."

I'm afraid your effeminate idea of proper protocol has no bearing on actual science. Gray and Atkinson's "innovation" is insisting that Bayesian phylogenetics with limited and sometimes questionable inputs of data can produce highly accurate and precise readouts of linguistic history that supercede all previous linguistic and archaeological knowledge. Their results may dazzle twits like you and appeal to those who find their results politically or ethnically congenial. But the first question a serious person would ask is how closely Gray and Atkinson's attempts at reconstruction recapitulate recent/known linguistic history. That they frequently fail to do so is extremely germane to the question of how much faith one should put in their deeper reconstructions.

Statistical models are not magic. Bayesian tree building is not magic. Even with large corpuses of genetic data, the "most likely" tree is often overwhelmingly likely to be wrong. For genetics, where there's an explosion of data with comparatively few human analysts and little or no historical context, such results are useful, being often the best we have until additional data and further refinements of models appear. On the other hand, in linguistics, where on the PIE question relatively many human analysts have been poring over a comparatively limited corpus for many decades, it's up to Gray and Atkinson to demonstrate they have something useful to contribute. Every indication says they do not.

The Genographic Project is an international effort using genetic data to chart human migratory history. The project is non-profit and non-medical, and through its Legacy Fund supports locally led efforts to preserve indigenous and traditional cultures. In its second phase, the project is focusing on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide SNP genotyping, they were designed for medical genetic studies and contain medically related markers that are not appropriate for global population genetic studies. GenoChip, the Genographic Project's new genotyping array, was designed to resolve these issues and enable higher-resolution research into outstanding questions in genetic anthropology. We developed novel methods to identify AIMs and genomic regions that may be enriched with alleles shared with ancestral hominins. Overall, we collected and ascertained AIMs from over 450 populations. Containing an unprecedented number of Y-chromosomal and mtDNA SNPs and over 130,000 SNPs from the autosomes and X-chromosome, the chip was carefully vetted to avoid inclusion of medically relevant markers. The GenoChip results were successfully validated. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays for three continental populations. While all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The GenoChip is a dedicated genotyping platform for genetic anthropology and promises to be the most powerful tool available for assessing population structure and migration history.

Let's be clear: the "most powerful tool available for assessing population structure and migration history" is whole genome sequencing. The Genographic Project, which represents a large fraction of the global spending on its type of population genetics research, unnecessarily hobbled itself from the outset in hopes of pre-emptively appeasing rent-seeking shrill self-appointed advocates for "indigenous peoples". I don't think Spencer Wells and company thought they were giving up much, since the short-sighted original plan was to examine only uniparental markers. In that light, perhaps we can be thankful that they've come up with a way of sidestepping the restrictions they placed on themselves and generating at least some useful autosomal data.

Several steps were taken to ensure that the genetic results would not be exploited for
pharmaceutical, medical, and biotechnology purposes. First, participant samples were
maintained in a completely anonymous status during GenoChip analysis. Second, no phenotypic
or medical data were collected from the participants. Third, we included only SNPs in
noncoding regions without any known functional association, as reported in dbSNP build 132.
Lastly, we filtered our SNP collection against a 1.5 million SNP data set containing all variants
that have potential, known, or suspected associations with diseases.

But however they'd like to spin it there's nothing ideal about ignoring "functional" variation or limiting the number of SNPs tested. Razib has a bizarre post up at his Discover blog in which he confuses SNP ascertainment and "Ancestry Informative Marker" ascertainment, and I see that the authors of the paper themselves appear to be eliding the distinction. But the overwhelming majority of the "450 populations" from which "AIMs" were "ascertained" for the GenoChip had merely been typed on existing microarrays -- which goes no ways towards addressing the issue the Affymetrix Human Origins array was designed to address (putting together SNP panels with known ascertainment, starting by sequencing individuals from multiple populations). Ultimately, the most useful and complete picture of human genetic history will come from whole genome sequencing, which should be cheap enough within a few years for use by the Genographic Project. The question is have they permanently handicapped themselves from applying the actual best tool for their stated mission, or will we eventually see at least some whole genome data for their 75,000 indigenous samples (no doubt with at minimum coding regions redacted).

The derived proportions of the human hand may provide supportive buttressing that protects the hand from injury when striking with a fist. Flexion of digits 2–5 results in buttressing of the pads of the distal phalanges against the central palm and the palmar pads of the proximal phalanges. Additionally, adduction of the thenar eminence to abut the dorsal surface of the distal phalanges of digits 2 and 3 locks these digits into a solid configuration that may allow a transfer of energy through the thenar eminence to the wrist. To test the hypothesis of a performance advantage, we measured: (1) the forces and rate of change of acceleration (jerk) from maximum effort strikes of subjects striking with a fist and an open hand; (2) the static stiffness of the second metacarpo-phalangeal (MCP) joint in buttressed and unbuttressed fist postures; and (3) static force transfer from digits 2 and 3 to digit 1 also in buttressed and unbuttressed fist postures. We found that peak forces, force impulses and peak jerk did not differ between the closed fist and open palm strikes. However, the structure of the human fist provides buttressing that increases the stiffness of the second MCP joint by fourfold and, as a result of force transfer through the thenar eminence, more than doubles the ability of the proximal phalanges to transmit ‘punching’ force. Thus, the proportions of the human hand provide a performance advantage when striking with a fist. We propose that the derived proportions of hominin hands reflect, in part, sexual selection to improve fighting performance.

Experts have long assumed these features evolved to help our ancestors make and use tools.

But new evidence from the US suggests it was not just dexterity that shaped the human hand, but violence also.

Hands largely evolved through natural selection to form a punching fist, it is claimed.

''The role aggression has played in our evolution has not been adequately appreciated,'' said Professor David Carrier, from the University of Utah.

''There are people who do not like this idea but it is clear that compared with other mammals, great apes are a relatively aggressive group with lots of fighting and violence, and that includes us. We're the poster children for violence.'' [. . .]

''Individuals who could strike with a clenched fish could hit harder without injuring themselves, so they were better able to fight for mates and thus be more likely to reproduce,'' he said. [. . .]

To test the theory Prof Carrier conducted experiments with volunteers aged 22 to 50 who had boxing or martial arts experience.

In one, participants were asked to hit a punchbag as hard as possible from different directions with their hands in a range of shapes, from open palms to closed fists.

The results, published in the Journal of Experimental Biology, show that tightly clenched fists are much more efficient weapons than open or loosely curled hands.

A punch delivers up for three times more force to the same amount of surface area as a slap. And the buttressing provided by a clenched fist increases the stiffness of the knuckles fourfold, while doubling the ability of the fingers to deliver a punching force. [. . .]

''Human-like hand proportions appear in the fossil record at the same time our ancestors started walking upright four million to five million years ago. An alternative possible explanation is that we stood up on two legs and evolved these hand proportions to beat each other.''

Manual dexterity could have evolved without the fingers and palms getting shorter, he said. But he added: ''There is only one way you can have a buttressed, clenched fist: the palms and fingers got shorter at the same time the thumb got longer.''

Prof Carrier cited other evidence pointing to the role of fighting in the evolution of human hands.

:: No ape other than humans hits with a clenched fist.

:: Humans use fists instinctively as threat displays. ''If you are angry, the reflexive response is to form a fist,'' said Prof Carrier. ''If you want to intimidate somebody, you wave your fist.''

:: Sexual dimorphism, or the difference in body size between the sexes, tends to be greater among primates when there is more competition between males. In humans the difference is mainly in the upper body and arms, especially the hands. ''It's consistent with the hand being a weapon,'' said Prof Carrier.

In their paper the professor and colleague Michael Morgan, a University of Utah medical student, ponder on the paradoxical nature of the human hand.

''It is arguably our most important anatomical weapon, used to threaten, beat and sometimes kill to resolve conflict. Yet it is also the part of our musculoskeletal system that crafts and uses delicate tools, plays musical instruments, produces art, conveys complex intentions and emotions, and nurtures,'' they write.

''More than any other part of our anatomy, the hand represents the identity of Homo sapiens. Ultimately, the evolutionary significance of the human hand may lie in its remarkable ability to serve two seemingly incompatible but intrinsically human functions.''

This highlights an intriguing paradox at the heart of human
communication. If language evolved to allow us to exchange
information, how come most people cannot understand what most other
people are saying? This perennial question was famously addressed in
the Old Testament story of the Tower of Babel, which tells of how
humans developed the conceit that they could use their shared
language to cooperate in the building of a tower that would take
them to heaven. God, angered at this attempt to usurp his power,
destroyed the tower and to ensure it would not be rebuilt he
scattered the people and confused them by giving them different
languages. The myth leads to the amusing irony that our separate
languages exist to prevent us from communicating. The surprise is
that this might not be far from the truth. [. . .]

Of course that still leaves the question of why people would want to
form into so many distinct groups. For the myriad biological species
in the tropics, there are advantages to being different because it
allows each to adapt to its own ecological niche. But humans all
occupy the same niche, and splitting into distinct cultural and
linguistic groups actually brings disadvantages, such as slowing the
movement of ideas, technologies and people. It also makes societies
more vulnerable to risks and plain bad luck. So why not have one
large group with a shared language?

An answer to this question is emerging with the realisation that
human history has been characterised by continual battles. Ever
since our ancestors walked out of Africa, beginning around 60,000
years ago, people have been in conflict over territory and
resources. In my book Wired for Culture (Norton/Penguin, 2012) I
describe how, as a consequence, we have acquired a suite of traits
that help our own particular group to outcompete the others. Two
traits that stand out are "groupishness" - affiliating with people
with whom you share a distinct identity - and xenophobia, demonising
those outside your group and holding parochial views towards them.
In this context, languages act as powerful social anchors of our
tribal identity. How we speak is a continual auditory reminder of
who we are and, equally as important, who we are not. Anyone who can
speak your particular dialect is a walking, talking advertisement
for the values and cultural history you share. What's more, where
different groups live in close proximity, distinct languages are an
effective way to prevent eavesdropping or the loss of important
information to a competitor.

In support of this idea, I have found anthropological accounts of
tribes deciding to change their language, with immediate effect, for
no other reason than to distinguish themselves from neighbouring
groups. For example, a group of Selepet speakers in Papua New Guinea
changed its word for "no" from bia to bune to be distinct from other
Selepet speakers in a nearby village. Another group reversed all its
masculine and feminine nouns - the word for he became she, man
became woman, mother became father, and so on. One can only
sympathise with anyone who had been away hunting for a few days when
the changes occurred.

The use of language as identity is not confined to Papua New Guinea.
People everywhere use language to monitor who is a member of their
"tribe". We have an acute, and sometimes obsessive, awareness of how
those around us speak, and we continually adapt language to mark out
our particular group from others. In a striking parallel to the
Selepet examples, many of the peculiar spellings that differentiate
American English from British - such as the tendency to drop the "u"
in words like colour - arose almost overnight when Noah Webster
produced the first American Dictionary of the English Language at
the start of the 19th century. He insisted that: "As an independent
nation, our honor [sic] requires us to have a system of our own, in
language as well as government."

The evidence of the recent NMS semifinalist lists seems the most conclusive of all, given the huge statistical sample sizes involved. As discussed earlier, these students constitute roughly the highest 0.5 percent in academic ability, the top 16,000 high school seniors who should be enrolling at the Ivy League and America’s other most elite academic universities. In California, white Gentile names outnumber Jewish ones by over 8-to-1; in Texas, over 20-to-1; in Florida and Illinois, around 9-to-1. Even in New York, America’s most heavily Jewish state, there are more than two high-ability white Gentile students for every Jewish one. Based on the overall distribution of America’s population, it appears that approximately 65–70 percent of America’s highest ability students are non-Jewish whites, well over ten times the Jewish total of under 6 percent.

Needless to say, these proportions are considerably different from what we actually find among the admitted students at Harvard and its elite peers, which today serve as a direct funnel to the commanding heights of American academics, law, business, and finance. Based on reported statistics, Jews approximately match or even outnumber non-Jewish whites at Harvard and most of the other Ivy League schools, which seems wildly disproportionate. Indeed, the official statistics indicate that non-Jewish whites at Harvard are America’s most under-represented population group, enrolled at a much lower fraction of their national population than blacks or Hispanics, despite having far higher academic test scores. [. . .]

Just as striking as these wildly disproportionate current numbers have been the longer enrollment trends. In the three decades since I graduated Harvard, the presence of white Gentiles has dropped by as much as 70 percent, despite no remotely comparable decline in the relative size or academic performance of that population; meanwhile, the percentage of Jewish students has actually increased. This period certainly saw a very rapid rise in the number of Asian, Hispanic, and foreign students, as well as some increase in blacks. But it seems rather odd that all of these other gains would have come at the expense of whites of Christian background, and none at the expense of Jews.

Furthermore, the Harvard enrollment changes over the last decade have been even more unusual when we compare them to changes in the underlying demographics. Between 2000 and 2011, the relative percentage of college-age blacks enrolled at Harvard dropped by 18 percent, along with declines of 13 percent for Asians and 11 percent for Hispanics, while only whites increased, expanding their relative enrollment by 16 percent. However, this is merely an optical illusion: in fact, the figure for non-Jewish whites slightly declined, while the relative enrollment of Jews increased by over 35 percent, probably reaching the highest level in Harvard’s entire history. Thus, the relative presence of Jews rose sharply while that of all other groups declined, and this occurred during exactly the period when the once-remarkable academic performance of Jewish high school students seemed to suddenly collapse. [. . .]

Each year, the Ivy League colleges enroll almost 10,000 American whites and Asians, of whom over 3000 are Jewish. Meanwhile, each year the NMS Corporation selects and publicly names America’s highest-ability 16,000 graduating seniors; of these, fewer than 1000 are Jewish, while almost 15,000 are non-Jewish whites and Asians. Even if every single one of these high-ability Jewish students applied to and enrolled at the Ivy League—with none going to any of America’s other 3000 colleges—Ivy League admissions officers are obviously still dipping rather deep into the lower reaches of the Jewish ability-pool, instead of easily drawing from some 15,000 other publicly identified candidates of far greater ability but different ethnicity. [. . .]

The situation becomes even stranger when we focus on Harvard, which this year accepted fewer than 6 percent of over 34,000 applicants and whose offers of admission are seldom refused. Each Harvard class includes roughly 400 Jews and 800 Asians and non-Jewish whites; this total represents over 40 percent of America’s highest-ability Jewish students, but merely 5 percent of their equally high-ability non-Jewish peers. It is quite possible that a larger percentage of these top Jewish students apply and decide to attend than similar members from these other groups, but it seems wildly implausible that such causes could account for roughly an eight-fold difference in apparent admissions outcome. Harvard’s stated “holistic” admissions policy explicitly takes into account numerous personal characteristics other than straight academic ability, including sports and musical talent. But it seems very unlikely that any remotely neutral application of these principles could produce admissions results whose ethnic skew differs so widely from the underlying meritocratic ratios.

One datapoint strengthening this suspicion of admissions bias has been the plunge in the number of Harvard’s entering National Merit Scholars, a particularly select ability group, which dropped by almost 40 percent between 2002 and 2011, falling from 396 to 248. This exact period saw a collapse in Jewish academic achievement combined with a sharp rise in Jewish Harvard admissions, which together might easily help to explain Harvard’s strange decline in this important measure of highest student quality. [. . .]

It is important to note that these current rejection rates of top scoring applicants are vastly higher than during the 1950s or 1960s, when Harvard admitted six of every seven such students and Princeton adopted a 1959 policy in which no high scoring applicant could be refused admission without a detailed review by a faculty committee.78 An obvious indication of Karabel’s obtuseness is that he describes and condemns the anti-meritocratic policies of the past without apparently noticing that they have actually become far worse today. An admissions framework in which academic merit is not the prime consideration may be directly related to the mystery of why Harvard’s ethnic skew differs in such extreme fashion from that of America’s brightest graduating seniors. In fact, Harvard’s apparent preference for academically weak Jewish applicants seems to be reflected in their performance once they arrive on campus.79

Our GWAS results did not identify any genetic loci reaching genomewide
significance at p < 5 x 10-8 among men or women. Among men,
the peak (non-significant) hit was in chromosome 8q12.3
(chr8:63532921 in NKAIN3, p= 7.1 x 10-8).

More interesting (if generally unsurprising) are the phenotypic associations:

We examined the correlation between sexual identity and ~1000
phenotypes already characterized in the 23andMe database
through other surveys. These analyses are preliminary; we have
not checked for outliers or confounders beyond what is listed in
the methods. We replicated previous findings showing a positive
association between lesbians and alcoholism, and between
lesbians and gay men and several psychiatric conditions.

A commenter at the 23andMe blog:

The phenotypic information is interesting if I’m reading it correctly:
Gay men are less likely to to have played common US sports, and are more likely to cry easily or to have had liposuction. Lesbians are less likely to shave their legs. Surprisingly, gay men are less likely to be atheist or agnostic.

Despite the limited data available for Z280 and Z93,
some general inferences can be drawn from the geographic
distributions of these two haplogroups. The R1a1-
Z280 subclade is a strong candidate for covering the
R1a1a* (xM458) in Eastern Europe, which was found in
high frequency by Underhill et al. (2010).The tested set of
53 Malaysian Indian samples presented 100% frequency
for the R1a1-Z93 subclade, without co-existence Z280 or
M458 sub-haplogroups. Inner and Central Asia seem to
be the overlap zones for the R1a1-Z280 and R1a1-Z93
chromosomes as both forms were observed at low frequencies.
This is again consistent with the observations
described for R1a1a* spread in Central Asia and in the
Altai region by Underhill et al. (2010). This pattern suggests
that the origin of R1a1-M198 arguably occurred
somewhere between South Asia and Eastern Europe.
Potential candidates could be the Eurasian Steppes
(Ukraine – Southern Russia – Kazakhstan – Caucasus) or
the Middle East. European populations showed higher
M458 and Z280, whereas Asian populations presented
higher Z93 frequencies, indicating that the new markers
can be effectively used to distinguish between the European
and Asian branches of the haplogroup R1a1-M198. [. . .]

The coalescent time calculated by us for R1a1-M458
carriers is consistent with the age calculated by Underhill
et al. (2010) in Europe yielding 7.3 KYA versus 7.9
KYA (thousands of years ago). Underhill et al. (2010)
also noted the potential association of R1a1-M458 with
the Linear Pottery Neolithic culture in the territory of
present-day Hungary—this observation is supported by
our data. The TMRCA calculated for R1a1-Z280 diversification
(10.3 KYA) is approximately in agreement with
the estimation of Underhill et al. (2010) for
R1a1a*(xM458) chromosomes in Eastern Europe ( 11
KYA). However, the coalescent age of 10.3 KYA for R1a1-
Z93 chromosomes in this study is lower than that of
populations of the Indus Valley (14 KYA) for the STR
associated diversity of R1a1a*(xM458) chromosomes calculated
by Underhill et al. (2010).

Of course, these markers and other markers defining additional layers of structure under M417 have been known for over a year. Budgetary constraints and the magic of peer review combine to render this paper relatively uninformative. One of the authors explains:

I have to agree with all, but those who never tried to push an article through a serious academic journal has no idea how difficult this is. The first version was submitted like 1 year ago, and also contained pedigree rates plus 500+ FTDNA samples from different ethnic groups. But unfortunately the reviewers were so narrow-minded that we had finally to drop all FTDNA samples plus the pedigree calcs.

Personally I also do not consider Zhiv. rate valid, but I had to accept this compromise to get the paper accepted.
Anyway, as Lukasz pointed out, the main goal was to introduce Z93 and Z280 into the "academic circles" so in the future we may have a comprehensive paper from a more wealthy lab. The Budapest forensics are not full of money so we had no chance to have more than 12 markers tested and "low-chance SNPs" like Z284 in Hungary. Actually we submitted the first draft before Z283 was established securely on the FTDNA tree so we could not include it later...

My comments from last year on the dna-forums postings of an Underhill(lab that brought us Zhivotovsky "evolutionary" mutation rates)-affiliated academic stand:

Another poster points out: "Dividing by 3 [to bring the estimate more in line with real mutation rates] gives an age of 3300 years, almost exactly the estimate from Nordtvedt's spreadsheet." Someone else recently estimated the TMRCA for L342.2+ at around 3,600 years. So: if current patterns hold, the bulk of South Asian R1a unambiguously falls within European R1a variation. While I fully expect, when we eventually see results for these markers in large academic samples published, the papers will feature evolutionary mutation rates and less than parsimonious attempts to fit the distribution of M417 sublineages to archaeology, it's pretty clear to me Z93 and L342.2 originated on the Steppe within the past 4000 years or so and spread with Indo-Iranian.

Again: the most straightforward interpretation of the evidence is that Z93 is a relatively young branch of an evidently European lineage. Accurate, unbiased dates using SNPs instead of STRs should be here soon enough, definitively settling this and other issues.

In September, Moment Magazine got all nerdy and wrote about their Great DNA Experiment, in which they look at the 23andMe results of 15 notable Americans of Jewish ancestry and make some interesting genetic connections. It’s a good illustration of how our DNA can tell us about our interconnectedness.

The piece shows it’s not “six degrees” that separates these individuals from each other, but, in all but one case, no degrees of separation. This means that these individuals are all directly related to one another, albeit in most cases distantly. This was also news to the 15 participants.

All but one of the individuals has Ashkenazi Jewish ancestry. The one exception is Linda Chavez, the political commentator, who descended from Conversos, indivduals of Jewish and Muslim ancestry who converted to Catholicism during the Inquisition. Her ancestors eventually settled in New Mexico. But even in her case, although she isn’t directly related to any of the group, she is connected to each of the others individuals focused on in the piece by just one other individual in the 23andMe database.

The connections shown in the article are what prompted The New York Times columnist David Brooks and NPR “All Things Considered” host Robert Siegel to joke about learning they were distant cousins, sharing a common ancestor several generations back. Brooks kidded that he was “most surprised that our ancestors worked together on National Schetl Radio, on a program called ‘All Pogroms Considered.’” (Maybe the line needs a drum roll to work.)

The article shows the genetic connections between people like Mayim Bialik, the actress on the Big Bang Theory, and Stephen Dubner, co-author of Freakonomics. Or the connections between NPR’s Siegel, and Harvard Law professor, Alan Dershowitz, or his connections to 23andMe’s CEO’s mother Esther Wojcicki, a journalist and teacher. The magazine shows the connections and the amount of shared DNA, measured in centimorgans (cMs), to illustrate the “relatedness” of any two individuals. More closely related individuals share more DNA.

In the full article at the Moment Magazine website we also learn, for example, that Stephen Dubner's "first cousin once removed was Ethel Greenglass, wife of Julius Rosenberg", or that "Esther Wojicki's maternal-grandparents gave each of the boys among their 13 children a different surname in order to help them avoid conscription in the Russian army".

Our findings are typical of what geneticists would expect from a group of people of mostly Ashkenazi Jewish origin, says 23andMe's Mike Macpherson. Today's Ashkenazi Jews descend from approximately 25,000 ancestors who survived plagues and massacres in the 12th and 13th centuries. Survivors of these "bottlenecks" and other similar occurrences then married one another, sharing their DNA with millions of descendants.

A new video created by Whitehead Institute in collaboration with the genealogical website Geni.com shows the births of millions people, from the Middle Ages through the early 20th Century, as single dots on a black background. As time advances, those births define the coastlines and countries of Europe and Great Britain, then the Pilgrims’ voyage to the New World, the migration to Australia, the overland expansion of the United States through the Oregon Trail and Gold Rush, and the founding of Johannesburg, South Africa. [. . .]

In the future, Erlich and Daniel MacArthur, a Group Leader in Genetics at Massachusetts General Hospital and the Broad Institute, will be partnering with Geni to delve even deeper into the information submitted to the world’s largest collaborative genealogical website.

DNA tests are expected to take 12 weeks. The team will compare samples from the skeletal remains with the DNA of a direct descendant of the king's sister, Michael Ibsen, 55, a Canadian furniture maker who lives in London.

The skeleton, more than 9,500 years old, has long been at the center of a rift between tribal members and scientists, led by Doug Owsley, a physical anthropologist at the Smithsonian Institution's National Museum of Natural History who spearheaded the legal challenge to gain access to the skeleton for scientific study.

Owsley says study shows that not only wasn't Kennewick Man Indian, he wasn't even from the Columbia Valley, which was inhabited by prehistoric Plateau tribes. [. . .]

Isotopes in the bones told scientists Kennewick Man was a hunter of marine mammals, such as seals, Owsley said. "They are not what you would expect for someone from the Columbia Valley," he said. "You would have to eat salmon 24 hours a day and you would not reach these values.

"This is a man from the coast, not a man from here. I think he is a coastal man." [. . .]

Pressed by Armand Minthorn of the Umatilla Board of Trustees, who asked Owsley directly, "Is Kennewick Man Native American?" Owsley said no. "There is not any clear genetic relationship to Native American peoples," Owsley said. "I do not look at him as Native American ... I can't see any kind of continuity. He is a representative of a very different people."

His skull, Owsley said, was most similar to an Asian Coastal people whose characteristics are shared with people, later, of Polynesian descent.

And, while tribes want the remains returned for reburial, Owsley said there is still much more to learn from the skeleton, which has largely been inaccessible but for two instances, in which a team of about 15 scientists could study it for a total of about two weeks.

Note: my own understanding is that Kennewick Man is broadly similar to other Paleoindians, and that historical Amerindians probably derive most of their ancestry from Paleoindians (with some later Asian gene flow and evolution in a more Mongoloid direction). On the other hand, it appears W. Eurasian-affiliated ancient Central Asians did contribute significantly to the ancestry of Paleoindians (and, to a lesser extent, to the ancestry of modern E. Eurasians in general), which is what I expect most of the heightened affinity between Northern Europeans and Amerindians found by Reich et al. is attributable to.

Testosterone is known to influence brain development and reproductive physiology but also plays an important role in social behavior [4]–[9]. While most studies have investigated a potential association between testosterone and aggressive behavior, two recent studies suggest that testosterone may also increase prosocial behavior or lead to less selfish behavior in certain situations [6], [9]. We therefore investigate a link between testosterone and self-serving lying. A prominent interpretation of the existing evidence on the role of testosterone in social behavior is that the hormone enhances dominance behavior, i.e., behavior intended to gain high social status [6]–[8], [10]–[14], which in humans can be aggressive or prosocial depending on the context. Recent research suggests that pride may have evolved as an affective mechanism for motivating such status seeking behavior [15]. Pride is indirectly linked to status seeking because it is an inward directed emotion that signals high status or ego. It has been speculated that testosterone helps translate such motivation into action, for example, in acts of heroic altruism [16], [17]. Importantly, an effect of testosterone on behavior via pride should also work if behavior cannot be observed by others and an individual’s status in the eyes of the others may therefore not be directly affected. [. . .]

Our main finding is a lower incidence of self-serving lies in the testosterone group. [. . .]

While we can rule out a belief effect we cannot ultimately conclude whether our findings are driven by a direct influence of testosterone on prosocial preferences or via increased status concerns. A potential interpretation for our findings is that testosterone administration affects a concern for self-image [25], or pride [16], i.e., enhances behavior which will make a subject feel proud and leads to the avoidance of behavior considered “cheap” or dishonorable. Subjects in our testosterone group may therefore lie less. This is intriguing because pride could be an affective mechanism underlying a link between testosterone and dominance behavior. An interpretation of our findings in terms of pride is in line with anecdotal and correlational evidence indicating that testosterone plays a positive part in heroic altruism [17]. It is also in line with reports that high testosterone individuals display more disobedient behavior in prison environments where proud individuals may be less willing to follow the strict rules and comply with orders [26], [27]. Finally, a relation between pride, testosterone, and the willingness to engage in “cheap” behavior also fits the observation that the five inmates with the lowest testosterone levels in a sample of 87 female prison inmates were characterized as “sneaky” and “treacherous” by prison staff members [27]. Further experiments manipulating whether lying is an honorable action (e.g., lying for charity) or not (lying for self) are needed to clarify the role of pride in the effect of testosterone on human social behavior. An alternative interpretation of our results, which we cannot rule out, is that testosterone has a direct effect on prosocial behavior, making people more honest per se.

The researchers compared the results from the testosterone group to those from the control group. "This showed that the test subjects with the higher testosterone levels had clearly lied less frequently than untreated test subjects," reports the economist Prof. Dr. Armin Falk, who is one of the CENS co-directors with Prof. Weber. "This result clearly contradicts the one-dimensional approach that testosterone results in anti-social behavior." He added that it is likely that the hormone increases pride and the need to develop a positive self-image. "Against this background, a few euros are obviously not a sufficient incentive to jeopardize one's feeling of self-worth," Prof. Falk reckons.

As with the Old Testament patriarch who gave birth to a nation, it all began with Abraham, whose forebears were from the town of Duersen in northern Brabant. Known in official documents as “Abraham the miller,” or “Abraham Pieterszen,” as in son of Peter, he landed on the island of “Manatus” some time before February 1627. Nearly 400 years later, he has more than 200,000 descendants over 15 generations scattered across the Americas, according to several genealogical experts who have built on intensive studies of the family over the centuries. In the 1880 census, there were 3,000 heads of household with the name Van Dusen — or Van Deusen, Van Deursen, Van Duzer and other common variants — all, the experts say, traceable back to Abraham the miller.

Theirs is among a small cohort of large, long-running Dutch families — including under-the-radar Rapeljes, with more than a million descendants, and the more prominent Kips and Rikers, with their names on neighborhoods and institutions — whose well-documented histories provide a compelling window into the development of what would become New York and, later, the United States.

Two of Abraham’s progeny — Martin Van Buren, a great-great-great-grandson; and Franklin Delano Roosevelt (add four more greats) — served as presidents of the United States. A third, Eliza Kortright (Generation 7), married one, James Monroe. Egbert Benson (Generation 6) was the first attorney general of postcolonial New York. The Rev. Dr. Henry Pitney Van Dusen, a theologian (Generation 10), made the cover of Time magazine in 1954.

There were family members on both sides of the early border wars between New York and Massachusetts, the War of Independence and the Civil War. At the Battle of Gettysburg, Pvt. William Jackson Raburn of Indiana’s “Fighting 300” died of a gunshot wound on July 2, 1863; a day later, Matthew Henry Van Dusen — Raburn’s fourth cousin twice removed (by marriage) — a “reb” with the fabled Hood’s Texas Brigade, was sidelined with a head injury.

Cornelis Kortright (Generation 5) owned slaves accused of participating in a “Negro plot” in 1741. Jan Van Deusen Jr., Kortright’s second cousin, saved New York’s historical records when the British burned the state’s first capital to the ground in 1777. [. . .]

Phoebe shares her father’s fascination with the family, particularly since she read some of the excerpts from her great-great-great-great-grandfather’s Civil War diary. “It kind of amazed me that I knew someone who was part of what I was studying in school in textbooks,” she said. “A lot of my friends’ parents just came here and don’t speak English yet. And some came here two generations ago. The one who has been here the longest came from Scotland, and that’s only a hundred years.”

Tim Tebow arrives in New Jersey, where the Jets practice and play, as the world’s most famous backup quarterback. It is a homecoming, of sorts, centuries in the making, because Tebow appears to be the great-great-great-great-great-great-great-great-grandson of a man from Hackensack.

MetLife Stadium, home of the Jets and the Giants in East Rutherford, is about 10 miles from where an immigrant, Andries Tebow (spelled variously as Thybaut, Tibout, TeBow and other derivations), settled down after landing from Europe in the late 1600s. One of his children was Pieter, born in Hackensack and baptized there in 1696, records show.

More than 300 years and 10 generations later, Tim Tebow brings the family name full circle, according to the amateur genealogist — and Tebow’s fourth cousin, once removed — Dean Enderlin. [. . .]

It is unclear how much Tebow knows about his genealogy. While his own recent background is well chronicled — born to Christian missionaries in the Philippines, raised in Florida, now a preacher in a championship quarterback’s body — little has been examined about his deeper roots.

But there is no doubt that early generations of Tebows settled in what is now Bergen County, and Tim Tebow appears to be the latest link in a long chain of North Jersey arrivals. [. . .]

Enderlin said that, like many Tebows in the country, he and Tim Tebow can be traced to Andries Tebow, who sailed to the New World out of Bruges, Belgium. Enderlin is unsure where Andries lived — either Belgium or Holland — but he believes his family was Walloon, a French-speaking minority rooted in southern Belgium.

“Belgium was governed by the Catholic rulers of Spain and persecuted Protestants, forcing many to flee,” Myra Vanderpool Gormley wrote in an article for Genealogy Magazine titled, “Belgian Migrations: Walloons Arrived Early in America.”

“Many went to the northern parts of the Netherlands,” she wrote. “It was from their exile in Holland that they emigrated again.”

When responding to the Census, more than five million Americans claim to be of Dutch descent. And they mostly are, at least a little. Now you might wonder how they compare with the Dutch back in the Netherlands: you might wonder about the relative academic or economic success of these two groups, which presumably have a common ancestry. But you would be wrong to do so. You would be comparing apples and House of Orangemen.

There were four or five different Dutch waves of settlement in this country. The first is pretty well-known, the Dutch colony in New York. Of course, it was only about half Dutch in origin: the rest were Walloons and French Huguenots. Lots of people have some ancestry from that group, including people I know. Why, if there was any justice, Henry Harpending would own a fine farm on Manhattan Island right now.

Of course, Henry isn’t all that Dutch. His surname is. He comes from an area of New York State that really did have some Dutch settlement. The thing is, white Protestants in this country have been intermarrying rather freely for several hundred years: it is rare to find someone in that category whose ancestors all come from one ethnicity. I would be surprised if Henry is 1/8th Dutch. In much the same way, my patrilineal lineage is Ulster Scot (who fears mention the battle of the Boyne!?) but the rest includes English, Welsh, Scottish, Green Irish, and a component that, I suspect, only became Dutch in 1918, and was Bavarian before that. We’re talking about ye olde Americans, not Ellis Island types. Not that they haven’t mixed as well, but less so… [. . .]

Most of the people who self-identify as Dutch-Americans are mostly something else. Why? Sometimes a family tradition, or a surname, but more than anything else, fashion.

Fashions change. For example, the fraction of Americans who report English ancestry has dropped drastically since 1980 – so much that so that you would have to wonder about secret death camps if you took it seriously. But it’s fashion. I looked at the census numbers for my home county, and then looked at the phone book: Census result was 20% English ancestry, real number was more like 80%. Of course this means that people in the US claiming a particular ethnicity can not only have limited ancestry from that group, but be oddly unrepresentative as well.

I would probably put “Dutch” on a census form if an answer were required. I am either 1/32 or 1/64 Dutch, and worse the supposed Dutch ancestor was a Huguenot or something like that, so I am likely really 0% Dutch. No matter…….

I've commented on this phenomenon before (e.g.), but a periodic reminder is useful. I don't see a problem with someone identifying with his patrilineal national origin for census purposes while remaining aware of his overall ancestry. What I find irritating is the eagerness of some with American ancestry to identify as "Scotch-Irish" after reading a review of Albion's Seed, or "Celtic" in the name of Celtic Southronism, or "German" because they had a German great-grandfather, and then declare themselves at war with or at least safely distinct from evil/culpable "WASPs" / "Anglo-Saxons" (which appellations in reality describe the core of the breeding population from which the newly self-identified Borderer/Celt/German sprung).

The Myth of Random Mating: Evidence of ancestry-related assortative mating across 3 generations in Framingham, MA.R. Sebro1,2, G. Peloso3,4, J. Dupuis5,6, N. Risch1,7,8
1) Institute for Human Genetics, University of California, San
Francisco, San Francisco, CA; 2) Department of Radiology and Biomedical
Imaging, University of California, San Francisco, San Francisco, CA; 3)
Center for Human Genetic Research, Massachusetts General Hospital,
Boston, MA; 4) Program in Medical and Population Genetics, Broad
Institute, Cambridge, MA; 5) Department of Biostatistics, Boston
University School of Public Health, Boston MA; 6) The National Heart,
Lung and Blood Institute’s Framingham Heart Study, Framingham, MA; 7)
Department of Biostatistics and Epidemiology, University of California,
San Francisco, San Francisco, CA; 8) Division of Research, Kaiser
Permanente, Oakland, CA.

The factors that influence spouse selection are important to
geneticists because the mating pattern determines the genetic structure
of a population. There has been evidence of positive assortative mating
(PAM) related to several phenotypic traits like height.
Ancestrally-related PAM is necessary for genetic population
stratification, which means spouses are more likely to share genes of
common ancestry. Prior studies have shown strong ancestry-related
assortative mating among Latino populations. Here, Caucasian spouse
pairs from the Framingham Heart Study (FHS) Original and Offspring
cohorts (N=885) genotyped on Affymetrix 500K were analyzed using
principal components (PC) analysis. Data from individuals genotyped in
HapMap and the Human Genome Diversity Project (HGDP) were projected onto
these PCs to facilitate interpretation. Based on these and other data,
the first principal component delineates the prominent
northwest-to-southeast European cline. In our data, there was clear
clustering on this axis, probably separating individuals of
English/Irish/German ancestry from those of Italian ancestry. The second
principal component also reveals strong clustering, and likely reveals
individuals of Ashkenazi Jewish ancestry. In the Original (older)
cohort, there is a very strong correlation in PC1 between the spouses
(r=0.73, P=2e-22) and also for PC2 (r=0.80, P=4e-29). In the Offspring
cohort the spouse correlations were lower but still highly significant:
r=0.38, P=3e-28 for PC1 and r=0.45, P =9e-40) for PC2. Examination of
scatter plots for spouse pairs in the two generations reveals both a
reduction in clustering and lower but still evident correlation in the
Offspring cohort. Of genetic impact, we observed highly significant
Hardy-Weinberg disequilibrium (homozygote excess) for SNPs loading
heavily on PC1 and PC2 across 3 generations, and also highly significant
linkage disequilibrium between the same set of SNPs located on
different chromosomes. These results are consistent with demographic
patterns of social homogamy which have existed in Framingham over
several generations, and a general trend of reduced homogamy over time.
While Framingham is not representative of the general US population, its
historic mating patterns serve as a reminder that assumptions of Hardy
Weinberg and Linkage Equilibrium need to be made with caution when
applied to genetic loci that are related to ancestry in any population.

Little is known about the connections between DNA and disease in
African Americans, in part because most genetics research has involved
only those of European ancestry. Greater understanding of such
connections could improve diagnoses and lead to opportunities for more
personalized health care. In 2011 23andMe, Inc., a personal genomics and
research company, launched the Roots into the Future initiative, which
aims to enroll 10,000 African Americans in an innovative research
project. The study seeks to determine whether genetic associations
previously identified in Europeans are relevant to African Americans and
to discover other genetic markers linked to conditions of particular
relevance to the African American community. Currently the 23andMe
cohort includes nearly 10,000 African Americans, over 5700 of whom were
recruited through the Roots into the Future initiative. Each of these
individuals (58% female, 42% male; mean age: 44) has submitted a saliva
sample for genotyping via 23andMe’s custom genotyping array, which
includes approximately 1 million single nucleotide polymorphisms.
Participants are currently contributing information about their health
and traits through online surveys. To date over 6200 participants have
completed an average of 10.6 surveys. Using the genetic data we
estimated the percent African and European ancestry of each participant.
Median estimates were 73% and 23% respectively (with 4% uncertain). As
expected, the higher a person’s proportion of European ancestry, the
greater the chance that person carries variants that are more common
among Europeans than among Africans, such as those linked to
HIV-resistance and alpha-1 antitrypsin deficiency. Furthermore, the
higher a person’s proportion of African ancestry, the more likely that
person reported having curly hair, high blood pressure and type 2
diabetes, and the less likely that person reported having facial
wrinkles, rosacea and Parkinson’s Disease. Based on data for over 8700
individuals likely to self-identify as African American, we replicated
over 25 genetic associations reported previously for African Americans,
including those for body-mass index, type 2 diabetes, lupus, height, and
osteoporosis. For conditions for which we have already accrued at least
500 cases among this cohort, such as asthma, migraines, and uterine
fibroids, we anticipate having power either to replicate associations
identified through previous studies of Europeans or to find new
associations.

Known discoveries from genome-wide association studies have limited
predictive ability for individual traits, but recent estimates of “hidden heritability”
suggest that in the future performance of predictive models can be
potentially enhanced by incorporation of a large number of SNPs each
with individually small effects. We use a novel theoretical model,
discoveries from the largest genome-wide association studies and recent
estimates of hidden heritability to project the predictive performance
of polygenic models for ten complex traits as a function of the number
and distribution of effect sizes for the underlying susceptibility SNPs,
the sample size of the training dataset and the balance of true and
false positives associated with the SNP selection criterion. We project,
for example, that while 45% of the total variance of adult height has
been attributed to common variants, a predictive model built based on as
many as one million people may only explain 33.4% of variance of the
trait in an independent sample. For rare highly familial conditions,
such as Type 1 diabetes and Crohn’s disease, risk models including
family history and optimal polygenic scores built based on current GWAS
can identify a large proportion (e.g 80-90%) of cases by targeting a
small group of high-risk individuals (e.g subjects with top 20% risk).
In contrast, for more common conditions with modest familial components,
such as Type 2 diabetes (T2D), coronary heart disease (CAD) and
prostate cancer (PrCA), risk models built based on GWAS with current or
foreseeable sample sizes (e.g triple in size) can miss large proportion
(>50%) of cases by targeting a small group of high-risk individuals.
For these common disease, the proportion of the population that can be
identified to have 2-fold or higher risk than an average person in the
population ranged between 1.1% (CAD) and 7.0% (PrCA) for polygenic
models built based on current GWAS. If the sample size for future
studies could be tripled, these proportions could range between 6.1%
(CAD) and 18.8% (T2D). Our analyses suggest that the predictive utility
of polygenic models depends not only on heritability, but also on
achievable sample sizes, effect-size distribution and information on
other risk-factors, including family history.

The specific factors influencing human sexual partnering are poorly
understood. Arguably, in the pre-modern era, multiple mating may have
been tied to selection for traits related to survival including
resistance to infection and starvation, strength, and certain behaviors.
Recently, we completed a GWAS using the Illumina Omni-Quad microarray
in ~5800 African- and European-American (AA and EA) participants in
genetic studies of alcohol, cocaine, and opioid dependence. Subjects
were interviewed using the Semi-Structured Assessment for Drug
Dependence and Alcoholism (SSADDA) - an instrument that covers all major
DSM-IV diagnoses as well as other numerous psychiatric and lifestyle
traits. One of these is the response to: “How many sexual partners have you had in your life?”
Association of age-adjusted residuals of this variable with more than 3
million SNPs reliably imputed using the 1K Genomes reference panel was
tested in each sex*population subgroup using generalized estimating
equations. Results from subgroup analyses were combined by meta
analysis. SNPs with p-values <1E-06 were genotyped in a replication
sample of ~2300 subjects. Genomewide-significant results were obtained
for 13 SNPs including ones that map to genes coding proteins involved in
reproductive-related functions (e.g., rs74738626 in KCNU1 which encodes
a testes-specific K+ channel [p=1.2E-12], rs78227383 in NME5, a
nucleoside diphosphate kinase which may have a specific function in the
phosphotransfer network involved in spermatogenesis [p=4.0E-11 in EAs
only], and rs76221611 in CCND2 which encodes cyclin D2, shown to be
highly expressed in ovarian and testicular tumors [p=3.3E-11 in AAs
only]), immune response (e.g., rs2709778 in GARS which encodes
gylcyl-tRNA synthetase shown to be a target of autoantibodies in human
autoimmune diseases [p=1.0E-10 in males only]), and other genes of
biological interest (e.g., rs10849971 in ALDH2, an alcohol-metabolizing
enzyme that is also an alcohol dependence risk locus [p=9.6E-09 in
females only]). These findings have clear implications with respect to
normal sexual function and potentially for risk of sexually transmitted
disease.

Recent studies in population of European ancestry have shown that
30-50% of heritability for human complex traits such as height (Yang et
al. 2010) and body mass index (Yang et al. 2011), and common diseases
such as schizophrenia (Lee et al. 2012) and rheumatoid arthritis (Stahl
et al. 2012) can be captured by common SNPs, and that genetic variation
can be attributed to chromosomes, in proportion to their length. Using
genome-wide estimation and partitioning approaches, we analyzed 49 human
quantitative traits, many of which are relevant to human diseases, in
7,170 unrelated Korean individuals genotyped on 326,591 SNPs. For 43 of
the 49 traits, we estimated a significant (P < 0.05) proportion of
variance explained by all SNPs (h2G). On average across 47 of the 49
traits for which the estimate of h2G is non-zero, 13.4% (range of 3.4%
to 31.6%) of phenotypic variance can be explained by all the SNPs being
analysed, or approximately one-third (range of 7.8% to 76.8%) of narrow
sense heritability. In contrast, on average across 25 of the 49 traits,
the top associated SNPs at genome-wide significance level (P < 5e-8)
explain 1.5% (range of 0.5% to 3.8%) of phenotypic variance. The
majority (~92%) of explained variation estimated from all SNPs is
captured by the SNPs with p-values < 0.031 in single SNP association
analyses. Longer genomic segments tend to explain more phenotypic
variation, with a correlation of 0.78 between the estimate of variance
explained by individual chromosomes and their physical length. This
correlation was stronger (0.81) for intergenic regions. Despite the fact
that there are a few SNPs with large effects for most traits, these
results suggest that polygenicity is ubiquitous for most human complex
traits, and that a substantial proportion of heritability is captured by
common SNPs.

Background. Much has been written about the so-called “missing heritability”
for complex traits. Nowhere is this more pertinent than for alcohol and
nicotine dependence (AD, ND) for which there are estimates of
heritability of up to 65% from twin studies, yet few causal variants
have been replicated from GWAS studies, despite large sample sizes,
suggesting that individual effect sizes of SNPs must be very small.
Recently new statistical genetic techniques have been developed which
allow estimation of the total variance associated with all SNPs on a
GWAS chip, but this has yet to be applied to AD and ND. Methods. The
current analysis is based on AD and ND symptom count data from over 8000
participants in our population-based twin-family studies who have used
either alcohol or cigarettes at some stage of their lives. They were
individually genotyped with Illumina 370K or 660K chips and 7.034M
genotypes were imputed from HapMap 3 and 1000-Genomes data. The GCTA
program of Yang, Visscher et al is used first to detect the degree of
relatedness between apparently unrelated subjects, based on a set of
about 300,000 SNPs pruned for LD. Phenotypic similarity is then
regressed on IBS sharing for all possible relative pairs to estimate the
total amount of variance due to SNPs on the chip. Results. Based on
GCTA analysis for other complex traits we expect to find SNP associated
variance accounting for about half the heritability estimated from
conventional genetic epidemiology designs. However, these estimates are
highly sensitive to population stratification so great care will be
taken to remove all traces of population stratification during the
analysis. Conclusions. The gap between the SNP-associated variance
estimated by GCTA and twin and family estimates of heritability is most
likely due to several factors. First, the tag SNPs on the chip are not
in perfect LD with the causal SNPs; for other traits, simulation has
shown that correcting for imperfect LD raises the SNP “heritability” by
about 10%. Another major factor is that commercial chips only
interrogate common SNPs so large effects of rare SNPs are simply not
captured. Reasonable estimates from simulations suggest that this could
account for another 20% of variance. Finally, we recognize that there
are large sections of the genome containing highly repetitive DNA which
are very poorly tagged by current chips, and where substantial
proportions of genetic variance may be hidden.

Background. Much has been written about the so-called “missing heritability”
for complex traits. Nowhere is this more pertinent than for alcohol and
nicotine dependence (AD, ND) for which there are estimates of
heritability of up to 65% from twin studies, yet few causal variants
have been replicated from GWAS studies, despite large sample sizes,
suggesting that individual effect sizes of SNPs must be very small.
Recently new statistical genetic techniques have been developed which
allow estimation of the total variance associated with all SNPs on a
GWAS chip, but this has yet to be applied to AD and ND. Methods. The
current analysis is based on AD and ND symptom count data from over 8000
participants in our population-based twin-family studies who have used
either alcohol or cigarettes at some stage of their lives. They were
individually genotyped with Illumina 370K or 660K chips and 7.034M
genotypes were imputed from HapMap 3 and 1000-Genomes data. The GCTA
program of Yang, Visscher et al is used first to detect the degree of
relatedness between apparently unrelated subjects, based on a set of
about 300,000 SNPs pruned for LD. Phenotypic similarity is then
regressed on IBS sharing for all possible relative pairs to estimate the
total amount of variance due to SNPs on the chip. Results. Based on
GCTA analysis for other complex traits we expect to find SNP associated
variance accounting for about half the heritability estimated from
conventional genetic epidemiology designs. However, these estimates are
highly sensitive to population stratification so great care will be
taken to remove all traces of population stratification during the
analysis. Conclusions. The gap between the SNP-associated variance
estimated by GCTA and twin and family estimates of heritability is most
likely due to several factors. First, the tag SNPs on the chip are not
in perfect LD with the causal SNPs; for other traits, simulation has
shown that correcting for imperfect LD raises the SNP “heritability” by
about 10%. Another major factor is that commercial chips only
interrogate common SNPs so large effects of rare SNPs are simply not
captured. Reasonable estimates from simulations suggest that this could
account for another 20% of variance. Finally, we recognize that there
are large sections of the genome containing highly repetitive DNA which
are very poorly tagged by current chips, and where substantial
proportions of genetic variance may be hidden.

Background: Compared to European Americans (EA),
African-Americans (AA) have stiffer peripheral vessels, reflected in
reduced carotid distensibility coefficient (DC). To determine whether
this racial difference may be genetically determined, we examined the
extent to which the variance in carotid distensibility in AA could be
explained by EA admixture either at a global or local at genomic level. Methods:
We examined data from 344 AA, 62% women, aged 25-76 years, enrolled in a
large study (GeneSTAR) of apparently healthy people with a family
history of early-onset coronary artery disease. DC was assessed using 2D
ultrasound, calculated as 2*(fractional change in diameter from
diastole to systole)/(systolic -diastolic blood pressure). By its
calculation DC is inherently corrected for blood pressure levels. EA
admixture was determined using a panel of 50,000 ancestry informative
markers (deCODE Genetics), and local ancestry was calculated on Illumina
Human 1M genomewide SNP panel using LAMP. Associations of
log-transformed DC were tested using mixed model regressions adjusted
for age, sex, sex*age interaction and within-family correlations. LAMP
models were adjusted for population stratification PCAs derived from the
Illumina 1M SNPs (EIGENSTRAT). Results: The median [interquartile range] of the DC was 0.0017 [0.0012-0.0024] mmHg-1.
Every 10% incremental level of EA admixture was associated with 5%
higher DC (95% CI: 1% to 9%, p=0.005), reflecting more distensibility,
and less stiffness. In genomewide local ancestry analysis adjusted for
sex, age, sex*age interaction, population stratification PCAs and
within-family correlations, of 2756 genome segments in local ancestry
LD, the highest association for local ancestry was found in Chromosome
8, positions 8.3M to 10M (Build 37.3), p=0.0012. On adjusting for local
ancestry in this region, population stratification PCA1 representing
global Caucasian ancestry was no longer significantly associated with DC
(p=0.93). Conclusions: The racial difference in arterial
distensibility between AA and EA is likely to have a basis in genetic
admixture. We have identified a candidate region on chromosome 8 that
may be responsible for this global admixture association.

Low frequency variants (MAF <5%) likely contribute to
susceptibility for complex traits, but their study is challenging in
admix populations. We hypothesize that population isolates that have
experienced bottlenecks would have an enrichment of specific low
frequency variants some of which could be predisposing to complex
traits. This enrichment could benefit especially identification of
variants with recessive effects. To test this hypothesis, we studied
homozygous deletions in a prospective birth cohort from an isolated
Northern Finnish population (N=4,931). The role of rare deletions being
clearly establish in abnormal neuronal development led us to constrain
our initial analysis to seven supposedly relevant phenotypes including
diagnosis of schizophrenia, intellectual deficit, learning difficulties,
epilepsy, neonatal convulsion, impaired hearing and cerebral
palsy/perinatal brain damage. The analysis included 32,487 homozygous
deletions in 205 loci of which 11% included exons of one or more genes.
Among the seven traits studied, the strongest association was found with
impaired hearing and a deletion on 15q15.3, overlapping STRC,
previously associated with deafness (p = 10-4). The largest identified
homozygous deletion was 240 kb on 22q11.22 and was associated with
intellectual deficit (p<0.02). The deletion showed significant
regional enrichment in an internal north-eastern isolate with 3-fold
risk of schizophrenia compared to elsewhere in the country. Follow up of
the deletion in 265 schizophrenia patients and 5140 controls revealed
an allelic association with schizophrenia (p= 0.02, OR = 1.9) and was
further replicated in 9,539 cases and 15,677 controls of European origin
(p = 0.03, OR = 2.1). After screening over 13,106 Finns, we identified
four individuals being homozygous for the deletion, all diagnosed with
schizophrenia and/or intellectual disability. The deletion overlaps a
gene encoding for TOP3B and was found to down regulate its expression to
half among heterozygous carriers and zero in homozygous carriers (p
< 10-10). Our results demonstrate the effect of multiple consecutive
population bottlenecks in the enrichment of sizable deletions
contributing to abnormal neuronal development. In addition the findings
highlight the usefulness of population isolates in studying rare and low
frequency variants in complex traits.

Genome-Wide Expression Profiles (GWEPs) have been assayed in a
growing number of cohort studies, but few attempts have been made to
meta-analyse and cross-validate expression datasets. Consequently, many
expression studies have been under powered. Therefore, we established a
large-scale multi-cohort GWEP meta-analysis. The aim of this study was
to robustly identify novel gene expression signatures associated with
age and sex, two major risk factors for many diseases. We analyzed 6,993
European-ancestry PAXgene (whole-blood) samples from 6 cohort studies
(RS, FHS, EGCUT, KORA, SHIP, INCHIANTI). GWEPs were quantile-normalized,
log2-transformed, probe-centered and sample-z-transformed prior to
analysis. In the discovery stage we meta-analysed age- and
sex-associated signals for samples hybridized to an Illumina or
Affymetrix array separately. All analyses were adjusted for plate ID,
RNA quality, fasting- and smoking status, and cell counts (when
available). The age analysis was additionally adjusted for sex. All
significant signals were cross-validated between the Illumina and
Affymetrix platforms. We examined the top-associated GWEPs in 3
additional studies: HVH (n=348), GARP (n=134), and NIDDK/PIMA (n=1457).
We identified 396 age-associated transcripts with p<1E-5 and same direction in both platforms. NELL2,
a protein kinase C-binding protein, was the most significant result
with gene expression levels decreasing with age (Illumina p=8.2E-81,
Affymetrix p=3.2E-64). NELL2 is involved in cell growth
regulation and differentiation, and there is evidence for developmental
fluctuation in puberty. We identified 347 transcripts differentially
expressed between males and females(p<1E-5, same direction both platforms), of which >200 show mapping to sex chromosomes. The top autosomal gender-differentiated transcript is DACT1, which has higher mRNA levels in females (Illumina p=2.4E-47, Affymetrix p=1.6E-75). DACT1
is an antagonist of beta-catenin and prior work indicates it to be
differentially methylated in testes. It is a biomarker for semen and DACT1 knockout mice showed developmental defects. Both the NELL2 and the DACT1
signals were replicated in all 3 additional cohorts. With the GWEP
meta-analysis, we gained power relative to individual cohort analyses,
and were able to identify novel replicable significant age- and sex-
associated loci. These loci may have implications for age-related
disease biology, gender biology, and in sample forensics.

Melanin pigmentation plays an important role in shielding the body
from ultraviolet (UV) radiation and may serve as a scavenger for
reactive oxygen species. More than 150 genes have been implicated in
determining in mice, and include transcription factors, membrane and
structural proteins, enzymes, and several kinds of receptors and their
legands, most of which have human orthologues. Although many molecular
mechanisms involved in melanin pigmentation are being determined,
relatively little is understood about the genetic component responsible
for variations in skin color within or between human populations. First,
in order to reveal their genetic contribution to skin color, we
examined the association of pigmentation-related genes variants and
variations in the melanin index in members of the general Japanese
population whose skin color was objectively measured by reflectometry.
The multiple regression showed that OCA2 A481T rs74653330 (p = 6.18e-8) and, OCA2
H615R rs1800414 (p = 5.72e-6) were strongly associated with the mean of
the melanin index in the female population. Three variants (SLC45A2 T500P rs11568737 p = 0.048, OCA2 T387M p = 0.015, TYR
D125Y rs13312741 p = 0.022) were also significantly associated with
melanin index. However, no significant associations were found between
age and melanin index for variants of MC1R. Second, we evaluated
the associations of the pigmentation-related genes variants and the risk
of skin cancer. The statistical analysis revealed that only OCA2
H615R was associated with the risk of all skin cancers, especially
malignant melanoma. We could not find any statistical significance in
the associations of other variants, including OCA2 A481T, or
melanin index with the risk of skin cancer. This is the first report on
the association between the genetic variants in pigmentation genes and
the risk of skin cancer in East Asian population.

You may contact the first author (during and after the meeting) at
tamsuz@med.id.yamagata-u.ac.jp

The derived (A111T) variant of SLC24A5 is associated with lighter
skin pigmentation compared to the ancestral allele. A111T is fixed or
nearly fixed in most European, North African and Middle Eastern
populations, extending east to Pakistan. In Europeans, a large genomic
region of diminished variation on chromosome 15, nearly 150 kb in
extent, includes SLC24A5. We analyzed the haplotypes in this region
using existing genomic data. Eleven haplotypes, defined on the basis of
16 SNPs that span a 76 kb genomic region in which recombination was
rare, account for 95% of the total. A single haplotype (here called C11)
carries A111T, suggesting that its origin did not long predate the
onset of selection. Haplotype C11 was the product of recombination
between haplotypes C3 and C10, followed by the A111T mutation. C3 and
C10 are both present in East Asia and the New World but virtually absent
in Africa, suggesting that C11 originated outside of Africa, most
likely in the Middle East. The current distribution of A111T is
consistent with the view that it originated after the divergence between
populations that settled Europe and those that settled East Asia.

You may contact the first author (during and after the meeting) at
vac3@psu.edu

Session Descriptions:
Identity by descent (IBD) is fundamental
to genetics and has diverse applications. Recently developed
statistical methods and genome-wide SNP data have made it possible to
detect haplotypes shared identically by descent between individuals with
common ancestry up to 25-50 generations ago. With sequence data, shared
haplotypes from even more distant ancestry can be detected. Patterns of
IBD segment sharing within and between populations reveal important
population demographic features including recent effective population
size and migration patterns. IBD segment sharing is directly relevant to
disease gene mapping and estimation of heritability. Individuals who
share a genetic basis for a trait are more likely to have IBD sharing
compared to randomly chosen individuals, and this forms the basis for
IBD mapping and heritability estimation. Analysis of data from extended
pedigrees was extremely difficult with standard linkage approaches, but
is now possible using approaches based on detected IBD segments.
Detected IBD can be present across pedigrees, which enhances power to
detect association with the trait. Further, in population samples there
is potential to utilize detected IBD segments to improve power to detect
association when multiple variants within a gene influence the trait.
IBD segments can also be used to greatly improve haplotype phase
estimates, which is critical to understanding the functional consequence
of genetic variation. IBD-based long-range phasing has previously been
shown to be effective in isolated populations such as Iceland, but
recent advances have extended its application to large outbred
populations. In this session, we explore these exciting new
developments.

Genome-wide association studies (GWAS) provide a powerful tool to
assess genetic associations between common marker alleles and complex
traits in large numbers of individuals. Typically these studies have
focused on testing the markers in the 22 autosomal chromosomes while the
X-chromosome has been omitted from the analyses. The chromosome X,
however, constitutes approximately 5% of genomic DNA encoding for more
than 1000 genes, and thus also likely contains genetic variation
contributing to common traits and disorders.
We set to test
associations between 560,000 genotyped and imputed SNP markers and eight
anthropometric (BMI, stature, WHR) and biochemical (CRP, HDL, LDL, TC,
TG) traits in 14,710 individuals (7468 males, 7242 females) from five
Finnish cohorts.
A region in chromosome Xq21.1 was associated with adult stature (meta-analysis p-value = 3.32×10-10).
The lead SNP in the locus explained up to 0.55% of the variance in
height in 31-year-old women corresponding to 1.09 cm difference between
minor and major allele homozygotes. The associated lead variant (MAF =
0.31) is located upstream of ITM2A, a gene encoding for a
membrane protein that plays a role in osteo- and chondrogenic
differentiation. As this is among the first studies using the X
chromosome reference haplotypes from the 1000 Genomes project, we are
currently validating the imputation with genotyping methods.
The
findings pinpoint the value of including chromosome X in the GWAS of
complex traits to identify further relevant gene regions that also
account for some of the missing heritability. The study illustrates that
the 1000 Genomes reference haplotypes allow for high-resolution
investigations of the genetic variants in chromosome X even with a
relative modest sample sizes compared to the current-day GWAS
meta-analyses. Our finding demonstrates that the same analysis strategy
is also likely to be useful in the meta-analyses of the large consortia
with complex traits.

Adult human height is a highly heritable polygenic trait. Previous
genome-wide analyses have identified 180 independent loci explaining an
estimated 1/8th of the heritable component (80%). Our aims were a) to
increase the understanding of the role of common genetic variation in a
model quantitative trait, and b) to help understand the biology of
normal growth and development. Within the GIANT consortium, we performed
a GWAS of ~250,000 individuals of European ancestry. We tested for the
presence of multiple signals at individual loci using an approximate
conditional and joint multiple SNP regression analysis. We identified
698 independent variants associated with height at p<5x10-8, which
fell in 424 loci (+/-500kb from lead SNP) and altogether explained 1/4
of the inherited component in adult height. Half of the loci contained
multiple signals of association. By applying a novel pathway analysis
approach that uses co-expression data from 80,000 samples to predict the
biological function of poorly annotated genes, we observed enrichment
for novel and biologically relevant pathways in these loci. For example,
for more than 10 % of the loci a gene was found in their vicinity with a
predicted "regulation of ossification" function (GO:0030278, WMW P <
10-34), including newly identified genes such as PRRX1and SNAI1. Other
genes and pathways newly highlighted by pathway analysis include WNT
(WNT2B, WNT4, WNT7A) and FGF (FGF2, FGF18) signaling and osteoglycin. We
also noted an excess of signals across the entire genome, with the
median test statistic twice that expected under null (lambda = 2.0).
This result is consistent with either a very deep polygenic component to
height that covers most of the genome or population stratification
contributing partly to the results, or a combination of the two.
Encouragingly, initial results from family based analyses and mixed
models that correct for distant relatedness across samples indicate that
a large proportion of the discovered signals are genuine
height-associated variants rather than confounded by stratification. In
conclusion, data from 250,000 individuals show that adult height is
highly polygenic with, typically, multiple signals of association per
locus now accounting for ¼ of heritability. Furthermore, these results
suggest that increasing GWAS sample sizes can continue to uncover
substantial new insights into the aetiological pathways involved in
common human phenotypes.

External morphological features are by definition visible and are
typically easy to measure. They also generally happen to be highly
heritable. As such, they have played a fundamental role in the
development of the field of genetics. As morphological traits have
frequently been the target of natural selection, their genetics may also
provide clues into our evolutionary history. Many rare diseases include
dysmorphologic features among their symptoms. However, aside from
height and BMI, currently little is known about the genetics of common
variation in human morphology. Here we present a series of genome-wide
association studies across 18 self-reported morphological traits in a
total of over 55,000 people of European ancestry from the customer base
of 23andMe. The phenotypes studied include hair traits (baldness,
unibrow, hair curl, upper and lower back hair, widow’s peak), as well as
many soft tissue and skeletal traits (chin dimple, nose shape, dimples,
earlobe attachment, nose-wiggling ability, the presence of a gap
between the top incisors, joint hypermobility, finger and toe relative
lengths, arch height, foot direction, height-normalized shoe size).
Across the 18 phenotypes, we find a total of 281 genome-wide significant
associations (including 53 for unibrow and 29 each for hair curl and
chin dimple). Almost all of these associations are novel; we believe
this is the largest set of novel associations ever described in a single
report. Many of these SNPs show pleiotropic effects, e.g., a SNP near
GDF5 is associated with hypermobility, arch height, relative toe length,
shoe size, and foot direction; another near AUTS is associated with
both back hair and baldness. Nearby genes are significantly enriched to
be transcription factors (p<1e-14) and to be involved in rare
disorders that cause cleft palate, ear, limb, or skull abnormalities
(p<1e-7). A SNP near ZEB2 is associated with both widow’s peak and
chin dimple; mutations in ZEB2 cause Mowat-Wilson syndrome, which
includes distinctive facial features such as a pronounced chin.
Morphology-associated SNPs are also enriched within regions that have
been identified as undergoing selection since the divergence from
Neanderthals (18 associations in 11 regions, p = 4e-5). The abundance of
these SNPs, which include the ZEB2 and GDF5 associations above, suggest
that physical traits may have played a significant role in driving the
natural selection processes that gave rise to modern humans.

Puberty is a complex trait with large variation in timing and tempo
in the population, and extremes in pubertal timing are a common cause
for referral to pediatric specialists. Recently, large genome-wide
association studies (GWAS) have revealed 42 common variant loci
associated with age at menarche (AAM), and some implicated genes are
known from severe single-gene disorders. However, little remains known
of the genetic architecture underlying normal variation in the onset of
puberty, especially in males.
Tanner staging, a 5-stage scale
assessing female breast and male genital development, is a commonly used
measure of pubertal development. While AAM is a late event during
puberty, Tanner staging during mid-puberty may correlate more closely
with the central activation of puberty. With Tanner scale data at the
comparable ages of 11-12 yrs in girls and 13-14 yrs in boys, we
performed GWAS meta-analyses across 10 cohorts with up to 9,900 samples.
The combined male and female analysis showed evidence for association
near LIN28B (P=1.95x10-8), previously
implicated in AAM and height growth in both sexes. Our data confirms
that this locus is also important for male pubertal development and may
be part of the pubertal initiation program upstream of sex-specific
mechanisms. A novel signal (P= 4.95 x 10-8) with a
consistent direction of effect across contributing datasets locates on
chromosome 1 at an intronic transcription factor binding-site cluster
within the gene CAMTA1. Furthermore, the primary analyses revealed suggestive evidence for male-specific loci, e.g. nearby MKL2 (P=4.68 x 10-7),
which may be confirmed by follow-up genotyping. MAGENTA gene-set
enrichment analysis of the combined-gender GWAS results showed
enrichment of genes involved in expected pathways given the known
biology underlying activation of puberty via the HPG axis. Novel genes
near suggestively associated loci may also pinpoint novel regulatory
mechanisms; CAMTA1 is a calmodulin-binding transcriptional activator, while MKL2 is
also a transcriptional activator involved in cell differentiation and
development. These results suggest the presence of multiple real signals
beneath the genome-wide significant threshold, and further exploration
of enriched pathways may reveal new insights into the biology of
pubertal development.

Height has an extremely polygenic pattern of inheritance. Genome-wide
association studies (GWAS) have revealed hundreds of common variants
that are associated with human height at genome-wide levels of
significance. Each of these common variants has a very modest effect,
and only a small fraction of phenotypic variation can be explained by
the aggregate of these common variants. In this large study of
African-American men and women, we genotyped and analyzed 975,519
autosomal SNPs across the entire genome using a variance components
approach, and found that 46.4% of phenotypic variation can be explained
by these SNPs in a sample of 9,779 evidently unrelated individuals. We
noted that in two samples of close relatives defined by probability of
identical-by-descent (IBD) alleles sharing (Pr (IBD=1)>=0.3 and Pr
(IBD=1)>=0.4), the proportion of phenotypic variation explained by
the same set of SNPs increased to 75.5% (se: 14.8%) and 70.3% (26.9%),
respectively. We conclude that the additive component of genetic
variation for height may have been overestimated in earlier studies
(~80%) and argue that this proportion also includes variation from
epistatic effects. Using simulation, we showed that by using common SNPs
that are only weakly correlated with causal SNPs, the model could
explain a large proportion of heritability. We therefore argue that the
heritability estimate from the variance components approach is not
necessarily the variation explained by a given set of SNPs, but also
possibly reflects distant relatedness between nominally unrelated
participants. Finally, we explored the performance of the variance
components approach and concluded that the approach fails when a large
number of independent variables are included in the model as the
structure of the two components becomes similar. Thus some degree of
population stratification seems to be required in order for the method
to perform well for very large numbers of SNPs; however when modest
stratification is present there is a risk of miss-attribution of effects
of unmeasured (and untagged) variants to measured variants.

There are many known examples of multiple (semi-)independent
associations at individual loci, which may arise either because of true
allelic heterogeneity or imperfect tagging of an unobserved causal
variant. This phenomenon is of great importance in monogenic traits but
has not yet been systematically investigated and quantified in complex
trait GWAS. We describe a multi-SNP association method that estimates
the effect of loci harbouring multiple association signals using GWAS
summary statistics. Applying the method to a large anthropometric GWAS
meta-analysis (GIANT), we show that for height, BMI, and waist-hip-ratio
(WHR) 10%, 9%, and 8% of additional phenotypic variance can be
explained respectively on top of the previously reported 10%, 1.5%, 1%.
The method also permitted to substantially increase the number of loci
that replicate in a discovery-validation design. Specifically, we
identified in total 263 loci at which the multi-SNP explains
significantly more variance than the best individual SNP at the locus. A
detailed analysis of multi-SNPs shows that most of the additional
variability explained is derived from SNPs not in LD with the lead SNP
suggesting a major contribution of allelic heterogeneity to the missing
heritability.

Central adiposity and body fat distribution are risk factors for type
2 diabetes and cardiovascular disease and can be measured using waist
circumference (WC), hip circumference (HIP), and waist-to-hip ratio
(WHR). Adjusting for body mass index (BMI) differentiates effects from
those for overall obesity. We performed fixed effects inverse variance
meta-analysis for these traits with 72,919 individuals from 30 studies
in a prior genome-wide association study (GWAS) meta-analysis, 71,139
individuals from 24 additional GWAS, and 67,163 individuals from 28
studies genotyped on Metabochip by the GIANT consortium. We identified
48 independent genome-wide significant (p<5x10-8)
associations for WHR adjusted for BMI, including all 14 previously
published signals. Twelve signals are located near genes for
transcription factors, including developmental homeobox-containing
proteins. Among them, two are in the HOXC gene cluster near HOXC8 and miR-196a2. HOXC8 is expressed in white adipose tissue and is a regulator of brown adipogenesis, while miR-196a inhibits Hoxc8 expression. Signals are located near PPARG, encoding a transcription factor known to regulate adipocyte differentiation, and near HMGA1 and CEPBA,
encoding transcription factors that act downstream of insulin receptor
and leptin signaling, respectively. Further novel signals are located
near genes involved in angiogenesis (PLXND1, VEGFB, and MEIS1).
Among the other five traits, we estimate that a significant proportion
of the genetic effects for WC and HIP adjusted for BMI are correlated
with height (0.59, p<5x10-79 and 0.83, p<2x10-40,
respectively). Despite this strong correlation, an appreciable
proportion of the genetic contributions to these traits will be
independent of height. Association meta-analysis for the five additional
traits identified an additional 148 independent signals (p<5x10-8),
32 of which have not been reported previously for an anthropometric
trait. These novel signals suggest regulation of adipose gene expression
(KLF14) and transcriptional control of cell patterning and differentiation in early development (HLX, SOX11, ZNF423, and HMGXB4)
affect fat distribution. Meta-analyses for WHR, WC, and HIP, with and
without adjustment for BMI, identified a total of 196 independent loci,
66 novel, affecting fat deposition and body shape, and implicating genes
involved in development, adipose gene expression and tissue
differentiation, response to metabolic signaling, and angiogenesis.

Prediction
of complex traits from genetic information is an area of
major clinical and scientific interest. Height is a model trait since it
is highly heritable and easily measured. Substantial strides in
understanding the genetic basis of height have recently been made
through genome-wide association studies (GWAS), and whole-genome
prediction (WGP) which fits thousands of SNPs jointly. Here, we attempt
to gain insight into the genetic architecture of human height by
examining how WGP accuracy is affected by the choice of
single-nucleotide polymorphism (SNPs). Specifically, we compare the
prediction accuracy of models using: 1) SNPs selected based on the ‘top
hits’
of the GIANT consortium meta-analysis for height at different p-value
thresholds, and 2) SNPs in genomic regions that surround the most
significant ‘top hits’. We use the Framingham Heart Study and GENEVA
datasets, imputed up to 10 million SNPs with 1000 Genomes reference
data. The predictive accuracy of each model was evaluated in
cross-validation. We find that prediction accuracy increases up to a
certain point with the inclusion of more ‘top hits’ from the GIANT
study, that including SNPs from the regions surrounding ‘top hits’
contributes minimally to prediction accuracy, and that prediction
accuracy increases with the size of the training dataset. Finally, we
find that prediction accuracy is greatest for individuals at the
phenotypic extremes of height. Our results suggest that improvement of
genomic prediction models will require the use of information from a
large number of selected SNPs, and that these models may be most useful
at the phenotypic extremes.

Stature is a classical and highly heritable complex trait, with
80-90% of variation explained by genetic factors. In recent years,
genome-wide association studies (GWAS) have successfully identified many
common additive variants influencing human height; however, little
attention has been given to the potential role of recessive genetic
effects. Here, we investigated genome-wide recessive effects by an
analysis of inbreeding depression on adult height in over 35,000 people
from 21 different population samples. We found a highly significant
inverse association between height and genome-wide homozygosity,
equivalent to a height reduction of up to 3 cm in the offspring of first
cousins compared with the offspring of unrelated individuals, an effect
which remained after controlling for the effects of socio-economic
status, an important confounder. There was, however, a high degree of
heterogeneity among populations: whereas the direction of the effect was
consistent across most population samples, the effect size differed
significantly among populations. It is likely that this reflects true
biological heterogeneity: whether or not an effect can be observed will
depend on both the variance in homozygosity in the population and the
chance inheritance of individual recessive genotypes. These results
predict that multiple, rare, recessive variants influence human height.
Although this exploratory work focuses on height alone, the methodology
developed is generally applicable to heritable quantitative traits (QT),
paving the way for an investigation into inbreeding effects, and
therefore genetic architecture, on a range of QT of biomedical
importance.

As the frontier of human genetic studies have shifted from
genome-wide association studies (GWAS) towards whole exome and whole
genome sequencing studies, we have witnessed an explosion of new DNA
variants, especially rare variants. An important but not yet answered
question is the contribution of rare variants to the heritabilities of
complex traits, which determine, in part, the gain in power from rare
variants to discover new disease-associated genes. Here we present
theoretical and empirical results on this question.
Our
theoretical study was based upon the distribution of allele frequencies
incorporating mutation, random genetic drift, and the possibility of
purifying selection against susceptibility mutations. It shows that in
most cases rare variants only contribute a small proportion to the
overall genetic variance of a trait, but under certain conditions they
may explain as much as 50% of additive genetic variance when both
susceptible alleles are under purifying selection and the rate of
mutations compensating the susceptible alleles (i.e. repair rate) is
high.
In our empirical study, we estimated the proportion of additive genetic variances (σg2)
of rare variants contributed to the total phenotypic variances of six
complex traits (BMI, height, LDL-C, HDL-C, triglyceride and total
cholesterol) using whole genome sequences (8x coverage) of 962 European
Americans from the Charge-S study. The results show that the estimated σg2 of rare variants (MAF≤1%)
ranged from 2% to 8% across the six traits. However, the standard
errors (s.e.) of the estimated variance components from rare variants
are relatively large compared to those of common variants. Using HDL-C
as an example, the estimated σg2s are 0.08 (s.e.
0.10), 0.05 (s.e. 0.05) and 0.58 (s.e. 0.05) for rare, low-frequency
(1%<MAF≤5%) and common (MAF>5%) variants, respectively.

Resolving
missing heritability, the difference between phenotypic
variance explained by associated SNPs and estimates of narrow-sense
heritability (h2), will inform strategies for disease mapping and
prediction of complex traits. Possible explanations for missing
heritability include rare variants not captured by genotyping arrays, or
biased estimates of h2 due to epistatic interactions [Zuk et al. 2012].
Here, we develop a novel approach to estimating h2 based on sharing of
local ancestry segments between pairs of unrelated individuals in an
admixed population. Unlike recent approaches for estimating the
heritability explained by genotyped markers (h2g) [Yang et al. 2010],
our approach captures the total h2, because local ancestry estimated
from genotyping array data captures the effects of all variants—not just
those on the array. Our approach uses only unrelated individuals, and
is thus not susceptible to biases caused by epistatic interactions or
shared environment that can confound genealogy-based estimates of h2.
Theory and simulations show that the variance explained by local
ancestry (h2γ) is related to h2, Fst, and genome-wide ancestry
proportion (θ): h2γ = h2*2*Fst*θ*(1-θ). Thus, we can estimate h2γ and
then infer h2 from h2γ. We apply our method to 5,040 African Americans
from the CARe cohort and estimate the autosomal h2 for HDL cholesterol
(0.39±0.11), LDL cholesterol (0.18±0.09), and height (0.55±0.13). As
expected these h2 estimates were higher than estimates of h2g from the
same data using standard approaches: 0.22±0.07, 0.16±0.07 and 0.31±0.07,
consistent with previous estimates. The difference between h2 and h2g
suggests that rare variants contribute substantial missing heritability
that can be quantified using local ancestry information. Larger sample
sizes will sizes will enable h2 estimates with even lower standard
errors, so that the possible contribution of epistasis to previous
estimates of h2 can be precisely quantified. We additionally use local
ancestry to estimate the fraction of phenotypic variance shared between
European and African genomes that is explained by genotyped markers, by
estimating h2g in European segments, h2g in African segments, and h2g
shared between European and African segments. Given that most GWAS to
date have been carried out in individuals of European descent, these
estimates shed light on the importance of collecting data from
non-European populations for mapping disease in those populations.

It is well-known that body fat distribution differs between men and
women, a circumstance that may be due to innate, genetic differences
between sexes. Previously, we performed a large-scale meta-analysis of
GWAS of waist-to-hip ratio adjusted for BMI (WHR), a measure of body fat
distribution independent of overall adiposity and found that of the 14
loci established in men and women combined, seven showed a significant
sex-difference. In a subsequent genome-wide analysis that was
specifically tailored to detect sex-differential genetic effects for
WHR, we identified two additional loci with significant sex-difference.
Despite these findings, the genetic basis affecting the sexual
dimorphism of WHR as well as the genetic architecture of WHR in general
are still poorly understood. We therefore conducted sex-combined and
sex-stratified meta-analyses comprising >210,000 individuals
(>116,000 women; >94,000 men) of European ancestry from 57 GWAS
studies and 28 studies genotyped on the MetaboChip within the GIANT
consortium. The sex-combined analysis yielded 39 loci with genome-wide
significant association (P<5x10-8), of which 11 loci showed
significant sex-difference (Bonferroni-corrected P<0.05/39). Six of
these loci influence WHR in women only without any effect in men (near COBLL1, LYPLAL1, PPARG, PLXND1, MACROD1, FAM13A); four loci have an effect in women and a less pronounced effect in men (near VEGFA, ADAMTS9, HOXC13, RSPO3); and one locus has only an effect in men (near GDF5).
The sex-stratified analyses identified nine additional female-specific
loci that had been missed in the sex-combined analysis due to the lack
of effect in men (near MAP3K1, BCL2, TNFAIP8, CMIP, NKX3-1, NMU, SFXN2, HMGA1, KCNJ2).
No additional loci were identified in the male-specific analysis. We
confirmed all previously established sexually dimorphic variants for
WHR. Of particular interest is the PPARG region that is a
well-known target in type 2 diabetes treatments and shows a
female-specific association with WHR. The enrichment of female-specific
associations, i.e. 19 of the 20 sexually dimorphic loci, is consistent
with the heritability of WHR as estimated in the Framingham Heart study;
we found that WHR is more heritable in women (h2~46%) compared to men
(h2~19%). Our results highlight the importance of sex-stratified
analyses and can help to better understand the genetics underpinning the
sex-differences of body fat distribution.