Wednesday, March 20, 2013

First-degree relationships and partly directed networks

I have noted before that a pedigree is a network not a tree, and specifically it is a hybridization network (Family trees, pedigrees and hybridization networks). That is, in sexually reproducing species, every offspring is the hybrid of two parents. If we include both parents in the pedigree, plus all of their relatives, then this will form a complex network every time inbreeding occurs.

This situation can be generalized to groups of closely related individuals, such as cultivated plants and domesticated animals, where human-mediated inbreeding has resulted in the formation of new breeds and cultivars with limited genetic diversity. In the extreme case, the network will consist of first-degree relationships, where the branches connect parent-offspring relationships or sibling relationships.

An example of this is provided by the work on the genetics of grape cultivars by Myles et al. (Myles S, Boyko AR, Owens CL, Brown PJ, Grassi F, Aradhya MK, Prins B, Reynolds A, Chia JM, Ware D, Bustamante CD, Buckler ES. 2011. Genetic structure and domestication history of the grape. Proceedings of the National Academy of Sciences of the USA 108: 3530-3535).

The genotype data were generated from a custom microarray, which assayed 5,387 SNPs genotyped in 583 unique Vitis vinifera samples from the US Department of Agriculture (USDA) germplasm collection. Estimates of identity-by-descent (IBD) were calculated based on linkage analysis for all pairwise comparisons of samples. These IBD values were calibrated based on known pedigree relationships (ie. confirmed parent-offspring relationships), and this was used to differentiate between parent-offspring and other pedigree relationships. For each cultivar that was related to at least two other cultivars by an estimated parent-offspring relationship, the proportion of SNPs consistent with Mendelian inheritance was used to determine the two parents.

The authors found that 75% of the grape cultivars were related to at least one other cultivar by a first-degree relationship. The first figure (above) shows the frequency histogram of these first-degree relationships, along with the resulting complex pedigree structure, which can be visualized as a set of undirected networks. This set is dominated by a single network with 58% of the cultivars, each related to at least one other cultivar by a first-degree realtionship.

The authors inferred that about half of the first-degree relationships were likely to be parent-offspring, with the other half being labeled "sibling or equivalent" (because complex crossing schemes can generate IBD values that are indistinguishable from sibling relationships). By evaluating Mendelian inconsistencies, they assigned parentage for 83 triplets of cultivars. The second figure shows a directed hybridization network of some well-known grape cultivars that includes several resolved triplets.

Note that the hybridization network is only partly directed — quite a few of the edges do not have a uniquely identified direction, based on the SNP data. This is an issue that I have not seen directly addressed in the literature. Practitioners tend to treat phylogenetic networks (and trees) as either directed or undirected, rather than a mixture of both, as this characteristic is determined by the presence or absence of a root node. However, in the grape case there is no root identifiable based on the cultivar SNP data. (There is a scenario for the origin of modern grape cultivars from Vitis sylvestris around the eastern Mediterranean, but even this is complicated by hypothesized later gene flow between V. sylvestris and V. vinifera.)