Compatibility of pedigree-based and marker-based relationships for single-step genomic prediction

CloseDepartment of Molecular Biology and Genetics - Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Science and Technology, Aarhus University

Abstract:

Single-step methods for genomic prediction have recently become popular because they are conceptually simple and in practice such a method can completely replace a pedigree-based method for routine genetic evaluation. An issue with single-step methods is compatibility between the marker-based relationship matrix and the pedigree-based relationship matrix. The compatibility issue involves which allele frequencies to use in the marker-based relationship matrix, and also that adjustments of this matrix to the pedigree-based relationship matrix are needed. In addition, it has been overlooked that it may be important that a single-step method is based on a model conditional on the observed markers. When data are from routine evaluation systems, selection affects the allele frequencies, and therefore both observed markers and observed phenotypes contain information about allele frequencies in the base population. Here, two ideas are explored. The first idea is to instead adjust the pedigree-based relationship matrix to be compatible to the marker-based relationship matrix, whereas the second idea is to include the likelihood for the observed markers. A single-step method is used where the marker-based relationship matrix is constructed assuming all allele frequencies equal to 0.5 and the pedigree-based relationship matrix is constructed using the unusual assumption that animals in the base population are related and inbreed with relationship coefficient alpha and inbreeding coefficient alpha/2. The parameter alpha should be determined from the markers, but since there is selection in routine evaluation systems the phenotypes in principle also provide information about this parameter. The likelihood function used for inference contains two terms. The first term is the REML-likelihood for the phenotypes conditional on the observed markers, whereas the second term is the likelihood for t he observed markers. The performance of the proposed method is studied on simulated data examples.