Affordable Access

Publisher Website

Modeling correlated marker effects in genome-wide prediction via Gaussian concentration graph models.

  • Martínez, Carlos Alberto1
  • Khare, Kshitij2
  • Rahman, Syed2
  • Elzo, Mauricio A3
  • 1 Department of Animal Sciences, University of Florida, Gainesville, FL, USA. Electronic address: [email protected]
  • 2 Department of Statistics, University of Florida, Gainesville, FL, USA.
  • 3 Department of Animal Sciences, University of Florida, Gainesville, FL, USA.
Published Article
Journal of Theoretical Biology
Publication Date
Jan 21, 2018
DOI: 10.1016/j.jtbi.2017.10.017
PMID: 29055677


In genome-wide prediction, independence of marker allele substitution effects is typically assumed; however, since early stages in the evolution of this technology it has been known that nature points to correlated effects. In statistics, graphical models have been identified as a useful and powerful tool for covariance estimation in high dimensional problems and it is an area that has recently experienced a great expansion. In particular, Gaussian concentration graph models (GCGM) have been widely studied. These are models in which the distribution of a set of random variables, the marker effects in this case, is assumed to be Markov with respect to an undirected graph G. In this paper, Bayesian (Bayes G and Bayes G-D) and frequentist (GML-BLUP) methods adapting the theory of GCGM to genome-wide prediction were developed. Different approaches to define the graph G based on domain-specific knowledge were proposed, and two propositions and a corollary establishing conditions to find decomposable graphs were proven. These methods were implemented in small simulated and real datasets. In our simulations, scenarios where correlations among allelic substitution effects were expected to arise due to various causes were considered, and graphs were defined on the basis of physical marker positions. Results showed improvements in correlation between phenotypes and predicted additive genetic values and accuracies of predicted additive genetic values when accounting for partially correlated allele substitution effects. Extensions to the multiallelic loci case were described and some possible refinements incorporating more flexible priors in the Bayesian setting were discussed. Our models are promising because they allow incorporation of biological information in the prediction process, and because they are more flexible and general than other models accounting for correlated marker effects that have been proposed previously.

Report this publication


Seen <100 times