Affordable Access

deepdyve-link
Publisher Website

Evolutionary models accounting for layers of selection in protein-coding genes and their impact on the inference of positive selection.

Authors
  • Rubinstein, Nimrod D
  • Doron-Faigenboim, Adi
  • Mayrose, Itay
  • Pupko, Tal
Type
Published Article
Journal
Molecular Biology and Evolution
Publisher
Oxford University Press
Publication Date
Dec 01, 2011
Volume
28
Issue
12
Pages
3297–3308
Identifiers
DOI: 10.1093/molbev/msr162
PMID: 21690564
Source
Medline
License
Unknown

Abstract

The selective forces acting on a protein-coding gene are commonly inferred using evolutionary codon models by contrasting the rate of nonsynonymous substitutions to the rate of synonymous substitutions. These models usually assume that the synonymous substitution rate, Ks, is homogenous across all sites, which is justified if synonymous sites are free from selection. However, a growing body of evidence indicates that the DNA and RNA levels of protein-coding genes are subject to varying degrees of selective constraints due to various biological functions encoded at these levels. In this paper, we develop evolutionary models that account for these layers of selection by allowing for both among-site variability of substitution rates at the DNA/RNA level (which leads to Ks variability among protein-coding sites) and among-site variability of substitution rates at the protein level (Ka variability). These models are constructed so that positive selection is either allowed or not. This enables statistical testing of positive selection when variability at the DNA/RNA substitution rate is accounted for. Using this methodology, we show that variability of the baseline DNA/RNA substitution rate is a widespread phenomenon in coding sequence data of mammalian genomes, most likely reflecting varying degrees of selection at the DNA and RNA levels. Additionally, we use simulations to examine the impact that accounting for the variability of the baseline DNA/RNA substitution rate has on the inference of positive selection. Our results show that ignoring this variability results in a high rate of erroneous positive-selection inference. Our newly developed model, which accounts for this variability, does not suffer from this problem and hence provides a likelihood framework for the inference of positive selection on a background of variability in the baseline DNA/RNA substitution rate.

Report this publication

Statistics

Seen <100 times