TY - JOUR
T1 - The influence of rate heterogeneity among sites on the time dependence of molecular rates
AU - Soubrier, Julien
AU - Steel, Mike
AU - Lee, Mike
AU - Der Sarkissian, Clio
AU - Guindon, Stephane
AU - Ho, Simon
AU - Cooper, Alan
PY - 2012/11
Y1 - 2012/11
N2 - Molecular evolutionary rate estimates have been shown to depend on the time period over which they are estimated. Factors such as demographic processes, calibration errors, purifying selection, and the heterogeneity of substitution rates among sites (RHAS) are known to affect the accuracy with which rates of evolution are estimated. We use mathematical modeling and Bayesian analyses of simulated sequence alignments to explore how mutational hotspots can lead to time-dependent rate estimates. Mathematical modeling shows that underestimation of molecular rates over increasing time scales is inevitable when RHAS is ignored. Although a gamma distribution is commonly used to model RHAS, we show that when the actual RHAS deviates from a gamma-like distribution, rates can either be under-or overestimated in a time-dependent manner. Simulations performed under different scenarios of RHAS confirm the mathematical modeling and demonstrate the impacts of time-dependent rates on estimates of divergence times. Most notably, erroneous rate estimates can have narrow credibility intervals, leading to false confidence in biased estimates of rates, and node ages. Surprisingly, large errors in estimates of overall molecular rate do not necessarily generate large errors in divergence time estimates. Finally, we illustrate the correlation between time-dependent rate patterns and differential saturation between quickly and slowly evolving sites. Our results suggest that data partitioning or simple nonparametric mixture models of RHAS significantly improve the accuracy with which node ages and substitution rates can be estimated.
AB - Molecular evolutionary rate estimates have been shown to depend on the time period over which they are estimated. Factors such as demographic processes, calibration errors, purifying selection, and the heterogeneity of substitution rates among sites (RHAS) are known to affect the accuracy with which rates of evolution are estimated. We use mathematical modeling and Bayesian analyses of simulated sequence alignments to explore how mutational hotspots can lead to time-dependent rate estimates. Mathematical modeling shows that underestimation of molecular rates over increasing time scales is inevitable when RHAS is ignored. Although a gamma distribution is commonly used to model RHAS, we show that when the actual RHAS deviates from a gamma-like distribution, rates can either be under-or overestimated in a time-dependent manner. Simulations performed under different scenarios of RHAS confirm the mathematical modeling and demonstrate the impacts of time-dependent rates on estimates of divergence times. Most notably, erroneous rate estimates can have narrow credibility intervals, leading to false confidence in biased estimates of rates, and node ages. Surprisingly, large errors in estimates of overall molecular rate do not necessarily generate large errors in divergence time estimates. Finally, we illustrate the correlation between time-dependent rate patterns and differential saturation between quickly and slowly evolving sites. Our results suggest that data partitioning or simple nonparametric mixture models of RHAS significantly improve the accuracy with which node ages and substitution rates can be estimated.
KW - among-site rate variation
KW - divergence times
KW - molecular clock
KW - substitution rate
KW - time-dependent rates
UR - http://www.scopus.com/inward/record.url?scp=84867750911&partnerID=8YFLogxK
U2 - 10.1093/molbev/mss140
DO - 10.1093/molbev/mss140
M3 - Article
VL - 29
SP - 3345
EP - 3358
JO - MOLECULAR BIOLOGY AND EVOLUTION
JF - MOLECULAR BIOLOGY AND EVOLUTION
SN - 0737-4038
IS - 11
ER -