Ghent University Academic Bibliography

Advanced

Inference of genome duplications from age distributions revisited

Kevin Vanneste, Yves Van de Peer UGent and Steven Maere UGent (2013) MOLECULAR BIOLOGY AND EVOLUTION. 30(1). p.177-190
abstract
Whole-genome duplications (WGDs), thought to facilitate evolutionary innovations and adaptations, have been uncovered in many phylogenetic lineages. WGDs are frequently inferred from duplicate age distributions, where they manifest themselves as peaks against a small-scale duplication background. However, the interpretation of duplicate age distributions is complicated by the use of K-S, the number of synonymous substitutions per synonymous site, as a proxy for the age of paralogs. Two particular concerns are the stochastic nature of synonymous substitutions leading to increasing uncertainty in K-S with increasing age since duplication and K-S saturation caused by the inability of evolutionary models to fully correct for the occurrence of multiple substitutions at the same site. K-S stochasticity is expected to erode the signal of older WGDs, whereas K-S saturation may lead to artificial peaks in the distribution. Here, we investigate the consequences of these effects on K-S-based age distributions and WGD inference by simulating the evolution of duplicated sequences according to predefined real age distributions and re-estimating the corresponding K-S distributions. We show that, although K-S estimates can be used for WGD inference far beyond the commonly accepted K-S threshold of 1, K-S saturation effects can cause artificial peaks at higher ages. Moreover, K-S stochasticity and saturation may lead to confounded peaks encompassing multiple WGD events and/or saturation artifacts. We argue that K-S effects need to be properly accounted for when inferring WGDs from age distributions and that the failure to do so could lead to false inferences.
Please use this url to cite or link to this publication:
author
organization
year
type
journalArticle (original)
publication status
published
subject
keyword
FLOWERING PLANTS, SYNONYMOUS NUCLEOTIDE DIVERGENCE, PHYLOGENETIC ANALYSIS, simulated sequence evolution, K-S saturation, duplicate age distribution, gene and genome duplication, WHOLE-GENOME, EVOLUTIONARY ANALYSIS, ARABIDOPSIS-THALIANA, MOLECULAR EVOLUTION, CODON SUBSTITUTION, PROTEIN FAMILIES, GENE DUPLICATION
journal title
MOLECULAR BIOLOGY AND EVOLUTION
Mol. Biol. Evol.
volume
30
issue
1
pages
177 - 190
Web of Science type
Article
Web of Science id
000312888600019
JCR category
BIOCHEMISTRY & MOLECULAR BIOLOGY
JCR impact factor
14.308 (2013)
JCR rank
6/291 (2013)
JCR quartile
1 (2013)
ISSN
0737-4038
DOI
10.1093/molbev/mss214
project
Bioinformatics: from nucleotids to networks (N2N)
language
English
UGent publication?
yes
classification
A1
copyright statement
I have transferred the copyright for this publication to the publisher
id
3131683
handle
http://hdl.handle.net/1854/LU-3131683
date created
2013-02-14 15:36:27
date last changed
2016-12-19 15:44:57
@article{3131683,
  abstract     = {Whole-genome duplications (WGDs), thought to facilitate evolutionary innovations and adaptations, have been uncovered in many phylogenetic lineages. WGDs are frequently inferred from duplicate age distributions, where they manifest themselves as peaks against a small-scale duplication background. However, the interpretation of duplicate age distributions is complicated by the use of K-S, the number of synonymous substitutions per synonymous site, as a proxy for the age of paralogs. Two particular concerns are the stochastic nature of synonymous substitutions leading to increasing uncertainty in K-S with increasing age since duplication and K-S saturation caused by the inability of evolutionary models to fully correct for the occurrence of multiple substitutions at the same site. K-S stochasticity is expected to erode the signal of older WGDs, whereas K-S saturation may lead to artificial peaks in the distribution. Here, we investigate the consequences of these effects on K-S-based age distributions and WGD inference by simulating the evolution of duplicated sequences according to predefined real age distributions and re-estimating the corresponding K-S distributions. We show that, although K-S estimates can be used for WGD inference far beyond the commonly accepted K-S threshold of 1, K-S saturation effects can cause artificial peaks at higher ages. Moreover, K-S stochasticity and saturation may lead to confounded peaks encompassing multiple WGD events and/or saturation artifacts. We argue that K-S effects need to be properly accounted for when inferring WGDs from age distributions and that the failure to do so could lead to false inferences.},
  author       = {Vanneste, Kevin and Van de Peer, Yves and Maere, Steven},
  issn         = {0737-4038},
  journal      = {MOLECULAR BIOLOGY AND EVOLUTION},
  keyword      = {FLOWERING PLANTS,SYNONYMOUS NUCLEOTIDE DIVERGENCE,PHYLOGENETIC ANALYSIS,simulated sequence evolution,K-S saturation,duplicate age distribution,gene and genome duplication,WHOLE-GENOME,EVOLUTIONARY ANALYSIS,ARABIDOPSIS-THALIANA,MOLECULAR EVOLUTION,CODON SUBSTITUTION,PROTEIN FAMILIES,GENE DUPLICATION},
  language     = {eng},
  number       = {1},
  pages        = {177--190},
  title        = {Inference of genome duplications from age distributions revisited},
  url          = {http://dx.doi.org/10.1093/molbev/mss214},
  volume       = {30},
  year         = {2013},
}

Chicago
Vanneste, Kevin, Yves Van de Peer, and Steven Maere. 2013. “Inference of Genome Duplications from Age Distributions Revisited.” Molecular Biology and Evolution 30 (1): 177–190.
APA
Vanneste, Kevin, Van de Peer, Y., & Maere, S. (2013). Inference of genome duplications from age distributions revisited. MOLECULAR BIOLOGY AND EVOLUTION, 30(1), 177–190.
Vancouver
1.
Vanneste K, Van de Peer Y, Maere S. Inference of genome duplications from age distributions revisited. MOLECULAR BIOLOGY AND EVOLUTION. 2013;30(1):177–90.
MLA
Vanneste, Kevin, Yves Van de Peer, and Steven Maere. “Inference of Genome Duplications from Age Distributions Revisited.” MOLECULAR BIOLOGY AND EVOLUTION 30.1 (2013): 177–190. Print.