» Articles » PMID: 16563161

Assessment of Methods for Amino Acid Matrix Selection and Their Use on Empirical Data Shows That Ad Hoc Assumptions for Choice of Matrix Are Not Justified

Overview
Journal BMC Evol Biol
Publisher Biomed Central
Specialty Biology
Date 2006 Mar 28
PMID 16563161
Citations 478
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In recent years, model based approaches such as maximum likelihood have become the methods of choice for constructing phylogenies. A number of authors have shown the importance of using adequate substitution models in order to produce accurate phylogenies. In the past, many empirical models of amino acid substitution have been derived using a variety of different methods and protein datasets. These matrices are normally used as surrogates, rather than deriving the maximum likelihood model from the dataset being examined. With few exceptions, selection between alternative matrices has been carried out in an ad hoc manner.

Results: We start by highlighting the potential dangers of arbitrarily choosing protein models by demonstrating an empirical example where a single alignment can produce two topologically different and strongly supported phylogenies using two different arbitrarily-chosen amino acid substitution models. We demonstrate that in simple simulations, statistical methods of model selection are indeed robust and likely to be useful for protein model selection. We have investigated patterns of amino acid substitution among homologous sequences from the three Domains of life and our results show that no single amino acid matrix is optimal for any of the datasets. Perhaps most interestingly, we demonstrate that for two large datasets derived from the proteobacteria and archaea, one of the most favored models in both datasets is a model that was originally derived from retroviral Pol proteins.

Conclusion: This demonstrates that choosing protein models based on their source or method of construction may not be appropriate.

Citing Articles

Phylogeography Analysis Reveals Rabies Epidemiology, Evolution, and Transmission in the Philippines.

Zhang L, Cruz J, Tian Y, Wang Y, Jiang J, Gonzales R Mol Biol Evol. 2025; 42(2).

PMID: 39936582 PMC: 11815495. DOI: 10.1093/molbev/msaf007.


High astrovirus diversity in an endemic bat species suggests multiple spillovers from synanthropic rodents and birds.

Leong R, Hoarau A, Carcauzon V, Koster M, Dietrich M, Tortosa P J Virol. 2025; 99(2):e0135724.

PMID: 39840948 PMC: 11853114. DOI: 10.1128/jvi.01357-24.


Morphological . molecular identification of trematode species infecting the edible cockle across Europe.

Stout L, Daffe G, Chambouvet A, Correia S, Culloty S, Freitas R Int J Parasitol Parasites Wildl. 2024; 25:101019.

PMID: 39687765 PMC: 11648788. DOI: 10.1016/j.ijppaw.2024.101019.


Intergrative Taxonomic Study of the Complex with a Modern Circumscription of the Section (Frullaniaceae, Marchantiphyta).

Mamontov Y, Vilnet A, Atwood J, Konstantinova N Plants (Basel). 2024; 13(17).

PMID: 39273882 PMC: 11397712. DOI: 10.3390/plants13172397.


Mapping disparities in viral infection rates using highly multiplexed serology.

Pina A, Elko E, Caballero R, Metrailer M, Mulrow M, Quan D mSphere. 2024; 9(9):e0012724.

PMID: 39162531 PMC: 11423740. DOI: 10.1128/msphere.00127-24.


References
1.
Hasegawa M, Kishino H, Yano T . Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985; 22(2):160-74. DOI: 10.1007/BF02101694. View

2.
Muller T, Spang R, Vingron M . Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method. Mol Biol Evol. 2001; 19(1):8-13. DOI: 10.1093/oxfordjournals.molbev.a003985. View

3.
Philip G, Creevey C, McInerney J . The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. Mol Biol Evol. 2005; 22(5):1175-84. DOI: 10.1093/molbev/msi102. View

4.
Yang Z, Goldman N, Friday A . Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol Biol Evol. 1994; 11(2):316-24. DOI: 10.1093/oxfordjournals.molbev.a040112. View

5.
Posada D, Crandall K . Selecting the best-fit model of nucleotide substitution. Syst Biol. 2002; 50(4):580-601. View