» Articles » PMID: 20356385

Parametric and Non-parametric Masking of Randomness in Sequence Alignments Can Be Improved and Leads to Better Resolved Trees

Overview
Journal Front Zool
Publisher Biomed Central
Specialty Biology
Date 2010 Apr 2
PMID 20356385
Citations 80
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective.

Results: ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict.

Conclusions: Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.

Citing Articles

Orthoptera-specific target enrichment (OR-TE) probes resolve relationships over broad phylogenetic scales.

Shin S, Baker A, Enk J, McKenna D, Foquet B, Vandergast A Sci Rep. 2024; 14(1):21377.

PMID: 39271747 PMC: 11399444. DOI: 10.1038/s41598-024-72622-6.


Endless forms most frustrating: disentangling species boundaries in the group (), with the description of six new species and a key to the group.

Blazquez M, Perez-Vargas I, Garrido-Benavent I, Villar-dePablo M, Turegano Y, Frias-Lopez C Persoonia. 2024; 52:44-93.

PMID: 39161630 PMC: 11319839. DOI: 10.3767/persoonia.2024.52.03.


The evolutionary origins and ancestral features of septins.

Delic S, Shuman B, Lee S, Bahmanyar S, Momany M, Onishi M Front Cell Dev Biol. 2024; 12:1406966.

PMID: 38994454 PMC: 11238149. DOI: 10.3389/fcell.2024.1406966.


The Evolutionary Origins and Ancestral Features of Septins.

Delic S, Shuman B, Lee S, Bahmanyar S, Momany M, Onishi M bioRxiv. 2024; .

PMID: 38585751 PMC: 10996617. DOI: 10.1101/2024.03.25.586683.


Potential Contribution of Ancient Introgression to the Evolution of a Derived Reproductive Strategy in Ricefishes.

Flury J, Meusemann K, Martin S, Hilgers L, Spanke T, Bohne A Genome Biol Evol. 2023; 15(8).

PMID: 37493080 PMC: 10465105. DOI: 10.1093/gbe/evad138.


References
1.
Gao Y, Bu Y, Luan Y . Phylogenetic relationships of basal hexapods reconstructed from nearly complete 18S and 28S rRNA gene sequences. Zoolog Sci. 2009; 25(11):1139-45. DOI: 10.2108/zsj.25.1139. View

2.
Edgecombe G . Arthropod phylogeny: an overview from the perspectives of morphology, molecular data and the fossil record. Arthropod Struct Dev. 2009; 39(2-3):74-87. DOI: 10.1016/j.asd.2009.10.002. View

3.
Notredame C . Recent progress in multiple sequence alignment: a survey. Pharmacogenomics. 2002; 3(1):131-44. DOI: 10.1517/14622416.3.1.131. View

4.
Wong K, Suchard M, Huelsenbeck J . Alignment uncertainty and genomic analysis. Science. 2008; 319(5862):473-6. DOI: 10.1126/science.1151532. View

5.
Pei J, Sadreyev R, Grishin N . PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics. 2003; 19(3):427-8. DOI: 10.1093/bioinformatics/btg008. View