» Articles » PMID: 35591887

Accuracy of Multiple Sequence Alignment Methods in the Reconstruction of Transposable Element Families

Overview
Specialty Biology
Date 2022 May 20
PMID 35591887
Authors
Affiliations
Soon will be listed here.
Abstract

The construction of a high-quality multiple sequence alignment (MSA) from copies of a transposable element (TE) is a critical step in the characterization of a new TE family. Most studies of MSA accuracy have been conducted on protein or RNA sequence families, where structural features and strong signals of selection may assist with alignment. Less attention has been given to the quality of sequence alignments involving neutrally evolving DNA sequences such as those resulting from TE replication. Transposable element sequences are challenging to align due to their wide divergence ranges, fragmentation, and predominantly-neutral mutation patterns. To gain insight into the effects of these properties on MSA accuracy, we developed a simulator of TE sequence evolution, and used it to generate a benchmark with which we evaluated the MSA predictions produced by several popular aligners, along with Refiner, a method we developed in the context of our RepeatModeler software. We find that MAFFT and Refiner generally outperform other aligners for low to medium divergence simulated sequences, while Refiner is uniquely effective when tasked with aligning high-divergent and fragmented instances of a family.

Citing Articles

HiTE: a fast and accurate dynamic boundary adjustment approach for full-length transposable element detection and annotation.

Hu K, Ni P, Xu M, Zou Y, Chang J, Gao X Nat Commun. 2024; 15(1):5573.

PMID: 38956036 PMC: 11219922. DOI: 10.1038/s41467-024-49912-8.


The good, the bad and the ugly of transposable elements annotation tools.

Loreto E, Melo E, Wallau G, Gomes T Genet Mol Biol. 2024; 46(3 Suppl 1):e20230138.

PMID: 38373163 PMC: 10876081. DOI: 10.1590/1678-4685-GMB-2023-0138.


An immune-suppressing protein in human endogenous retroviruses.

Zhang H, Ni S, Frith M Bioinform Adv. 2023; 3(1):vbad013.

PMID: 36818731 PMC: 9927554. DOI: 10.1093/bioadv/vbad013.


Recent Advances in Antibiotic-Free Markers; Novel Technologies to Enhance Safe Human Food Production in the World.

Mmbando G Mol Biotechnol. 2022; 65(7):1011-1022.

PMID: 36443619 DOI: 10.1007/s12033-022-00609-7.


Insights from analyses of low complexity regions with canonical methods for protein sequence comparison.

Jarnot P, Ziemska-Legiecka J, Grynberg M, Gruca A Brief Bioinform. 2022; 23(5).

PMID: 35914952 PMC: 9487646. DOI: 10.1093/bib/bbac299.


References
1.
Thompson J, Linard B, Lecompte O, Poch O . A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One. 2011; 6(3):e18093. PMC: 3069049. DOI: 10.1371/journal.pone.0018093. View

2.
Bull J, Cunningham C, Molineux I, Badgett M, Hillis D . EXPERIMENTAL MOLECULAR EVOLUTION OF BACTERIOPHAGE T7. Evolution. 2017; 47(4):993-1007. DOI: 10.1111/j.1558-5646.1993.tb02130.x. View

3.
Kazazian Jr H . Mobile elements: drivers of genome evolution. Science. 2004; 303(5664):1626-32. DOI: 10.1126/science.1089670. View

4.
Subramanian A, Kaufmann M, Morgenstern B . DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol. 2008; 3:6. PMC: 2430965. DOI: 10.1186/1748-7188-3-6. View

5.
Fletcher W, Yang Z . INDELible: a flexible simulator of biological sequence evolution. Mol Biol Evol. 2009; 26(8):1879-88. PMC: 2712615. DOI: 10.1093/molbev/msp098. View