» Articles » PMID: 15060012

MAVID: Constrained Ancestral Alignment of Multiple Sequences

Overview
Journal Genome Res
Specialty Genetics
Date 2004 Apr 3
PMID 15060012
Citations 119
Authors
Affiliations
Soon will be listed here.
Abstract

We describe a new global multiple-alignment program capable of aligning a large number of genomic regions. Our progressive-alignment approach incorporates the following ideas: maximum-likelihood inference of ancestral sequences, automatic guide-tree construction, protein-based anchoring of ab-initio gene predictions, and constraints derived from a global homology map of the sequences. We have implemented these ideas in the MAVID program, which is able to accurately align multiple genomic regions up to megabases long. MAVID is able to effectively align divergent sequences, as well as incomplete unfinished sequences. We demonstrate the capabilities of the program on the benchmark CFTR region, which consists of 1.8 Mb of human sequence and 20 orthologous regions in marsupials, birds, fish, and mammals. Finally, we describe two large MAVID alignments, an alignment of all the available HIV genomes and a multiple alignment of the entire human, mouse, and rat genomes.

Citing Articles

Multiple genome alignment in the telomere-to-telomere assembly era.

Kille B, Balaji A, Sedlazeck F, Nute M, Treangen T Genome Biol. 2022; 23(1):182.

PMID: 36038949 PMC: 9421119. DOI: 10.1186/s13059-022-02735-6.


Comparative analyses of the Hymenoscyphus fraxineus and Hymenoscyphus albidus genomes reveals potentially adaptive differences in secondary metabolite and transposable element repertoires.

Elfstrand M, Chen J, Cleary M, Halecker S, Ihrmark K, Karlsson M BMC Genomics. 2021; 22(1):503.

PMID: 34217229 PMC: 8254937. DOI: 10.1186/s12864-021-07837-2.


Multiple Alignment of Promoter Sequences from the L. Genome.

Korotkov E, Suvorova Y, Kostenko D, Korotkova M Genes (Basel). 2021; 12(2).

PMID: 33494278 PMC: 7909805. DOI: 10.3390/genes12020135.


Progressive Cactus is a multiple-genome aligner for the thousand-genome era.

Armstrong J, Hickey G, Diekhans M, Fiddes I, Novak A, Deran A Nature. 2020; 587(7833):246-251.

PMID: 33177663 PMC: 7673649. DOI: 10.1038/s41586-020-2871-y.


Draft genomes of two outcrossing wild rice, and , reveal genomic features associated with mating-system evolution.

Li W, Zhang Q, Zhu T, Tong Y, Li K, Shi C Plant Direct. 2020; 4(6):e00232.

PMID: 32537559 PMC: 7287411. DOI: 10.1002/pld3.232.


References
1.
Holmes I . Using guide trees to construct multiple-sequence evolutionary HMMs. Bioinformatics. 2003; 19 Suppl 1:i147-57. DOI: 10.1093/bioinformatics/btg1019. View

2.
Felsenstein J . Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981; 17(6):368-76. DOI: 10.1007/BF01734359. View

3.
Hohl M, Kurtz S, Ohlebusch E . Efficient multiple genome alignment. Bioinformatics. 2002; 18 Suppl 1:S312-20. DOI: 10.1093/bioinformatics/18.suppl_1.s312. View

4.
Thorne J, Kishino H, Felsenstein J . An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol. 1991; 33(2):114-24. DOI: 10.1007/BF02193625. View

5.
Thomas J, Touchman J, Blakesley R, Bouffard G, Beckstrom-Sternberg S, Margulies E . Comparative analyses of multi-species sequences from targeted genomic regions. Nature. 2003; 424(6950):788-93. DOI: 10.1038/nature01858. View