» Articles » PMID: 31062021

MAFFT-DASH: Integrated Protein Sequence and Structural Alignment

Overview
Specialty Biochemistry
Date 2019 May 8
PMID 31062021
Citations 371
Authors
Affiliations
Soon will be listed here.
Abstract

Here, we describe a web server that integrates structural alignments with the MAFFT multiple sequence alignment (MSA) tool. For this purpose, we have prepared a web-based Database of Aligned Structural Homologs (DASH), which provides structural alignments at the domain and chain levels for all proteins in the Protein Data Bank (PDB), and can be queried interactively or by a simple REST-like API. MAFFT-DASH integration can be invoked with a single flag on either the web (https://mafft.cbrc.jp/alignment/server/) or command-line versions of MAFFT. In our benchmarks using 878 cases from the BAliBase, HomFam, OXFam, Mattbench and SISYPHUS datasets, MAFFT-DASH showed 10-20% improvement over standard MAFFT for MSA problems with weak similarity, in terms of Sum-of-Pairs (SP), a measure of how well a program succeeds at aligning input sequences in comparison to a reference alignment. When MAFFT alignments were supplemented with homologous sequences, further improvement was observed. Potential applications of DASH beyond MSA enrichment include functional annotation through detection of remote homology and assembly of template libraries for homology modeling.

Citing Articles

IMA GENOME - F20 A draft genome assembly of , , , , and genomic resources for and .

DAngelo D, Sorrentino R, Nkomo T, Zhou X, Vaghefi N, Sonnekus B IMA Fungus. 2025; 16:e141732.

PMID: 40052082 PMC: 11882029. DOI: 10.3897/imafungus.16.141732.


Assembly and comparative analysis of the complete mitochondrial genome of (Polyporaceae, Basidiomycota), contributing to understanding fungal evolution and ecological functions.

Ma J, Li H, Jin C, Wang H, Tang L, Si J IMA Fungus. 2025; 16:e141288.

PMID: 40052081 PMC: 11882022. DOI: 10.3897/imafungus.16.141288.


Virome specific to tick genus with distinct ecogeographical distribution.

Tian D, Ye R, Li Y, Wang N, Gao W, Wang B Microbiome. 2025; 13(1):57.

PMID: 40022268 PMC: 11869668. DOI: 10.1186/s40168-025-02061-6.


Comparative Chloroplast Genomics and Codon Usage Bias Analysis in Genus.

Yang Y, Liu X, He L, Li Z, Yuan B, Fang F Genes (Basel). 2025; 16(2).

PMID: 40004530 PMC: 11855534. DOI: 10.3390/genes16020201.


The complete mitochondrial genome of Wang, Du Liu, 2008 (Neuroptera: Osmylidae: Spilosmylinae) with phylogenetic analysis.

Zhang R, Tian S, Liu X, Wang Y Mitochondrial DNA B Resour. 2025; 10(3):218-223.

PMID: 39968217 PMC: 11834781. DOI: 10.1080/23802359.2025.2466579.


References
1.
Li W, Godzik A . Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006; 22(13):1658-9. DOI: 10.1093/bioinformatics/btl158. View

2.
Andreeva A, Prlic A, Hubbard T, Murzin A . SISYPHUS--structural alignments for proteins with non-trivial relationships. Nucleic Acids Res. 2006; 35(Database issue):D253-9. PMC: 1635320. DOI: 10.1093/nar/gkl746. View

3.
Webb B, Sali A . Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics. 2016; 54:5.6.1-5.6.37. PMC: 5031415. DOI: 10.1002/cpbi.3. View

4.
Xu J . Distance-based protein folding powered by deep learning. Proc Natl Acad Sci U S A. 2019; 116(34):16856-16865. PMC: 6708335. DOI: 10.1073/pnas.1821309116. View

5.
Sievers F, Wilm A, Dineen D, Gibson T, Karplus K, Li W . Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011; 7:539. PMC: 3261699. DOI: 10.1038/msb.2011.75. View