» Articles » PMID: 20576627

MSAProbs: Multiple Sequence Alignment Based on Pair Hidden Markov Models and Partition Function Posterior Probabilities

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Jun 26
PMID 20576627
Citations 89
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Multiple sequence alignment is of central importance to bioinformatics and computational biology. Although a large number of algorithms for computing a multiple sequence alignment have been designed, the efficient computation of highly accurate multiple alignments is still a challenge.

Results: We present MSAProbs, a new and practical multiple alignment algorithm for protein sequences. The design of MSAProbs is based on a combination of pair hidden Markov models and partition functions to calculate posterior probabilities. Furthermore, two critical bioinformatics techniques, namely weighted probabilistic consistency transformation and weighted profile-profile alignment, are incorporated to improve alignment accuracy. Assessed using the popular benchmarks: BAliBASE, PREFAB, SABmark and OXBENCH, MSAProbs achieves statistically significant accuracy improvements over the existing top performing aligners, including ClustalW, MAFFT, MUSCLE, ProbCons and Probalign. Furthermore, MSAProbs is optimized for multi-core CPUs by employing a multi-threaded design, leading to a competitive execution time compared to other aligners.

Availability: The source code of MSAProbs, written in C++, is freely and publicly available from http://msaprobs.sourceforge.net.

Citing Articles

A conserved acidic residue drives thyroxine synthesis within thyroglobulin and other protein precursors.

Stejskalova C, Arrigoni F, Albanesi R, Bertini L, Mollica L, Coscia F J Biol Chem. 2024; 301(1):108026.

PMID: 39608720 PMC: 11730217. DOI: 10.1016/j.jbc.2024.108026.


The hagfish genome and the evolution of vertebrates.

Marletaz F, Timoshevskaya N, Timoshevskiy V, Parey E, Simakov O, Gavriouchkina D Nature. 2024; 627(8005):811-820.

PMID: 38262590 PMC: 10972751. DOI: 10.1038/s41586-024-07070-3.


Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding.

Liu Y, Yuan H, Zhang Q, Wang Z, Xiong S, Wen N Bioinformatics. 2023; 39(11).

PMID: 37856335 PMC: 10628385. DOI: 10.1093/bioinformatics/btad636.


Analysis of the sea urchin genome highlights contrasting trends of genomic and regulatory evolution in deuterostomes.

Marletaz F, Couloux A, Poulain J, Labadie K, Da Silva C, Mangenot S Cell Genom. 2023; 3(4):100295.

PMID: 37082140 PMC: 10112332. DOI: 10.1016/j.xgen.2023.100295.


Experimental and computational analysis of the ancestry of an evolutionary young enzyme from histidine biosynthesis.

Kinateder T, Drexler L, Straub K, Merkl R, Sterner R Protein Sci. 2022; 32(1):e4536.

PMID: 36502290 PMC: 9798254. DOI: 10.1002/pro.4536.