» Articles » PMID: 11955022

An Expectation Maximization Algorithm for Training Hidden Substitution Models

Overview
Journal J Mol Biol
Publisher Elsevier
Date 2002 Apr 17
PMID 11955022
Citations 32
Authors
Affiliations
Soon will be listed here.
Abstract

We derive an expectation maximization algorithm for maximum-likelihood training of substitution rate matrices from multiple sequence alignments. The algorithm can be used to train hidden substitution models, where the structural context of a residue is treated as a hidden variable that can evolve over time. We used the algorithm to train hidden substitution matrices on protein alignments in the Pfam database. Measuring the accuracy of multiple alignment algorithms with reference to BAliBASE (a database of structural reference alignments) our substitution matrices consistently outperform the PAM series, with the improvement steadily increasing as up to four hidden site classes are added. We discuss several applications of this algorithm in bioinformatics.

Citing Articles

Next-generation development and application of codon model in evolution.

Gupta M, Vadde R Front Genet. 2023; 14:1091575.

PMID: 36777719 PMC: 9911445. DOI: 10.3389/fgene.2023.1091575.


Mirage: estimation of ancestral gene-copy numbers by considering different evolutionary patterns among gene families.

Fukunaga T, Iwasaki W Bioinform Adv. 2023; 1(1):vbab014.

PMID: 36700099 PMC: 9710636. DOI: 10.1093/bioadv/vbab014.


BML: a versatile web server for bipartite motif discovery.

Vahed M, Vahed M, Garmire L Brief Bioinform. 2022; 23(1).

PMID: 34974623 PMC: 8769915. DOI: 10.1093/bib/bbab536.


Maximum Likelihood Estimation of Symmetric Group-Based Models via Numerical Algebraic Geometry.

Kosta D, Kubjas K Bull Math Biol. 2018; 81(2):337-360.

PMID: 30357599 PMC: 6342846. DOI: 10.1007/s11538-018-0523-2.


Computational methods for birth-death processes.

Crawford F, Ho L, Suchard M Wiley Interdiscip Rev Comput Stat. 2018; 10(2).

PMID: 29942419 PMC: 6014701. DOI: 10.1002/wics.1423.