» Articles » PMID: 12691987

Using Multiple Interdependency to Separate Functional from Phylogenetic Correlations in Protein Alignments

Overview
Journal Bioinformatics
Specialty Biology
Date 2003 Apr 15
PMID 12691987
Citations 56
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Multiple sequence alignments of homologous proteins are useful for inferring their phylogenetic history and to reveal functionally important regions in the proteins. Functional constraints may lead to co-variation of two or more amino acids in the sequence, such that a substitution at one site is accompanied by compensatory substitutions at another site. It is not sufficient to find the statistical correlations between sites in the alignment because these may be the result of several undetermined causes. In particular, phylogenetic clustering will lead to many strong correlations.

Results: A procedure is developed to detect statistical correlations stemming from functional interaction by removing the strong phylogenetic signal that leads to the correlations of each site with many others in the sequence. Our method relies upon the accuracy of the alignment but it does not require any assumptions about the phylogeny or the substitution process. The effectiveness of the method was verified using computer simulations and then applied to predict functional interactions between amino acids in the Pfam database of alignments.

Citing Articles

Identification of coevolving positions by ancestral reconstruction.

Nelson M, Talavera D Commun Biol. 2025; 8(1):329.

PMID: 40021815 PMC: 11871020. DOI: 10.1038/s42003-025-07676-x.


Motifs in SARS-CoV-2 evolution.

Barrett C, Bura A, He Q, Huang F, Li T, Reidys C RNA. 2023; 30(1):1-15.

PMID: 37903545 PMC: 10726165. DOI: 10.1261/rna.079557.122.


Thirteen dubious ways to detect conserved structural RNAs.

Gao W, Yang A, Rivas E IUBMB Life. 2022; 75(6):471-492.

PMID: 36495545 PMC: 11234323. DOI: 10.1002/iub.2694.


General strategies for using amino acid sequence data to guide biochemical investigation of protein function.

Kennedy E, Foster C, Barr S, Bourret R Biochem Soc Trans. 2022; 50(6):1847-1858.

PMID: 36416676 PMC: 10257402. DOI: 10.1042/BST20220849.


Information Theory in Computational Biology: Where We Stand Today.

Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R Entropy (Basel). 2020; 22(6).

PMID: 33286399 PMC: 7517167. DOI: 10.3390/e22060627.