» Articles » PMID: 10725404

Separation of Phylogenetic and Functional Associations in Biological Sequences by Using the Parametric Bootstrap

Overview
Specialty Science
Date 2000 Mar 22
PMID 10725404
Citations 48
Authors
Affiliations
Soon will be listed here.
Abstract

Quantitative analyses of biological sequences generally proceed under the assumption that individual DNA or protein sequence elements vary independently. However, this assumption is not biologically realistic because sequence elements often vary in a concerted manner resulting from common ancestry and structural or functional constraints. We calculated intersite associations among aligned protein sequences by using mutual information. To discriminate associations resulting from common ancestry from those resulting from structural or functional constraints, we used a parametric bootstrap algorithm to construct replicate data sets. These data are expected to have intersite associations resulting solely from phylogeny. By comparing the distribution of our association statistic for the replicate data against that calculated for empirical data, we were able to assign a probability that two sites covaried resulting from structural or functional constraint rather than phylogeny. We tested our method by using an alignment of 237 basic helix-loop-helix (bHLH) protein domains. Comparison of our results against a solved three-dimensional structure confirmed the identification of several sites important to function and structure of the bHLH domain. This analytical procedure has broad utility as a first step in the identification of sites that are important to biological macromolecular structure and function when a solved structure is unavailable.

Citing Articles

Identification of coevolving positions by ancestral reconstruction.

Nelson M, Talavera D Commun Biol. 2025; 8(1):329.

PMID: 40021815 PMC: 11871020. DOI: 10.1038/s42003-025-07676-x.


Detection and Phylogenetic Analysis of Extended-Spectrum β-Lactamase (ESBL)-Genetic Determinants in Gram-Negative Fecal-Microbiota of Wild Birds and Chicken Originated at Trimmu Barrage.

Saeed M, Khan A, Ehtisham-Ul-Haque S, Waheed U, Qamar M, Rehman A Antibiotics (Basel). 2023; 12(9).

PMID: 37760673 PMC: 10525410. DOI: 10.3390/antibiotics12091376.


Cross-Sectional Study for Detection and Risk Factor Analysis of ESBL-Producing Avian Pathogenic Associated with Backyard Chickens in Pakistan.

Saeed M, Saqlain M, Waheed U, Ehtisham-Ul-Haque S, Khan A, Rehman A Antibiotics (Basel). 2023; 12(5).

PMID: 37237837 PMC: 10215362. DOI: 10.3390/antibiotics12050934.


General strategies for using amino acid sequence data to guide biochemical investigation of protein function.

Kennedy E, Foster C, Barr S, Bourret R Biochem Soc Trans. 2022; 50(6):1847-1858.

PMID: 36416676 PMC: 10257402. DOI: 10.1042/BST20220849.


Extracting phylogenetic dimensions of coevolution reveals hidden functional signals.

Colavin A, Atolia E, Bitbol A, Huang K Sci Rep. 2022; 12(1):820.

PMID: 35039514 PMC: 8764114. DOI: 10.1038/s41598-021-04260-1.


References
1.
Goldman N . Statistical tests of models of DNA substitution. J Mol Evol. 1993; 36(2):182-98. DOI: 10.1007/BF00166252. View

2.
Jones D, Taylor W, Thornton J . The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992; 8(3):275-82. DOI: 10.1093/bioinformatics/8.3.275. View

3.
Ferre-DAmare A, Pognonec P, Roeder R, Burley S . Structure and function of the b/HLH/Z domain of USF. EMBO J. 1994; 13(1):180-9. PMC: 394791. DOI: 10.1002/j.1460-2075.1994.tb06247.x. View

4.
Taylor W, Hatrick K . Compensating changes in protein multiple sequence alignments. Protein Eng. 1994; 7(3):341-8. DOI: 10.1093/protein/7.3.341. View

5.
Ma P, Rould M, Weintraub H, Pabo C . Crystal structure of MyoD bHLH domain-DNA complex: perspectives on DNA recognition and implications for transcriptional activation. Cell. 1994; 77(3):451-9. DOI: 10.1016/0092-8674(94)90159-7. View