» Articles » PMID: 9705509

Protein Sequence Similarity Searches Using Patterns As Seeds

Overview
Specialty Biochemistry
Date 1998 Aug 15
PMID 9705509
Citations 114
Authors
Affiliations
Soon will be listed here.
Abstract

Protein families often are characterized by conserved sequence patterns or motifs. A researcher frequently wishes to evaluate the significance of a specific pattern within a protein, or to exploit knowledge of known motifs to aid the recognition of greatly diverged but homologous family members. To assist in these efforts, the pattern-hit initiated BLAST (PHI-BLAST) program described here takes as input both a protein sequence and a pattern of interest that it contains. PHI-BLAST searches a protein database for other instances of the input pattern, and uses those found as seeds for the construction of local alignments to the query sequence. The random distribution of PHI-BLAST alignment scores is studied analytically and empirically. In many instances, the program is able to detect statistically significant similarity between homologous proteins that are not recognizably related using traditional single-pass database search methods. PHI-BLAST is applied to the analysis of CED4-like cell death regulators, HS90-type ATPase domains, archaeal tRNA nucleotidyltransferases and archaeal homologs of DnaG-type DNA primases.

Citing Articles

Characterizing phenotype variants of Cercosporidium personatum, causal agent of peanut late leaf spot disease, their morphology, genetics and metabolites.

Arias R, Cantonwine E, Orner V, Walk T, Massa A, Stewart J Sci Rep. 2025; 15(1):1405.

PMID: 39789282 PMC: 11718120. DOI: 10.1038/s41598-025-85953-9.


A novel in-silico model explores LanM homologs among Hyphomicrobium spp.

Valdes J, Petrash D, Konhauser K Commun Biol. 2024; 7(1):1539.

PMID: 39562649 PMC: 11576760. DOI: 10.1038/s42003-024-07258-3.


Evolution of pH-sensitive transcription termination in during adaptation to repeated long-term starvation.

Worthan S, McCarthy R, Delaleau M, Stikeleather R, Bratton B, Boudvillain M Proc Natl Acad Sci U S A. 2024; 121(39):e2405546121.

PMID: 39298488 PMC: 11441560. DOI: 10.1073/pnas.2405546121.


Evolution of pH-sensitive transcription termination during adaptation to repeated long-term starvation.

Worthan S, McCarthy R, Delaleau M, Stikeleather R, Bratton B, Boudvillain M bioRxiv. 2024; .

PMID: 38464051 PMC: 10925284. DOI: 10.1101/2024.03.01.582989.


Slc11 Synapomorphy: A Conserved 3D Framework Articulating Carrier Conformation Switch.

Cellier M Int J Mol Sci. 2023; 24(20).

PMID: 37894758 PMC: 10606218. DOI: 10.3390/ijms242015076.


References
1.
Myers E, Miller W . Approximate matching of regular expressions. Bull Math Biol. 1989; 51(1):5-37. DOI: 10.1007/BF02458834. View

2.
Staden R . Methods for calculating the probabilities of finding patterns in sequences. Comput Appl Biosci. 1989; 5(2):89-96. DOI: 10.1093/bioinformatics/5.2.89. View

3.
Mehldau G, Myers G . A system for pattern matching applications on biosequences. Comput Appl Biosci. 1993; 9(3):299-314. DOI: 10.1093/bioinformatics/9.3.299. View

4.
Tsui H, Mandavilli B, Winkler M . Nonconserved segment of the MutL protein from Escherichia coli K-12 and Salmonella typhimurium. Nucleic Acids Res. 1992; 20(9):2379. PMC: 312362. DOI: 10.1093/nar/20.9.2379. View

5.
Gotoh O . An improved algorithm for matching biological sequences. J Mol Biol. 1982; 162(3):705-8. DOI: 10.1016/0022-2836(82)90398-9. View