» Articles » PMID: 8188249

Protein Family Classification Based on Searching a Database of Blocks

Overview
Journal Genomics
Specialty Genetics
Date 1994 Jan 1
PMID 8188249
Citations 99
Authors
Affiliations
Soon will be listed here.
Abstract

The most highly conserved regions of proteins can be represented as "blocks" of locally aligned sequence segments. Previously, an automated system was introduced to generate a database of blocks that is searched for local similarities using a sequence query. Here, we describe a method for searching this database that can also reveal significant global similarities. Local and global alignments are scored independently, so they can be used in concert to infer homology. A set of 7082 diverse sequences not represented in the database provided queries for testing this approach. The resulting distributions of scores led to guidelines for interpretation of search data and to the classification of 289 uncatalogued sequences into known groups. Thirty-eight of these relationships appear to be new discoveries. We also show how searching a database of blocks can be used to detect repeated domains and to find distinct cross-family relationships that were missed in searches of sequence databases.

Citing Articles

Remote homology search with hidden Potts models.

Wilburn G, Eddy S PLoS Comput Biol. 2020; 16(11):e1008085.

PMID: 33253143 PMC: 7728182. DOI: 10.1371/journal.pcbi.1008085.


Evolutionary history of the human multigene families reveals widespread gene duplications throughout the history of animals.

Pervaiz N, Shakeel N, Qasim A, Zehra R, Anwar S, Rana N BMC Evol Biol. 2019; 19(1):128.

PMID: 31221090 PMC: 6585022. DOI: 10.1186/s12862-019-1441-0.


Identification of a novel potassium channel (GiK) as a potential drug target in : Computational descriptions of binding sites.

Palomo-Ligas L, Gutierrez-Gutierrez F, Ochoa-Maganda V, Cortes-Zarate R, Charles-Nino C, Castillo-Romero A PeerJ. 2019; 7:e6430.

PMID: 30834181 PMC: 6397635. DOI: 10.7717/peerj.6430.


Determinants of Base-Pair Substitution Patterns Revealed by Whole-Genome Sequencing of DNA Mismatch Repair Defective .

Foster P, Niccum B, Popodi E, Townes J, Lee H, MohammedIsmail W Genetics. 2018; 209(4):1029-1042.

PMID: 29907647 PMC: 6063221. DOI: 10.1534/genetics.118.301237.


The Protein Data Bank: Current Status and Future Challenges.

Abola E, Manning N, Prilusky J, Stampf D, Sussman J J Res Natl Inst Stand Technol. 1996; 101(3):231-241.

PMID: 27805161 PMC: 4963140. DOI: 10.6028/jres.101.025.