» Articles » PMID: 9367767

Intermediate Sequences Increase the Detection of Homology Between Sequences

Overview
Journal J Mol Biol
Publisher Elsevier
Date 1997 Nov 21
PMID 9367767
Citations 61
Authors
Affiliations
Soon will be listed here.
Abstract

Two homologous sequences, which have diverged beyond the point where their homology can be recognised by a simple direct comparison, can be related through a third sequence that is suitably intermediate between the two. High scores, for a sequence match between the first and third sequences and between the second and the third sequences, imply that the first and second sequences are related even though their own match score is low. We have tested the usefulness of this idea using a database that contains the sequences of 971 protein domains whose structures are known and whose residue identities with each other are some 40% or less (PDB40D). On the basis of sequence and structural information, 2143 pairs of these sequences are known to have an evolutionary relationship. FASTA, in an all-against-all comparison of the sequences in the database, detected 320 (15%) of these relationships as well as three false positive (i.e. 1% error rate). Using intermediate sequences found by FASTA matches of PDB40D sequences to those in the large non-redundant OWL database we could detect 550 evolutionary relationships with an error rate of 1%. This means the intermediate sequence procedure increases the ability to recognise the evolutionary relationships amongst the PDB40D sequences by 70%.

Citing Articles

Investigation of protein family relationships with deep learning.

Ponamareva I, Andreeva A, Bileschi M, Colwell L, Bateman A Bioinform Adv. 2024; 4(1):vbae132.

PMID: 39399373 PMC: 11467057. DOI: 10.1093/bioadv/vbae132.


Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection.

Kumar G, Srinivasan N, Sandhya S Methods Mol Biol. 2022; 2449:149-167.

PMID: 35507261 DOI: 10.1007/978-1-0716-2095-3_5.


Remote homology clustering identifies lowly conserved families of effector proteins in plant-pathogenic fungi.

Jones D, Moolhuijzen P, Hane J Microb Genom. 2021; 7(9).

PMID: 34468307 PMC: 8715435. DOI: 10.1099/mgen.0.000637.


Computational Structural Genomics Unravels Common Folds and Novel Families in the Secretome of Fungal Phytopathogen .

Seong K, Krasileva K Mol Plant Microbe Interact. 2021; 34(11):1267-1280.

PMID: 34415195 PMC: 9447291. DOI: 10.1094/MPMI-03-21-0071-R.


Sequence alignment generation using intermediate sequence search for homology modeling.

Makigaki S, Ishida T Comput Struct Biotechnol J. 2020; 18:2043-2050.

PMID: 32802276 PMC: 7415839. DOI: 10.1016/j.csbj.2020.07.012.