» Articles » PMID: 16271823

Searching for Hypothetical Proteins: Theory and Practice Based Upon Original Data and Literature

Overview
Journal Prog Neurobiol
Specialty Neurology
Date 2005 Nov 8
PMID 16271823
Citations 50
Authors
Affiliations
Soon will be listed here.
Abstract

A large part of mammalian proteomes is represented by hypothetical proteins (HP), i.e. proteins predicted from nucleic acid sequences only and protein sequences with unknown function. Databases are far from being complete and errors are expected. The legion of HP is awaiting experiments to show their existence at the protein level and subsequent bioinformatic handling in order to assign proteins a tentative function is mandatory. Two-dimensional gel-electrophoresis with subsequent mass spectrometrical identification of protein spots is an appropriate tool to search for HP in the high-throughput mode. Spots are identified by MS or by MS/MS measurements (MALDI-TOF, MALDI-TOF-TOF) and subsequent software as e.g. Mascot or ProFound. In many cases proteins can thus be unambiguously identified and characterised; if this is not the case, de novo sequencing or Q-TOF analysis is warranted. If the protein is not identified, the sequence is being sent to databases for BLAST searches to determine identities/similarities or homologies to known proteins. If no significant identity to known structures is observed, the protein sequence is examined for the presence of functional domains (databases PROSITE, PRINTS, InterPro, ProDom, Pfam and SMART), subjected to searches for motifs (ELM) and finally protein-protein interaction databases (InterWeaver, STRING) are consulted or predictions from conformations are performed. We here provide information about hypothetical proteins in terms of protein chemical analysis, independent of antibody availability and specificity and bioinformatic handling to contribute to the extension/completion of protein databases and include original work on HP in the brain to illustrate the processes of HP identification and functional assignment.

Citing Articles

Comparative proteomic analysis to annotate the structural association of the hypothetical proteins from the conserved domain of P. aeruginosa as novel vaccine candidates.

Tenginakai P, Bhor S, Waasia F, Sharma S, Dinesh S Biotechnol Lett. 2024; 47(1):13.

PMID: 39702823 DOI: 10.1007/s10529-024-03546-4.


Biochemical and structural characterization reveals Rv3400 codes for β-phosphoglucomutase in Mycobacterium tuberculosis.

Singh L, Karthikeyan S, Thakur K Protein Sci. 2024; 33(4):e4943.

PMID: 38501428 PMC: 10949319. DOI: 10.1002/pro.4943.


In silico analysis of a novel hypothetical protein (YP_498675.1) from Staphylococcus aureus unravels the protein of tryptophan synthase beta superfamily (Try-synth-beta_ II).

Chakma V, Barman D, Das S, Hossain A, Momin M, Tasneem M J Genet Eng Biotechnol. 2023; 21(1):135.

PMID: 37995054 PMC: 10667181. DOI: 10.1186/s43141-023-00613-7.


In Silico Functional Characterization of a Hypothetical Protein From Reveals a Novel -Adenosylmethionine-Dependent Methyltransferase Activity.

Masum M, Rajia S, Bristi U, Akter M, Amin M, Shishir T Bioinform Biol Insights. 2023; 17:11779322231184024.

PMID: 37424709 PMC: 10328030. DOI: 10.1177/11779322231184024.


AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data.

Maia G, Benetti Filho V, Kawagoe E, Soratto T, Moreira R, Grisard E Front Genet. 2022; 13:1020100.

PMID: 36482896 PMC: 9723129. DOI: 10.3389/fgene.2022.1020100.