A Novel Method for Automatic Functional Annotation of Proteins
Overview
Authors
Affiliations
Motivation: To cope with the increasing amount of sequence data, reliable automatic annotation tools are required. The TrEMBL database contains together with SWISS-PROT nearly all publicly available protein sequences, but in contrast to SWISS-PROT only limited functional annotation. To improve this situation, we had to develop a method of automatic annotation that produces highly reliable functional prediction using the language and the syntax of SWISS-PROT.
Results: An algorithm was developed and successfully used for the automatic annotation of a testset of unknown proteins. The predicted information included description, function, catalytic activity, cofactors, pathway, subcellular location, quaternary structure, similarity to other protein, active sites, and keywords. The algorithm showed a low coverage (10%), but a high specificity and reliability.
Availability: The results can be obtained by anonymous ftp from ftp.ebi.ac.uk/pub/databases/sp_tr_nrdb. The source code is available on request from the authors.
Tao J, Brayton K, Broschat S Front Bioinform. 2022; 1:749008.
PMID: 36303767 PMC: 9581018. DOI: 10.3389/fbinf.2021.749008.
Chen Q, Britto R, Erill I, Jeffery C, Liberzon A, Magrane M Genomics Proteomics Bioinformatics. 2020; 18(2):91-103.
PMID: 32652120 PMC: 7646089. DOI: 10.1016/j.gpb.2018.11.006.
Translational biomedical informatics in the cloud: present and future.
Chen J, Qian F, Yan W, Shen B Biomed Res Int. 2013; 2013:658925.
PMID: 23586054 PMC: 3613081. DOI: 10.1155/2013/658925.
HAMAP in 2013, new developments in the protein family classification and annotation system.
Pedruzzi I, Rivoire C, Auchincloss A, Coudert E, Keller G, de Castro E Nucleic Acids Res. 2012; 41(Database issue):D584-9.
PMID: 23193261 PMC: 3531088. DOI: 10.1093/nar/gks1157.
Update on activities at the Universal Protein Resource (UniProt) in 2013.
Nucleic Acids Res. 2012; 41(Database issue):D43-7.
PMID: 23161681 PMC: 3531094. DOI: 10.1093/nar/gks1068.