» Articles » PMID: 15814565

Literature Mining and Database Annotation of Protein Phosphorylation Using a Rule-based System

Overview
Journal Bioinformatics
Specialty Biology
Date 2005 Apr 9
PMID 15814565
Citations 33
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: A large volume of experimental data on protein phosphorylation is buried in the fast-growing PubMed literature. While of great value, such information is limited in databases owing to the laborious process of literature-based curation. Computational literature mining holds promise to facilitate database curation.

Results: A rule-based system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was used to extract protein phosphorylation information from MEDLINE abstracts. An annotation-tagged literature corpus developed at PIR was used to evaluate the system for finding phosphorylation papers and extracting phosphorylation objects (kinases, substrates and sites) from abstracts. RLIMS-P achieved a precision and recall of 91.4 and 96.4% for paper retrieval, and of 97.9 and 88.0% for extraction of substrates and sites. Coupling the high recall for paper retrieval and high precision for information extraction, RLIMS-P facilitates literature mining and database annotation of protein phosphorylation.

Citing Articles

Integrating Multi-Omics Data to Construct Reliable Interconnected Models of Signaling, Gene Regulatory, and Metabolic Pathways.

Kumar K, Bhowmik D, Mandloi S, Gautam A, Lahiri A, Biswas N Methods Mol Biol. 2023; 2634:139-151.

PMID: 37074577 DOI: 10.1007/978-1-0716-3008-2_6.


Text Mining and Machine Learning Protocol for Extracting Human-Related Protein Phosphorylation Information from PubMed.

Arumugam K, Shanker R Methods Mol Biol. 2022; 2496:159-177.

PMID: 35713864 DOI: 10.1007/978-1-0716-2305-3_9.


Humans and machines in biomedical knowledge curation: hypertrophic cardiomyopathy molecular mechanisms' representation.

Glavaski M, Velicki L BioData Min. 2021; 14(1):45.

PMID: 34600580 PMC: 8487578. DOI: 10.1186/s13040-021-00279-2.


Utilizing image and caption information for biomedical document classification.

Li P, Jiang X, Zhang G, Trelles Trabucco J, Raciti D, Smith C Bioinformatics. 2021; 37(Suppl_1):i468-i476.

PMID: 34252939 PMC: 8346654. DOI: 10.1093/bioinformatics/btab331.


ANDDigest: a new web-based module of ANDSystem for the search of knowledge in the scientific literature.

Ivanisenko T, Saik O, Demenkov P, Ivanisenko N, Savostianov A, Ivanisenko V BMC Bioinformatics. 2020; 21(Suppl 11):228.

PMID: 32921303 PMC: 7488989. DOI: 10.1186/s12859-020-03557-8.