» Articles » PMID: 23645987

Recognizing Scientific Artifacts in Biomedical Literature

Overview
Publisher Sage Publications
Date 2013 May 7
PMID 23645987
Authors
Affiliations
Soon will be listed here.
Abstract

Today's search engines and digital libraries offer little or no support for discovering those scientific artifacts (hypotheses, supporting/contradicting statements, or findings) that form the core of scientific written communication. Consequently, we currently have no means of identifying central themes within a domain or to detect gaps between accepted knowledge and newly emerging knowledge as a means for tracking the evolution of hypotheses from incipient phases to maturity or decline. We present a hybrid Machine Learning approach using an ensemble of four classifiers, for recognizing scientific artifacts (ie, hypotheses, background, motivation, objectives, and findings) within biomedical research publications, as a precursory step to the general goal of automatically creating argumentative discourse networks that span across multiple publications. The performance achieved by the classifiers ranges from 15.30% to 78.39%, subject to the target class. The set of features used for classification has led to promising results. Furthermore, their use strictly in a local, publication scope, ie, without aggregating corpus-wide statistics, increases the versatility of the ensemble of classifiers and enables its direct applicability without the necessity of re-training.

References
1.
Hanisch D, Fundel K, Mevissen H, Zimmer R, Fluck J . ProMiner: rule-based protein and gene entity recognition. BMC Bioinformatics. 2005; 6 Suppl 1:S14. PMC: 1869006. DOI: 10.1186/1471-2105-6-S1-S14. View

2.
Wilbur W, Rzhetsky A, Shatkay H . New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC Bioinformatics. 2006; 7:356. PMC: 1559725. DOI: 10.1186/1471-2105-7-356. View

3.
Liakata M, Saha S, Dobnik S, Batchelor C, Rebholz-Schuhmann D . Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics. 2012; 28(7):991-1000. PMC: 3315721. DOI: 10.1093/bioinformatics/bts071. View

4.
Kilicoglu H, Bergler S . Recognizing speculative language in biomedical research articles: a linguistically motivated perspective. BMC Bioinformatics. 2008; 9 Suppl 11:S10. PMC: 2586760. DOI: 10.1186/1471-2105-9-S11-S10. View

5.
Saha S, Sarkar S, Mitra P . Feature selection techniques for maximum entropy based biomedical named entity recognition. J Biomed Inform. 2009; 42(5):905-11. DOI: 10.1016/j.jbi.2008.12.012. View