» Articles » PMID: 20971216

Desiderata for Ontologies to Be Used in Semantic Annotation of Biomedical Documents

Overview
Journal J Biomed Inform
Publisher Elsevier
Date 2010 Oct 26
PMID 20971216
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities.

Citing Articles

A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling.

Thieu T, Maldonado J, Ho P, Ding M, Marr A, Brandt D Int J Med Inform. 2021; 147:104351.

PMID: 33401169 PMC: 8104034. DOI: 10.1016/j.ijmedinf.2020.104351.


Development of a cardiac-centered frailty ontology.

Doing-Harris K, Bray B, Thackeray A, Shah R, Shao Y, Cheng Y J Biomed Semantics. 2019; 10(1):3.

PMID: 30658684 PMC: 6339414. DOI: 10.1186/s13326-019-0195-3.


Automated annotation of functional imaging experiments via multi-label classification.

Turner M, Chakrabarti C, Jones T, Xu J, Fox P, Luger G Front Neurosci. 2014; 7:240.

PMID: 24409112 PMC: 3864256. DOI: 10.3389/fnins.2013.00240.


Event extraction across multiple levels of biological organization.

Pyysalo S, Ohta T, Miwa M, Cho H, Tsujii J, Ananiadou S Bioinformatics. 2012; 28(18):i575-i581.

PMID: 22962484 PMC: 3436834. DOI: 10.1093/bioinformatics/bts407.


Concept annotation in the CRAFT corpus.

Bada M, Eckert M, Evans D, Garcia K, Shipley K, Sitnikov D BMC Bioinformatics. 2012; 13:161.

PMID: 22776079 PMC: 3476437. DOI: 10.1186/1471-2105-13-161.