» Articles » PMID: 31881938

Biomedical Named Entity Recognition Using Deep Neural Networks with Contextual Information

Overview
Publisher Biomed Central
Specialty Biology
Date 2019 Dec 29
PMID 31881938
Citations 27
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In biomedical text mining, named entity recognition (NER) is an important task used to extract information from biomedical articles. Previously proposed methods for NER are dictionary- or rule-based methods and machine learning approaches. However, these traditional approaches are heavily reliant on large-scale dictionaries, target-specific rules, or well-constructed corpora. These methods to NER have been superseded by the deep learning-based approach that is independent of hand-crafted features. However, although such methods of NER employ additional conditional random fields (CRF) to capture important correlations between neighboring labels, they often do not incorporate all the contextual information from text into the deep learning layers.

Results: We propose herein an NER system for biomedical entities by incorporating n-grams with bi-directional long short-term memory (BiLSTM) and CRF; this system is referred to as a contextual long short-term memory networks with CRF (CLSTM). We assess the CLSTM model on three corpora: the disease corpus of the National Center for Biotechnology Information (NCBI), the BioCreative II Gene Mention corpus (GM), and the BioCreative V Chemical Disease Relation corpus (CDR). Our framework was compared with several deep learning approaches, such as BiLSTM, BiLSTM with CRF, GRAM-CNN, and BERT. On the NCBI corpus, our model recorded an F-score of 85.68% for the NER of diseases, showing an improvement of 1.50% over previous methods. Moreover, although BERT used transfer learning by incorporating more than 2.5 billion words, our system showed similar performance with BERT with an F-scores of 81.44% for gene NER on the GM corpus and a outperformed F-score of 86.44% for the NER of chemicals and diseases on the CDR corpus. We conclude that our method significantly improves performance on biomedical NER tasks.

Conclusion: The proposed approach is robust in recognizing biological entities in text.

Citing Articles

Construction, evaluation, and application of an electronic medical record corpus for cerebral palsy rehabilitation.

Xiao M, Pang Q, Zhu Y, Shuai L, Jin G Digit Health. 2024; 10:20552076241286260.

PMID: 39347507 PMC: 11437554. DOI: 10.1177/20552076241286260.


Utility analysis and demonstration of real-world clinical texts: A case study on Japanese cancer-related EHRs.

Yada S, Nishiyama T, Wakamiya S, Kawazoe Y, Imai S, Hori S PLoS One. 2024; 19(9):e0310432.

PMID: 39259727 PMC: 11389901. DOI: 10.1371/journal.pone.0310432.


Application of machine reading comprehension techniques for named entity recognition in materials science.

Huang Z, He L, Yang Y, Li A, Zhang Z, Wu S J Cheminform. 2024; 16(1):76.

PMID: 38956728 PMC: 11220966. DOI: 10.1186/s13321-024-00874-5.


Advancing entity recognition in biomedicine via instruction tuning of large language models.

Keloth V, Hu Y, Xie Q, Peng X, Wang Y, Zheng A Bioinformatics. 2024; 40(4).

PMID: 38514400 PMC: 11001490. DOI: 10.1093/bioinformatics/btae163.


A Combined Manual Annotation and Deep-Learning Natural Language Processing Study on Accurate Entity Extraction in Hereditary Disease Related Biomedical Literature.

Huang D, Zeng Q, Xiong Y, Liu S, Pang C, Xia M Interdiscip Sci. 2024; 16(2):333-344.

PMID: 38340264 PMC: 11289304. DOI: 10.1007/s12539-024-00605-2.


References
1.
Li J, Sun Y, Johnson R, Sciaky D, Wei C, Leaman R . BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database (Oxford). 2016; 2016. PMC: 4860626. DOI: 10.1093/database/baw068. View

2.
Galea D, Laponogov I, Veselkov K . Exploiting and assessing multi-source data for supervised biomedical named entity recognition. Bioinformatics. 2018; 34(14):2474-2482. PMC: 6041968. DOI: 10.1093/bioinformatics/bty152. View

3.
Zhao Z, Yang Z, Luo L, Wang L, Zhang Y, Lin H . Disease named entity recognition from biomedical literature using a novel convolutional neural network. BMC Med Genomics. 2018; 10(Suppl 5):73. PMC: 5751782. DOI: 10.1186/s12920-017-0316-8. View

4.
Huang C, Lu Z . Community challenges in biomedical text mining over 10 years: success, failure and the future. Brief Bioinform. 2015; 17(1):132-44. PMC: 4719069. DOI: 10.1093/bib/bbv024. View

5.
Cho H, Choi W, Lee H . A method for named entity normalization in biomedical articles: application to diseases and plants. BMC Bioinformatics. 2017; 18(1):451. PMC: 5640957. DOI: 10.1186/s12859-017-1857-8. View