» Articles » PMID: 38805478

Biomedical Named Entity Recognition Based on Multi-cross Attention Feature Fusion

Overview
Journal PLoS One
Date 2024 May 28
PMID 38805478
Authors
Affiliations
Soon will be listed here.
Abstract

Currently, in the field of biomedical named entity recognition, CharCNN (Character-level Convolutional Neural Networks) or CharRNN (Character-level Recurrent Neural Network) is typically used independently to extract character features. However, this approach does not consider the complementary capabilities between them and only concatenates word features, ignoring the feature information during the process of word integration. Based on this, this paper proposes a method of multi-cross attention feature fusion. First, DistilBioBERT and CharCNN and CharLSTM are used to perform cross-attention word-char (word features and character features) fusion separately. Then, the two feature vectors obtained from cross-attention fusion are fused again through cross-attention to obtain the final feature vector. Subsequently, a BiLSTM is introduced with a multi-head attention mechanism to enhance the model's ability to focus on key information features and further improve model performance. Finally, the output layer is used to output the final result. Experimental results show that the proposed model achieves the best F1 values of 90.76%, 89.79%, 94.98%, 80.27% and 88.84% on NCBI-Disease, BC5CDR-Disease, BC5CDR-Chem, JNLPBA and BC2GM biomedical datasets respectively. This indicates that our model can capture richer semantic features and improve the ability to recognize entities.

References
1.
Li J, Sun Y, Johnson R, Sciaky D, Wei C, Leaman R . BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database (Oxford). 2016; 2016. PMC: 4860626. DOI: 10.1093/database/baw068. View

2.
Sayers E, Cavanaugh M, Clark K, Ostell J, Pruitt K, Karsch-Mizrachi I . GenBank. Nucleic Acids Res. 2019; 48(D1):D84-D86. PMC: 7145611. DOI: 10.1093/nar/gkz956. View

3.
Zhang J, Shen D, Zhou G, Su J, Tan C . Enhancing HMM-based biomedical named entity recognition by studying special phenomena. J Biomed Inform. 2004; 37(6):411-22. DOI: 10.1016/j.jbi.2004.08.005. View

4.
Islamaj Dogan R, Leaman R, Lu Z . NCBI disease corpus: a resource for disease name recognition and concept normalization. J Biomed Inform. 2014; 47:1-10. PMC: 3951655. DOI: 10.1016/j.jbi.2013.12.006. View

5.
Lee J, Yoon W, Kim S, Kim D, Kim S, So C . BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019; 36(4):1234-1240. PMC: 7703786. DOI: 10.1093/bioinformatics/btz682. View