» Articles » PMID: 38716250

BERT-based Language Model for Accurate Drug Adverse Event Extraction from Social Media: Implementation, Evaluation, and Contributions to Pharmacovigilance Practices

Overview
Specialty Public Health
Date 2024 May 8
PMID 38716250
Authors
Affiliations
Soon will be listed here.
Abstract

Introduction: Social media platforms serve as a valuable resource for users to share health-related information, aiding in the monitoring of adverse events linked to medications and treatments in drug safety surveillance. However, extracting drug-related adverse events accurately and efficiently from social media poses challenges in both natural language processing research and the pharmacovigilance domain.

Method: Recognizing the lack of detailed implementation and evaluation of Bidirectional Encoder Representations from Transformers (BERT)-based models for drug adverse event extraction on social media, we developed a BERT-based language model tailored to identifying drug adverse events in this context. Our model utilized publicly available labeled adverse event data from the ADE-Corpus-V2. Constructing the BERT-based model involved optimizing key hyperparameters, such as the number of training epochs, batch size, and learning rate. Through ten hold-out evaluations on ADE-Corpus-V2 data and external social media datasets, our model consistently demonstrated high accuracy in drug adverse event detection.

Result: The hold-out evaluations resulted in average F1 scores of 0.8575, 0.9049, and 0.9813 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively. External validation using human-labeled adverse event tweets data from SMM4H further substantiated the effectiveness of our model, yielding F1 scores 0.8127, 0.8068, and 0.9790 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively.

Discussion: This study not only showcases the effectiveness of BERT-based language models in accurately identifying drug-related adverse events in the dynamic landscape of social media data, but also addresses the need for the implementation of a comprehensive study design and evaluation. By doing so, we contribute to the advancement of pharmacovigilance practices and methodologies in the context of emerging information sources like social media.

References
1.
Muller M, Salathe M, Kummervold P . COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter. Front Artif Intell. 2023; 6:1023281. PMC: 10043293. DOI: 10.3389/frai.2023.1023281. View

2.
Oyebode O, Orji R . Identifying adverse drug reactions from patient reviews on social media using natural language processing. Health Informatics J. 2023; 29(1):14604582221136712. DOI: 10.1177/14604582221136712. View

3.
de Bie S, Ferrajolo C, Straus S, Verhamme K, Bonhoeffer J, Wong I . Pediatric Drug Safety Surveillance in FDA-AERS: A Description of Adverse Events from GRiP Project. PLoS One. 2015; 10(6):e0130399. PMC: 4474891. DOI: 10.1371/journal.pone.0130399. View

4.
Qiao Y, Zhu X, Gong H . BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models. Bioinformatics. 2021; 38(3):648-654. DOI: 10.1093/bioinformatics/btab712. View

5.
Shi L, Tong W, Fang H, Xie Q, Hong H, Perkins R . An integrated "4-phase" approach for setting endocrine disruption screening priorities--phase I and II predictions of estrogen receptor binding affinity. SAR QSAR Environ Res. 2002; 13(1):69-88. DOI: 10.1080/10629360290002235. View