» Articles » PMID: 22166012

Mining FDA Drug Labels Using an Unsupervised Learning Technique--topic Modeling

Overview
Publisher Biomed Central
Specialty Biology
Date 2011 Dec 15
PMID 22166012
Citations 41
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The Food and Drug Administration (FDA) approved drug labels contain a broad array of information, ranging from adverse drug reactions (ADRs) to drug efficacy, risk-benefit consideration, and more. However, the labeling language used to describe these information is free text often containing ambiguous semantic descriptions, which poses a great challenge in retrieving useful information from the labeling text in a consistent and accurate fashion for comparative analysis across drugs. Consequently, this task has largely relied on the manual reading of the full text by experts, which is time consuming and labor intensive.

Method: In this study, a novel text mining method with unsupervised learning in nature, called topic modeling, was applied to the drug labeling with a goal of discovering "topics" that group drugs with similar safety concerns and/or therapeutic uses together. A total of 794 FDA-approved drug labels were used in this study. First, the three labeling sections (i.e., Boxed Warning, Warnings and Precautions, Adverse Reactions) of each drug label were processed by the Medical Dictionary for Regulatory Activities (MedDRA) to convert the free text of each label to the standard ADR terms. Next, the topic modeling approach with latent Dirichlet allocation (LDA) was applied to generate 100 topics, each associated with a set of drugs grouped together based on the probability analysis. Lastly, the efficacy of the topic modeling was evaluated based on known information about the therapeutic uses and safety data of drugs.

Results: The results demonstrate that drugs grouped by topics are associated with the same safety concerns and/or therapeutic uses with statistical significance (P<0.05). The identified topics have distinct context that can be directly linked to specific adverse events (e.g., liver injury or kidney injury) or therapeutic application (e.g., antiinfectives for systemic use). We were also able to identify potential adverse events that might arise from specific medications via topics.

Conclusions: The successful application of topic modeling on the FDA drug labeling demonstrates its potential utility as a hypothesis generation means to infer hidden relationships of concepts such as, in this study, drug safety and therapeutic use in the study of biomedical documents.

Citing Articles

Classifying Free Texts Into Predefined Sections Using AI in Regulatory Documents: A Case Study with Drug Labeling Documents.

Gray M, Xu J, Tong W, Wu L Chem Res Toxicol. 2023; 36(8):1290-1299.

PMID: 37487037 PMC: 10445280. DOI: 10.1021/acs.chemrestox.3c00028.


Psychosocial Needs of Gynecological Cancer Survivors: Mixed Methods Study.

Adams E, Tallman D, Haynam M, Nekhlyudov L, Lustberg M J Med Internet Res. 2022; 24(9):e37757.

PMID: 36125848 PMC: 9533206. DOI: 10.2196/37757.


Comparison of the Erectile Dysfunction Drugs Sildenafil and Tadalafil Using Patient Medication Reviews: Topic Modeling Study.

Kim M, Noh Y, Yamada A, Hong S JMIR Med Inform. 2022; 10(2):e32689.

PMID: 35225813 PMC: 8922152. DOI: 10.2196/32689.


Using topic modelling for unsupervised annotation of electronic health records to identify an outbreak of disease in UK dogs.

Noble P, Appleton C, Radford A, Nenadic G PLoS One. 2021; 16(12):e0260402.

PMID: 34882714 PMC: 8659617. DOI: 10.1371/journal.pone.0260402.


Mining Early Life Risk and Resiliency Factors and Their Influences in Human Populations from PubMed: A Machine Learning Approach to Discover DOHaD Evidence.

Tewari S, Toledo Margalef P, Kareem A, Abdul-Hussein A, White M, Wazana A J Pers Med. 2021; 11(11).

PMID: 34834416 PMC: 8621659. DOI: 10.3390/jpm11111064.


References
1.
Lasser K, Allen P, Woolhandler S, Himmelstein D, Wolfe S, Bor D . Timing of new black box warnings and withdrawals for prescription medications. JAMA. 2002; 287(17):2215-20. DOI: 10.1001/jama.287.17.2215. View

2.
Willy M, Li Z . What is prescription labeling communicating to doctors about hepatotoxic drugs? A study of FDA approved product labeling. Pharmacoepidemiol Drug Saf. 2004; 13(4):201-6. DOI: 10.1002/pds.856. View

3.
Hamosh A, Scott A, Amberger J, Bocchini C, Valle D, McKusick V . Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2001; 30(1):52-5. PMC: 99152. DOI: 10.1093/nar/30.1.52. View

4.
Wang H, Ding Y, Tang J, Dong X, He B, Qiu J . Finding complex biological relationships in recent PubMed articles using Bio-LDA. PLoS One. 2011; 6(3):e17243. PMC: 3063155. DOI: 10.1371/journal.pone.0017243. View

5.
Scheiber J, Jenkins J, Sukuru S, Bender A, Mikhailov D, Milik M . Mapping adverse drug reactions in chemical space. J Med Chem. 2009; 52(9):3103-7. DOI: 10.1021/jm801546k. View