» Articles » PMID: 37786783

Topic Modeling Based Classification of Clinical Reports

Overview
Date 2023 Oct 3
PMID 37786783
Authors
Affiliations
Soon will be listed here.
Abstract

Electronic health records (EHRs) contain important clinical information about patients. Some of these data are in the form of free text and require preprocessing to be able to used in automated systems. Efficient and effective use of this data could be vital to the speed and quality of health care. As a case study, we analyzed classification of CT imaging reports into binary categories. In addition to regular text classification, we utilized topic modeling of the entire dataset in various ways. Topic modeling of the corpora provides interpretable themes that exist in these reports. Representing reports according to their topic distributions is more compact than bag-of-words representation and can be processed faster than raw text in subsequent automated processes. A binary topic model was also built as an unsupervised classification approach with the assumption that each topic corresponds to a class. And, finally an aggregate topic classifier was built where reports are classified based on a single discriminative topic that is determined from the training dataset. Our proposed topic based classifier system is shown to be competitive with existing text classification techniques and provides a more efficient and interpretable representation.

Citing Articles

Prediction of complications in diabetes mellitus using machine learning models with transplanted topic model features.

Han B, Kim J, Choi J Biomed Eng Lett. 2024; 14(1):163-171.

PMID: 38186952 PMC: 10769946. DOI: 10.1007/s13534-023-00322-7.


Detecting Hypoglycemia Incidents Reported in Patients' Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance.

Chen J, Lalor J, Liu W, Druhl E, Granillo E, Vimalananda V J Med Internet Res. 2019; 21(3):e11990.

PMID: 30855231 PMC: 6431826. DOI: 10.2196/11990.


Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling.

Onan A Comput Math Methods Med. 2018; 2018:2497471.

PMID: 30140300 PMC: 6081524. DOI: 10.1155/2018/2497471.


Comment Topic Evolution on a Cancer Institution's Facebook Page.

Tang C, Zhou L, Plasek J, Rozenblum R, Bates D Appl Clin Inform. 2017; 8(3):854-865.

PMID: 28832069 PMC: 6220692. DOI: 10.4338/ACI-2017-04-RA-0055.


An overview of topic modeling and its current applications in bioinformatics.

Liu L, Tang L, Dong W, Yao S, Zhou W Springerplus. 2016; 5(1):1608.

PMID: 27652181 PMC: 5028368. DOI: 10.1186/s40064-016-3252-8.

References
1.
Griffiths T, Steyvers M . Finding scientific topics. Proc Natl Acad Sci U S A. 2004; 101 Suppl 1:5228-35. PMC: 387300. DOI: 10.1073/pnas.0307752101. View

2.
Huang Y, Lowe H, Klein D, Cucina R . Improved identification of noun phrases in clinical radiology reports using a high-performance statistical natural language parser augmented with the UMLS specialist lexicon. J Am Med Inform Assoc. 2005; 12(3):275-85. PMC: 1090458. DOI: 10.1197/jamia.M1695. View

3.
Yadav K, Cowan E, Haukoos J, Ashwell Z, Nguyen V, Gennis P . Derivation of a clinical risk score for traumatic orbital fracture. J Trauma Acute Care Surg. 2012; 73(5):1313-8. DOI: 10.1097/TA.0b013e318265cf61. View

4.
Sarioglu E, Choi H, Yadav K . Clinical report classification using Natural Language Processing and Topic Modeling. Proc Int Conf Mach Learn Appl. 2023; 2012:204-209. PMC: 10530625. DOI: 10.1109/icmla.2012.173. View