» Articles » PMID: 16872495

Extracting Principal Diagnosis, Co-morbidity and Smoking Status for Asthma Research: Evaluation of a Natural Language Processing System

Overview
Publisher Biomed Central
Date 2006 Jul 29
PMID 16872495
Citations 186
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.

Methods: The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.

Results: The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.

Conclusion: We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.

Citing Articles

Current Applications of Artificial Intelligence in Billing Practices and Clinical Plastic Surgery.

Zhu C, Attaluri P, Wirth P, Shaffrey E, Friedrich J, Rao V Plast Reconstr Surg Glob Open. 2024; 12(7):e5939.

PMID: 38957712 PMC: 11216662. DOI: 10.1097/GOX.0000000000005939.


Using electronic health records for clinical pharmacology research: Challenges and considerations.

Jafari E, Blackman M, Karnes J, Van Driest S, Crawford D, Choi L Clin Transl Sci. 2024; 17(7):e13871.

PMID: 38943244 PMC: 11213823. DOI: 10.1111/cts.13871.


A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis.

Xu X, Li J, Zhu Z, Zhao L, Wang H, Song C Bioengineering (Basel). 2024; 11(3).

PMID: 38534493 PMC: 10967767. DOI: 10.3390/bioengineering11030219.


Reasons for multiple biologic and targeted synthetic DMARD switching and characteristics of treatment refractory rheumatoid arthritis.

McDermott G, DiIorio M, Kawano Y, Jeffway M, MacVicar M, Dahal K Semin Arthritis Rheum. 2024; 66:152421.

PMID: 38457949 PMC: 11088978. DOI: 10.1016/j.semarthrit.2024.152421.


The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis.

Haque M, Gedara M, Nickel N, Turgeon M, Lix L BMC Med Inform Decis Mak. 2024; 24(1):33.

PMID: 38308231 PMC: 10836023. DOI: 10.1186/s12911-024-02416-3.


References
1.
Xu H, Anderson K, Grann V, Friedman C . Facilitating cancer research using natural language processing of pathology reports. Stud Health Technol Inform. 2004; 107(Pt 1):565-72. View

2.
Aronson A . Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2002; :17-21. PMC: 2243666. View

3.
Birman-Deych E, Waterman A, Yan Y, Nilasena D, Radford M, Gage B . Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors. Med Care. 2005; 43(5):480-5. DOI: 10.1097/01.mlr.0000160417.39497.a9. View

4.
Taira R, Soderland S . A statistical natural language processor for medical reports. Proc AMIA Symp. 1999; :970-4. PMC: 2232848. View

5.
OMalley K, Cook K, Price M, Wildes K, Hurdle J, Ashton C . Measuring diagnoses: ICD code accuracy. Health Serv Res. 2005; 40(5 Pt 2):1620-39. PMC: 1361216. DOI: 10.1111/j.1475-6773.2005.00444.x. View