» Articles » PMID: 24551356

Using Natural Language Processing on the Free Text of Clinical Documents to Screen for Evidence of Homelessness Among US Veterans

Overview
Date 2014 Feb 20
PMID 24551356
Citations 44
Authors
Affiliations
Soon will be listed here.
Abstract

Information retrieval algorithms based on natural language processing (NLP) of the free text of medical records have been used to find documents of interest from databases. Homelessness is a high priority non-medical diagnosis that is noted in electronic medical records of Veterans in Veterans Affairs (VA) facilities. Using a human-reviewed reference standard corpus of clinical documents of Veterans with evidence of homelessness and those without, an open-source NLP tool (Automated Retrieval Console v2.0, ARC) was trained to classify documents. The best performing model based on document level work-flow performed well on a test set (Precision 94%, Recall 97%, F-Measure 96). Processing of a naïve set of 10,000 randomly selected documents from the VA using this best performing model yielded 463 documents flagged as positive, indicating a 4.7% prevalence of homelessness. Human review noted a precision of 70% for these flags resulting in an adjusted prevalence of homelessness of 3.3% which matches current VA estimates. Further refinements are underway to improve the performance. We demonstrate an effective and rapid lifecycle of using an off-the-shelf NLP tool for screening targets of interest from medical records.

Citing Articles

The ENACT network is acting on housing instability and the unhoused using the open health natural language processing toolkit.

Harris D, Fu S, Wen A, Corbeau A, Henderson D, Hilsman J J Clin Transl Sci. 2024; 8(1):e98.

PMID: 39655040 PMC: 11626605. DOI: 10.1017/cts.2024.543.


Natural Language Processing Algorithm to Extract Multiple Myeloma Stage From Oncology Notes in the Veterans Affairs Healthcare System.

Goryachev S, Yildirim C, DuMontier C, La J, Dharne M, Gaziano J JCO Clin Cancer Inform. 2024; 8:e2300197.

PMID: 39038255 PMC: 11371094. DOI: 10.1200/CCI.23.00197.


Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias.

Yu Z, Peng C, Yang X, Dang C, Adekkanattu P, Gopal Patra B J Biomed Inform. 2024; 153():104642.

PMID: 38621641 PMC: 11141428. DOI: 10.1016/j.jbi.2024.104642.


Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review.

Bompelli A, Wang Y, Wan R, Singh E, Zhou Y, Xu L Health Data Sci. 2024; 2021:9759016.

PMID: 38487504 PMC: 10880156. DOI: 10.34133/2021/9759016.


Using electronic health record metadata to predict housing instability amongst veterans.

Zamora-Resendiz R, Oslin D, Hooshyar D, Crivelli S Prev Med Rep. 2024; 37:102505.

PMID: 38261912 PMC: 10796937. DOI: 10.1016/j.pmedr.2023.102505.


References
1.
Austin J, McKellar J, Moos R . The influence of co-occurring axis I disorders on treatment utilization and outcome in homeless patients with substance use disorders. Addict Behav. 2011; 36(9):941-4. DOI: 10.1016/j.addbeh.2011.05.001. View

2.
Chapman W . Closing the gap between NLP research and clinical practice. Methods Inf Med. 2010; 49(4):317-9. View

3.
Schneeweiss S, Robicsek A, Scranton R, Zuckerman D, Solomon D . Veteran's affairs hospital discharge databases coded serious bacterial infections accurately. J Clin Epidemiol. 2007; 60(4):397-409. DOI: 10.1016/j.jclinepi.2006.07.011. View

4.
DAvolio L, Nguyen T, Goryachev S, Fiore L . Automated concept-level information extraction to reduce the need for custom software and rules development. J Am Med Inform Assoc. 2011; 18(5):607-13. PMC: 3168318. DOI: 10.1136/amiajnl-2011-000183. View

5.
DAvolio L, Nguyen T, Farwell W, Chen Y, Fitzmeyer F, Harris O . Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC). J Am Med Inform Assoc. 2010; 17(4):375-82. PMC: 2995644. DOI: 10.1136/jamia.2009.001412. View