» Articles » PMID: 24488511

Using Natural Language Processing to Improve Efficiency of Manual Chart Abstraction in Research: the Case of Breast Cancer Recurrence

Overview
Journal Am J Epidemiol
Specialty Public Health
Date 2014 Feb 4
PMID 24488511
Citations 82
Authors
Affiliations
Soon will be listed here.
Abstract

The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.

Citing Articles

Automated Identification of Breast Cancer Relapse in Computed Tomography Reports Using Natural Language Processing.

Lee J, Zepeda A, Arbour G, Isaac K, Ng R, Nichol A JCO Clin Cancer Inform. 2024; 8:e2400107.

PMID: 39705642 PMC: 11670918. DOI: 10.1200/CCI.24.00107.


Artificial intelligence methods available for cancer research.

Murmu A, Gyorffy B Front Med. 2024; 18(5):778-797.

PMID: 39115792 DOI: 10.1007/s11684-024-1085-3.


Advancing equity in breast cancer care: natural language processing for analysing treatment outcomes in under-represented populations.

Park J, Park J, Zhang K, Kim D BMJ Health Care Inform. 2024; 31(1).

PMID: 38955389 PMC: 11218025. DOI: 10.1136/bmjhci-2023-100966.


Development of an Automatic Rule-Based Algorithm for the Detection of Ovarian Cancer Recurrence From Electronic Health Records.

Lee S, Kim J, Ha H, Lim M, Cho H JCO Clin Cancer Inform. 2024; 8:e2300150.

PMID: 38442323 PMC: 10927333. DOI: 10.1200/CCI.23.00150.


Toward Efficient, Sustainable, and Scalable Methods of Treatment Characterization: An Investigation of Coding Clinical Practice from Chart Notes.

Isenberg B, Becker K, Wu E, Park H, Chu W, Keenan-Miller D Adm Policy Ment Health. 2023; 51(1):103-122.

PMID: 38032421 DOI: 10.1007/s10488-023-01316-4.


References
1.
Chubak J, Buist D, Boudreau D, Rossing M, Lumley T, Weiss N . Breast cancer recurrence risk in relation to antidepressant use after diagnosis. Breast Cancer Res Treat. 2007; 112(1):123-32. PMC: 3519424. DOI: 10.1007/s10549-007-9828-9. View

2.
Savova G, Fan J, Ye Z, Murphy S, Zheng J, Chute C . Discovering peripheral arterial disease cases from radiology notes using natural language processing. AMIA Annu Symp Proc. 2011; 2010:722-6. PMC: 3041293. View

3.
Hripcsak G, Austin J, Alderson P, Friedman C . Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology. 2002; 224(1):157-63. DOI: 10.1148/radiol.2241011118. View

4.
Strauss J, Chao C, Kwan M, Ahmed S, Schottinger J, Quinn V . Identifying primary and recurrent cancers using a SAS-based natural language processing algorithm. J Am Med Inform Assoc. 2012; 20(2):349-55. PMC: 3638182. DOI: 10.1136/amiajnl-2012-000928. View

5.
Friedman C, Hripcsak G . Natural language processing and its future in medicine. Acad Med. 1999; 74(8):890-5. DOI: 10.1097/00001888-199908000-00012. View