» Articles » PMID: 39838914

Automated Extraction of Post-stroke Functional Outcomes from Unstructured Electronic Health Records

Overview
Journal Eur Stroke J
Date 2025 Jan 22
PMID 39838914
Authors
Affiliations
Soon will be listed here.
Abstract

Purpose: Population level tracking of post-stroke functional outcomes is critical to guide interventions that reduce the burden of stroke-related disability. However, functional outcomes are often missing or documented in unstructured notes. We developed a natural language processing (NLP) model that reads electronic health records (EHR) notes to automatically determine the modified Rankin Scale (mRS).

Method: We included consecutive patients (⩾18 years) with acute stroke admitted to our center (2015-2024). mRS scores were obtained from the Get With the Guidelines registry and clinical notes (if documented), and used as the gold standard to compare against NLP-generated scores. We used text-based features from notes, along with age, sex, discharge status, and outpatient follow-up to train a logistic regression for prediction of good (0-2) versus poor (3-6) mRS, and a linear regression for the full range of mRS scores. The models were trained for prediction of mRS at hospital discharge and post-discharge. The models were externally validated in a dataset of patients with brain injuries from a different healthcare center.

Findings: We included 5307 patients, 5006 in train and test and 301 in validation; average age was 69 (SD 15) and 65 (SD 17) years, respectively; 47% female. The logistic regression achieved an area under the receiver operating curve (AUROC) of 0.94 [CI 0.93-0.95] (test) and 0.94 [0.91-0.96] (validation), and the linear model a root mean squared error (RMSE) of 0.91 [0.87-0.94] (test) and 1.17 [1.06-1.28] (validation).

Discussion And Conclusion: The NLP-based model is suitable for use in large-scale phenotyping of stroke functional outcomes and population health research.

References
1.
Brugnara G, Neuberger U, Mahmutoglu M, Foltyn M, Herweh C, Nagel S . Multimodal Predictive Modeling of Endovascular Treatment Outcome for Acute Ischemic Stroke Using Machine-Learning. Stroke. 2020; 51(12):3541-3551. DOI: 10.1161/STROKEAHA.120.030287. View

2.
van Os H, Ramos L, Hilbert A, van Leeuwen M, van Walderveen M, Kruyt N . Predicting Outcome of Endovascular Treatment for Acute Ischemic Stroke: Potential Value of Machine Learning Algorithms. Front Neurol. 2018; 9:784. PMC: 6167479. DOI: 10.3389/fneur.2018.00784. View

3.
Asadi H, Dowling R, Yan B, Mitchell P . Machine learning for outcome prediction of acute ischemic stroke post intra-arterial therapy. PLoS One. 2014; 9(2):e88225. PMC: 3919736. DOI: 10.1371/journal.pone.0088225. View

4.
Heo J, Yoon J, Park H, Kim Y, Nam H, Heo J . Machine Learning-Based Model for Prediction of Outcomes in Acute Stroke. Stroke. 2019; 50(5):1263-1265. DOI: 10.1161/STROKEAHA.118.024293. View

5.
Collins G, Moons K, Dhiman P, Riley R, Beam A, Van Calster B . TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024; 385:e078378. PMC: 11019967. DOI: 10.1136/bmj-2023-078378. View