» Articles » PMID: 27376095

Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data

Overview
Journal EGEMS (Wash DC)
Publisher Ubiquity Press
Date 2016 Jul 5
PMID 27376095
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

Introduction/objective: Pulmonary function tests (PFTs) are objective estimates of lung function, but are not reliably stored within the Veteran Health Affairs data systems as structured data. The aim of this study was to validate the natural language processing (NLP) tool we developed-which extracts spirometric values and responses to bronchodilator administration-against expert review, and to estimate the number of additional spirometric tests identified beyond the structured data.

Methods: All patients at seven Veteran Affairs Medical Centers with a diagnostic code for asthma Jan 1, 2006-Dec 31, 2012 were included. Evidence of spirometry with a bronchodilator challenge (BDC) was extracted from structured data as well as clinical documents. NLP's performance was compared against a human reference standard using a random sample of 1,001 documents.

Results: In the validation set NLP demonstrated a precision of 98.9 percent (95 percent confidence intervals (CI): 93.9 percent, 99.7 percent), recall of 97.8 percent (95 percent CI: 92.2 percent, 99.7 percent), and an F-measure of 98.3 percent for the forced vital capacity pre- and post pairs and precision of 100 percent (95 percent CI: 96.6 percent, 100 percent), recall of 100 percent (95 percent CI: 96.6 percent, 100 percent), and an F-measure of 100 percent for the forced expiratory volume in one second pre- and post pairs for bronchodilator administration. Application of the NLP increased the proportion identified with complete bronchodilator challenge by 25 percent.

Discussion/conclusion: This technology can improve identification of PFTs for epidemiologic research. Caution must be taken in assuming that a single domain of clinical data can completely capture the scope of a disease, treatment, or clinical test.

Citing Articles

Development and validation of a pulmonary function test data extraction tool for the US department of veterans affairs electronic health record.

Rabin A, Weinstein J, Seelye S, Whittington T, Hogan C, Prescott H BMC Res Notes. 2024; 17(1):115.

PMID: 38654333 PMC: 11039415. DOI: 10.1186/s13104-024-06770-3.


Extracting forced vital capacity from the electronic health record through natural language processing in rheumatoid arthritis-associated interstitial lung disease.

England B, Roul P, Yang Y, Hershberger D, Sayles H, Rojas J Pharmacoepidemiol Drug Saf. 2023; 33(1):e5744.

PMID: 38112272 PMC: 10872496. DOI: 10.1002/pds.5744.


Salience of Medical Concepts of Inside Clinical Texts and Outside Medical Records for Referred Cardiovascular Patients.

Moon S, Liu S, Chen D, Wang Y, Wood D, Chaudhry R J Healthc Inform Res. 2022; 3(2):200-219.

PMID: 35415427 PMC: 8982748. DOI: 10.1007/s41666-019-00044-5.


Searching the PDF Haystack: Automated Knowledge Discovery in Scanned EHR Documents.

Kostrinsky-Thomas A, Hisama F, Payne T Appl Clin Inform. 2021; 12(2):245-250.

PMID: 33763846 PMC: 7990572. DOI: 10.1055/s-0041-1726103.


Expert artificial intelligence-based natural language processing characterises childhood asthma.

Seol H, Rolfes M, Chung W, Sohn S, Ryu E, Park M BMJ Open Respir Res. 2020; 7(1).

PMID: 33371009 PMC: 7011897. DOI: 10.1136/bmjresp-2019-000524.


References
1.
Busse W, Holgate S, Kerwin E, Chon Y, Feng J, Lin J . Randomized, double-blind, placebo-controlled study of brodalumab, a human anti-IL-17 receptor monoclonal antibody, in moderate to severe asthma. Am J Respir Crit Care Med. 2013; 188(11):1294-302. DOI: 10.1164/rccm.201212-2318OC. View

2.
Bui D, Zeng-Treitler Q . Learning regular expressions for clinical text classification. J Am Med Inform Assoc. 2014; 21(5):850-7. PMC: 4147608. DOI: 10.1136/amiajnl-2013-002411. View

3.
Nelson S, Lu C, Teng C, Leng J, Cannon G, He T . The use of natural language processing of infusion notes to identify outpatient infusions. Pharmacoepidemiol Drug Saf. 2014; 24(1):86-92. DOI: 10.1002/pds.3720. View

4.
Al-Haddad M, Friedlin J, Kesterson J, Waters J, Aguilar-Saavedra J, Schmidt C . Natural language processing for the development of a clinical registry: a validation study in intraductal papillary mucinous neoplasms. HPB (Oxford). 2010; 12(10):688-95. PMC: 3003479. DOI: 10.1111/j.1477-2574.2010.00235.x. View

5.
Garvin J, DuVall S, South B, Bray B, Bolton D, Heavirland J . Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure. J Am Med Inform Assoc. 2012; 19(5):859-66. PMC: 3422820. DOI: 10.1136/amiajnl-2011-000535. View