» Articles » PMID: 31032481

Weakly Supervised Natural Language Processing for Assessing Patient-centered Outcome Following Prostate Cancer Treatment

Overview
Journal JAMIA Open
Date 2019 Apr 30
PMID 31032481
Citations 21
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD).

Methods: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision).

Results: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin.

Conclusions: We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms.

Citing Articles

Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods.

Chng S, Tern P, Kan M, Cheng L Health Care Sci. 2024; 2(2):120-128.

PMID: 38938764 PMC: 11080679. DOI: 10.1002/hcs2.40.


Natural language processing pipeline to extract prostate cancer-related information from clinical notes.

Nakai H, Suman G, Adamo D, Navin P, Bookwalter C, LeGout J Eur Radiol. 2024; 34(12):7878-7891.

PMID: 38842692 DOI: 10.1007/s00330-024-10812-6.


Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review.

Wieland-Jorna Y, van Kooten D, Verheij R, de Man Y, Francke A, Oosterveld-Vlug M JAMIA Open. 2024; 7(2):ooae044.

PMID: 38798774 PMC: 11126158. DOI: 10.1093/jamiaopen/ooae044.


Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review.

Sim J, Huang X, Horan M, Baker J, Huang I Expert Rev Pharmacoecon Outcomes Res. 2024; 24(4):467-475.

PMID: 38383308 PMC: 11001514. DOI: 10.1080/14737167.2024.2322664.


Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.

Sim J, Huang X, Horan M, Stewart C, Robison L, Hudson M Artif Intell Med. 2023; 146:102701.

PMID: 38042599 PMC: 10693655. DOI: 10.1016/j.artmed.2023.102701.


References
1.
Weiss N, Hutter C . Re: Comparative effectiveness of prostate cancer treatments: evaluating statistical adjustments for confounding in observational data. J Natl Cancer Inst. 2011; 103(16):1277. DOI: 10.1093/jnci/djr262. View

2.
Skeppstedt M, Kvist M, Nilsson G, Dalianis H . Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J Biomed Inform. 2014; 49:148-58. DOI: 10.1016/j.jbi.2014.01.012. View

3.
Nguyen D, Patrick J . Supervised machine learning and active learning in classification of radiology reports. J Am Med Inform Assoc. 2014; 21(5):893-901. PMC: 4147614. DOI: 10.1136/amiajnl-2013-002516. View

4.
Quan H, Li B, Saunders L, Parsons G, Nilsson C, Alibhai A . Assessing validity of ICD-9-CM and ICD-10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv Res. 2008; 43(4):1424-41. PMC: 2517283. DOI: 10.1111/j.1475-6773.2007.00822.x. View

5.
Sanda M, Dunn R, Michalski J, Sandler H, Northouse L, Hembroff L . Quality of life and satisfaction with outcome among prostate-cancer survivors. N Engl J Med. 2008; 358(12):1250-61. DOI: 10.1056/NEJMoa074311. View