» Articles » PMID: 35463193

Leveraging Deep Representations of Radiology Reports in Survival Analysis for Predicting Heart Failure Patient Mortality

Overview
Journal Proc Conf
Date 2022 Apr 25
PMID 35463193
Authors
Affiliations
Soon will be listed here.
Abstract

Utilizing clinical texts in survival analysis is difficult because they are largely unstructured. Current automatic extraction models fail to capture textual information comprehensively since their labels are limited in scope. Furthermore, they typically require a large amount of data and high-quality expert annotations for training. In this work, we present a novel method of using BERT-based hidden layer representations of clinical texts as covariates for proportional hazards models to predict patient survival outcomes. We show that hidden layers yield notably more accurate predictions than predefined features, outperforming the previous baseline model by 5.7% on average across C-index and time-dependent AUC. We make our work publicly available at https://github.com/bionlplab/heart_failure_mortality.

Citing Articles

Evaluating progress in automatic chest X-ray radiology report generation.

Yu F, Endo M, Krishnan R, Pan I, Tsai A, Reis E Patterns (N Y). 2023; 4(9):100802.

PMID: 37720336 PMC: 10499844. DOI: 10.1016/j.patter.2023.100802.


Exploring optimal granularity for extractive summarization of unstructured health records: Analysis of the largest multi-institutional archive of health records in Japan.

Ando K, Okumura T, Komachi M, Horiguchi H, Matsumoto Y PLOS Digit Health. 2023; 1(9):e0000099.

PMID: 36812582 PMC: 9931252. DOI: 10.1371/journal.pdig.0000099.


Quality Management of Pulmonary Nodule Radiology Reports Based on Natural Language Processing.

Fei X, Chen P, Wei L, Huang Y, Xin Y, Li J Bioengineering (Basel). 2022; 9(6).

PMID: 35735487 PMC: 9220149. DOI: 10.3390/bioengineering9060244.

References
1.
Pencina M, DAgostino R . Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med. 2004; 23(13):2109-23. DOI: 10.1002/sim.1802. View

2.
Chapman W, Bridewell W, Hanbury P, Cooper G, Buchanan B . A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2002; 34(5):301-10. DOI: 10.1006/jbin.2001.1029. View

3.
Uno H, Cai T, Pencina M, DAgostino R, Wei L . On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011; 30(10):1105-17. PMC: 3079915. DOI: 10.1002/sim.4154. View

4.
Kamarudin A, Cox T, Kolamunnage-Dona R . Time-dependent ROC curve analysis in medical research: current methods and applications. BMC Med Res Methodol. 2017; 17(1):53. PMC: 5384160. DOI: 10.1186/s12874-017-0332-6. View

5.
Lee J, Yoon W, Kim S, Kim D, Kim S, So C . BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019; 36(4):1234-1240. PMC: 7703786. DOI: 10.1093/bioinformatics/btz682. View