Adaptation of Autoencoder for Sparsity Reduction From Clinical Notes Representation Learning

Overview

Journal IEEE J Transl Eng Health Med

Specialty Biomedical Engineering

Date 2023 Oct 11

PMID 37817825

Authors

Thanh-Dung Le

Rita Noumeir

Jerome Rambaud

Guillaume Sans

Philippe Jouvet

Affiliations

Soon will be listed here.

Abstract

Goal: Our aim is, therefore, to access an alternative approach to tackle the sparsity by compressing the clinical representation feature space, where limited French clinical notes can also be dealt with effectively.

Methods: This study proposed an autoencoder learning algorithm to take advantage of sparsity reduction in clinical note representation. The motivation was to determine how to compress sparse, high-dimensional data by reducing the dimension of the clinical note representation feature space. The classification performance of the classifiers was then evaluated in the trained and compressed feature space.

Results: The proposed approach provided overall performance gains of up to 3% for each test set evaluation. Finally, the classifier achieved 92% accuracy, 91% recall, 91% precision, and 91% f1-score in detecting the patient's condition. Furthermore, the compression working mechanism and the autoencoder prediction process were demonstrated by applying the theoretic information bottleneck framework. Clinical and Translational Impact Statement- An autoencoder learning algorithm effectively tackles the problem of sparsity in the representation feature space from a small clinical narrative dataset. Significantly, it can learn the best representation of the training data because of its lossless compression capacity compared to other approaches. Consequently, its downstream classification ability can be significantly improved, which cannot be done using deep learning models.

References

Fodeh S, Li T, Jarad H, Safdar B . Classification of Patients with Coronary Microvascular Dysfunction. IEEE/ACM Trans Comput Biol Bioinform. 2019; 17(2):704-711. DOI: 10.1109/TCBB.2019.2914442. View

Rudin C . Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat Mach Intell. 2022; 1(5):206-215. PMC: 9122117. DOI: 10.1038/s42256-019-0048-x. View

Geiger B, Kubin G . Information Bottleneck: Theory and Applications in Deep Learning. Entropy (Basel). 2020; 22(12). PMC: 7764901. DOI: 10.3390/e22121408. View

Yahyatabar M, Jouvet P, Cheriet F . Dense-Unet: a light model for lung fields segmentation in Chest X-Ray images. Annu Int Conf IEEE Eng Med Biol Soc. 2020; 2020:1242-1245. DOI: 10.1109/EMBC44109.2020.9176033. View

Bellani G, Laffey J, Pham T, Fan E, Brochard L, Esteban A . Epidemiology, Patterns of Care, and Mortality for Patients With Acute Respiratory Distress Syndrome in Intensive Care Units in 50 Countries. JAMA. 2016; 315(8):788-800. DOI: 10.1001/jama.2016.0291. View

Sutton R, Pincock D, Baumgart D, Sadowski D, Fedorak R, Kroeker K . An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020; 3:17. PMC: 7005290. DOI: 10.1038/s41746-020-0221-y. View

Fries J, Varma P, Chen V, Xiao K, Tejeda H, Saha P . Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences. Nat Commun. 2019; 10(1):3111. PMC: 6629670. DOI: 10.1038/s41467-019-11012-3. View

Yu S, Principe J . Understanding autoencoders with information theoretic concepts. Neural Netw. 2019; 117:104-123. DOI: 10.1016/j.neunet.2019.05.003. View

Lee S, Jo J . Information Flows of Diverse Autoencoders. Entropy (Basel). 2021; 23(7). PMC: 8303402. DOI: 10.3390/e23070862. View

10.

Pluim J, Maintz J, Viergever M . Mutual-information-based registration of medical images: a survey. IEEE Trans Med Imaging. 2003; 22(8):986-1004. DOI: 10.1109/TMI.2003.815867. View

11.

Kolyvakis P, Kalousis A, Smith B, Kiritsis D . Biomedical ontology alignment: an approach based on representation learning. J Biomed Semantics. 2018; 9(1):21. PMC: 6094585. DOI: 10.1186/s13326-018-0187-8. View

12.

. Pediatric acute respiratory distress syndrome: consensus recommendations from the Pediatric Acute Lung Injury Consensus Conference. Pediatr Crit Care Med. 2015; 16(5):428-39. PMC: 5253180. DOI: 10.1097/PCC.0000000000000350. View

13.

Sauthier M, Tuli G, Jouvet P, Brownstein J, Randolph A . Estimated Pao: A Continuous and Noninvasive Method to Estimate Pao and Oxygenation Index. Crit Care Explor. 2021; 3(10):e0546. PMC: 8480940. DOI: 10.1097/CCE.0000000000000546. View

14.

Lu Y, Cheung Y, Yan Tang Y . Bayes Imbalance Impact Index: A Measure of Class Imbalanced Data Set for Classification Problem. IEEE Trans Neural Netw Learn Syst. 2019; 31(9):3525-3539. DOI: 10.1109/TNNLS.2019.2944962. View

15.

Olsen C, Meyer P, Bontempi G . On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information. EURASIP J Bioinform Syst Biol. 2009; :308959. PMC: 3171423. DOI: 10.1155/2009/308959. View

16.

Gold R, Larson A, Sperl-Hillen J, Boston D, Sheppler C, Heintzman J . Effect of Clinical Decision Support at Community Health Centers on the Risk of Cardiovascular Disease: A Cluster Randomized Clinical Trial. JAMA Netw Open. 2022; 5(2):e2146519. PMC: 8817199. DOI: 10.1001/jamanetworkopen.2021.46519. View

17.

Zaglam N, Jouvet P, Flechelles O, Emeriaud G, Cheriet F . Computer-aided diagnosis system for the Acute Respiratory Distress Syndrome from chest radiographs. Comput Biol Med. 2014; 52:41-8. DOI: 10.1016/j.compbiomed.2014.06.006. View

18.

Hinton G, Salakhutdinov R . Reducing the dimensionality of data with neural networks. Science. 2006; 313(5786):504-7. DOI: 10.1126/science.1127647. View

19.

Quiroz J, Laranjo L, Kocaballi A, Berkovsky S, Rezazadegan D, Coiera E . Challenges of developing a digital scribe to reduce clinical documentation burden. NPJ Digit Med. 2019; 2:114. PMC: 6874666. DOI: 10.1038/s41746-019-0190-1. View

20.

Steinmeyer C, Wiese L . Sampling methods and feature selection for mortality prediction with neural networks. J Biomed Inform. 2020; 111:103580. DOI: 10.1016/j.jbi.2020.103580. View