Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data

Overview

Journal J Healthc Inform Res

Specialty Medical Informatics

Date 2023 Jun 28

PMID 37377633

Authors

Sagnik Dakshit

Sristi Dakshit

Ninad Khargonkar

Balakrishnan Prabhakaran

Affiliations

Soon will be listed here.

Abstract

One of the hindrances in the widespread acceptance of deep learning-based decision support systems in healthcare is bias. Bias in its many forms occurs in the datasets used to train and test deep learning models and is amplified when deployed in the real world, leading to challenges such as model drift. Recent advancements in the field of deep learning have led to the deployment of deployable automated healthcare diagnosis decision support systems at hospitals as well as tele-medicine through IoT devices. Research has been focused primarily on the development and improvement of these systems leaving a gap in the analysis of the fairness. The domain of FAccT ML (fairness, accountability, and transparency) accounts for the analysis of these deployable machine learning systems. In this work, we present a framework for bias analysis in healthcare time series (BAHT) signals such as electrocardiogram (ECG) and electroencephalogram (EEG). BAHT provides a graphical interpretive analysis of bias in the training, testing datasets in terms of protected variables, and analysis of bias amplification by the trained supervised learning model for time series healthcare decision support systems. We thoroughly investigate three prominent time series ECG and EEG healthcare datasets used for model training and research. We show the extensive presence of bias in the datasets leads to potentially biased or unfair machine-learning models. Our experiments also demonstrate the amplification of identified bias with an observed maximum of 66.66%. We investigate the effect of model drift due to unanalyzed bias in datasets and algorithms. Bias mitigation though prudent is a nascent area of research. We present experiments and analyze the most prevalently accepted bias mitigation strategies of under-sampling, oversampling, and the use of synthetic data for balancing the dataset through augmentation. It is important that healthcare models, datasets, and bias mitigation strategies should be properly analyzed for a fair unbiased delivery of service.

Citing Articles

Synthetic data generation methods in healthcare: A review on open-source tools and methods.

Pezoulas V, Zaridis D, Mylona E, Androutsos C, Apostolidis K, Tachos N Comput Struct Biotechnol J. 2024; 23:2892-2910.

PMID: 39108677 PMC: 11301073. DOI: 10.1016/j.csbj.2024.07.005.

References

Maweu B, Dakshit S, Shamsuddin R, Prabhakaran B . CEFEs: A CNN Explainable Framework for ECG Signals. Artif Intell Med. 2021; 115:102059. DOI: 10.1016/j.artmed.2021.102059. View

Garcia Santa Cruz B, Bossa M, Solter J, Husch A . Public Covid-19 X-ray datasets and their impact on model bias - A systematic review of a significant problem. Med Image Anal. 2021; 74:102225. PMC: 8479314. DOI: 10.1016/j.media.2021.102225. View

Bower J, Patel S, Rudy J, Felix A . Addressing Bias in Electronic Health Record-Based Surveillance of Cardiovascular Disease Risk: Finding the Signal Through the Noise. Curr Epidemiol Rep. 2019; 4(4):346-352. PMC: 6585457. DOI: 10.1007/s40471-017-0130-z. View

Gianfrancesco M, Tamang S, Yazdany J, Schmajuk G . Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data. JAMA Intern Med. 2018; 178(11):1544-1547. PMC: 6347576. DOI: 10.1001/jamainternmed.2018.3763. View

Burlina P, Pacheco K, Joshi N, Freund D, Bressler N . Comparing humans and deep learning performance for grading AMD: A study in using universal deep features and transfer learning for automated AMD analysis. Comput Biol Med. 2017; 82:80-86. PMC: 5373654. DOI: 10.1016/j.compbiomed.2017.01.018. View

Rozier M, Patel K, Cross D . Electronic Health Records as Biased Tools or Tools Against Bias: A Conceptual Model. Milbank Q. 2021; 100(1):134-150. PMC: 8932623. DOI: 10.1111/1468-0009.12545. View

Dokur Z, Olmez T . ECG beat classification by a novel hybrid neural network. Comput Methods Programs Biomed. 2001; 66(2-3):167-81. DOI: 10.1016/s0169-2607(00)00133-4. View

Moody G, Mark R . The impact of the MIT-BIH arrhythmia database. IEEE Eng Med Biol Mag. 2001; 20(3):45-50. DOI: 10.1109/51.932724. View

Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov P, Mark R . PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000; 101(23):E215-20. DOI: 10.1161/01.cir.101.23.e215. View

10.

Duprez D, Jacobs Jr D, Lutsey P, Herrington D, Prime D, Ouyang P . Race/ethnic and sex differences in large and small artery elasticity--results of the multi-ethnic study of atherosclerosis (MESA). Ethn Dis. 2009; 19(3):243-50. PMC: 2924814. View

11.

Hague D . Benefits, Pitfalls, and Potential Bias in Health Care AI. N C Med J. 2019; 80(4):219-223. DOI: 10.18043/ncm.80.4.219. View

12.

Kishi S, Reis J, Venkatesh B, Gidding S, Armstrong A, Jacobs Jr D . Race-ethnic and sex differences in left ventricular structure and function: the Coronary Artery Risk Development in Young Adults (CARDIA) Study. J Am Heart Assoc. 2015; 4(3):e001264. PMC: 4392424. DOI: 10.1161/JAHA.114.001264. View

13.

Alvarez-Rodriguez L, de Moura J, Novo J, Ortega M . Does imbalance in chest X-ray datasets produce biased deep learning approaches for COVID-19 screening?. BMC Med Res Methodol. 2022; 22(1):125. PMC: 9046709. DOI: 10.1186/s12874-022-01578-w. View

14.

Gurupur V, Wan T . Inherent Bias in Artificial Intelligence-Based Decision Support Systems for Healthcare. Medicina (Kaunas). 2020; 56(3). PMC: 7142873. DOI: 10.3390/medicina56030141. View

15.

Bhanot K, Qi M, Erickson J, Guyon I, Bennett K . The Problem of Fairness in Synthetic Healthcare Data. Entropy (Basel). 2021; 23(9). PMC: 8468495. DOI: 10.3390/e23091165. View