» Articles » PMID: 30779668

Data Engineering for Machine Learning in Women's Imaging and Beyond

Overview
Specialties Oncology
Radiology
Date 2019 Feb 20
PMID 30779668
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Data engineering is the foundation of effective machine learning model development and research. The accuracy and clinical utility of machine learning models fundamentally depend on the quality of the data used for model development. This article aims to provide radiologists and radiology researchers with an understanding of the core elements of data preparation for machine learning research. We cover key concepts from an engineering perspective, including databases, data integrity, and characteristics of data suitable for machine learning projects, and from a clinical perspective, including the HIPAA, patient consent, avoidance of bias, and ethical concerns related to the potential to magnify health disparities. The focus of this article is women's imaging; nonetheless, the principles described apply to all domains of medical imaging. Machine learning research is inherently interdisciplinary: effective collaboration is critical for success. In medical imaging, radiologists possess knowledge essential for data engineers to develop useful datasets for machine learning model development.

Citing Articles

Detailed Image Data Quality and Cleaning Practices for Artificial Intelligence Tools for Breast Cancer.

Wu D, Fang Y, Vo D, Spangler A, Seiler S JCO Clin Cancer Inform. 2024; 8:e2300074.

PMID: 38552191 PMC: 10994436. DOI: 10.1200/CCI.23.00074.


Artificial Intelligence (AI) in Breast Imaging: A Scientometric Umbrella Review.

Tan X, Cheor W, Lim L, Ab Rahman K, Bakrin I Diagnostics (Basel). 2022; 12(12).

PMID: 36553119 PMC: 9777253. DOI: 10.3390/diagnostics12123111.

References
1.
McDonald J, Barg F, Weathers B, Guerra C, Troxel A, Domchek S . Understanding participation by African Americans in cancer genetics research. J Natl Med Assoc. 2012; 104(7-8):324-30. PMC: 3760677. DOI: 10.1016/s0027-9684(15)30172-3. View

2.
Madabhushi A, Lee G . Image analysis and machine learning in digital pathology: Challenges and opportunities. Med Image Anal. 2016; 33:170-175. PMC: 5556681. DOI: 10.1016/j.media.2016.06.037. View

3.
Char D, Shah N, Magnus D . Implementing Machine Learning in Health Care - Addressing Ethical Challenges. N Engl J Med. 2018; 378(11):981-983. PMC: 5962261. DOI: 10.1056/NEJMp1714229. View

4.
Ribli D, Horvath A, Unger Z, Pollner P, Csabai I . Detecting and classifying lesions in mammograms with Deep Learning. Sci Rep. 2018; 8(1):4165. PMC: 5854668. DOI: 10.1038/s41598-018-22437-z. View

5.
Wu M, Yan C, Liu H, Liu Q . Automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks. Biosci Rep. 2018; 38(3). PMC: 5938423. DOI: 10.1042/BSR20180289. View