» Articles » PMID: 34545955

Training Data Distribution Significantly Impacts the Estimation of Tissue Microstructure with Machine Learning

Overview
Journal Magn Reson Med
Publisher Wiley
Specialty Radiology
Date 2021 Sep 21
PMID 34545955
Citations 14
Authors
Affiliations
Soon will be listed here.
Abstract

Purpose: Supervised machine learning (ML) provides a compelling alternative to traditional model fitting for parameter mapping in quantitative MRI. The aim of this work is to demonstrate and quantify the effect of different training data distributions on the accuracy and precision of parameter estimates when supervised ML is used for fitting.

Methods: We fit a two- and three-compartment biophysical model to diffusion measurements from in-vivo human brain, as well as simulated diffusion data, using both traditional model fitting and supervised ML. For supervised ML, we train several artificial neural networks, as well as random forest regressors, on different distributions of ground truth parameters. We compare the accuracy and precision of parameter estimates obtained from the different estimation approaches using synthetic test data.

Results: When the distribution of parameter combinations in the training set matches those observed in healthy human data sets, we observe high precision, but inaccurate estimates for atypical parameter combinations. In contrast, when training data is sampled uniformly from the entire plausible parameter space, estimates tend to be more accurate for atypical parameter combinations but may have lower precision for typical parameter combinations.

Conclusion: This work highlights that estimation of model parameters using supervised ML depends strongly on the training-set distribution. We show that high precision obtained using ML may mask strong bias, and visual assessment of the parameter maps is not sufficient for evaluating the quality of the estimates.

Citing Articles

Flexible and cost-effective deep learning for accelerated multi-parametric relaxometry using phase-cycled bSSFP.

Birk F, Mahler L, Steiglechner J, Wang Q, Scheffler K, Heule R Sci Rep. 2025; 15(1):4825.

PMID: 39924554 PMC: 11808094. DOI: 10.1038/s41598-025-88579-z.


hvEEGNet: a novel deep learning model for high-fidelity EEG reconstruction.

Cisotto G, Zancanaro A, Zoppis I, Manzoni S Front Neuroinform. 2025; 18:1459970.

PMID: 39759760 PMC: 11695360. DOI: 10.3389/fninf.2024.1459970.


Introducing µGUIDE for quantitative imaging via generalized uncertainty-driven inference using deep learning.

Jallais M, Palombo M Elife. 2024; 13.

PMID: 39589260 PMC: 11594529. DOI: 10.7554/eLife.101069.


Improved quantitative parameter estimation for prostate T relaxometry using convolutional neural networks.

Bolan P, Saunders S, Kay K, Gross M, Akcakaya M, Metzger G MAGMA. 2024; 37(4):721-735.

PMID: 39042205 PMC: 11417079. DOI: 10.1007/s10334-024-01186-3.


Empowering prediction of miRNA-mRNA interactions in species with limited training data through transfer learning.

Hadad E, Rokach L, Veksler-Lublinsky I Heliyon. 2024; 10(7):e28000.

PMID: 38560149 PMC: 10981012. DOI: 10.1016/j.heliyon.2024.e28000.