» Articles » PMID: 24807526

Study on the Impact of Partition-induced Dataset Shift on K-fold Cross-validation

Overview
Date 2014 May 9
PMID 24807526
Citations 27
Authors
Affiliations
Soon will be listed here.
Abstract

Cross-validation is a very commonly employed technique used to evaluate classifier performance. However, it can potentially introduce dataset shift, a harmful factor that is often not taken into account and can result in inaccurate performance estimation. This paper analyzes the prevalence and impact of partition-induced covariate shift on different k-fold cross-validation schemes. From the experimental results obtained, we conclude that the degree of partition-induced covariate shift depends on the cross-validation scheme considered. In this way, worse schemes may harm the correctness of a single-classifier performance estimation and also increase the needed number of repetitions of cross-validation to reach a stable performance estimation.

Citing Articles

Comparison of principal component analysis algorithms for imputation in agrometeorological data in high dimension and reduced sample size.

de Souza V, Rodrigues S, Filho L PLoS One. 2024; 19(12):e0315574.

PMID: 39739837 PMC: 11687751. DOI: 10.1371/journal.pone.0315574.


Automatic 3D pelvimetry framework in CT images and its validation.

Shao J, Wu Q, Zhang Y, Liu C, Huo X, Wang C Sci Rep. 2024; 14(1):21431.

PMID: 39271720 PMC: 11399230. DOI: 10.1038/s41598-024-72123-6.


Design optimization of large-scale bifacial photovoltaic module frame using deep learning surrogate model.

Han D, Kim S Sci Rep. 2024; 14(1):14592.

PMID: 38918445 PMC: 11199489. DOI: 10.1038/s41598-024-64594-4.


Integrating attention mechanism and multi-scale feature extraction for fall detection.

Chen H, Gu W, Zhang Q, Li X, Jiang X Heliyon. 2024; 10(10):e31614.

PMID: 38831825 PMC: 11145491. DOI: 10.1016/j.heliyon.2024.e31614.


Performance of deep learning algorithms to distinguish high-grade glioma from low-grade glioma: A systematic review and meta-analysis.

Sun W, Song C, Tang C, Pan C, Xue P, Fan J iScience. 2023; 26(6):106815.

PMID: 37250800 PMC: 10209541. DOI: 10.1016/j.isci.2023.106815.