» Articles » PMID: 34009128

Relative Performance of Machine Learning and Linear Regression in Predicting Quality of Life and Academic Performance of School Children in Norway: Data Analysis of a Quasi-Experimental Study

Overview
Publisher JMIR Publications
Date 2021 May 19
PMID 34009128
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Machine learning techniques are increasingly being applied in health research. It is not clear how useful these approaches are for modeling continuous outcomes. Child quality of life is associated with parental socioeconomic status and physical activity and may be associated with aerobic fitness and strength. It is unclear whether diet or academic performance is associated with quality of life.

Objective: The purpose of this study was to compare the predictive performance of machine learning techniques with that of linear regression in examining the extent to which continuous outcomes (physical activity, aerobic fitness, muscular strength, diet, and parental education) are predictive of academic performance and quality of life and whether academic performance and quality of life are associated.

Methods: We modeled data from children attending 9 schools in a quasi-experimental study. We split data randomly into training and validation sets. Curvilinear, nonlinear, and heteroscedastic variables were simulated to examine the performance of machine learning techniques compared to that of linear models, with and without imputation.

Results: We included data for 1711 children. Regression models explained 24% of academic performance variance in the real complete-case validation set, and up to 15% in quality of life. While machine learning techniques explained high proportions of variance in training sets, in validation, machine learning techniques explained approximately 0% of academic performance and 3% to 8% of quality of life. With imputation, machine learning techniques improved to 15% for academic performance. Machine learning outperformed regression for simulated nonlinear and heteroscedastic variables. The best predictors of academic performance in adjusted models were the child's mother having a master-level education (P<.001; β=1.98, 95% CI 0.25 to 3.71), increased television and computer use (P=.03; β=1.19, 95% CI 0.25 to 3.71), and dichotomized self-reported exercise (P=.001; β=2.47, 95% CI 1.08 to 3.87). For quality of life, self-reported exercise (P<.001; β=1.09, 95% CI 0.53 to 1.66) and increased television and computer use (P=.002; β=-0.95, 95% CI -1.55 to -0.36) were the best predictors. Adjusted academic performance was associated with quality of life (P=.02; β=0.12, 95% CI 0.02 to 0.22).

Conclusions: Linear regression was less prone to overfitting and outperformed commonly used machine learning techniques. Imputation improved the performance of machine learning, but not sufficiently to outperform regression. Machine learning techniques outperformed linear regression for modeling nonlinear and heteroscedastic relationships and may be of use in such cases. Regression with splines performed almost as well in nonlinear modeling. Lifestyle variables, including physical exercise, television and computer use, and parental education are predictive of academic performance or quality of life. Academic performance is associated with quality of life after adjusting for lifestyle variables and may offer another promising intervention target to improve quality of life in children.

Citing Articles

Predicting the time to get back to work using statistical models and machine learning approaches.

Bouliotis G, Underwood M, Froud R BMC Med Res Methodol. 2024; 24(1):295.

PMID: 39614191 PMC: 11606207. DOI: 10.1186/s12874-024-02390-4.


Relations between the levels of moderate to vigorous physical activity, BMI, dietary habits, cognitive functions and attention problems in 8 to 9 years old pupils: network analysis (PACH Study).

Raudeniece J, Vanags E, Justamente I, Skara D, Fredriksen P, Brownlee I BMC Public Health. 2024; 24(1):544.

PMID: 38383413 PMC: 10882845. DOI: 10.1186/s12889-024-18055-2.

References
1.
Moeijes J, van Busschbach J, Bosscher R, Twisk J . Sports participation and health-related quality of life: a longitudinal observational study in children. Qual Life Res. 2019; 28(9):2453-2469. PMC: 6698265. DOI: 10.1007/s11136-019-02219-4. View

2.
Cotman C, Berchtold N, Christie L . Exercise builds brain health: key roles of growth factor cascades and inflammation. Trends Neurosci. 2007; 30(9):464-72. DOI: 10.1016/j.tins.2007.06.011. View

3.
Hoffman H, Lee S, Garst J, Lu D, Li C, Nagasawa D . Use of multivariate linear regression and support vector regression to predict functional outcome after surgery for cervical spondylotic myelopathy. J Clin Neurosci. 2015; 22(9):1444-9. PMC: 4842312. DOI: 10.1016/j.jocn.2015.04.002. View

4.
Wellner B, Grand J, Canzone E, Coarr M, Brady P, Simmons J . Predicting Unplanned Transfers to the Intensive Care Unit: A Machine Learning Approach Leveraging Diverse Clinical Elements. JMIR Med Inform. 2017; 5(4):e45. PMC: 5719228. DOI: 10.2196/medinform.8680. View

5.
Rhodes R, Kates A . Can the Affective Response to Exercise Predict Future Motives and Physical Activity Behavior? A Systematic Review of Published Evidence. Ann Behav Med. 2015; 49(5):715-31. DOI: 10.1007/s12160-015-9704-5. View