Perspective: Big Data and Machine Learning Could Help Advance Nutritional Epidemiology
Overview
Affiliations
The field of nutritional epidemiology faces challenges posed by measurement error, diet as a complex exposure, and residual confounding. The objective of this perspective article is to highlight how developments in big data and machine learning can help address these challenges. New methods of collecting 24-h dietary recalls and recording diet could enable larger samples and more repeated measures to increase statistical power and measurement precision. In addition, use of machine learning to automatically classify pictures of food could become a useful complimentary method to help improve precision and validity of dietary measurements. Diet is complex due to thousands of different foods that are consumed in varying proportions, fluctuating quantities over time, and differing combinations. Current dietary pattern methods may not integrate sufficient dietary variation, and most traditional modeling approaches have limited incorporation of interactions and nonlinearity. Machine learning could help better model diet as a complex exposure with nonadditive and nonlinear associations. Last, novel big data sources could help avoid unmeasured confounding by offering more covariates, including both omics and features derived from unstructured data with machine learning methods. These opportunities notwithstanding, application of big data and machine learning must be approached cautiously to ensure quality of dietary measurements, avoid overfitting, and confirm accurate interpretations. Greater use of machine learning and big data would also require substantial investments in training, collaborations, and computing infrastructure. Overall, we propose that judicious application of big data and machine learning in nutrition science could offer new means of dietary measurement, more tools to model the complexity of diet and its relations with diseases, and additional potential ways of addressing confounding.
Salesse F, Eldridge A, Mak T, Gibney E Front Nutr. 2025; 12:1532926.
PMID: 40013165 PMC: 11860067. DOI: 10.3389/fnut.2025.1532926.
The development and validation of a prediction model for post-AKI outcomes of pediatric inpatients.
Zhang C, Liu X, Yan R, Nie X, Peng Y, Zhou N Clin Kidney J. 2025; 18(2):sfaf007.
PMID: 39991652 PMC: 11843026. DOI: 10.1093/ckj/sfaf007.
Spicker D, Nazemi A, Hutchinson J, Fieguth P, Kirkpatrick S, Wallace M Stat Med. 2025; 44(5):e70013.
PMID: 39921576 PMC: 11806516. DOI: 10.1002/sim.70013.
Nutritional intelligence in the food system: Combining food, health, data and AI expertise.
McCarthy D Nutr Bull. 2025; 50(1):142-150.
PMID: 39799464 PMC: 11815607. DOI: 10.1111/nbu.12729.
Bianco R, Marinoni M, Coluccia S, Carioni G, Fiori F, Gnagnarella P Nutrients. 2024; 16(19).
PMID: 39408306 PMC: 11479105. DOI: 10.3390/nu16193339.