» Articles » PMID: 40012886

TAME 2.0: Expanding and Improving Online Data Science Training for Environmental Health Research

Abstract

Introduction: Data science training has the potential to propel environmental health research efforts into territories that remain untapped and holds immense promise to change our understanding of human health and the environment. Though data science training resources are expanding, they are still limited in terms of public accessibility, user friendliness, breadth of content, tangibility through real-world examples, and applicability to the field of environmental health science.

Methods: To fill this gap, we developed an environmental health data science training resource, the inTelligence And Machine lEarning (TAME) Toolkit, version 2.0 (TAME 2.0).

Results: TAME 2.0 is a publicly available website that includes training modules organized into seven chapters. Training topics were prioritized based upon ongoing engagement with trainees, professional colleague feedback, and emerging topics in the field of environmental health research (e.g., artificial intelligence and machine learning). TAME 2.0 is a significant expansion upon the original TAME training resource pilot. TAME 2.0 specifically includes training organized into the following chapters: (1) Data management to enable scientific collaborations; (2) Coding in R; (3) Basics of data analysis and visualizations; (4) Converting wet lab data into dry lab analyses; (5) Machine learning; (6) Applications in toxicology and exposure science; and (7) Environmental health database mining. Also new to TAME 2.0 are "Test Your Knowledge" activities at the end of each training module, in which participants are asked additional module-specific questions about the example datasets and apply skills introduced in the module to answer them. TAME 2.0 effectiveness was evaluated via participant surveys during graduate-level workshops and coursework, as well as undergraduate-level summer research training events, and suggested edits were incorporated while overall metrics of effectiveness were quantified.

Discussion: Collectively, TAME 2.0 now serves as a valuable resource to address the growing demand of increased data science training in environmental health research. TAME 2.0 is publicly available at: https://uncsrp.github.io/TAME2/.

References
1.
Davis A, Grondin C, Johnson R, Sciaky D, Wiegers J, Wiegers T . Comparative Toxicogenomics Database (CTD): update 2021. Nucleic Acids Res. 2020; 49(D1):D1138-D1143. PMC: 7779006. DOI: 10.1093/nar/gkaa891. View

2.
Shaffer R, Sellers S, Baker M, de Buen Kalman R, Frostad J, Suter M . Improving and Expanding Estimates of the Global Burden of Disease Due to Environmental Health Risk Factors. Environ Health Perspect. 2019; 127(10):105001. PMC: 6867191. DOI: 10.1289/EHP5496. View

3.
Roell K, Koval L, Boyles R, Patlewicz G, Ring C, Rider C . Development of the InTelligence And Machine LEarning (TAME) Toolkit for Introductory Data Science, Chemical-Biological Analyses, Predictive Modeling, and Database Mining for Environmental Health Research. Front Toxicol. 2022; 4:893924. PMC: 9257219. DOI: 10.3389/ftox.2022.893924. View

4.
Rager J, Clark J, Eaves L, Avula V, Niehoff N, Kim Y . Mixtures modeling identifies chemical inducers versus repressors of toxicity associated with wildfire smoke. Sci Total Environ. 2021; 775:145759. PMC: 8243846. DOI: 10.1016/j.scitotenv.2021.145759. View

5.
Hicks S, Irizarry R . A Guide to Teaching Data Science. Am Stat. 2019; 72(4):382-391. PMC: 6519964. DOI: 10.1080/00031305.2017.1356747. View