Do We Need Flexible Machine-learning Algorithms to Assess the Effect of Long-term Exposure to Fine Particulate Matter on Mortality?: An Example from a Canadian National Cohort
Overview
Affiliations
Background: Evidence suggests the existence of nonlinearity in the relationship between long-term fine particulate matter (PM) and mortality, and the methods to flexibly incorporate nonlinearity can be improved. To heuristically evaluate the necessity of incorporating machine-learning algorithms, we compared the benefit of reducing long-term PM on mortality estimated from three analytical methods with varying flexibility and complexity.
Methods: Using a cohort of the Canadian Community Health Survey respondents (followed from 2005 until 2014), we obtained consented respondents' baseline characteristics, time-varying annual average PM in the previous 3 years, yearly income and neighborhood characteristics, and vital status. We estimated the 10-year cumulative mortality rate under both a natural-course exposure and a hypothetical dynamic intervention, which would set the respondent's exposure to 8.8 μg/m (current Canadian annual PM standard) if higher. We compared estimates of three analytical methods and mean squared errors under a range of hypothetical true values.
Results: Among 62,365 participants, the 10-year cumulative mortality rate differences per 1000 participants were -0.23 (95% confidence intervals: -0.46, 0.00), -0.83 (-1.24, -0.43), and -0.67 (-1.27, -0.06) for parametric g-computation, targeted minimum loss-based estimator using parametric models, and targeted minimum loss-based estimator with SuperLearner and six candidate algorithms of high flexibility, respectively. Changing the hyperparameters did not meaningful change estimates or algorithm weights.
Conclusions: All three methods of reducing long-term exposure to PM yielded tangible public health benefits in Canada where PM levels are among the lowest worldwide. However, the advantage of employing machine-learning algorithms with a doubly robust estimator remains minimal, especially considering the variance-bias tradeoff.