» Articles » PMID: 36474414

Exploring the Impact of Selection Bias in Observational Studies of COVID-19: a Simulation Study

Abstract

Background: Non-random selection of analytic subsamples could introduce selection bias in observational studies. We explored the potential presence and impact of selection in studies of SARS-CoV-2 infection and COVID-19 prognosis.

Methods: We tested the association of a broad range of characteristics with selection into COVID-19 analytic subsamples in the Avon Longitudinal Study of Parents and Children (ALSPAC) and UK Biobank (UKB). We then conducted empirical analyses and simulations to explore the potential presence, direction and magnitude of bias due to this selection (relative to our defined UK-based adult target populations) when estimating the association of body mass index (BMI) with SARS-CoV-2 infection and death-with-COVID-19.

Results: In both cohorts, a broad range of characteristics was related to selection, sometimes in opposite directions (e.g. more-educated people were more likely to have data on SARS-CoV-2 infection in ALSPAC, but less likely in UKB). Higher BMI was associated with higher odds of SARS-CoV-2 infection and death-with-COVID-19. We found non-negligible bias in many simulated scenarios.

Conclusions: Analyses using COVID-19 self-reported or national registry data may be biased due to selection. The magnitude and direction of this bias depend on the outcome definition, the true effect of the risk factor and the assumed selection mechanism; these are likely to differ between studies with different target populations. Bias due to sample selection is a key concern in COVID-19 research based on national registry data, especially as countries end free mass testing. The framework we have used can be applied by other researchers assessing the extent to which their results may be biased for their research question of interest.

Citing Articles

Releasing synthetic data from the Avon Longitudinal Study of Parents and Children (ALSPAC): Guidelines and applied examples.

Major-Smith D, Kwong A, Timpson N, Heron J, Northstone K Wellcome Open Res. 2025; 9:57.

PMID: 39931104 PMC: 11809151. DOI: 10.12688/wellcomeopenres.20530.2.


Twenty-Five Years of Evolution and Hurdles in Electronic Health Records and Interoperability in Medical Research: Comprehensive Review.

Shen Y, Yu J, Zhou J, Hu G J Med Internet Res. 2025; 27:e59024.

PMID: 39787599 PMC: 11757985. DOI: 10.2196/59024.


Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.

Kawabata E, Major-Smith D, Clayton G, Shapland C, Morris T, Carter A BMC Med Res Methodol. 2024; 24(1):278.

PMID: 39538117 PMC: 11558901. DOI: 10.1186/s12874-024-02382-4.


Air Pollution in Relation to COVID-19 Morbidity and Mortality: A Large Population-Based Cohort Study in Catalonia, Spain (COVAIR-CAT).

Tonne C, Ranzani O, Alari A, Ballester J, Basagana X, Chaccour C Res Rep Health Eff Inst. 2024; (220):1-48.

PMID: 39468856 PMC: 11525941.


Longitudinal investigation of a single variant SARS-CoV-2-outbreak in the immunologically naïve population of Ulvik, Norway.

Mortensen N, Wensaas K, Solem U, Sivertsen A, Grewal H, Rortveit G BMC Infect Dis. 2024; 24(1):1161.

PMID: 39407116 PMC: 11481362. DOI: 10.1186/s12879-024-09856-2.


References
1.
Recalde M, Pistillo A, Fernandez-Bertolin S, Roel E, Aragon M, Freisling H . Body Mass Index and Risk of COVID-19 Diagnosis, Hospitalization, and Death: A Cohort Study of 2 524 926 Catalans. J Clin Endocrinol Metab. 2021; 106(12):e5030-e5042. PMC: 8344917. DOI: 10.1210/clinem/dgab546. View

2.
Boyd A, Golding J, Macleod J, Lawlor D, Fraser A, Henderson J . Cohort Profile: the 'children of the 90s'--the index offspring of the Avon Longitudinal Study of Parents and Children. Int J Epidemiol. 2012; 42(1):111-27. PMC: 3600618. DOI: 10.1093/ije/dys064. View

3.
Lassale C, Gaye B, Hamer M, Gale C, Batty G . Ethnic disparities in hospitalisation for COVID-19 in England: The role of socioeconomic factors, mental health, and inflammatory and pro-inflammatory factors in a community-based cohort study. Brain Behav Immun. 2020; 88:44-49. PMC: 7263214. DOI: 10.1016/j.bbi.2020.05.074. View

4.
Griffith G, Morris T, Tudball M, Herbert A, Mancano G, Pike L . Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat Commun. 2020; 11(1):5749. PMC: 7665028. DOI: 10.1038/s41467-020-19478-2. View

5.
Northstone K, Howarth S, Smith D, Bowring C, Wells N, Timpson N . The Avon Longitudinal Study of Parents and Children - A resource for COVID-19 research: Questionnaire data capture April-May 2020. Wellcome Open Res. 2021; 5:127. PMC: 7883314. DOI: 10.12688/wellcomeopenres.16020.2. View