» Articles » PMID: 35025022

Protecting Against Researcher Bias in Secondary Data Analysis: Challenges and Potential Solutions

Overview
Journal Eur J Epidemiol
Specialty Public Health
Date 2022 Jan 13
PMID 35025022
Authors
Affiliations
Soon will be listed here.
Abstract

Analysis of secondary data sources (such as cohort studies, survey data, and administrative records) has the potential to provide answers to science and society's most pressing questions. However, researcher biases can lead to questionable research practices in secondary data analysis, which can distort the evidence base. While pre-registration can help to protect against researcher biases, it presents challenges for secondary data analysis. In this article, we describe these challenges and propose novel solutions and alternative approaches. Proposed solutions include approaches to (1) address bias linked to prior knowledge of the data, (2) enable pre-registration of non-hypothesis-driven research, (3) help ensure that pre-registered analyses will be appropriate for the data, and (4) address difficulties arising from reduced analytic flexibility in pre-registration. For each solution, we provide guidance on implementation for researchers and data guardians. The adoption of these practices can help to protect against researcher bias in secondary data analysis, to improve the robustness of research based on existing data.

Citing Articles

Exploring the Intersection of Mental and Reproductive Health Among Women Living with HIV in Spain: A Qualitative Secondary Data Analysis.

Huertas-Zurriaga A, Gimenez-Diez D, Leyva-Moral J Healthcare (Basel). 2025; 13(2).

PMID: 39857194 PMC: 11764562. DOI: 10.3390/healthcare13020168.


A guide for planning triangulation studies to investigate complex causal questions in behavioural and psychiatric research.

Treur J, Lukas E, Sallis H, Wootton R Epidemiol Psychiatr Sci. 2024; 33:e61.

PMID: 39506622 PMC: 7616800. DOI: 10.1017/S2045796024000623.


Long-Term Randomized Controlled Trials of Diet Intervention Reports and Their Impact on Cancer: A Systematic Review.

Sauter E, Butera G, Agurs-Collins T Cancers (Basel). 2024; 16(19).

PMID: 39409915 PMC: 11475623. DOI: 10.3390/cancers16193296.


The Aging of Polymers under Electromagnetic Radiation.

Maraveas C, Kyrtopoulos I, Arvanitis K, Bartzanas T Polymers (Basel). 2024; 16(5).

PMID: 38475374 PMC: 10934588. DOI: 10.3390/polym16050689.


Open Science Practices in Psychiatric Genetics: A Primer.

Kepinska A, Johnson J, Huckins L Biol Psychiatry Glob Open Sci. 2024; 4(1):110-119.

PMID: 38298792 PMC: 10829621. DOI: 10.1016/j.bpsgos.2023.08.007.


References
1.
Hughes R, Heron J, Sterne J, Tilling K . Accounting for missing data in statistical analyses: multiple imputation is not always the answer. Int J Epidemiol. 2019; 48(4):1294-1304. PMC: 6693809. DOI: 10.1093/ije/dyz032. View

2.
Steegen S, Tuerlinckx F, Gelman A, Vanpaemel W . Increasing Transparency Through a Multiverse Analysis. Perspect Psychol Sci. 2016; 11(5):702-712. DOI: 10.1177/1745691616658637. View

3.
Botvinik-Nezer R, Holzmeister F, Camerer C, Dreber A, Huber J, Johannesson M . Variability in the analysis of a single neuroimaging dataset by many teams. Nature. 2020; 582(7810):84-88. PMC: 7771346. DOI: 10.1038/s41586-020-2314-9. View

4.
Rubin D . The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat Med. 2006; 26(1):20-36. DOI: 10.1002/sim.2739. View

5.
Lash T . Preregistration of study protocols is unlikely to improve the yield from our science, but other strategies might. Epidemiology. 2010; 21(5):612-3. DOI: 10.1097/EDE.0b013e3181e9bba6. View