» Articles » PMID: 36854597

Measuring Concordance of Data Sources Used for Infectious Disease Research in the USA: a Retrospective Data Analysis

Overview
Journal BMJ Open
Specialty General Medicine
Date 2023 Feb 28
PMID 36854597
Authors
Affiliations
Soon will be listed here.
Abstract

Objectives: As highlighted by the COVID-19 pandemic, researchers are eager to make use of a wide variety of data sources, both government-sponsored and alternative, to characterise the epidemiology of infectious diseases. The objective of this study is to investigate the strengths and limitations of sources currently being used for research.

Design: Retrospective descriptive analysis.

Primary And Secondary Outcome Measures: Yearly number of national-level and state-level disease-specific case counts and disease clusters for three diseases (measles, mumps and varicella) during a 5-year study period (2013-2017) across four different data sources: Optum (health insurance billing claims data), HealthMap (online news surveillance data), Morbidity and Mortality Weekly Reports (official government reports) and National Notifiable Disease Surveillance System (government case surveillance data).

Results: Our study demonstrated drastic differences in reported infectious disease incidence across data sources. When compared with the other three sources of interest, Optum data showed substantially higher, implausible standardised case counts for all three diseases. Although there was some concordance in identified state-level case counts and disease clusters, all four sources identified variations in state-level reporting.

Conclusions: Researchers should consider data source limitations when attempting to characterise the epidemiology of infectious diseases. Some data sources, such as billing claims data, may be unsuitable for epidemiological research within the infectious disease context.

Citing Articles

What makes an epidemic a disaster: the future of epidemics within the EM-DAT International Disaster Database.

Tonnelier M, Delforge D, Below R, Munguia J, Saegerman C, Wathelet V BMC Public Health. 2025; 25(1):21.

PMID: 39754094 PMC: 11697923. DOI: 10.1186/s12889-024-21026-2.


Benchmarking commercial healthcare claims data.

Dahlen A, Deng Y, Charu V medRxiv. 2024; .

PMID: 39228744 PMC: 11370529. DOI: 10.1101/2024.08.19.24312249.

References
1.
Majumder M, Santillana M, Mekaru S, McGinnis D, Khan K, Brownstein J . Utilizing Nontraditional Data Sources for Near Real-Time Estimation of Transmission Dynamics During the 2015-2016 Colombian Zika Virus Disease Outbreak. JMIR Public Health Surveill. 2016; 2(1):e30. PMC: 4909981. DOI: 10.2196/publichealth.5814. View

2.
Vink M, Bootsma M, Wallinga J . Serial intervals of respiratory infectious diseases: a systematic review and analysis. Am J Epidemiol. 2014; 180(9):865-75. DOI: 10.1093/aje/kwu209. View

3.
Majumder M, Nguyen C, Cohn E, Hswen Y, Mekaru S, Brownstein J . Vaccine compliance and the 2016 Arkansas mumps outbreak. Lancet Infect Dis. 2017; 17(4):361-362. DOI: 10.1016/S1473-3099(17)30122-6. View

4.
McGough S, Brownstein J, Hawkins J, Santillana M . Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data. PLoS Negl Trop Dis. 2017; 11(1):e0005295. PMC: 5268704. DOI: 10.1371/journal.pntd.0005295. View

5.
Hoen A, Keller M, Verma A, Buckeridge D, Brownstein J . Electronic event-based surveillance for monitoring dengue, Latin America. Emerg Infect Dis. 2012; 18(7):1147-50. PMC: 3376807. DOI: 10.3201/eid1807.120055. View