Modeling and Analyzing Respondent-driven Sampling As a Counting Process
Overview
Affiliations
Respondent-driven sampling (RDS) is an approach to sampling design and analysis which utilizes the networks of social relationships that connect members of the target population, using chain-referral. RDS sampling will typically oversample participants with many acquaintances. Naïve estimators, such as the sample average, will thus be biased towards the state of the most highly connected individuals. Current methodology cannot estimate population size from RDS, and promotes inverse probability weighted estimators for population parameters such as HIV prevalence. We propose to use the timing of recruitment, typically collected and discarded, in order to estimate the population size via a counting process model. Once population size and degree frequencies are made available, prevalence can be debiased in a post-stratified framework. We adapt methods developed for inference in epidemiology and software reliability to estimate the population size, degree counts and frequencies. A fundamental advantage of our approach is that it makes the assumptions of the sampling design explicit. This enables verification of the assumptions, maximum likelihood estimation, extension with covariates, and model selection. We develop large-sample theory, proving consistency and asymptotic normality. We further compare our estimators to other estimators in the RDS literature, through simulation and real-world data. In both cases, we find our estimators to outperform current methods. The likelihood problem in the model we present is separable, and thus efficiently solvable. We implement these estimators in an accompanying R package, chords, available on CRAN.
Rudolph A, Nance R, Bobashev G, Brook D, Akhtar W, Cook R BMC Med Res Methodol. 2024; 24(1):94.
PMID: 38654219 PMC: 11036624. DOI: 10.1186/s12874-024-02206-5.
Human social sensing is an untapped resource for computational social science.
Galesic M, Bruine de Bruin W, Dalege J, Feld S, Kreuter F, Olsson H Nature. 2021; 595(7866):214-222.
PMID: 34194037 DOI: 10.1038/s41586-021-03649-2.
THE GRAPHICAL STRUCTURE OF RESPONDENT-DRIVEN SAMPLING.
Crawford F Sociol Methodol. 2019; 46(1):187-211.
PMID: 31607761 PMC: 6788810. DOI: 10.1177/0081175016641713.
Hidden population size estimation from respondent-driven sampling: a network approach.
Crawford F, Wu J, Heimer R J Am Stat Assoc. 2019; 113(522):755-766.
PMID: 30828120 PMC: 6392194. DOI: 10.1080/01621459.2017.1285775.
"Starfish Sampling": a Novel, Hybrid Approach to Recruiting Hidden Populations.
Raymond H, Chen Y, McFarland W J Urban Health. 2018; 96(1):55-62.
PMID: 30328063 PMC: 6391284. DOI: 10.1007/s11524-018-0316-9.