Cluster Randomised Trials with a Binary Outcome and a Small Number of Clusters: Comparison of Individual and Cluster Level Analysis Method

Overview

Journal BMC Med Res Methodol

Publisher Biomed Central

Specialties General Medicine
Health Services

Date 2022 Aug 12

PMID 35962318

Authors

Jennifer A Thompson

Clemence Leyrat

Katherine L Fielding

Richard J Hayes

Affiliations

Soon will be listed here.

Abstract

Background: Cluster randomised trials (CRTs) are often designed with a small number of clusters, but it is not clear which analysis methods are optimal when the outcome is binary. This simulation study aimed to determine (i) whether cluster-level analysis (CL), generalised linear mixed models (GLMM), and generalised estimating equations with sandwich variance (GEE) approaches maintain acceptable type-one error including the impact of non-normality of cluster effects and low prevalence, and if so (ii) which methods have the greatest power. We simulated CRTs with 8-30 clusters, altering the cluster-size, outcome prevalence, intracluster correlation coefficient, and cluster effect distribution. We analysed each dataset with weighted and unweighted CL; GLMM with adaptive quadrature and restricted pseudolikelihood; GEE with Kauermann-and-Carroll and Fay-and-Graubard sandwich variance using independent and exchangeable working correlation matrices. P-values were from a t-distribution with degrees of freedom (DoF) as clusters minus cluster-level parameters; GLMM pseudolikelihood also used Satterthwaite and Kenward-Roger DoF.

Results: Unweighted CL, GLMM pseudolikelihood, and Fay-and-Graubard GEE with independent or exchangeable working correlation matrix controlled type-one error in > 97% scenarios with clusters minus parameters DoF. Cluster-effect distribution and prevalence of outcome did not usually affect analysis method performance. GEE had the least power. With 20-30 clusters, GLMM had greater power than CL with varying cluster-size but similar power otherwise; with fewer clusters, GLMM had lower power with common cluster-size, similar power with medium variation, and greater power with large variation in cluster-size.

Conclusion: We recommend that CRTs with ≤ 30 clusters and a binary outcome use an unweighted CL or restricted pseudolikelihood GLMM both with DoF clusters minus cluster-level parameters.

Citing Articles

Design of field trials for the evaluation of transmissible vaccines in animal populations.

Sheen J, Kennedy-Shaffer L, Levy M, Metcalf C PLoS Comput Biol. 2025; 21(2):e1012779.

PMID: 39899630 PMC: 11790233. DOI: 10.1371/journal.pcbi.1012779.

A cluster randomized trial assessing the effect of a digital health algorithm on quality of care in Tanzania (DYNAMIC study).

Tan R, Kavishe G, Kulinkina A, Renggli S, Luwanda L, Mangu C PLOS Digit Health. 2024; 3(12):e0000694.

PMID: 39715234 PMC: 11666054. DOI: 10.1371/journal.pdig.0000694.

Re-analysis of data from cluster randomised trials to explore the impact of model choice on estimates of odds ratios: study protocol.

Hemming K, Thompson J, Taljaard M, Watson S, Kasza J, Thompson J Trials. 2024; 25(1):818.

PMID: 39695707 PMC: 11653799. DOI: 10.1186/s13063-024-08653-1.

Demystifying estimands in cluster-randomised trials.

Kahan B, Blette B, Harhay M, Halpern S, Jairath V, Copas A Stat Methods Med Res. 2024; 33(7):1211-1232.

PMID: 38780480 PMC: 11348634. DOI: 10.1177/09622802241254197.

Analysing cluster randomised controlled trials using GLMM, GEE1, GEE2, and QIF: results from four case studies.

Offorha B, Walters S, Jacques R BMC Med Res Methodol. 2023; 23(1):293.

PMID: 38093221 PMC: 10717070. DOI: 10.1186/s12874-023-02107-z.

References

Ford W, Westgate P . Improved standard error estimator for maintaining the validity of inference in cluster randomized trials with a small number of clusters. Biom J. 2017; 59(3):478-495. DOI: 10.1002/bimj.201600182. View

Li P, Redden D . Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials. BMC Med Res Methodol. 2015; 15:38. PMC: 4458010. DOI: 10.1186/s12874-015-0026-x. View

Westgate P . On small-sample inference in group randomized trials with binary outcomes and cluster-level covariates. Biom J. 2013; 55(5):789-806. DOI: 10.1002/bimj.201200237. View

Westgate P, Cheng D, Feaster D, Fernandez S, Shoben A, Vandergrift N . Marginal modeling in community randomized trials with rare events: Utilization of the negative binomial regression model. Clin Trials. 2022; 19(2):162-171. PMC: 9038610. DOI: 10.1177/17407745211063479. View

Thompson J, Hemming K, Forbes A, Fielding K, Hayes R . Comparison of small-sample standard-error corrections for generalised estimating equations in stepped wedge cluster randomised trials with a binary outcome: A simulation study. Stat Methods Med Res. 2020; 30(2):425-439. PMC: 8008420. DOI: 10.1177/0962280220958735. View

Litiere S, Alonso A, Molenberghs G . The impact of a misspecified random-effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models. Stat Med. 2007; 27(16):3125-44. DOI: 10.1002/sim.3157. View

McNeish D, Stapleton L . Modeling Clustered Data with Very Few Clusters. Multivariate Behav Res. 2016; 51(4):495-518. DOI: 10.1080/00273171.2016.1167008. View

Adams G, Gulliford M, Ukoumunne O, Eldridge S, Chinn S, Campbell M . Patterns of intra-cluster correlation from primary care research to inform study design and analysis. J Clin Epidemiol. 2004; 57(8):785-94. DOI: 10.1016/j.jclinepi.2003.12.013. View

Zou G, Donner A . Confidence interval estimation of the intraclass correlation coefficient for binary outcome data. Biometrics. 2004; 60(3):807-11. DOI: 10.1111/j.0006-341X.2004.00232.x. View

10.

Mancl L, DeRouen T . A covariance estimator for GEE with improved small-sample properties. Biometrics. 2001; 57(1):126-34. DOI: 10.1111/j.0006-341x.2001.00126.x. View

11.

Liu X, Lewis J, Zhang H, Lu W, Zhang S, Zheng G . Effectiveness of Electronic Reminders to Improve Medication Adherence in Tuberculosis Patients: A Cluster-Randomised Trial. PLoS Med. 2015; 12(9):e1001876. PMC: 4570796. DOI: 10.1371/journal.pmed.1001876. View

12.

Gallis J, Li F, Turner E . xtgeebcv: A command for bias-corrected sandwich variance estimation for GEE analyses of cluster randomized trials. Stata J. 2022; 20(2):363-381. PMC: 8942127. DOI: 10.1177/1536867x20931001. View

13.

Heeren T, DAgostino R . Robustness of the two independent samples t-test when applied to ordinal scaled data. Stat Med. 1987; 6(1):79-90. DOI: 10.1002/sim.4780060110. View

14.

Eldridge S, Ashby D, Kerry S . Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol. 2006; 35(5):1292-300. DOI: 10.1093/ije/dyl129. View

15.

Scott J, deCamp A, Juraska M, Fay M, B Gilbert P . Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials. Stat Methods Med Res. 2014; 26(2):583-597. PMC: 4411204. DOI: 10.1177/0962280214552092. View

16.

BONEAU C . The effects of violations of assumptions underlying the test. Psychol Bull. 1960; 57:49-64. DOI: 10.1037/h0041412. View

17.

Hanley J, Negassa A, Edwardes M . GEE analysis of negatively correlated binary responses: a caution. Stat Med. 2000; 19(5):715-22. DOI: 10.1002/(sici)1097-0258(20000315)19:5<715::aid-sim342>3.0.co;2-t. View

18.

Lu B, Preisser J, Qaqish B, Suchindran C, Bangdiwala S, Wolfson M . A comparison of two bias-corrected covariance estimators for generalized estimating equations. Biometrics. 2007; 63(3):935-41. DOI: 10.1111/j.1541-0420.2007.00764.x. View

19.

Zeger S, Liang K, Albert P . Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988; 44(4):1049-60. View

20.

Khan M, Walley J, Witter S, Imran A, Safdar N . Costs and cost-effectiveness of different DOT strategies for the treatment of tuberculosis in Pakistan. Directly Observed Treatment. Health Policy Plan. 2002; 17(2):178-86. DOI: 10.1093/heapol/17.2.178. View