Power Analysis in Randomized Clinical Trials Based on Item Response Theory

Overview

Journal Control Clin Trials

Publisher Elsevier

Specialties General Medicine
Pharmacology

Date 2003 Jul 17

PMID 12865034

Citations 15

Authors

Rebecca Holman

Cees A W Glas

Rob J de Haan

Affiliations

Soon will be listed here.

Abstract

Patient relevant outcomes, measured using questionnaires, are becoming increasingly popular endpoints in randomized clinical trials (RCTs). Recently, interest in the use of item response theory (IRT) to analyze the responses to such questionnaires has increased. In this paper, we used a simulation study to examine the small sample behavior of a test statistic designed to examine the difference in average latent trait level between two groups when the two-parameter logistic IRT model for binary data is used. The simulation study was extended to examine the relationship between the number of patients required in each arm of an RCT, the number of items used to assess them, and the power to detect minimal, moderate, and substantial treatment effects. The results show that the number of patients required in each arm of an RCT varies with the number of items used to assess the patients. However, as long as at least 20 items are used, the number of items barely affects the number of patients required in each arm of an RCT to detect effect sizes of 0.5 and 0.8 with a power of 80%. In addition, the number of items used has more effect on the number of patients required to detect an effect size of 0.2 with a power of 80%. For instance, if only five randomly selected items are used, it is necessary to include 950 patients in each arm, but if 50 items are used, only 450 are required in each arm. These results indicate that if an RCT is to be designed to detect small effects, it is inadvisable to use very short instruments analyzed using IRT. Finally, the SF-36, SF-12, and SF-8 instruments were considered in the same framework. Since these instruments consist of items scored in more than two categories, slightly different results were obtained.

Citing Articles

Bayesian item response theory to estimate power in clinical trials with patient-reported outcomes as endpoints.

Mei X, Cappelleri J, Hu J Qual Life Res. 2025; .

PMID: 39776338 DOI: 10.1007/s11136-024-03874-y.

Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT.

Zimmer F, Draxler C, Debelak R Psychometrika. 2022; 88(4):1249-1298.

PMID: 36029390 PMC: 10656348. DOI: 10.1007/s11336-022-09883-5.

Item Response Theory Modeling of the International Prostate Symptom Score in Patients with Lower Urinary Tract Symptoms Associated with Benign Prostatic Hyperplasia.

Lyauk Y, Jonker D, Lund T, Hooker A, Karlsson M AAPS J. 2020; 22(5):115.

PMID: 32856168 PMC: 7452927. DOI: 10.1208/s12248-020-00500-w.

Measuring physical and mental health during pregnancy and postpartum in an Australian childbearing population - validation of the PROMIS Global Short Form.

Slavin V, Gamble J, Creedy D, Fenwick J, Pallant J BMC Pregnancy Childbirth. 2019; 19(1):370.

PMID: 31640626 PMC: 6805680. DOI: 10.1186/s12884-019-2546-6.

Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes.

Smits N, Paap M, Bohnke J Qual Life Res. 2018; 27(4):1055-1063.

PMID: 29476312 PMC: 5874279. DOI: 10.1007/s11136-018-1821-8.