Sample Size Requirements Are Not Being Considered in Studies Developing Prediction Models for Binary Outcomes: a Systematic Review

Overview

Journal BMC Med Res Methodol

Publisher Biomed Central

Specialties General Medicine
Health Services

Date 2023 Aug 19

PMID 37598153

Authors

Paula Dhiman

Jie Ma

Cathy Qi

Garrett Bullock

Jamie C Sergeant

Richard D Riley

Gary S Collins

Affiliations

Soon will be listed here.

Abstract

Background: Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome.

Methods: We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size.

Results: A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63-82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66-84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84).

Conclusions: Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model.

Citing Articles

Risk prediction tools for pressure injury occurrence: an umbrella review of systematic reviews reporting model development and validation methods.

Hillier B, Scandrett K, Coombe A, Hernandez-Boussard T, Steyerberg E, Takwoingi Y Diagn Progn Res. 2025; 9(1):2.

PMID: 39806510 PMC: 11730812. DOI: 10.1186/s41512-024-00182-4.

Lipoprotein(a), high-sensitivity c-reactive protein, homocysteine and cardiovascular disease in the Multi-Ethnic Study of Atherosclerosis.

Nomura S, Bhatia H, Garg P, Karger A, Guan W, Cao J Am J Prev Cardiol. 2025; 21():100903.

PMID: 39802678 PMC: 11722194. DOI: 10.1016/j.ajpc.2024.100903.

Overlooked and underpowered: a meta-research addressing sample size in radiomics prediction models for binary outcomes.

Zhong J, Liu X, Lu J, Yang J, Zhang G, Mao S Eur Radiol. 2025; 35(3):1146-1156.

PMID: 39789271 PMC: 11835977. DOI: 10.1007/s00330-024-11331-0.

Machine Learning and Metabolomics Predict Mesenchymal Stem Cell Osteogenic Differentiation in 2D and 3D Cultures.

Klontzas M, Vernardis S, Batsali A, Papadogiannis F, Panoskaltsis N, Mantalaris A J Funct Biomater. 2024; 15(12).

PMID: 39728167 PMC: 11680063. DOI: 10.3390/jfb15120367.

Prognostic models for prediction of perioperative allogeneic red blood cell transfusion in adult cardiac surgery: A systematic review and meta-analysis.

Van den Eynde R, Vrancken A, Foubert R, Tuand K, Vandendriessche T, Schrijvers A Transfusion. 2024; 65(2):397-409.

PMID: 39726297 PMC: 11826302. DOI: 10.1111/trf.18108.

References

Riley R, Snell K, Martin G, Whittle R, Archer L, Sperrin M . Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small. J Clin Epidemiol. 2020; 132:88-96. PMC: 8026952. DOI: 10.1016/j.jclinepi.2020.12.005. View

Raseta M, Bazarova A, Wright H, Parrott A, Nayak S . A novel toolkit for the prediction of clinical outcomes following mechanical thrombectomy. Clin Radiol. 2020; 75(10):795.e15-795.e21. DOI: 10.1016/j.crad.2020.06.026. View

van Smeden M, Moons K, de Groot J, Collins G, Altman D, Eijkemans M . Sample size for binary logistic prediction models: Beyond events per variable criteria. Stat Methods Med Res. 2018; 28(8):2455-2474. PMC: 6710621. DOI: 10.1177/0962280218784726. View

Moons K, Wolff R, Riley R, Whiting P, Westwood M, Collins G . PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Ann Intern Med. 2019; 170(1):W1-W33. DOI: 10.7326/M18-1377. View

Riley R, Snell K, Ensor J, Burke D, Harrell Jr F, Moons K . Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat Med. 2018; 38(7):1276-1296. PMC: 6519266. DOI: 10.1002/sim.7992. View

Moons K, Altman D, Reitsma J, Ioannidis J, Macaskill P, Steyerberg E . Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015; 162(1):W1-73. DOI: 10.7326/M14-0698. View

Moons K, de Groot J, Bouwmeester W, Vergouwe Y, Mallett S, Altman D . Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014; 11(10):e1001744. PMC: 4196729. DOI: 10.1371/journal.pmed.1001744. View

Collins G, Reitsma J, Altman D, Moons K . Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015; 162(1):55-63. DOI: 10.7326/M14-0697. View

Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A . Rayyan-a web and mobile app for systematic reviews. Syst Rev. 2016; 5(1):210. PMC: 5139140. DOI: 10.1186/s13643-016-0384-4. View

10.

Roposch A, Protopapa E, Malaga-Shaw O, Gelfer Y, Humphries P, Ridout D . Predicting developmental dysplasia of the hip in at-risk newborns. BMC Musculoskelet Disord. 2020; 21(1):442. PMC: 7341560. DOI: 10.1186/s12891-020-03454-4. View

11.

Archer L, Snell K, Ensor J, Hudda M, Collins G, Riley R . Minimum sample size for external validation of a clinical prediction model with a continuous outcome. Stat Med. 2020; 40(1):133-146. DOI: 10.1002/sim.8766. View

12.

Dhiman P, Ma J, Navarro C, Speich B, Bullock G, Damen J . Reporting of prognostic clinical prediction models based on machine learning methods in oncology needs to be improved. J Clin Epidemiol. 2021; 138:60-72. PMC: 8592577. DOI: 10.1016/j.jclinepi.2021.06.024. View

13.

Meehan A, Lewis S, Fazel S, Fusar-Poli P, Steyerberg E, Stahl D . Clinical prediction models in psychiatry: a systematic review of two decades of progress and challenges. Mol Psychiatry. 2022; 27(6):2700-2708. PMC: 9156409. DOI: 10.1038/s41380-022-01528-4. View

14.

Wolff R, Moons K, Riley R, Whiting P, Westwood M, Collins G . PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med. 2019; 170(1):51-58. DOI: 10.7326/M18-1376. View

15.

Andaur Navarro C, Damen J, Takada T, Nijman S, Dhiman P, Ma J . Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ. 2021; 375:n2281. PMC: 8527348. DOI: 10.1136/bmj.n2281. View

16.

Page M, McKenzie J, Bossuyt P, Boutron I, Hoffmann T, Mulrow C . The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021; 372:n71. PMC: 8005924. DOI: 10.1136/bmj.n71. View

17.

Pate A, Riley R, Collins G, van Smeden M, Van Calster B, Ensor J . Minimum sample size for developing a multivariable prediction model using multinomial logistic regression. Stat Methods Med Res. 2023; 32(3):555-571. PMC: 10012398. DOI: 10.1177/09622802231151220. View

18.

Andaur Navarro C, Damen J, van Smeden M, Takada T, Nijman S, Dhiman P . Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models. J Clin Epidemiol. 2022; 154:8-22. DOI: 10.1016/j.jclinepi.2022.11.015. View

19.

Riley R, Debray T, Collins G, Archer L, Ensor J, van Smeden M . Minimum sample size for external validation of a clinical prediction model with a binary outcome. Stat Med. 2021; 40(19):4230-4251. DOI: 10.1002/sim.9025. View

20.

Riley R, Collins G . Stability of clinical prediction models developed using statistical or machine learning methods. Biom J. 2023; 65(8):e2200302. PMC: 10952221. DOI: 10.1002/bimj.202200302. View