» Articles » PMID: 31957077

Developing Risk Models for Multicenter Data Using Standard Logistic Regression Produced Suboptimal Predictions: A Simulation Study

Overview
Journal Biom J
Specialty Public Health
Date 2020 Jan 21
PMID 31957077
Citations 11
Authors
Affiliations
Soon will be listed here.
Abstract

Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

Citing Articles

Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression.

Li L, Rysavy M, Bobashev G, Das A BMC Med Res Methodol. 2024; 24(1):261.

PMID: 39482630 PMC: 11526521. DOI: 10.1186/s12874-024-02389-x.


Detecting anteriorly displaced temporomandibular joint discs using super-resolution magnetic resonance imaging: a multi-center study.

Li Y, Li W, Wang L, Wang X, Gao S, Liao Y Front Physiol. 2024; 14:1272814.

PMID: 38250655 PMC: 10796555. DOI: 10.3389/fphys.2023.1272814.


Personalising monitoring for chemotherapy patients through predicting deterioration in renal and hepatic function.

Chambers P, Watson M, Bridgewater J, Forster M, Roylance R, Burgoyne R Cancer Med. 2023; 12(17):17856-17865.

PMID: 37610318 PMC: 10524043. DOI: 10.1002/cam4.6418.


Development and validation of prediction models for the discharge destination of elderly patients with aspiration pneumonia.

Hirota Y, Shin J, Sasaki N, Kunisawa S, Fushimi K, Imanaka Y PLoS One. 2023; 18(2):e0282272.

PMID: 36827320 PMC: 9955922. DOI: 10.1371/journal.pone.0282272.


Outcome prediction in newborn infants: Past, present, and future.

Shukla V, Rysavy M, Das A, Tyson J, Bell E, Ambalavanan N Semin Perinatol. 2022; 46(7):151641.

PMID: 35850743 PMC: 10969981. DOI: 10.1016/j.semperi.2022.151641.


References
1.
Strobl A, Vickers A, Van Calster B, Steyerberg E, Leach R, Thompson I . Improving patient prostate cancer risk assessment: Moving from static, globally-applied to dynamic, practice-specific risk calculators. J Biomed Inform. 2015; 56:87-93. PMC: 4532612. DOI: 10.1016/j.jbi.2015.05.001. View

2.
Timmerman D, Van Calster B, Testa A, Guerriero S, Fischerova D, Lissoni A . Ovarian cancer prediction in adnexal masses using ultrasound-based logistic regression models: a temporal and external validation study by the IOTA group. Ultrasound Obstet Gynecol. 2010; 36(2):226-34. DOI: 10.1002/uog.7636. View

3.
Falconieri N, Van Calster B, Timmerman D, Wynants L . Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study. Biom J. 2020; 62(4):932-944. PMC: 7383814. DOI: 10.1002/bimj.201900075. View

4.
Neuhaus J, McCulloch C, Boylan R . Estimation of covariate effects in generalized linear mixed models with a misspecified distribution of random intercepts and slopes. Stat Med. 2012; 32(14):2419-29. DOI: 10.1002/sim.5682. View

5.
Kahan B, Harhay M . Many multicenter trials had few events per center, requiring analysis via random-effects models or GEEs. J Clin Epidemiol. 2015; 68(12):1504-11. PMC: 4845666. DOI: 10.1016/j.jclinepi.2015.03.016. View