» Articles » PMID: 32134502

Assessing the Goodness of Fit of Logistic Regression Models in Large Samples: A Modification of the Hosmer-Lemeshow Test

Overview
Journal Biometrics
Specialty Public Health
Date 2020 Mar 6
PMID 32134502
Citations 59
Authors
Affiliations
Soon will be listed here.
Abstract

Evaluating the goodness of fit of logistic regression models is crucial to ensure the accuracy of the estimated probabilities. Unfortunately, such evaluation is problematic in large samples. Because the power of traditional goodness of fit tests increases with the sample size, practically irrelevant discrepancies between estimated and true probabilities are increasingly likely to cause the rejection of the hypothesis of perfect fit in larger and larger samples. This phenomenon has been widely documented for popular goodness of fit tests, such as the Hosmer-Lemeshow test. To address this limitation, we propose a modification of the Hosmer-Lemeshow approach. By standardizing the noncentrality parameter that characterizes the alternative distribution of the Hosmer-Lemeshow statistic, we introduce a parameter that measures the goodness of fit of a model but does not depend on the sample size. We provide the methodology to estimate this parameter and construct confidence intervals for it. Finally, we propose a formal statistical test to rigorously assess whether the fit of a model, albeit not perfect, is acceptable for practical purposes. The proposed method is compared in a simulation study with a competing modification of the Hosmer-Lemeshow test, based on repeated subsampling. We provide a step-by-step illustration of our method using a model for postneonatal mortality developed in a large cohort of more than 300 000 observations.

Citing Articles

The combined manifestations of dramatically sore throat, congested and edematous mucosa, no-swelling tonsil are specific in acute Omicron pharyngitis.

Zhou L, Zhang L, Xu F BMC Infect Dis. 2025; 25(1):29.

PMID: 39762748 PMC: 11702273. DOI: 10.1186/s12879-024-10364-6.


Development and construction of a cataract risk prediction model based on biochemical indices: the National Health and Nutrition Examination Survey, 2005-2008.

Wang G, Yi X Front Med (Lausanne). 2024; 11:1452756.

PMID: 39497845 PMC: 11532035. DOI: 10.3389/fmed.2024.1452756.


Severity of illness scores in the pediatric intensive care unit: a practical guide.

Arias Lopez M, Prata-Barbosa A, Lima-Setta F Crit Care Sci. 2024; 36:e20240205en.

PMID: 39442137 PMC: 11554295. DOI: 10.62675/2965-2774.20240205-en.


The association between arterial stiffness and socioeconomic status: a cross-sectional study using estimated pulse wave velocity.

Kim H, Kwon S, Joh H, Lim W, Seo J, Kim S Clin Hypertens. 2024; 30(1):26.

PMID: 39350219 PMC: 11443864. DOI: 10.1186/s40885-024-00284-7.


Evaluating the cost-effectiveness of polygenic risk score-stratified screening for abdominal aortic aneurysm.

Kelemen M, Danesh J, Di Angelantonio E, Inouye M, OSullivan J, Pennells L Nat Commun. 2024; 15(1):8063.

PMID: 39277617 PMC: 11401842. DOI: 10.1038/s41467-024-52452-w.