A Comparison of Classical and Modern Measures of Internal Consistency

Overview

Journal Front Psychol

Date 2019 Dec 24

PMID 31866905

Citations 31

Authors

Pasquale Anselmi

Daiana Colledani

Egidio Robusto

Affiliations

Soon will be listed here.

Abstract

Three measures of internal consistency - Kuder-Richardson Formula 20 (KR20), Cronbach's alpha (α), and person separation reliability (R) - are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement. These three measures specify the observed variance as the sum of true variance and error variance. However, they differ for the way in which these quantities are obtained. KR20 uses the error variance of an "average" respondent from the sample, which overestimates the error variance of respondents with high or low scores. Conversely, R uses the actual average error variance of the sample. KR20 and α use respondents' test scores in calculating the observed variance. This is potentially misleading because test scores are not linear representations of the underlying variable, whereas calculation of variance requires linearity. Contrariwise, if the data fit the Rasch model, the measures estimated for each respondent are on a linear scale, thus being numerically suitable for calculating the observed variance. Given these differences, R is expected to be a better index of internal consistency than KR20 and α. The present work compares the three measures on simulated data sets with dichotomous and polytomous items. It is shown that all the estimates of internal consistency decrease with the increasing of the skewness of the score distribution, with R decreasing to a larger extent. Thus, R is more conservative than KR20 and α, and prevents test users from believing a test has better measurement characteristics than it actually has. In addition, it is shown that Rasch-based infit and outfit person statistics can be used for handling data sets with random responses. Two options are described. The first one implies computing a more conservative estimate of internal consistency. The second one implies detecting individuals with random responses. When there are a few individuals with a consistent number of random responses, infit and outfit allow for correctly detecting almost all of them. Once these individuals are removed, a "cleaned" data set is obtained that can be used for computing a less biased estimate of internal consistency.

Citing Articles

Cross-cultural adaptation and validation of a Norwegian version of the Goodman Satisfaction Score (GSS-NO) for patients with total hip and knee arthroplasty.

Bergvad I, Kottorp A, Aamodt A, Lerdal A, Skou S, Lindberg M Acta Orthop. 2025; 96():52-58.

PMID: 39804812 PMC: 11724478. DOI: 10.2340/17453674.2024.42703.

Wellbeing of Family Carers of Adults With Intellectual Disabilities During the COVID-19 Pandemic in the UK: Longitudinal Study.

Thompson P, Summers E, Caton S, Hayden N, Todd S, Oloidi E J Intellect Disabil Res. 2024; 69(4):265-273.

PMID: 39717989 PMC: 11876487. DOI: 10.1111/jir.13206.

A cross-sectional study applying the PRECEDE model to explore factors influencing epidemic prevention behaviors among preschool educators.

He Y, Huang W, Tung C BMC Public Health. 2024; 24(1):3486.

PMID: 39696044 PMC: 11657516. DOI: 10.1186/s12889-024-20865-3.

Validation of the Spanish Version of the Nurses' Global Assessment of Suicide Risk Scale (NGAR) in Nonclinical Settings.

Alonso-Martinez L, Santos J, Cunha M, Puente-Alcaraz J Nurs Open. 2024; 11(10):e70057.

PMID: 39462266 PMC: 11512755. DOI: 10.1002/nop2.70057.

Assessment of the Mental Health of Police Officers: A Systematic Review of Specific Instruments.

Teles D, Oliveira R, Parnaiba A, Rios M, Machado M, Aquino P Int J Environ Res Public Health. 2024; 21(10).

PMID: 39457273 PMC: 11507048. DOI: 10.3390/ijerph21101300.

References

Vidotto G, Bertolotti G, Carone M, Arpinelli F, Bellia V, Jones P . A new questionnaire specifically designed for patients affected by chronic obstructive pulmonary disease; The Italian Health Status Questionnaire. Respir Med. 2005; 100(5):862-70. DOI: 10.1016/j.rmed.2005.08.024. View

Wagner T, Harvey R . Development of a new critical thinking test using item response theory. Psychol Assess. 2006; 18(1):100-5. DOI: 10.1037/1040-3590.18.1.100. View

Duncan P, Bode R, Lai S, Perera S . Rasch analysis of a new stroke-specific outcome scale: the Stroke Impact Scale. Arch Phys Med Rehabil. 2003; 84(7):950-63. DOI: 10.1016/s0003-9993(03)00035-2. View

Anselmi P, Vianello M, Robusto E . Positive associations primacy in the IAT: a many-facet rasch measurement analysis. Exp Psychol. 2011; 58(5):376-84. DOI: 10.1027/1618-3169/a000106. View

Vidotto G, Anselmi P, Filipponi L, Tommasi M, Saggino A . Using Overt and Covert Items in Self-Report Personality Tests: Susceptibility to Faking and Identifiability of Possible Fakers. Front Psychol. 2018; 9:1100. PMC: 6037895. DOI: 10.3389/fpsyg.2018.01100. View

Thomas M . The value of item response theory in clinical assessment: a review. Assessment. 2010; 18(3):291-307. DOI: 10.1177/1073191110374797. View

Pallant J, Tennant A . An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS). Br J Clin Psychol. 2007; 46(Pt 1):1-18. DOI: 10.1348/014466506x96931. View

Shea T, Tennant A, Pallant J . Rasch model analysis of the Depression, Anxiety and Stress Scales (DASS). BMC Psychiatry. 2009; 9:21. PMC: 2689214. DOI: 10.1186/1471-244X-9-21. View

Colledani D, Anselmi P, Robusto E . Using Item Response Theory for the Development of a New Short Form of the Eysenck Personality Questionnaire-Revised. Front Psychol. 2018; 9:1834. PMC: 6190847. DOI: 10.3389/fpsyg.2018.01834. View

10.

Anselmi P, Vianello M, Voci A, Robusto E . Implicit sexual attitude of heterosexual, gay and bisexual individuals: disentangling the contribution of specific associations to the overall measure. PLoS One. 2013; 8(11):e78990. PMC: 3832517. DOI: 10.1371/journal.pone.0078990. View

11.

Da Dalt L, Anselmi P, Furlan S, Carraro S, Baraldi E, Robusto E . Validating a set of tools designed to assess the perceived quality of training of pediatric residency programs. Ital J Pediatr. 2015; 41:2. PMC: 4339004. DOI: 10.1186/s13052-014-0106-2. View

12.

Cole J, Rabin A, Smith T, Kaufman A . Development and validation of a Rasch-derived CES-D short form. Psychol Assess. 2004; 16(4):360-72. DOI: 10.1037/1040-3590.16.4.360. View

13.

Vidotto G, Carone M, Jones P, Salini S, Bertolotti G . Maugeri Respiratory Failure questionnaire reduced form: a method for improving the questionnaire using the Rasch model. Disabil Rehabil. 2007; 29(13):991-8. DOI: 10.1080/09638280600926678. View

14.

Smith Jr E . Evidence for the reliability of measures and validity of measure interpretation: a Rasch measurement perspective. J Appl Meas. 2002; 2(3):281-311. View

15.

Anselmi P, Vidotto G, Bettinardi O, Bertolotti G . Measurement of change in health status with Rasch models. Health Qual Life Outcomes. 2015; 13:16. PMC: 4341816. DOI: 10.1186/s12955-014-0197-x. View

16.

Rossi Ferrario S, Panzeri A, Anselmi P, Vidotto G . Development and psychometric properties of a short form of the Illness Denial Questionnaire. Psychol Res Behav Manag. 2019; 12:727-739. PMC: 6709814. DOI: 10.2147/PRBM.S207622. View

17.

Tucker L . A note on the estimation of test reliability by the Kuder-Richardson formula (20). Psychometrika. 1949; 14(2):117-9. DOI: 10.1007/BF02289147. View

18.

Balsamo M, Giampaglia G, Saggino A . Building a new Rasch-based self-report inventory of depression. Neuropsychiatr Dis Treat. 2014; 10:153-65. PMC: 3913547. DOI: 10.2147/NDT.S53425. View

19.

Meade A, Craig S . Identifying careless responses in survey data. Psychol Methods. 2012; 17(3):437-55. DOI: 10.1037/a0028085. View

20.

Anselmi P, Vianello M, Robusto E . Preferring thin people does not imply derogating fat people. A Rasch analysis of the implicit weight attitude. Obesity (Silver Spring). 2013; 21(2):261-5. DOI: 10.1002/oby.20085. View