» Articles » PMID: 17484039

Evaluating Measurement Equivalence Using the Item Response Theory Log-likelihood Ratio (IRTLR) Method to Assess Differential Item Functioning (DIF): Applications (with Illustrations) to Measures of Physical Functioning Ability and General Distress

Overview
Journal Qual Life Res
Date 2007 May 8
PMID 17484039
Citations 37
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Methods based on item response theory (IRT) that can be used to examine differential item functioning (DIF) are illustrated. An IRT-based approach to the detection of DIF was applied to physical function and general distress item sets. DIF was examined with respect to gender, age and race. The method used for DIF detection was the item response theory log-likelihood ratio (IRTLR) approach. DIF magnitude was measured using the differences in the expected item scores, expressed as the unsigned probability differences, and calculated using the non-compensatory DIF index (NCDIF). Finally, impact was assessed using expected scale scores, expressed as group differences in the total test (measure) response functions.

Methods: The example for the illustration of the methods came from a study of 1,714 patients with cancer or HIV/AIDS. The measure contained 23 items measuring physical functioning ability and 15 items addressing general distress, scored in the positive direction.

Results: The substantive findings were of relatively small magnitude DIF. In total, six items showed relatively larger magnitude (expected item score differences greater than the cutoff) of DIF with respect to physical function across the three comparisons: "trouble with a long walk" (race), "vigorous activities" (race, age), "bending, kneeling stooping" (age), "lifting or carrying groceries" (race), "limited in hobbies, leisure" (age), "lack of energy" (race). None of the general distress items evidenced high magnitude DIF; although "worrying about dying" showed some DIF with respect to both age and race, after adjustment.

Conclusions: The fact that many physical function items showed DIF with respect to age, even after adjustment for multiple comparisons, indicates that the instrument may be performing differently for these groups. While the magnitude and impact of DIF at the item and scale level was minimal, caution should be exercised in the use of subsets of these items, as might occur with selection for clinical decisions or computerized adaptive testing. The issues of selection of anchor items, and of criteria for DIF detection, including the integration of significance and magnitude measures remain as issues requiring investigation. Further research is needed regarding the criteria and guidelines appropriate for DIF detection in the context of health-related items.

Citing Articles

Exposome Burden Scores to Summarize Environmental Chemical Mixtures: Creating a Fair and Common Scale for Cross-study Harmonization, Report-back and Precision Environmental Health.

Liu S, Manz K, Buckley J, Feuerstahler L Curr Environ Health Rep. 2025; 12(1):13.

PMID: 39964568 DOI: 10.1007/s40572-024-00467-2.


Measuring visual ability in linguistically diverse populations.

Hooper M, Tomarken A, Gauthier I Behav Res Methods. 2024; 57(1):36.

PMID: 39738819 PMC: 11685244. DOI: 10.3758/s13428-024-02579-x.


Applying Latent Variable Models to Estimate Cumulative Exposure Burden to Chemical Mixtures and Identify Latent Exposure Subgroups: A Critical Review and Future Directions.

Liu S, Chen Y, Kuiper J, Ho E, Buckley J, Feuerstahler L Stat Biosci. 2024; 16(2):482-502.

PMID: 39494216 PMC: 11529820. DOI: 10.1007/s12561-023-09410-9.


Pre-natal and early life lead exposure and childhood inhibitory control: an item response theory approach to improve measurement precision of inhibitory control.

Liu S, Chen Y, Bellinger D, de Water E, Horton M, Tellez-Rojo M Environ Health. 2024; 23(1):71.

PMID: 39232724 PMC: 11375946. DOI: 10.1186/s12940-023-01015-5.


Exposure to per- and polyfluoroalkyl substances and alterations in plasma microRNA profiles in children.

Li Y, Baumert B, Stratakis N, Goodrich J, Wu H, Liu S Environ Res. 2024; 259:119496.

PMID: 38936497 PMC: 11847561. DOI: 10.1016/j.envres.2024.119496.


References
1.
Teresi J, Kleinman M . Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures. Stat Med. 2000; 19(11-12):1651-83. DOI: 10.1002/(sici)1097-0258(20000615/30)19:11/12<1651::aid-sim453>3.0.co;2-h. View

2.
Morales L, Flowers C, Gutierrez P, Kleinman M, Teresi J . Item and scale differential functioning of the Mini-Mental State Exam assessed using the Differential Item and Test Functioning (DFIT) Framework. Med Care. 2006; 44(11 Suppl 3):S143-51. PMC: 1661831. DOI: 10.1097/01.mlr.0000245141.70946.29. View

3.
Fleishman J, Lawrence W . Demographic variation in SF-12 scores: true differences or differential item functioning?. Med Care. 2003; 41(7 Suppl):III75-III86. DOI: 10.1097/01.MLR.0000076052.42628.CF. View

4.
Teresi J . Different approaches to differential item functioning in health applications. Advantages, disadvantages and some neglected topics. Med Care. 2006; 44(11 Suppl 3):S152-70. DOI: 10.1097/01.mlr.0000245142.74628.ab. View

5.
Teresi J, Stewart A, Morales L, Stahl S . Measurement in a multi-ethnic society. Overview to the special issue. Med Care. 2006; 44(11 Suppl 3):S3-4. PMC: 1634762. DOI: 10.1097/01.mlr.0000245437.46695.4a. View