» Articles » PMID: 19884961

Categorical Data Analysis: Away from ANOVAs (transformation or Not) and Towards Logit Mixed Models

Overview
Journal J Mem Lang
Publisher Elsevier
Date 2009 Nov 4
PMID 19884961
Citations 784
Authors
Affiliations
Soon will be listed here.
Abstract

This paper identifies several serious problems with the widespread use of ANOVAs for the analysis of categorical outcome variables such as forced-choice variables, question-answer accuracy, choice in production (e.g. in syntactic priming research), et cetera. I show that even after applying the arcsine-square-root transformation to proportional data, ANOVA can yield spurious results. I discuss conceptual issues underlying these problems and alternatives provided by modern statistics. Specifically, I introduce ordinary logit models (i.e. logistic regression), which are well-suited to analyze categorical data and offer many advantages over ANOVA. Unfortunately, ordinary logit models do not include random effect modeling. To address this issue, I describe mixed logit models (Generalized Linear Mixed Models for binomially distributed outcomes, Breslow & Clayton, 1993), which combine the advantages of ordinary logit models with the ability to account for random subject and item effects in one step of analysis. Throughout the paper, I use a psycholinguistic data set to compare the different statistical methods.

Citing Articles

Exploring the development of face recognition across childhood via logistic mixed-effects modelling of the standardised Cambridge Face Memory Test.

Ewing L, Althaus N, Farran E, Papasavva M, Mares I, Smith M Behav Res Methods. 2025; 57(4):113.

PMID: 40064748 PMC: 11893692. DOI: 10.3758/s13428-025-02629-y.


Testing the Interface Hypothesis: Evidence from processing directions of possession transfer in double object constructions by L1-Mandarin Chinese L2-English learners.

Li Y, Zeng T, Liu Z PLoS One. 2025; 20(2):e0313965.

PMID: 39937790 PMC: 11819539. DOI: 10.1371/journal.pone.0313965.


Analogical reasoning in first and second languages.

Ikuta M, Miwa K PLoS One. 2025; 20(2):e0318348.

PMID: 39932969 PMC: 11813118. DOI: 10.1371/journal.pone.0318348.


On the interaction between implicit statistical learning and the alternation advantage: Evidence from manual and oculomotor serial reaction time tasks.

Compostella A, Tagliani M, Vender M, Delfitto D PLoS One. 2025; 20(2):e0318638.

PMID: 39913613 PMC: 11801591. DOI: 10.1371/journal.pone.0318638.


Exploring short-term memory and listening effort in two-talker conversations: The influence of soft and moderate background noise.

Mohanathasan C, Ermert C, Fels J, Kuhlen T, Schlittmeier S PLoS One. 2025; 20(2):e0318821.

PMID: 39913505 PMC: 11801578. DOI: 10.1371/journal.pone.0318821.


References
1.
Haldane J . The estimation and significance of the logarithm of a ratio of frequencies. Ann Hum Genet. 1956; 20(4):309-11. DOI: 10.1111/j.1469-1809.1955.tb01285.x. View

2.
Lorch Jr R, Myers J . Regression analyses of repeated measures data in cognitive research. J Exp Psychol Learn Mem Cogn. 1990; 16(1):149-57. DOI: 10.1037//0278-7393.16.1.149. View

3.
Lindstrom M, Bates D . Nonlinear mixed effects models for repeated measures data. Biometrics. 1990; 46(3):673-87. View

4.
GART J, Zweifel J . On the bias of various estimators of the logit and its variance with application to quantal bioassay. Biometrika. 1967; 54(1):181-7. View