» Articles » PMID: 38594441

The English Sublexical Toolkit: Methods for Indexing Sound-spelling Consistency

Overview
Publisher Springer
Specialty Social Sciences
Date 2024 Apr 9
PMID 38594441
Authors
Affiliations
Soon will be listed here.
Abstract

This work introduces the English Sublexical Toolkit, a suite of tools that utilizes an experience-dependent learning framework of sublexical knowledge to extract regularities from the English lexicon. The Toolkit quantifies the empirical regularity of sublexical units in both the reading and spelling directions (i.e., grapheme-to-phoneme and phoneme-to-grapheme) and at multiple grain sizes (i.e., phoneme/grapheme and onset/rime unit size). It can extract multiple experience-dependent regularity indices for words or pseudowords, including both frequency indices (e.g., grapheme frequency) and conditional probability indices (e.g., grapheme-to-phoneme probability). These tools provide (1) superior estimates of the regularities that better reflect the complexity of the sublexical system relative to previously published indices and (2) completely novel indices of sublexical units such as phonographeme frequency (i.e., combined units of individual phonemes and graphemes that are independent of processing direction). We demonstrate that measures from the toolkit explain significant amounts of variance in empirical data (naming of real words and lexical decision), and either outperform or are comparable to the best available consistency measures. The flexibility of the toolkit is further demonstrated by its ability to readily index the probability of different pseudowords pronunciations, and we report that the measures account for the majority of variance in these empirically observed probabilities. Overall, this work provides a framework and resources that can be flexibly used to identify optimal corpus-based consistency measures that help explain reading/spelling behaviors for real and pseudowords.

References
1.
Balota D, Yap M, Cortese M, Hutchison K, Kessler B, Loftis B . The English Lexicon Project. Behav Res Methods. 2007; 39(3):445-59. DOI: 10.3758/bf03193014. View

2.
Brysbaert M, New B . Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav Res Methods. 2009; 41(4):977-90. DOI: 10.3758/BRM.41.4.977. View

3.
Chee Q, Chow K, Yap M, Goh W . Consistency norms for 37,677 english words. Behav Res Methods. 2020; 52(6):2535-2555. DOI: 10.3758/s13428-020-01391-7. View

4.
Coltheart M, Ulicheva A . Why is nonword reading so variable in adult skilled readers?. PeerJ. 2018; 6:e4879. PMC: 5971095. DOI: 10.7717/peerj.4879. View

5.
Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J . DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev. 2001; 108(1):204-56. DOI: 10.1037/0033-295x.108.1.204. View