Word Categorization from Distributional Information: Frames Confer More Than the Sum of Their (Bigram) Parts
Overview
Authors
Affiliations
Grammatical categories, such as noun and verb, are the building blocks of syntactic structure and the components that govern the grammatical patterns of language. However, in many languages words are not explicitly marked with their category information, hence a critical part of acquiring a language is categorizing the words. Computational analyses of child-directed speech have shown that distributional information-information about how words pattern with one another in sentences-could be a useful source of initial category information. Yet questions remain as to whether learners use this kind of information, and if so, what kinds of distributional patterns facilitate categorization. In this paper we investigated how adults exposed to an artificial language use distributional information to categorize words. We compared training situations in which target words occurred in frames (i.e., surrounded by two words that frequently co-occur) against situations in which target words occurred in simpler bigram contexts (where an immediately adjacent word provides the context for categorization). We found that learners categorized words together when they occurred in similar frame contexts, but not when they occurred in similar bigram contexts. These findings are particularly relevant because they accord with computational investigations showing that frame contexts provide accurate category information cross-linguistically. We discuss these findings in the context of prior research on distribution-based categorization and the broader implications for the role of distributional categorization in language acquisition.
Distributional Lattices as a Model for Discovering Syntactic Categories in Child-Directed Speech.
Zhu H, Clark A J Psycholinguist Res. 2022; 51(4):917-931.
PMID: 35348946 DOI: 10.1007/s10936-022-09872-w.
Guo R, Ellis N Front Psychol. 2021; 12:582259.
PMID: 33995170 PMC: 8116661. DOI: 10.3389/fpsyg.2021.582259.
A distributional perspective on the gavagai problem in early word learning.
Aslin R, Wang A Cognition. 2021; 213:104680.
PMID: 33853740 PMC: 8324554. DOI: 10.1016/j.cognition.2021.104680.
Lexical category acquisition is facilitated by uncertainty in distributional co-occurrences.
Cassani G, Grimm R, Daelemans W, Gillis S PLoS One. 2018; 13(12):e0209449.
PMID: 30592738 PMC: 6310260. DOI: 10.1371/journal.pone.0209449.
Reeder P, Newport E, Aslin R J Mem Lang. 2018; 97:17-29.
PMID: 29456288 PMC: 5810951. DOI: 10.1016/j.jml.2017.07.006.