Statistical Properties of Infant-directed Versus Adult-directed Speech: Insights from Speech Recognition
Overview
Authors
Affiliations
Previous studies have shown that infant-directed speech ('motherese') exhibits overemphasized acoustic properties which may facilitate the acquisition of phonetic categories by infant learners. It has been suggested that the use of infant-directed data for training automatic speech recognition systems might also enhance the automatic learning and discrimination of phonetic categories. This study investigates the properties of infant-directed vs. adult-directed speech from the point of view of the statistical pattern recognition paradigm underlying automatic speech recognition. Isolated-word speech recognizers were trained on adult-directed vs. infant-directed data sets and were tested on both matched and mismatched data. Results show that recognizers trained on infant-directed speech did not always exhibit better recognition performance; however, their relative loss in performance on mismatched data was significantly less severe than that of recognizers trained on adult-directed speech and presented with infant-directed test data. An analysis of the statistical distributions of a subset of phonetic classes in both data sets showed that this pattern is caused by larger class overlaps in infant-directed speech. This finding has implications for both automatic speech recognition and theories of infant speech perception.
Multi-modal cross-linguistic perception of Mandarin tones in clear speech.
Zeng Y, Leung K, Jongman A, Sereno J, Wang Y Front Hum Neurosci. 2023; 17:1247811.
PMID: 37829822 PMC: 10565566. DOI: 10.3389/fnhum.2023.1247811.
Dilley L, Gamache J, Wang Y, Houston D, Bergeson T J Phon. 2020; 75:73-87.
PMID: 32884162 PMC: 7467459. DOI: 10.1016/j.wocn.2019.05.004.
Motherese in interaction: at the cross-road of emotion and cognition? (A systematic review).
Saint-Georges C, Chetouani M, Cassel R, Apicella F, Mahdhaoui A, Muratori F PLoS One. 2013; 8(10):e78103.
PMID: 24205112 PMC: 3800080. DOI: 10.1371/journal.pone.0078103.
Variation in the input: a case study of manner class frequencies.
Daland R J Child Lang. 2012; 40(5):1091-122.
PMID: 23046894 PMC: 3798116. DOI: 10.1017/S0305000912000372.
Phonetic enhancement of sibilants in infant-directed speech.
Cristia A J Acoust Soc Am. 2010; 128(1):424-34.
PMID: 20649236 PMC: 3188599. DOI: 10.1121/1.3436529.