» Articles » PMID: 36741776

Dataset Size Considerations for Robust Acoustic and Phonetic Speech Encoding Models in EEG

Overview
Specialty Neurology
Date 2023 Feb 6
PMID 36741776
Authors
Affiliations
Soon will be listed here.
Abstract

In many experiments that investigate auditory and speech processing in the brain using electroencephalography (EEG), the experimental paradigm is often lengthy and tedious. Typically, the experimenter errs on the side of including more data, more trials, and therefore conducting a longer task to ensure that the data are robust and effects are measurable. Recent studies used naturalistic stimuli to investigate the brain's response to individual or a combination of multiple speech features using system identification techniques, such as multivariate temporal receptive field (mTRF) analyses. The neural data collected from such experiments must be divided into a training set and a test set to fit and validate the mTRF weights. While a good strategy is clearly to collect as much data as is feasible, it is unclear how much data are needed to achieve stable results. Furthermore, it is unclear whether the specific stimulus used for mTRF fitting and the choice of feature representation affects how much data would be required for robust and generalizable results. Here, we used previously collected EEG data from our lab using sentence stimuli and movie stimuli as well as EEG data from an open-source dataset using audiobook stimuli to better understand how much data needs to be collected for naturalistic speech experiments measuring acoustic and phonetic tuning. We found that the EEG receptive field structure tested here stabilizes after collecting a training dataset of approximately 200 s of TIMIT sentences, around 600 s of movie trailers training set data, and approximately 460 s of audiobook training set data. Thus, we provide suggestions on the minimum amount of data that would be necessary for fitting mTRFs from naturalistic listening data. Our findings are motivated by highly practical concerns when working with children, patient populations, or others who may not tolerate long study sessions. These findings will aid future researchers who wish to study naturalistic speech processing in healthy and clinical populations while minimizing participant fatigue and retaining signal quality.

Citing Articles

Neural tracking of natural speech: an effective marker for post-stroke aphasia.

De Clercq P, Kries J, Mehraram R, Vanthornhout J, Francart T, Vandermosten M Brain Commun. 2025; 7(2):fcaf095.

PMID: 40066108 PMC: 11891514. DOI: 10.1093/braincomms/fcaf095.


A comparison of EEG encoding models using audiovisual stimuli and their unimodal counterparts.

Desai M, Field A, Hamilton L PLoS Comput Biol. 2024; 20(9):e1012433.

PMID: 39250485 PMC: 11412666. DOI: 10.1371/journal.pcbi.1012433.

References
1.
Holdgraf C, Rieger J, Micheli C, Martin S, Knight R, Theunissen F . Encoding and Decoding Models in Cognitive Electrophysiology. Front Syst Neurosci. 2017; 11:61. PMC: 5623038. DOI: 10.3389/fnsys.2017.00061. View

2.
Willmore B, Smyth D . Methods for first-order kernel estimation: simple-cell receptive fields from responses to natural scenes. Network. 2003; 14(3):553-77. View

3.
Aertsen A, Johannesma P . The spectro-temporal receptive field. A functional characteristic of auditory neurons. Biol Cybern. 1981; 42(2):133-43. DOI: 10.1007/BF00336731. View

4.
Brodbeck C, Hong L, Simon J . Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech. Curr Biol. 2018; 28(24):3976-3983.e5. PMC: 6339854. DOI: 10.1016/j.cub.2018.10.042. View

5.
Broderick M, Anderson A, Lalor E . Semantic Context Enhances the Early Auditory Encoding of Natural Speech. J Neurosci. 2019; 39(38):7564-7575. PMC: 6750931. DOI: 10.1523/JNEUROSCI.0584-19.2019. View