Computational Models of Auditory Scene Analysis: A Review
Overview
Affiliations
Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.
Design and evaluation of a global workspace agent embodied in a realistic multimodal environment.
Dossa R, Arulkumaran K, Juliani A, Sasai S, Kanai R Front Comput Neurosci. 2024; 18:1352685.
PMID: 38948336 PMC: 11211627. DOI: 10.3389/fncom.2024.1352685.
Simultaneous relative cue reliance in speech-on-speech masking.
Lutfi R, Zandona M, Lee J J Acoust Soc Am. 2023; 154(4):2530-2538.
PMID: 37870932 PMC: 10708949. DOI: 10.1121/10.0021874.
A biologically oriented algorithm for spatial sound segregation.
Chou K, Boyd A, Best V, Colburn H, Sen K Front Neurosci. 2022; 16:1004071.
PMID: 36312015 PMC: 9614053. DOI: 10.3389/fnins.2022.1004071.
Intention-based predictive information modulates auditory deviance processing.
Widmann A, Schroger E Front Neurosci. 2022; 16:995119.
PMID: 36248631 PMC: 9554204. DOI: 10.3389/fnins.2022.995119.
Luberadzka J, Kayser H, Hohmann V J Acoust Soc Am. 2022; 151(2):712.
PMID: 35232067 PMC: 9088677. DOI: 10.1121/10.0009337.