» Articles » PMID: 39937708

Monkeys Can Identify Pictures from Words

Overview
Journal PLoS One
Date 2025 Feb 12
PMID 39937708
Authors
Affiliations
Soon will be listed here.
Abstract

Humans learn and incorporate cross-modal associations between auditory and visual objects (e.g., between a spoken word and a picture) into language. However, whether nonhuman primates can learn cross-modal associations between words and pictures remains uncertain. We trained two rhesus macaques in a delayed cross-modal match-to-sample task to determine whether they could learn associations between sounds and pictures of different types. In each trial, the monkeys listened to a brief sound (e.g., a monkey vocalization or a human word), and retained information about the sound to match it with one of 2-4 pictures presented on a touchscreen after a 3-second delay. We found that the monkeys learned and performed proficiently in over a dozen associations. In addition, to test their ability to generalize, we exposed them to sounds uttered by different individuals. We found that their hit rate remained high but more variable, suggesting that they perceived the new sounds as equivalent, though not identical. We conclude that rhesus monkeys can learn cross-modal associations between objects of different types, retain information in working memory, and generalize the learned associations to new objects. These findings position rhesus monkeys as an ideal model for future research on the brain pathways of cross-modal associations between auditory and visual objects.

References
1.
Romanski L, Sharma K . Multisensory interactions of face and vocal information during perception and memory in ventrolateral prefrontal cortex. Philos Trans R Soc Lond B Biol Sci. 2023; 378(1886):20220343. PMC: 10404928. DOI: 10.1098/rstb.2022.0343. View

2.
Dahl C, Logothetis N, Kayser C . Modulation of visual responses in the superior temporal sulcus by audio-visual congruency. Front Integr Neurosci. 2010; 4:10. PMC: 2859867. DOI: 10.3389/fnint.2010.00010. View

3.
Chandrasekaran C, Ghazanfar A . Different neural frequency bands integrate faces and voices differently in the superior temporal sulcus. J Neurophysiol. 2008; 101(2):773-88. PMC: 2657063. DOI: 10.1152/jn.90843.2008. View

4.
Noesselt T, Rieger J, Schoenfeld M, Kanowski M, Hinrichs H, Heinze H . Audiovisual temporal correspondence modulates human multisensory superior temporal sulcus plus primary sensory cortices. J Neurosci. 2007; 27(42):11431-41. PMC: 2957075. DOI: 10.1523/JNEUROSCI.2252-07.2007. View

5.
Takahashi D . Vocal Learning: Shaping by Social Reinforcement. Curr Biol. 2019; 29(4):R125-R127. DOI: 10.1016/j.cub.2019.01.001. View