Reconstructing the Spectrotemporal Modulations of Real-life Sounds from FMRI Response Patterns

Overview

Journal Proc Natl Acad Sci U S A

Specialty Science

Date 2017 Apr 20

PMID 28420788

Citations 45

Authors

Roberta Santoro

Michelle Moerel

Federico De Martino

Giancarlo Valente

Kamil Ugurbil

Essa Yacoub

Elia Formisano

Affiliations

Soon will be listed here.

Abstract

Ethological views of brain functioning suggest that sound representations and computations in the auditory neural system are optimized finely to process and discriminate behaviorally relevant acoustic features and sounds (e.g., spectrotemporal modulations in the songs of zebra finches). Here, we show that modeling of neural sound representations in terms of frequency-specific spectrotemporal modulations enables accurate and specific reconstruction of real-life sounds from high-resolution functional magnetic resonance imaging (fMRI) response patterns in the human auditory cortex. Region-based analyses indicated that response patterns in separate portions of the auditory cortex are informative of distinctive sets of spectrotemporal modulations. Most relevantly, results revealed that in early auditory regions, and progressively more in surrounding regions, temporal modulations in a range relevant for speech analysis (∼2-4 Hz) were reconstructed more faithfully than other temporal modulations. In early auditory regions, this effect was frequency-dependent and only present for lower frequencies (<∼2 kHz), whereas for higher frequencies, reconstruction accuracy was higher for faster temporal modulations. Further analyses suggested that auditory cortical processing optimized for the fine-grained discrimination of speech and vocal sounds underlies this enhanced reconstruction accuracy. In sum, the present study introduces an approach to embed models of neural sound representations in the analysis of fMRI response patterns. Furthermore, it reveals that, in the human brain, even general purpose and fundamental neural processing mechanisms are shaped by the physical features of real-world stimuli that are most relevant for behavior (i.e., speech, voice).

Citing Articles

Ultra high density imaging arrays in diffuse optical tomography for human brain mapping improve image quality and decoding performance.

Markow Z, Trobaugh J, Richter E, Tripathy K, Rafferty S, Svoboda A Sci Rep. 2025; 15(1):3175.

PMID: 39863633 PMC: 11762274. DOI: 10.1038/s41598-025-85858-7.

Speech prosody enhances the neural processing of syntax.

Degano G, Donhauser P, Gwilliams L, Merlo P, Golestani N Commun Biol. 2024; 7(1):748.

PMID: 38902370 PMC: 11190187. DOI: 10.1038/s42003-024-06444-7.

A hierarchy of processing complexity and timescales for natural sounds in human auditory cortex.

Rupp K, Hect J, Harford E, Holt L, Ghuman A, Abel T bioRxiv. 2024; .

PMID: 38826304 PMC: 11142240. DOI: 10.1101/2024.05.24.595822.

The human auditory system uses amplitude modulation to distinguish music from speech.

Chang A, Teng X, Assaneo M, Poeppel D PLoS Biol. 2024; 22(5):e3002631.

PMID: 38805517 PMC: 11132470. DOI: 10.1371/journal.pbio.3002631.

Linguistic modulation of the neural encoding of phonemes.

Kim S, De Martino F, Overath T Cereb Cortex. 2024; 34(4.

PMID: 38687241 PMC: 11059272. DOI: 10.1093/cercor/bhae155.

References

Mesgarani N, Chang E . Selective cortical representation of attended speaker in multi-talker speech perception. Nature. 2012; 485(7397):233-6. PMC: 3870007. DOI: 10.1038/nature11020. View

Nishimoto S, Vu A, Naselaris T, Benjamini Y, Yu B, Gallant J . Reconstructing visual experiences from brain activity evoked by natural movies. Curr Biol. 2011; 21(19):1641-6. PMC: 3326357. DOI: 10.1016/j.cub.2011.08.031. View

Edwards E, Chang E . Syllabic (∼2-5 Hz) and fluctuation (∼1-10 Hz) ranges in speech and auditory processing. Hear Res. 2013; 305:113-34. PMC: 3830943. DOI: 10.1016/j.heares.2013.08.017. View

Harms M, Melcher J . Sound repetition rate in the human auditory pathway: representations in the waveshape and amplitude of fMRI activation. J Neurophysiol. 2002; 88(3):1433-50. DOI: 10.1152/jn.2002.88.3.1433. View

Chi T, Ru P, Shamma S . Multiresolution spectrotemporal analysis of complex sounds. J Acoust Soc Am. 2005; 118(2):887-906. DOI: 10.1121/1.1945807. View

Miller L, Escabi M, Read H, Schreiner C . Functional convergence of response properties in the auditory thalamocortical system. Neuron. 2001; 32(1):151-60. DOI: 10.1016/s0896-6273(01)00445-7. View

Butts D, Goldman M . Tuning curves, neuronal variability, and sensory coding. PLoS Biol. 2006; 4(4):e92. PMC: 1403159. DOI: 10.1371/journal.pbio.0040092. View

Naselaris T, Prenger R, Kay K, Oliver M, Gallant J . Bayesian reconstruction of natural images from human brain activity. Neuron. 2009; 63(6):902-15. PMC: 5553889. DOI: 10.1016/j.neuron.2009.09.006. View

Bacon S, Viemeister N . Temporal modulation transfer functions in normal-hearing and hearing-impaired listeners. Audiology. 1985; 24(2):117-34. DOI: 10.3109/00206098509081545. View

10.

Gaab N, Gabrieli J, Glover G . Assessing the influence of scanner background noise on auditory processing. I. An fMRI study comparing three experimental designs with varying degrees of scanner noise. Hum Brain Mapp. 2006; 28(8):703-20. PMC: 6871450. DOI: 10.1002/hbm.20298. View

11.

Hullett P, Hamilton L, Mesgarani N, Schreiner C, Chang E . Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli. J Neurosci. 2016; 36(6):2014-26. PMC: 4748082. DOI: 10.1523/JNEUROSCI.1779-15.2016. View

12.

Bitterman Y, Mukamel R, Malach R, Fried I, Nelken I . Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature. 2008; 451(7175):197-201. PMC: 2676858. DOI: 10.1038/nature06476. View

13.

Bialek W, Rieke F, de Ruyter van Steveninck R, Warland D . Reading a neural code. Science. 1991; 252(5014):1854-7. DOI: 10.1126/science.2063199. View

14.

Elliott T, Theunissen F . The modulation transfer function for speech intelligibility. PLoS Comput Biol. 2009; 5(3):e1000302. PMC: 2639724. DOI: 10.1371/journal.pcbi.1000302. View

15.

Moerel M, De Martino F, Formisano E . Processing of natural sounds in human auditory cortex: tonotopy, spectral tuning, and relation to voice sensitivity. J Neurosci. 2012; 32(41):14205-16. PMC: 6622378. DOI: 10.1523/JNEUROSCI.1388-12.2012. View

16.

Fukushima M, Doyle A, Mullarkey M, Mishkin M, Averbeck B . Distributed acoustic cues for caller identity in macaque vocalization. R Soc Open Sci. 2016; 2(12):150432. PMC: 4806230. DOI: 10.1098/rsos.150432. View

17.

Mesgarani N, David S, Fritz J, Shamma S . Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. J Neurophysiol. 2009; 102(6):3329-39. PMC: 2804432. DOI: 10.1152/jn.91128.2008. View

18.

Woolley S, Fremouw T, Hsu A, Theunissen F . Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nat Neurosci. 2005; 8(10):1371-9. DOI: 10.1038/nn1536. View

19.

Theunissen F, Sen K, Doupe A . Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds. J Neurosci. 2000; 20(6):2315-31. PMC: 6772498. View

20.

Giraud A, Lorenzi C, Ashburner J, Wable J, Johnsrude I, Frackowiak R . Representation of the temporal envelope of sounds in the human brain. J Neurophysiol. 2000; 84(3):1588-98. DOI: 10.1152/jn.2000.84.3.1588. View