A Self-supervised Domain-general Learning Framework for Human Ventral Stream Representation

Overview

Journal Nat Commun

Specialty Biology

Date 2022 Jan 26

PMID 35078981

Authors

Talia Konkle

George A Alvarez

Affiliations

Soon will be listed here.

Abstract

Anterior regions of the ventral visual stream encode substantial information about object categories. Are top-down category-level forces critical for arriving at this representation, or can this representation be formed purely through domain-general learning of natural image structure? Here we present a fully self-supervised model which learns to represent individual images, rather than categories, such that views of the same image are embedded nearby in a low-dimensional feature space, distinctly from other recently encountered views. We find that category information implicitly emerges in the local similarity structure of this feature space. Further, these models learn hierarchical features which capture the structure of brain responses across the human ventral visual stream, on par with category-supervised models. These results provide computational support for a domain-general framework guiding the formation of visual representation, where the proximate goal is not explicitly about category information, but is instead to learn unique, compressed descriptions of the visual world.

Citing Articles

Individual variation in the functional lateralization of human ventral temporal cortex: Local competition and long-range coupling.

Blauch N, Plaut D, Vin R, Behrmann M Imaging Neurosci (Camb). 2025; 3.

PMID: 40078535 PMC: 11894816. DOI: 10.1162/imag_a_00488.

How Can Deep Neural Networks Inform Theory in Psychological Science?.

McGrath S, Russin J, Pavlick E, Feiman R Curr Dir Psychol Sci. 2025; 33(5):325-333.

PMID: 39949337 PMC: 11824574. DOI: 10.1177/09637214241268098.

Human-like face pareidolia emerges in deep neural networks optimized for face and object recognition.

Gupta P, Dobs K PLoS Comput Biol. 2025; 21(1):e1012751.

PMID: 39869654 PMC: 11790231. DOI: 10.1371/journal.pcbi.1012751.

A computational deep learning investigation of animacy perception in the human brain.

Duyck S, Costantino A, Bracci S, Op de Beeck H Commun Biol. 2024; 7(1):1718.

PMID: 39741161 PMC: 11688457. DOI: 10.1038/s42003-024-07415-8.

Teaching deep networks to see shape: Lessons from a simplified visual world.

Jarvers C, Neumann H PLoS Comput Biol. 2024; 20(11):e1012019.

PMID: 39527647 PMC: 11581402. DOI: 10.1371/journal.pcbi.1012019.

References

Konkle T, Brady T, Alvarez G, Oliva A . Conceptual distinctiveness supports detailed visual long-term memory for real-world objects. J Exp Psychol Gen. 2010; 139(3):558-78. PMC: 3398125. DOI: 10.1037/a0019165. View

Solomon S, Schapiro A . Structure shapes the representation of a novel category. J Exp Psychol Learn Mem Cogn. 2023; 50(3):458-483. DOI: 10.1037/xlm0001257. View

Bracci S, Ritchie J, Op de Beeck H . On the partnership between neural representations of object categories and visual features in the ventral visual pathway. Neuropsychologia. 2017; 105:153-164. PMC: 5680697. DOI: 10.1016/j.neuropsychologia.2017.06.010. View

Lotter W, Kreiman G, Cox D . A neural network trained for prediction mimics diverse features of biological neurons and perception. Nat Mach Intell. 2021; 2(4):210-219. PMC: 8291226. DOI: 10.1038/s42256-020-0170-9. View

Eickenberg M, Gramfort A, Varoquaux G, Thirion B . Seeing it all: Convolutional network layers map the function of the human visual system. Neuroimage. 2016; 152:184-194. DOI: 10.1016/j.neuroimage.2016.10.001. View

Cichy R, Khosla A, Pantazis D, Torralba A, Oliva A . Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci Rep. 2016; 6:27755. PMC: 4901271. DOI: 10.1038/srep27755. View

Heeger D . Normalization of cell responses in cat striate cortex. Vis Neurosci. 1992; 9(2):181-97. DOI: 10.1017/s0952523800009640. View

Grill-Spector K, Weiner K . The functional architecture of the ventral temporal cortex and its role in categorization. Nat Rev Neurosci. 2014; 15(8):536-48. PMC: 4143420. DOI: 10.1038/nrn3747. View

Konkle T, Caramazza A . The Large-Scale Organization of Object-Responsive Cortex Is Reflected in Resting-State Network Architecture. Cereb Cortex. 2016; 27(10):4933-4945. PMC: 6059148. DOI: 10.1093/cercor/bhw287. View

10.

Crapse T, Sommer M . Corollary discharge across the animal kingdom. Nat Rev Neurosci. 2008; 9(8):587-600. PMC: 5153363. DOI: 10.1038/nrn2457. View

11.

Ostwald D, Lam J, Li S, Kourtzi Z . Neural coding of global form in the human visual cortex. J Neurophysiol. 2008; 99(5):2456-69. DOI: 10.1152/jn.01307.2007. View

12.

Long B, Yu C, Konkle T . Mid-level visual features underlie the high-level categorical organization of the ventral stream. Proc Natl Acad Sci U S A. 2018; 115(38):E9015-E9024. PMC: 6156638. DOI: 10.1073/pnas.1719616115. View

13.

Kriegeskorte N, Mur M, Bandettini P . Representational similarity analysis - connecting the branches of systems neuroscience. Front Syst Neurosci. 2008; 2:4. PMC: 2605405. DOI: 10.3389/neuro.06.004.2008. View

14.

Janini D, Konkle T . A Pokémon-sized window into the human brain. Nat Hum Behav. 2019; 3(6):552-553. DOI: 10.1038/s41562-019-0594-6. View

15.

Khaligh-Razavi S, Henriksson L, Kay K, Kriegeskorte N . Fixed versus mixed RSA: Explaining visual representations by fixed and mixed feature sets from shallow and deep computational models. J Math Psychol. 2017; 76(Pt B):184-197. PMC: 5341758. DOI: 10.1016/j.jmp.2016.10.007. View

16.

Smith L, Slone L . A Developmental Approach to Machine Learning?. Front Psychol. 2017; 8:2124. PMC: 5723343. DOI: 10.3389/fpsyg.2017.02124. View

17.

Tarhan L, Konkle T . Reliability-based voxel selection. Neuroimage. 2019; 207:116350. DOI: 10.1016/j.neuroimage.2019.116350. View

18.

Stringer C, Pachitariu M, Steinmetz N, Carandini M, Harris K . High-dimensional geometry of population responses in visual cortex. Nature. 2019; 571(7765):361-365. PMC: 6642054. DOI: 10.1038/s41586-019-1346-5. View

19.

Lenc K, Vedaldi A . Understanding Image Representations by Measuring Their Equivariance and Equivalence. Int J Comput Vis. 2019; 127(5):456-476. PMC: 6510825. DOI: 10.1007/s11263-018-1098-y. View

20.

Wilson H, Wilkinson F . From orientations to objects: Configural processing in the ventral stream. J Vis. 2015; 15(7):4. DOI: 10.1167/15.7.4. View