Learning Dynamics of Deep Linear Networks with Multiple Pathways

Overview

Journal Adv Neural Inf Process Syst

Date 2024 Jan 30

PMID 38288081

Authors

Jianghong Shi

Eric Shea-Brown

Michael A Buice

Affiliations

Soon will be listed here.

Abstract

Not only have deep networks become standard in machine learning, they are increasingly of interest in neuroscience as models of cortical computation that capture relationships between structural and functional properties. In addition they are a useful target of theoretical research into the properties of network computation. Deep networks typically have a serial or approximately serial organization across layers, and this is often mirrored in models that purport to represent computation in mammalian brains. There are, however, multiple examples of parallel pathways in mammalian brains. In some cases, such as the mouse, the entire visual system appears arranged in a largely parallel, rather than serial fashion. While these pathways may be formed by differing cost functions that drive different computations, here we present a new mathematical analysis of learning dynamics in networks that have parallel computational pathways driven by the same cost function. We use the approximation of deep linear networks with large hidden layer sizes to show that, as the depth of the parallel pathways increases, different features of the training set (defined by the singular values of the input-output correlation) will typically concentrate in one of the pathways. This result is derived analytically and demonstrated with numerical simulation with both linear and non-linear networks. Thus, rather than sharing stimulus and task features across multiple pathways, parallel network architectures learn to produce sharply diversified representations with specialized and specific pathways, a mechanism which may hold important consequences for codes in both biological and artificial systems.

Citing Articles

Flexible task abstractions emerge in linear networks with fast and bounded units.

Sandbrink K, Bauer J, Proca A, Saxe A, Summerfield C, Hummos A ArXiv. 2025; .

PMID: 39876939 PMC: 11774440.

Transition to chaos separates learning regimes and relates to measure of consciousness in recurrent neural networks.

Mastrovito D, Liu Y, Kusmierz L, Shea-Brown E, Koch C, Mihalas S bioRxiv. 2024; .

PMID: 38798582 PMC: 11118502. DOI: 10.1101/2024.05.15.594236.

References

Yamins D, DiCarlo J . Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci. 2016; 19(3):356-65. DOI: 10.1038/nn.4244. View

Cadena S, Denfield G, Walker E, Gatys L, Tolias A, Bethge M . Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Comput Biol. 2019; 15(4):e1006897. PMC: 6499433. DOI: 10.1371/journal.pcbi.1006897. View

Felleman D, Van Essen D . Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex. 1991; 1(1):1-47. DOI: 10.1093/cercor/1.1.1-a. View

Goodale M, Milner A . Separate visual pathways for perception and action. Trends Neurosci. 1992; 15(1):20-5. DOI: 10.1016/0166-2236(92)90344-8. View

Wang Q, Gao E, Burkhalter A . Gateways of ventral and dorsal streams in mouse visual cortex. J Neurosci. 2011; 31(5):1905-18. PMC: 3040111. DOI: 10.1523/JNEUROSCI.3488-10.2011. View

DiCarlo J, Zoccolan D, Rust N . How does the brain solve visual object recognition?. Neuron. 2012; 73(3):415-34. PMC: 3306444. DOI: 10.1016/j.neuron.2012.01.010. View

Frost B . A taxonomy of different forms of visual motion detection and their underlying neural mechanisms. Brain Behav Evol. 2010; 75(3):218-35. DOI: 10.1159/000314284. View

Fukushima K . Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern. 1980; 36(4):193-202. DOI: 10.1007/BF00344251. View

Harris J, Mihalas S, Hirokawa K, Whitesell J, Choi H, Bernard A . Hierarchical organization of cortical and thalamic connectivity. Nature. 2019; 575(7781):195-202. PMC: 8433044. DOI: 10.1038/s41586-019-1716-z. View

10.

Yamins D, Hong H, Cadieu C, Solomon E, Seibert D, DiCarlo J . Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci U S A. 2014; 111(23):8619-24. PMC: 4060707. DOI: 10.1073/pnas.1403112111. View

11.

Oja E . A simplified neuron model as a principal component analyzer. J Math Biol. 1982; 15(3):267-73. DOI: 10.1007/BF00275687. View

12.

Riesenhuber M, Poggio T . Hierarchical models of object recognition in cortex. Nat Neurosci. 1999; 2(11):1019-25. DOI: 10.1038/14819. View

13.

Wang Q, Sporns O, Burkhalter A . Network analysis of corticocortical connections reveals ventral and dorsal processing streams in mouse visual cortex. J Neurosci. 2012; 32(13):4386-99. PMC: 3328193. DOI: 10.1523/JNEUROSCI.6063-11.2012. View

14.

Saxe A, McClelland J, Ganguli S . A mathematical theory of semantic development in deep neural networks. Proc Natl Acad Sci U S A. 2019; 116(23):11537-11546. PMC: 6561300. DOI: 10.1073/pnas.1820226116. View

15.

Shi J, Tripp B, Shea-Brown E, Mihalas S, Buice M . MouseNet: A biologically constrained convolutional neural network model for the mouse visual cortex. PLoS Comput Biol. 2022; 18(9):e1010427. PMC: 9481165. DOI: 10.1371/journal.pcbi.1010427. View