How Cortico-basal Ganglia-thalamic Subnetworks Can Shift Decision Policies to Maximize Reward Rate

Overview

Journal bioRxiv

Date 2024 Jun 3

PMID 38826315

Authors

Jyotika Bahuguna

Timothy Verstynen

Jonathan E Rubin

Affiliations

Soon will be listed here.

Abstract

All mammals exhibit flexible decision policies that depend, at least in part, on the cortico-basal ganglia-thalamic (CBGT) pathways. Yet understanding how the complex connectivity, dynamics, and plasticity of CBGT circuits translate into experience-dependent shifts of decision policies represents a longstanding challenge in neuroscience. Here we present the results of a computational approach to address this problem. Specifically, we simulated decisions driven by CBGT circuits under baseline, unrewarded conditions using a spiking neural network, and fit an evidence accumulation model to the resulting behavior. Using canonical correlation analysis, we then replicated the identification of three control ensembles (, and ) within CBGT circuits, with each of these subnetworks mapping to a specific configuration of the evidence accumulation process. We subsequently simulated learning in a simple two-choice task with one optimal (i.e., rewarded) target and found that feedback-driven dopaminergic plasticity on cortico-striatal synapses effectively manages the speed-accuracy tradeoff so as to increase reward rate over time. The learning-related changes in the decision policy can be decomposed in terms of the contributions of each control ensemble, whose influence is driven by sequential reward prediction errors on individual trials. Our results provide a clear and simple mechanism for how dopaminergic plasticity shifts subnetworks within CBGT circuits so as to maximize reward rate by strategically modulating how evidence is used to drive decisions.

References

Hikosaka O, Rand M, Miyachi S, Miyashita K . Learning of sequential movements in the monkey: process of learning and retention of memory. J Neurophysiol. 1995; 74(4):1652-61. DOI: 10.1152/jn.1995.74.4.1652. View

Wei W, Rubin J, Wang X . Role of the indirect pathway of the basal ganglia in perceptual decision making. J Neurosci. 2015; 35(9):4052-64. PMC: 4348195. DOI: 10.1523/JNEUROSCI.3611-14.2015. View

Fengler A, Bera K, Pedersen M, Frank M . Beyond Drift Diffusion Models: Fitting a Broad Class of Decision and Reinforcement Learning Models with HDDM. J Cogn Neurosci. 2022; 34(10):1780-1805. DOI: 10.1162/jocn_a_01902. View

McCairn K, Turner R . Deep brain stimulation of the globus pallidus internus in the parkinsonian primate: local entrainment and suppression of low-frequency oscillations. J Neurophysiol. 2009; 101(4):1941-60. PMC: 3350155. DOI: 10.1152/jn.91092.2008. View

Vich C, Clapp M, Rubin J, Verstynen T . Identifying control ensembles for information processing within the cortico-basal ganglia-thalamic circuit. PLoS Comput Biol. 2022; 18(6):e1010255. PMC: 9258830. DOI: 10.1371/journal.pcbi.1010255. View

Gurney K, Humphries M, Redgrave P . A new framework for cortico-striatal plasticity: behavioural theory meets in vitro data at the reinforcement-action interface. PLoS Biol. 2015; 13(1):e1002034. PMC: 4285402. DOI: 10.1371/journal.pbio.1002034. View

Bogacz R, Gurney K . The basal ganglia and cortex implement optimal decision making between alternative actions. Neural Comput. 2007; 19(2):442-77. DOI: 10.1162/neco.2007.19.2.442. View

Cavanagh J, Wiecki T, Cohen M, Figueroa C, Samanta J, Sherman S . Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nat Neurosci. 2011; 14(11):1462-7. PMC: 3394226. DOI: 10.1038/nn.2925. View

Frank M, Scheres A, Sherman S . Understanding decision-making deficits in neurological conditions: insights from models of natural action selection. Philos Trans R Soc Lond B Biol Sci. 2007; 362(1485):1641-54. PMC: 2440777. DOI: 10.1098/rstb.2007.2058. View

10.

Dudman J, Krakauer J . The basal ganglia: from motor commands to the control of vigor. Curr Opin Neurobiol. 2016; 37:158-166. DOI: 10.1016/j.conb.2016.02.005. View

11.

Liquin E, Gopnik A . Children are more exploratory and learn more than adults in an approach-avoid task. Cognition. 2021; 218:104940. DOI: 10.1016/j.cognition.2021.104940. View

12.

Fontanesi L, Gluth S, Spektor M, Rieskamp J . A reinforcement learning diffusion decision model for value-based decisions. Psychon Bull Rev. 2019; 26(4):1099-1121. PMC: 6820465. DOI: 10.3758/s13423-018-1554-2. View

13.

Zacksenhouse M, Bogacz R, Holmes P . Robust versus optimal strategies for two-alternative forced choice tasks. J Math Psychol. 2012; 54(2):230-246. PMC: 3505075. DOI: 10.1016/j.jmp.2009.12.004. View

14.

Tremblay L, Hollerman J, Schultz W . Modifications of reward expectation-related neuronal activity during learning in primate striatum. J Neurophysiol. 1998; 80(2):964-77. DOI: 10.1152/jn.1998.80.2.964. View

15.

Bowman N, Kording K, Gottfried J . Temporal integration of olfactory perceptual evidence in human orbitofrontal cortex. Neuron. 2012; 75(5):916-27. PMC: 3441053. DOI: 10.1016/j.neuron.2012.06.035. View

16.

Pasupathy A, Miller E . Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature. 2005; 433(7028):873-6. DOI: 10.1038/nature03287. View

17.

Malhotra G, Leslie D, Ludwig C, Bogacz R . Time-varying decision boundaries: insights from optimality analysis. Psychon Bull Rev. 2017; 25(3):971-996. PMC: 5990589. DOI: 10.3758/s13423-017-1340-6. View

18.

Bogacz R, Moraud E, Abdi A, Magill P, Baufreton J . Properties of Neurons in External Globus Pallidus Can Support Optimal Action Selection. PLoS Comput Biol. 2016; 12(7):e1005004. PMC: 4936724. DOI: 10.1371/journal.pcbi.1005004. View

19.

Wu Y, Levy R, Ashby P, Tasker R, Dostrovsky J . Does stimulation of the GPi control dyskinesia by activating inhibitory axons?. Mov Disord. 2001; 16(2):208-16. DOI: 10.1002/mds.1046. View

20.

DeLong M . Primate models of movement disorders of basal ganglia origin. Trends Neurosci. 1990; 13(7):281-5. DOI: 10.1016/0166-2236(90)90110-v. View