» Articles » PMID: 39345642

Policy Complexity Suppresses Dopamine Responses

Overview
Journal bioRxiv
Date 2024 Sep 30
PMID 39345642
Authors
Affiliations
Soon will be listed here.
Abstract

Limits on information processing capacity impose limits on task performance. We show that animals achieve performance on a perceptual decision task that is near-optimal given their capacity limits, as measured by policy complexity (the mutual information between states and actions). This behavioral profile could be achieved by reinforcement learning with a penalty on high complexity policies, realized through modulation of dopaminergic learning signals. In support of this hypothesis, we find that policy complexity suppresses midbrain dopamine responses to reward outcomes, thereby reducing behavioral sensitivity to these outcomes. Our results suggest that policy compression shapes basic mechanisms of reinforcement learning in the brain.

References
1.
Schutt H, Kim D, Ma W . Reward prediction error neurons implement an efficient code for reward. Nat Neurosci. 2024; 27(7):1333-1339. DOI: 10.1038/s41593-024-01671-x. View

2.
Westbrook A, Braver T . Dopamine Does Double Duty in Motivating Cognitive Effort. Neuron. 2016; 89(4):695-710. PMC: 4759499. DOI: 10.1016/j.neuron.2015.12.029. View

3.
Collins A, Ciullo B, Frank M, Badre D . Working Memory Load Strengthens Reward Prediction Errors. J Neurosci. 2017; 37(16):4332-4342. PMC: 5413179. DOI: 10.1523/JNEUROSCI.2700-16.2017. View

4.
Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J, Uchida N . Arithmetic and local circuitry underlying dopamine prediction errors. Nature. 2015; 525(7568):243-6. PMC: 4567485. DOI: 10.1038/nature14855. View

5.
Gershman S, Assad J, Datta S, Linderman S, Sabatini B, Uchida N . Explaining dopamine through prediction errors and beyond. Nat Neurosci. 2024; 27(9):1645-1655. DOI: 10.1038/s41593-024-01705-4. View