» Articles » PMID: 38659861

Dopamine Transients Encode Reward Prediction Errors Independent of Learning Rates

Overview
Journal bioRxiv
Date 2024 Apr 25
PMID 38659861
Authors
Affiliations
Soon will be listed here.
Abstract

Biological accounts of reinforcement learning posit that dopamine encodes reward prediction errors (RPEs), which are multiplied by a learning rate to update state or action values. These values are thought to be represented in synaptic weights in the striatum, and updated by dopamine-dependent plasticity, suggesting that dopamine release might reflect the product of the learning rate and RPE. Here, we leveraged the fact that animals learn faster in volatile environments to characterize dopamine encoding of learning rates in the nucleus accumbens core (NAcc). We trained rats on a task with semi-observable states offering different rewards, and rats adjusted how quickly they initiated trials across states using RPEs. Computational modeling and behavioral analyses showed that learning rates were higher following state transitions, and scaled with trial-by-trial changes in beliefs about hidden states, approximating normative Bayesian strategies. Notably, dopamine release in the NAcc encoded RPEs independent of learning rates, suggesting that dopamine-independent mechanisms instantiate dynamic learning rates.

References
1.
McGuire J, Nassar M, Gold J, Kable J . Functionally dissociable influences on learning rate in a dynamic environment. Neuron. 2014; 84(4):870-81. PMC: 4437663. DOI: 10.1016/j.neuron.2014.10.013. View

2.
Wilson R, Nassar M, Gold J . A mixture of delta-rules approximation to bayesian inference in change-point problems. PLoS Comput Biol. 2013; 9(7):e1003150. PMC: 3723502. DOI: 10.1371/journal.pcbi.1003150. View

3.
Olds J . Self-stimulation of the brain; its use to study local effects of hunger, sex, and drugs. Science. 1958; 127(3294):315-24. DOI: 10.1126/science.127.3294.315. View

4.
Steinberg E, Keiflin R, Boivin J, Witten I, Deisseroth K, Janak P . A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci. 2013; 16(7):966-73. PMC: 3705924. DOI: 10.1038/nn.3413. View

5.
Schultz W, Dayan P, Montague P . A neural substrate of prediction and reward. Science. 1997; 275(5306):1593-9. DOI: 10.1126/science.275.5306.1593. View