» Articles » PMID: 39561186

Reward Bases: A Simple Mechanism for Adaptive Acquisition of Multiple Reward Types

Overview
Specialty Biology
Date 2024 Nov 19
PMID 39561186
Authors
Affiliations
Soon will be listed here.
Abstract

Animals can adapt their preferences for different types of reward according to physiological state, such as hunger or thirst. To explain this ability, we employ a simple multi-objective reinforcement learning model that learns multiple values according to different reward dimensions such as food or water. We show that by weighting these learned values according to the current needs, behaviour may be flexibly adapted to present preferences. This model predicts that individual dopamine neurons should encode the errors associated with some reward dimensions more than with others. To provide a preliminary test of this prediction, we reanalysed a small dataset obtained from a single primate in an experiment which to our knowledge is the only published study where the responses of dopamine neurons to stimuli predicting distinct types of rewards were recorded. We observed that in addition to subjective economic value, dopamine neurons encode a gradient of reward dimensions; some neurons respond most to stimuli predicting food rewards while the others respond more to stimuli predicting fluids. We also proposed a possible implementation of the model in the basal ganglia network, and demonstrated how the striatal system can learn values in multiple dimensions, even when dopamine neurons encode mixtures of prediction error from different dimensions. Additionally, the model reproduces the instant generalisation to new physiological states seen in dopamine responses and in behaviour. Our results demonstrate how a simple neural circuit can flexibly guide behaviour according to animals' needs.

Citing Articles

The curious case of dopaminergic prediction errors and learning associative information beyond value.

Kahnt T, Schoenbaum G Nat Rev Neurosci. 2025; 26(3):169-178.

PMID: 39779974 DOI: 10.1038/s41583-024-00898-8.


Dopaminergic responses to identity prediction errors depend differently on the orbitofrontal cortex and hippocampus.

Takahashi Y, Zhang Z, Kahnt T, Schoenbaum G bioRxiv. 2025; .

PMID: 39763911 PMC: 11702580. DOI: 10.1101/2024.12.11.628003.

References
1.
Lin S, Owald D, Chandra V, Talbot C, Huetteroth W, Waddell S . Neural correlates of water reward in thirsty Drosophila. Nat Neurosci. 2014; 17(11):1536-42. PMC: 4213141. DOI: 10.1038/nn.3827. View

2.
Otto N, Pleijzier M, Morgan I, Edmondson-Stait A, Heinz K, Stark I . Input Connectivity Reveals Additional Heterogeneity of Dopaminergic Reinforcement in Drosophila. Curr Biol. 2020; 30(16):3200-3211.e8. PMC: 7443709. DOI: 10.1016/j.cub.2020.05.077. View

3.
Liu C, Goel P, Kaeser P . Spatial and temporal scales of dopamine transmission. Nat Rev Neurosci. 2021; 22(6):345-358. PMC: 8220193. DOI: 10.1038/s41583-021-00455-7. View

4.
Glascher J, Daw N, Dayan P, ODoherty J . States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron. 2010; 66(4):585-95. PMC: 2895323. DOI: 10.1016/j.neuron.2010.04.016. View

5.
Lillicrap T, Cownden D, Tweed D, Akerman C . Random synaptic feedback weights support error backpropagation for deep learning. Nat Commun. 2016; 7:13276. PMC: 5105169. DOI: 10.1038/ncomms13276. View