Individual Differences and the Neural Representations of Reward Expectation and Reward Prediction Error

Overview

Journal Soc Cogn Affect Neurosci

Specialties Neurology
Social Sciences

Date 2007 Aug 22

PMID 17710118

Citations 38

Authors

Michael X Cohen

Affiliations

Soon will be listed here.

Abstract

Reward expectation and reward prediction errors are thought to be critical for dynamic adjustments in decision-making and reward-seeking behavior, but little is known about their representation in the brain during uncertainty and risk-taking. Furthermore, little is known about what role individual differences might play in such reinforcement processes. In this study, it is shown behavioral and neural responses during a decision-making task can be characterized by a computational reinforcement learning model and that individual differences in learning parameters in the model are critical for elucidating these processes. In the fMRI experiment, subjects chose between high- and low-risk rewards. A computational reinforcement learning model computed expected values and prediction errors that each subject might experience on each trial. These outputs predicted subjects' trial-to-trial choice strategies and neural activity in several limbic and prefrontal regions during the task. Individual differences in estimated reinforcement learning parameters proved critical for characterizing these processes, because models that incorporated individual learning parameters explained significantly more variance in the fMRI data than did a model using fixed learning parameters. These findings suggest that the brain engages a reinforcement learning process during risk-taking and that individual differences play a crucial role in modeling this process.

Citing Articles

Roles and interplay of reinforcement-based and error-based processes during reaching and gait in neurotypical adults and individuals with Parkinson's disease.

Roth A, Buggeln J, Hoh J, Wood J, Sullivan S, Ngo T PLoS Comput Biol. 2024; 20(10):e1012474.

PMID: 39401183 PMC: 11472932. DOI: 10.1371/journal.pcbi.1012474.

A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI.

Chase H Front Psychol. 2024; 14:1211528.

PMID: 38187436 PMC: 10768009. DOI: 10.3389/fpsyg.2023.1211528.

A neural and behavioral trade-off between value and uncertainty underlies exploratory decisions in normative anxiety.

Aberg K, Toren I, Paz R Mol Psychiatry. 2021; 27(3):1573-1587.

PMID: 34725456 DOI: 10.1038/s41380-021-01363-z.

Reward and fictive prediction error signals in ventral striatum: asymmetry between factual and counterfactual processing.

Santo-Angles A, Fuentes-Claramonte P, Argila-Plaza I, Guardiola-Ripoll M, Almodovar-Paya C, Munuera J Brain Struct Funct. 2021; 226(5):1553-1569.

PMID: 33839955 DOI: 10.1007/s00429-021-02270-3.

The Prisoner's Dilemma paradigm provides a neurobiological framework for the social decision cascade.

Thompson K, Nahmias E, Fani N, Kvaran T, Turner J, Tone E PLoS One. 2021; 16(3):e0248006.

PMID: 33735226 PMC: 7971531. DOI: 10.1371/journal.pone.0248006.

References

Glascher J, Buchel C . Formal learning theory dissociates brain regions with different temporal integration. Neuron. 2005; 47(2):295-306. DOI: 10.1016/j.neuron.2005.06.008. View

Hikosaka K, Watanabe M . Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb Cortex. 2000; 10(3):263-71. DOI: 10.1093/cercor/10.3.263. View

Worgotter F, Porr B . Temporal sequence learning, prediction, and control: a review of different models and their relation to biological mechanisms. Neural Comput. 2005; 17(2):245-319. DOI: 10.1162/0899766053011555. View

Bechara A, Damasio H, Tranel D, Damasio A . Deciding advantageously before knowing the advantageous strategy. Science. 1997; 275(5304):1293-5. DOI: 10.1126/science.275.5304.1293. View

Bechara A, Damasio H, Damasio A . Emotion, decision making and the orbitofrontal cortex. Cereb Cortex. 2000; 10(3):295-307. DOI: 10.1093/cercor/10.3.295. View

Gold J, Shadlen M . Representation of a perceptual decision in developing oculomotor commands. Nature. 2000; 404(6776):390-4. DOI: 10.1038/35006062. View

Hollerman J, Tremblay L, Schultz W . Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior. Prog Brain Res. 2000; 126:193-215. DOI: 10.1016/S0079-6123(00)26015-9. View

Zuckerman M, Kuhlman D . Personality and risk-taking: common biosocial factors. J Pers. 2000; 68(6):999-1029. DOI: 10.1111/1467-6494.00124. View

Petry N . Substance abuse, pathological gambling, and impulsiveness. Drug Alcohol Depend. 2001; 63(1):29-38. DOI: 10.1016/s0376-8716(00)00188-5. View

10.

Waelti P, Dickinson A, Schultz W . Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001; 412(6842):43-8. DOI: 10.1038/35083500. View

11.

Brett M, Johnsrude I, Owen A . The problem of functional localization in the human brain. Nat Rev Neurosci. 2002; 3(3):243-9. DOI: 10.1038/nrn756. View

12.

Baxter M, Murray E . The amygdala and reward. Nat Rev Neurosci. 2002; 3(7):563-73. DOI: 10.1038/nrn875. View

13.

Joel D, Niv Y, Ruppin E . Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw. 2002; 15(4-6):535-47. DOI: 10.1016/s0893-6080(02)00047-3. View

14.

Daw N, Kakade S, Dayan P . Opponent interactions between serotonin and dopamine. Neural Netw. 2002; 15(4-6):603-16. DOI: 10.1016/s0893-6080(02)00052-7. View

15.

Holroyd C, Coles M . The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol Rev. 2002; 109(4):679-709. DOI: 10.1037/0033-295X.109.4.679. View

16.

Montague P, Berns G . Neural economics and the biological substrates of valuation. Neuron. 2002; 36(2):265-84. DOI: 10.1016/s0896-6273(02)00974-1. View

17.

Gilbert P, Campbell A, Kesner R . The role of the amygdala in conditioned flavor preference. Neurobiol Learn Mem. 2002; 79(1):118-21. DOI: 10.1016/s1074-7427(02)00013-8. View

18.

Noble E . D2 dopamine receptor gene in psychiatric and neurologic disorders and its phenotypes. Am J Med Genet B Neuropsychiatr Genet. 2002; 116B(1):103-25. DOI: 10.1002/ajmg.b.10005. View

19.

ODoherty J, Dayan P, Friston K, Critchley H, Dolan R . Temporal difference models and reward-related learning in the human brain. Neuron. 2003; 38(2):329-37. DOI: 10.1016/s0896-6273(03)00169-7. View

20.

McClure S, Berns G, Montague P . Temporal prediction errors in a passive learning task activate human striatum. Neuron. 2003; 38(2):339-46. DOI: 10.1016/s0896-6273(03)00154-5. View