» Articles » PMID: 17183636

Policy Adjustment in a Dynamic Economic Game

Overview
Journal PLoS One
Date 2006 Dec 22
PMID 17183636
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.

Citing Articles

Meta-Analysis Reveals That Explore-Exploit Decisions are Dissociable by Activation in the Dorsal Lateral Prefrontal Cortex and the Dorsal Anterior Cingulate Cortex.

Sazhin D, Dachs A, Smith D bioRxiv. 2023; .

PMID: 37961286 PMC: 10634720. DOI: 10.1101/2023.10.21.563317.


Learning under social versus nonsocial uncertainty: A meta-analytic approach.

Martinez-Saito M, Gorina E Hum Brain Mapp. 2022; 43(13):4185-4206.

PMID: 35620870 PMC: 9374892. DOI: 10.1002/hbm.25948.


To learn or to gain: neural signatures of exploration in human decision-making.

Zhen S, Yaple Z, Eickhoff S, Yu R Brain Struct Funct. 2021; 227(1):63-76.

PMID: 34596757 DOI: 10.1007/s00429-021-02389-3.


Separable Influences of Reward on Visual Processing and Choice.

Soltani A, Rakhshan M, Schafer R, Burrows B, Moore T J Cogn Neurosci. 2020; 33(2):248-262.

PMID: 33166195 PMC: 8240750. DOI: 10.1162/jocn_a_01647.


Signals of anticipation of reward and of mean reward rates in the human brain.

Viviani R, Dommes L, Bosch J, Steffens M, Paul A, Schneider K Sci Rep. 2020; 10(1):4287.

PMID: 32152378 PMC: 7062891. DOI: 10.1038/s41598-020-61257-y.


References
1.
Daw N, Niv Y, Dayan P . Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005; 8(12):1704-11. DOI: 10.1038/nn1560. View

2.
Berns G, Cohen J, Mintun M . Brain regions responsive to novelty in the absence of awareness. Science. 1997; 276(5316):1272-5. DOI: 10.1126/science.276.5316.1272. View

3.
Elliott R, Friston K, Dolan R . Dissociable neural responses in human reward systems. J Neurosci. 2000; 20(16):6159-65. PMC: 6772605. View

4.
Miller E . The prefrontal cortex and cognitive control. Nat Rev Neurosci. 2001; 1(1):59-65. DOI: 10.1038/35036228. View

5.
Miller E, Cohen J . An integrative theory of prefrontal cortex function. Annu Rev Neurosci. 2001; 24:167-202. DOI: 10.1146/annurev.neuro.24.1.167. View