» Articles » PMID: 28320846

Working Memory Load Strengthens Reward Prediction Errors

Overview
Journal J Neurosci
Specialty Neurology
Date 2017 Mar 22
PMID 28320846
Citations 44
Authors
Affiliations
Soon will be listed here.
Abstract

Reinforcement learning (RL) in simple instrumental tasks is usually modeled as a monolithic process in which reward prediction errors (RPEs) are used to update expected values of choice options. This modeling ignores the different contributions of different memory and decision-making systems thought to contribute even to simple learning. In an fMRI experiment, we investigated how working memory (WM) and incremental RL processes interact to guide human learning. WM load was manipulated by varying the number of stimuli to be learned across blocks. Behavioral results and computational modeling confirmed that learning was best explained as a mixture of two mechanisms: a fast, capacity-limited, and delay-sensitive WM process together with slower RL. Model-based analysis of fMRI data showed that striatum and lateral prefrontal cortex were sensitive to RPE, as shown previously, but, critically, these signals were reduced when the learning problem was within capacity of WM. The degree of this neural interaction related to individual differences in the use of WM to guide behavioral learning. These results indicate that the two systems do not process information independently, but rather interact during learning. Reinforcement learning (RL) theory has been remarkably productive at improving our understanding of instrumental learning as well as dopaminergic and striatal network function across many mammalian species. However, this neural network is only one contributor to human learning and other mechanisms such as prefrontal cortex working memory also play a key role. Our results also show that these other players interact with the dopaminergic RL system, interfering with its key computation of reward prediction errors.

Citing Articles

Policy Complexity Suppresses Dopamine Responses.

Gershman S, Lak A J Neurosci. 2025; 45(9).

PMID: 39788740 PMC: 11866995. DOI: 10.1523/JNEUROSCI.1756-24.2024.


Working memory gating in obesity is moderated by striatal dopaminergic gene variants.

Herzog N, Hartmann H, Janssen L, Kanyamibwa A, Waltmann M, Kovacs P Elife. 2024; 13.

PMID: 39431987 PMC: 11493406. DOI: 10.7554/eLife.93369.


Neural and Computational Mechanisms of Motivation and Decision-making.

Yee D J Cogn Neurosci. 2024; 36(12):2822-2830.

PMID: 39378176 PMC: 11602011. DOI: 10.1162/jocn_a_02258.


Policy complexity suppresses dopamine responses.

Gershman S, Lak A bioRxiv. 2024; .

PMID: 39345642 PMC: 11429712. DOI: 10.1101/2024.09.15.613150.


Altered learning from positive feedback in adolescents with anorexia nervosa.

Uniacke B, van den Bos W, Wonderlich J, Ojeda J, Posner J, Steinglass J J Int Neuropsychol Soc. 2024; 30(7):651-659.

PMID: 39291440 PMC: 11773347. DOI: 10.1017/S1355617724000237.


References
1.
Poldrack R, Clark J, Shohamy D, Creso Moyano J, Myers C, Gluck M . Interactive memory systems in the human brain. Nature. 2001; 414(6863):546-50. DOI: 10.1038/35107080. View

2.
Schultz W . Getting formal with dopamine and reward. Neuron. 2002; 36(2):241-63. DOI: 10.1016/s0896-6273(02)00967-4. View

3.
Frank M, Seeberger L, OReilly R . By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science. 2004; 306(5703):1940-3. DOI: 10.1126/science.1102941. View

4.
Ranganath C, Blumenfeld R . Doubts about double dissociations between short- and long-term memory. Trends Cogn Sci. 2005; 9(8):374-80. DOI: 10.1016/j.tics.2005.06.009. View

5.
Daw N, Doya K . The computational neurobiology of learning and reward. Curr Opin Neurobiol. 2006; 16(2):199-204. DOI: 10.1016/j.conb.2006.03.006. View