A Novel Technique for Delineating the Effect of Variation in the Learning Rate on the Neural Correlates of Reward Prediction Errors in Model-based FMRI

Overview

Journal Front Psychol

Date 2024 Jan 8

PMID 38187436

Authors

Henry W Chase

Affiliations

Soon will be listed here.

Abstract

Introduction: Computational models play an increasingly important role in describing variation in neural activation in human neuroimaging experiments, including evaluating individual differences in the context of psychiatric neuroimaging. In particular, reinforcement learning (RL) techniques have been widely adopted to examine neural responses to reward prediction errors and stimulus or action values, and how these might vary as a function of clinical status. However, there is a lack of consensus around the importance of the precision of free parameter estimation for these methods, particularly with regard to the learning rate. In the present study, I introduce a novel technique which may be used within a general linear model (GLM) to model the effect of mis-estimation of the learning rate on reward prediction error (RPE)-related neural responses.

Methods: Simulations employed a simple RL algorithm, which was used to generate hypothetical neural activations that would be expected to be observed in functional magnetic resonance imaging (fMRI) studies of RL. Similar RL models were incorporated within a GLM-based analysis method including derivatives, with individual differences in the resulting GLM-derived beta parameters being evaluated with respect to the free parameters of the RL model or being submitted to other validation analyses.

Results: Initial simulations demonstrated that the conventional approach to fitting RL models to RPE responses is more likely to reflect individual differences in a reinforcement efficacy construct (lambda) rather than learning rate (alpha). The proposed method, adding a derivative regressor to the GLM, provides a second regressor which reflects the learning rate. Validation analyses were performed including examining another comparable method which yielded highly similar results, and a demonstration of sensitivity of the method in presence of fMRI-like noise.

Conclusion: Overall, the findings underscore the importance of the lambda parameter for interpreting individual differences in RPE-coupled neural activity, and validate a novel neural metric of the modulation of such activity by individual differences in the learning rate. The method is expected to find application in understanding aberrant reinforcement learning across different psychiatric patient groups including major depression and substance use disorder.

References

Bradshaw C, Killeen P . A theory of behaviour on progressive ratio schedules, with applications in behavioural pharmacology. Psychopharmacology (Berl). 2012; 222(4):549-64. DOI: 10.1007/s00213-012-2771-4. View

Culbreth A, Westbrook A, Xu Z, Barch D, Waltz J . Intact Ventral Striatal Prediction Error Signaling in Medicated Schizophrenia Patients. Biol Psychiatry Cogn Neurosci Neuroimaging. 2017; 1(5):474-483. PMC: 5321567. DOI: 10.1016/j.bpsc.2016.07.007. View

Stoops W, Lile J, Fillmore M, Glaser P, Rush C . Reinforcing effects of modafinil: influence of dose and behavioral demands following drug administration. Psychopharmacology (Berl). 2005; 182(1):186-93. DOI: 10.1007/s00213-005-0044-1. View

Schonberg T, ODoherty J, Joel D, Inzelberg R, Segev Y, Daw N . Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study. Neuroimage. 2009; 49(1):772-81. DOI: 10.1016/j.neuroimage.2009.08.011. View

Di X, Kannurpatti S, Rypma B, Biswal B . Calibrating BOLD fMRI activations with neurovascular and anatomical constraints. Cereb Cortex. 2012; 23(2):255-63. PMC: 3539449. DOI: 10.1093/cercor/bhs001. View

Cohen M . Individual differences and the neural representations of reward expectation and reward prediction error. Soc Cogn Affect Neurosci. 2007; 2(1):20-30. PMC: 1945222. DOI: 10.1093/scan/nsl021. View

Katahira K, Toyama A . Revisiting the importance of model fitting for model-based fMRI: It does matter in computational psychiatry. PLoS Comput Biol. 2021; 17(2):e1008738. PMC: 7899379. DOI: 10.1371/journal.pcbi.1008738. View

Davey C, Grayden D, Egan G, Johnston L . Filtering induces correlation in fMRI resting state data. Neuroimage. 2012; 64:728-40. DOI: 10.1016/j.neuroimage.2012.08.022. View

Madsen H, Ahmed S . Drug versus sweet reward: greater attraction to and preference for sweet versus drug cues. Addict Biol. 2014; 20(3):433-44. DOI: 10.1111/adb.12134. View

10.

Tobler P, Dickinson A, Schultz W . Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm. J Neurosci. 2003; 23(32):10402-10. PMC: 6741002. View

11.

Zou Q, Ross T, Gu H, Geng X, Zuo X, Hong L . Intrinsic resting-state activity predicts working memory brain activation and behavioral performance. Hum Brain Mapp. 2012; 34(12):3204-15. PMC: 6870161. DOI: 10.1002/hbm.22136. View

12.

Perez O, Dickinson A . A theory of actions and habits: The interaction of rate correlation and contiguity systems in free-operant behavior. Psychol Rev. 2020; 127(6):945-971. DOI: 10.1037/rev0000201. View

13.

Handwerker D, Ollinger J, DEsposito M . Variation of BOLD hemodynamic responses across subjects and brain regions and their effects on statistical analyses. Neuroimage. 2004; 21(4):1639-51. DOI: 10.1016/j.neuroimage.2003.11.029. View

14.

Cremers H, Wager T, Yarkoni T . The relation between statistical power and inference in fMRI. PLoS One. 2017; 12(11):e0184923. PMC: 5695788. DOI: 10.1371/journal.pone.0184923. View

15.

Molinaro G, Collins A . Intrinsic rewards explain context-sensitive valuation in reinforcement learning. PLoS Biol. 2023; 21(7):e3002201. PMC: 10374061. DOI: 10.1371/journal.pbio.3002201. View

16.

Murray G, Corlett P, Clark L, Pessiglione M, Blackwell A, Honey G . Substantia nigra/ventral tegmental reward prediction error disruption in psychosis. Mol Psychiatry. 2007; 13(3):239, 267-76. PMC: 2564111. DOI: 10.1038/sj.mp.4002058. View

17.

Hursh S, Silberberg A . Economic demand and essential value. Psychol Rev. 2008; 115(1):186-98. DOI: 10.1037/0033-295X.115.1.186. View

18.

Poline J, Brett M . The general linear model and fMRI: does love last forever?. Neuroimage. 2012; 62(2):871-80. DOI: 10.1016/j.neuroimage.2012.01.133. View

19.

Collins A, Ciullo B, Frank M, Badre D . Working Memory Load Strengthens Reward Prediction Errors. J Neurosci. 2017; 37(16):4332-4342. PMC: 5413179. DOI: 10.1523/JNEUROSCI.2700-16.2017. View

20.

Lebreton M, Bavard S, Daunizeau J, Palminteri S . Assessing inter-individual differences with task-related functional neuroimaging. Nat Hum Behav. 2019; 3(9):897-905. DOI: 10.1038/s41562-019-0681-8. View