A Comparison of Reinforcement Learning Models of Human Spatial Navigation

Overview

Journal Sci Rep

Specialty Science

Date 2022 Aug 17

PMID 35978035

Authors

Qiliang He

Jancy Ling Liu

Lou Eschapasse

Elizabeth H Beveridge

Thackery I Brown

Affiliations

Soon will be listed here.

Abstract

Reinforcement learning (RL) models have been influential in characterizing human learning and decision making, but few studies apply them to characterizing human spatial navigation and even fewer systematically compare RL models under different navigation requirements. Because RL can characterize one's learning strategies quantitatively and in a continuous manner, and one's consistency of using such strategies, it can provide a novel and important perspective for understanding the marked individual differences in human navigation and disentangle navigation strategies from navigation performance. One-hundred and fourteen participants completed wayfinding tasks in a virtual environment where different phases manipulated navigation requirements. We compared performance of five RL models (3 model-free, 1 model-based and 1 "hybrid") at fitting navigation behaviors in different phases. Supporting implications from prior literature, the hybrid model provided the best fit regardless of navigation requirements, suggesting the majority of participants rely on a blend of model-free (route-following) and model-based (cognitive mapping) learning in such navigation scenarios. Furthermore, consistent with a key prediction, there was a correlation in the hybrid model between the weight on model-based learning (i.e., navigation strategy) and the navigator's exploration vs. exploitation tendency (i.e., consistency of using such navigation strategy), which was modulated by navigation task requirements. Together, we not only show how computational findings from RL align with the spatial navigation literature, but also reveal how the relationship between navigation strategy and a person's consistency using such strategies changes as navigation requirements change.

Citing Articles

Spatially organized striatal neuromodulator release encodes trajectory errors.

Brown E, Zi Y, Vu M, Bouabid S, Lindsey J, Godfrey-Nwachukwu C bioRxiv. 2024; .

PMID: 39185163 PMC: 11343099. DOI: 10.1101/2024.08.13.607797.

Collaborative robots can augment human cognition in regret-sensitive tasks.

Schlafly M, Prabhakar A, Popovic K, Schlafly G, Kim C, Murphey T PNAS Nexus. 2024; 3(2):pgae016.

PMID: 38725525 PMC: 11079486. DOI: 10.1093/pnasnexus/pgae016.

The neural correlates of memory integration in value-based decision-making during human spatial navigation.

He Q, Liu J, Eschapasse L, Zagora A, Brown T Neuropsychologia. 2023; 193:108758.

PMID: 38103679 PMC: 11867550. DOI: 10.1016/j.neuropsychologia.2023.108758.

References

Eckstein M, Wilbrecht L, Collins A . What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience. Curr Opin Behav Sci. 2022; 41:128-137. PMC: 8722372. DOI: 10.1016/j.cobeha.2021.06.004. View

Gershman S, Daw N . Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework. Annu Rev Psychol. 2016; 68:101-128. PMC: 5953519. DOI: 10.1146/annurev-psych-122414-033625. View

Eckstein M, Collins A . Computational evidence for hierarchically structured reinforcement learning in humans. Proc Natl Acad Sci U S A. 2020; 117(47):29381-29389. PMC: 7703642. DOI: 10.1073/pnas.1912330117. View

Collins A, Frank M . How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. Eur J Neurosci. 2012; 35(7):1024-35. PMC: 3390186. DOI: 10.1111/j.1460-9568.2011.07980.x. View

Otto A, Gershman S, Markman A, Daw N . The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol Sci. 2013; 24(5):751-61. PMC: 3843765. DOI: 10.1177/0956797612463080. View

Daw N, Gershman S, Seymour B, Dayan P, Dolan R . Model-based influences on humans' choices and striatal prediction errors. Neuron. 2011; 69(6):1204-15. PMC: 3077926. DOI: 10.1016/j.neuron.2011.02.027. View

Simon D, Daw N . Neural correlates of forward planning in a spatial decision task in humans. J Neurosci. 2011; 31(14):5526-39. PMC: 3108440. DOI: 10.1523/JNEUROSCI.4647-10.2011. View

Jocham G, Klein T, Ullsperger M . Dopamine-mediated reinforcement learning signals in the striatum and ventromedial prefrontal cortex underlie value-based choices. J Neurosci. 2011; 31(5):1606-13. PMC: 6623749. DOI: 10.1523/JNEUROSCI.3904-10.2011. View

Vikbladh O, Meager M, King J, Blackmon K, Devinsky O, Shohamy D . Hippocampal Contributions to Model-Based Planning and Spatial Memory. Neuron. 2019; 102(3):683-693.e4. PMC: 6508991. DOI: 10.1016/j.neuron.2019.02.014. View

10.

Schultz W . Behavioral dopamine signals. Trends Neurosci. 2007; 30(5):203-10. DOI: 10.1016/j.tins.2007.03.007. View

11.

Daw N, Niv Y, Dayan P . Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005; 8(12):1704-11. DOI: 10.1038/nn1560. View

12.

Glascher J, Daw N, Dayan P, ODoherty J . States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron. 2010; 66(4):585-95. PMC: 2895323. DOI: 10.1016/j.neuron.2010.04.016. View

13.

Anggraini D, Glasauer S, Wunderlich K . Neural signatures of reinforcement learning correlate with strategy adoption during spatial navigation. Sci Rep. 2018; 8(1):10110. PMC: 6031619. DOI: 10.1038/s41598-018-28241-z. View

14.

He Q, McNamara T, Bodenheimer B, Klippel A . Acquisition and transfer of spatial knowledge during wayfinding. J Exp Psychol Learn Mem Cogn. 2018; 45(8):1364-1386. DOI: 10.1037/xlm0000654. View

15.

Otto A, Raio C, Chiang A, Phelps E, Daw N . Working-memory capacity protects model-based learning from stress. Proc Natl Acad Sci U S A. 2013; 110(52):20941-6. PMC: 3876216. DOI: 10.1073/pnas.1312011110. View

16.

Radulescu A, Daniel R, Niv Y . The effects of aging on the interaction between reinforcement learning and attention. Psychol Aging. 2016; 31(7):747-757. DOI: 10.1037/pag0000112. View

18.

He Q, Beveridge E, Starnes J, Goodroe S, Brown T . Environmental overlap and individual encoding strategy modulate memory interference in spatial navigation. Cognition. 2020; 207:104508. PMC: 7779693. DOI: 10.1016/j.cognition.2020.104508. View

19.

Chrastil E, Warren W . Active and passive spatial learning in human navigation: acquisition of survey knowledge. J Exp Psychol Learn Mem Cogn. 2013; 39(5):1520-37. DOI: 10.1037/a0032382. View

20.

Ishikawa T, Montello D . Spatial knowledge acquisition from direct experience in the environment: individual differences in the development of metric knowledge and the integration of separately learned places. Cogn Psychol. 2005; 52(2):93-129. DOI: 10.1016/j.cogpsych.2005.08.003. View

21.

Weisberg S, Schinazi V, Newcombe N, Shipley T, Epstein R . Variations in cognitive maps: understanding individual differences in navigation. J Exp Psychol Learn Mem Cogn. 2013; 40(3):669-682. DOI: 10.1037/a0035261. View