A Recurrent Neural Network Framework for Flexible and Adaptive Decision Making Based on Sequence Learning

Overview

Journal PLoS Comput Biol

Specialty Biology

Date 2020 Nov 3

PMID 33141824

Citations 5

Authors

Zhewei Zhang

Huzi Cheng

Tianming Yang

Affiliations

Soon will be listed here.

Abstract

The brain makes flexible and adaptive responses in a complicated and ever-changing environment for an organism's survival. To achieve this, the brain needs to understand the contingencies between its sensory inputs, actions, and rewards. This is analogous to the statistical inference that has been extensively studied in the natural language processing field, where recent developments of recurrent neural networks have found many successes. We wonder whether these neural networks, the gated recurrent unit (GRU) networks in particular, reflect how the brain solves the contingency problem. Therefore, we build a GRU network framework inspired by the statistical learning approach of NLP and test it with four exemplar behavior tasks previously used in empirical studies. The network models are trained to predict future events based on past events, both comprising sensory, action, and reward events. We show the networks can successfully reproduce animal and human behavior. The networks generalize the training, perform Bayesian inference in novel conditions, and adapt their choices when event contingencies vary. Importantly, units in the network encode task variables and exhibit activity patterns that match previous neurophysiology findings. Our results suggest that the neural network approach based on statistical sequence learning may reflect the brain's computational principle underlying flexible and adaptive behaviors and serve as a useful approach to understand the brain.

Citing Articles

Sequential neuronal processing of number values, abstract decision, and action in the primate prefrontal cortex.

Viswanathan P, Stein A, Nieder A PLoS Biol. 2024; 22(2):e3002520.

PMID: 38364194 PMC: 10871863. DOI: 10.1371/journal.pbio.3002520.

Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning.

Uthman O, Court R, Enderby J, Al-Khudairy L, Nduka C, Mistry H Health Technol Assess. 2022; .

PMID: 36562494 PMC: 10068584. DOI: 10.3310/UDIR6682.

Category learning in a recurrent neural network with reinforcement learning.

Zhang Y, Pan X, Wang Y Front Psychiatry. 2022; 13:1008011.

PMID: 36387007 PMC: 9640766. DOI: 10.3389/fpsyt.2022.1008011.

An Evaluation of 3D-Printed Materials' Structural Properties Using Active Infrared Thermography and Deep Neural Networks Trained on the Numerical Data.

Szymanik B Materials (Basel). 2022; 15(10).

PMID: 35629753 PMC: 9146560. DOI: 10.3390/ma15103727.

Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments.

Foucault C, Meyniel F Elife. 2021; 10.

PMID: 34854377 PMC: 8735865. DOI: 10.7554/eLife.71801.

References

Ding L, Gold J . Caudate encodes multiple computations for perceptual decisions. J Neurosci. 2010; 30(47):15747-59. PMC: 3005761. DOI: 10.1523/JNEUROSCI.2894-10.2010. View

Zhang Z, Cheng Z, Lin Z, Nie C, Yang T . A neural network model for the orbitofrontal cortex and task space acquisition during reinforcement learning. PLoS Comput Biol. 2018; 14(1):e1005925. PMC: 5771635. DOI: 10.1371/journal.pcbi.1005925. View

Florian R . Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Comput. 2007; 19(6):1468-502. DOI: 10.1162/neco.2007.19.6.1468. View

Kiani R, Shadlen M . Representation of confidence associated with a decision by neurons in the parietal cortex. Science. 2009; 324(5928):759-64. PMC: 2738936. DOI: 10.1126/science.1169405. View

Yang G, Joglekar M, Song H, Newsome W, Wang X . Task representations in neural networks trained to perform many cognitive tasks. Nat Neurosci. 2019; 22(2):297-306. PMC: 11549734. DOI: 10.1038/s41593-018-0310-2. View

Frank M, Loughry B, OReilly R . Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cogn Affect Behav Neurosci. 2002; 1(2):137-60. DOI: 10.3758/cabn.1.2.137. View

Neuringer A . Operant variability: evidence, functions, and theory. Psychon Bull Rev. 2003; 9(4):672-705. DOI: 10.3758/bf03196324. View

Graybiel A . Building action repertoires: memory and learning functions of the basal ganglia. Curr Opin Neurobiol. 1995; 5(6):733-41. DOI: 10.1016/0959-4388(95)80100-6. View

Akam T, Costa R, Dayan P . Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task. PLoS Comput Biol. 2015; 11(12):e1004648. PMC: 4686094. DOI: 10.1371/journal.pcbi.1004648. View

10.

Wang L, Rangarajan K, Gerfen C, Krauzlis R . Activation of Striatal Neurons Causes a Perceptual Decision Bias during Visual Change Detection in Mice. Neuron. 2018; 97(6):1369-1381.e5. PMC: 5866220. DOI: 10.1016/j.neuron.2018.01.049. View

11.

Kim J, Shadlen M . Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nat Neurosci. 1999; 2(2):176-85. DOI: 10.1038/5739. View

12.

Jin X, Costa R . Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature. 2010; 466(7305):457-62. PMC: 3477867. DOI: 10.1038/nature09263. View

13.

Ding L, Gold J . Separate, causal roles of the caudate in saccadic choice and execution in a perceptual decision task. Neuron. 2012; 75(5):865-74. PMC: 3446771. DOI: 10.1016/j.neuron.2012.07.021. View

14.

Greff K, Srivastava R, Koutnik J, Steunebrink B, Schmidhuber J . LSTM: A Search Space Odyssey. IEEE Trans Neural Netw Learn Syst. 2016; 28(10):2222-2232. DOI: 10.1109/TNNLS.2016.2582924. View

15.

Wood J, Simon N, Koerner F, Kass R, Moghaddam B . Networks of VTA Neurons Encode Real-Time Information about Uncertain Numbers of Actions Executed to Earn a Reward. Front Behav Neurosci. 2017; 11:140. PMC: 5550723. DOI: 10.3389/fnbeh.2017.00140. View

16.

Song H, Yang G, Wang X . Reward-based training of recurrent neural networks for cognitive and value-based tasks. Elife. 2017; 6. PMC: 5293493. DOI: 10.7554/eLife.21492. View

17.

Graybiel A, Grafton S . The striatum: where skills and habits meet. Cold Spring Harb Perspect Biol. 2015; 7(8):a021691. PMC: 4526748. DOI: 10.1101/cshperspect.a021691. View

18.

Wang J, Kurth-Nelson Z, Kumaran D, Tirumala D, Soyer H, Leibo J . Prefrontal cortex as a meta-reinforcement learning system. Nat Neurosci. 2018; 21(6):860-868. DOI: 10.1038/s41593-018-0147-8. View

19.

Engelhard B, Finkelstein J, Cox J, Fleming W, Jang H, Ornelas S . Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature. 2019; 570(7762):509-513. PMC: 7147811. DOI: 10.1038/s41586-019-1261-9. View

20.

Hochreiter S, Schmidhuber J . Long short-term memory. Neural Comput. 1997; 9(8):1735-80. DOI: 10.1162/neco.1997.9.8.1735. View