RTify: Aligning Deep Neural Networks with Human Behavioral Decisions

Overview

Journal ArXiv

Date 2025 Jan 7

PMID 39764401

Authors

Yu-Ang Cheng

Ivan Felipe Rodriguez

Sixuan Chen

Kohitij Kar

Takeo Watanabe

Thomas Serre

Affiliations

Soon will be listed here.

Abstract

Current neural network models of primate vision focus on replicating overall levels of behavioral accuracy, often neglecting perceptual decisions' rich, dynamic nature. Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an approximation that allows us to constrain the number of time steps an RNN takes to solve a task with human RTs. The approach is extensively evaluated against various psychophysics experiments. We also show that the approximation can be used to optimize an "ideal-observer" RNN model to achieve an optimal tradeoff between speed and accuracy without human data. The resulting model is found to account well for human RT data. Finally, we use the approximation to train a deep learning implementation of the popular Wong-Wang decision-making model. The model is integrated with a convolutional neural network (CNN) model of visual processing and evaluated using both artificial and natural image stimuli. Overall, we present a novel framework that helps align current vision models with human behavior, bringing us closer to an integrated model of human vision.

References

Rajalingham R, Issa E, Bashivan P, Kar K, Schmidt K, DiCarlo J . Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks. J Neurosci. 2018; 38(33):7255-7269. PMC: 6096043. DOI: 10.1523/JNEUROSCI.0388-18.2018. View

De Boeck P, Jeon M . An Overview of Models for Response Times and Processes in Cognitive Tests. Front Psychol. 2019; 10:102. PMC: 6372526. DOI: 10.3389/fpsyg.2019.00102. View

Shibata K, Chang L, Kim D, Nanez Sr J, Kamitani Y, Watanabe T . Decoding reveals plasticity in V3A as a result of motion perceptual learning. PLoS One. 2012; 7(8):e44003. PMC: 3429406. DOI: 10.1371/journal.pone.0044003. View

Wong K, Wang X . A recurrent network mechanism of time integration in perceptual decisions. J Neurosci. 2006; 26(4):1314-28. PMC: 6674568. DOI: 10.1523/JNEUROSCI.3733-05.2006. View

Hebart M, Contier O, Teichmann L, Rockter A, Zheng C, Kidder A . THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. Elife. 2023; 12. PMC: 10038662. DOI: 10.7554/eLife.82580. View

Biederman I . Recognition-by-components: a theory of human image understanding. Psychol Rev. 1987; 94(2):115-147. DOI: 10.1037/0033-295X.94.2.115. View

Green C, Pouget A, Bavelier D . Improved probabilistic inference as a general learning mechanism with action video games. Curr Biol. 2010; 20(17):1573-9. PMC: 2956114. DOI: 10.1016/j.cub.2010.07.040. View

Ratcliff R, Smith P, McKoon G . Modeling Regularities in Response Time and Accuracy Data with the Diffusion Model. Curr Dir Psychol Sci. 2016; 24(6):458-470. PMC: 4692464. DOI: 10.1177/0963721415596228. View

Kar K, Kubilius J, Schmidt K, Issa E, DiCarlo J . Evidence that recurrent circuits are critical to the ventral stream's execution of core object recognition behavior. Nat Neurosci. 2019; 22(6):974-983. PMC: 8785116. DOI: 10.1038/s41593-019-0392-5. View

10.

Crouzet S, Serre T . What are the Visual Features Underlying Rapid Object Recognition?. Front Psychol. 2011; 2:326. PMC: 3216029. DOI: 10.3389/fpsyg.2011.00326. View

11.

Vickers D, Packer J . Effects of alternating set for speed or accuracy on response time, accuracy and confidence in a unidimensional discrimination task. Acta Psychol (Amst). 1982; 50(2):179-97. DOI: 10.1016/0001-6918(82)90006-3. View

12.

Wang X . Probabilistic decision making by slow reverberation in cortical circuits. Neuron. 2002; 36(5):955-68. DOI: 10.1016/s0896-6273(02)01092-9. View

13.

Spoerer C, McClure P, Kriegeskorte N . Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition. Front Psychol. 2017; 8:1551. PMC: 5600938. DOI: 10.3389/fpsyg.2017.01551. View

14.

van Kerkoerle T, Self M, Dagnino B, Gariel-Mathis M, Poort J, van der Togt C . Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proc Natl Acad Sci U S A. 2014; 111(40):14332-41. PMC: 4210002. DOI: 10.1073/pnas.1402773111. View

15.

Schrimpf M, Kubilius J, Lee M, Ratan Murty N, Ajemian R, DiCarlo J . Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence. Neuron. 2020; 108(3):413-423. DOI: 10.1016/j.neuron.2020.07.040. View

16.

Trueblood J, Brown S, Heathcote A . The multiattribute linear ballistic accumulator model of context effects in multialternative choice. Psychol Rev. 2014; 121(2):179-205. DOI: 10.1037/a0036137. View

17.

Mehrer J, Spoerer C, Jones E, Kriegeskorte N, Kietzmann T . An ecologically motivated image dataset for deep learning yields better models of human vision. Proc Natl Acad Sci U S A. 2021; 118(8). PMC: 7923360. DOI: 10.1073/pnas.2011417118. View

18.

Cochrane A, Sims C, Bejjanki V, Green C, Bavelier D . Multiple timescales of learning indicated by changes in evidence-accumulation processes during perceptual decision-making. NPJ Sci Learn. 2023; 8(1):19. PMC: 10250420. DOI: 10.1038/s41539-023-00168-9. View

19.

Winkel J, Keuken M, Van Maanen L, Wagenmakers E, Forstmann B . Early evidence affects later decisions: why evidence accumulation is required to explain response time data. Psychon Bull Rev. 2014; 21(3):777-84. DOI: 10.3758/s13423-013-0551-8. View

20.

Petrov A, Van Horn N, Ratcliff R . Dissociable perceptual-learning mechanisms revealed by diffusion-model analysis. Psychon Bull Rev. 2011; 18(3):490-7. DOI: 10.3758/s13423-011-0079-8. View