» Articles » PMID: 39764401

RTify: Aligning Deep Neural Networks with Human Behavioral Decisions

Overview
Journal ArXiv
Date 2025 Jan 7
PMID 39764401
Authors
Affiliations
Soon will be listed here.
Abstract

Current neural network models of primate vision focus on replicating overall levels of behavioral accuracy, often neglecting perceptual decisions' rich, dynamic nature. Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an approximation that allows us to constrain the number of time steps an RNN takes to solve a task with human RTs. The approach is extensively evaluated against various psychophysics experiments. We also show that the approximation can be used to optimize an "ideal-observer" RNN model to achieve an optimal tradeoff between speed and accuracy without human data. The resulting model is found to account well for human RT data. Finally, we use the approximation to train a deep learning implementation of the popular Wong-Wang decision-making model. The model is integrated with a convolutional neural network (CNN) model of visual processing and evaluated using both artificial and natural image stimuli. Overall, we present a novel framework that helps align current vision models with human behavior, bringing us closer to an integrated model of human vision.

References
1.
Rajalingham R, Issa E, Bashivan P, Kar K, Schmidt K, DiCarlo J . Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks. J Neurosci. 2018; 38(33):7255-7269. PMC: 6096043. DOI: 10.1523/JNEUROSCI.0388-18.2018. View

2.
De Boeck P, Jeon M . An Overview of Models for Response Times and Processes in Cognitive Tests. Front Psychol. 2019; 10:102. PMC: 6372526. DOI: 10.3389/fpsyg.2019.00102. View

3.
Shibata K, Chang L, Kim D, Nanez Sr J, Kamitani Y, Watanabe T . Decoding reveals plasticity in V3A as a result of motion perceptual learning. PLoS One. 2012; 7(8):e44003. PMC: 3429406. DOI: 10.1371/journal.pone.0044003. View

4.
Wong K, Wang X . A recurrent network mechanism of time integration in perceptual decisions. J Neurosci. 2006; 26(4):1314-28. PMC: 6674568. DOI: 10.1523/JNEUROSCI.3733-05.2006. View

5.
Hebart M, Contier O, Teichmann L, Rockter A, Zheng C, Kidder A . THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. Elife. 2023; 12. PMC: 10038662. DOI: 10.7554/eLife.82580. View