Reinforcement Learning: Computational Theory and Biological Mechanisms
Authors
Affiliations
Reinforcement learning is a computational framework for an active agent to learn behaviors on the basis of a scalar reward signal. The agent can be an animal, a human, or an artificial system such as a robot or a computer program. The reward can be food, water, money, or whatever measure of the performance of the agent. The theory of reinforcement learning, which was developed in an artificial intelligence community with intuitions from animal learning theory, is now giving a coherent account on the function of the basal ganglia. It now serves as the "common language" in which biologists, engineers, and social scientists can exchange their problems and findings. This article reviews the basic theoretical framework of reinforcement learning and discusses its recent and future contributions toward the understanding of animal behaviors and human decision making.
Distributed representations of temporally accumulated reward prediction errors in the mouse cortex.
Makino H, Suhaimi A Sci Adv. 2025; 11(4):eadi4782.
PMID: 39841828 PMC: 11753378. DOI: 10.1126/sciadv.adi4782.
Motor synergy and energy efficiency emerge in whole-body locomotion learning.
Li G, Hayashibe M Sci Rep. 2025; 15(1):712.
PMID: 39753645 PMC: 11698959. DOI: 10.1038/s41598-024-82472-x.
Dopamine transients encode reward prediction errors independent of learning rates.
Mah A, Golden C, Constantinople C Cell Rep. 2024; 43(10):114840.
PMID: 39395170 PMC: 11571066. DOI: 10.1016/j.celrep.2024.114840.
Dopamine transients encode reward prediction errors independent of learning rates.
Mah A, Golden C, Constantinople C bioRxiv. 2024; .
PMID: 38659861 PMC: 11042285. DOI: 10.1101/2024.04.18.590090.
An opponent striatal circuit for distributional reinforcement learning.
Lowet A, Zheng Q, Meng M, Matias S, Drugowitsch J, Uchida N bioRxiv. 2024; .
PMID: 38260354 PMC: 10802299. DOI: 10.1101/2024.01.02.573966.