» Articles » PMID: 24339762

Actions, Action Sequences and Habits: Evidence That Goal-directed and Habitual Action Control Are Hierarchically Organized

Overview
Specialty Biology
Date 2013 Dec 17
PMID 24339762
Citations 88
Authors
Affiliations
Soon will be listed here.
Abstract

Behavioral evidence suggests that instrumental conditioning is governed by two forms of action control: a goal-directed and a habit learning process. Model-based reinforcement learning (RL) has been argued to underlie the goal-directed process; however, the way in which it interacts with habits and the structure of the habitual process has remained unclear. According to a flat architecture, the habitual process corresponds to model-free RL, and its interaction with the goal-directed process is coordinated by an external arbitration mechanism. Alternatively, the interaction between these systems has recently been argued to be hierarchical, such that the formation of action sequences underlies habit learning and a goal-directed process selects between goal-directed actions and habitual sequences of actions to reach the goal. Here we used a two-stage decision-making task to test predictions from these accounts. The hierarchical account predicts that, because they are tied to each other as an action sequence, selecting a habitual action in the first stage will be followed by a habitual action in the second stage, whereas the flat account predicts that the statuses of the first and second stage actions are independent of each other. We found, based on subjects' choices and reaction times, that human subjects combined single actions to build action sequences and that the formation of such action sequences was sufficient to explain habitual actions. Furthermore, based on Bayesian model comparison, a family of hierarchical RL models, assuming a hierarchical interaction between habit and goal-directed processes, provided a better fit of the subjects' behavior than a family of flat models. Although these findings do not rule out all possible model-free accounts of instrumental conditioning, they do show such accounts are not necessary to explain habitual actions and provide a new basis for understanding how goal-directed and habitual action control interact.

Citing Articles

Maturation of striatal dopamine supports the development of habitual behavior through adolescence.

Petrie D, Parr A, Sydnor V, Ojha A, Foran W, Tervo-Clemmens B bioRxiv. 2025; .

PMID: 39829737 PMC: 11741407. DOI: 10.1101/2025.01.06.631527.


Environmental complexity modulates information processing and the balance between decision-making systems.

Mugan U, Hoffman S, Redish A Neuron. 2024; 112(24):4096-4114.e10.

PMID: 39476843 PMC: 11659045. DOI: 10.1016/j.neuron.2024.10.004.


Sequence termination cues drive habits via dopamine-mediated credit assignment.

Magnard R, Cheng Y, Zhou J, Province H, Thiriet N, Janak P bioRxiv. 2024; .

PMID: 39463939 PMC: 11507917. DOI: 10.1101/2024.10.16.618735.


Craving money? Evidence from the laboratory and the field.

Payzan-LeNestour E, Doran J Sci Adv. 2024; 10(2):eadi5034.

PMID: 38215199 PMC: 10786414. DOI: 10.1126/sciadv.adi5034.


Exploring the steps of learning: computational modeling of initiatory-actions among individuals with attention-deficit/hyperactivity disorder.

Katabi G, Shahar N Transl Psychiatry. 2024; 14(1):10.

PMID: 38191535 PMC: 10774270. DOI: 10.1038/s41398-023-02717-7.


References
1.
Pew R . Acquisition of hierarchical control over the temporal organization of a skill. J Exp Psychol. 1966; 71(5):764-71. DOI: 10.1037/h0023100. View

2.
Reynolds J, OReilly R . Developing PFC representations using reinforcement learning. Cognition. 2009; 113(3):281-292. PMC: 2783795. DOI: 10.1016/j.cognition.2009.05.015. View

3.
Keramati M, Dezfouli A, Piray P . Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS Comput Biol. 2011; 7(5):e1002055. PMC: 3102758. DOI: 10.1371/journal.pcbi.1002055. View

4.
Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y . Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J Neurosci. 2013; 33(13):5797-805. PMC: 3865543. DOI: 10.1523/JNEUROSCI.5445-12.2013. View

5.
Kim H, Sul J, Huh N, Lee D, Jung M . Role of striatum in updating values of chosen actions. J Neurosci. 2009; 29(47):14701-12. PMC: 6666000. DOI: 10.1523/JNEUROSCI.2728-09.2009. View