Reinforcement Learning During Locomotion

Overview

Journal eNeuro

Specialty Neurology

Date 2024 Mar 4

PMID 38438263

Authors

Jonathan M Wood

Hyosub E Kim

Susanne M Morton

Affiliations

Soon will be listed here.

Abstract

When learning a new motor skill, people often must use trial and error to discover which movement is best. In the reinforcement learning framework, this concept is known as exploration and has been linked to increased movement variability in motor tasks. For locomotor tasks, however, increased variability decreases upright stability. As such, exploration during gait may jeopardize balance and safety, making reinforcement learning less effective. Therefore, we set out to determine if humans could acquire and retain a novel locomotor pattern using reinforcement learning alone. Young healthy male and female participants walked on a treadmill and were provided with binary reward feedback (indicated by a green checkmark on the screen) that was tied to a fixed monetary bonus, to learn a novel stepping pattern. We also recruited a comparison group who walked with the same novel stepping pattern but did so by correcting for target error, induced by providing real-time veridical visual feedback of steps and a target. In two experiments, we compared learning, motor variability, and two forms of motor memories between the groups. We found that individuals in the binary reward group did, in fact, acquire the new walking pattern by exploring (increasing motor variability). Additionally, while reinforcement learning did not increase implicit motor memories, it resulted in more accurate explicit motor memories compared with the target error group. Overall, these results demonstrate that humans can acquire new walking patterns with reinforcement learning and retain much of the learning over 24 h.

Citing Articles

The dual timescales of gait adaptation: initial stability adjustments followed by subsequent energetic cost adjustments.

Brinkerhoff S, Sanchez N, Culver M, Murrah W, Robinson A, McCullough J J Exp Biol. 2024; 227(23).

PMID: 39422307 PMC: 11883409. DOI: 10.1242/jeb.249217.

Roles and interplay of reinforcement-based and error-based processes during reaching and gait in neurotypical adults and individuals with Parkinson's disease.

Roth A, Buggeln J, Hoh J, Wood J, Sullivan S, Ngo T PLoS Comput Biol. 2024; 20(10):e1012474.

PMID: 39401183 PMC: 11472932. DOI: 10.1371/journal.pcbi.1012474.

References

Madelain L, Paeye C, Wallman J . Modification of saccadic gain by reinforcement. J Neurophysiol. 2011; 106(1):219-32. PMC: 3129734. DOI: 10.1152/jn.01094.2009. View

Bakkum A, Marigold D . Learning from the Physical Consequences of Our Actions Improves Motor Memory. eNeuro. 2022; 9(3). PMC: 9172287. DOI: 10.1523/ENEURO.0459-21.2022. View

Tsay J, Kim H, Saxena A, Parvin D, Verstynen T, Ivry R . Dissociable use-dependent processes for volitional goal-directed reaching. Proc Biol Sci. 2022; 289(1973):20220415. PMC: 9043705. DOI: 10.1098/rspb.2022.0415. View

Daw N, ODoherty J, Dayan P, Seymour B, Dolan R . Cortical substrates for exploratory decisions in humans. Nature. 2006; 441(7095):876-9. PMC: 2635947. DOI: 10.1038/nature04766. View

Raviv O, Ahissar M, Loewenstein Y . How recent history affects perception: the normative approach and its heuristic approximation. PLoS Comput Biol. 2012; 8(10):e1002731. PMC: 3486920. DOI: 10.1371/journal.pcbi.1002731. View

Verstynen T, Sabes P . How each movement changes the next: an experimental and theoretical study of fast adaptive priors in reaching. J Neurosci. 2011; 31(27):10050-9. PMC: 3148097. DOI: 10.1523/JNEUROSCI.6525-10.2011. View

Huang V, Haith A, Mazzoni P, Krakauer J . Rethinking motor learning and savings in adaptation paradigms: model-free memory for successful actions combines with internal models. Neuron. 2011; 70(4):787-801. PMC: 3134523. DOI: 10.1016/j.neuron.2011.04.012. View

Cashaback J, McGregor H, Mohatarem A, Gribble P . Dissociating error-based and reinforcement-based loss functions during sensorimotor learning. PLoS Comput Biol. 2017; 13(7):e1005623. PMC: 5550011. DOI: 10.1371/journal.pcbi.1005623. View

Zeni Jr J, Richards J, Higginson J . Two simple methods for determining gait events during treadmill and overground walking using kinematic data. Gait Posture. 2007; 27(4):710-4. PMC: 2384115. DOI: 10.1016/j.gaitpost.2007.07.007. View

10.

Reisman D, Block H, Bastian A . Interlimb coordination during locomotion: what can be adapted and stored?. J Neurophysiol. 2005; 94(4):2403-15. DOI: 10.1152/jn.00089.2005. View

11.

Maki B . Gait changes in older adults: predictors of falls or indicators of fear. J Am Geriatr Soc. 1997; 45(3):313-20. DOI: 10.1111/j.1532-5415.1997.tb00946.x. View

12.

Haith A, Huberdeau D, Krakauer J . The influence of movement preparation time on the expression of visuomotor learning and savings. J Neurosci. 2015; 35(13):5109-17. PMC: 6705405. DOI: 10.1523/JNEUROSCI.3869-14.2015. View

13.

Hausdorff J, Edelberg H, Mitchell S, Goldberger A, Wei J . Increased gait unsteadiness in community-dwelling elderly fallers. Arch Phys Med Rehabil. 1997; 78(3):278-83. DOI: 10.1016/s0003-9993(97)90034-4. View

14.

Floel A, Garraux G, Xu B, Breitenstein C, Knecht S, Herscovitch P . Levodopa increases memory encoding and dopamine release in the striatum in the elderly. Neurobiol Aging. 2006; 29(2):267-79. PMC: 2323457. DOI: 10.1016/j.neurobiolaging.2006.10.009. View

15.

Branch F, Park E, Hegde J . Heuristic Vetoing: Top-Down Influences of the Anchoring-and-Adjustment Heuristic Can Override the Bottom-Up Information in Visual Images. Front Neurosci. 2022; 16:745269. PMC: 9163416. DOI: 10.3389/fnins.2022.745269. View

16.

Stanley J, Krakauer J . Motor skill depends on knowledge of facts. Front Hum Neurosci. 2013; 7:503. PMC: 3756281. DOI: 10.3389/fnhum.2013.00503. View

17.

Izawa J, Shadmehr R . Learning from sensory and reward prediction errors during motor adaptation. PLoS Comput Biol. 2011; 7(3):e1002012. PMC: 3053313. DOI: 10.1371/journal.pcbi.1002012. View

18.

Abril-Pla O, Andreani V, Carroll C, Dong L, Fonnesbeck C, Kochurov M . PyMC: a modern, and comprehensive probabilistic programming framework in Python. PeerJ Comput Sci. 2023; 9:e1516. PMC: 10495961. DOI: 10.7717/peerj-cs.1516. View

19.

Wood J, Kim H, French M, Reisman D, Morton S . Use-dependent plasticity explains aftereffects in visually guided locomotor learning of a novel step length asymmetry. J Neurophysiol. 2020; 124(1):32-39. PMC: 7474450. DOI: 10.1152/jn.00083.2020. View

20.

Marinovic W, Poh E, de Rugy A, Carroll T . Action history influences subsequent movement via two distinct processes. Elife. 2017; 6. PMC: 5662285. DOI: 10.7554/eLife.26713. View