Unconscious Reinforcement Learning of Hidden Brain States Supported by Confidence

Overview

Journal Nat Commun

Specialty Biology

Date 2020 Sep 2

PMID 32868772

Citations 13

Authors

Aurelio Cortese

Hakwan Lau

Mitsuo Kawato

Affiliations

Soon will be listed here.

Abstract

Can humans be trained to make strategic use of latent representations in their own brains? We investigate how human subjects can derive reward-maximizing choices from intrinsic high-dimensional information represented stochastically in neural activity. Reward contingencies are defined in real-time by fMRI multivoxel patterns; optimal action policies thereby depend on multidimensional brain activity taking place below the threshold of consciousness, by design. We find that subjects can solve the task within two hundred trials and errors, as their reinforcement learning processes interact with metacognitive functions (quantified as the meaningfulness of their decision confidence). Computational modelling and multivariate analyses identify a frontostriatal neural mechanism by which the brain may untangle the 'curse of dimensionality': synchronization of confidence representations in prefrontal cortex with reward prediction errors in basal ganglia support exploration of latent task representations. These results may provide an alternative starting point for future investigations into unconscious learning and functions of metacognition.

Citing Articles

Touch-driven advantages in reaction time but not in performance in a cross-sensory comparison of reinforcement learning.

Sun W, Ripp I, Borrmann A, Moll M, Fairhurst M Heliyon. 2025; 11(1):e41330.

PMID: 39839521 PMC: 11748724. DOI: 10.1016/j.heliyon.2024.e41330.

Time-dependent neural arbitration between cue associative and episodic fear memories.

Cortese A, Ohata R, Alemany-Gonzalez M, Kitagawa N, Imamizu H, Koizumi A Nat Commun. 2024; 15(1):8706.

PMID: 39433735 PMC: 11494204. DOI: 10.1038/s41467-024-52733-4.

Decoding and modifying dynamic attentional bias in gaming disorder.

Oka T, Kubo T, Kobayashi N, Murakami M, Chiba T, Cortese A Philos Trans R Soc Lond B Biol Sci. 2024; 379(1915):20230090.

PMID: 39428882 PMC: 11491851. DOI: 10.1098/rstb.2023.0090.

Mechanisms of brain self-regulation: psychological factors, mechanistic models and neural substrates.

Sitaram R, Sanchez-Corzo A, Vargas G, Cortese A, El-Deredy W, Jackson A Philos Trans R Soc Lond B Biol Sci. 2024; 379(1915):20230093.

PMID: 39428875 PMC: 11491850. DOI: 10.1098/rstb.2023.0093.

Interaction between the prefrontal and visual cortices supports subjective fear.

Taschereau-Dumouchel V, Cote M, Manuel S, Valevicius D, Cushing C, Cortese A Philos Trans R Soc Lond B Biol Sci. 2024; 379(1908):20230245.

PMID: 39005034 PMC: 11444220. DOI: 10.1098/rstb.2023.0245.

References

Moutard C, Dehaene S, Malach R . Spontaneous Fluctuations and Non-linear Ignitions: Two Dynamic Faces of Cortical Recurrent Loops. Neuron. 2015; 88(1):194-206. DOI: 10.1016/j.neuron.2015.09.018. View

Pessiglione M, Petrovic P, Daunizeau J, Palminteri S, Dolan R, Frith C . Subliminal instrumental conditioning demonstrated in the human brain. Neuron. 2008; 59(4):561-7. PMC: 2572733. DOI: 10.1016/j.neuron.2008.07.005. View

Seitz A, Kim D, Watanabe T . Rewards evoke learning of unconsciously processed visual stimuli in adult humans. Neuron. 2009; 61(5):700-7. PMC: 2683263. DOI: 10.1016/j.neuron.2009.01.016. View

Seitz A, Watanabe T . Psychophysics: Is subliminal learning really passive?. Nature. 2003; 422(6927):36. DOI: 10.1038/422036a. View

Bechara A, Damasio H, Tranel D, Damasio A . Deciding advantageously before knowing the advantageous strategy. Science. 1997; 275(5304):1293-5. DOI: 10.1126/science.275.5304.1293. View

Ganguly K, Dimitrov D, Wallis J, Carmena J . Reversible large-scale modification of cortical networks during neuroprosthetic control. Nat Neurosci. 2011; 14(5):662-7. PMC: 3389499. DOI: 10.1038/nn.2797. View

Finn I, Priebe N, Ferster D . The emergence of contrast-invariant orientation tuning in simple cells of cat visual cortex. Neuron. 2007; 54(1):137-52. PMC: 1993919. DOI: 10.1016/j.neuron.2007.02.029. View

Rahnev D, Maniscalco B, Luber B, Lau H, Lisanby S . Direct injection of noise to the visual cortex decreases accuracy but increases decision confidence. J Neurophysiol. 2011; 107(6):1556-63. DOI: 10.1152/jn.00985.2011. View

Brown R, Lau H, LeDoux J . Understanding the Higher-Order Approach to Consciousness. Trends Cogn Sci. 2019; 23(9):754-768. DOI: 10.1016/j.tics.2019.06.009. View

10.

Cortese A, Amano K, Koizumi A, Kawato M, Lau H . Multivoxel neurofeedback selectively modulates confidence without changing perceptual performance. Nat Commun. 2016; 7:13669. PMC: 5171844. DOI: 10.1038/ncomms13669. View

11.

Shibata K, Lisi G, Cortese A, Watanabe T, Sasaki Y, Kawato M . Toward a comprehensive understanding of the neural mechanisms of decoded neurofeedback. Neuroimage. 2018; 188:539-556. PMC: 6431555. DOI: 10.1016/j.neuroimage.2018.12.022. View

12.

Watanabe T, Sasaki Y, Shibata K, Kawato M . Advances in fMRI Real-Time Neurofeedback. Trends Cogn Sci. 2017; 21(12):997-1010. PMC: 5694350. DOI: 10.1016/j.tics.2017.09.010. View

13.

Fox M, Raichle M . Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat Rev Neurosci. 2007; 8(9):700-11. DOI: 10.1038/nrn2201. View

14.

Fleming S, Dolan R, Frith C . Metacognition: computation, biology and function. Philos Trans R Soc Lond B Biol Sci. 2012; 367(1594):1280-6. PMC: 3318771. DOI: 10.1098/rstb.2012.0021. View

15.

Cortese A, de Martino B, Kawato M . The neural and cognitive architecture for learning from a small sample. Curr Opin Neurobiol. 2019; 55:133-141. DOI: 10.1016/j.conb.2019.02.011. View

16.

Dehaene S, Lau H, Kouider S . What is consciousness, and could machines have it?. Science. 2017; 358(6362):486-492. DOI: 10.1126/science.aan8871. View

17.

Pasupathy A, Miller E . Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature. 2005; 433(7028):873-6. DOI: 10.1038/nature03287. View

18.

Persaud N, Davidson M, Maniscalco B, Mobbs D, Passingham R, Cowey A . Awareness-related activity in prefrontal and parietal cortices in blindsight reflects more than superior visual performance. Neuroimage. 2011; 58(2):605-11. DOI: 10.1016/j.neuroimage.2011.06.081. View

19.

Charles L, Van Opstal F, Marti S, Dehaene S . Distinct brain mechanisms for conscious versus subliminal error detection. Neuroimage. 2013; 73:80-94. PMC: 5635965. DOI: 10.1016/j.neuroimage.2013.01.054. View

20.

Charles L, Gaillard R, Amado I, Krebs M, Bendjemaa N, Dehaene S . Conscious and unconscious performance monitoring: Evidence from patients with schizophrenia. Neuroimage. 2016; 144(Pt A):153-163. DOI: 10.1016/j.neuroimage.2016.09.056. View