» Articles » PMID: 35359274

Pupil Dilation and Response Slowing Distinguish Deliberate Explorative Choices in the Probabilistic Learning Task

Overview
Publisher Springer
Date 2022 Apr 1
PMID 35359274
Authors
Affiliations
Soon will be listed here.
Abstract

This study examined whether pupil size and response time would distinguish directed exploration from random exploration and exploitation. Eighty-nine participants performed the two-choice probabilistic learning task while their pupil size and response time were continuously recorded. Using LMM analysis, we estimated differences in the pupil size and response time between the advantageous and disadvantageous choices as a function of learning success, i.e., whether or not a participant has learned the probabilistic contingency between choices and their outcomes. We proposed that before a true value of each choice became known to a decision-maker, both advantageous and disadvantageous choices represented a random exploration of the two options with an equally uncertain outcome, whereas the same choices after learning manifested exploitation and direct exploration strategies, respectively. We found that disadvantageous choices were associated with increases both in response time and pupil size, but only after the participants had learned the choice-reward contingencies. For the pupil size, this effect was strongly amplified for those disadvantageous choices that immediately followed gains as compared to losses in the preceding choice. Pupil size modulations were evident during the behavioral choice rather than during the pretrial baseline. These findings suggest that occasional disadvantageous choices, which violate the acquired internal utility model, represent directed exploration. This exploratory strategy shifts choice priorities in favor of information seeking and its autonomic and behavioral concomitants are mainly driven by the conflict between the behavioral plan of the intended exploratory choice and its strong alternative, which has already proven to be more rewarding.

Citing Articles

Losses resulting from deliberate exploration trigger beta oscillations in frontal cortex.

Chernyshev B, Pultsina K, Tretyakova V, Miasnikova A, Prokofyev A, Kozunova G Front Neurosci. 2023; 17:1152926.

PMID: 37250414 PMC: 10211346. DOI: 10.3389/fnins.2023.1152926.


Value-driven modulation of visual perception by visual and auditory reward cues: The role of performance-contingent delivery of reward.

Antono J, Vakhrushev R, Pooresmaeili A Front Hum Neurosci. 2023; 16:1062168.

PMID: 36618995 PMC: 9816136. DOI: 10.3389/fnhum.2022.1062168.

References
1.
Bechara A, Damasio H, Tranel D, Damasio A . Deciding advantageously before knowing the advantageous strategy. Science. 1997; 275(5304):1293-5. DOI: 10.1126/science.275.5304.1293. View

2.
Poe G, Foote S, Eschenko O, Johansen J, Bouret S, Aston-Jones G . Locus coeruleus: a new look at the blue spot. Nat Rev Neurosci. 2020; 21(11):644-659. PMC: 8991985. DOI: 10.1038/s41583-020-0360-9. View

3.
Daw N, ODoherty J, Dayan P, Seymour B, Dolan R . Cortical substrates for exploratory decisions in humans. Nature. 2006; 441(7095):876-9. PMC: 2635947. DOI: 10.1038/nature04766. View

4.
Preuschoff K, Hart B, Einhauser W . Pupil Dilation Signals Surprise: Evidence for Noradrenaline's Role in Decision Making. Front Neurosci. 2011; 5:115. PMC: 3183372. DOI: 10.3389/fnins.2011.00115. View

5.
Dudschig C, Jentzsch I . Speeding before and slowing after errors: is it all just strategy?. Brain Res. 2009; 1296:56-62. DOI: 10.1016/j.brainres.2009.08.009. View