Task learnability modulates surprise but not valence processing for reinforcement learning in probabilistic choice tasks
- Author
- Franz Wurm, Wioleta Walentowska (UGent) , Benjamin Ernst, Mario Carlo Severo, Gilles Pourtois (UGent) and Marco Steinhauser
- Organization
- Project
- Abstract
- The goal of temporal difference (TD) reinforcement learning is to maximize outcomes and improve future decision-making. It does so by utilizing a prediction error (PE), which quantifies the difference between the expected and the obtained outcome. In gambling tasks, however, decision-making cannot be improved because of the lack of learnability. On the basis of the idea that TD utilizes two independent bits of information from the PE (valence and surprise), we asked which of these aspects is affected when a task is not learnable. We contrasted behavioral data and ERPs in a learning variant and a gambling variant of a simple two-armed bandit task, in which outcome sequences were matched across tasks. Participants were explicitly informed that feedback could be used to improve performance in the learning task but not in the gambling task, and we predicted a corresponding modulation of the aspects of the PE. We used a model-based analysis of ERP data to extract the neural footprints of the valence and surprise information in the two tasks. Our results revealed that task learnability modulates reinforcement learning via the suppression of surprise processing but leaves the processing of valence unaffected. On the basis of our model and the data, we propose that task learnability can selectively suppress TD learning as well as alter behavioral adaptation based on a flexible cost–benefit arbitration.
- Keywords
- Cognitive Neuroscience, FEEDBACK-RELATED NEGATIVITY, BAYESIAN MODEL SELECTION, EVENT-RELATED POTENTIALS, PREDICTION ERROR, ANTERIOR CINGULATE, REWARD PREDICTION, BRAIN POTENTIALS, NEURAL BASES, DOPAMINE, BEHAVIOR
Downloads
-
Wurm etal JoCN2021.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 419.07 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8723736
- MLA
- Wurm, Franz, et al. “Task Learnability Modulates Surprise but Not Valence Processing for Reinforcement Learning in Probabilistic Choice Tasks.” JOURNAL OF COGNITIVE NEUROSCIENCE, vol. 34, no. 1, 2021, pp. 34–53, doi:10.1162/jocn_a_01777.
- APA
- Wurm, F., Walentowska, W., Ernst, B., Severo, M. C., Pourtois, G., & Steinhauser, M. (2021). Task learnability modulates surprise but not valence processing for reinforcement learning in probabilistic choice tasks. JOURNAL OF COGNITIVE NEUROSCIENCE, 34(1), 34–53. https://doi.org/10.1162/jocn_a_01777
- Chicago author-date
- Wurm, Franz, Wioleta Walentowska, Benjamin Ernst, Mario Carlo Severo, Gilles Pourtois, and Marco Steinhauser. 2021. “Task Learnability Modulates Surprise but Not Valence Processing for Reinforcement Learning in Probabilistic Choice Tasks.” JOURNAL OF COGNITIVE NEUROSCIENCE 34 (1): 34–53. https://doi.org/10.1162/jocn_a_01777.
- Chicago author-date (all authors)
- Wurm, Franz, Wioleta Walentowska, Benjamin Ernst, Mario Carlo Severo, Gilles Pourtois, and Marco Steinhauser. 2021. “Task Learnability Modulates Surprise but Not Valence Processing for Reinforcement Learning in Probabilistic Choice Tasks.” JOURNAL OF COGNITIVE NEUROSCIENCE 34 (1): 34–53. doi:10.1162/jocn_a_01777.
- Vancouver
- 1.Wurm F, Walentowska W, Ernst B, Severo MC, Pourtois G, Steinhauser M. Task learnability modulates surprise but not valence processing for reinforcement learning in probabilistic choice tasks. JOURNAL OF COGNITIVE NEUROSCIENCE. 2021;34(1):34–53.
- IEEE
- [1]F. Wurm, W. Walentowska, B. Ernst, M. C. Severo, G. Pourtois, and M. Steinhauser, “Task learnability modulates surprise but not valence processing for reinforcement learning in probabilistic choice tasks,” JOURNAL OF COGNITIVE NEUROSCIENCE, vol. 34, no. 1, pp. 34–53, 2021.
@article{8723736, abstract = {{The goal of temporal difference (TD) reinforcement learning is to maximize outcomes and improve future decision-making. It does so by utilizing a prediction error (PE), which quantifies the difference between the expected and the obtained outcome. In gambling tasks, however, decision-making cannot be improved because of the lack of learnability. On the basis of the idea that TD utilizes two independent bits of information from the PE (valence and surprise), we asked which of these aspects is affected when a task is not learnable. We contrasted behavioral data and ERPs in a learning variant and a gambling variant of a simple two-armed bandit task, in which outcome sequences were matched across tasks. Participants were explicitly informed that feedback could be used to improve performance in the learning task but not in the gambling task, and we predicted a corresponding modulation of the aspects of the PE. We used a model-based analysis of ERP data to extract the neural footprints of the valence and surprise information in the two tasks. Our results revealed that task learnability modulates reinforcement learning via the suppression of surprise processing but leaves the processing of valence unaffected. On the basis of our model and the data, we propose that task learnability can selectively suppress TD learning as well as alter behavioral adaptation based on a flexible cost–benefit arbitration.}}, author = {{Wurm, Franz and Walentowska, Wioleta and Ernst, Benjamin and Severo, Mario Carlo and Pourtois, Gilles and Steinhauser, Marco}}, issn = {{0898-929X}}, journal = {{JOURNAL OF COGNITIVE NEUROSCIENCE}}, keywords = {{Cognitive Neuroscience,FEEDBACK-RELATED NEGATIVITY,BAYESIAN MODEL SELECTION,EVENT-RELATED POTENTIALS,PREDICTION ERROR,ANTERIOR CINGULATE,REWARD PREDICTION,BRAIN POTENTIALS,NEURAL BASES,DOPAMINE,BEHAVIOR}}, language = {{eng}}, number = {{1}}, pages = {{34--53}}, title = {{Task learnability modulates surprise but not valence processing for reinforcement learning in probabilistic choice tasks}}, url = {{http://doi.org/10.1162/jocn_a_01777}}, volume = {{34}}, year = {{2021}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: