Fig. 6: Adaptive reinforcement learning models replicate neuro-behavioral results and predict potential benefits for GPe stimulation under PCP administration. | Nature Communications

Fig. 6: Adaptive reinforcement learning models replicate neuro-behavioral results and predict potential benefits for GPe stimulation under PCP administration.

From: Basal ganglia deep brain stimulation restores cognitive flexibility and exploration-exploitation balance disrupted by NMDA-R antagonism

Fig. 6

a, b Adaptive forgetful model. a Top—Comparing the non-human primates’ (NHPs) normalized recorded activity of the external segment of the globus pallidus (GPe) firing rate (FR, solid line) with the normalized surprise measure \({\alpha }_{t}\) (dashed line) calculated based on the NHPs’ choices. r and p indicate the correlation coefficient and p-values using Pearson’s correlation. Middle—Comparing the models’ and the NHPs’ learning curves. Bottom - Comparing \({\alpha }_{t}\) value with the models’ switch probability. b Simulating the phencyclidine (PCP) state by increasing the model’s forgetfulness (\(\phi\)). Simulated parameters color graded from black to red (\(\phi=0.1-1\)). Top, learning criterion. Middle—\({\alpha }_{t}\) value after an unsuccessful trial compared with the probability for directed exploration. Bottom—\({\alpha }_{t}\) value after a successful trial compared with the probability for random exploration. c The same as (b), this time increasing the value of \(C\), the \({\alpha }_{t}\) modulating parameter. d–f The same as (a–c), here showing the results of the adaptive combined (WM + RL) model. WM working memory, RL reinforcement learning.

Back to article page