• Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

      Chang, C.Y.; Gardner, M.; Di, Tillio, M.G. (Cell Press, 2017)
      Prediction errors are critical for associative learning [1, 2]. Transient changes in dopamine neuron activity correlate with positive and negative reward prediction errors and can mimic their effects [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. However, although causal studies show that dopamine transients of 1–2 s are sufficient to drive learning about reward, these studies do not address whether they are necessary (but see [11]). Further, the precise nature of this signal is not yet fully established. Although it has been equated with the cached-value error signal proposed to support model-free reinforcement learning, cached-value errors are typically confounded with errors in the prediction of reward features [16]. Here, we used optogenetic and transgenic approaches to prevent transient changes in midbrain dopamine neuron activity during the critical error-signaling period of two unblocking tasks. In one, learning was unblocked by increasing the number of rewards, a manipulation that induces errors in predicting both value and reward features. In another, learning was unblocked by switching from one to another equally valued reward, a manipulation that induces errors only in reward feature prediction. Preventing dopamine neurons in the ventral tegmental area from firing for 5 s beginning before and continuing until after the changes in reward prevented unblocking of learning in both tasks. A similar duration suppression did not induce extinction when delivered during an expected reward, indicating that it did not act independently as a negative prediction error. This result suggests that dopamine transients play a general role in error signaling rather than being restricted to only signaling errors in value. Copyright 2017