Researchers from UNIGE, Harvard, and McGill have discovered a sophisticated brain function showing how the ventral tegmental area (VTA) precisely encodes reward timing, revolutionizing our understanding of motivation and learning.
A new study by researchers from the University of Geneva (UNIGE), Harvard University and McGill University unveils a sophisticated role of the brain’s ventral tegmental area (VTA) in processing rewards. Published in the journal Nature, the study demonstrates that the VTA not only predicts rewards but also encodes the precise moment they are expected to occur — a feat made possible by a machine learning algorithm.
The VTA, a small region in the brain, is critical in motivation and the brain’s reward circuitry. It produces dopamine, a neuromodulator that helps predict future rewards based on contextual cues.
Remarkably, the new findings show that the VTA’s predictions are more detailed and temporally specific than previously thought.
“Initially, the VTA was thought to be merely the brain’s reward center. But in the 1990s, scientists discovered that it doesn’t encode reward itself, but rather the prediction of reward,” Alexandre Pouget, a full professor in the Department of Basic Neurosciences at UNIGE’s Faculty of Medicine who led the research, said in a news release.
Previous animal studies demonstrated that when a reward consistently follows a signal, the VTA releases dopamine at the sight of the signal rather than the reward itself. This response thus encodes the prediction of the reward.
The recent study reveals the VTA’s coding to be even more sophisticated.
“Rather than predicting a weighted sum of future rewards, the VTA predicts their temporal evolution. In other words, each gain is represented separately, with the precise moment at which it is expected,” added Pouget.
The study shows that different neurons in the VTA prioritize rewards on various time scales. Some neurons focus on rewards expected in a few seconds, others on those anticipated in a minute, and some on even more distant timeframes.
This diversity enables the VTA to encode reward timing with great flexibility, adapting its predictions to maximize immediate or delayed rewards, depending on individual goals and priorities.
“While we knew that VTA neurons prioritized rewards close in time over the ones further in the future — on the principle of ‘a bird in the hand is worth two in the bush’ — we discovered that different neurons do so on different time scales, with some focused on the reward possible in a few seconds’ time, others on the reward expected in a minute’s time, and others on more distant horizons. This diversity is what allows the encoding of reward timing,” Pouget added.
The study exemplifies the synergy between artificial intelligence and neuroscience. Pouget developed a mathematical algorithm incorporating the timing of reward processing, while Harvard researchers collected extensive neurophysiological data on VTA activity in animals.
“They then applied our algorithm to their data and found that the results matched perfectly with their empirical findings,” added Pouget.
This collaboration underscores how machine learning techniques inspired by the brain can also elucidate complex neurophysiological mechanisms.
Source: University of Geneva