Envision instructing a dog in the game of fetch. Toss a ball, the dog dashes after it, grabs it, and races back. Then, owners often reward them with a well-deserved treat.
Yet, the true challenge for the dog lies in deciphering which part of this sequence merits the coveted reward.
Scientists refer to this as the “credit assignment problem” within the brain—a fundamental inquiry into discerning which actions contribute to the positive outcomes we encounter.
Dopamine, a pivotal neurotransmitter in the brain, is recognized for its crucial role in this cognitive process. However, the precise mechanism by which the brain associates specific actions with the release of dopamine has remained elusive.
A recent publication in Nature, by researchers from the Allen Institute, Columbia University’s Zuckerman Mind Brain Behavior Institute, the Champalimaud Centre for the Unknown, and Seattle Children’s Research Institute, sheds new light on this enigma.
The study illuminates how dopamine not only signals rewards but also directs animals to pinpoint the exact behaviors that lead to these rewards through a process of trial and error.
Interestingly, the research also demonstrates that the brain’s reward system can rapidly and dynamically modify the entire spectrum of an animal’s movements and behaviors.
This underscores a sophisticated learning strategy wherein behaviors are not merely reinforced but actively molded and refined through experience, as elucidated by Rui Costa, D.V.M., Ph.D., the senior author of the study.
When you reinforce behavior, we often think it’s just that action. But no: you’re changing the entire behavioral structure. And what was really surprising was how rapid it was.
Rui Costa, Study Senior Author and CEO, Allen Institute
Decoding How Dopamine Shapes Learning
To unearth these insights, the team collaborated with engineers and neuroscientists at the Champalimaud Center for the Unknown, devising an innovative “closed loop” system to connect specific actions by mice to real-time dopamine release.
The researchers equipped mice with wireless sensors to monitor their movements in a controlled space. Subsequently, they input this data into a machine learning algorithm, categorizing the actions into distinct groups.
Employing optogenetics, a technique for controlling neurons with light, the researchers stimulated dopamine neurons when the mice executed predefined “target actions.”
The results revealed that mice swiftly adjusted their behavior in response to dopamine release. Initially, they increased the frequency of the target action and similar actions and those occurring a few seconds before dopamine release.
Meanwhile, dissimilar actions to the target rapidly decreased. Over time, this refinement became more precise, with the mice increasingly focusing on the exact action leading to dopamine release.
The study delved into how mice learn a sequence of actions, unveiling a crucial process akin to rewinding time to comprehend what leads to a reward. When actions triggering dopamine occurred further apart, the mice learned more slowly.
This underscores that longer intervals between actions make it challenging for mice to associate the sequence with the reward. Essentially, actions right before the reward is quickly grasped and improved upon, while earlier actions are refined more gradually. This “rewinding” process reinforces the behavior, aiding mice in progressively identifying the precise actions and sequences yielding the reward.
Lead author Jonathan Tang, Ph.D., an Assistant Professor at the University of Washington Medicine – Pediatrics, Seattle Children’s Research Institute, highlighted the potential impact of these findings on diverse fields like education and artificial intelligence (AI).
For instance, incorporating exploration, mistakes, and gradual refinement in the classroom aligns more with the human brain's innate learning processes. In AI, these insights could lead to more sophisticated and efficient learning systems, better replicating biological learning processes and enhancing adaptability to new data and situations.
This study provides deeper insights into how the brain learns and adapts through trial and error, whether a scientist or a pup.
We take a lot of stuff for granted about how things work, including credit assignment. But it’s when you really start diving in that you realize the complexity. This is why people do science: to home in on the truth of the matter.
Jonathan Tang, Study Lead Author and Assistant Professor, University of Washington
Source:
Journal reference:
Tang, J. C. Y., et al. (2023) Dynamic behaviour restructuring mediates dopamine-dependent credit assignment. Nature. doi.org/10.1038/s41586-023-06941-5.