Secret computations of the hidden brain 1: How brain reward works and why it matters … in higher education
Brandy Eggan and Jim Stellar
Many of our blogs start with the idea that unconscious decision-making works by balancing positive and negative experiences, like a choice between possible majors in college based on an internship. Many of you have asked me (JS) how that decision process works in the brain. What is the underlying brain mechanism or mechanisms, as far as we know today? For those who are interested, this blog is for you.
We imagine a series of these blogs, first on different aspects of the neuroscience of reward and then on the negative side of the unconscious decision-making process and the integration between them. We may tie in some neuroeconomics thinking. As with all of our blogs, we hope this blog post is accessible to the layperson, even though we will use a few scientific references (you can ignore them).
Our ultimate focus is on how experiential education is natural to the brain and contributes powerfully to enhanced higher education. Experiential education certainly seems to be all over the current higher education meetings (e.g. AAC&U, ACE in the last 6 months). We see it has the core of “High Impact Practices,” driving employability, and leading to students who are stronger in innovation, creativity, and critical thinking.
Reward in history
The notion that behavior is driven by reward and punishment has been around for thousands of years and is seen in the earliest teachings of Greek philosophers. Plato thought that after death in the underworld a panel of judges would mete out justice; those who lived a noble life would be rewarded, while those who lived an evil life would be punished. Today, the concept of reward has shifted behaviors that are first related to survival whether unlearned (e.g. eating food when hungry) or learned (e.g. cues that lead to getting that food). Behavioral neuroscientist Wolfram Schultz (2016) has defined three criteria by which rewards typically function. First, rewards are typically positive reinforcers (although lessening pain works too) and act to induce learning and repetition of the behaviors that produce the reward. Of course, the repetition concept actually goes back to BF Skinner, probably one of the most famous psychologists, and even earlier in history. Second, rewards have an economic utility; that is the organism will work for the desired object to an extent that is based on the assumed amount of reward received once the object is obtained. Third, rewards are frequently associated with positive emotions such as desire or pleasure.
What we would add here from our perspective is that while we humans are aware of the conscious spreadsheet type of analysis that might lead to a stock purchase, much of decision-making even in economics occurs outside consciousness. For that work, Daniel Kahneman won the Nobel Prize in Economics in 2002 and published an important book in 2011, Thinking Fast and Slow, in which the thinking fast part happens without conscious awareness except as an emotional sense that something feels right or good about a decision. Now this is a field called behavioral economics
Why do we think the brain chemical dopamine underlies reward?
Neurotransmitters are chemicals that are released by one nerve cell when it is active and that rapidly spread to a neighboring nerve cell where they activate specific receptors for that chemical to continue the chain of communication in a nerve cell circuit. Dopamine was suggested to be a neurotransmitter in the late 1950’s based on its distribution pattern in the brain (Carlsson et al, 1958; Carlsson, 1959). Selective drugs that block the dopamine receptor, called antagonists, were then developed and used in behavioral studies to determine the role of the receptors for the dopamine neurotransmitter in mediating reward. Some of the earliest studies, using a whole-body injection of a dopamine antagonist showed that drug seeking behavior (pressing of a lever for a laboratory rat) for a drug of abuse was significantly decreased compared to animals that did not receive the dopamine antagonist (Yokel & Wise, 1976; de Wit & Wise, 1977). Further studies were then conducted to see if these dopamine receptor antagonists would decrease the self-stimulation of some previously mapped dopamine rich “pleasure centers” in the brain, and indeed the drug did just that (Fouriezos & Wise, 1976; Fouriezos et al, 1978; Corbett & Wise, 1980; Wise & Rompre, 1989). These findings were interpreted in such that dopamine receptors were critical for the rewarding efficacy of the drug of abuse and that dopamine acted in specific brain regions as a substrate of reward. This is a big field. At the time JS published a book on the subject in 1985, some 10,000 papers had been published. Today that factor would be hundreds of times higher. Obviously, we have just given you a taste of the research above.
With dopamine emerging as a reward neurotransmitter, researchers then began to look at different dopamine rich areas in the brain as the next step to understanding the brain’s reward system. The simplest method at this time to determine the importance of brain regions in mediating a behavior was to surgically enter the brain and lesion (inactivate) the area of interest. The results of these studies revealed that the dopamine rich nucleus accumbens was critical for reward (Roberts et al, 1977) in that animals with a non-functional accumbens lost their drive to actively press on a lever for a drug. Researchers soon identified the source of these accumbens dopamine neurons to be the ventral tegmental area. Activating these ventral tegmental area dopamine neurons in turn caused a subsequent activation of the dopamine containing terminal regions in the nucleus accumbens (Gysling & Wang, 1983). These data together were the early footholds of an amazingly vast field of research in to reward, dopamine, and this hallmark reward pathway that we now know plays a critical role in decision making behavior in respect to rewarding stimuli.
Fun fact about dopamine release and reward – it shifts
With the simplified introduction presented thus far, or for those of you who have studied reward in an introduction to neuroscience or psychology course, you probably have heard that dopamine is reward. This transmitter has even earned the title as the brain’s “pleasure transmitter.” A simple scenario would be eating a piece of your favorite chocolate; you consume the sweet treat, dopamine is released in your reward pathway, specifically in that nucleus accumbens, and you feel happy. You enjoyed the chocolate. This has been demonstrated many times in the lab and is shown in the associated figure, (top panel in the graph below; Figure source: Schultz et al., 1997) where each firing of dopamine neurons is shown as a dot on one of the thin horizontal lines stacked on top of each other as the experiment is repeated. At the top is the cumulative result summed vertically across all of the dots. You see firing just after a reward (R) is presented, just to the right of the vertical solid line. With no conditioned stimulus (CS, explained later), dopamine neurons in the nucleus accumbens fire after the reward is received, just as you would expect.
But, it’s not that simple. Scientists have uncovered evidence that dopamine is more than a pleasure transmitter; it is really also a hallmark for event predictability. In regards to that piece of chocolate (R), it has been shown that if a food reward is preceded by a stimulus such as a someone handing you the chocolate, that stimulus configuration becomes what psychologists call a conditioned stimulus (CS) and was made famous by the Russian physiologist Ivan Pavlov who taught dogs to salivate to his ringing of a bell. The CS predicts the occurrence of reward. What is amazing is that your brain will no longer release dopamine in response to consuming the chocolate. Rather it will shift to the sight of your chocolate-bearing co-worker. This is shown in the middle panel of the above figure (Ljunberg et al, 1992; Mirenowicz & Schultz, 1994).
The opposite of this situation has also proven to be true, namely that dopamine cell firing is depressed by the omission of a predicted reward. Studies have shown that if an animal fails to be rewarded (no R) when a reward is expected following the presentation of the predicting stimulus (CS; i.e. co-worker enters the room but does not provide you with the piece of chocolate you were expecting), the activity of dopamine neurons in the reward pathway is actually depressed. This is shown in the bottom panel of the figure (Hollerman & Schultz, 1996; Ljunberg et al, 1991, Schultz et al, 1993).
Together these data suggest that the traditional view of dopamine as a “reward transmitter” falls short in reflecting the true capacities of this brain chemical. Rather, dopamine seems more to be a monitor of reward and reflects expectation based on an internal clock that tracks the details surrounding a predicted reward.
Tying this back to higher education
Dopamine is then more of a monitor for reward and not to the reward itself. Even more importantly, the experience of getting the reward and the cues that precede it cause a shift in the response of the neural circuits to the cue. To us this may indicate how experience builds implicit knowledge into the system. If the brain can change in a little chocolate experiment, think what it can do after a student does an internship in a field in which they are majoring. In future blog posts, we will look at other aspects of the brain’s circuitry and how this unconscious decision-making circuit can change, drive our decisions, and even communicate with the conscious decision-making process … all from this neuroscience perspective.
Carlsson, A. (1959) The occurrence, distribution and physiological role of catecholamines in the nervous system. Pharmacol Rev. 11: 490-493.
Carlsson, A., Waldeck, B. (1958) A fluorimetric method for the determination of dopamine (3-hydroxytryptamine). ACTA Physiologica. 44: 293-298.
Corbett, D., Wise, R.A. (1980) Intracranial self-stimulation in relation to the ascending dopaminergic systems of the midbrain: A movable electrode mapping study. Brain Res. 185: 1-15.
De Wit, H., Wise, R.A. (1977) Blockade of cocaine reinforcement in rats with the dopamine receptor blocker pimozide, but not with the noradrenergic blockers phentolamine or phenoxybenzamine. Can J Psychol. 31: 195-203.
Fouriezos, G., Hansson, P., Wise, R.A. Neuroleptic-induced attenuation of brain stimulation reward in rats. J Comp Physiol Psychol. 92: 661-671.
Fouriezos, G., Wise, R.A. (1976) Pimozide-induced extinction in rats: Stimulus control of responding rules out motor deficit. Brain Res. 11: 71-75.
Hollerman, R.J., Schultz, W. (1996) Activity of dopamine neurons during learning in a familiar task context. Soc Neurosci Abstr. 22: 1388.
Ljunberg, T., Apicella, P., Schultz, W. (1991). Reponses of monkey midbrain dopamine eneurons during delayed alteration performances. Brain Res. 586: 337-341.
Ljunberg, T., Apicella, P., Scultz, W. (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol. 69: 145-163.
Mirenowicz, J., Schultz, W. (1996) Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature. 379: 449-451.
Schultz, W., Apicella, P., Ljunberg, T. (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci. 13: 900-913.
Schultz, W., Dayan, P., Montague, R.R.A. (1997) Neural substrate of prediction and reward. Science. 275: 1593-1599.
Schultz, W. (2016) Dopamine reward prediction-error signalling: a two-component response. Nature Rev Neurosci. 17: 183-195.
Yokel, R.A., Wise, R.A. (1976) Attenuation of intravenous amphetamine reinforcement by central dopamine blockade in rats. Psychopharmacol. 48: 311-318.
Wise, R.A., Rompre, P.P. (1989) Brain dopamine and reward. Annu Rev Psychol. 40: 191-225.