Infralimbic cortex plays a similar role in the punishment and extinction of instrumental behavior

Punishment learning has generated interest as a model of how animals learn to stop responding (Bouton and Broomer, 2023, Jean-Richard-Dit-Bressel et al., 2018, Marchant et al., 2019). In punishment, an animal learns to stop performing an instrumental response because the response now produces an aversive stimulus (e.g., an electrical shock) in addition to the original reinforcer (e.g., a food pellet). On a superficial level, punishment appears different from extinction, where an animal learns to stop responding because the response now produces nothing at all. However, evidence suggests that punishment and extinction can be seen as manifestations of a common interference process in which newer learning interferes with the expression of the original conditioning in a context-dependent way.

Punishment and extinction are both specific to the context in which they are learned. When an instrumental response is trained in one context (A) then either punished or extinguished in another context (B), responding is suppressed during a subsequent test in Context B, but recovers robustly when tested in Context A. This ABA renewal effect has been demonstrated in separate punishment (Bouton and Schepers, 2015, Marchant et al., 2013) and extinction (Bernal-Gamboa et al., 2017a; Bouton et al., 2011, Nakajima et al., 2000; see Bouton et al., 2012) experiments, as well as in an experiment designed to compare renewal in the two paradigms directly (Broomer & Bouton, 2023a). Notably, Broomer and Bouton observed nearly identical degrees of renewal following punishment and extinction. Additional experiments by Broomer and Bouton (2023a) manipulating temporal context (i.e., spontaneous recovery; Bernal-Gamboa et al., 2017b; Rescorla, 1997a) and reinforcer context (i.e., reacquisition; Woods & Bouton, 2007; see Bouton et al., 2021) similarly indicated that punishment and extinction are specific to the context in which they are learned.

There is a second similarity between the two paradigms. Additional evidence from Todd (2013) and Bouton and Schepers (2015) suggests that both forms of behavioral suppression are specific to the response. Rats learned to perform one response in one context (R1 in A) and another response in another context (R2 in B). Both the responses were then either punished (Bouton & Schepers, 2015) or extinguished (Todd, 2013) in the opposite context from which they were trained (R1 in B, R2 in A). When subsequently tested in each context, each response was selectively suppressed in the context where it had been punished or extinguished. In other words, rats learned to stop making a specific response in a specific context. The finding implies an underlying associative structure in which the punishment or extinction context acts as a stimulus that directly inhibits the response that was punished or extinguished there, e.g., as captured by an inhibitory stimulus–response (S-R) association (Bouton et al., 2016, Bouton et al., 2021, Broomer and Bouton, 2023b, Todd, 2013; see also Rescorla, 1993, Rescorla, 1997b).

The results of the various renewal procedures (Bouton and Schepers, 2015, Bouton et al., 2011, Broomer and Bouton, 2023a, Todd, 2013) suggest that punishment and extinction are both instances of retroactive interference, in which newer instrumental learning retroactively interferes with the expression of the original conditioning in a context- and response-specific way (Bouton, 1993, Bouton, 2002, Bouton, 2019, Bouton and Swartzentruber, 1991, Bouton et al., 2021, Miller and Escobar, 2002, Miller et al., 1986). The common behavioral mechanism suggests a similar possible overlap in neural circuitry. One brain region that might support such overlap is the infralimbic cortex (IL). IL is known to be necessary for the consolidation and retrieval of extinction learning across both Pavlovian and instrumental paradigms with a variety of food and drug reinforcers (e.g., Eddy et al., 2016, Gutman et al., 2017, LaLumiere et al., 2010, Laurent and Westbrook, 2009, Lingawi et al., 2017, Milad and Quirk, 2002, Nett et al., 2023, Rhodes and Killcross, 2004, Rhodes and Killcross, 2007, Peters et al., 2008; but see Bossert et al., 2011). For example, Eddy et al. (2016) found that pharmacological inactivation of IL prior to a test of ABA renewal significantly increased responding in the extinction context, suggesting impaired retrieval or expression of extinction learning.

Laurent, Westbrook, Lingawi, and colleagues suggest that IL supports an inhibitory memory of a Pavlovian conditioned stimulus presented alone, based on evidence for IL involvement in both Pavlovian extinction (Laurent & Westbrook, 2009) and latent inhibition (e.g., Lingawi et al., 2017). However, much of the work cited above has focused on IL’s role in instrumental, rather than Pavlovian, learning. And in another important part of the instrumental learning literature, IL has been implicated in instrumental habit learning (Coutureau and Killcross, 2003, Killcross and Coutureau, 2003, Shipman et al., 2018, Smith et al., 2012). To be more specific, an instrumental behavior is thought to transition from goal-directed to habitual control with extensive training, as evidenced by the loss of sensitivity to changes in the value of the outcome (see Adams, 1982, Dickinson, 1985). IL inactivation following extensive training of an instrumental response restores this sensitivity, suggesting a loss of habitual control and a switch back to goal-direction (Coutureau and Killcross, 2003, Killcross and Coutureau, 2003, Shipman et al., 2018, Smith et al., 2012). Recently, Steinfeld and Bouton (2021) suggested that habit retroactively interferes with goal-directed control in a context-specific way, similarly to how extinction retroactively interferes with the original conditioning. Thus, IL seems to play a similar role in both extinction and habit, controlling expression of newer learning that retroactively interferes with the old in a context-dependent way. It is also worth noting that habit, like instrumental extinction, is characterized as an S-R association between context and response (Adams and Dickinson, 1981, Dickinson, 1985).

The similar involvement of IL in habit and extinction has influenced some theoretical frameworks of IL function. Barker et al. (2014) argue that IL suppresses original action-outcome-based learning (goal-direction and conditioning) and facilitates the emergence of new learning (habit and extinction). Green and Bouton (2021) propose that IL is important in a range of interference paradigms such as extinction, habit, and punishment. They suggest that IL may be capable of bidirectionally switching between behaviors; that is, in situations where multiple conflicting associations have been learned, IL may simply promote whichever association is currently active. Similarly, but in frameworks that explicitly accommodate both instrumental and Pavlovian learning, Roughley and Killcross (2021) suggest that IL is necessary for selection between competing learned associations with a single Pavlovian cue or instrumental response, and Nett and LaLumiere (2021) propose that IL is broadly involved in modulating changing relationships between Pavlovian cues, instrumental responses, and environmental stimuli. Although these frameworks vary in their precise accounts of IL function, they suggest that IL may be broadly involved in interference-based learning.

Although the idea of IL involvement in punishment is compatible with the frameworks described above—as well as consistent with the role of IL in extinction—there is currently little evidence to support a role for IL in punishment. For example, Jean-Richard-dit-Bressel and McNally (2016) trained rats to respond on two different levers for the same food outcome, then punished responses on one of the levers with footshock (the other lever served as an unpunished control). During an aversive choice test—in which both response levers were available—IL inactivation reduced the latency to press both levers, although it did not significantly affect relative performance on either. Otherwise, Pelloux et al. (2013) observed no difference between IL-lesioned and sham-lesioned rats when a cocaine-reinforced seeking-taking behavioral chain was punished.

In sum, the involvement of IL in punishment—and in a general mechanism of interference in instrumental learning—remains unclear. Thus, the present experiments had two goals. The first was to test the hypothesis that IL would be similarly involved in context-specific response suppression following either punishment or extinction (Experiment 1). The second goal was to further characterize the associative content of the learning controlled by IL in both punishment and extinction. Specifically, we tested whether IL controlled an inhibitory S-R association between a context and response that was either punished or extinguished there.

留言 (0)

沒有登入
gif