decision making improves across adolescent development in ... · postnatal day number of reversals...

1
Decision making improves across adolescent development in the rat: implications for orbitofrontal circuit development N. Moin Afshar, AJ Keip 1 , D Lee 2 , JR Taylor 1 , and SM Groman 1 1. Department of Psychiatry, Yale University, 2. Department of Neuroscience, Johns Hopkins University Adolescence is a dynamic phase of brain development associated with a decline in synaptic density, an increase in myelination and a strengthening of neural circuits. This refinement in neural systems is believed to improve the efficiency of the brain and enhance the speed of information flow across neural networks which are critical for optimal decision-making. The neurobiology that mediate these changes in decision making, however, are unknown. We investigated how decision-making processes, which are controlled by OFC circuits, change across adolescent development in male and female rats. Training on the Reversal Learning (RL) task Long Evans rats (N=43; 21 F, 22 M) bred in our facility were trained on a three-choice reversal learning (RL) task at either postnatal day (PND) 30 (N=12), 50 (N=12), 70 (N=7) or 90 (N=12). Rats were trained to make operant responses (e.g., nose port entry) to receive an oral delivery of sweetened condensed milk (10% w/v) in 12 h overnight sessions. Rats were then trained to discriminate between three spatial locations using a deterministic schedule of reinforcement. Each time rats reached a performance criterion (e.g., choosing the highest reinforced option 21 times in the last 30 trials) the reward contingencies changed. Sessions terminated when rats received 151 rewards or 12 h had elapsed. After completing 3 overnight sessions on the deterministic RL, decision making was assessed in a probabilistically reinforced RL task. Reinforcement probabilities assigned to each noseport were pseudo randomly assigned at the start of each session (70%, 30%, and 10%). The reward contingencies changed each time the performance criterion was met. Sessions terminated when rats achieved 151 rewards or 12 h had elapsed. Rats completed 3 overnight sessions on the probabilistic RL and were sacrificed immediately after the last session. Reinforcement learning model Trial-by-trial choice data in the RL task was fit with the following reinforcement learning model which contained four free parameters: if a(t)=i and r(t)=1, Q(t+1) = ! C Q(t) + " + if a(t)=i and r(t)=0, Q(t+1) = ! C Q(t) + " 0 if a(t) i, Q(t+1) = ! U Q(t) The model was fit separately to the choice data collected in the deterministic and probabilistic RL task and then averaged across the different schedules. Adolescence-related changes in signaling pathways Brain tissue was collected immediately following the last RL session. Tissue was homogenized and underwent tryptic digestion to generate peptide fragments. Digested peptides were submitted to the Yale-NIDA Neuroproteomics Core where they will be separated on an Ultra high-pressure liquid chromatography (LC) system and analyzed by LC-MS/MS. Peptide precursors were isolated and fragmented to produce a measure of peptide abundance. We will compare protein abundance across adolescent development and examine the relationship between protein expression and decision making. Conclusions These data demonstrate that improvements in flexible decision making that occur during adolescence are related to reward-mediated updating. Based on our previous work demonstrating that action value updating following rewards is controlled by the amygdalaàOFC circuit, we hypothesize that maturation of the amygdalaàOFC circuit may be critical for the age-related improvements in decision making we observed here. Our ongoing proteomic studies seek to identify the signaling pathways that are responsible for these decision-making improvements. Future directions We have found that individual differences in the " + parameter prior to any drug use predict future drug-taking behaviors. We hypothesize, therefore, that development disruptions in the amygdalaàOFC circuit enhance addiction-like susceptibility. Our ongoing work is using a viral approach to characterize the amygdalaàOFC circuit across development to determine if differences in circuit formation in adolescence predicts drug-taking behaviors in adulthood. Funding sources: These studies were supported by a NARSAD Young Investigator Award (SMG), a Yale/NIDA Neuroproteomics Pilot Award (P30 DA018343) and the State of Connecticut. Reversal learning (RL) task Decision making improves across adolescence Value updating changes across adolescent development Improvements in RL are specific to reward-mediated updating ! C – decay rate for chosen options ! U – decay rate for unchosen options " + – appetitive strength of rewarded outcome " 0 – aversive strength of no reward outcome Introduction Methods Deterministic schedule Probabilistic schedule 30 50 70 90 0 1 2 3 4 5 Postnatal Day Number of reversals achieved * Age: F(3,39)=3.68; p=0.01 30 50 70 90 0 1 2 3 4 5 Postnatal Day Number of reversals achieved ** Age: F(3,37)=7.36; p<0.001 30 50 70 90 0.000 0.005 0.010 0.015 Postnatal Day Number of reversals achieved / trials completed ** 30 50 70 90 0.000 0.002 0.004 0.006 0.008 0.010 Postnatal Day Number of reversals achieved / trials completed * Age: F(3,37)=6.69; p=0.001 Age: F(3,39)=6.86; p<0.001 OFC circuits influence distinct reinforcement-learning steps Experimental design 2 h exposure to 10% SCM 12 h food restriction Overnight operant training (1-2 days) Overnight deterministic RL test (3 days) Overnight probabilistic RL test (3 days) 0.0 0.5 1.0 Parameter estimate PND30 PND50 PND70 PND90 γ C Δ + γ U Δ 0 ** * Omnibus: Age x parameter: X 2 =28.12; p=0.001 ! C Age: X 2 =3.83; p=0.28 ! U Age: X 2 =2.49; p=0.48 + : Age X 2 =13.30; p=0.004 0 : Age X 2 =10.95; p=0.012 0 1 2 3 0 1 2 3 4 5 Number of reversal acheieved under probabilistic schedule Number of reversal acheieved under deterministic schedule R 2 =0.36 p<0.001 Groman et al., 2019; Neuron. Age x outcome: X 2 =12.53; p=0.006 p(stay | reward & correct) Age=X 2 =22.55; p<0.001 p(shift | unreward & incorrect) Age: X 2 =2.04; p=0.57 0 100 200 300 1st reversal 2nd reversal Trials to criterion 30 50 70 90 0 100 200 300 1st reversal 2nd reversal Trials to criterion 30 50 70 90 After the last RL test, rats were sacrificed and fresh tissue collected from the OFC, nucleus accumbens and amygdala. Tissue underwent tryptic digestion and peptides submitted to the Yale-NIDA Neuroproteomics Core for tandem mass spectrometry. Proteins whose abundance relates to decision-making, as well as changes across adolescent development, will be identified. 0 1 2 3 4 0 50 100 150 200 Δ + parameter estimate Number of cocaine infusions (0.5 mg/kg/infusion) R 2 =0.32 p=0.03 PND30 PND50 PND70 PND90 Deterministic RL task 60 120 180 240 0.00 0.25 0.50 0.75 1.00 Trial p(reward | NP x ) NP1 NP2 NP3 60 120 180 240 300 360 0.0 0.2 0.4 0.6 0.8 1.0 Trial p(reward | NPx) NP1 NP2 NP3 Probabilistic RL task PND30 PND50 PND70 PND90 PND30 PND50 PND70 PND90 PND30 PND50 PND70 PND90

Upload: others

Post on 20-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decision making improves across adolescent development in ... · Postnatal Day Number of reversals achieved / trials completed * Age: F(3,37)=6.69; p=0.001 Age: F(3,39)=6.86; p

Decision making improves across adolescent development in the rat: implications for orbitofrontal circuit development

N. Moin Afshar, AJ Keip1, D Lee2, JR Taylor1, and SM Groman11. Department of Psychiatry, Yale University, 2. Department of Neuroscience, Johns Hopkins University

Adolescence is a dynamic phase of brain development associated with a decline in synaptic density, an increase in myelination and a strengthening of neural circuits. This refinement in neural systems is believed to improve the efficiency of the brain and enhance the speed of information flow across neural networks which are critical for optimal decision-making. The neurobiology that mediate these changes in decision making, however, are unknown.

We investigated how decision-making processes, which are controlled by OFC circuits, change across

adolescent development in male and female rats.

Training on the Reversal Learning (RL) taskLong Evans rats (N=43; 21 F, 22 M) bred in our facility were trained on a three-choice reversal learning (RL) task at either postnatal day (PND) 30 (N=12), 50 (N=12), 70 (N=7) or 90 (N=12). Rats were trained to make operant responses (e.g., nose port entry) to receive an oral delivery of sweetened condensed milk (10% w/v) in 12 h overnight sessions. Rats were then trained to discriminate between three spatial locations using a deterministic schedule of reinforcement. Each time rats reached a performance criterion (e.g., choosing the highest reinforced option 21 times in the last 30 trials) the reward contingencies changed. Sessions terminated when rats received 151 rewards or 12 h had elapsed. After completing 3 overnight sessions on the deterministic RL, decision making was assessed in a probabilistically reinforced RL task. Reinforcement probabilities assigned to each noseport were pseudo randomly assigned at the start of each session (70%, 30%, and 10%). The reward contingencies changed each time the performance criterion was met. Sessions terminated when rats achieved 151 rewards or 12 h had elapsed. Rats completed 3 overnight sessions on the probabilistic RL and were sacrificed immediately after the last session.

Reinforcement learning modelTrial-by-trial choice data in the RL task was fit with the following reinforcement learning model which contained four free parameters:if a(t)=i and r(t)=1, Q(t+1) = ! C Q(t) + "+if a(t)=i and r(t)=0, Q(t+1) = ! C Q(t) + "0if a(t) ≠ i, Q(t+1) = ! U Q(t)

The model was fit separately to the choice data collected in the deterministic and probabilistic RL task and then averaged across the different schedules.Adolescence-related changes in signaling pathways Brain tissue was collected immediately following the last RL session. Tissue was homogenized and underwent tryptic digestion to generate peptide fragments. Digested peptides were submitted to the Yale-NIDA Neuroproteomics Core where they will be separated on an Ultra high-pressure liquid chromatography (LC) system and analyzed by LC-MS/MS. Peptide precursors were isolated and fragmented to produce a measure of peptide abundance. We will compare protein abundance across adolescent development and examine the relationship between protein expression and decision making.

ConclusionsThese data demonstrate that improvements in flexible decision making that occur during adolescence are related to reward-mediated updating. Based on our previous work demonstrating that action value updating following rewards is controlled by the amygdalaàOFC circuit, we hypothesize that maturation of the amygdalaàOFC circuit may be critical for the age-related improvements in decision making we observed here. Our ongoing proteomic studies seek to identify the signaling pathways that are responsible for these decision-making improvements.

Future directionsWe have found that individual differences in the "+ parameter prior to any drug use predict future drug-taking behaviors. We hypothesize, therefore, that development disruptions in the amygdalaàOFC circuit enhance addiction-like susceptibility. Our ongoing work is using a viral approach to characterize the amygdalaàOFC circuit across development to determine if differences in circuit formation in adolescence predicts drug-taking behaviors in adulthood.

Funding sources: These studies were supported by a NARSAD Young Investigator Award (SMG), a Yale/NIDA Neuroproteomics Pilot Award (P30 DA018343) and the State of Connecticut.

Reversal learning (RL) task

Decision making improves across adolescence

Value updating changes across adolescent development

Improvements in RL are specific to reward-mediated updating

! C – decay rate for chosen options! U – decay rate for unchosen options"+ – appetitive strength of rewarded outcome"0 – aversive strength of no reward outcome

Introduction

Methods

Deterministic schedule

Probabilistic schedule

30 50 70 900

1

2

3

4

5

Postnatal Day

Num

ber

of r

ever

sals

ac

hiev

ed

*

Age: F(3,39)=3.68; p=0.01

30 50 70 900

1

2

3

4

5

Postnatal Day

Num

ber

of r

ever

sals

ac

hiev

ed **

Age: F(3,37)=7.36; p<0.001

30 50 70 900.000

0.005

0.010

0.015

Postnatal Day

Num

ber

of r

ever

sals

ac

hiev

ed /

tria

ls c

ompl

eted **

30 50 70 900.000

0.002

0.004

0.006

0.008

0.010

Postnatal Day

Num

ber

of r

ever

sals

ac

hiev

ed /

tria

ls c

ompl

eted *

Age: F(3,37)=6.69; p=0.001

Age: F(3,39)=6.86; p<0.001

OFC circuits influence distinct reinforcement-learning steps

Experimental design

2 h exposure to

10% SCM

12 h food restriction

Overnight operant training

(1-2 days)

Overnight deterministic

RL test (3 days)

Overnight probabilistic

RL test(3 days) 0.0

0.5

1.0

Para

met

er e

stim

ate

PND30 PND50PND70PND90

γC Δ+γU Δ0

***

Omnibus: Age x parameter: X2=28.12; p=0.001

!C Age: X2=3.83; p=0.28!U Age: X2=2.49; p=0.48∆+: Age X2=13.30; p=0.004∆0: Age X2=10.95; p=0.012

0 1 2 30

1

2

3

4

5

Number of reversal acheievedunder probabilistic schedule

Num

ber

of r

ever

sal a

chei

eved

unde

r de

term

inis

tic s

ched

ule R2=0.36

p<0.001

Groman et al., 2019; Neuron.

Age x outcome: X2=12.53; p=0.006p(stay | reward & correct) Age=X2=22.55; p<0.001

p(shift | unreward & incorrect) Age: X2=2.04; p=0.57

0 100 200 300

1st r

ever

sal

2nd

reve

rsal

Trials to criterion

30507090

0 100 200 300

1st r

ever

sal

2nd

reve

rsal

Trials to criterion

30507090

After the last RL test, rats were sacrificed and fresh tissue collected from the OFC, nucleus accumbens and amygdala. Tissue underwent tryptic digestion and peptides submitted to the Yale-NIDA Neuroproteomics Core for tandem mass spectrometry. Proteins whose abundance relates to decision-making, as well as changes across adolescent development, will be identified.

0 1 2 3 40

50

100

150

200

Δ+ parameter estimate

Num

ber

of c

ocai

ne in

fusi

ons

(0.5

mg/

kg/in

fusi

on)

R2=0.32p=0.03

PND30PND50PND70PND90

Deterministic RL task

60 120 180 2400.00

0.25

0.50

0.75

1.00

Trial

p(re

war

d | N

Px)

NP1 NP2 NP3

60 120 180 240 300 3600.0

0.2

0.4

0.6

0.8

1.0

Trial

p(re

war

d | N

Px)

NP1 NP2 NP3

Probabilistic RL task

PND30

PND50

PND70

PND90

PND30

PND50

PND70

PND90

PND30

PND50

PND70

PND90