seminar talk, 2008

Post on 25-May-2015

80 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A seminar talk I gave in 2008 to the University of Toronto graduate students seminar.

TRANSCRIPT

Learning to Forage:Rules, rules, everywhere a rule.

!

Steven Hamblin - Dept. of Biology, UQÀM

The road ahead...

Some background:

Components of the problem: learning, foraging, optima.

Producer-Scrounger game.

Learning rules.

The road ahead...

Our approach:

Simulations and genetic algorithms.

Results.

Next steps.

The road ahead...

Learning

Learning

Learning

ESS: A strategy which, if adopted a population, cannot be invaded by a rare mutant strategy.

Social foraging

Equilibrium

behaviour

Learning

Evolution of Learning Rules

Producer

Producer-Scrounger Game

Producers

Scrounger

Producer-Scrounger Game

Producer-Scrounger Game

544 A N I M A L B E H A V I O U R , 2 9 , 2

Where the two pay-of f curves intersect , bo th types fare equal ly well: to one side o f the inter-

section p roducers do better, to the other, scroungers do better. We can call this the ESS po in t in accordance with the principle o f evo-

lu t ionar i ly stable strategies ( M a y n a r d Smith

1974; Dawkins 1976). The ESS po in t represents the stable mixture o f producers and scroungers

in selective terms to which groups which conta in the two types should converge (Fig. lb) .

However , the s i tuat ion is unl ikely to be as

s t ra ight forward as that . Because bo th frequency-

dependent and dens i ty-dependent factors are

l ikely to opera te with changes in g roup size, pay- offs to p roducers and scroungers are more accu-

ra te ly represented as pay-off surfaces (Fig. lc). The same principles app ly to the surfaces as to the curves in Fig. l a , except now the intersect ion

between the surfaces for p roducers and scroungers produces a line ra ther than a single

point . The line o f intersect ion can be m a p p e d as

an ESS line on to the two-dimensional surface between the p roducer / sc rounger axes (Fig. l d),

and groups should now ' t r ack ' the line ra ther than converge to a single point . A new and im-

p o r t an t impl ica t ion arising f rom the idea o f an

ESS line is t ha t the ra t io o f p roducers to scroungers at equi l ibr ium is l ikely to depend on

group size. Depend ing on the shape of the two intersecting surfaces, the ESS l ine in the hori-

zonta l p lane can describe a wide var ie ty o f curves all o f which, except for s t raight lines

th rough the origin, show a group size effect. The

at

No. scroungers No. producers

Here producers do better

Pay-off to / S C F O U n Q @ r s

S:- ducers

5 4 3 2 1 0 1 2 3 4 5 6

Here scroungers

b)

So group composit ion should adjust

I

i

ESS

,Fig. 1

PaY-~ f t / /

, o rod cer - ,

E S S - l i n e

I No. producers

0 1 J 3 4 5 6 ~._

d) ~ ~ = 2 "",~.. g j y H e r e scroungers ~ do bet ter

#

6 / Here producers i

Fig. I. (a) Pay-off to individual producers and scroungers as a function of the producer :scrounger ratio in the group (here arbitrarily set at six individuals). The intersection of the two curves is a point representing equal pay-offs to producers and scroungers; when strategies are conditional it is the point at which it would not pay any individual to change strategy. (b) The ESS corresponding to the pay- offs shown in (a). (c) The pay-off to individual producers and scroungers as a function of the number of scroungers at a site yields two surfaces. The intersection of the sur- faces is a line giving the ESS for each group size. (d) The projection of these ESS's onto the horizontal plane, giving the ESS line as a function of the number of pro- ducers and the number of scroungers.

General note: For simplicity the ESS line has been drawn as if non-integer numbers of producers and scroungers were possible. Restriction to integers gives a line to the right of that shown, usually as close as possible. The integer ESS for a given flock size gives a ratio of scroungers to producers such that if any one changed strategy be would do worse.

precise shapes of the surfaces may vary depend- ing on the na ture o f the p roducer / sc rounger

re la t ionship. In gua rde r / ' sneak ' re la t ionships dur ing mat ing, for example, the pay-of f to

guarders (producers) might decrease mono-

Barnard & Sibley, 1981.

50% producer. 50% scrounger.100% 0%

0% 100%producer.

producer.

scrounger.

scrounger.

Do they learn?

Do they learn?

Yes:

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Katsnelson et al. , 2008

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Katsnelson et al. , 2008

ISBE, 2008.

Individual-based model (a.k.a. agent-based model).

Rules tested in isolation; stability test was questionable.

Rules

RulesRelative payoff sum

RulesRelative payoff sum

Perfect Memory

RulesRelative payoff sum

Perfect Memory

Linear Operator

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Linear Operator

Si(t) = xSi(t� 1) + (1� x)Pi(t)

where 0 < x < 1 is a memory factor,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Multiple stable rules with multiple parameters?

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Agent Start

At a patch with food?

Feed

Produce or scrounge?

Produce Scrounge

Move randomly

No

Yes

Any conspecifics

feeding?No

Move to closest

Closest still feeding?

There yet?Still food in

patch?Yes

No

Feed

YesNo

No

Yes

Agent Start

At a patch with food?

Feed

Produce or scrounge?

Produce Scrounge

Move randomly

No

Yes

Any conspecifics

feeding?No

Move to closest

Closest still feeding?

There yet?Still food in

patch?Yes

No

Feed

YesNo

No

Yes

Simulation notes...Foraging grid is a variable-sized square grid with movement in the 4 cardinal directions.

Number of patches and number of agents kept to 20% and 10% of grid size.

Thus: 40x40 grid would have 320 patches and 160 agents

Genetic Algorithms

Algorithms that simulate evolution to solve optimization problems.

Initial population

Measure fitness

Select for

reproduction

Mutation

Exit> n generations

One final wrinkle.

Environmental vs. frequency-dependent variance in payoff.

Environmental variation.

Manipulating patch density.

N changes, with greater N meaning greater variation.

Foraging / Learning rule simulation.

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Sources of variation

Problem Solution

Rules tested in isolation. Simulation population randomly generated, using all rule types.

Parameter values arbitrarily chosen; few values tested.

Genetic algorithm to optimize across the whole parameter space.

Will rules converge on an ESS? Are they ES Learning rules?

Genetic algorithm to simulate population dynamics.

Results to date

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

02

46

810

Relative Payoff Sum Perfect Memory Linear Operator

0 500

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

0200

400

600

800

Relative Payoff Sum Perfect Memory Linear Operator

0 500

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

050

100

150

200

250

300

350

Relative Payoff Sum Perfect Memory Linear Operator

0 500

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Producer residual

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Scrounger residual

Producer residual

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Scrounger residual

Producer residual

Memory factor

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)Relative Payoff Sum

rp >> rs for large population sizes.

-1 0 1 2 3 4 5 6 7 8

1

2

3

4

5

Producer residual

Scrounger residual

Time without payo! to behaviour

Value assignedto behaviour

●●

● ●

0.0

0.2

0.4

0.6

0.8

1.0

Group size

Prop

ortio

n of

spe

cial

ists

.

● ●

10 40 90 160 360 1000

mean=0.981

mean=0.008

2 4 6 8 10

0.24

50.

250

0.25

50.

260

Periods of environmental variability

Mea

n pr

opor

tion

of sc

roun

ging

.

2 4 6 8 10

0.52

0.54

0.56

0.58

Periods of environmental variability

Mea

n pr

opor

tion

of sp

ecia

lists.

What does that mean?

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Small, but consistent effect of environmental variability.

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Small, but consistent effect of environmental variability.

Learning is selected against.

Next steps?

Questions?

Thanks to:

The Giraldeau Lab.

Guy Beauchamp.

Maria Modanu and Steve Walker, for the invitation.

Evolution of learning rule form.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Initial population

Measure fitness

Select for

reproduction

Mutation

Exit> n generations

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Genetic programming to optimize rule structure.

Learning

Learning

Learning

Learning

Learning

housed in flocks of six in common cages (59!32 and46 cm high) made of galvanized wire mesh and kept on a12:12 h light:dark cycle at 27"C (#2"). They were fed adlibitum on a mixture of white and red millet seeds andoffered ad libitum water. Each bird was marked with aunique combination of two coloured leg bands. Inaddition, the tail and neck feathers of each individualwere coloured with acrylic paint to allow individualidentification from a distance.

ApparatusThe purpose of the experimental apparatus was to

constrain subjects to act as either producers or scroungersin order to manipulate the frequency of each tactic

within a flock. The apparatus consisted of an indoor cage(273!102 cm and 104 cm high) with a producer and ascrounger compartment divided by a series of 22 patches,of which every second one contained seeds (Fig. 2a). Anopaque barrier placed length-wise from ceiling to floorprevented birds from moving between the producer andscrounger compartments (Fig. 2a).

Each patch consisted of a seed container and a stringthat prevented the seeds from falling out. Pulling thestring caused the seeds to fall into a 2!2 cm collectingdish located directly below the seed container. Oncein the collecting dish the seeds were available to theindividual that pulled the string from the producercompartment and all individuals within the scrounger

BarrierScrounger side

Producer side

Seed container

Division

Collecting dish

String

Perch

Scrounger sideProducer side

(b)

(a)

Figure 2. Top view of the experimental apparatus (a) and foraging patch (b). Individuals could search for seed-containing patches by pullingthe string associated with each patch. Strings were available only in the producer compartment. Birds in the scrounger compartment searchedfor individuals feeding from produced patches. When the top portion of an opaque barrier was in place, the birds in one compartment couldnot move into the other compartment. A close-up view of the patch (b) shows that producers had to sit on a perch directly in front of a patchto pull the string associated with that patch, and if seeds were present, they were released into the collecting dish. From the perch, a producercould reach the collecting dish by stretching its neck through a small hole in the division placed between compartments. The arrow indicatesthe direction in which the string had to be pulled to release the seeds.

343MOTTLEY & GIRALDEAU: CONVERGING ON PS EQUILIBRIA

top related