learning classifier systems. learning classifier systems (lcs) the system has three layers: – a...

14
Learning Classifier Systems

Upload: kiera-dame

Post on 15-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Learning Classifier Systems

Learning Classifier Systems (LCS)

• The system has three layers:– A performance system that interacts with

environment,– An apportionment of credit algorithm that rates

rules as to usefulness,– A rule discovery algorithm that generates

plausible new rules to replace less useful rules.

Performance System Cycles• Message is posted in the message list from the

input interface.• Each rule is matched against the message list• All matching rules compete to post in the next

message list via bidding process; winning rule posts in the new message list

• The output interface checks the new message and produces an effector action.

• The new message list replaces the previous one.• Repeat.

Overview of LCS

Rule format

• Rule– Condition = {0,1,#}k

– Action = message to be posted in the message list– Strength = rule’s usefulness to the system

kind ears num. of legs smart scream runaway kiss

Example (Wolf or Grandmother?)

teeth

1 0 1 1 # 1 1 0

0 1 0 0 # 0 0 1

1

1

Wolf

GrandMa

Encoding

Matching

[M] Condition Action Strength

# 1 # # #010 100

0 # 0 # 0011 100

Message List

0 1 0 0

Condition Action Strength

# 1 # # #010 100

1 # 0 1 1### 50

0 # 0 # 0011 100

1 # # # 1010 1000

1 0 1 # 0111 1000

[N]

Bidding Process

[M] Rule id

Condition Action Strength

r1 # 1 # # #010 100

r3 0 # 0 # 0011 100

β = 0.2

Bid(r1) = 0.2 × ¼ × 100 = 5Bid(r3) = 0.2 × ½ × 100 = 10

r3 posts its message in the new message list.

Bid(R,t) = β × specificity(R) × Strength(R,t)Specificity(R)= number of non # / k

Credit assignment: Bucket Brigade

r3Bucket

10

r5Bucket

150

coupled

Environment

executedReward

200

r3Bucket

10

r5Bucket

150

EnvironmentReward

200

Credit assignment: Bucket Brigade

Genetic Algorithms

• Fitness = rule strength• Parents: Strong classifiers (best, roulette

wheel, etc.)• Mutation: alter parts of parent’s string• Crossover: exchange parts of parents’ strings• Offspring replaces a weak rule.

Genetic Algorithms (cont.)

0 0 1 0 1 1 # #

1 0 1 0 0 1 0 0

Parent 1Parent 2

0 0 1 0 1 1 0 0

1 0 1 0 0 1 # #

Crossover point

0 0 1 0 1 1 # #

1 0 1 0 0 1 0 0

Parent 1Parent 2

0 0 1 0 1 0 # #

1 0 1 0 0 1 0 0

Crossover

Mutation

Maze Environment

A

Environment

Message List

40 5 f N 5 (1,2)

GF

Condition Action Strength

# >0 # # # # GF 1000

# <0 # # # # ∧ TL TL 1000

# <0 # # # # ∧ TR TR 1000

(Signal smell-ahead bump heading score location)

References

• A Mathematical framework for Studying Learning in Classifier Systems, John H. Holland, Phsyca D, Vol 2, No 1-3, 1986, pp. 307-317

• A First Order Logic Classifier System, Drew Mellor Gecco ’05