02 -1 lecture 02 heuristic search topics –basics –hill climbing –simulated annealing –best...

02<Heuristic Search>-1

Lecture 02 Heuristic Search

• Topics– Basics– Hill Climbing– Simulated Annealing– Best First Search– Genetic Algorithms – Game Search


Basics• Heuristics

– General rules about past experience of solving a problem

• Problem solving by heuristic search– Solving a problem by reasoning about how

to search a solution space using heuristics

• Solution space– A space containing (partial) solution

states

• Search– Exploring solution spaces by generating

solution states


Hill Climbing

• Search solution spaces using gradient as heuristics

• Always search the direction with the greatest gradient

B

A

C

A

Current state: A => Next state: B


Hill Climbing• Solving 8-puzzle by hill climbing using

Manhattan distance heuristics – Manhattan distance

1 2 3

4 5 6

7 8

4 6 7

1 2

5 3 8

Sgoal=((1 2 3) (4 5 6) (7 8 #)) Sinitial=((4 6 7) (1 # 2) (5 3 8))

i

siTMDSH )()(

||||||),(),,(||)( gisigisigigisisisi yyxxyxyxTMD

MD(Ts1) = ||(2, 1), (1, 1)|| = |2-1| + |1-1| = 1

− Heuristics


Hill Climbing

• Solving 8-puzzle by hill climbing using Manhattan distance heuristics

Sinitial=((4 6 7)(1 # 2)(5 3 8)) [H=-16]

((4 # 7)(1 6 2)(5 3 8)) [H=-15]

((# 4 7)(1 6 2)(5 3 8)) [H=-17]

((4 6 7)(# 1 2)(5 3 8)) [H=-17]

((4 7 #)(1 6 2)(5 3 8)) [H=-14]

((4 6 7)(1 3 2)(5 # 8)) [H=-15] ((4 6 7)(1 2 #)(5 3 8)) [H=-15]

O# 左 O# 右

O# 左 O# 右

O# 下 O# 上 O# 下


Hill ClimbingHill-climbing(Sinitial) {

Scurrent←Sinitial;H(Scurrent) ←Heuristics-eval(Scurrent);loop {

if ?Goal(Scurrent) = true, exit;New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states

H(Si) ←Heuristics-eval(Si);Smax ← ；if H(Smax ) ≦ H(Scurrent), exit;Scurrent ←Smax;}

return Scurrent;}

)} {H(SArgi

si

max


Hill Climbing

• Problems with hill climbing– Local maxima

Plateau

RidgeHill top


Simulated Annealing• Annealing refers to the process of cooling do

wn a high temperature liquidated metal or glass back to a solid state in order to get the solid material settled in an optimal state.

• We have to carefully control the temperature during annealing so that at any time point the system is approximately under thermodynamic equilibrium, which implies equilibrium entropy is reached

• Given equilibrium entropy, a system contains maximal possible states according to the second law of thermodynamics– Boltzmann defined entropy [S] as log W where W i

s the number of microstates in the system time a constant "k" ; hence S= k Log W


Simulated Annealing• The probability of each (energy) state E at s

ome temperature T can be determined by Boltzmann Factor p(E):

• k: Boltzmann constantkTEeCEp /)(

P(E2)

P(E1)

P(E3)E1 < E2 < E3

Energy

ProbFixed temperature


Simulated Annealing

• Annealing schedule: a mechanism that controls annealing temperature; it includes initial annealing temperature as well as how the temperature is decreased

• Simulated annealing: – Simulate the annealing process to explore all

possible states in terms of state probability to reach the optimal final state


Simulated Annealing• Simulated annealing uses the Monte Carlo

Method to simulate the annealing process– For a given state E, random() simulates the mini

mal probability that the state is promising– If the Boltzmann Factor of state E satisfies the foll

owing condition, it is considered to be a choice for further exploration:

• Simulated annealing provides a mechanism to escape from local maxima

• Simulated annealing does not guarantee to final optimal solutions though

• Annealing schedule– Initial annealing temperature: T0

– Temperature change : Ti+1 = Ti

())( randomEp


Simulated Annealing

())( randomCp

B

D

A

B

D

AC

(b)(a)

Current state: B => Next state: C if

• Compared to Hill Climbing


Simulated AnnealingSimulated-annealing (Sinitial){

Scurrent←Sinitial;H(Scurrent) ←Heuristics-eval(Scurrent);Tnext ←Annealing-schedule();loop {

New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states H(Si) ←Heuristics-eval(Si);Snew ← ;if (H(Snew) > H(Scurrent)) then Scurrent ←Snew;else loop { Snew ← Random-select(New-states); If Boltzmann-factor(Snew, Tnext ) > Random(1), then {Scurrent ←Snew; exit;} }if ?End-annealing-schedule() = true, exit;else Tnext = Annealing-schedule();}

return Scurrent;}

)} {H(SArgi

si

max


Best First Search

• Go global to cope with local maxima• Use priority queue Qp to contain all nod

es waiting for exploration• Qp represents the wave front (the fronti

er of the exploration process)• Always choose the best state from Qp to

explore next• Completeness of Best First Search

– can always find optimal solutions


Best First Search

• Solving 8-puzzle using Best First Search

Sinitial=((4 6 7)(1 # 2)(5 3 8)) [H=-16]

((4 # 7)(1 6 2)(5 3 8)) [H=-15]

((# 4 7)(1 6 2)(5 3 8)) [H=-17]

((4 6 7)(# 1 2)(5 3 8)) [H=-17]

((4 7 #)(1 6 2)(5 3 8))[H=-14]

((4 6 7)(1 3 2)(5 # 8)) [H=-15] ((4 6 7)(1 2 #)(5 3 8)) [H=-15]

O# 左 O# 右

O# 左 O# 右

O# 上 O# 下O# 下


Best First Search

Current state: B => Next state: D, if, after B is explored, D becomes the best in Qp

BD

A

Qp

BD

A

C

Qp

(b)(a)

• Compared to Hill Climbing


Best First SearchBest-first-search (Sinitial) {

H(Sinitial) ←Heuristics-eval(Sinitial);Add-pqueue(Qp, Sinitial, H(Sinitial));loop {

Smax ← Pop-pqueue(Qp) ；if Smax = nil, fail ；else Scurrent ←Smax;if ?Goal(Scurrent ) = true, exit;New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states {

H(Si) ←Heuristics-eval(Si);Add-pqueue(Qp, Si, H(Si));}

}return Scurrent;}


Best First Search• A* algorithm

– A Best First Search process with a heuristic which insists the estimated cost must never be larger than the real optimal cost

– That is, h(S)≦h*(S), given heuristics evaluation function f(S)=g(S)+h(S), where h(S) and h*(S) represent the estimated and optimal cost, respectively, from the current state to the target state

• Optimality of A*– Can always find optimal solutions through

optimal paths


Genetic Algorithms• Simulate natural evolution process

– The fittest survives– Chromosomes vs. solutions– Genetic operators vs. search – Fitness value vs. objective function

• Genetic operators– Selection– Crossover– Mutation


Genetic Algorithms• Selection

– Selection strategy• Elitism• Variety

– Elitism• Selection probabilities proportional to

fitness values

• Crossover– Crossover rate– Crossover point

• Mutation– Single- vs. multiple-point mutation– Mutation rate


Genetic Algorithms• Example• Initial population selection

– Begin with a collection of initial solutions

– Solution example:

– Transform each solution into a bit string

X = 155Y = 124Z = 228

(155, 124, 228) (100110110111110011100100)

X Y Z


Genetic Algorithms• Selection

– Select two parents from the population according to a cumulated fitness distribution formed as a roulette wheel

(155, 124, 228) (100110110111110011100100)

X Y Z

(116, 4, 195) (011101000000010011000011)


Genetic Algorithms


Genetic Algorithms• Crossover

– Determine whether the two selected individuals are qualified for crossover (crossover rate)

– Identify locations for crossovers– Crossovers need not be at gene

boundaries

– Exchange the trailing portions of the bit string to create two new children

(155, 124, 228) (100110110111110011100100)(116, 4, 195) (011101000000010011000011)

(100110000000110011100111)(011101110111010011000000) (119, 116, 192)

(152, 12, 231)


Genetic Algorithms• Mutation

– Mutation points• Point mutation

– select a random bit in the string and change it

• Complex mutation– mutate a pattern or sequence of bits

– Determine whether the selected bit deserves mutation (mutation rate)

(152, 12, 231) (100110000000110011100111)

(152, 76, 231)(100110000100110011100111)


Genetic AlgorithmsGenetic Algorithm(P, Y, N, L){ /* P pop., Y gen., N chromo., L long */

Pcurrent ← P;gen ← 1;fold ← Average(Fitness-value(Pcurrent));loop until gen>Y {

PD ← Selection-prob(Pcurrent);RW ← Produce-cum-prob(PD);Pnew ← ; i ← 1;loop until i>N {

d1 ← Select(Pcurrent, RW); d2 ← Select(Pcurrent, RW);<dc1, dc2> ← Crossover(d1, d2, pc, L);dm1 ← Mutate(dc1, pm, L); Add(dm1, Pnew);dm2 ← Mutate(dc2, pm, L); Add(dm2, Pnew);i = i +2;}

Pcurrent ← Pnew;fnew ← Average(Fitness-value(Pcurrent));if |fnew - fold|≦, exit;else {

fold ← fnew;gen = gen + 1;

}}

return Pcurrent and the fittest chromosome in Pcurrent;}


Genetic AlgorithmsSelect(P, RW) {

i ← Roulette-wheel(RW, P, Random()) ；Return i ；}

Crossover(d1, d2, pc , L) {if pc >Random() {

b ← Random(L) ；dc1 ← Substring(d1, 1, b) + Substring(d2, b+1, L) ；dc2 ← Substring(d2, 1, b) + Substring(d1, b+1, L) ；}

else <dc1, dc2> ← <d1, d2> ；return< dc1, dc2> ；}

Mutate(d, pm , L) {b ← 1 ；loop until b>L {if pm > Random() change the bth bit of d; b = b + 1 ；

} return d;}


Genetic Algorithms• Why Genetic algorithm works to find

optimal solutions?– Schema theorem - Schema with higher fitness

value, shorter definition length, and less definition genes get exponential growth during the evolution process

• Higher fitness value: higher selection probability

• Shorter defining length: less crossover probability

• Less defining genes: less mutation probability


Genetic Algorithms

Gene position: 12345678Chromosome schema ： **0**1**chromosomes included ： 01010110,

010010110, 00010110, …Defining length ： 6 -3 = 3

(position of 0 is 3 and that of 1 is 6)Defining genes ： 2 (0 and 1)


Game Search• Adversarial search• MiniMax Strategy

– Select maximally evaluated values from my moves

– Select minimally evaluated values from opponent’s moves

nodeMIN :n if

node MAX :n if

terminal:n if

min

max Value(n)-MiniMax

n))Successor((s

n))Successor((s

Utility(n)

Terminal(look ahead 2 steps)

A

2 4 612 8 14 5 2

B C D

3

3 2 2

3

MAX

MINMax(3, 2, 2)

Min(3, 12, 8)


• MiniMax on Tic-tac-toe

Game Search


Game Search• MiniMax on Tic-tac-toe (cont.)


• MiniMax Algorithm1. Generate the game tree down to the termin

al nodes.2. Apply the utility function to the terminal no

des.3. For a S set of sibling nodes, pass up to the p

arent…1) the lowest value in S if the siblings are 2) the largest value in S if the siblings are

4. Recursively do the above, until the backed-up values reach the initial state.

5. The value of the initial state is the minimum score for Max.

Game Search


Game Search• Alpha-Beta Pruning

– In general, if m is better than n, the subtree leading to n should never be explored

A

B

m

Player

Opponent

n

Player

Opponent

…..


Game Search• -Cutoff

≥2

2

2

Player (MAX)

Opponent (MIN) ≤1

7 8 1


Game Search• -cutoff

≥2

5

≤5

Player (MAX)

Opponent (MIN)

≥7

25 7

Player (MAX)

02 -1 lecture 02 heuristic search topics –basics –hill climbing –simulated annealing –best...

Documents