02 -1 lecture 02 heuristic search topics –basics –hill climbing –simulated annealing –best...
Post on 20-Dec-2015
218 views
TRANSCRIPT
02<Heuristic Search>-1
Lecture 02 Heuristic Search
• Topics– Basics– Hill Climbing– Simulated Annealing– Best First Search– Genetic Algorithms – Game Search
02<Heuristic Search>-2
Basics• Heuristics
– General rules about past experience of solving a problem
• Problem solving by heuristic search– Solving a problem by reasoning about how
to search a solution space using heuristics
• Solution space– A space containing (partial) solution
states
• Search– Exploring solution spaces by generating
solution states
02<Heuristic Search>-3
Hill Climbing
• Search solution spaces using gradient as heuristics
• Always search the direction with the greatest gradient
B
A
C
A
Current state: A => Next state: B
02<Heuristic Search>-4
Hill Climbing• Solving 8-puzzle by hill climbing using
Manhattan distance heuristics – Manhattan distance
1 2 3
4 5 6
7 8
4 6 7
1 2
5 3 8
Sgoal=((1 2 3) (4 5 6) (7 8 #)) Sinitial=((4 6 7) (1 # 2) (5 3 8))
i
siTMDSH )()(
||||||),(),,(||)( gisigisigigisisisi yyxxyxyxTMD
MD(Ts1) = ||(2, 1), (1, 1)|| = |2-1| + |1-1| = 1
− Heuristics
02<Heuristic Search>-5
Hill Climbing
• Solving 8-puzzle by hill climbing using Manhattan distance heuristics
Sinitial=((4 6 7)(1 # 2)(5 3 8)) [H=-16]
((4 # 7)(1 6 2)(5 3 8)) [H=-15]
((# 4 7)(1 6 2)(5 3 8)) [H=-17]
((4 6 7)(# 1 2)(5 3 8)) [H=-17]
((4 7 #)(1 6 2)(5 3 8)) [H=-14]
((4 6 7)(1 3 2)(5 # 8)) [H=-15] ((4 6 7)(1 2 #)(5 3 8)) [H=-15]
O# 左 O# 右
O# 左 O# 右
O# 下 O# 上 O# 下
02<Heuristic Search>-6
Hill ClimbingHill-climbing(Sinitial) {
Scurrent←Sinitial;H(Scurrent) ←Heuristics-eval(Scurrent);loop {
if ?Goal(Scurrent) = true, exit;New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states
H(Si) ←Heuristics-eval(Si);Smax ← ;if H(Smax ) ≦ H(Scurrent), exit;Scurrent ←Smax;}
return Scurrent;}
)} {H(SArgi
si
max
02<Heuristic Search>-7
Hill Climbing
• Problems with hill climbing– Local maxima
Plateau
RidgeHill top
02<Heuristic Search>-8
Simulated Annealing• Annealing refers to the process of cooling do
wn a high temperature liquidated metal or glass back to a solid state in order to get the solid material settled in an optimal state.
• We have to carefully control the temperature during annealing so that at any time point the system is approximately under thermodynamic equilibrium, which implies equilibrium entropy is reached
• Given equilibrium entropy, a system contains maximal possible states according to the second law of thermodynamics– Boltzmann defined entropy [S] as log W where W i
s the number of microstates in the system time a constant "k" ; hence S= k Log W
02<Heuristic Search>-9
Simulated Annealing• The probability of each (energy) state E at s
ome temperature T can be determined by Boltzmann Factor p(E):
• k: Boltzmann constantkTEeCEp /)(
P(E2)
P(E1)
P(E3)E1 < E2 < E3
Energy
ProbFixed temperature
02<Heuristic Search>-10
Simulated Annealing
• Annealing schedule: a mechanism that controls annealing temperature; it includes initial annealing temperature as well as how the temperature is decreased
• Simulated annealing: – Simulate the annealing process to explore all
possible states in terms of state probability to reach the optimal final state
02<Heuristic Search>-11
Simulated Annealing• Simulated annealing uses the Monte Carlo
Method to simulate the annealing process– For a given state E, random() simulates the mini
mal probability that the state is promising– If the Boltzmann Factor of state E satisfies the foll
owing condition, it is considered to be a choice for further exploration:
• Simulated annealing provides a mechanism to escape from local maxima
• Simulated annealing does not guarantee to final optimal solutions though
• Annealing schedule– Initial annealing temperature: T0
– Temperature change : Ti+1 = Ti
())( randomEp
02<Heuristic Search>-12
Simulated Annealing
())( randomCp
B
D
A
B
D
AC
(b)(a)
Current state: B => Next state: C if
• Compared to Hill Climbing
02<Heuristic Search>-13
Simulated AnnealingSimulated-annealing (Sinitial){
Scurrent←Sinitial;H(Scurrent) ←Heuristics-eval(Scurrent);Tnext ←Annealing-schedule();loop {
New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states H(Si) ←Heuristics-eval(Si);Snew ← ;if (H(Snew) > H(Scurrent)) then Scurrent ←Snew;else loop { Snew ← Random-select(New-states); If Boltzmann-factor(Snew, Tnext ) > Random(1), then {Scurrent ←Snew; exit;} }if ?End-annealing-schedule() = true, exit;else Tnext = Annealing-schedule();}
return Scurrent;}
)} {H(SArgi
si
max
02<Heuristic Search>-14
Best First Search
• Go global to cope with local maxima• Use priority queue Qp to contain all nod
es waiting for exploration• Qp represents the wave front (the fronti
er of the exploration process)• Always choose the best state from Qp to
explore next• Completeness of Best First Search
– can always find optimal solutions
02<Heuristic Search>-15
Best First Search
• Solving 8-puzzle using Best First Search
Sinitial=((4 6 7)(1 # 2)(5 3 8)) [H=-16]
((4 # 7)(1 6 2)(5 3 8)) [H=-15]
((# 4 7)(1 6 2)(5 3 8)) [H=-17]
((4 6 7)(# 1 2)(5 3 8)) [H=-17]
((4 7 #)(1 6 2)(5 3 8))[H=-14]
((4 6 7)(1 3 2)(5 # 8)) [H=-15] ((4 6 7)(1 2 #)(5 3 8)) [H=-15]
O# 左 O# 右
O# 左 O# 右
O# 上 O# 下O# 下
02<Heuristic Search>-16
Best First Search
Current state: B => Next state: D, if, after B is explored, D becomes the best in Qp
BD
A
Qp
BD
A
C
Qp
(b)(a)
• Compared to Hill Climbing
02<Heuristic Search>-17
Best First SearchBest-first-search (Sinitial) {
H(Sinitial) ←Heuristics-eval(Sinitial);Add-pqueue(Qp, Sinitial, H(Sinitial));loop {
Smax ← Pop-pqueue(Qp) ;if Smax = nil, fail ;else Scurrent ←Smax;if ?Goal(Scurrent ) = true, exit;New-states ← Legal-operators(Scurrent);if New-states = , exit;for each Si New-states {
H(Si) ←Heuristics-eval(Si);Add-pqueue(Qp, Si, H(Si));}
}return Scurrent;}
02<Heuristic Search>-18
Best First Search• A* algorithm
– A Best First Search process with a heuristic which insists the estimated cost must never be larger than the real optimal cost
– That is, h(S)≦h*(S), given heuristics evaluation function f(S)=g(S)+h(S), where h(S) and h*(S) represent the estimated and optimal cost, respectively, from the current state to the target state
• Optimality of A*– Can always find optimal solutions through
optimal paths
02<Heuristic Search>-19
Genetic Algorithms• Simulate natural evolution process
– The fittest survives– Chromosomes vs. solutions– Genetic operators vs. search – Fitness value vs. objective function
• Genetic operators– Selection– Crossover– Mutation
02<Heuristic Search>-20
Genetic Algorithms• Selection
– Selection strategy• Elitism• Variety
– Elitism• Selection probabilities proportional to
fitness values
• Crossover– Crossover rate– Crossover point
• Mutation– Single- vs. multiple-point mutation– Mutation rate
02<Heuristic Search>-21
Genetic Algorithms• Example• Initial population selection
– Begin with a collection of initial solutions
– Solution example:
– Transform each solution into a bit string
X = 155Y = 124Z = 228
(155, 124, 228) (100110110111110011100100)
X Y Z
02<Heuristic Search>-22
Genetic Algorithms• Selection
– Select two parents from the population according to a cumulated fitness distribution formed as a roulette wheel
(155, 124, 228) (100110110111110011100100)
X Y Z
(116, 4, 195) (011101000000010011000011)
02<Heuristic Search>-23
Genetic Algorithms
02<Heuristic Search>-24
Genetic Algorithms• Crossover
– Determine whether the two selected individuals are qualified for crossover (crossover rate)
– Identify locations for crossovers– Crossovers need not be at gene
boundaries
– Exchange the trailing portions of the bit string to create two new children
(155, 124, 228) (100110110111110011100100)(116, 4, 195) (011101000000010011000011)
(100110000000110011100111)(011101110111010011000000) (119, 116, 192)
(152, 12, 231)
02<Heuristic Search>-25
Genetic Algorithms• Mutation
– Mutation points• Point mutation
– select a random bit in the string and change it
• Complex mutation– mutate a pattern or sequence of bits
– Determine whether the selected bit deserves mutation (mutation rate)
(152, 12, 231) (100110000000110011100111)
(152, 76, 231)(100110000100110011100111)
02<Heuristic Search>-26
Genetic AlgorithmsGenetic Algorithm(P, Y, N, L){ /* P pop., Y gen., N chromo., L long */
Pcurrent ← P;gen ← 1;fold ← Average(Fitness-value(Pcurrent));loop until gen>Y {
PD ← Selection-prob(Pcurrent);RW ← Produce-cum-prob(PD);Pnew ← ; i ← 1;loop until i>N {
d1 ← Select(Pcurrent, RW); d2 ← Select(Pcurrent, RW);<dc1, dc2> ← Crossover(d1, d2, pc, L);dm1 ← Mutate(dc1, pm, L); Add(dm1, Pnew);dm2 ← Mutate(dc2, pm, L); Add(dm2, Pnew);i = i +2;}
Pcurrent ← Pnew;fnew ← Average(Fitness-value(Pcurrent));if |fnew - fold|≦, exit;else {
fold ← fnew;gen = gen + 1;
}}
return Pcurrent and the fittest chromosome in Pcurrent;}
02<Heuristic Search>-27
Genetic AlgorithmsSelect(P, RW) {
i ← Roulette-wheel(RW, P, Random()) ;Return i ;}
Crossover(d1, d2, pc , L) {if pc >Random() {
b ← Random(L) ;dc1 ← Substring(d1, 1, b) + Substring(d2, b+1, L) ;dc2 ← Substring(d2, 1, b) + Substring(d1, b+1, L) ;}
else <dc1, dc2> ← <d1, d2> ;return< dc1, dc2> ;}
Mutate(d, pm , L) {b ← 1 ;loop until b>L {if pm > Random() change the bth bit of d; b = b + 1 ;
} return d;}
02<Heuristic Search>-28
Genetic Algorithms• Why Genetic algorithm works to find
optimal solutions?– Schema theorem - Schema with higher fitness
value, shorter definition length, and less definition genes get exponential growth during the evolution process
• Higher fitness value: higher selection probability
• Shorter defining length: less crossover probability
• Less defining genes: less mutation probability
02<Heuristic Search>-29
Genetic Algorithms
Gene position: 12345678Chromosome schema : **0**1**chromosomes included : 01010110,
010010110, 00010110, …Defining length : 6 -3 = 3
(position of 0 is 3 and that of 1 is 6)Defining genes : 2 (0 and 1)
02<Heuristic Search>-30
Game Search• Adversarial search• MiniMax Strategy
– Select maximally evaluated values from my moves
– Select minimally evaluated values from opponent’s moves
nodeMIN :n if
node MAX :n if
terminal:n if
min
max Value(n)-MiniMax
n))Successor((s
n))Successor((s
Utility(n)
Terminal(look ahead 2 steps)
A
2 4 612 8 14 5 2
B C D
3
3 2 2
3
MAX
MINMax(3, 2, 2)
Min(3, 12, 8)
02<Heuristic Search>-31
• MiniMax on Tic-tac-toe
Game Search
02<Heuristic Search>-32
Game Search• MiniMax on Tic-tac-toe (cont.)
02<Heuristic Search>-33
• MiniMax Algorithm1. Generate the game tree down to the termin
al nodes.2. Apply the utility function to the terminal no
des.3. For a S set of sibling nodes, pass up to the p
arent…1) the lowest value in S if the siblings are 2) the largest value in S if the siblings are
4. Recursively do the above, until the backed-up values reach the initial state.
5. The value of the initial state is the minimum score for Max.
Game Search
02<Heuristic Search>-34
Game Search• Alpha-Beta Pruning
– In general, if m is better than n, the subtree leading to n should never be explored
A
B
m
Player
Opponent
n
Player
Opponent
…..
02<Heuristic Search>-35
Game Search• -Cutoff
≥2
2
2
Player (MAX)
Opponent (MIN) ≤1
7 8 1
02<Heuristic Search>-36
Game Search• -cutoff
≥2
5
≤5
Player (MAX)
Opponent (MIN)
≥7
25 7
Player (MAX)