state space representations and search strategies - 2 spring 2007, juris vīksna
DESCRIPTION
Search strategies - A* [Adapted from J.Pearl]TRANSCRIPT
State space representations and search strategies - 2
Spring 2007, Juris Vīksna
Search strategies - A*<S,P,I,G,W> - state space
h*(x) - a minimum path weight fromx to the goal state
h(x) - heuristic estimate of h*(x)
g*(x) - a minimum path weight fromI to x
g(x) - estimate of g*(x) (i.e. minimal weightfound as far
f*(x) = g*(x) + h*(x)f(x) = g(x) + h(x)
Search strategies - A*
[Adapted from J.Pearl]
Search strategies - A*A*Search(state space =<S,P,I,G,W>,h)
Open {<h(I),0,I>}Closed while Open do
<fx,gx,x> ExtractMin(Open) [minimum for fx] if Goal(x, ) then return xInsert(<fx,gx,x>,Closed)for y Child(x ,) do
gy = gx + W(x,y)fy = gy + h(y) if there is no <f,g,y>Closed with ffy then
Insert(< fy,gy,y>,Open) [replace existing<f,g,y>, if present]
return fail
Complete search
Definition
An algorithm is said to be complete if it terminates with a solution when one exists.
Admissible search
Definition
An algorithm is admissible if it is guaranteed to return an optimal solution (with minimal possible path weight from the start state to a goal state) whenever a solution exists.
Dominant search
Definition
An algorithm A is said to dominate algorithm B, if every node expanded by A is also expanded by B. Similarly, A strictly dominates B if A dominates B and B does notdominate A. We will also use the phrase “more efficient than” interchangeably with dominates.
Optimal search
Definition
An algorithm is said to be optimal over a class of algorithms if it dominates all members of that class.
Locally finite state spacesDefinition
A state space <S,P,I,G,W> is locally finite, if • for every xS, there is only a finite number of ySsuch that (x,y)P
• there exists > 0 such that for all (x,y)P we haveW(x,y) .
Completeness of A*
Theorem
A* algorithm is complete on locally finite state spaces.
Admissibility of A*
Definition
A heuristic function h is said to be admissible if
0 h(n) h*(n) for all nS.
Admissibility of A*
Theorem
A* which uses admissible heuristic function is admissibleon locally finite state spaces.
Admissibility of A*
Lemma
If A* uses admissible heuristic function h, then at any time before A* terminates there exists a node n’ in Open,such that f(n’) f*(I).
Admissibility of A*
Theorem
A* which uses admissible heuristic function is admissibleon locally finite state spaces.
Informedness of heuristic functions
Definition
A heuristic function h2 is said to be more informed than h1, if both h1 and h2 are admissible and
h2(n) > h1(n) for every non-goal node nS.
Similarly, an A* algorithm using h2 is said to be more informed than that using h1.
Dominance of A*
Theorem
If A2* is more informed than A1*, then A2* dominates A1*.
Dominance of A*
Lemma
Any node expanded by A* cannot have an f valueexceeding f*(I), i.e.
f(n) f*(I) for all nodes expanded.
Dominance of A*
Lemma
Every node n on Open for which f(n) < f*(I) will eventually be expanded by A*.
C-bounded paths
Definition
We say that path P is C-bounded if every node along thispath satisfies gP(n)+h(n) C. Similarly, if a strict inequalityholds for every n along P, we say that P is strictly C-bounded. When it becomes necessary to identify which heuristic was used, we will use the notation C(h)-bounded.
C-bounded paths
Theorem
A sufficient condition for A* to expand a node n is that there exists some strictly f*(I)-bounded path P from I to n.
C-bounded paths
Theorem
A necessary condition for A* to expand a node n is that there exists a f*(I)-bounded path P from I to n.
Dominance of A*
Theorem
If A2* is more informed than A1*, then A2* dominates A1*.
Consistent heuristic functions
Definition
A heuristic function h is said to be consistent, if
h(n) k(n,n’) + h(n’) for all nodes n and n’.
(where k(n,n’) denotes the weight of cheapest path from n to n’).
Monotone heuristic functions
Definition
A heuristic function h is said to be monotone, if
h(n) W(n,n’) + h(n’) for all (n,n’)P.
Monotonicity and consistency
Theorem
Monotonicity and consistency are equivalent properties.
Monotonicity and admissibility
Theorem
Every monotone heuristic is also admissible.
A* with monotone heuristic
Theorem
An A* algorithm with monotone heuristic finds optimal paths to all expanded nodes, i.e.
g(n) = g*(n) for all <fn,gn,n> Closed.
Some terminology
A* - we just discussed that
A - basically the same as A*, but we check whether wehave reached a goal state already at the time whennodes are generated
Z* - generalization of A*, instead of f(x) = g(x) + h(x)uses more general function f(x’) = F(E(x),f(x),h(X’))
Z - related to Z* similarly as A to A*
Implementation issuesA*Search(state space =<S,P,I,G,W>,h)
Open {<h(I),0,I>}Closed while Open do
<fx,gx,x> ExtractMin(Open) [minimum for fx] if Goal(x, ) then return xInsert(<fx,gx,x>,Closed)for y Child(x ,) do
gy = gx + W(x,y)fy = gy + h(y) if there is no <f,g,y>Closed with ffy then
Insert(< fy,gy,y>,Open) [replace existing<f,g,y>, if present]
return fail
Implementation issues - Heaps
• They are binary trees with all levels completed, except the lowest one which may have uncompleted section on the right side
• They satisfy so called Heap Property - for each subtree of heap the key for the root of subtree must be at least as large as the keys for its (left and right) children
Implementation issues - Heaps
2
45
12
3
1
13
Implementation issues - Heaps
3
45
12
7
2
13 1
T(n) = (h) = (log n)
Insert
Implementation issues - Heaps
Delete
3
45
12
7
2
13 14
T(n) = (h) = (log n)
Implementation issues - Heaps
ExtractMin
3
45
12
7 13 14
T(n) = (h) = (log n)
1
Implementation issues - BST
T is a binary search tree, if
• it is a binary tree (with a key associated with each node)
• for each node x in T the keys at all nodes of left subtree of x are not larger than key at node x and keys at all nodes of right subtree of x are not smaller than key at node x
Implementation issues - BST
3
5
12
7
11
Implementation issues - BST
Insert
3
5
13
142
7
12
9
108
11
11
Implementation issues - BST
Delete
3
5
13
142
7
12
9
108
11
Implementation issues - BST
Delete
3
5
13
142
7
12
9
118
10
Implementation issues - BST
Delete
3
5
13
352
7
12
20
3018
3119
Implementation issues - AVL trees
T is an AVL tree, if
• it is a binary search tree
• for each node x in T we have
Height(LC(x)) – Height(RC(x)) {– 1, 0, 1}
Implementation issues - skip lists
4 5 9 11 13 1 63
“Perfect” Skip List
How to chose a heuristic? Original problem P Relaxed problem P'
A set of constraints removing one or more constraintsP is complex P' becomes simpler
Use cost of a best solution path from n in P' as h(n) for P Admissibility:
h* hcost of best solution in P >= cost of best solution in P'
Solution space of P
Solution space of P'
How to chose a heuristic - 8-puzzle Example: 8-puzzle
– Constraints: to move from cell A to cell B cond1: there is a tile on A cond2: cell B is empty cond3: A and B are adjacent (horizontally or vertically)
– Removing cond2: h2 (sum of Manhattan distances of all misplaced tiles)
– Removing cond2 and cond3: h1 (# of misplaced tiles)
– Removing cond3: h3, a new heuristic function
How to chose a heuristic - 8-puzzle h3:
repeat if the current empty cell A is to be occupied by tile x
in the goal, move x to A. Otherwise, move into A any arbitrary misplaced tile. until the goal is reached
h2>= h3 >= h1
h1(start) = 7h2(start) = 18h3(start) = 7
How to chose a heuristic - TSP• Example: TSP. A legal tour is a (Hamiltonian) circuit
How to chose a heuristic - TSP• Example: TSP. A legal tour is a (Hamiltonian) circuit
– It is a connected second degree graph (each node has exactly two adjacent edges)Removing the connectivity constraint leads to h1: find the cheapest second degree graph from the
given graph (with o(n^3) complexity)
The given complete
graph
A legal tour
Other second degree graphs
How to chose a heuristic - TSP– It is a spanning tree (when an edge is removed) with the constraint
that each node has at most 2 adjacent edges)Removing the constraint leads to h2:
find the cheapest minimum spanning tree from the given graph
(with O(n^2/log n)
The given graph A legal tour Other MST
How complicated heuristic to chose?
[Adapted from R.Shinghal]
Relaxing optimality requirements
• is f = g + h the best choice, if we want to minimize search efforts, not solution cost? • even if solution cost is important, admissible f can lead to non terminating A*. Can speed be gained by decreasing solution quality? • it may be hard to find good admissible heuristic. What happens, if we do not require admissibility?
Relaxing optimality requirements
Weighted evaluation function
fw(n) = (1–w)g(n) + w h(n)
w = 0 - uniform costw = 1/2 - A*w = 1 - BestFirst
Relaxing optimality requirements
Bounded decrease in solution quality?
Relaxing optimality requirements
Dynamic Weighting
f(n) = g(n) + h(n) + (1 – d(n)/N) h(n)
d(n) - depth of a nodeN - anticipated depth of a goal node
Relaxing optimality requirements
Dynamic Weighting
f(n) = g(n) + h(n) + (1 – d(n)/N) h(n)
Theorem
If h is admissible, then algorithm is -admissible, i.e.it finds a path with a cost at most (1+ )C*.
Relaxing optimality requirements
A* algorithm
Uses 2 lists - Open and Focal
Focal is a sublist of Open containing nodes that do not deviate from the lowest f node by a factor greater than 1+ .
A* selects the node from Focal with lowest hF value
hF(n) - a second heuristic function estimating the computational effort to complete the search starting from n
Relaxing optimality requirements
A* algorithm
Theorem
If h is admissible, then A* algorithm is -admissible, i.e.it finds a path with a cost at most (1+ )C*.
NotehF does not need to be admissible
Relaxing optimality requirements
[Adapted from J.Pearl]
Relaxing optimality requirements
Theorem
If h(n) - h*(n) , then A* algorithm is -admissible, i.e.it finds a path with a cost at most (1+ )C*.
Relaxing optimality requirements
Example
Consider SP with arc costs uniformly drawn from [0,1].
N - number of arcs between n and a goal
h*(n) tends to be close to N/2 for large N
The only admissible heuristic is h(n) = 0
Relaxing optimality requirements
[Adapted from J.Pearl]
Relaxing optimality requirements
[Adapted from J.Pearl]
Relaxing optimality requirements
R* algorithm selects node from Open with the lowest C(n) value
Three common risk measures:
R1 - the worst case risk
R2 - the probability of suboptimal termination
R3 - the expected risk
Relaxing optimality requirements
[Adapted from J.Pearl]
Relaxing optimality requirements
[Adapted from J.Pearl]
Relaxing optimality requirements
R* algorithm selects node from Open with the lowest C(n) value
Theorem
For risk measures R1, R2, R3 algorithm R* is -risk admissible, i.e. terminates with a solution cost C, such that R(C) for all nodes left in Open.
Some performance examples
[Adapted from J.Pearl]
Some performance examples
[Adapted from J.Pearl]
Some performance
examples
[Adapted from J.Pearl]
Performance of search strategy
T - number of nodes in search graphD - number fo nodes in solution path
We define penetrance P as follows:
P = D/T
We have 0 < P < 1
Performance of search strategy
T - number of nodes in search graphD - number fo nodes in solution path
We define branching factor B as follows:
T = B + B2 + ... + BD
T = B(BD – 1)/(B – 1)
We have B 1
Performance of search strategy
[Adapted from R.Shinghal]
Performance of search strategy
[Adapted from R.Shinghal]