informed search algorithms chapter 4 slides derived in part from russell/slides/chapter04a.pdf,...

44
Informed search algorithms Chapter 4 Slides derived in part from www.cs.berkeley.edu/~russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen Kan, National University of Singapore, and from www.cs.umbc.edu/671/fall03/slides/c5-6_inf_search.ppt, Marie DesJardins, University of Maryland Baltimore County.

Upload: jean-hodge

Post on 19-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Informed search algorithms

Chapter 4

Slides derived in part from

www.cs.berkeley.edu/~russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen Kan, National University of Singapore,

and from www.cs.umbc.edu/671/fall03/slides/c5-6_inf_search.ppt, Marie DesJardins, University of Maryland Baltimore County.

Page 2: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Review: Tree search

• A search strategy is defined by picking the order of node expansion

Page 3: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Heuristic Search• Basic tree search is generic; the choice of node to

expand is dependent on the shape of the tree and the strategy chosen for node expansion.

• Often we have some domain knowledge that can help make a better decision.

• For the Romania problem, for instance, eyeballing it results in looking at certain cities first because they "look closer" to where we are going.

• If that domain knowledge can be captured in a heuristic, search performance can be improved by using that heuristic.

• This gives us an informed search strategy.

Page 4: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

So What's A Heuristic?Webster's Revised Unabridged Dictionary (1913) (web1913)

Heuristic \Heu*ris"tic\, a. [Gr. ? to discover.] Serving to discover or find out.

The Free On-line Dictionary of Computing (15Feb98)

heuristic 1. <programming> A rule of thumb, simplification or educated guess that reduces or limits the search for solutions in domains that are difficult and poorly understood. Unlike algorithms, heuristics do not guarantee feasible solutions and are often used with no theoretical guarantee. 2. <algorithm> approximation algorithm.

From WordNet (r) 1.6 heuristic adj 1: (computer science) relating to or using a heuristic rule 2:

of or relating to a general formulation that serves to guide investigation [ant: algorithmic] n : a commonsense rule (or set of rules) intended to increase the probability of solving some problem [syn: heuristic rule, heuristic program]

Page 5: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Heuristics• All domain knowledge used in the search is

encoded in the heuristic function h. • Examples:

– Missionaries and Cannibals: Number of people on starting river bank

– 8-puzzle: Number of tiles out of place – 8-puzzle: Sum of distances from goal – for each time

• In general:– h(n) >= 0 for all nodes n – h(n) = 0 implies that n is a goal node – h(n) = infinity implies that n is a deadend from which a goal cannot

be reached

Page 6: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Best-first search

• Order nodes on the nodes list by increasing value of an evaluation function, f(n), that incorporates domain-specific information in some way.

• This is a generic way of referring to the class of informed methods.

Page 7: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Best-first search

• Idea: use an evaluation function f(n) for each node– estimate of "desirability"

Expand most desirable unexpanded node

• Implementation:Order the nodes in fringe in decreasing order of desirability

• Special cases:– greedy best-first search– A* search

Page 8: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Romania with step costs in km

Page 9: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Greedy best-first search

• Evaluation function f(n) = h(n) (heuristic)

• = estimate of cost from n to goal

• e.g., hSLD(n) = straight-line distance from n to Bucharest

• Greedy best-first search expands the node that appears to be closest to goal

Page 10: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Greedy best-first search example

Page 11: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Greedy best-first search example

Page 12: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Greedy best-first search example

Page 13: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Greedy best-first search example

Page 14: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Properties of greedy best-first search

• Complete? No – can get stuck in loops, e.g., Iasi Neamt Iasi Neamt

• Time? O(bm), but a good heuristic can give dramatic improvement

• Space? O(bm) -- keeps all nodes in memory

• Optimal? No• Remember: Time and space complexity are measured in terms of

– b: maximum branching factor of the search tree– d: depth of the least-cost solution– m: maximum depth of the state space (may be infinite)

Page 15: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search

• Idea: avoid expanding paths that are already expensive

• Evaluation function f(n) = g(n) + h(n)

• g(n) = cost so far to reach n

• h(n) = estimated cost from n to goal

• f(n) = estimated total cost of path through n to goal

Page 16: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 17: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 18: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 19: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 20: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 21: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

A* search example

Page 22: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Admissible heuristics

• A heuristic h(n) is admissible if for every node n,

h(n) <= h*(n), where h*(n) is the true cost to reach the goal state from n.

• An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic.

• Example: hSLD(n) (never overestimates the actual road distance)

Page 23: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Admissible heuristics

E.g., for the 8-puzzle:• h1(n) = number of misplaced tiles• h2(n) = total Manhattan distance

(i.e., no. of squares from desired location of each tile)

• h1(S) = ? • h2(S) = ?

Page 24: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Admissible heuristics

E.g., for the 8-puzzle:• h1(n) = number of misplaced tiles• h2(n) = total Manhattan distance(i.e., no. of squares from desired location of each tile)

• h1(S) = ? 8• h2(S) = ? 3+1+2+2+2+3+3+2 = 18

Page 25: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Properties of A*

• If h(n) is admissible

• Complete? Yes (unless there are infinitely many nodes with f <≤ f(G) )

• Time? Exponential in [relative error in h * length of solution]

• Space? Keeps all nodes in memory

• Optimal? Yes; cannot expand f i+1 until f i is finished.

Page 26: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Some observations on A*• Perfect heuristic: If h(n) = h*(n) for all n, then only the nodes

on the optimal solution path will be expanded. So, no extra work will be performed.

• Null heuristic: If h(n) = 0 for all n, then this is an admissible heuristic and A* acts like Uniform-Cost Search.

• Better heuristic: If h1(n) < h2(n) <= h*(n) for all non-goal nodes, then h2 is a better heuristic than h1 – If A1* uses h1, and A2* uses h2, then every node expanded by A2*

is also expanded by A1*.

– In other words, A1 expands at least as many nodes as A2*.

– We say that A2* is better informed than A1*, or A2* dominates A1*

• The closer h is to h*, the fewer extra nodes that will be expanded

Page 27: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

What’s a good heuristic?

• If h1(n) < h2(n) <= h*(n) for all n, then both are admissible and h2 is better than (dominates) h1.

• Relaxing the problem: remove constraints to create a (much) easier problem; use the solution cost for this problem as the heuristic function

• Combining heuristics: take the max of several admissible heuristics: still have an admissible heuristic, and it’s better!

• Identify good features, then use a learning algorithm to find a heuristic function: may lose admissibility

Page 28: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Relaxed problems

• A problem with fewer restrictions on the actions is called a relaxed problem

• The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem

• If the rules of the 8-puzzle are relaxed so that a tile can move anywhere, then h1(n) gives the shortest solution

• If the rules are relaxed so that a tile can move to any adjacent square, then h2(n) gives the shortest solution

Page 29: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Some Examples of Heuristics?

• 8-puzzle?

• Mapquest driving directions?

• Minesweeper?

• Crossword puzzle?

• Making a medical diagnosis?

• ??

Page 30: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Iterative improvement search

• Another approach to search involves starting with an initial guess at a solution and gradually improving it until a solution is reached.

• Some examples:

– Hill Climbing

– Simulated Annealing

– Constraint satisfaction

Page 31: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Example: n-queens

• Put n queens on an n × n board with no two queens on the same row, column, or diagonal

Page 32: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Hill-climbing search• If there exists a successor s for the current state n such

that – h(s) < h(n)– h(s) <= h(t) for all the successors t of n,

• then move from n to s. Otherwise, halt at n. • Looks one step ahead to determine if any successor is

better than the current state; if there is, move to the best successor.

• Similar to Greedy search in that it uses h, but does not allow backtracking or jumping to an alternative path since it doesn’t “remember” where it has been.

• Not complete since the search will terminate at "local minima," "plateaus," and "ridges."

Page 33: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Hill-climbing search

• "Like climbing Everest in thick fog with amnesia"

Page 34: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Hill climbing example

2 8 3

1 4

7 6 5

2 3

1 8 4

7 6 5

1 3

8 4

7 6 5

2

3

1 8 4

7 6 5

2

1 3

8 4

7 6 5

2

start goal

-5

h = -3

h = -3

h = -2

h = -1

h = 0h = -4

-5

-4

-4-3

-2

f(n) = -(number of tiles out of place)

2 8 3

1 6 4

7 5

1 3

8 4

7 6 5

2

Page 35: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Drawbacks of hill climbing

• Problems:– Local Maxima: peaks that aren’t the highest point in the

space– Plateaus: the space has a broad flat region that gives

the search algorithm no direction (random walk)– Ridges: flat like a plateau, but with dropoffs to the sides;

steps to the North, East, South and West may go down, but a step to the NW may go up.

• Remedy: – Random restart.

• Some problem spaces are great for hill climbing and others are terrible.

Page 36: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Hill-climbing search

• Problem: depending on initial state, can get stuck in local maxima

Page 37: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Example of a local maximum

1 2 5

7 4

8 6 3

1 2 5

7 4

8 6 3

1 2 5

7 4

8 6 3

1 2 5

7 4

8 6 3

1 2 5

7 4

8 6 3

-3

-4

-4

-4

0

start goal

Page 38: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Simulated annealing• Simulated annealing (SA) exploits an analogy between the way in

which a metal cools and freezes into a minimum-energy crystalline structure (the annealing process) and the search for a minimum [or maximum] in a more general system.

• SA can avoid becoming trapped at local minima.• SA uses a random search that accepts changes that increase

objective function f, as well as some that decrease it.• SA uses a control parameter T, which by analogy with the original

application is known as the system “temperature.”• T starts out high and gradually decreases toward 0.

Page 39: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Simulated annealing search• Idea: escape local maxima by allowing some "bad"

moves but gradually decrease their frequency

Page 40: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Properties of simulated annealing search

• One can prove: If T decreases slowly enough, then simulated annealing search will find a global optimum with probability approaching 1

• Widely used in VLSI layout, airline scheduling, etc

Page 41: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Genetic algorithms

• A successor state is generated by combining two parent states

• Start with k randomly generated states (population)

• A state is represented as a string over a finite alphabet (often a string of 0s and 1s)

• Evaluation function (fitness function). Higher values for better states.

• Produce the next generation of states by selection, crossover, and mutation

Page 42: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Genetic algorithms

• Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 = 28)

• 24/(24+23+20+11) = 31%• 23/(24+23+20+11) = 29% etc

Page 43: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Genetic algorithms

Page 44: Informed search algorithms Chapter 4 Slides derived in part from russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen

Summary: Informed search• Best-first search is general search where the minimum-cost nodes

(according to some measure) are expanded first. • Greedy search uses minimal estimated cost h(n) to the goal state as

measure. This reduces the search time, but the algorithm is neither complete nor optimal.

• A* search combines uniform-cost search and greedy search: f(n) = g(n) + h(n). A* handles state repetitions and h(n) never overestimates. – A* is complete, optimal and optimally efficient, but its space complexity is

still bad. – The time complexity depends on the quality of the heuristic function.

• Hill-climbing algorithms keep only a single state in memory, but can get stuck on local optima.

• Simulated annealing escapes local optima, and is complete and optimal given a “long enough” cooling schedule.