popular search algorithms

Prof.Mrs.M.P.Atre

PVGCOET, SPPU

1. Problem solving

2. Problem solving agents

3. Example problems

4. Searching for solutions

5. Uniformed search strategies

6. Avoiding repeated states

7. Searching with partial information

Searching is problem solving

There are some single-player games such

as Tile games, Sudoku, Crossword, etc.

The search algorithms help you to search

for a particular position in such games.

The games such as 3X3 eight-tile, 4X4 fifteen-tile, and 5X5 twenty four tile puzzles are single-agent-path-finding challenges

They consist of a matrix of tiles with a blank tile

The player is required to arrange the tiles by sliding a tile either vertically or horizontally into a blank space with the aim of accomplishing some objective

The other examples of single agent

pathfinding problems are • Travelling Salesman Problem,

• Rubik’s Cube, and

• Theorem Proving.

The travelling salesman problem (TSP),

or in recent years, the travelling

salesperson problem, asks the following

question: "Given a list of cities and the

distances between each pair of cities, what

is the shortest possible route that visits

each city exactly once and returns to the

origin city?"

A TSP tour in the graph is 1-2-4-3-1. The cost of the tour is 10+25+30+15 which is 80.

The problem is a famous NP hard problem. There is no polynomial time known solution for this problem.

http://www.geeksforgeeks.org/np-completeness-set-1/

Automated theorem proving (also known

as ATP or automated deduction) is a

subfield of automated reasoning and

mathematical logic dealing with proving

mathematical theorems by computer

programs.

Problem Space − It is the environment in

which the search takes place. (A set of

states and set of operators to change

those states)

Problem Instance − It is Initial state +

Goal state.

Problem Space Graph − It represents

problem state. States are shown by nodes

and operators are shown by edges

Depth of a problem − Length of a shortest

path or shortest sequence of operators

from Initial State to goal state.

Space Complexity − The maximum

number of nodes that are stored in

memory.

Time Complexity − The maximum

number of nodes that are created.

Admissibility − A property of an algorithm

to always find an optimal solution.

Branching Factor − The average number

of child nodes in the problem space graph.

Depth − Length of the shortest path from

initial state to goal state.

They find sequence of actions that

achieve goals.

Problem-Solving Steps:

• Goal transformation: where a goal is set

of acceptable states.

• Problem formation: choose the operators

and state space.

• Search

• Execute solution

Types of problems: • Single state problems: state is always known with

certainty. • Multi state problems: know which states might be in. • Contingency problems: constructed plans with

conditional parts based on sensors. • Exploration problems: agent must learn the effect of

actions. Formal definition of a problem:

• Initial state (or set of states) • set of operators• goal test on states• path cost

Measuring performance: • Does it find a solution?

• What is the search cost?

• What is the total cost?

• (total cost = path cost + search cost)

Choosing states and actions:

Abstraction: remove unnecessary

information from representation;

makes it cheaper to find a solution.

The agent will first formulate its goal, then

it will formulate a problem whose solution

is a path (sequence of actions) to the goal,

and then it will solve the problem using

search.

Often the first step in problem-solving is to simplify the performance measure that the agent is trying to maximize

Formally, a "goal" is a set of desirable world-states.

"Goal formulation" means ignoring all other aspects of the current state and the performance measure, and choosing a goal.

Example: if you are in Arad (Romania) and your visa will expire tomorrow, your goal is to reach Bucharest airport.

Be sure to notice and understand the

difference between a "goal" and a

description OF a goal.

Technically "reach Bucharest airport" is a

description of a goal.

You can apply this description to particular

states and decide yes/no whether they

belong to the set of goal states.

After goal formulation, the agent must do problemformulation.

This means choosing a relevant set of states,operators for moving from one state to another, thegoal test function and the path cost function.

The relevant set of states should include thecurrent state, which is the initial state, and (at leastone!) goal state.

The operators correspond to "imaginary" actionsthat the agent might take.

The goal test function is a function whichdetermines if a single state is a goal state.

The path cost is the sum of the cost of individualactions along a path from one state to another.

Single state problems

Consider the vacuum cleaner world.

Imagine that our intelligent agent is a robot

vacuum cleaner.

Let's suppose that the world has just two

rooms.

The robot can be in either room and there

can be dirt in zero, one, or two rooms.

Goal formulation:

intuitively, we want all the dirt cleaned up. Formally, the goal is { state 7, state 8 }.

Note that the { } notation indicates a set.

Problem formulation:

we already know what the set of all possible states is.

The operators are "move left", "move right", and "vacuum".

Suppose that the robot has no sensor that

can tell it which room it is in and it doesn't

know where it is initially.

Then it must consider sets of possible

states.

Suppose that the "vacuum" action sometimes actually deposits dirt on the carpet--but only if the carpet is already clean!

Now [right,vacuum,left,vacuum] is NOT a correct plan, because one room might be clean originally, but then become dirty.

[right,vacuum,vacuum,left,vacuum,vacuum] doesn't work either, and so on.

There doesn't exist any FIXED plan that always works.

So far we have assumed that the robot is ignorant of which rooms are dirty today, but that the robot knows how many rooms there are and what the effect of each available action is.

Suppose the robot is completely ignorant. Then it must take actions for the purpose of acquiring knowledge about their effects, NOT just for their contribution towards achieving a goal.

This is called "exploration" and the agent must do learning about the environment.

An initial state is the description of the

starting configuration of the agent

An action or an operator takes the agent

from one state to another state which is

called a successor state.

A state can have a number of successor

states.

A plan is a sequence of actions.

The cost of a plan is referred to as the path

cost.

The path cost is a positive number, and a

common path cost may be the sum of the

costs of the steps in the path.

S: the full set of states •

s : the initial state •

A:S→S is a set of operators •

G is the set of final states. Note that G ⊆S

The search problem is to find a sequence of actions which transforms the agent from the

initial state to a goal state g∈G.

A search problem is represented by a 4-tuple {S, s , A, G}.

S: set of states s ∈ S : initial state A: S-> operators/ actions that transform one

state to another state G : goal, a set of states. G ⊆ S

This sequence of actions is called a solution plan.

It is a path from the initial state to a goal state.

A plan P is a sequence of actions.

P = {a0, a1,….aN} which leads to traversing a

number of states {s0, s1,….,s N+1 , ∈G}.

A sequence of states is called a path.

The cost of a path is a positive number.

In many cases the path cost is computed by taking

the sum of the costs of each action.

A search problem is represented using a

directed graph.

The states are represented as nodes.

The allowed actions are represented as

arcs.

Do until a solution is found or the state

space is exhausted.

1. Check the current state

2. Execute allowable actions to find the

successor states.

3. Pick one of the new states.

4. Check if the new state is a solution state • If it is not, the new state becomes the current state

and the process is repeated

We will now illustrate the searching

process with the help of an example.

S0 is the initial

state.

The successor

states are the

adjacent states in

the graph.

There are three

goal states.

The grey nodes define the search tree.

Usually the search tree is extended one

node at a time.

The order in which the search tree is

extended depends on the search strategy.

Uniformed search (Blind search): when all

we know about a problem is its definition.

Informed search (Heuristic search): beside

the problem definition, we know that a

certain action will make us more close to

our goal than other action.

We have 3 pegs and 3 disks.

Operators: one may move the topmost

disk on any needle to the topmost position

to any other needle

In the goal state all the pegs are in the

needle B as shown in the figure below.

In this section we will use a map as an example, if you take fast look you can deduce that each node represents a city, and the cost to travel from a city to another is denoted by the number over the edge connecting the nodes of those 2 cities.

Brute-Force Search Strategies

Informed (Heuristic) Search Strategies

Local Search Algorithms

They are most simple, as they do not need

any domain-specific knowledge. They work

fine with small number of possible states.

Requirements −• State description

• A set of valid operators

• Initial state

• Goal state description

Breadth-First Search

Depth-First Search

Bidirectional Search

Uniform Cost Search

Iterative Deepening Depth-First Search

It starts from the root node, explores the

neighboring nodes first and moves towards

the next level neighbors. It generates one

tree at a time until the solution is found. It

can be implemented using FIFO queue

data structure. This method provides

shortest path to the solution.

If branching factor (average number of

child nodes for a given node) = b and

depth = d, then number of nodes at level d

= bd.

The total no of nodes created in worst

case is b + b2 + b3 + … + bd.

Disadvantage − Since each level of nodes

is saved for creating next one, it consumes

a lot of memory space. Space requirement

to store nodes is exponential.

Its complexity depends on the number of

nodes. It can check duplicate nodes.

It is implemented in recursion with LIFO stack data structure. It creates the same set of nodes as Breadth-First method, only in the different order.

As the nodes on the single path are stored in each iteration from root to leaf node, the space requirement to store nodes is linear. With branching factor b and depth as m, the storage space is bm.

Disadvantage − This algorithm may not terminate and go on infinitely on one path. The solution to this issue is to choose a cut-off depth. If the ideal cut-off is d, and if chosen cut-off is lesser than d, then this algorithm may fail. If chosen cut-off is more than d, then execution time increases.

Its complexity depends on the number of paths. It cannot check duplicate nodes.

It searches forward from initial state and

backward from goal state till both meet to

identify a common state.

The path from initial state is concatenated

with the inverse path from the goal state.

Each search is done only up to half of the

total path.

Sorting is done in increasing cost of the path to a node. It always expands the least cost node. It is identical to Breadth First search if each transition has the same cost.

It explores paths in the increasing order of cost.

Disadvantage − There can be multiple long paths with the cost ≤ C*. Uniform Cost search must explore them all.

It performs depth-first search to level 1, starts over, executes a complete depth-first search to level 2, and continues in such way till the solution is found.

It never creates a node until all lower nodes are generated. It only saves a stack of nodes. The algorithm ends when it finds a solution at depth d. The number of nodes created at depth d is bd and at depth d-1 is bd-1.

Criterion Breadt

h First

Depth

First

Bidirectional Uniform

Cost

Interactive

Deepening

Time bd

bm

bd/2

bd

bd

Space bd

bm

bd/2

bd

bd

Optimality Yes No Yes Yes Yes

Completenes

s

Yes No Yes Yes Yes

To solve large problems with large number

of possible states, problem-specific

knowledge needs to be added to increase

the efficiency of search algorithms.

Heuristic Evaluation Functions

Pure Heuristic Search

A * Search

Greedy Best First Search

They calculate the cost of optimal path between two states.

A heuristic function for sliding-tiles games is computed by counting number of moves that each tile makes from its goal state and adding these number of moves for all tiles.

Heuristic function is a way to inform the search about the direction of a goal

It provides an informed way to guess which neighbour of a node will lead to a goal

It expands nodes in the order of their heuristic values.

It creates two lists, a closed list for the already expanded nodes and an open list for the created but unexpanded nodes.

In each iteration, a node with a minimum heuristic value is expanded, all its child nodes are created and placed in the closed list.

Then, the heuristic function is applied to the child nodes and they are placed in the open list according to their heuristic value.

The shorter paths are saved and the longer ones are disposed.

It is best-known form of Best First search. It

avoids expanding paths that are already

expensive, but expands most promising paths

first.

f(n) = g(n) + h(n), where• g(n) the cost (so far) to reach the node

• h(n) estimated cost to get from the node to the goal

• f(n) estimated total cost of path through n to goal. It is

implemented using priority queue by increasing f(n).

We saw that Uniform Cost Search was

optimal in terms of cost for a weighted

graph.

Now our aim will be to improve the

efficiency of the algorithm with the help of

heuristics.

Particularly, we will be using admissible

heuristics for A* Search

A* Search also makes use of a priority

queue just like Uniform Cost Search with

the element stored being the path from the

start state to a particular node, but the

priority of an element is not the same.

In Uniform Cost Search we used the actual

cost of getting to a particular node from the

start state as the priority.

For A*, we use the cost of getting to a node plus the heuristic at that point as the priority.

Let n be a particular node, then we define g(n) as the cost of getting to the node from the start state and h(n) as the heuristic at that node.

The priority thus is f(n) = g(n) + h(n). The priority is maximum when the f(n) value is least.

We use this priority queue in the following

algorithm, which is quite similar to the

Uniform Cost Search algorithm

Insert the root node into the queue

While the queue is not empty

Dequeue the element with the highest

priority

(If priorities are same, alphabetically

smaller path is chosen)

If the path is ending in the goal state,

print the path and exit

Else

Insert all the children of the

dequeued element, with f(n) as the priority

It expands the node that is estimated to be

closest to goal. It expands nodes based on

f(n) = h(n).

It is implemented using priority queue.

Disadvantage − It can get stuck in loops.

It is not optimal.

They start from a prospective solution and

then move to a neighboring solution.

They can return a valid solution even if it is

interrupted at any time before they end.

Hill-Climbing Search

Local Beam Search

Simulated Annealing

Travelling Salesman Problem

It is an iterative algorithm that starts with an arbitrary solution to a problem and attempts to find a better solution by changing a single element of the solution incrementally.

If the change produces a better solution, an incremental change is taken as a new solution.

This process is repeated until there are no further improvements.

function Hill-Climbing (problem), returns a state that is a local maximum.

Disadvantage of Hill climbing:

This algorithm is neither complete, nor

optimal.

In this algorithm, it holds k number of states at any given time.

At the start, these states are generated randomly.

The successors of these k states are computed with the help of objective function.

If any of these successors is the maximum value of the objective function, then the algorithm stops.

Otherwise the (initial k states and k number of successors of the states = 2k) states are placed in a pool.

The pool is then sorted numerically. The highest k states are selected as new initial states.

This process continues until a maximum value is reached.

function BeamSearch( problem, k), returns a solution state.

Annealing is the process of heating and

cooling a metal to change its internal

structure for modifying its physical

properties.

When the metal cools, its new structure is

seized, and the metal retains its newly

obtained properties.

In simulated annealing process, the

temperature is kept variable.

We initially set the temperature high and

then allow it to ‘cool' slowly as the

algorithm proceeds.

When the temperature is high, the

algorithm is allowed to accept worse

solutions with high frequency.

Start Initialize k = 0; L = integer number of

variables;From i → j, search the performance

difference ∆. If ∆ <= 0 then accept else if exp(- /T(k)) >

random(0,1) then accept;Repeat steps 1 and 2 for L(k) steps.k = k + 1;Repeat steps 1 through 4 till the criteria is

met.End

In this algorithm, the objective is to find a

low-cost tour that starts from a city, visits

all cities en-route exactly once and ends at

the same starting city.

Start Find out all (n -1)! Possible solutions,

where n is the total number of cities.

Determine the minimum cost by finding out

the cost of each of these (n -1)! solutions.

Finally, keep the one with the minimum

cost. end

popular search algorithms

Engineering