planning (ch. 10) · mutexes c g q m w d t f c ┐c g ┐g q ┐q p none of the pairs are in mutex...

53
Planning (Ch. 10)

Upload: others

Post on 22-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Planning (Ch. 10)

Page 2: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan: states

Let's consider this problem:

Page 3: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

M

W

D

T

F

C┐C

G┐G

Q┐QP

Mutexes

C

G

Q

M

W

D

T

F

C┐C

G┐G

Q┐QP

Blue mutexesdissappear

Pink = new mutex

Page 4: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as heuristic

GraphPlan is optimistic, so if any pair of goalstates are in mutex, the goal is impossible

3 basic ways to use GraphPlan as heuristic:(1) Maximum level of all goals(2) Sum of level of all goals (not admissible)(3) Level where no pair of goals is in mutex

(1) and (2) do not require any mutexes, but are less accurate (quick 'n' dirty)

Page 5: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as heuristic

For heuristics (1) and (2), we relax as such:1. Multiple actions per step, so can only take

fewer steps to reach same result2. Never remove any states, so the number

of possible states only increases

This is a valid simplification of the problem,but it is often too simplistic directly

Page 6: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as heuristic

Heuristic (1) directly uses this relaxation andfinds the first time when all 3 goals appearat a state level

(2) tries to sum the levels of each individualfirst appearance, which is not admissible(but works well if they are independent parts)

Our problem: goal={Food, ┐Garbage, Present}First appearance: F=1, ┐G=1, P=1

Page 7: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan: states

C

G

Q

M

W

D

T

F

C┐C

G┐G

Q┐QP

Level 0: Level 1:

Heuristic (1): Max(1,1,1) = 1

Heuristic (2):1+1+1=3

Page 8: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as heuristic

Often the problem is too trivial with justthose two simplifications

So we add in mutexes to keep track of invalidpairs of states/actions

This is still a simplification, as only impossiblestate/action pairs in the original problem arein mutex in the relaxation

Page 9: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as heuristic

Heuristic (3) looks to find the first time noneof the goal pairs are in mutex

For our problem, the goal states are:(Food, ┐Garbage, Present)

So all pairs that need to have no mutex:(F, ┐G), (F, P), (┐G, P)

Page 10: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Mutexes

C

G

Q

M

W

D

T

F

C┐C

G┐G

Q┐QP

None of thepairs are inmutex atlevel 1

This is ourheuristic estimate

Page 11: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Finding a solution

GraphPlan can also be used to find a solution:(1) Converting to a Constraint Sat. Problem(2) Backwards search

Both of these ways can be run once GraphPlanhas all goal pairs not in mutex (or converges)

Additionally, you might need to extend it out a few more levels further to find asolution (as GraphPlan underestimates)

Page 12: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

GraphPlan as CSP

Variables = states, Domains = actions out ofConstraints = mutexes & preconditions

from Do & Kambhampati

Page 13: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Finding a solution

For backward search, attempt to find arrowsback to the initial state(without conflict/mutex)

Start by finding actions that satisfy all goalconditions, then recursively try to satisfyall of the selected actions’ preconditions

If this fails to find a solution, mark this leveland all the goals not satisfied as: (level, goals)(level, goals) stops changing, no solution

Page 14: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Graph Plan

Remember this...

Page 15: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Graph Plan D D D

┐D ┐D ┐D ┐D

S S S

┐S ┐S ┐S ┐S

M M M

┐M ┐M ┐M ┐M

Sc

J

Sc

P

J

Sc

P

J

Ask:┐D^S^┐MFind firstno mutex...

Page 16: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Graph Plan D D D

┐D ┐D ┐D ┐D

S S S

┐S ┐S ┐S ┐S

M M M

┐M ┐M ┐M ┐M

Sc

J

Sc

P

J

Sc

P

J

Ask:┐D^S^┐M... then backsearch

Error! States of1&4 in mutex

1.

2.

3.

Page 17: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Graph Plan D D D

┐D ┐D ┐D ┐D

S S S

┐S ┐S ┐S ┐S

M M M

┐M ┐M ┐M ┐M

Sc

J

Sc

P

J

Sc

P

J

Ask:┐D^S^┐Mtry differentback path...

1.2.

Error, actions3&4 in mutex

3.

4.

Page 18: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Graph Plan D D D

┐D ┐D ┐D ┐D

S S S

┐S ┐S ┐S ┐S

M M M

┐M ┐M ┐M ┐M

Sc

J

Sc

P

J

Sc

P

J

Ask:┐D^S^┐Mfoundsolution!

Page 19: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Finding a solution

Formally, the algorithm is:

graph = initialnoGoods = empty table (hash)for level = 0 to infinity

if all goal pairs not in mutexsolution = recursive search with noGoodsif success, return paths

if graph & noGoods converged, return failgraaph = expand graph

Page 20: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

M

W

D

T

F

C┐C

G┐G

Q┐QP

Mutexes

C

G

Q

M

W

D

T

F

C┐C

G┐G

Q┐QP

You try it!

Page 21: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural networks (Ch. 12)

Page 22: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Biology: brains

Computer science is fundamentally a creativeprocess: building new & interesting algorithms

As with other creative processes, this involvesmixing ideas together from various places

Neural networks get their inspiration fromhow brains work at a fundamental level(simplification... of course)

Page 23: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Biology: brains

(Disclaimer: I am not a neuroscience-person)Brains receive small chemical signals at the “input” side, if there are enough inputs to“activate” it signals an “output”

Page 24: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Biology: brains

An analogy is sleeping: when you are asleep,minor sounds will not wake you up

However, specific sounds in combinationwith their volume will wake you up

Page 25: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Biology: brains

Other sounds might help you go to sleep(my majestic voice?)

Many babies tend to sleep better with “whitenoise” and some people like the TV/radio on

Page 26: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: basics

Neural networks are connected nodes, whichcan be arranged into layers (more on this later)

First is an example of a perceptron, the mostsimple NN; a single node on a single layer

Page 27: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: basics

Neural networks are connected nodes, whichcan be arranged into layers (more on this later)

First is an example of a perceptron, the mostsimple NN; a single node on a single layer

inputsoutput

activation function

Page 28: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Mammals

Let's do an example with mammals...

First the definition of a mammal (wikipedia):

Mammals [posses]: (1) a neocortex (a region of the brain), (2) hair, (3) three middle ear bones, (4) and mammary glands

Page 29: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Mammals

Common mammal misconceptions:(1) Warm-blooded(2) Does not lay eggs

Let's talk dolphins for one second.http://mentalfloss.com/article/19116/if-dolphins-are-mammals-and-all-mammals-have-hair-why-arent-dolphins-hairy

Dolphins have hair (technically) for the firstweek after birth, then lose it for the rest of life... I will count this as “not covered in hair”

Page 30: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Perceptrons

Consider this example: we want to classifywhether or not an animal is mammal viaa perceptron (weighted evaluation)

We will evaluate on:1. Warm blooded? (WB) Weight = 22. Lays eggs? (LE) Weight = -23. Covered hair? (CH) Weight = 3

Page 31: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Perceptrons

Consider the following animals:Humans {WB=y, LE=n, CH=y}, mam=y

Bat {WB=sorta, LE=n, CH=y}, mam=y

What about these?Platypus {WB=y, LE=y, CH=y}, mam=yDolphin {WB=y, LE=n, CH=n}, mam=yFish {WB=n, LE=y, CH=n}, mam=nBirds {WB=y, LE=y, CH=n}, mam=n

Page 32: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Perceptrons

But wait... what is the general form of:

Page 33: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Perceptrons

But wait... what is the general form of:

This is simply one side of a plane in 3D,so this is trying to classifyall possible points usinga single plane...

Page 34: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Perceptrons

If we had only 2 inputs, it would be everythingabove a line in 2D, but consider XOR on right

There is no way a line can possibly classifythis (limitation of perceptron)

Page 35: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

Today we will look at feed-forward NN, where information flows in a single direction

Recurrent networks can have outputs of onenode loop back to inputs as previous

This can cause the NN to not converge on ananswer (ask it the same question and it willrespond differently) and also has to maintainsome “initial state” (all around messy)

Page 36: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

Let's expand our mammal classification to5 nodes in 3 layers (weights on edges):

WB

LE

CH

N1

N2 N4

N3

N5

2

-1

-1

31

-21

2

1

2

if Output(Node 5) > 0, guess mammal

Page 37: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

You try Bat on this:{WB=0, LE=-1, CH=1}

WB

LE

CH

N1

N2 N4

N3

N5

2

-1

-1

31

-21

2

1

2

if Output(Node 5) > 0, guess mammal

Assume (for now) output = sum input

Page 38: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

Output is -7, so bats are not mammal... Oops...

0

-1

1

1

4 5

-6

-7

2

-1

-1

31

-21

2

1

2

if Output(Node 5) > 0, guess mammal

Page 39: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

In fact, this is no better than our 1 node NN

This is because we simply output a linearcombination of weights into a linear function(i.e. if f(x) and g(x) are linear... then g(x)+f(x) is also linear)

Ideally, we want a activation function thathas a limited range so large signals do notalways dominate

Page 40: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Neural network: feed-forward

One commonly used function is the sigmoid:

Page 41: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

The neural network is as good as it's structureand weights on edges

Structure we will ignore (more complex), butthere is an automated way to learn weights

Whenever a NN incorrectly answer a problem,the weights play a “blame game”...- Weights that have a big impact to the wrong

answer are reduced

Page 42: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

To do this blaming, we have to find how mucheach weight influenced the final answer

Steps:1. Find total error2. Find derivative of error w.r.t. weights3. Penalize each weight by an amount

proportional to this derivative

Page 43: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Consider this example: 4 nodes, 2 layers

N1

N2 N4

N3

in2

in1

w1

w2

w3

w4

w5

w6

w7

w8

1

This node as a constant bias of 1

out1

out2

b1 b

2

Page 44: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

0.506

N2 N4

N3

in2

in1

.15

.2

.25

.3

.4

.45

.5

.55

1

Node 1: 0.15*0.05 + 0.2*0.1 = 0.35 as inputthus it outputs (all edges) S(0.35)=0.59327

out1

out2

0.35

0.6

0.05

0.1

Page 45: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

0.506

0.510 0.630

0.606

in2

in1

.15

.2

.25

.3

.4

.45

.5

.55

1Eventually we get: out

1= 0.606, out

2= 0.630

Suppose wanted: out1= 0.01, out

2= 0.99

out1

out2

0.35

0.6

0.05

0.1

Page 46: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

We will define the error as:(you will see why shortly)

Suppose we want to find how much w5 is

to blame for our incorrectness

We then need to find:Apply the chain rule:

Page 47: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Page 48: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

In a picture we did this:

Now that we know w5 is 0.08217 partresponsible, we update the weight by:w

5 ←w

5 - α * 0.0645 = 0.374 (from 0.4)

α is learning rate, set to 0.5

Page 49: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Updating this w5 to w

8 gives:

w5 = 0.3589

w6 = 0.4067

w7 = 0.5113

w8 = 0.5614

For other weights, you need to consider allpossible ways in which they contribute

Page 50: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

For w1 it would look like:

(book describes how to dynamic program this)

Page 51: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Specifically for w1 you would get:

Next we have to break down the top equation...

Page 52: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Page 53: Planning (Ch. 10) · Mutexes C G Q M W D T F C ┐C G ┐G Q ┐Q P None of the pairs are in mutex at level 1 This is our heuristic ... process: building new & interesting algorithms

Back-propagation

Similarly for Error2 we get:

You might notice this is small... This is an issue with neural networks, deeperthe network the less earlier nodes update