making decisions under uncertainty · making decisions under uncertainty what an agent should do...

84
Making Decisions Under Uncertainty What an agent should do depends on: The agent’s ability — what options are available to it. The agent’s beliefs — the ways the world could be, given the agent’s knowledge. Sensing updates the agent’s beliefs. The agent’s preferences — what the agent wants and tradeoffs when there are risks. Decision theory specifies how to trade off the desirability and probabilities of the possible outcomes for competing actions. c D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 1 / 28

Upload: others

Post on 14-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Making Decisions Under Uncertainty

What an agent should do depends on:

The agent’s ability — what options are available to it.

The agent’s beliefs — the ways the world could be, given theagent’s knowledge.Sensing updates the agent’s beliefs.

The agent’s preferences — what the agent wants andtradeoffs when there are risks.

Decision theory specifies how to trade off the desirability andprobabilities of the possible outcomes for competing actions.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 1 / 28

Page 2: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Variables

Decision variables are like random variables that an agent getsto choose a value for.

A possible world specifies a value for each decision variableand each random variable.

For each assignment of values to all decision variables, there isa probability distribution over random variables.

The probability of a proposition is undefined unless the agentconditions on the values of all decision variables.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 2 / 28

Page 3: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Variables

Decision variables are like random variables that an agent getsto choose a value for.

A possible world specifies a value for each decision variableand each random variable.

For each assignment of values to all decision variables, there isa probability distribution over random variables.

The probability of a proposition is undefined unless the agentconditions on the values of all decision variables.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 2 / 28

Page 4: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Variables

Decision variables are like random variables that an agent getsto choose a value for.

A possible world specifies a value for each decision variableand each random variable.

For each assignment of values to all decision variables, there isa probability distribution over random variables.

The probability of a proposition is undefined unless the agentconditions on the values of all decision variables.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 2 / 28

Page 5: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Tree for Delivery Robot

The robot can choose to wear pads to protect itself or not.The robot can choose to go the short way past the stairs or a longway that reduces the chance of an accident.There is one random variable of whether there is an accident.

wear pads

don’t wear pads

short way

long way

short way

long way

accident

no accident

accident

no accident

accident

no accidentaccident

no accident

w0 - moderate damage

w2 - moderate damage

w4 - severe damage

w6 - severe damage

w1 - quick, extra weight

w3 - slow, extra weight

w5 - quick, no weight

w7 - slow, no weight

Square boxes represent decisions that the robot can make. Circlesrepresent random variables that the robot can’t observe beforemaking its decision.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 3 / 28

Page 6: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Tree for Delivery Robot

The robot can choose to wear pads to protect itself or not.The robot can choose to go the short way past the stairs or a longway that reduces the chance of an accident.There is one random variable of whether there is an accident.

wear pads

don’t wear pads

short way

long way

short way

long way

accident

no accident

accident

no accident

accident

no accidentaccident

no accident

w0 - moderate damage

w2 - moderate damage

w4 - severe damage

w6 - severe damage

w1 - quick, extra weight

w3 - slow, extra weight

w5 - quick, no weight

w7 - slow, no weight

Square boxes represent decisions that the robot can make. Circlesrepresent random variables that the robot can’t observe beforemaking its decision.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 3 / 28

Page 7: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Tree for Delivery Robot

The robot can choose to wear pads to protect itself or not.The robot can choose to go the short way past the stairs or a longway that reduces the chance of an accident.There is one random variable of whether there is an accident.

wear pads

don’t wear pads

short way

long way

short way

long way

accident

no accident

accident

no accident

accident

no accidentaccident

no accident

w0 - moderate damage

w2 - moderate damage

w4 - severe damage

w6 - severe damage

w1 - quick, extra weight

w3 - slow, extra weight

w5 - quick, no weight

w7 - slow, no weight

Square boxes represent decisions that the robot can make. Circlesrepresent random variables that the robot can’t observe beforemaking its decision.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 3 / 28

Page 8: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single decisions

Single decisions: agent makes all decisions before acting

The agent can choose a value for each decision variable

Lets combine all decision variables into a single variable D

The expected utility of decision D = di is

E(u | D = di ) =∑ω∈Ω

P(ω | D = di )× u(ω)

where u(·) is the utility function Ω is the set of all worlds

An optimal single decision is a decision D = dmax whoseexpected utility is maximal:

E(u | D = dmax) = maxdi∈domain(D)

E(u | D = di ).

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 4 / 28

Page 9: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single decisions

Single decisions: agent makes all decisions before acting

The agent can choose a value for each decision variable

Lets combine all decision variables into a single variable D

The expected utility of decision D = di is

E(u | D = di ) =∑ω∈Ω

P(ω | D = di )× u(ω)

where u(·) is the utility function Ω is the set of all worlds

An optimal single decision is a decision D = dmax whoseexpected utility is maximal:

E(u | D = dmax) = maxdi∈domain(D)

E(u | D = di ).

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 4 / 28

Page 10: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single decisions

Single decisions: agent makes all decisions before acting

The agent can choose a value for each decision variable

Lets combine all decision variables into a single variable D

The expected utility of decision D = di is

E(u | D = di ) =∑ω∈Ω

P(ω | D = di )× u(ω)

where u(·) is the utility function Ω is the set of all worlds

An optimal single decision is a decision D = dmax whoseexpected utility is maximal:

E(u | D = dmax) = maxdi∈domain(D)

E(u | D = di ).

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 4 / 28

Page 11: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single decisions

Single decisions: agent makes all decisions before acting

The agent can choose a value for each decision variable

Lets combine all decision variables into a single variable D

The expected utility of decision D = di is

E(u | D = di ) =∑ω∈Ω

P(ω | D = di )× u(ω)

where u(·) is the utility function Ω is the set of all worlds

An optimal single decision is a decision D = dmax whoseexpected utility is maximal:

E(u | D = dmax) = maxdi∈domain(D)

E(u | D = di ).

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 4 / 28

Page 12: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

Extend belief networks with:

Decision nodes that the agent chooses the value for.Domain is the set of possible actions. Drawn as rectangle.

Utility node, whose parents are the variables on which theutility depends. Drawn as a diamond.

Which WayAccident

Utility

Wear Pads

This shows explicitly which nodes affect whether there is anaccident.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 5 / 28

Page 13: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 14: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 15: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 16: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 17: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 18: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Single-stage decision networks

A single-stage decision network consists of:

DAG with three sorts of nodes: decision, random, utility.Random nodes are the same as the nodes in a belief network.

A domain for each decision variable and each random variable.

A unique utility node.The utility node has no children and no domain.

A single-stage decision network has the factors:

A utility function is a factor on the parents of the utility node

A conditional probability for each random variable given itsparents

(No tables associated with the decision nodes.)

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 6 / 28

Page 19: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =

∑X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of DI Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 20: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of DI Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 21: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of DI Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 22: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utility

I Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of DI Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 23: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variables

I This creates a factor on D that gives the expected utility foreach value in the domain of D

I Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 24: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on

D that gives the expected utility foreach value in the domain of D

I Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 25: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of D

I Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 26: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal decision

Suppose the random variables are X1, . . . ,Xn, andutility depends on Xi1 , . . . ,Xik

E(u | D) =∑

X1,...,Xn

P(X1, . . . ,Xn | D)× u(Xi1 , . . . ,Xik )

=∑

X1,...,Xn

n∏i=1

P(Xi | parents(Xi ))× u(Xi1 , . . . ,Xik )

To find an optimal decision:I Create a factor for each conditional probability and for the

utilityI Sum out all of the random variablesI This creates a factor on D that gives the expected utility for

each value in the domain of DI Choose the D with the maximum value in the factor.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 7 / 28

Page 27: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Example Initial Factors

Which Way Accident Value

long true 0.01long false 0.99short true 0.2short false 0.8

Which Way Accident Wear Pads Value

long true true 30long true false 0long false true 75long false false 80short true true 35short true false 3short false true 95short false false 100

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 8 / 28

Page 28: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

After summing out Accident

Which Way Wear Pads Value

long true 74.55long false 79.2short true 83.0short false 80.6

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 9 / 28

Page 29: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Networks

flat or modular or hierarchical

explicit states or features or individuals and relations

static or finite stage or indefinite stage or infinite stage

fully observable or partially observable

deterministic or stochastic dynamics

goals or complex preferences

single agent or multiple agents

knowledge is given or knowledge is learned

perfect rationality or bounded rationality

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 10 / 28

Page 30: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Sequential Decisions

An intelligent agent doesn’t carry out a multi-step planignoring information it receives between actions.

A more typical scenario is where the agent:observes, acts, observes, acts, . . .

Subsequent actions can depend on what is observed.What is observed depends on previous actions.

Often the sole reason for carrying out an action is to provideinformation for future actions.For example: diagnostic tests, spying.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 11 / 28

Page 31: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Sequential Decisions

An intelligent agent doesn’t carry out a multi-step planignoring information it receives between actions.

A more typical scenario is where the agent:observes, acts, observes, acts, . . .

Subsequent actions can depend on what is observed.What is observed depends on previous actions.

Often the sole reason for carrying out an action is to provideinformation for future actions.For example: diagnostic tests, spying.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 11 / 28

Page 32: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Sequential Decisions

An intelligent agent doesn’t carry out a multi-step planignoring information it receives between actions.

A more typical scenario is where the agent:observes, acts, observes, acts, . . .

Subsequent actions can depend on what is observed.What is observed depends on previous actions.

Often the sole reason for carrying out an action is to provideinformation for future actions.For example: diagnostic tests, spying.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 11 / 28

Page 33: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Sequential Decisions

An intelligent agent doesn’t carry out a multi-step planignoring information it receives between actions.

A more typical scenario is where the agent:observes, acts, observes, acts, . . .

Subsequent actions can depend on what is observed.What is observed depends on previous actions.

Often the sole reason for carrying out an action is to provideinformation for future actions.For example: diagnostic tests, spying.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 11 / 28

Page 34: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Sequential decision problems

A sequential decision problem consists of a sequence ofdecision variables D1, . . . ,Dn.

Each Di has an information set of variables parents(Di ),whose value will be known at the time decision Di is made.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 12 / 28

Page 35: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decisions Networks

A decision network is a graphical representation of a finitesequential decision problem, with 3 types of nodes:

A random variable is drawn as anellipse. Arcs into the node representprobabilistic dependence.

A decision variable is drawn as anrectangle. Arcs into the noderepresent information available whenthe decision is make.

A utility node is drawn as a diamond.Arcs into the node represent variablesthat the utility depends on.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 13 / 28

Page 36: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

The agent has to decide whether to take its umbrella.

It observes

the forecast.

It doesn’t observe the weather directly.

The forecast is a noisy sensor of the weather.

The utility depends on the weather and whether the agenttakes the umbrella.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 14 / 28

Page 37: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

The agent has to decide whether to take its umbrella.

It observes the forecast.

It doesn’t observe

the weather directly.

The forecast is a noisy sensor of the weather.

The utility depends on the weather and whether the agenttakes the umbrella.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 14 / 28

Page 38: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

The agent has to decide whether to take its umbrella.

It observes the forecast.

It doesn’t observe the weather directly.

The forecast is a noisy sensor of the weather.

The utility depends on the weather and whether the agenttakes the umbrella.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 14 / 28

Page 39: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

The agent has to decide whether to take its umbrella.

It observes the forecast.

It doesn’t observe the weather directly.

The forecast is a noisy sensor of the weather.

The utility depends on the weather and whether the agenttakes the umbrella.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 14 / 28

Page 40: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

The agent has to decide whether to take its umbrella.

It observes the forecast.

It doesn’t observe the weather directly.

The forecast is a noisy sensor of the weather.

The utility depends on the weather and whether the agenttakes the umbrella.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 14 / 28

Page 41: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Network for the Alarm Problem

Tampering Fire

Alarm

Leaving

Report

Smoke

SeeSmokeCheckSmoke

Call

Utility

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 15 / 28

Page 42: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

No-forgetting

A No-forgetting decision network is a decision network where:

The decision nodes are totally ordered. This is the order theactions will be taken.

All decision nodes that come before Di are parents of decisionnode Di . Thus the agent remembers its previous actions.

Any parent of a decision node is a parent of subsequentdecision nodes. Thus the agent remembers its previousobservations.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 16 / 28

Page 43: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

No-forgetting

A No-forgetting decision network is a decision network where:

The decision nodes are totally ordered. This is the order theactions will be taken.

All decision nodes that come before Di are parents of decisionnode Di . Thus the agent remembers its previous actions.

Any parent of a decision node is a parent of subsequentdecision nodes. Thus the agent remembers its previousobservations.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 16 / 28

Page 44: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

No-forgetting

A No-forgetting decision network is a decision network where:

The decision nodes are totally ordered. This is the order theactions will be taken.

All decision nodes that come before Di are parents of decisionnode Di . Thus the agent remembers its previous actions.

Any parent of a decision node is a parent of subsequentdecision nodes. Thus the agent remembers its previousobservations.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 16 / 28

Page 45: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

What should an agent do?

What an agent should do at any time depends on what it willdo in the future.

What an agent does in the future depends on what it didbefore.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 17 / 28

Page 46: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Policies

A decision function for decision node Di is a function πi thatspecifies what the agent does for each assignment of values tothe parents of Di .When it observes O, it does πi (O).

A policy is a sequence of decision functions; one for eachdecision node.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 18 / 28

Page 47: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Policies

A decision function for decision node Di is a function πi thatspecifies what the agent does for each assignment of values tothe parents of Di .When it observes O, it does πi (O).

A policy is a sequence of decision functions; one for eachdecision node.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 18 / 28

Page 48: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

domain(Forecast) = sunny , cloudy , rainydomain(Umbbrella) = take, leaveSome policies:

take if cloudy else leave

always take

always leave

There are

23 = 8

policies

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 19 / 28

Page 49: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

domain(Forecast) = sunny , cloudy , rainydomain(Umbbrella) = take, leaveSome policies:

take if cloudy else leave

always take

always leave

There are

23 = 8

policies

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 19 / 28

Page 50: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

domain(Forecast) = sunny , cloudy , rainydomain(Umbbrella) = take, leaveSome policies:

take if cloudy else leave

always take

always leave

There are

23 = 8

policiesc©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 19 / 28

Page 51: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Umbrella Decision Network

Umbrella

Weather

UtilityForecast

domain(Forecast) = sunny , cloudy , rainydomain(Umbbrella) = take, leaveSome policies:

take if cloudy else leave

always take

always leave

There are 23 = 8 policiesc©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 19 / 28

Page 52: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Network for the Alarm Problem

Tampering Fire

Alarm

Leaving

Report

Smoke

SeeSmokeCheckSmoke

Call

Utility

All variables are Boolean. Some policies:

Never check. Call iff report.

Check iff report. Call iff report and see smoke.

Always check. Always call.

There are

22 ∗ 28 = 1024

policies.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 20 / 28

Page 53: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Network for the Alarm Problem

Tampering Fire

Alarm

Leaving

Report

Smoke

SeeSmokeCheckSmoke

Call

Utility

All variables are Boolean. Some policies:

Never check. Call iff report.

Check iff report. Call iff report and see smoke.

Always check. Always call.

There are

22 ∗ 28 = 1024

policies.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 20 / 28

Page 54: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Network for the Alarm Problem

Tampering Fire

Alarm

Leaving

Report

Smoke

SeeSmokeCheckSmoke

Call

Utility

All variables are Boolean. Some policies:

Never check. Call iff report.

Check iff report. Call iff report and see smoke.

Always check. Always call.

There are

22 ∗ 28 = 1024

policies.c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 20 / 28

Page 55: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Decision Network for the Alarm Problem

Tampering Fire

Alarm

Leaving

Report

Smoke

SeeSmokeCheckSmoke

Call

Utility

All variables are Boolean. Some policies:

Never check. Call iff report.

Check iff report. Call iff report and see smoke.

Always check. Always call.

There are 22 ∗ 28 = 1024 policies.c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 20 / 28

Page 56: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Expected Utility of a Policy

Possible world ω satisfies policy π if ω assigns the value toeach decision node that the policy specifies.

The expected utility of policy π is

E(u | π) =∑

ω satisfies π

u(ω)× P(ω)

An optimal policy is one with the highest expected utility.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 21 / 28

Page 57: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Expected Utility of a Policy

Possible world ω satisfies policy π if ω assigns the value toeach decision node that the policy specifies.

The expected utility of policy π is

E(u | π) =∑

ω satisfies π

u(ω)× P(ω)

An optimal policy is one with the highest expected utility.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 21 / 28

Page 58: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.I Eliminate D by maximizing. This returns:

I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 59: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.

I Let D be last variable— D is only in a factor f with (some of) its parents.

I Eliminate D by maximizing. This returns:I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 60: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.

I Eliminate D by maximizing. This returns:I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 61: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.I Eliminate D by maximizing. This returns:

I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 62: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.I Eliminate D by maximizing. This returns:

I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 63: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.I Eliminate D by maximizing. This returns:

I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 64: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Finding an optimal policy

Create a factor for each conditional probability table and afactor for the utility.

Repeat:I Sum out random variables that are not parents of a decision

node.I Let D be last variable

— D is only in a factor f with (some of) its parents.I Eliminate D by maximizing. This returns:

I an optimal decision function for D: argmaxD fI a new factor: maxD f

until there are no more decision nodes.

Sum out the remaining random variables. Multiply thefactors: this is the expected utility of an optimal policy.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 22 / 28

Page 65: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Initial factors for the Umbrella Decision

Weather Value

norain 0.7rain 0.3

Weather Fcast Value

norain sunny 0.7norain cloudy 0.2norain rainy 0.1rain sunny 0.15rain cloudy 0.25rain rainy 0.6

Weather Umb Value

norain take 20norain leave 100rain take 70rain leave 0

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 23 / 28

Page 66: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Eliminating By Maximizing

f :

Fcast Umb Val

sunny take 12.95sunny leave 49.0cloudy take 8.05cloudy leave 14.0rainy take 14.0rainy leave 7.0

maxUmb f :

Fcast Val

sunny 49.0cloudy 14.0rainy 14.0

arg maxUmb f :

Fcast Umb

sunny leavecloudy leaverainy take

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 24 / 28

Page 67: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Exercise

Disease

Symptoms

Test ResultTest

Treatment

Utility

Outcome

What are the factors?

Which random variables get summed out first?Which decision variable is eliminated? What factor is created?Then what is eliminated (and how)?What factors are created after maximization?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 25 / 28

Page 68: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Exercise

Disease

Symptoms

Test ResultTest

Treatment

Utility

Outcome

What are the factors?Which random variables get summed out first?

Which decision variable is eliminated? What factor is created?Then what is eliminated (and how)?What factors are created after maximization?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 25 / 28

Page 69: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Exercise

Disease

Symptoms

Test ResultTest

Treatment

Utility

Outcome

What are the factors?Which random variables get summed out first?Which decision variable is eliminated? What factor is created?

Then what is eliminated (and how)?What factors are created after maximization?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 25 / 28

Page 70: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Exercise

Disease

Symptoms

Test ResultTest

Treatment

Utility

Outcome

What are the factors?Which random variables get summed out first?Which decision variable is eliminated? What factor is created?Then what is eliminated (and how)?

What factors are created after maximization?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 25 / 28

Page 71: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Exercise

Disease

Symptoms

Test ResultTest

Treatment

Utility

Outcome

What are the factors?Which random variables get summed out first?Which decision variable is eliminated? What factor is created?Then what is eliminated (and how)?What factors are created after maximization?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 25 / 28

Page 72: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are

2k

assignments of values to the parents.

there are

b2k

different decision functions.

To optimize D, the algorithm does

2k

optimizations.The time complexity to optimize D is O(

b ∗ 2k

).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 73: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are

b2k

different decision functions.

To optimize D, the algorithm does

2k

optimizations.The time complexity to optimize D is O(

b ∗ 2k

).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 74: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does

2k

optimizations.

The time complexity to optimize D is O(

b ∗ 2k

).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 75: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does 2k optimizations.The time complexity to optimize D is O(

b ∗ 2k

).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 76: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does 2k optimizations.The time complexity to optimize D is O( b ∗ 2k).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 77: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does 2k optimizations.The time complexity to optimize D is O( b ∗ 2k).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there are

n∏i=1

b2kii

policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 78: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does 2k optimizations.The time complexity to optimize D is O( b ∗ 2k).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there aren∏

i=1

b2kii policies.

optimizing in the variable elimination algorithm takes

O

(

n∑i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 79: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Complexity of finding an optimal policy

Decision D has k binary parents, and has b possible actions:

there are 2k assignments of values to the parents.

there are b2k different decision functions.

To optimize D, the algorithm does 2k optimizations.The time complexity to optimize D is O( b ∗ 2k).

If the decision variables are Di , . . . ,Dn and decision Di has kibinary parents and bi possible actions:

there aren∏

i=1

b2kii policies.

optimizing in the variable elimination algorithm takes

O

(n∑

i=1

bi ∗ 2ki

)time

The dynamic programming algorithm is much more efficientthan searching through policy space.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 26 / 28

Page 80: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Value of Information

The value of information X for decision D is the utility of thenetwork with an arc from X to D (+ no-forgetting arcs)minus the utility of the network without the arc.

The value of information is always

non-negative.

It is positive only if the agent changes its action depending onX .

The value of information provides a bound on how much anagent should be prepared to pay for a sensor. How much is abetter weather forecast worth?

We need to be careful when adding an arc would create acycle. E.g., how much would it be worth knowing whether thefire truck will arrive quickly when deciding whether to callthem?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 27 / 28

Page 81: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Value of Information

The value of information X for decision D is the utility of thenetwork with an arc from X to D (+ no-forgetting arcs)minus the utility of the network without the arc.

The value of information is always non-negative.

It is positive only if

the agent changes its action depending onX .

The value of information provides a bound on how much anagent should be prepared to pay for a sensor. How much is abetter weather forecast worth?

We need to be careful when adding an arc would create acycle. E.g., how much would it be worth knowing whether thefire truck will arrive quickly when deciding whether to callthem?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 27 / 28

Page 82: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Value of Information

The value of information X for decision D is the utility of thenetwork with an arc from X to D (+ no-forgetting arcs)minus the utility of the network without the arc.

The value of information is always non-negative.

It is positive only if the agent changes its action depending onX .

The value of information provides a bound on how much anagent should be prepared to pay for a sensor. How much is abetter weather forecast worth?

We need to be careful when adding an arc would create acycle. E.g., how much would it be worth knowing whether thefire truck will arrive quickly when deciding whether to callthem?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 27 / 28

Page 83: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Value of Information

The value of information X for decision D is the utility of thenetwork with an arc from X to D (+ no-forgetting arcs)minus the utility of the network without the arc.

The value of information is always non-negative.

It is positive only if the agent changes its action depending onX .

The value of information provides a bound on how much anagent should be prepared to pay for a sensor. How much is abetter weather forecast worth?

We need to be careful when adding an arc would create acycle. E.g., how much would it be worth knowing whether thefire truck will arrive quickly when deciding whether to callthem?

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 27 / 28

Page 84: Making Decisions Under Uncertainty · Making Decisions Under Uncertainty What an agent should do depends on: The agent’sability| what options are available to it. The agent’sbeliefs|

Value of Control

The value of control of a variable X is the value of the networkwhen X is a decision variable (and add no-forgetting arcs)minus the value of the network when X is a random variable.

You need to be explicit about what information is availablewhen you control X .

If you control X without observing, controlling X can beworse than observing X . E.g., controlling a thermometer.

If you keep the parents the same, the value of control isalways non-negative.

c©D. Poole and A. Mackworth 2019 Artificial Intelligence, Lecture 9.2 28 / 28