Transcript
Page 1: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

CHAPTER 15 SECTION 1 – 2

Markov Models

Page 2: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Outline

Probabilistic InferenceBayes RuleMarkov Chains

Page 3: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Probabilistic Inference

Probabilistic inference: compute a desired probability from other known probabilities (e.g. conditional from joint)

We generally compute conditional probabilities: P(on time | no reported accidents) = 0.90 These represent the agent’s beliefs given the evidence

Probabilities change with new evidence: P(on time | no accidents, 5 a.m.) = 0.95 P(on time | no accidents, 5 a.m., raining) = 0.80 Observing new evidence causes beliefs to be updated

Page 4: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Bayes’ Rule

Two ways to factor a joint distribution over two variables:

Dividing we get:

Why is this at all helpful? Lets us build a conditional from its reverse Often one conditional is tricky but the other one is

simple Foundation of many systems we’ll see later

Page 5: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Terminology

Marginal Probability: Joint Probability: Conditional Probability:

Page 6: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Inference by enumeration

P(sun)?

Page 7: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Inference by enumeration

P(sun | winter)?P(sun | winter, hot)?

Page 8: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Inference by enumeration

General case: Evidence variables: Query* variable: Q Hidden variables:

We want: First select the entries consistent with the

evidenceSecond, sum out H to get joint of Query and

evidence: =

Finally, normalize the remaining entries to conditionalize

Page 9: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

The product rule

Sometimes have conditional distributions but want the joint:𝑃(𝑥│𝑦) =

Page 10: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

The product rule

Sometimes have conditional distributions but want the joint:𝑃(𝑥│𝑦) =

Page 11: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

The chain rule

More generally, can always write any joint distribution as an incremental product of conditional distributions:

Page 12: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Bayes’ Rule

Two ways to factor a joint distribution over two variables:

Dividing we get:

Why is this at all helpful? Lets us build a conditional from its reverse Often one conditional is tricky but the other one is simple Foundation of many systems we’ll see later

In the running for most important AI equation!

Page 13: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Inference with Bayes’ Rule

Example: Diagnostic probability from causal probability:

Example: m is meningitis, s is stiff neck

What is the probability that you have meningitis given that you had stiff neck? 0.0008

Note: posterior probability of meningitis still very small Note: you should still get stiff necks checked out! Why?

Page 14: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Reasoning over Time or Space

Often, we want to reason about a sequence of observations Speech recognition Robot localization User attention Medical monitoring

Need to introduce time (or space) into our models

Page 15: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Markov Models (Markov Chains)

A Markov model is: a Decision Process with no actions (and no rewards) a chain-structured Bayesian Network (BN)

A Markov model includes: Random variables Xt for all time steps t (the state) Parameters: called transition probabilities or dynamics,

specify how the state evolves over time (also, initial probs)

Page 16: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Markov Models (Markov Chains)

A Markov defines: a joint probability distribution

One common inference problem: Compute marginals P(Xt )for all time steps t

Page 17: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Conditional independence

Basic conditional independence: Past and future independent of the present Each time step only depends on the previous This is called the (first order) Markov property

Note that the chain is just a (growable) BN: We can always use generic BN reasoning on it if we truncate the chain at a fixed length

Page 18: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Example: Markov Chain

Weather: States: X = {rain, sun} Transitions:

Initial distribution: 1.0 sun What’s the probability distribution after one step?

Page 19: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Markov Chain Inference

Question: probability of being in state x at time t?

Slow answer: Enumerate all sequences of length t which end in s Add up p their probabilities

Page 20: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Joint distribution of a Markov Model

Joint Distribution:P(X1, X2, X3, X4) = P(X1)P(X2 |X1)P(X3 | X2)P(X4 | X3)

More generally:P(X1, X2,...,XT ) = P(X1)P(X2 |X1)P(X3 | X2)...P(XT |XT-1) = P(X1)

Page 21: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Markov Models Recap

Consequence, joint distribution can be written as: P(X1, X2,...,XT ) = P(X1)

Implied conditional independencies: Past independent of future given the present

Additional explicit assumption: ) is the same for all t

Page 22: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Mini-Forward Algorithm

Question: What’s P(X) on some day t? We don’t need to enumerate every sequence!

Page 23: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Example Run of Mini-Forward Algorithm

From initial observations of sun:

From initial observations of rain:

Page 24: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Example Run of Mini-Forward Algorithm

From yet another initial distribution P(X1):

Page 25: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Stationary Distributions

For most chains: Influence of the initial distribution gets less and less

over time. The distribution we end up in is independent of the

initial distributionStationary distribution:

The distribution we end up with is called the stationary distribution of the chain

It satisfies:

Page 26: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Example: Stationary Distributions

Question: What’s P(X) at time t = infinity?

Page 27: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

Application of Stationary Distribution: Web Link Analysis

PageRank over a web graph Each web page is a state Initial distribution: uniform over pages Transitions:

With prob. c, uniform jump to a random page (doMed lines, not all shown)

With prob. 1-c, follow a random outlink (solid lines)

Stationary distribution Will spend more time on highly reachable pages E.g. many ways to get to the Acrobat Reader download page Somewhat robust to link spam Google 1.0 returned the set of pages containing all your keywords

in decreasing rank, now all search engines use link analysis along with many other factors (rank actually getting less important over time)

Page 28: CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains

References

CSE473: Introduction to Artificial Intelligence http://courses.cs.washington.edu/courses/cse473/


Top Related