today’s topics graded hw1 in moodle (testbeds used for grading are linked to class home page) hw2...

25
Today’s Topics • Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) • HW2 due (but can still use 5 late days) at 11:55pm tonight • HW3 out tomorrow but not due to THURSDAY Nov 5 (no later than 11/12) • Midterm next THURSDAY (10/22); Reviews 10/20 (me) & 10/21 (Dmitry) • Basics of Probability – We’re covering probabilistic reasoning next so that the final programming HW can be undertaken right after the midterm – Random Events and Random Variables – Probability Distributions – Conditional Probabilities – Some Rules of Probability – Independence of Random Events Joint Probability Distributions (our focus) 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 1 Do Problem 1, HW3 before the midterm (Lecture 14 is end of midterm material)

Upload: kory-robinson

Post on 29-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

1

Today’s Topics• Graded HW1 in Moodle (Testbeds used for grading are linked to class home page)

• HW2 due (but can still use 5 late days) at 11:55pm tonight

• HW3 out tomorrow but not due to THURSDAY Nov 5 (no later than 11/12)

• Midterm next THURSDAY (10/22); Reviews 10/20 (me) & 10/21 (Dmitry)

• Basics of Probability

– We’re covering probabilistic reasoning next so that the final

programming HW can be undertaken right after the midterm

– Random Events and Random Variables

– Probability Distributions

– Conditional Probabilities

– Some Rules of Probability

– Independence of Random Events

• Joint Probability Distributions (our focus)

10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Do Problem 1, HW3 before the midterm

(Lecture 14 is end of midterm material)

Page 2: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Random Events

• Prob(event) or P(event)– The probability event will happen– Eg, prob(fair coin comes up heads) = 0.5– A random variable represent a random event

• Probability Distributions– Assume event has k possible disjoint outcomes

– Assign prob to each outcome; ∑ p(outcomei) = 1

– In cs540, we’ll ignore real-valued outcomes

(and probability density functions),

so we’ll SUM rather than INTEGRATE10/6/15 2

Page 3: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Sample Probability Distribution

10/6/15 3

RED GREEN BLUE

Pro

babi

lity

Color

1.0Values of COLOR assumed disjoint and complete

(ie, all objects have one and only one color)

Page 4: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

What if Values of a Random Variable are NOT Disjoint and Complete?

Assume some var has values x, y, and z, but these are not disjoint nor complete

Create new valuesa = x y z g = x y zb = x y z h = x y zc = x y z

d = x y ze = x y zf = x y z

10/6/15 4

Page 5: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Prob(A | B) - probability event A happens given event B occurred

Eg,

prob(you take bus to work) = 0.1

prob(you take bus to work | it is raining) = 0.9

NOTE:P( A B | C B) is shorthand for P( (A B) | (C B) )

10/6/15 5

Conditional Prob’s

Page 6: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Some Rules of Prob

1) 0 prob(A) 1

2) prob(false) 0 and prob(true) 1

3) prob(A B) = prob(A) + prob(B) - prob(A B)

4) prob(A B) = prob(A | B) ₓ prob(B)

= prob(B | A) ₓ prob(A)

5) prob( A) = 1 – prob(A)

10/6/15 6

Venn

A B

Page 7: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Prob(A | B) = Prob(A B) Prob(B)

10/6/15 7

A B

Prob(A) can be small, say 0.10 and Prob(B) can be smaller, say 0.05, yet Prob(A | B) can be large, say, 0.75

Given we’re in B, what fraction is

also in A?

Page 8: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Notational Complexity• We really should state both variables and their values

P(A = valueA | B = valueB)

EgP(color = yellow | size = big)

• When we mean ‘whatever value A and B have’ we should say something like

P(A = ? | B = ?)

• However, this gets complicated and so for clarity for P(A = ? | B = ?), we’ll often just say

P(A | B)

• In a SPECIFIC calculation with Boolean-valued vars,

P(A | B) means P(A=true | B=true)10/6/15 8

Page 9: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

Useful Equation for Manipulating Conditional Probs

P(A B | C ) = P(A | B C) P(B | C)

Derivation1) P(A | B C) P(A B C) / P(B C) // Using P( ) P( | ) P()

2) P(B | C) P(B C) / P(C)

3) P(A | B C) P(B | C) = P(A B C) / P(C) // Combine 1 and 2

4) P(A B C) / P(C) P(A B | C)

5) P(A B | C ) = P(A | B C) P(B | C) // Combine 3 and 4

10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 9

Page 10: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Independence

If A and B are independent events, then prob(A | B) = prob(A) // Knowing B happened tells us

// nothing about A

and prob(B | A) = prob(B) // Ditto

So if independence holds (or is assumed) prob(A B) = prob(A) ₓ prob(B)

since prob(A B) = prob(A | B) ₓ prob(B)

10/6/15 10

Page 11: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

I have two coins, if I flip them,what is the prob one comes up heads and the other tails?

10/6/15 11

I never promised you independent coins!

Page 12: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Conditional Independence

• If A and B are independent events given C, then

prob(A B | C) = prob(A | C) prob(B | C)

We say “A and B are conditionally independent given C”

• A variant: “A is independent of B given C”prob(A | B C) = prob(A | C)

// Assuming we know the value of C,// knowing B happened tells us nothing more about

A

10/6/15 12

Page 13: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Joint Prob Distributions(probs for compound events)

• Assume we have three Boolean-valued random variables; so 23 = 8 possible combo’s

10/6/15 13

Combination Prob of this Combination

A=false, B=false, C=false 0.50

A=false, B=false, C=true 0.20

A=false, B=true, C=false 0.15

A=false, B=true, C=true 0.01

A=true, B=false, C=false 0.02

A=true, B=false, C=true 0.03

A=true, B=true, C=false 0.04

A=true, B=true, C=true ? [hint: probs sum to 1]

Page 14: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Visually – There are 8 Distinct Regions

10/6/15 14

A

B

C

Page 15: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Calculating Prob of Any Expression

Given any (arbitrarily complex) logical expression, we can calculate its probability by

summing the probs in the cells of the joint prob table where the expression is true

This is our key calculation! The bulk of this topic in cs540 will address making this calculation feasible because these tables get too big10/6/15 15

Page 16: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Examples

1) P(A) = ?

2) P(A C) = ?

3) P(A B) = ?

4) P(C) = ?

5) P(B (A C)) = ?

10/6/15 16

Combination Prob of this ComboA=false, B=false, C=false 0.50

A=false, B=false, C=true 0.20

A=false, B=true, C=false 0.15

A=false, B=true, C=true 0.01

A=true, B=false, C=false 0.02

A=true, B=false, C=true 0.03

A=true, B=true, C=false 0.04

A=true, B=true, C=true 0.05

Answers1) 0.142) 0.063) 0.304) 0.715) 0.28

Here, vars = true unless NOT sign present

Page 17: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Then vs. Now

• In 1980s: from where do the probs come?• Today: we have tons of data!

• We can (conceptually) fill a huge joint prob table simply by counting, then can answer any possible question about the ‘random variables’ in our table!

10/6/15 17

Page 18: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Evidence, Hidden, and Query Vars

• In general, we ask about some QUERY variables CONDITIONED on some EVIDENCE variables, but don’t mention some HIDDEN variables

Prob(A C E | S P Y ) ?

Evidence vars: S, P and Y

Hidden vars: B, D, F, G, …, N, O, Q, R, T, U, … X, & Z

Query vars: A, C, and E

10/6/15 18

Page 19: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6

More Experience with Joint Prob Tables

• Recall that a Joint Probability Table

specifies the probability of each

(discrete) “complete world state”

• A “complete world state” specifies the value

of each random variable used to represent the

world we’re modeling

• If N variables each with M possible values,

there are M N different states of the world

one Petabyte = 1015 bytes

if N = 50 and M = 2, then M N 1015

10/13/15 19

Page 20: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6

Marginalizing (“Summing Out”)

A method to answer questions involving partial world states

P(Y) = P(Y, Z) // Eq 13.6

10/13/15 20

Z all possible ‘conjunctive’ settings for vars not set by Y

This comma means AND

Ex: Assume we have vars A, B, C, and D And Y = A ˄ ¬C

Then Z { B ˄ D, B ˄ ¬D, ¬B ˄ D, ¬B ˄ ¬D }

We did this earlier when we added up all cells where Y was true

Page 21: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6

Conditional Forms

P(Y) = P(Y | Z) P (Z) // Eq 13.8

- uses P(A ˄ B) = P(A | B) P(B)

- a bit like a weighted sum

P(Y | X) P(Y ˄ X) / P (X)

= P(Y ˄ X ˄ Z1) / P(X ˄ Z2)

10/13/15 21

Sum over all vars not set by Y or X

Sum over all vars not set by X

Called ‘conditioning’

Page 22: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

A ‘Weighted Sum’ Example

P(takeBus)

= P(takeBus | weather=sunny) x P(w=sunny)

+ P(takeBus | weather=cloudy) x P(w=cloudy)

+ P(takeBus | weather=rainy) x P(w=rainy)

+ P(takeBus | weather=snowy) x P(w=snowy)

10/6/15 22

Assumed disjoint and complete

Page 23: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6

Some Worked Examples

Assume we have a joint prob table for A, …, E

Express the following using table entries (ie, full world states)

P(B ˄ D ˄ E) =

P(B | D ˄ E) =

P(A ˅ C) =10/13/15 23

Basic idea

1) create probs that only involve AND and NOT

2) “AND in” the remaining vars in all possible (conjunctive) ways

3) look up fully specified ‘world states’

4) do the arithmetic

Here, vars = true unless NOT sign present

Page 24: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6

Solutions for Prev Slide

P(B ˄ D ˄ E) = P(B ˄ D ˄ E ˄ A ˄ C) + P(B ˄ D ˄ E ˄ A ˄ ¬ C) +

P(B ˄ D ˄ E ˄ ¬ A ˄ C) + P(B ˄ D ˄ E ˄ ¬ A ˄ ¬ C)

P(B | D ˄ E) = P(B ˄ D ˄ E) / P(D ˄ E) = …

- numerator done in first question

- denominator involves EIGHT terms

P(A ˅ C) = P(A) + P(C) - P(A ˄ C) = …

- first TWO terms each involve summing SIXTEEN terms

- last term involves EIGHT

10/13/15 24

Page 25: Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight

CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6

Next

• What if too few examples to sufficiently populate all the cells in the joint table?

• What if joint prob table too large for memory?

• Bayesian Networks (Bayes Nets for short) provide a popular/successful answer

10/6/15 25