naïve bayes classifier - biola university

12
Naïve Bayes classifier A small weather data set on previous records of (i) weather conditions and (ii) whether certain event happens (i.e. certain activity is “played”)

Upload: others

Post on 29-Nov-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Naïve Bayes classifier - Biola University

Naïve Bayes classifier

A small weather data set on previous records of

(i) weather conditions and

(ii) whether certain event happens (i.e. certain activity is “played”)

Page 2: Naïve Bayes classifier - Biola University

A new case for prediction:

Play=?

E: The evidences (observations) we have:

E1 = , E2 = , E3 = , E4 =

E = (E1, E2, E3, E4) =

H: whether the event happens:

Two possible predictions (hypotheses):

H = “Play=yes” or

H = “Play=no”

Page 3: Naïve Bayes classifier - Biola University

The basic probabilistic approach:

Compare two conditional probabilities

Pr( H = “Play=yes” | E = ) versus

Pr( H = “Play=no” | E = ) .

It is hard to directly estimate these two probabilities from the data set.

(Why?)

Page 4: Naïve Bayes classifier - Biola University

The Naïve Bayes approach:

(I) Use the Bayes rule

Page 5: Naïve Bayes classifier - Biola University

The Naïve Bayes approach:

Just calculate as the likelihood and compare

Pr(E = | H = “Play=yes”) *

Pr(H = “Play=yes”)

with

Pr(E = | H = “Play=no”) *

Pr(H = “Play=no”).

No need to worry about since

Pr(E = ) is the same denominator on the

right hand side of the Bayes rule.

Page 6: Naïve Bayes classifier - Biola University

The Naïve Bayes approach:

(II) Use the Naïve assumption on independency: Individual

evidences (E1, E2, E3, E4 …) are independently affected by the

underlying event separately.

=

*

* …

*

It is much easier estimate conditional probabilities

Pr(E1 | H), Pr(E2 | H), Pr(E3 | H), and Pr(E4 | H) from the data set . (Why?)

Page 7: Naïve Bayes classifier - Biola University

How to estimate Pr(E1 | H), Pr(E2 | H), Pr(E3 | H), Pr(E4 | H) ?

Page 8: Naïve Bayes classifier - Biola University

Estimate Pr(E1 | H) and Pr(E2 | H)

Pr(E1 = | H = “Play=yes” ) : 2/9 (why?)

Pr(E1 = | H = “Play=no” ) : 3/5 (why?)

Pr(E2 = | H = “Play=yes” ) : 3/9 (why?)

Pr(E2 = | H = “Play=no” ) : 1/5 (why?)

Page 9: Naïve Bayes classifier - Biola University

Estimate Pr(E3 | H) and Pr(E4 | H)

Pr(E3 = | H = “Play=yes” ) : 3/9 (why?)

Pr(E3 = | H = “Play=no” ) : 4/5 (why?)

Pr(E4 = | H = “Play=yes” ) : 3/9 (why?)

Pr(E4 = | H = “Play=no” ) : 3/5 (why?)

Page 10: Naïve Bayes classifier - Biola University

Estimate Pr(H):

Pr(H= “Play=yes” ) = 9/14 (why?)

Pr(H= “Play=no” ) = 5/14 (why?)

Page 11: Naïve Bayes classifier - Biola University

Just calculate the likelihood

for H = “Play=yes” and for H = “Play=no”

For example,

Pr(E = | H = “Play=yes”) *

Pr(H = “Play=yes”)

=

Page 12: Naïve Bayes classifier - Biola University

The results:

The prediction: “Play=no”