naive bayes text classification · naive bayes classification example: positive/negative movie...
TRANSCRIPT
![Page 1: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/1.jpg)
Naive Bayes Text Classification
Jaime ArguelloINLS 613: Text Data Mining
September 17, 2014
Wednesday, September 17, 14
![Page 2: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/2.jpg)
Outline
Basic Probability and Notation
Bayes Law and Naive Bayes Classification
Smoothing
Class Prior Probabilities
Naive Bayes Classification
Summary
Wednesday, September 17, 14
![Page 3: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/3.jpg)
Crash Course in Basic Probability
Wednesday, September 17, 14
![Page 4: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/4.jpg)
4
Discrete Random Variable
• A is a discrete random variable if:
‣ A describes an event with a finite number of possible outcomes (discrete vs continuous)
‣ A describes and event whose outcome has some degree of uncertainty (random vs. pre-determined)
Wednesday, September 17, 14
![Page 5: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/5.jpg)
5
• A = the outcome of a coin-flip
‣ outcomes: heads, tails
• A = it will rain tomorrow
‣ outcomes: rain, no rain
• A = you have the flu
‣ outcomes: flu, no flu
• A = your final grade in this class
‣ outcomes: F, L, P, H
Discrete Random VariablesExamples
Wednesday, September 17, 14
![Page 6: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/6.jpg)
6
• A = the color of a ball pulled out from this bag
‣ outcomes: RED, BLUE, ORANGE
Discrete Random VariablesExamples
Wednesday, September 17, 14
![Page 7: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/7.jpg)
7
Probabilities
• Let P(A=X) denote the probability that the outcome of event A equals X
• For simplicity, we often express P(A=X) as P(X)
• Ex: P(RAIN), P(NO RAIN), P(FLU), P(NO FLU), ...
Wednesday, September 17, 14
![Page 8: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/8.jpg)
8
Probability Distribution
• A probability distribution gives the probability of each possible outcome of a random variable
• P(RED) = probability of pulling out a red ball
• P(BLUE) = probability of pulling out a blue ball
• P(ORANGE) = probability of pulling out an orange ball
Wednesday, September 17, 14
![Page 9: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/9.jpg)
9
• For it to be a probability distribution, two conditions must be satisfied:
‣ the probability assigned to each possible outcome must be between 0 and 1 (inclusive)
‣ the sum of probabilities assigned to all outcomes must equal 1
0 ≤ P(RED) ≤ 1
0 ≤ P(BLUE) ≤ 1
0 ≤ P(ORANGE) ≤ 1
P(RED) + P(BLUE) + P(ORANGE) = 1
Probability Distribution
Wednesday, September 17, 14
![Page 10: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/10.jpg)
10
• Let’s estimate these probabilities based on what we know about the contents of the bag
• P(RED) = ?
• P(BLUE) = ?
• P(ORANGE) = ?
Probability DistributionEstimation
Wednesday, September 17, 14
![Page 11: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/11.jpg)
11
• Let’s estimate these probabilities based on what we know about the contents of the bag
• P(RED) = 10/20 = 0.5
• P(BLUE) = 5/20 = 0.25
• P(ORANGE) = 5/20 = 0.25
• P(RED) + P(BLUE) + P(ORANGE) = 1.0
Probability Distributionestimation
Wednesday, September 17, 14
![Page 12: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/12.jpg)
• Given a probability distribution, we can assign probabilities to different outcomes
• I reach into the bag and pull out an orange ball. What is the probability of that happening?
• I reach into the bag and pull out two balls: one red, one blue. What is the probability of that happening?
• What about three orange balls?
P(RED) = 0.5P(BLUE) = 0.25P(ORANGE) = 0.25
12
Probability Distributionassigning probabilities to outcomes
Wednesday, September 17, 14
![Page 13: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/13.jpg)
What can we do with a probability distribution?
• If we assume that each outcome is independent of previous outcomes, then the probability of a sequence of outcomes is calculated by multiplying the individual probabilities
• Note: we’re assuming that when you take out a ball, you put it back in the bag before taking another
P(RED) = 0.5P(BLUE) = 0.25P(ORANGE) = 0.25
13
Wednesday, September 17, 14
![Page 14: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/14.jpg)
14
What can we do with a probability distribution?
P(RED) = 0.5P(BLUE) = 0.25P(ORANGE) = 0.25
• P( ) = ??
• P( ) = ??
• P( ) = ??
• P( ) = ??
• P( ) = ??
• P( ) = ??
Wednesday, September 17, 14
![Page 15: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/15.jpg)
15
What can we do with a probability distribution?
P(RED) = 0.5P(BLUE) = 0.25P(ORANGE) = 0.25
• P( ) = 0.25
• P( ) = 0.5
• P( ) = 0.25 x 0.25 x 0.25
• P( ) = 0.25 x 0.25 x 0.25
• P( ) = 0.25 x 0.50 x 0.25
• P( ) = 0.25 x 0.50 x 0.25 x 0.50
Wednesday, September 17, 14
![Page 16: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/16.jpg)
16
• P(A,B): the probability that event A and event B both occur
• P(A|B): the probability of event A occurring given prior knowledge that event B occurred
Conditional Probability
Wednesday, September 17, 14
![Page 17: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/17.jpg)
17
P(RED) = 0.50P(BLUE) = 0.25
P(ORANGE) = 0.25
P(RED) = 0.50P(BLUE) = 0.00
P(ORANGE) = 0.50
A B• P( | A) = ??
• P( | A) = ??
• P( | A) = ??
• P( | B) = ??
• P( | B) = ??
• P( | B) = ??
Conditional Probability
Wednesday, September 17, 14
![Page 18: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/18.jpg)
18
Conditional Probability
P(RED) = 0.50P(BLUE) = 0.25
P(ORANGE) = 0.25
P(RED) = 0.50P(BLUE) = 0.00
P(ORANGE) = 0.50
A B• P( | A) = 0.25
• P( | A) = 0.50
• P( | A) = 0.016
• P( | B) = 0.00
• P( | B) = 0.25
• P( | B) = 0.00
Wednesday, September 17, 14
![Page 19: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/19.jpg)
19
• P(A, B) = P(A|B) x P(B)
• Example:
‣ probability that it will rain today (B) and tomorrow (A)
‣ probability that it will rain today (B)
‣ probability that it will rain tomorrow (A) given that it will rain today (B)
Chain Rule
Wednesday, September 17, 14
![Page 20: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/20.jpg)
20
• P(A, B) = P(A|B) x P(B) = P(A) x P(B)
• Example:
‣ probability that it will rain today (B) and tomorrow (A)
‣ probability that it will rain today (B)
‣ probability that it will rain tomorrow (A) given that it will rain today (B)
‣ probability that it will rain tomorrow (A)
Independence
Wednesday, September 17, 14
![Page 21: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/21.jpg)
21
A B • P( | B) = 0.00
• P( | B) = 0.25
• P( | B) = 0.00
Independence
P( | A) ?= P( )
Wednesday, September 17, 14
![Page 22: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/22.jpg)
22
A B • P( | B) = 0.00
• P( | B) = 0.25
• P( | B) = 0.00
Independence
P( | A) > P( )
Wednesday, September 17, 14
![Page 23: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/23.jpg)
23
A B • P( | B) = 0.00
• P( | B) = 0.25
• P( | B) = 0.00
Independence
P( | A) ?= P( )
Wednesday, September 17, 14
![Page 24: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/24.jpg)
24
A B • P( | B) = 0.00
• P( | B) = 0.25
• P( | B) = 0.00
Independence
P( | A) = P( )
Wednesday, September 17, 14
![Page 25: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/25.jpg)
Outline
Basic Probability and Notation
Bayes Law and Naive Bayes Classification
Smoothing
Class Prior Probabilities
Naive Bayes Classification
Summary
Wednesday, September 17, 14
![Page 26: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/26.jpg)
26
Bayes’ Law
Wednesday, September 17, 14
![Page 27: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/27.jpg)
27
Bayes’ Law
P(A|B) = P(B|A)⇥ P(A)P(B)
Wednesday, September 17, 14
![Page 28: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/28.jpg)
28
Derivation of Bayes’ Law
P(A, B) = P(A, B)
P(A|B)⇥ P(B) = P(B|A)⇥ (B)
P(A|B) = P(B|A)⇥ P(A)P(B)
Always true!
Chain Rule!
Divide both sides by P(B)!
Wednesday, September 17, 14
![Page 29: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/29.jpg)
29
P(A|B) = P(B|A)⇥ P(A)P(B)
Naive Bayes Classificationexample: positive/negative movie reviews
P(POS|D) =P(D|POS)⇥ P(POS)
P(D)
P(NEG|D) =P(D|NEG)⇥ P(NEG)
P(D)
Bayes Rule
Confidence of POS prediction given instance D
Confidence of NEG prediction given instance D
Wednesday, September 17, 14
![Page 30: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/30.jpg)
30
Naive Bayes Classificationexample: positive/negative movie reviews
P(POS|D) � P(NEG|D)
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
Wednesday, September 17, 14
![Page 31: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/31.jpg)
31
Naive Bayes Classificationexample: positive/negative movie reviews
P(D|POS)⇥ P(POS)P(D)
� P(D|NEG)⇥ P(NEG)P(D)
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
Wednesday, September 17, 14
![Page 32: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/32.jpg)
32
Naive Bayes Classificationexample: positive/negative movie reviews
P(D|POS)⇥ P(POS)P(D)
� P(D|NEG)⇥ P(NEG)P(D)
Are these necessary?
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
Wednesday, September 17, 14
![Page 33: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/33.jpg)
33
Naive Bayes Classificationexample: positive/negative movie reviews
P(D|POS)⇥ P(POS) � P(D|NEG)⇥ P(NEG)
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
Wednesday, September 17, 14
![Page 34: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/34.jpg)
34
• Our next goal is to estimate these parameters from the training data!
• P(NEG) = ??
• P(POS) = ??
• P(D|NEG) = ??
• P(D|POS) = ??
Naive Bayes Classificationexample: positive/negative movie reviews
Easy!
Not so easy!
Wednesday, September 17, 14
![Page 35: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/35.jpg)
35
• Our next goal is to estimate these parameters from the training data!
• P(NEG) = % of training set documents that are NEG
• P(POS) = % of training set documents that are POS
• P(D|NEG) = ??
• P(D|POS) = ??
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 36: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/36.jpg)
36
Remember Conditional Probability?
P(RED) = 0.50P(BLUE) = 0.25
P(ORANGE) = 0.25
P(RED) = 0.50P(BLUE) = 0.00
P(ORANGE) = 0.50
A B• P( | A) = 0.25
• P( | A) = 0.50
• P( | A) = 0.25
• P( | B) = 0.00
• P( | B) = 0.50
• P( | B) = 0.50
Wednesday, September 17, 14
![Page 37: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/37.jpg)
37
Naive Bayes Classificationexample: positive/negative movie reviews
Training Instances
Posi=ve TrainingInstances
Nega=ve TrainingInstances
P(D|POS) = ?? P(D|NEG) = ??
Wednesday, September 17, 14
![Page 38: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/38.jpg)
38
w_1 w_2 w_3 w_4 w_5 w_6 w_7 w_8 ... w_n sen=ment
1 0 1 0 1 0 0 1 ... 0 posi=ve
0 1 0 1 1 0 1 1 ... 0 posi=ve
0 1 0 1 1 0 1 0 ... 0 posi=ve
0 0 1 0 1 1 0 1 ... 1 posi=ve
........
........
........
........ ...
........
1 1 0 1 1 0 0 1 ... 1 posi=ve
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 39: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/39.jpg)
39
Naive Bayes Classificationexample: positive/negative movie reviews
• We have a problem! What is it?
Training Instances
Posi=ve TrainingInstances
Nega=ve TrainingInstances
P(D|POS) = ?? P(D|NEG) = ??
Wednesday, September 17, 14
![Page 40: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/40.jpg)
40
• We have a problem! What is it?
• Assuming n binary features, the number of possible combinations is 2n
• 21000 = 1.071509e+301
• And in order to estimate the probability of each combination, we would require multiple occurrences of each combination in the training data!
• We could never have enough training data to reliably estimate P(D|NEG) or P(D|POS)!
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 41: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/41.jpg)
41
• Assumption: given a particular class value (i.e, POS or NEG), the value of a particular feature is independent of the value of other features
• In other words, the value of a particular feature is only dependent on the class value
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 42: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/42.jpg)
42
w_1 w_2 w_3 w_4 w_5 w_6 w_7 w_8 ... w_n sen=ment
1 0 1 0 1 0 0 1 ... 0 posi=ve
0 1 0 1 1 0 1 1 ... 0 posi=ve
0 1 0 1 1 0 1 0 ... 0 posi=ve
0 0 1 0 1 1 0 1 ... 1 posi=ve
........
........
........
........ ...
........
1 1 0 1 1 0 0 1 ... 1 posi=ve
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 43: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/43.jpg)
43
• Assumption: given a particular class value (i.e, POS or NEG), the value of a particular feature is independent of the value of other features
• Example: we have seven features and D = 1011011
• P(1011011|POS) =
P(w1=1|POS) x P(w2=0|POS) x P(w3=1|POS) x P(w4=1|POS) x P(w5=0|POS) x P(w6=1|POS) x P(w7=1|POS)
• P(1011011|NEG) =
P(w1=1|NEG) x P(w2=0|NEG) x P(w3=1|NEG) x P(w4=1|NEG) x P(w5=0|NEG) x P(w6=1|NEG) x P(w7=1|NEG)
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 44: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/44.jpg)
44
• Question: How do we estimate P(w1=1|POS) ?
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 45: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/45.jpg)
45
w_1 w_2 w_3 w_4 w_5 w_6 w_7 w_8 ... w_n sen=ment
1 0 1 0 1 0 0 1 ... 0 posi=ve
0 1 0 1 1 0 1 1 ... 0 nega=ve
0 1 0 1 1 0 1 0 ... 0 nega=ve
0 0 1 0 1 1 0 1 ... 1 posi=ve
........
........
........
........ ...
........
1 1 0 1 1 0 0 1 ... 1 nega=ve
Naive Bayes Classificationexample: positive/negative movie reviews
Wednesday, September 17, 14
![Page 46: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/46.jpg)
46
• Question: How do we estimate P(w1=1|POS) ?
Naive Bayes Classificationexample: positive/negative movie reviews
77
P(w1=1|POS) = ??
a b
c d
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 47: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/47.jpg)
47
• Question: How do we estimate P(w1=1|POS) ?
Naive Bayes Classificationexample: positive/negative movie reviews
77
P(w1=1|POS) = a / (a + c)
a b
c d
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 48: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/48.jpg)
48
• Question: How do we estimate P(w1=1/0|POS/NEG) ?
Naive Bayes Classificationexample: positive/negative movie reviews
77
P(w1=1|POS) = a / (a + c)
P(w1=0|POS) = ??
P(w1=1|NEG) = ??
P(w1=0|NEG) = ??
a b
c d
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 49: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/49.jpg)
49
• Question: How do we estimate P(w1=1/0|POS/NEG) ?
Naive Bayes Classificationexample: positive/negative movie reviews
77
P(w1=1|POS) = a / (a + c)
P(w1=0|POS) = c / (a + c)
P(w1=1|NEG) = b / (b + d)
P(w1=0|NEG) = d / (b + d)
a b
c d
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 50: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/50.jpg)
50
• Question: How do we estimate P(w2=1/0|POS/NEG) ?
Naive Bayes Classificationexample: positive/negative movie reviews
77
P(w2=1|POS) = a / (a + c)
P(w2=0|POS) = c / (a + c)
P(w2=1|NEG) = b / (b + d)
P(w2=0|NEG) = d / (b + d)
a b
c d
w2 = 1
w2 = 0
POS NEG
• The value of a, b, c, and d would be different for different features w1, w2, w3, w4, w5, .... , wn
Wednesday, September 17, 14
![Page 51: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/51.jpg)
51
Naive Bayes Classificationexample: positive/negative movie reviews
P(D|POS)⇥ P(POS) � P(D|NEG)⇥ P(NEG)
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
Wednesday, September 17, 14
![Page 52: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/52.jpg)
52
Naive Bayes Classificationexample: positive/negative movie reviews
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
P(POS)⇥n
’i=1
P(wi = Di|POS) � P(NEG)⇥n
’i=1
P(wi = Di|NEG)
Wednesday, September 17, 14
![Page 53: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/53.jpg)
53
Naive Bayes Classificationexample: positive/negative movie reviews
• Given instance D = 1011011, predict positive (POS) if:
P(w1=1|POS) x P(w2=0|POS) x P(w3=1|POS) x P(w4=1|POS) x P(w5=0|POS) x P(w6=1|POS) x P(w7=1|POS) x P(POS)
P(w1=1|NEG) x P(w2=0|NEG) x P(w3=1|NEG) x P(w4=1|NEG) x P(w5=0|NEG) x P(w6=1|NEG) x P(w7=1|NEG) x P(NEG)
• Otherwise, predict negative (NEG)
�
Wednesday, September 17, 14
![Page 54: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/54.jpg)
54
Naive Bayes Classificationexample: positive/negative movie reviews
• We still have a problem! What is it?
Wednesday, September 17, 14
![Page 55: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/55.jpg)
55
Naive Bayes Classificationexample: positive/negative movie reviews
• Given instance D = 1011011, predict positive (POS) if:
P(w1=1|POS) x P(w2=0|POS) x P(w3=1|POS) x P(w4=1|POS) x P(w5=0|POS) x P(w6=1|POS) x P(w7=1|POS) x P(POS)
P(w1=1|NEG) x P(w2=0|NEG) x P(w3=1|NEG) x P(w4=1|NEG) x P(w5=0|NEG) x P(w6=1|NEG) x P(w7=1|NEG) x P(NEG)
• Otherwise, predict negative (NEG)
�
What if this never happens in the training data?
Wednesday, September 17, 14
![Page 56: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/56.jpg)
56
• When estimating probabilities, we tend to ...
‣ Over-estimate the probability of observed outcomes
‣ Under-estimate the probability of unobserved outcomes
• The goal of smoothing is to ...
‣ Decrease the probability of observed outcomes
‣ Increase the probability of unobserved outcomes
• It’s usually a good idea
• You probably already know this concept!
Smoothing Probability Estimates
Wednesday, September 17, 14
![Page 57: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/57.jpg)
57
Smoothing Probability Estimates
• YOU: Are there mountain lions around here?
• YOUR FRIEND: Nope.
• YOU: How can you be so sure?
• YOUR FRIEND: Because I’ve been hiking here five times before and have never seen one.
• YOU: ????
Wednesday, September 17, 14
![Page 58: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/58.jpg)
58
Smoothing Probability Estimates
• YOU: Are there mountain lions around here?
• YOUR FRIEND: Nope.
• YOU: How can you be so sure?
• YOUR FRIEND: Because I’ve been hiking here five times before and have never seen one.
• MOUNTAIN LION: You should have learned about smoothing by taking INLS 613. Yum!
Wednesday, September 17, 14
![Page 59: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/59.jpg)
59
Add-One Smoothing
• Question: How do we estimate P(w2=1/0|POS/NEG) ?
P(w2=1|POS) = a / (a + c)
P(w2=0|POS) = c / (a + c)
P(w2=1|NEG) = b / (b + d)
P(w2=0|NEG) = d / (b + d)
a b
c d
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 60: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/60.jpg)
60
Add-One Smoothing
• Question: How do we estimate P(w2=1/0|POS/NEG) ?
P(w2=1|POS) = ??
P(w2=0|POS) = ??
P(w2=1|NEG) = ??
P(w2=0|NEG) = ??
a + 1 b + 1
c + 1 d + 1
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 61: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/61.jpg)
61
Add-One Smoothing
• Question: How do we estimate P(w2=1/0|POS/NEG) ?
P(w2=1|POS) = (a + 1) / (a + c + 2)
P(w2=0|POS) = (c + 1) / (a + c + 2)
P(w2=1|NEG) = (b + 1) / (b + d + 2)
P(w2=0|NEG) = (d + 1) / (b + d + 2)
a + 1 b + 1
c + 1 d + 1
w1 = 1
w1 = 0
POS NEG
Wednesday, September 17, 14
![Page 62: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/62.jpg)
62
Naive Bayes Classificationexample: positive/negative movie reviews
• Given instance D, predict positive (POS) if:
• Otherwise, predict negative (NEG)
P(POS)⇥n
’i=1
P(wi = Di|POS) � P(NEG)⇥n
’i=1
P(wi = Di|NEG)
Wednesday, September 17, 14
![Page 63: Naive Bayes Text Classification · Naive Bayes Classification example: positive/negative movie reviews Wednesday, September 17, 14 41 • Assumption: given a particular class value](https://reader036.vdocuments.site/reader036/viewer/2022071216/60482b06770b951a3b059805/html5/thumbnails/63.jpg)
63
• Naive Bayes Classifiers are simple, effective, robust, and very popular
• Assumes that feature values are conditionally independent given the target class value
• This assumption does not hold in natural language
• Even so, NB classifiers are very powerful
• Smoothing is necessary in order to avoid zero probabilities
Naive Bayes Classification
Wednesday, September 17, 14