machine learning practice session 1 · 2018. 2. 16. · strong noc strong yes weak no weak yes weak...
TRANSCRIPT
Machine learning practice session 1
"Machine learning: The Art and Science of Algorithms that Make Sense of Data" by Peter Flach (2012)
✤ Practice sessions support understanding of lectures and add practical material
✤ Main programming language is Python
✤ 6 homeworks: 36 points; at least 50% required
✤ Practice sessions help to understand the tasks in homework and discuss solutions
Tennis DatasetDay Outlook Temp Humidity Wind PlayTennisD1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No
Shall we play tennis today?PlayTennis
NoNoYesYesYesNo YesNoYesYesYesYesYesNo
Shall we play tennis today?PlayTennis
NoNoYesYesYesNo YesNoYesYesYesYesYesNo
P(Yes) = ???
P(No) = ???
Shall we play tennis today?PlayTennis
NoNoYesYesYesNo YesNoYesYesYesYesYesNo
P(Yes) = 9/14 = 0.64
P(No) = 5/14 = 0.36
Shall we play tennis today?PlayTennis
NoNoYesYesYesNo YesNoYesYesYesYesYesNo
P(Yes) = 9/14 = 0.64
P(No) = 5/14 = 0.36
P(Yes) > P(No)
Yes
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = ???
P(No | Strong) = ???
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
P(Yes | Weak) = ???
P(No | Weak) = ???
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
P(Yes | Weak) = 6/8 = 0.75
P(No | Weak) = 2/8 = 0.25
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
P(Yes | Weak) = 6/8 = 0.75
P(No | Weak) = 2/8 = 0.25
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
P(Yes | Weak) = 6/8 = 0.75
P(No | Weak) = 2/8 = 0.25
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Weak) = 8/14 = 0.57
P(Strong) = 6/14 = 0.43
P(Yes | Strong) = 3/6 = 0.5
P(No | Strong) = 3/6 = 0.5
P(Yes | Weak) = 6/8 = 0.75
P(No | Weak) = 2/8 = 0.25
P(Yes | Strong) = P(Yes, Strong)/P(Strong) Definition of conditional probability
More attributesHumidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = ???Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = 4/14 = 0.29Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = 4/14 = 0.29P(Yes | High, Weak) = ???
Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = 4/14 = 0.29P(Yes | High, Weak) = ???
Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = 4/14 = 0.29P(Yes | High, Weak) = 2/4 = 0.5
Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
More attributes
P(High, Weak) = 4/14 = 0.29P(Yes | High, Weak) = 2/4 = 0.5
Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
P(No | High, Weak) = 2/4 = 0.5
More attributes
P(High, Weak) = 4/14 = 0.29P(Yes | High, Weak) = 2/4 = 0.5
P(High, Strong) = 3/14 = 0.21
P(Yes | High, Strong) = 1/3 = 0.33
Humidity Wind PlayTennisHigh Weak NoHigh Strong NoHigh Weak YesHigh Weak Yes
Normal Weak YesNormal Strong No Normal Strong YesHigh Weak No
Normal Weak YesNormal Weak YesNormal Strong YesHigh Strong Yes
Normal Weak YesHigh Strong No
P(No | High, Weak) = 2/4 = 0.5
P(No | High, Strong) = 2/3 = 0.69
…
Classifier
1. Estimate from data: P(Class | X1,X2,X3,…)
2. For a given instance (X1,X2,X3,…) predict class whose conditional probability is greater:
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
Classifier
1. Estimate from data: P(Class | X1,X2,X3,…)
2. For a given instance (X1,X2,X3,…) predict class whose conditional probability is greater:
P(No |High, Strong) > P(Yes |High, Strong) —> predict No
It’s windy today. Tennis, anyone?Wind PlayTennisWeak NoStrong NoWeak YesWeak YesWeak YesStrong No Strong YesWeak NoWeak YesWeak YesStrong YesStrong YesWeak YesStrong No
P(Yes | Strong) = 3/6 = 0.5
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = ???
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = ???
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes, Strong) = ???
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes, Strong) = P(Yes | Strong)P(Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes, Strong) = P(Yes | Strong)P(Strong) = = ???
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes, Strong) = P(Yes | Strong)P(Strong) = = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes, Strong) = P(Yes | Strong)P(Strong) = = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong)P(Strong) = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong)P(Strong) = P(Strong | Yes)P(Yes)
P(Yes | Strong) = ???
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong)P(Strong) = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Strong | Yes)P(Yes)/P(Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong)P(Strong) = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Strong | Yes)P(Yes)/P(Strong)
P(Yes | Strong) = P(Yes, Strong)/P(Strong)
P(Strong | Yes) = P(Strong, Yes)/P(Yes)
P(Strong, Yes) = P(Yes, Strong)
P(Yes | Strong)P(Strong) = P(Strong | Yes)P(Yes)
P(Yes | Strong) = P(Strong | Yes)P(Yes)/P(Strong)
You have just derived the Bayes’ rule!
The Bayes’ rule
Classifier
1. Estimate from data: P(Class | X1,X2,X3,…)
2. For a given instance (X1,X2,X3,…) predict class whose conditional probability is greater:
P(No |High, Strong) > P(Yes |High, Strong) —> predict No
The Bayes Classifier
1. Estimate from data: P(Class | X1,X2,X3,…)
2. For a given instance (X1,X2,X3,…) predict class whose conditional probability is greater:
P(No |High, Strong) > P(Yes |High, Strong) —> predict No
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
???
Bayes’ rule
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
P(Strong!|!Yes)!P(Yes)P Strong
> P Strong! No)!P(No)P Strong
! —> predict Yes
Bayes’ rule
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
P(Strong | Yes) P(Yes) > P( Strong | No)P(No) —> predict Yes
P(Strong!|!Yes)!P(Yes)P Strong
> P Strong! No)!P(No)P Strong
! —> predict Yes
Bayes’ rule
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
P(Strong | Yes) P(Yes) > P( Strong | No)P(No) —> predict Yes
P(Strong!|!Yes)!P(Yes)P Strong
> P Strong! No)!P(No)P Strong
! —> predict Yes
Bayes’ rule
P(Strong | Yes)/P( Strong | No) > P(No)/P(Yes) —> predict Yes
The Bayes classifierP(Yes | Strong) > P(No | Strong) —> predict Yes
P(Strong | Yes) P(Yes) > P( Strong | No)P(No) —> predict Yes
P(Strong!|!Yes)!P(Yes)P Strong
> P Strong! No)!P(No)P Strong
! —> predict Yes
Bayes’ rule
P(Strong | Yes)/P( Strong | No) > P(No)/P(Yes) —> predict Yes
What about Naïve Bayes classifier?
What about Naïve Bayes classifier?
Assume (naively) that features X1,X2,X3,… are independent
What about Naïve Bayes classifier?
Assume (naively) that features X1,X2,X3,… are independent
Then:P(X1,X2) = P(X1)P(X2) and P(X1 | X2) = P(X1)
What about Naïve Bayes classifier?
Assume (naively) that features X1,X2,X3,… are independent
Then:P(X1,X2) = P(X1)P(X2) and P(X1 | X2) = P(X1)
And this gives us tools to use: P(X1 | X2,X3,C1) = P(X1 | C1) P(X2| X3,C1) = P(X2 | C1)
What about Naïve Bayes classifier?
Assume (naively) that features X1,X2,X3,… are independent
Then:P(X1,X2) = P(X1)P(X2) and P(X1 | X2) = P(X1)
And this gives us tools to use: P(X1 | X2,X3,C1) = P(X1 | C1) P(X2| X3,C1) = P(X2 | C1)
Naïve Bayes Classifier
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
Naïve Bayes Classifier
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
! !! !!)! !! !!)
!
!!!> !(!!)!(!!)
! —> predict C1
…
Naïve Bayes Classifier
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
! !! !!)! !! !!)
!
!!!> !(!!)!(!!)
! —> predict C1
…
P(Strong | Yes)/P( Strong | No) > P(No)/P(Yes) —> predict Yes
Naïve Bayes Classifier
1. Estimate from data: P(Class | X1,X2,X3,…)
2. For a given instance (X1,X2,X3,…) predict class whose conditional probability is greater:
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
Assume X1,X2,X3,… are independent
Use Bayes’ rule to calculate
Day Outlook Temp Humidity Wind PlayTennisD1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No
We have a new day: Outlook = Rain Temp = Mild Humidity = Normal Windy = Strong
Shall we play tennis?
Day Outlook Temp Humidity Wind PlayTennisD1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No
We have a new day: Outlook=Rain; Temp=Mild; Humidity=Normal; Wind=Strong
P(C1 | X1,X2,X3,…) > P(C2 | X1,X2,X3,…) —> predict C1
Day Outlook Temp Humidity Wind PlayTennisD1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No
We have a new day: Outlook=Rain; Temp=Mild; Humidity=Normal; Wind=Strong
! !! !!)! !! !!)
!
!!!> !(!!)!(!!)
! —> predict C1