cs#2750:#machine#learning# review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. ·...

25
CS 2750: Machine Learning Review Changsheng Liu University of Pi4sburgh April 4, 2016

Upload: others

Post on 15-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

CS  2750:  Machine  Learning    Review  

Changsheng  Liu  University  of  Pi4sburgh  

April  4,  2016      

Page 2: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Plan  for  today

•  Review  some  quesDons  from  HW    3  •  Density  EsDmaDon  •  Mixture  of  Gaussian  •  Naïve  Bayesian

Page 3: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

HW  3

•  Please  see  whiteboard

Page 4: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Density  EsDmaDon       •  Maximum  Likelihood  •  Maximum  a  posteriori  esDmaDon

Page 5: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Density  EsDmaDon       •  A  set  of  random  variables  X  ={X1,X2,…Xd}  •  A  model  of  distribuDon  over  variables  in  X  with  Parameters  Θ  :  P(X|Θ)    

•  Data  D={D1,D2,…Dn}  •  ObjecDve:  Find  parameter  Θ  that  P(X|Θ)  fits  data  D  the  best

Page 6: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Density  EsDmaDon      •  Maximum  likelihood  

•  Maximize  P(D|  Θ  ,ξ)  •  Maximum  a  posteriori  probability(MAP)  

•  A  model  of  distribuDon  over  variables  in  X  with  Parameters  Θ  :  P(Θ|D,  ξ)    

Page 7: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

A  coin  example      

Slide  from  Milos  

•  A  biased  coin,  with  the  probability  of  a  head  θ  •  Data  •   HHTTHHTHTHTTTHTHHHHTHHHHT  •  Heads  15  •  Tails:10  

•  What  is  a  good  esDmate  of  θ?

Page 8: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Maximum  likelihood      

Slide  from  Milos  

•  Use  the  frequency  of  occurrences    •   15/25  •  This  is  the  maximum  likelihood  esDmate  •  The  likelihood  of  the  data  

•  Maximum  likelihood  

Page 9: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Maximum  likelihood      

Slide  from  Milos  

Page 10: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Maximum  a  posteriori  esDmate      

Slide  from  Milos  

Page 11: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Maximum  a  posteriori  esDmate      

Slide  from  Milos  

•  Choose  from  the  same  family  for  convienence

Page 12: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Maximum  a  posteriori  esDmate  

Slide  from  Bishop  

Page 13: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Prior  ·∙  Likelihood  =  Posterior  

Slide  from  Bishop  

Page 14: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

The  Gaussian  DistribuDon  

Slide  from  Bishop  

Page 15: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

The  Gaussian  DistribuDon  

Slide  from  Bishop  

Diagonal  covariance  matrix   Covariance  matrix    proporDonal  to  the    idenDty  matrix  

Page 16: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Mixtures  of  Gaussians  (1)  

Old  Faithful  data  set  

Single  Gaussian   Mixture  of  two  Gaussians  

Slide  from  Bishop  

Page 17: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Mixtures  of  Gaussians  (2)  

Combine  simple  models    into  a  complex  model:  

Component  

Mixing  coefficient  K=3  

Slide  from  Bishop  

Page 18: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Mixtures  of  Gaussians  (3)  

Slide  from  Bishop  

Page 19: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Bayesian  Networks  

•  Directed  Acyclic  Graph  (DAG)  •  Nodes  are  random  variables  •  Edges  indicate  causal  influences  

Burglary   Earthquake  

Alarm  

JohnCalls   MaryCalls  

Slide  credit:  Ray  Mooney  

Page 20: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

CondiDonal  Probability  Tables  •  Each  node  has  a  condi=onal  probability  table  (CPT)  that  

gives  the  probability  of  each  of  its  values  given  every  possible  combinaDon  of  values  for  its  parents  (condiDoning  case).  •  Roots  (sources)  of  the  DAG  that  have  no  parents  are  given  prior  

probabiliDes.  

Burglary   Earthquake  

Alarm  

JohnCalls   MaryCalls  

P(B)

.001

P(E)

.002

B E P(A) T T .95 T F .94 F T .29 F F .001

A P(M) T .70 F .01

A P(J) T .90 F .05

Slide  credit:  Ray  Mooney  

Page 21: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

CondiDonal  Independence  

a  is  independent  of  b  given  c      Equivalently      NotaDon  

Slide  from  Bishop  

Page 22: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Condi=onally  independent  via  D-­‐separa=on    

   

Slide  from  Milos  

•  D-­‐separa=on  in  the  graph  Let  X,Y  and  Z  be  three  sets  of  nodes  If  X  and  Y  are  d-­‐separated  by  Z  then  X  and  Y  are  condiDonally  independent  give  Z    •  D-­‐separa=on  A  is  d-­‐separated  from  B  give  C  if  every  undirected  path  between  them  is  blocked  with  C

Page 23: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

D-­‐separa=on        

Slide  from  Milos  

Page 24: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Exercise  

Slide  from  Milos  

Page 25: CS#2750:#Machine#Learning# Review&kovashka/cs2750_sp16/chl_ml.pdf · 2016. 4. 7. · CS#2750:#Machine#Learning#! Review& Changsheng!Liu! University!of!Pi4sburgh! April!4,2016!!

Naïve  Bayes  as  a  Bayes  Net  Naïve  Bayes  is  a  simple  Bayes  Net  

Y  

X1   X2   …   Xn  

•  Priors  P(Y)  and  condiDonals  P(Xi|Y)  for  Naïve  Bayes  provide  CPTs  for  the  network.  

Slide  credit:  Ray  Mooney