machine learning for data science (cs4786) lecture 5 · 2017. 9. 7. · k-means: pitfalls • looks...
TRANSCRIPT
![Page 1: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/1.jpg)
Machine Learning for Data Science (CS4786)Lecture 5
Gaussian Mixture Models
Course Webpage :http://www.cs.cornell.edu/Courses/cs4786/2017fa/
![Page 2: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/2.jpg)
Clustering demo
![Page 3: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/3.jpg)
Two elongated ellipses
![Page 4: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/4.jpg)
Iris dataset: Flowers
Iris-Setosa Iris-versicolor Iris-virginica
![Page 5: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/5.jpg)
Smiley dataset
![Page 6: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/6.jpg)
Wisconsin Breast Cancer dataset
![Page 7: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/7.jpg)
K-means: pitfalls
![Page 8: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/8.jpg)
K-means: pitfalls
![Page 9: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/9.jpg)
K-means: pitfalls
![Page 10: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/10.jpg)
K-means: pitfalls
![Page 11: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/11.jpg)
K-means: pitfalls
![Page 12: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/12.jpg)
K-means: pitfalls
![Page 13: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/13.jpg)
K-means: pitfalls
![Page 14: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/14.jpg)
K-means: pitfalls
![Page 15: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/15.jpg)
K-means: pitfalls
![Page 16: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/16.jpg)
K-means: pitfalls
![Page 17: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/17.jpg)
K-means: pitfalls
![Page 18: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/18.jpg)
K-means: pitfalls
• Looks for spherical clusters
• Of same radius
• And with roughly equal number of points
![Page 19: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/19.jpg)
• Can we design algorithm that can address these shortcomings?
K-means: pitfalls
![Page 20: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/20.jpg)
Ellipsoid
⌃ =1
|Cj |X
t2Cj
(xt � rj)(xt � rj)>
rjx
(x� rj)>⌃�1(x� rj)
![Page 21: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/21.jpg)
t
K-MEANS CLUSTERING
For all j ∈ [K], initialize cluster centroids r
0j randomly and set m = 1
Repeat until convergence (or until patience runs out)1 For each t ∈ {1, . . . ,n}, set cluster identity of the point
cm(xt) = argminj∈[K]
�xt − r
m−1j �
2 For each j ∈ [K], set new representative as
r
mj = 1�Cm
j � �xt∈Cmj
xt
3 m← m + 1
![Page 22: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/22.jpg)
ELLIPSOIDAL CLUSTERING
For all j ∈ [K], initialize cluster centroids r
0j and ellipsoids ⌃0
jrandomly and set m = 1Repeat until convergence (or until patience runs out)
1 For each t ∈ {1, . . . ,n}, set cluster identity of the point
cm(xt) = argminj∈[K]
(xt − r
m−1j )� �⌃m−1�−1 (xt − r
m−1j )
2 For each j ∈ [K], set new representative as
r
mj = 1�Cm
j � �xt∈Cmj
xt ⌃m = 1�Cj� �t∈Cj
(xt − r
mj )(xt − r
mj )�
3 m← m + 1
![Page 23: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/23.jpg)
K-means: pitfalls
• Looks for spherical clusters
• Of same radius
• And with roughly equal number of points
![Page 24: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/24.jpg)
HARD GAUSSIAN MIXTURE MODEL
For all j ∈ [K], initialize cluster centroids r
0j , ellipsoids ⌃0
j andinitial proportions ⇡0 randomly and set m = 1Repeat until convergence (or until patience runs out)
1 For each t ∈ {1, . . . ,n}, set cluster identity of the point
cm(xt) = argminj∈[K]
(xt − r
m−1j )� �⌃m−1�−1 (xt − r
m−1j ) − log(⇡m−1
j )2 For each j ∈ [K], set new representative as
r
mj = 1�Cm
j � �xt∈Cmj
xt ⌃m = 1�Cj� �t∈Cj
(xt − r
mj )(xt − r
mj )� ⇡m
j = �Cmj �
n
3 m← m + 1
![Page 25: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/25.jpg)
Multivariate Gaussian• Two parameters:
• Mean
• Covariance matrix of size dxd
µ 2 Rd
⌃
p(x;µ,⌃) = (2⇡)
d/2det(⌃)
�1/2exp
✓�1
2
(x� µ)
>⌃
�1(x� µ)
◆
![Page 26: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/26.jpg)
Multivariate Gaussian• Two parameters:
• Mean
• Covariance matrix of size dxd
µ 2 Rd
⌃
32
10
x
-1
exp(-(10 x2 +y2)/2)
-2-3-3
-2
-1
0
y
1
2
3
1
0.8
0.6
0.4
0.2
0
p(x;µ,⌃) = (2⇡)
d/2det(⌃)
�1/2exp
✓�1
2
(x� µ)
>⌃
�1(x� µ)
◆
![Page 27: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/27.jpg)
HARD GAUSSIAN MIXTURE MODEL
For all j ∈ [K], initialize cluster centroids r
0j , ellipsoids ⌃0
j andinitial proportions ⇡0 randomly and set m = 1Repeat until convergence (or until patience runs out)
1 For each t ∈ {1, . . . ,n}, set cluster identity of the point
cm(xt) = argminj∈[K]
p(xt, rm−1j , ⌃m−1) × ⇡m(j)
2 For each j ∈ [K], set new representative as
r
mj = 1�Cm
j � �xt∈Cmj
xt ⌃m = 1�Cj� �t∈Cj
(xt − r
mj )(xt − r
mj )� ⇡m
j = �Cmj �
n
3 m← m + 1
![Page 28: Machine Learning for Data Science (CS4786) Lecture 5 · 2017. 9. 7. · K-means: pitfalls • Looks for spherical clusters • Of same radius • And with roughly equal number of](https://reader035.vdocuments.site/reader035/viewer/2022081407/604f079c1718b70df7080c18/html5/thumbnails/28.jpg)
(SOFT) GAUSSIAN MIXTURE MODEL
For all j ∈ [K], initialize cluster centroids r
0j and ellipsoids ⌃0
jrandomly and set m = 1Repeat until convergence (or until patience runs out)
1 For each t ∈ {1, . . . ,n}, set cluster identity of the point
Qmt (j) = p(xt, r
m−1j , ⌃m−1) × ⇡m(j)
2 For each j ∈ [K], set new representative as
r
mj = ∑
nt=1 Qt(j)xt∑n
t=1 Qt(j) ⌃m = ∑nt=1 Qt(j)(xt − r
mj )(xt − r
mj )�
∑nt=1 Qt(j)
⇡mj = ∑
nt=1 Qt(j)
n
3 m← m + 1