partitional clustering
DESCRIPTION
PARTITIONAL CLUSTERING. Deniz ÜSTÜN. CONTENT. WHAT IS CLUSTERING ? WHAT IS PARTITIONAL CLUSTERING ? THE USED ALGORITHMS IN PARTITIONAL CLUSTERING. What is Clustering ?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/1.jpg)
PARTITIONAL CLUSTERING
Deniz ÜSTÜN
![Page 2: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/2.jpg)
CONTENT
WHAT IS CLUSTERING ?
WHAT IS PARTITIONAL CLUSTERING ?
THE USED ALGORITHMS IN PARTITIONAL CLUSTERING
![Page 3: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/3.jpg)
What is Clustering ? A process of clustering is classification of the objects which are
similar among them, and organizing of data into groups.
The techniques for Clustering are among the unsupervised methods.
![Page 4: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/4.jpg)
What is Partitional Clustering ? The Partitional Clustering Algorithms separate the similar
objects to the Clusters.
The Partitional Clustering Algorithms are succesful to determine center based Cluster.
The Partitional Clustering Algorithms divide n objects to k cluster by using k parameter.
The techniques of the Partitional Clustering start with a randomly chosen clustering and then optimize the clustering according to some accuracy measurement.
![Page 5: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/5.jpg)
The Used Algorithms in Partitional Clustering
K-MEANS ALGORITHM
K-MEDOIDS ALGORITHM
FUZZY C-MEANS ALGORITHM
![Page 6: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/6.jpg)
K-MEANS ALGORITHM
K-MEANS algorithm is introduced as one of the simplest unsupervised learning algorithms that resolve the clustering problems by J.B. MacQueen in 1967 (MacQueen, 1967).
K-MEANS algorithm allows that one of the data belong to only a cluster.
Therefore, this algorithm is a definite clustering algorithm.
Given the N-sample of the clusters in the N-dimensional space.
![Page 7: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/7.jpg)
K-MEANS ALGORITHM
This space is separated, {C1,C2,…,Ck} the K clusters. The vector mean (Mk) of the Ck cluster is given (Kantardzic, 2003) :
kn
iik
kk Xn
M1
1
where the value of Xk is i.sample belong to Ck.
The square-error formula for the Ck is given :
2
1
2
kn
ikiki MXe
![Page 8: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/8.jpg)
K-MEANS ALGORITHM The square-error formula for the Ck is called the changing in
cluster. The square-error for all the clusters is the sum of the changing
in clusters.
K
kkk eE
1
22
The aim of the square-error method is to find the K clusters that minimize the value of the Ek
2 according to the value of the given K
![Page 9: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/9.jpg)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
9
10
K-MEANS ALGORITHMEXAMPLE
Gözlemler Değişken1 Değişken2 Küme ÜyeliğiX1 3 2 C1X2 2 3 C2X3 7 8 C1
5,52
82,2
731
M
3,213,
12
2
M
![Page 10: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/10.jpg)
K-MEANS ALGORITHMEXAMPLE
2158575253 222221 e
2102122
21
2 eeE
03332 2222 e
![Page 11: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/11.jpg)
K-MEANS ALGORITHMEXAMPLE
41,12332, 2212 XMd
82,22535, 2211 XMd
60,33525, 2221 XMd
03322, 2222 XMd
60,38575, 2231 XMd
07,78372, 2232 XMd
Gözlemler d(M1) d(M2) Küme Üyeliği
X1 2,82 1,41 C2X2 3,60 0 C2X3 3,60 7,07 C1
1112 ,, XMdXMd
![Page 12: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/12.jpg)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
9
10
K-MEANS ALGORITHMEXAMPLE
Gözlemler Değişken1 Değişken2 Küme ÜyeliğiX1 3 2 C2X2 2 3 C2X3 7 8 C1
5.2,5.22
32,2
232
M
8,718,
17
1
M
![Page 13: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/13.jpg)
K-MEANS ALGORITHMEXAMPLE
15.235.225.225.23 222222 e
11022
21
2 eeE
08877 2221 e
![Page 14: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/14.jpg)
K-MEANS ALGORITHMEXAMPLE-1
21,72837, 2211 XMd
7,025,235,2, 2212 XMd
07,73827, 2221 XMd
7,035,225,2, 2222 XMd
08877, 2231 XMd
10,785.275.2, 2232 XMd
Gözlemler d(M1) d(M2) Küme Üyeliği
X1 7,21 0,7 C2X2 7,07 0,7 C2X3 0 7,10 C1
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
9
10
C2
C1
![Page 15: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/15.jpg)
K-MEANS ALGORITHMEXAMPLE-2
Dataset The Number of Attributes
The Number of Features
The Number of Class
Synthetic 1200 2 4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 16: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/16.jpg)
K-MEANS ALGORITHMEXAMPLE-2
K=2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 17: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/17.jpg)
K-MEANS ALGORITHMEXAMPLE-2
K=3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 18: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/18.jpg)
K-MEANS ALGORITHMEXAMPLE-2
K=4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 19: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/19.jpg)
K-MEDOIDS ALGORITHM The aim of the K-MEDOIDS algorithm is to find the K
representative objects (Kaufman and Rousseeuw, 1987). Each cluster in K-MEDOIDS algorithm is represented by
the object in cluster. K-MEANS algorithm determine the clusters by the mean
process. However, K-MEDOIDS algorithm find the cluster by using mid-point.
2
1
2
kn
ikiki OXe
![Page 20: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/20.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
![Page 21: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/21.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Select the Randomly K-MedoidsSelect the Randomly K-Medoids
![Page 22: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/22.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Allocate to Each Point to Closest MedoidAllocate to Each Point to Closest Medoid
![Page 23: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/23.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Allocate to Each Point to Closest MedoidAllocate to Each Point to Closest Medoid
![Page 24: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/24.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Allocate to Each Point to Closest MedoidAllocate to Each Point to Closest Medoid
![Page 25: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/25.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Determine New Medoid for Each ClusterDetermine New Medoid for Each Cluster
![Page 26: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/26.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Determine New Medoid for Each ClusterDetermine New Medoid for Each Cluster
![Page 27: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/27.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Allocate to Each Point to Closest MedoidAllocate to Each Point to Closest Medoid
![Page 28: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/28.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-1
Stop ProcessStop Process
![Page 29: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/29.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-2
Dataset The Number of Attributes
The Number of Features
The Number of Class
Synthetic 2000 2 3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 30: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/30.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-2
K=2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 31: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/31.jpg)
K-MEDOIDS ALGORITHMEXAMPLE-2
K=3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 32: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/32.jpg)
FUZZY C-MEANS ALGORITHM Fuzzy C-MEANS algorithm is the best known and widely
used a method. Fuzzy C-MEANS algorithm is introduced by DUNN in 1973
and improved by BEZDEK in 1981 [Höppner vd, 2000]. Fuzzy C-MEANS lets that objects are belonging to two and
more cluster. The total value of the membership of a data for all the
classes is equal to one. However, the value of the memebership of the cluster that
contain this object is high than other clusters. This Algorithm is used the least squares method [Höppner
vd, 2000].
![Page 33: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/33.jpg)
FUZZY C-MEANS ALGORITHM
N
i
C
jii
mij mcxuJm
1 1
2 1 ,
The algorithm start by using randomly membership matrix (U) and then the center vector calculate [Höppner vd, 2000].
N
i
mij
N
ii
mij
j
u
xuc
1
1
![Page 34: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/34.jpg)
FUZZY C-MEANS ALGORITHMAccording to the calculated center vector, the membership matrix (u) is computed by using the given as:
C
k
m
ki
ii
ij
cxcx
u
1
12
1
The new membership matrix (unew) is compared with the old membership matrix (uold) and the the process continues until the difference is smaller than the value of the ε
![Page 35: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/35.jpg)
FUZZY C-MEANS ALGORITHMEXAMPLE
Dataset The Number of Attributes
The Number of Features
The Number of Class
Synthetic 2000 2 3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 36: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/36.jpg)
FUZZY C-MEANS ALGORITHMEXAMPLE
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
C=3m=5ε=1e-6
![Page 37: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/37.jpg)
Results
K-MEDOIDS is the best algorithm according to K-MEANS and FUZZY C-MEANS.
However, K-MEDOIDS algorithm is suitable for small datasets.
K-MEANS algorithm is the best appropriate in terms of time.
In FUZZY C-MEANS algorithm, a object can belong to one or more cluster.
However, a object can belong to only a cluster in the other two algorithms.
![Page 38: PARTITIONAL CLUSTERING](https://reader035.vdocuments.site/reader035/viewer/2022081422/56815d8d550346895dcb9bb5/html5/thumbnails/38.jpg)
References [MacQueen, 1967] J.B., MacQueen, “Some Methods for Classification and Analysis of
Multivariate Observations”, Proc. Symp. Math. Statist.and Probability (5th), 281-297,(1967). [Kantardzic, 2003] M., Kantardzic, “Data Mining: Concepts, Methods and Algorithms”, Wiley,
(2003). [Kaufman and Rousseeuw, 1987] L., Kaufman, P. J., Rousseeuw, “Clustering by Means of
Medoids,” Statistical Data Analysis Based on The L1–Norm and Related Methods, edited by Y. Dodge, North-Holland, 405–416, (1987).
[Kaufman and Rousseeuw, 1990] L., Kaufman, P. J., Rousseeuw, “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley and Sons., (1990).
[Höppner vd, 2000] F., Höppner, F., Klawonn, R., Kruse, T., Runkler, “Fuzzy Cluster Analysis”, John Wiley&Sons, Chichester, (2000).
[Işık and Çamurcu, 2007] M., Işık, A.Y., Çamurcu, “K-MEANS, K-MEDOIDS ve Bulanık C-MEANS Algoritmalarının Uygulamalı olarak Performanslarının Tespiti”, İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, Sayı :11, 31-45, (2007).