fast modified global k-means algorithm for incremental cluster construction
DESCRIPTION
Fast modified global k-means algorithm for incremental cluster construction. Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/1.jpg)
1Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Fast modified global k-means algorithm for incremental cluster construction
Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011
Presented by Wen-Chung Liao2011/01/05
![Page 2: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/2.jpg)
2
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Outlines
Motivation Objectives Methodology Experiments Conclusions Comments
![Page 3: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/3.jpg)
3
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Motivation
The global k-means algorithm and the modified global k-means algorithm are incremental clustering algorithms.─ allow one to find global or a near global minimizer of the clust
er (or error) function.
However, these algorithms are memory demanding ─ they require the storage of the affinity matrix .
Alternatively, this matrix can be computed at each iteration, however, this extends the computational time significantly.
![Page 4: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/4.jpg)
4
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Objectives
A new version of the modified global k-means algorithm is proposed:─ apply an auxiliary cluster function to generate a
set of starting points lying in different parts of the dataset.
─ the best solution is selected as a starting point for the next cluster center.
─ information gathered in previous iterations of the incremental algorithm to avoid computing the whole affinity matrix.
─ the triangle inequality for distances is used to avoid unnecessary computations
![Page 5: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/5.jpg)
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology Modified global k-means algorithm [1]
─ Starts with the computation of one cluster center and attempts to optimally add one new cluster center at each iteration.
─ An auxiliary cluster function using k-1 cluster centers from the(k-1)-th iteration to compute the starting point for the k-th center.
─ The k-means algorithm is applied starting from this point to find the k-partition of the dataset.
Fast modified global k-means algorithm─ auxiliary cluster function to generate a set of starting poi
nts─ the best solution is selected─ avoid computing the whole affinity matrix
![Page 6: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/6.jpg)
6
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Modified global k-means algorithm
cluster function
![Page 7: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/7.jpg)
7
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
![Page 8: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/8.jpg)
8
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
: the solution to the(k-1)-partition problem
Modified global k-means algorithm
Auxiliary cluster function:
x1
x2
x3
y S (y)
![Page 9: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/9.jpg)
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Modified global k-means algorithm
![Page 10: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/10.jpg)
10
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
u=1.0u=0.2
x1
x2
x3
y S 0.2(y)
x1
x2
x3
y S 1.0(y)
Fast modified global k-means algorithm
![Page 11: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/11.jpg)
11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Reduction of computational effort
x1
aiaj
S (ai)
![Page 12: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/12.jpg)
12
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Reduction of computational effort
![Page 13: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/13.jpg)
13
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Computational complexity
The modified global k-means algorithm ─ O(mk2T+km2+kmt)
The fast modified global k-means algorithm ─ O(p(mk2T+km2+kmt)) (without complexity reduction s
chemes) ─ O(p(mk2T+km1
2+km1t)) (with complexity reduction schemes)
T the number of iterations by Algorithm 2t the number of iterations by Algorithm 1 m1 the number of data points in the set P(u)∩A an
d m1<<m.
![Page 14: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/14.jpg)
14
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Numericalexperiments
k the number of clustersfopt the best known value of the cluster function × m
E the error in %,
α the number of Euclidean norm evaluations t the CPU time
![Page 15: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/15.jpg)
15
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
fFMGKM/fGKM
fFMGKM/fMGKM
α
CPU time
![Page 16: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/16.jpg)
16
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
The Dunn’s validity index
The Davies–Bouldin cluster validity measure • Show a similar pattern.
• Generate similar cluster structures
![Page 17: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/17.jpg)
17
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Conclusions
Developed a new version of the modified global k-means algorithm─ Using the k-1 cluster centers from the previous
iteration to solve the k-partition problem.─ does not rely on the affinity matrix to compute the
starting point─ use more than one starting point to minimize the
auxiliary function─ Two schemes to reduce the amount of computational
effort─ no guarantee that it will converge to the global
solution.
![Page 18: Fast modified global k-means algorithm for incremental cluster construction](https://reader035.vdocuments.site/reader035/viewer/2022062804/56814b54550346895db84cb7/html5/thumbnails/18.jpg)
18
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Comments
Advantages─ Schemes to avoid computational effort.
Shortages─ Determine the set U is not easy.
Applications─ clustering