![Page 1: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/1.jpg)
CLUSTERINGDENSITY-BASED METHODSElsayed HemayedData Mining Course
![Page 2: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/2.jpg)
2
Outline
Density-based Clustering Methods
Density-Based Clustering Methods Density-Based Clustering Background Terminology How does DBSCAN find clusters? DBSCAN
![Page 3: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/3.jpg)
3
Clustering Methods
Density-based Clustering Methods
Partitioning methods K-Means
Hierarchical methods Agglomerative Hierarchical Clustering Divisive hierarchical clustering
Density-based methods DBSCAN: a Density-Based Spatial Clustering of Applications
with Noise Grid-based methods
STING: A Statistical Information Grid Approach to Spatial Data Mining Model-based methods
Expectation-Maximization Neural Network Approach
High Dimensional Data Clustering CLIQUE: A Dimension-Growth Subspace Clustering Method
![Page 4: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/4.jpg)
4
DBSCAN
Density-based Clustering Methods
Density-based Clustering Methods
![Page 5: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/5.jpg)
5
Density-Based Clustering Methods Clustering based on density, such as density-connected
points instead of distance metric. Cluster = set of “density connected” points. Major features:
Discover clusters of arbitrary shape Handle noise Need “density parameters” as termination condition- (when
no new objects can be added to the cluster.)
Example: DBSCAN (Ester, et al. 1996) OPTICS (Ankerst, et al 1999) DENCLUE (Hinneburg & D. Keim 1998)
Density-based Clustering Methods
![Page 6: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/6.jpg)
6
Density-Based Clustering: Background
Eps neighborhood: The neighborhood within a radius Eps of a given object MinPts: Minimum number of points in an Eps-
neighborhood of that object. Core object: If the Eps neighborhood contains at
least a minimum number of points Minpts, then the object is a core object
Directly density-reachable: A point p is directly density-reachable from a point q wrt. Eps, MinPts if 1) p is within the Eps neighborhood of q 2) q is a core object p
qMinPts = 5
Eps = 1Density-based Clustering Methods
![Page 7: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/7.jpg)
7
Density Reachability and Density Connectivity
M, P, O and R are core objects since each is in an Eps neighborhood containing at least 3 points
Minpts = 3
Eps=radius of the circles
Density-based Clustering Methods
![Page 8: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/8.jpg)
8
Directly density reachable Q is directly density reachable from M. M is directly density reachable from P and
vice versa.
Density-based Clustering Methods
![Page 9: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/9.jpg)
9
Indirectly density reachable Q is indirectly density reachable from P
since Q is directly density reachable from M and M is directly density reachable from P. But, P is not density reachable from Q since Q is not a core object.
Density-based Clustering Methods
![Page 10: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/10.jpg)
10
Core, border, and noise points DBSCAN is a Density-Based Spatial Clustering of
Applications with Noise Density = number of points within a specified radius (Eps)
A point is a core point if it has a specified number (or more) of points (MinPts) within Eps These are points that are at the interior of a cluster.
A border point has fewer than MinPts within Eps, but is in the neighborhood of a core point.
A noise point is any point that is not a core point nor a border point.
Density-based Clustering Methods
![Page 11: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/11.jpg)
11
How does DBSCAN find clusters?
Density-based Clustering Methods
DBSCAN searches for clusters by checking the Eps-neighborhood of each point in the database.
If the Eps-neighborhood of a point p contains more than MinPts, a new cluster with p as a core object is created.
DBSCAN then iteratively collects directly density-reachable objects from these core objects, which may involve the merge of a few density-reachable clusters.
The process terminates when no new point can be added to any cluster
![Page 12: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/12.jpg)
12
DBSCAN Algorithm Arbitrary select a point p Retrieve all points density-reachable from p
wrt Eps and MinPts. If p is a core point, a cluster is formed. If p is a border point, no points are density-
reachable from p and DBSCAN visits the next point of the database.
Continue the process until all of the points have been processed.
Density-based Clustering Methods
![Page 13: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/13.jpg)
13
DBSCAN Summary DBSCAN is A Density-Based Clustering Method
Based on Connected Regions with Sufficiently High Density
The algorithm grows regions with sufficiently high density into clusters and discovers clusters of arbitrary shape in spatial databases with noise.
It defines a cluster as a maximal set of density-connected points. So distance is not the metric unlike the case of hierarchical methods.
Density-based Clustering Methods
![Page 14: CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course](https://reader036.vdocuments.site/reader036/viewer/2022062223/5a4d1b307f8b9ab05999ac89/html5/thumbnails/14.jpg)
14
Summary
Density-based Clustering Methods
Density-Based Clustering Methods Density-Based Clustering
Background Terminology How does DBSCAN find clusters? DBSCAN