graph-based clusteringcse.msu.edu/~cse802/s17/slides/lec_20_21_22_clustering.pdf · graph-based...

54
‹#› Graph-based Clustering Transform the data into a graph representation Vertices are the data points to be clustered Edges are weighted based on similarity between data points Þ Graph partitioning Each connected component is a cluster

Upload: others

Post on 31-May-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Graph-based Clustering

● Transform the data into a graph representation– Vertices are the data points to be clustered – Edges are weighted based on similarity between data

points

ÞGraph partitioning

Each connected component is a cluster

Page 2: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Clustering as Graph Partitioning

● Two things needed:1. An objective function to determine what would be the

best way to “cut” the edges of a graph

2. An algorithm to find the optimal partition (optimal according to the objective function)

Page 3: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Objective Function for Partitioning

● Suppose we want to partition the set of vertices V into two sets: V1 and V2– One possible objective function is to minimize graph cut

åÎÎ

=

2

1 ,21 ),(Cut

VjVi

ijwVV

0.1

0.3

0.1

0.1

v1

v3

v2

v5

v4

v6

0.2 0.1

0.3

0.1

0.1

v1

v3

v2

v5

v4

v6

0.2

Cut = 0.2 Cut = 0.4

wij is weight of the edge between nodes i and j

Page 4: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Objective Function for Partitioning

● Limitation of minimizing graph cut:

– The optimal solution might be to split up a single node from the rest of the graph! Not a desirable solution

Cut = 0.10.1

0.3

0.1

0.1

v1

v3

v2

v5

v4

v6

0.2

Page 5: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Objective Function for Partitioning

● We should not only minimize the graph cut; but also look for “balanced” clusters

å

åå=

+=

+=

ÎÎ

jiji

Vjj

Vii

wd

dVV

dVVVV

VVV

VVVVV

e wher

),Cut(),Cut(),(cut Normalized

||),Cut(

||),Cut(),(cut Ratio

21

212121

2

21

1

2121

V1 and V2 are the set of nodes in partitions 1 and 2

|Vi| is the number of nodes in partition Vi

V1 V2

Page 6: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Example

Cut = 0.1

Ratio cut = 0.1/1 + 0.1/5 = 0.12

Normalized cut = 0.1/0.1 + 0.1/1.5 = 1.07

Cut = 0.2

Ratio cut = 0.2/3 + 0.2/3 = 0.13

Normalized cut = 0.2/1 + 0.2/0.6 = 0.53

0.1

0.3

0.1

0.1

v1

v3

v2

v5

v4

v6

0.20.1

0.3

0.1

0.1

v1

v3

v2

v5

v4

v6

0.2

Page 7: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Example

Cut = 1

Ratio cut = 1/1 + 1/5 = 1.2

Normalized cut = 1/1 + 1/9 = 1.11

Cut = 2

Ratio cut = 1/3 + 1/3 = 0.67

Normalized cut = 1/5 + 1/5 = 0.2

1

1

1

1

v1

v3

v2

v5

v4

v6

11

1

1

1

v1

v3

v2

v5

v4

v6

1

If graph is unweighted (or has the same edge weight)

Page 8: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Algorithm for Graph Partitioning

● How to minimize the objective function?– We can use a heuristic (greedy) approach to do this

u Example: METIS graph partitioninghttp://www.cs.umn.edu/~metis

– An elegant way to optimize the function is by using ideas from spectral graph theoryu This leads to a class of algorithms known as spectral clustering

Page 9: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Spectral Clustering

● Spectral properties of a graph– Spectral properties: eigenvalues/eigenvectors of the

adjacency matrix can be used to represent a graph

● There exists a relationship between spectral properties of a graph and the graph partitioning problem

Page 10: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Spectral Properties of a Graph

● Start with a similarity/adjacency matrix, W, of a graph

● Define a diagonal matrix D

– If W is a binary 0/1 matrix, then Dii represents the degree of node i

ïî

ïíì

== å=

otherwise0

if1

jiwDn

kik

ij

Page 11: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Preliminaries

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=

001000001000110000000011000100000100

W1

1

1

1

v1

v3

v2

v5

v4

v6

Two block-diagonal matrices

ïî

ïíì

== å=

otherwise0

if1

jiwDn

kik

ij

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=

100000010000002000000200000010000001

D

Two clusters

Page 12: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Graph Laplacian Matrix

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=

001000001000110000000011000100000100

W1

1

1

1

v1

v3

v2

v5

v4

v6

Two block matrices

WDL -=Laplacian,

úúúúúúúú

û

ù

êêêêêêêê

ë

é

--

----

--

=

101000011000112000000211000110000101

LLaplacian also has a block structure

Page 13: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Graph Laplacian

● L = (D – W) is a symmetric matrix● L is a positive semi-definite matrix

– Consequence: all eigenvalues of L are ³ 0

Page 14: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Spectral Clustering

Consider a data set with N data points1. Construct an N ´ N similarity matrix, W2. Compute the N ´ N Laplacian matrix, L = D – W3. Compute the k “smallest” eigenvectors of L

a) Each eigenvector vi is an N ´ 1 column vectorb) Create a matrix V containing eigenvectors v1, v2, .., vk

as columns (you may exclude the first eigenvector)4. Cluster the rows in V using k-means or other

clustering algorithms into K clusters

Page 15: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Example

Page 16: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Summary

● Spectral properties of a graph (i.e., eigenvalues and eigenvectors) contain information about clustering structure

● To find k clusters, apply k-means or other algorithms to the first k eigenvectors of the graph Laplacian matrix

Page 17: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Minimum Spanning Tree

● Given the MST of data points, remove the longest edge (inconsistent) and then the next longest edge,.......

Page 18: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Page 19: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

● One useful statistics that can be estimated from the MST is the edge length distribution

● For instance, in the case of 2 dense clusters immersed in a sparse set of points

Page 20: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Cluster Validity

l Which clustering method is appropriate for a particular data set?

l How does one determine whether the results of a clustering method truly characterize data?

l How do you know when you have a good set of clusters?

l Is it unusual to find a cluster as compact and isolated as the observed clusters?

l How to guard against elaborate interpretation of randomly distributed data?

Page 21: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›21

Cluster Validity

● Clustering algorithms find clusters, even if there are no natural clusters in data

to design new methods, difficult to validateK-Means; K=3100 2D uniform data points

• Cluster stability: Perturb data by bootstrapping.How do clusters change over the ensemble

Page 22: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Hierarchical Clustering

• Hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Two approaches:

• Agglomerative ("bottom up“): each point starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy; more popular

• Divisive ("top down“): all points start in one cluster, and splits are performed recursively as one moves down the hierarchy

How to define similarity between two clusters or a point and a cluster?

Page 23: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Agglomerative Clustering Example

• Cluster six elements {a}, {b}, {c}, {d}, {e} and {f} in 2D; use Euclidean distance as a similarity

• Build the hierarchy from the individual elements by progressively merging clusters

• Which elements to merge in a cluster? Usually, merge the two closest elements, according to the chosen distance

Page 24: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Suppose we have merged the two closest elements b and c to obtainclusters {a}, {b, c}, {d}, {e} and {f}

To merge them further, we need to take the distance between {a} and {b c}. Two common ways to define distance between two clusters:

• The maximum distance between elements of each cluster (also called complete-linkage clustering): max { d ( x , y ) : x ∈ A , y ∈ B }

• The minimum distance between elements of each cluster (single-linkage clustering): min { d ( x , y ) : x ∈ A , y ∈ B }

Stop clustering either when the clusters are too far apart to be merged or when there is a sufficiently small number of clusters

Single-link v. Complete-link Hierarchical Clustering

Page 25: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

2DPCAProjectionofIrisData

Page 26: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

MinimumSpanningTreeClusteringof2DPCAProjectionofIrisData

Page 27: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

K-MeansClusteringofIrisData(ClusteringAssignmentshownon2DPCAProjection)

Page 28: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

Single-linkClusteringofIrisData

Page 29: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

Complete-linkClusteringofIrisData

Page 30: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Angkor Wat

Hindu temple built by a Khmer king ~1,150AD; Khmer kingdom declined in the 15th century; French explorers discovered the hidden ruins in late 1800’s

Page 31: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Apsaras of Angkor Wat• Angkor Wat contains the most unique gallery of ~2,000 women

depicted by detailed full body portraits

• What facial types are represented in these portraits?

Page 32: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Clustering of Apsara Faces

Shape alignment

How to validate the clusters or groups?

127 facial landmarks

127 landmarks

1 2 3 4 56 7 89 10

Single Link clusters

Single Link

Page 33: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ground Truth

Khmer Dance and Cultural Center

Page 34: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Exploratory Data Analysis

Clustering with large weightsassigned to chin and nose

Example devata faces from the clusters differ largely in chin and nose, thereby reflecting the weights chosen for similarity

2D MDS Projection of the Similarity matrix

Page 35: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Exploratory Data Analysis3D MDS Projection of the Similarity matrix

Page 36: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Page 37: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Spectral Clustering & Graph Partitioning

● We have shown that the spectral properties of the graph is related to the clusters– How is it related to minimizing graph cut?

Page 38: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Graph Partitioning

● Recall the following objective of graph partitioning

å

å

åå

ÎÎ

ÎÎ

=

=

+=

+=

21

21

,21

212121

2

21

1

2121

),Cut(

e wher

),Cut(),Cut(),(cut Normalized

||),Cut(

||),Cut(),(cut Ratio

VjVi

ij

jiji

Vjj

Vii

wVV

wd

dVV

dVVVV

VVV

VVVVV

Page 39: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ratio Cut

● Let xi indicates membership of node vi in a cluster:

● Also:

– where L is the graph Laplacian matrix

ïïî

ïïí

ì

Î-

Î=

22

1

11

2

if||||

if||||

VvVV

VvVV

xi

i

i

( )

( ) ( )åå

å

ÎÎÎÎ

-+-=

-=

1221 ,

2

,

2

,

2

21

2121

VjVijiij

VjVijiij

jijiij

T

xxwxxw

xxwLxx

Page 40: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ratio Cut

( ) ( )

),(||||||||

||||||),(

||||

||||

||||

||||),(

2||||

||||),(

2||||

||||

212

||||

||||

21

||||

||||

21

||||

||||

21

21

21

21

2

21

1

1221

2

2

1

1

2

1

1

221

2

1

1

221

, 1

2

2

1

, 2

1

1

2

,

2

1

2

2

1

,

2

2

1

1

2

,

2

,

2

1221

1221

1221

VVRatioCutVVVV

VVVVVCut

VV

VV

VV

VVVVCut

VV

VVVVCut

VV

VVw

VV

VVw

VV

VVw

VV

VVw

xxwxxwLxx

VjViij

VjViij

VjViij

VjViij

VjVijiij

VjVijiij

T

´=

÷÷ø

öççè

æ ++

+=

÷÷ø

öççè

æ+++=

÷÷ø

öççè

æ++=

÷÷ø

öççè

æ+++÷÷

ø

öççè

æ++=

÷÷ø

öççè

æ--+÷

÷ø

öççè

æ+=

-+-=

åå

åå

åå

ÎÎÎÎ

ÎÎÎÎ

ÎÎÎÎ

Page 41: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ratio Cut

● Therefore:

– Thus, we have related ratio cut to Laplacian matrix L – But there is one issue:

u Trivial solution is x is a vector of all zerosu Need to look for a non-trivial solution

– Look for constraints that must be satisfied by x

LxxVVRatioCut T

xVVmin),(min 21, 21

=

0||||

||||1

212

1

1

2

1=-== ååå

ÎÎ= ViVi

n

ii

TV

VV

Vxx

The solution x must be orthogonal to the vector of all 1s

Page 42: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ratio Cut

Another constraint that must be satisfied by x:

LxxVVRatioCut T

xVVmin),(min 21, 21

=

nVV

VV

VVxxx

ViVi

n

ii

T

||||

||||

||||

12

2

2

1

2

1

2

1

2

21

=+=

÷÷ø

öççè

æ-+÷÷

ø

öççè

æ== ååå

ÎÎ=

Page 43: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Ratio Cut

subject to:

● This is a constrained optimization problem where

● Instead, we solve a relaxation of the problem:

LxxTxmin nxxT =

( )xLxxLx

xF

nxxLxxF TT

ll

l

=Þ=-=¶¶

--=

0

ïïî

ïïí

ì

Î-

Î=

22

1

11

2

if||||

if||||

VvVV

VvVV

xi

i

i

Page 44: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Putting It Altogether

● We have shown that– Minimizing graph cut is equivalent to finding x that

– Solution for x is given by the eigenvectors of L

– Thus, the spectral decomposition of graph Laplacian is equivalent to the solution of the graph partitioning problem

LxxTxmin nxxT = such that

Page 45: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Spectral Clustering with Ratio Cut

● But lmin=0 with eigenvector 1 = (1 1 1…1)T

● Since we want a solution where xT1 = 0, so x ¹ 1

● Instead of the smallest eigenvalue, we look for the eigenvector corresponding to the next smallest eigenvalue

● In summary, finding the eigenvector that corresponds to the second smallest eigenvalue is a relaxation of the ratio cut graph partitioning problem (for k=2)

ll minminmin == xxLxx T

x

T

x

Page 46: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Graph Laplacian

● L = (D – W) is a symmetric matrix● L is a positive semi-definite matrix

– For all real-valued vectors, x: xTLx ³ 0

– Consequence: all eigenvalues of L are ³ 0

( )å

ååå

ååå

=

-=

÷÷ø

öççè

æ+-=

=-=

-=-=

N

jijiij

jiiij

jijiij

jiiij

jiji

jijiij

iii

TTTT

xxw

xwxxwxw

xdxxwxdWxxDxxxWDxLxx

1,

2

,

2

,,

2

,

2

21

221

) where( )(

Page 47: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

úúúú

û

ù

êêêê

ë

é

=

úúúú

û

ù

êêêê

ë

é

úúúú

û

ù

êêêê

ë

é

=

úúúú

û

ù

êêêê

ë

é

=

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=

úúúú

û

ù

êêêê

ë

é

úúúú

û

ù

êêêê

ë

é

=

úúúú

û

ù

êêêê

ë

é

=

å

å

å

=

=

=

dddd

ddd

jdj

d

jj

d

jj

dddd

d

d

D

DD

D

DD

De

D

DD

w

w

w

www

wwwwww

Wee

...1...11

...00............0...00...0

......

1...11

...............

...

...

,

1...11

Suppose

22

11

22

11

22

11

1

12

11

21

22221

11211

Page 48: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

● Since e ¹ [0..0]T, therefore l = 0 – 0 is an eigenvalue of L with the corresponding

eigenvector e = [1 1 1 1…1]T

– Furthermore, since L is positive semi-definite, 0 is the smallest eigenvalue of L

eLeLe

eWDWeDe

l==Þ

=-Þ=

:equation Eigenvalue0

0)(

Page 49: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

● More generally, if

– Thenu There are k eigenvalues of L which have the value 0u The corresponding eigenvectors are:

úúúú

û

ù

êêêê

ë

é

=

kL

LL

L

000............000000

2

1

úúúúúúúú

û

ù

êêêêêêêê

ë

é

úúúúúúúú

û

ù

êêêêêêêê

ë

é

úúúúúúúú

û

ù

êêêêêêêê

ë

é

e

ee

0......00

,...,

00......

0

,

00......0

where e is [1 1…1]T

Page 50: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

Eigenvalues of L:

úúúúúúú

û

ù

êêêêêêê

ë

é

=L

3000000300000010000001000000.00000000

Eigenvectors of L:

úúúúúúú

û

ù

êêêêêêê

ë

é

---

--

---

=

041.071.0058.00041.071.0058.00082.00058.0081.0000058.041.00071.0058.041.00071.0058.0

V

1

1

1

1

v1

v3

v2

v5

v4

v6 úúúúúúúú

û

ù

êêêêêêêê

ë

é

--

----

--

=

101000011000112000000211000110000101

L

Page 51: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

Eigenvalues of L:

úúúúúúú

û

ù

êêêêêêê

ë

é

=L

3000000300000010000001000000.00000000

Eigenvectors of L:

úúúúúúú

û

ù

êêêêêêê

ë

é

---

--

---

=

041.071.0058.00041.071.0058.00082.00058.0081.0000058.041.00071.0058.041.00071.0058.0

V

1

1

1

1

v1

v3

v2

v5

v4

v6

If we cluster the data using only the first 2 eigenvectors, we get the two desired clusters

Page 52: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=

001000001000110100001011000100000100

W

WDL -=Laplacian,

úúúúúúúú

û

ù

êêêêêêêê

ë

é

--

------

--

=

101000011000113100001311000110000101

L

1

1

1

1

v1

v3

v2

v5

v4

v6

1

Clusters are no longer well separated

Page 53: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

úúúúúúúú

û

ù

êêêêêêêê

ë

é

--

------

--

=

101000011000113100001311000110000101

L

úúúúúúú

û

ù

êêêêêêê

ë

é

----

----

----

=

18.029.065.028.046.041.018.029.065.028.046.041.066.058.00026.041.066.058.00026.041.018.029.028.065.046.041.018.029.028.065.046.041.0

V

úúúúúúúú

û

ù

êêêêêêêê

ë

é

=L

56.400000030000001000000100000044.00000000

Eigenvalues of L: Eigenvectors of L:

1

1

1

1

v1

v3

v2

v5

v4

v6

1

Page 54: Graph-based Clusteringcse.msu.edu/~cse802/S17/slides/Lec_20_21_22_Clustering.pdf · Graph-based Clustering Transform the data into a graph representation –Vertices are the data

‹#›

Properties of Laplacian Matrix

Eigenvalues of the graph Laplacian:0, 0.5505, 0.5505, 3, 3, 3, 3, 5.4495, 5.4495

úúúúúúúúúúúú

û

ù

êêêêêêêêêêêê

ë

é

---

---

--

--

=

............0050.023.033.0

............35.0050.023.033.0

............35.0023.010.033.0

............61.0005.055.033.0

............26.0005.055.033.0

............35.0002.025.033.0

............35.0020.014.033.0

............18.071.045.032.033.0

............18.071.045.032.033.0

V

Eigenvectors of Laplacian:

Can be used to obtain 3 clusters