signed kernels

8/2/2019 Signed Kernels

1/12

Spectral Analysis of Signed Graphs for

Clustering, Prediction and Visualization

Jerome Kunegis Stephan Schmidt Andreas Lommatzsch

Jurgen Lerner Ernesto W. De Luca Sahin Albayrak

Abstract

We study the application of spectral clustering, predic-tion and visualization methods to graphs with nega-tively weighted edges. We show that several charac-teristic matrices of graphs can be extended to graphswith positively and negatively weighted edges, giving

signed spectral clustering methods, signed graph ker-nels and network visualization methods that apply tosigned graphs. In particular, we review a signed variantof the graph Laplacian. We derive our results by con-sidering random walks, graph clustering, graph drawingand electrical networks, showing that they all result inthe same formalism for handling negatively weightededges. We illustrate our methods using examples fromsocial networks with negative edges and bipartite ratinggraphs.

1 Introduction

Many graph-theoretic data mining problems can be

solved by spectral methods which consider matricesassociated with a given network and compute theireigenvalues and eigenvectors. Common examples arespectral clustering, graph drawing and graph kernelsused for link prediction and link weight prediction.In the usual setting, only graphs with unweighted orpositively weighted edges are supported. Some datamining problems however apply to signed graphs, i.e.graphs that contain positively and negatively weightededges. In this paper, we show that many spectralmachine learning methods can be extended to the signedcase using specially-defined characteristic matrices.

Intuitively, a positive edge in a graph denotes prox-

imity or similarity, and a negative edge denotes dissim-ilarity or distance. Negative edges are found in manytypes of networks: social networks may contain not justfriend but also foe links, and connections betweenusers and products may denote like and dislike. In othercases, negative edges may be introduced explicitly, as in

DAI-Labor, Technische Universitat Berlin, GermanyUniversitat Konstanz, Germany

the case of constrained clustering, where some pairs ofpoints must or must not be in the same cluster. Theseproblems can all be modeled with signed graphs.

Spectral graph theory is the study of graphs usingmethods of linear algebra [4]. To study a given graph, itsedge set is represented by an adjacency matrix, whoseeigenvectors and eigenvalues are then used. Alterna-

tively, the Laplacian matrix or one of several normal-ized adjacency matrices are used. In this paper, we areconcerned with spectral graph mining algorithms thatapply to signed graphs. In order to do this, we definevariants of the Laplacian and related matrices that re-sult in signed variants of spectral clustering, predictionand visualization methods.

We begin the study in Section 2 by considering theproblem of drawing a graph with negative edges andderive the signed combinatorial graph Laplacian. InSection 3, we give the precise definitions of the varioussigned graph matrices, and derive basic results abouttheir spectrum in Section 4. In Section 5 we derivethe normalized and unnormalized signed Laplacian fromthe problem of signed spectral clustering using signedextensions of the ratio cut and normalized cut functions.In Section 6, we define several graph kernels that applyto signed graphs and show how they can be applied tolink sign prediction in signed unipartite and bipartitenetworks. In Section 7 we give a derivation of the signedLaplacian using electrical networks in which resistancesadmit negative values. We conclude in Section 8.

2 Signed Graph Drawing

To motivate signed spectral graph theory, we consider

the problem of drawing signed graphs, and show howit naturally leads to the signed normalized Laplacian.Proper definitions are given in the next section. TheLaplacian matrix turns up in graph drawing when wetry to find an embedding of a graph into a plane in a waythat adjacent nodes are drawn near to each other [1].The drawing of signed graphs is covered for instancein [2], but using the adjacency matrix instead of theLaplacian. In our approach, we stipulate that negative


2/12

edges should be drawn as far from each other as possible.

2.1 Positively Weighted Graphs We now describethe general method for generating an embedding of thenodes of a graph into the plane using the Laplacianmatrix.

Given a connected graph G = (V , E , A) with posi-tively weighted edges, its adjacency matrix (Aij) givesthe positive edge weights when the vertices i and j areconnected, and is zero otherwise. We now want to finda two-dimensional drawing of the graph in which eachvertex is drawn near to its neighbors. This requirementgives rise to the following vertex equation, which statesthat every vertex is placed at the mean of its neighborscoordinates, weighted by the weight of the connectingedges. For each node i, let Xi R

2 be its coordinatesin the drawing, then

Xi =

ji

Aij1

ji

AijXj .(2.1)

Rearranging and aggregating the equation for all i wearrive at

Dx = Ax(2.2)

Lx = 0

where D is the diagonal matrix defined by Dii =

j Aijand L = D A is the combinatorial Laplacian of G.In other words, X should belong to the null space

of L, which leads to the degenerate solution of Xcontaining constant vectors, as the constant vector isan eigenvector of L with eigenvalue 0. To excludethat solution, we require additionally that the columnvectors of X are orthogonal to the constant vector,leading to X being the eigenvectors associated with thetwo smallest eigenvalues of L different from zero. Thissolution results in a well-known satisfactory embeddingof positively weighted graphs. Such an embedding isrelated to the resistance distance (or commute timedistance) between nodes of the graph [1].

2.2 Signed Graphs We now extend the graph draw-ing method described in the previous section to graphswith positive and negative edge weights.

To adapt (2.1) for negative edge weights, we in-terpret a negative edge as an indication that two ver-tices should be placed on opposite sides of the drawing.Therefore, we take the opposite coordinates Xj of ver-tices j adjacent to i through a negative edge, and usethe absolute value of edge weights to compute the mean,as pictured in Figure 1. We will call this construction

(a) Unsigned graph (b) Signed graph

Figure 1: Drawing a vertex at the mean coordinatesof its neighbors by proximity and antipodal proximity.(a) In unsigned graphs, a vertex u is placed at the meanof its neighbors v1, v2, v3. (b) In signed graphs, a vertexu is placed at the mean of its positive neighbors v1, v2and antipodal points v3 of its negative neighbors.

antipodal proximity. This leads to the vertex equation

Xi =

ji

|Aij |

1

ji

AijXj(2.3)

resulting in a signed Laplacian matrix L = D A inwhich we take Dii =

j |Aij |:

Dx = Ax(2.4)

Lx = 0.

As with L, we must now choose eigenvalues ofL close tozero as coordinates. As we will see in the next section,

L is always positive-semidefinite, and is positive-definitefor graphs that are unbalanced, i.e. that contain cycleswith an odd number of negative edges.

To obtain a graph drawing from L, we can thusdistinguish three cases, assuming that G is connected:

If all edges are positive, L = L and we arrive at thesolution given by the ordinary Laplacian matrix.

If the graph is unbalanced, L is positive-definiteand we can use the eigenvectors of the two smallesteigenvalues as coordinates.

If the graph is balanced, its spectrum is equiva-lent to that of the corresponding unsigned Lapla-cian matrix, up to signs of the eigenvector compo-nents. Using the eigenvectors of the two smallesteigenvalues (including zero), we arrive at a graphdrawing with all points being placed on two paral-lel lines, reflecting the perfect 2-clustering presentin the graph.


3/12

2.3 Toy Examples Figure 2 shows example graphswith positive edges drawn in green and negative edgesin red. All edges have weight 1, and all graphs con-tain cycles with an odd number of negative edges. Col-umn (a) shows all graphs drawn using the eigenvectorsof the two largest eigenvalues of the adjacency matrix A.

Column (b) shows the unsigned Laplacian embedding ofthe graphs by setting all edge weights to +1. Column (c)shows the signed Laplacian embedding. The embedding

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0 .60.378

0.378

0.378

0.378

0.378

0.378

0.378

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0 .60.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0. 60.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0 . 5 5 0 . 5 0 . 4 5 0 . 4 0 . 3 5 0 . 3 0 . 2 5 0 . 2 0 . 1 5 0 . 1 0 . 0 50.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0 .60.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0. 60.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .30.25

0.2

0.15

0.1

0.05

0

0.05

0.1

0.15

0.2

0.25

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .30.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0. 30.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0. 5 0 .4 0. 3 0 .2 0. 1 0 0 .1 0 .2 0 .30.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0.5

0. 5 0 .4 0. 3 0 .2 0. 1 0 0 .1 0 .2 0 .30.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0.5

0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .4 0 .50.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 00.6

0.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .4 0 .50.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

0. 5 0 .4 0 .3 0 . 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0. 8 0. 6 0. 4 0. 2 0 0 .2 0 .4 0 .60.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .40.8

0.6

0.4

0.2

0

0.2

0.4

0.6

(a) A (b) L (c) L

Figure 2: Small toy graphs drawn using the eigenvec-tors of three graph matrices. (a) The adjacency ma-trix A. (b) The unsigned Laplacian D A. (c) Thesigned Laplacian D A. All graphs shown contain neg-ative cycles, and their signed Laplacian matrices arepositive-definite. Edges have weights 1 shown in green(+1) and red (1).

given by the eigenvectors ofA is clearly not satisfactoryfor graph drawing. As expected, the graphs drawn usingthe ordinary Laplacian matrix place nodes connected bya negative edge near to each other. The signed Lapla-cian matrix produces a graph embedding where nega-tive links span large distances across the drawing, as

required.

3 Definitions

In this section, we give the definition of the combina-torial and normalized signed Laplacian matrices of agraph and derive their basic properties.

The combinatorial Laplacian matrix of signedgraphs with edge weights restricted to 1 is describedin [12] where it is called the Kirchhoff matrix (of a signedgraph). A different Laplacian matrix for signed graphsthat is not positive-semidefinite is used in the context ofknot theory [19]. That same Laplacian matrix is used

in [16] to draw graphs with negative edge weights. Forgraphs where all edges have negative weights, the ma-trix D + A may be considered, and corresponds to theLaplacian of the underlying unsigned graph [5].

Let G = (V , E , A) be an undirected graph withvertex set V, edge set E, and nonzero edge weightsdescribed by the adjacency matrix A RVV. We willdenote an edge between nodes i and j as (i, j) and writei j to signify that two nodes are adjacent. If (i, j) isnot an edge of the graph, we set Aij = 0. Otherwise,Aij > 0 denotes a positive edge and Aij < 0 denotes anegative edge. Unless otherwise noted, we assume G tobe connected.

3.1 Laplacian Matrix Given a graph G with onlypositively weighted edges, its ordinary Laplacian ma-trix is a symmetric V V matrix that, in a generalsense, captures relations between individual nodes of thegraph. The Laplacian matrix is positive-semidefinite,and its MoorePenrose pseudoinverse can be interpretedas a forest count [3] and can be used to compute the re-sistance distance between any two nodes [14].

Definition 1. The Laplacian matrix L RVV of agraph G with nonnegative adjacency matrix A is givenby

L = D A(3.5)

with the diagonal degree matrix D RVV given by

Dii =ji

Aij .(3.6)

The multiplicity of the Laplacians eigenvalue zeroequals the number of connected components in G. IfG


4/12

is connected, the eigenvalue zero has multiplicity one,and the second-smallest eigenvalue is known as the alge-braic connectivity of the graph [4]. The eigenvector cor-responding to that second-smallest eigenvalue is calledthe Fiedler vector, and has been used successfully forclustering the nodes of G [6].

3.2 Signed Laplacian Matrix If applied to signedgraphs, the Laplacian of Equation (3.5) results in anindefinite matrix, and thus cannot be used as the basisfor graph kernels [19]. Therefore, we use a modifieddegree matrix D [12].

Definition 2. The signed Laplacian matrixL RVV

of a graph G with adjacency matrix A is given by

L = D A,(3.7)

where the signed degree matrix D RVV is thediagonal matrix given by

Dii =ji

|Aij |.(3.8)

We prove in Section 4 that the signed Laplacianmatrix is positive-semidefinite.

Two different matrices are usually called the nor-malized Laplacian, and both can be extended to signedgraphs. We follow [20] and call these the random walkand symmetric normalized Laplacian matrices.

3.3 Random Walk Normalized Laplacian Whenmodeling random walks on an unsigned graph G, thetransition probability from node i to j is given by

entries of the stochastic matrix D

1A. This matrix alsoarises from Equation (2.2) as the eigenvalue equationx = D1Ax. The matrix Lrw = I D

1A is called therandom walk Laplacian and is positive-semidefinite. Itssigned counterpart is given by Lrw = I D

1A.The random walk normalized Laplacian arises when

considering random walks, but also when drawinggraphs and when clustering using normalized cuts, asexplained in Section 5.

3.4 Symmetric Normalized Laplacian Anothersigned Laplacian is given by Lsym = I D

1/2AD1/2.As with the unnormalized Laplacian matrix, we define

an ordinary and a signed variant.

Lsym = D1/2LD1/2 = I D1/2AD1/2

Lsym = D1/2LD1/2 = I D1/2AD1/2

The ordinary variant only applies to graphs with posi-tive edge weights. The normalized Laplacian matricescan be used instead of the combinatorial Laplacian ma-trices in most settings, with good results reported forgraphs with very skewed degree distributions [8].

4 Spectral Analysis

In this section, we prove that the signed Laplacian ma-trix L is positive-semidefinite, characterize the graphsfor which it is positive-definite, and give the relation-ship between the eigenvalue decomposition of the signedLaplacian matrix and the eigenvalue decomposition ofthe corresponding unsigned Laplacian matrix. Thecharacterization of the smallest eigenvalue of L in termsof graph balance can be found in [12].

4.1 Positive-semidefiniteness

Theorem 4.1. The signed Laplacian matrix L ispositive-semidefinite for any graph G.

Proof. We write the Laplacian matrix as a sum over theedges of G:

L =

(i,j)E

L(i,j)

where L(i,j) RVV contains the four following nonzeroentries:

L(i,j)ii = L

(i,j)jj = |Aij|

L(i,j)ij = L

(i,j)ji = Aij .

Let x RV be a vertex-vector. By considering thebilinear form xTL(i,j)x, we see that L(i,j) is positive-semidefinite:

xTL(i,j)x = |Aij |x2i + |Aij |x

2j 2Aijxixj

= |Aij |(xi sgn(Aij)xj)2

0(4.9)

We now consider the bilinear form xTLx:

xTLx =

(i,j)E

xTL(i,j)x 0

It follows that L is positive-semidefinite.

Another way to prove that L is positive-semidefiniteconsists of expressing it using the incidence matrix ofG. Assume that for each edge (i, j), an arbitraryorientation is chosen. Then we define the incidencematrix S RVE of G as

Si(i,j) = +

|Aij|

Sj(i,j) = sgn(Aij)

|Aij |.

We now consider the product SST RVV:

(SST)ii =ji

|Aij |

(SST)ij = Aij


5/12

Figure 3: The nodes of a graph without negativecycles can be partitioned into two sets such that alledges inside of each group are positive and all edgesbetween the two groups are negative. We call such agraph balanced, and the eigenvalue decomposition ofits signed Laplacian matrix can be expressed as themodified eigenvalue decomposition of the correspondingunsigned graphs Laplacian.

for diagonal and off-diagonal entries, respectively.Therefore SST = L, and it follows that L is positive-semidefinite. This result is independent of the orienta-tion chosen for S.

4.2 Positive-definiteness We now show that, un-like the ordinary Laplacian matrix, the signed Lapla-cian matrix is strictly positive-definite for some graphs,including most real-world networks.

As with the ordinary Laplacian matrix, the spec-trum of the signed Laplacian matrix of a disconnected

graph is the union of the spectra of its connected com-ponents. This can be seen by noting that the Lapla-cian matrix of an unconnected graph has block-diagonalform, with each diagonal entry being the Laplacian ma-trix of a single component. Therefore, we will restrictourselves to connected graphs. First, we define balancedgraphs [11].

Definition 3. A connected graph with nonzero signed

edge weights is balanced when its vertices can be parti-

tioned into two groups such that all positive edges con-

nect vertices within the same group, and all negative

edges connect vertices of different groups.

Figure 3 shows a balanced graph partitioned intotwo vertex sets. Equivalently, unbalanced graphs canbe defined as those graphs containing a cycle with anodd number of negative edges, as shown in Figure 4.

To prove that the balanced graphs are exactly thosethat do not contain cycles with an odd number of edges,consider that any cycle in a balanced graph has to crosssides an even number of times. On the other hand, any

Figure 4: An unbalanced graph contains a cycle with anodd number of negatively weighted edges. Negativelyweighted edges are shown in red, positively weightededges are shown in green, and edges that are not part ofthe cycle in black. The presence of such cycles resultsin a positive-definite Laplacian matrix.

balanced graph can be partitioned into two vertex setsby depth-first traversal while assigning each vertex to apartition such that the balance property is fulfilled. Anyinconsistency that arises during such a labeling leads toa cycle with an odd number of negative edges.

Using this definition, we can characterize the graphsfor which the signed Laplacian matrix is positive-definite.

Theorem 4.2. The signed Laplacian matrix of an un-

balanced graph is positive-definite.

Proof. We show that if the bilinear form xTLx is zerofor some x = 0, then a bipartition of the vertices asdescribed above exists.

Let xTLx = 0. We have seen that for every L(i,j)

and any x, xTL(i,j)x 0. Therefore, we have for everyedge (i, j)

xTL(i,j)x = 0

|Aij |(xi sgn(Aij)xj)2 = 0

xi = sgn(Aij)xj

In other words, two components of x are equal if thecorresponding vertices are connected by a positive edge,and opposite to each other if the corresponding verticesare connected by a negative edge. Because the graph isconnected, it follows that all |xi| must be equal. We canexclude the solution xi = 0 for all i because x is not thezero vector. Without loss of generality, we assume that|xi| = 1 for all i. Therefore, x gives a bipartition intovertices with xi = +1 and vertices with xi = 1, withthe property that two vertices with the same value ofxiare in the same partition and two vertices with oppositesign of xi are in different partitions, and therefore G isbalanced.


6/12

4.3 Balanced Graphs We now show how the spec-trum and eigenvalues of the signed Laplacian of a bal-anced graph arise from the spectrum and the eigenvaluesof the corresponding unsigned graph by multiplicationof eigenvector components with 1.

Let G = (V , E , A) be a balanced graph with positive

and negative edge weights and G = (V,E,A) thecorresponding graph with positive edge weights givenby Aij = |Aij |. Since G is balanced, there is a vectorx {1, +1}V such that for all edges (i, j), sgn(Aij) =xixj .

Theorem 4.3. If L is the signed Laplacian matrix ofthe balanced graph G with bipartition x and eigenvaluedecomposition L = UUT, then the eigenvalue decom-position of the Laplacian matrix L of G of the corre-

sponding unsigned graphG of G is given byL = UUT

where

Uij = xiUij .(4.10)

Proof. To see that L = UUT, note that for diagonalelements, UiU

Ti = x

2i UiU

Ti = UiU

Ti = Lii =

Lii. For off-diagonal elements, we have UiUTj =

xixjUiUTj = sgn(Aij)Lij = sgn(Aij)Aij = |Aij | =

Aij = Lij .We now show that UUT is an eigenvalue decom-

position ofL by showing that U is orthogonal. To seethat the columns ofU are indeed orthogonal, note thatfor any two column indices m = n, we have UT

mUn =iV UimUin =

iV x

2i UimUin = U

TmUn = 0 be-

cause U is orthogonal. Changing signs in U does

not change the norm of each column vector, and thusL = UUT is a proper eigenvalue decomposition.

As shown in Section 4.2, the Laplacian matrix ofan unbalanced graph is positive-definite and thereforeits spectrum is different from that of the correspondingunsigned graph. Aggregating Theorems 4.2 and 4.3, wearrive at our main result.

Theorem 4.4. The signed Laplacian matrix of a graph

is positive-definite if and only if the graph is unbalanced.

Proof. The theorem follows directly from Theorems 4.2and 4.3.

The spectra of several large unipartite and bipartitesigned networks are plotted in Figure 5. Table 1gives the smallest Laplacian eigenvalues for some largenetworks. Jester is a bipartite useritem network of

joke ratings [9]. MovieLens is the bipartite useritem network of signed movie ratings from GroupLensresearch1, released in three different sizes. The Slashdot

1http://www.grouplens.org/node/73

0 1 2 3 4 5 6 7 8 9 1 00

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

i

i

slashdotzoo.l.lap spectrum

(a) Slashdot Zoo

0 5 10 15 20 25 30 350

0.005

0.01

0.015

0.02

0.025

i

i

epinions.l.lap spectrum

(b) Epinions

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

i

i

jester.l.lap spectrum

(c) Jester

0 5 10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

i

i

movielens100k/rating.l.lap spectrum

(d) MovieLens 100k

0 5 10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

i

i

movielens1m.l.lap spectrum

(e) MovieLens 1M

0 5 10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

i

i

movielens10m/rating.l.lap spectrum

(f) MovieLens 10M

Figure 5: The Laplacian spectra of several large signednetworks.

Table 1: Smallest Laplacian eigenvalue 1 > 0 oflarge signed networks. Smaller values indicate a morebalanced network.

Network 1MovieLens 100k 0.4285 conflictMovieLens 1M 0.3761Jester 0.06515MovieLens 10M 0.04735Slashdot Zoo 0.006183Epinions 0.004438 balance

Zoo is a social network with friend and foe links

extracted from technology news website Slashdot [17].Epinions is a signed useruser opinion network [21]. Thesmallest Laplacian eigenvalue is a characteristic valueof the network denoting its balance. The higher thevalue the more unbalanced is the network [12]. All theselarge networks are unbalanced, and we can for instanceobserve that the social networks are more balanced thanthe rating networks.

5 Clustering

One of the main application areas of the graph Lapla-cian are clustering problems. In spectral clustering, theeigenvectors of matrices associated with a graph areused to partition the vertices of the graph into well-connected groups. In this section, we show that in agraph with negatively weighted edges, spectral cluster-ing algorithms correspond to finding clusters of verticesconnected by positive edges, but not connected by neg-ative edges.

Spectral clustering algorithms are usually derivedby formulating a minimum cut problem which is thenrelaxed. The choice of the cut function results in
http://www.grouplens.org/node/73http://www.grouplens.org/node/73


7/12

different spectral clustering algorithms. In all cases, thevertices of a given graph are mapped into the spacespanned by the eigenvector components of a matrixassociated with the graph.

We derive a signed extension of the ratio cut,which leads to clustering with the signed combinatorial

Laplacian L. Analogous derivations are possible for thenormalized Laplacians. We restrict our proofs to thecase of 2-clustering. Higher-order clusterings can bederived analogously.

5.1 Unsigned Graphs We first review the deriva-tion of the ratio cut in unsigned graphs, which leads toa clustering based on the eigenvectors of L.

Let G = (V , E , A) be an unsigned graph. A cut ofG is a partition of the vertices V = X Y. The weightof a cut is given by cut(X, Y) =

iX,jY Aij. The cut

measures how well two clusters are connected. Since wewant to find two distinct groups of vertices, the cut mustbe minimized. Minimizing cut(X, Y) however leads inmost cases to solutions separating very few vertices fromthe rest of the graph. Therefore, the cut is usuallydivided by the size of the clusters, giving the ratio cut:

RatioCut(X, Y) = cut(X, Y)

1

|X|+

1

|Y|

To get a clustering, we then solve the following opti-mization problem:

minXV

RatioCut(X, V \ X)

Let Y = V \ X, then this problem can be solved byexpressing it in terms of the characteristic vector u ofX defined by:

ui = +

|Y|/|X| when i X(5.11)

ui =

|X|/|Y| when i Y

We observe that uLuT = 2|V| RatioCut(X, Y), andthat

i ui = 0, i.e. u is orthogonal to the constant

vector. Denoting by U the vectors u of the form givenin Equation (5.11) we have

minuRV

uLuT

s.t. u 1, u U

This can be relaxed by removing the constraint u U, giving as solution the eigenvector of L having thesmallest nonzero eigenvalue [20]. The next subsectiongives an analogous derivation for signed graphs.

5.2 Signed Graphs We now give a derivation ofthe ratio cut for signed graphs. Let G = (V , E , A)

be a signed graph. We will write A+ and A forthe adjacency matrices containing only positive andnegative edges. In other words, A+ij = max(0, Aij) and

Aij = max(0, Aij).For convenience we define positive and negative cuts

that only count positive and negative edges respectively:

cut+(X, Y) =

iX,jY

A+ij

cut(X, Y) =

iX,jY

Aij

In these definitions, we allow X and Y to be overlap-ping. For a vector u RV, we consider the bilinear formuTLu. As shown in Equation (4.9), this can be writtenin the following way:

uTLu = ij |Aij|(ui sgn(Aij)uj)2

For a given partition (X, Y), let u RV be the followingvector:

ui = +1

2

|A|

|A|+

|A|

|A|

when i X(5.12)

ui = 1

2

|A|

|A|+

|A|

|A|

otherwise.

The corresponding bilinear form then becomes:

uTLu = ij

|Aij

| (ui

sgn(Aij

)uj

)2

= |V|

2 cut+(X, Y) + cut(X, X) + cut(Y, Y)

1

|X|+

1

|Y|

This leads us to define the following signed cut of (X, Y):

scut(X, Y) = 2 cut+(X, Y)

+ cut(X, X) + cut(Y, Y)

and to define the signed ratio cut as follows:

SignedRatioCut(X, Y) = scut(X, Y)

1|X|

+ 1|Y|

Therefore, the following minimization problem solvesthe signed clustering problem, where U denotes vectorsof the form given in Equation (5.12).

minXV

SignedRatioCut(X, V \ X)

minuRV

uLuT s.t. u U


8/12

Note that we lose the orthogonality of u to the constantvector. This can be explained by the fact that when Gcontains negative edges, the smallest eigenvector canalways be used for clustering: If G is balanced, thesmallest eigenvalue is zero and its eigenvector equals(1) and gives the two clusters separated by negative

edges. If G is unbalanced, the smallest eigenvalue islarger than zero, so the constant vector plays no role.

The signed cut scut(X, Y) counts the number ofpositive edges that connect the two groups X and Y,and the number of negative edges that remain in eachof these groups. Thus, minimizing the signed cut leadsto clusterings where two groups are connected by fewpositive edges and contain few negative edges insideeach group. This signed ratio cut generalizes the ratiocut of unsigned graphs and justifies the use of the signedLaplacian L for spectral clustering of signed graphs.

5.3 Normalized Laplacian When instead of nor-malizing with the number of vertices |X| we normalizewith the number of edges vol(X), the result is a spectralclustering algorithm based on the eigenvectors ofD1Aintroduced by Shi and Malik [24]. The cuts normalizedby vol(X) are called normalized cuts. In the signed case,the eigenvectors ofD1A lead to the signed normalizedcut:

SignedNormalizedCut(X, Y)

= scut(X, Y)

1

vol(X)+

1

vol(Y)

A similar derivation can be made for normalized

cuts based on D1/2AD1/2, generalizing the spectralclustering method of Ng, Jordan and Weiss [22]. Thefollowing section gives an example of clustering a small,signed graph.

5.4 Anthropological Example As an applicationof signed spectral clustering to real-world data, wepresent the dataset of [23]. This dataset describes therelations between sixteen tribal groups of the EasternCentral Highlands of New Guinea [10]. Relations be-tween tribal groups in the GahukuGama alliance struc-ture can be friendly (rova) or antagonistic (hina).We model the dataset as a graph with edge weights +1for friendship and 1 for enmity.

The resulting graph contains cycles with an oddnumber of negative edges, and therefore its signedLaplacian matrix is positive-definite. We use the eigen-vectors of the two smallest eigenvalues (1.04 and 2.10)to embed the graph into the plane. The result is shownin Figure 6. We observe that indeed the positive (green)edges are short, and the negative (red) edges are long.Looking at only the positive edges, the drawing makes

Alikadzuha

Asarodzuha

Gahuku Ove

Masilakidzuha

Ukudzuha

Gehamo

Gaveve

Gama

KotuniNagamidzuha

Nagamiza

Kohika

Seuve

Uheto

Notohana

Figure 6: The tribal groups of the Eastern CentralHighlands of New Guinea from the study of Read [23]

drawn using signed Laplacian graph embedding. Indi-vidual tribes are shown as vertices of the graphs, withfriendly relations shown as green edges and antagonis-tic relations shown as red edges. The three higher-ordergroups as described by Hage & Harary in [10] are lin-early separable.

the two connected components easy to see. Looking atonly the negative edges, we recognize that the tribalgroups can be clustered into three groups, with no neg-ative edges inside any group. These three groups corre-spond indeed to a higher-order grouping in the Gahuku

Gama society [10]. An example on a larger network isshown in the next section, using the genre of movies ina useritem graph with positive and negative ratings.

6 Graph Kernels and Link Prediction

In this section, we describe Laplacian graph kernels thatapply to graphs with negatively weighted edges. Wefirst review ordinary Laplacian graph kernels, and thenextend these to signed graphs.

A kernel is a function of two variables k(x, y) thatis symmetric, positive-semidefinite, and represents ameasure of similarity in the space of its arguments.Graph kernels are defined on the vertices of a givengraph, and are used for link prediction, classificationand other machine learning problems.

6.1 Unsigned Laplacian Kernels We briefly re-view three graph kernels based on the ordinary Lapla-cian matrix.

Commute time kernel The simplest graph kernelpossible using L is the MoorePenrose pseudoinverseL+. This kernel is called the commute time kernel due


9/12

to its connection to random walks [7]. Alternatively, itcan be described as the resistance distance kernel, basedon interpreting the graph as a network of electricalresistances [14].

Regularized Laplacian kernel Regularizing theLaplacian kernel results in the regularized Laplacian

graph kernel (I + L)1, which is always positive-definite [13]. is the regularization parameter.

Heat diffusion kernel By considering a processof heat diffusion on a graph one arrives at the heatdiffusion kernel exp(L) [15].

6.2 Signed Laplacian Kernels To apply the Lapla-cian graph kernels to signed graphs, we replace the or-dinary Laplacian L by the signed Laplacian L.

Signed resistance distance In graphs with neg-ative edge weights, the commute time kernel can be ex-tended to L+. As noted in Section 4, L is positive-definite for certain graphs, in which case the pseudoin-verse reduces to the ordinary matrix inverse. The signedLaplacian graph kernel can also be interpreted as thesigned resistance distance kernel [18]. A separate deriva-tion of the signed resistance distance kernel is given inthe next section.

Regularized Laplacian kernel For signedgraphs, we define the regularized Laplacian kernel as(I+ L)1.

Heat diffusion kernel We extend the heat diffu-sion kernel to signed graphs giving exp(L).

Because L is positive-semidefinite, it follows that allthree kernels are also positive-semidefinite and indeed

proper kernels.In the following two subsections, we perform exper-iments in which we apply these signed Laplacian graphkernels to link sign prediction in unipartite and bipar-tite networks, respectively. Since these networks con-tain negatively weighted edges, we cannot apply theordinary Laplacian graph kernels to them. Therefore,we compare the signed Laplacian graph kernels to otherkernels defined for networks with negative edges.

6.3 Social Network Analysis In this section, weapply the signed Laplacian kernels to the task of linksign prediction in the social network of user relation-

ships in the Slashdot Zoo. The Slashdot Zoo consists ofthe relationships between users of the technology newssite Slashdot. On Slashdot, users can tag other users asfriends and foes, giving rise to a network of usersconnected by two edge types [17]. We model friendlinks as having the weight +1 and foe links as havingthe weight 1. We ignore link direction, resulting inan undirected graph. The resulting network has 71,523nodes and 488,440 edges, of which 24.1% are negative.

6.3.1 Evaluation Setup As the task for evaluation,we choose the prediction of link sign. We split the edgeset into a training and a test set, the test set containing33% of all edges. In contrast to the usual task of linkprediction, we do not try to predict whether a node ispresent in the test set, but only its weight.

As baseline kernels, we take the symmetric adja-cency matrix A of the training set, compute its reducedeigenvalue decomposition A(k) = UU

T of rank k = 9,and use it to compute the following three link sign pre-diction functions:

Rank reduction The reduced eigenvalue decom-position A(k) itself can be used to compute a rank-kapproximation of A. We use this approximation as thelink sign prediction.

Power sum We use a polynomial of degree 4 ofA(k), the coefficients of which are determined empiri-cally to give the most accurate link sign prediction bycross validation.

Matrix exponential We compute the exponentialgraph kernel exp(A(k)) = Uexp()U

T, and use it forlink prediction.

To evaluate the signed spectral approach, we usethree graph kernels based on the signed Laplacianmatrix L. These kernels are computed with the reducedeigenvalue decomposition of L(k) = UU

T in whichonly the smallest k = 9 eigenvalues are kept.

Signed resistance distance The pseudoinverseof the signed Laplacian L+ = U+UT gives the signedresistance distance kernel. In this dataset, odd cyclesexist, and so L is positive-definite. We thus use all

eigenvectors ofL.Signed regularized Laplacian By regularization

we arrive at (I + L)1 = U(I + )1UT, theregularized Laplacian graph kernel.

Signed heat diffusion We compute the heat dif-fusion kernel given by exp(L) = Uexp()UT.

6.3.2 Evaluation Results As a measure of predic-tion accuracy, we use the root mean squared error(RMSE) [7]. The evaluation results are summarized inTable 7. We observe that all three signed Laplaciangraph kernels perform better than the baseline meth-ods. The best performance is achieved by the signed

regularized Laplacian kernel.

6.4 Collaborative Filtering To evaluate the signedLaplacian graph kernel on a bipartite network, we chosethe task of collaborative filtering on the MovieLens 100kcorpus, as published by GroupLens Research. In a col-laborative filtering setting, the network consists of users,items and ratings. Each rating is represented by an edgebetween a user and an item. The MovieLens 100k cor-


10/12

Figure 7: The link sign prediction accuracy of differentgraph kernels measured by the root mean squared error(RMSE).

Kernel RMSE

Rank reduction 0.838

Power sum 0.840Matrix exponential 0.839Signed resistance distance 0.812Signed regularized Laplacian 0.778Signed heat diffusion 0.789

pus of movie ratings contains 6.040 users, 3.706 items,and about a hundred thousand ratings.

Figure 8 shows the users and items of theMovieLens 100k corpus drawn with the method de-scribed in 2.2. We observe that the signed Laplaciangraph drawing methods place movies on lines passing

through the origin. These lines correspond to clustersin the underlying graph, as illustrated by the placementof the movies in the genre adventure in red.

6.4.1 Rating Prediction The task of collaborativefiltering consists of predicting the value of unknownratings. To do that, we split the set of ratings into atraining set and a test set. The test set contains 2% ofall known ratings. Using the graph consisting of edgesin the training set, we compute the signed Laplacianmatrix, and use its inverse as a kernel for ratingprediction. The collaborative filtering task consists ofpredicting the rating of edges.

6.4.2 Algorithms In the setting of collaborative fil-tering, the graph G = (V , E , A) is bipartite, containinguser-vertices, item-vertices, and edges between them. Inthe MovieLens 100k corpus, ratings are given on an in-teger scale from 1 to 5. We scale these ratings to beevenly distributed around zero, by taking the mean ofuser and item ratings. Let Rij {1, 2, 3, 4, 5} be therating by user i of item j, then we define

Aij = Aji = Rij (i + j)/2

Where i and j are user and item means, and diand dj are the number of rating given by each user orreceived by each item. The resulting adjacency matrixA is symmetric. We now describe the rating predictionalgorithms we evaluate.

Mean As the first baseline algorithm, we use (i +j)/2 as the prediction.

Rank reduction As another baseline algorithm,we perform rank reduction on the symmetric matrixA by truncation of the eigenvalue decomposition A =

0.06 0.04 0.02 0 0.02 0.04 0.060.06

0.04

0.02

0

0.02

0.04

0.06

(a) Adjacency matrix A

1 0.5 0 0.5 1 1.5

x 108

1

0.5

0

0.5

1

1.5x 10

8

(b) Signed Laplacian matrix L

Figure 8: The users and items of the MovieLens 100k

corpus plotted using (a) eigenvectors of A with largesteigenvalue and (b) eigenvectors of L with smallesteigenvalue. Users are shown in yellow, movies in blue,and movies in the genre adventure in red. Edges areomitted for clarity. In the Laplacian plot, movies areclustered along lines through the origin, justifying theusage of the cosine similarity as described below.

UUT to the k first eigenvectors and eigenvalues, givingA(k) = U(k)(k)U

T(k). We vary the value of k between 1

and 120. As a prediction for the user-item pair (i, j), weuse the corresponding value in A(k). To that value, weadd (u+i)/2 to scale the value back to the MovieLensrating scale.

Signed resistance distance kernel We use thesigned Laplacian kernel K = L+ for prediction. Wecompute the kernel on a rank-reduced eigenvalue de-composition of L. As observed in Figure 8, clusters ofvertices tend to form lines through the origin. There-fore, we suspect that the distance of a single point to theorigin does not matter to rating prediction, and we use


11/12

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

0 20 40 60 80 100 120

RMSE

k (dimensional reduction)

MeanDimensionality reductionSigned Laplacian kernel

Figure 9: Root mean squared error of the ratingprediction algorithms in function of the reduced rank k.Lower values denote better prediction accuracy.

the cosine similarity in the space spanned by the square

root of the inverted Laplacian. This observation corre-sponds to established usage for the unsigned Laplaciangraph kernel [7].

Given the signed Laplacian matrix L with eigende-composition L = UUT, we set M = U1/2, notingthat K = MMT. Then, we normalize the rows ofM tounit length to get the matrix N using Ni = Mi/|Mi|.The cosine distance between any vertices i and j isthen given by NiN

Tj , and N N

T is our new kernel. Aswith simple rank reduction, we scale the result to theMovieLens rating scale by adding (u + i)/2.

The evaluation results are shown in Figure 9. Weobserve that the signed Laplacian kernel has higher

rating prediction accuracy than simple rank reductionand than the baseline algorithm. We also note thatsimple rank reduction achieves better accuracy for smallvalues of k. All kernels display overfitting.

7 Electrical Networks

In this section, we complete the work of [18] by giving aderivation of the signed Laplacian matrix L in electricalnetworks with negative edges.

The resistance distance and heat diffusion kernelsdefined in the previous section are justified by physicalinterpretations. In the presence of signed edge weights,these interpretations still hold using the following for-malism, which we describe in terms of electrical resis-tance networks.

A positive electrical resistance indicates that the po-tentials of two connected nodes will tend to each other:The smaller the resistance, the more both potentials ap-proach each other. If the edge has negative weight, wecan interpret the connection as consisting of a resistorof the corresponding positive weight in series with aninverting amplifier that guarantees its ends to have op-

a

a+a

a

Figure 10: An edge with a negative weight is inter-

preted as a positive resistor in series with an invertingcomponent.

posite voltage, as depicted in Figure 10. In other words,two nodes connected by a negative edge will tend to op-posite voltages.

Thus, a positive edge with weight +a is modeledby a resistor of resistance a and a negative edge withweight a is modeled by a resistor of resistance a inseries with a (hypothetical) electrical component thatassures its ends have opposite electrical potential.

Electrical networks with such edges can be analysed

in the following way. Let i be any node of an unsignednetwork. Then its electric potential is given by the meanof its neighbors potentials, weighted by edge weights.

vi =

ji

Aijvi

ji

Aij

1

If edges have negative weights, the inversion of poten-tials results in the following equation:

vi =

ji

Aijvi

ji

|Aij|

1

Thus, electrical networks give the equation Dv = Avand the matrices D1A and D A.

Because such an inverting amplifier needs the def-inition of a specific value of zero voltage, the resultingmodel loses one degree of freedom, explaining that inthe general case L has rank one greater than L.

8 Discussion

The various example applications in this paper showthat signed graphs appear in many diverse spectralgraph mining applications, and that they can be ap-

proached by defining a variant of the Laplacian ma-trix L. Although we used the notation L = D Athroughout this paper, we propose the notation L =D A, since it does not interfere with unsigned graphs(in which D = D) and because the matrix D A is lessuseful for data mining applications2. This signed exten-sion not only exists for the combinatorial Laplacian L

2although D A is used for signed graphs in knot theory


12/12

but also for the two normalized Laplacians I D1Aand D1/2LD1/2.

Spectral clustering of signed graphs is thus indeedpossible using the intuitive measure of cut whichcounts positive edges between two clusters and negativeedges inside each cluster. As for unsigned spectral

clustering, the different Laplacian matrices correspondto ratio cuts and normalized cuts. For link prediction,we saw that the signed Laplacians can be used askernels (since they are positive-semidefinite), and canreplace graph kernels based on the adjacency matrix.This is especially true when the sign of edges is tobe predicted. Finally, we derived the signed Laplacianusing the application of graph drawing and electricalnetworks. These derivations should confirm that thedefinition L = D A is to be preferred over L = D Afor signed graphs.

In all cases, we observed that when the graph isunbalanced, zero is not an eigenvalue of the Laplacianand thus its eigenvector can be used directly unlike theunsigned case when the eigenvector of least eigenvalueis trivial and can be ignored. For graph drawing, thisresults in the loss of translational invariance of thedrawing, i.e. the drawing is placed relative to a pointzero. For electrical networks, this results in the lossof invariance under addition of a constant to electricalpotentials, since the inverting amplification depends onthe chosen zero voltage.

References

[1] M. Belkin and P. Niyogi. Laplacian eigenmaps andspectral techniques for embedding and clustering. InAdvances in Neural Information Processing Systems,pages 585591, 2002.

[2] U. Brandes, D. Fleischer, and J. Lerner. Summariz-ing dynamic bipolar conflict structures. Trans. on Vi-sualization and Computer Graphics, 12(6):14861499,2006.

[3] P. Chebotarev and E. V. Shamis. On proximitymeasures for graph vertices. Automation and RemoteControl, 59(10):14431459, 1998.

[4] F. Chung. Spectral Graph Theory. American Mathe-matical Society, 1997.

[5] M. Desai and V. Rao. A characterization of the small-est eigenvalue of a graph. Graph Theory, 18(2):181194, 1994.

[6] I. S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means:Spectral clustering and normalized cuts. In Proc. Int.Conf. Knowledge Discovery and Data Mining, pages551556, 2004.

[7] F. Fouss, A. Pirotte, J.-M. Renders, and M. Saerens.Random-walk computation of similarities betweennodes of a graph with application to collaborative rec-

ommendation. Trans. on Knowledge and Data Engi-neering, 19(3):355369, 2007.

[8] C. Gkantsidis, M. Mihail, and E. Zegura. Spectralanalysis of Internet topologies. In Proc. Joint Conf.IEEE Computer and Communications Societies, pages364374, 2003.

[9] K. Goldberg, T. Roeder, D. Gupta, and C. Perkins.Eigentaste: A constant time collaborative filteringalgorithm. Information Retrieval, 4(2):133151, 2001.

[10] P. Hage and F. Harary. Structural Models in Anthro-pology. Cambridge University Press, 1983.

[11] F. Harary. On the notion of balance of a signed graph.Michigan Math., 2(2):143146, 1953.

[12] Y. Hou. Bounds for the least Laplacian eigenvalue ofa signed graph. Acta Mathematica Sinica, 21(4):955960, 2005.

[13] T. Ito, M. Shimbo, T. Kudo, and Y. Matsumoto.Application of kernels to link analysis. In Proc. Int.Conf. on Knowledge Discovery in Data Mining, pages586592, 2005.

[14] D. J. Klein and M. Randic. Resistance distance.Mathematical Chemistry, 12(1):8195, 1993.

[15] R. Kondor and J. Lafferty. Diffusion kernels on graphsand other discrete structures. In Proc. Int. Conf. onMachine Learning, pages 315322, 2002.

[16] Y. Koren, L. Carmel, and D. Harel. ACE: A fastmultiscale eigenvectors computation for drawing hugegraphs. In Symposium on Information Visualization,pages 137144, 2002.

[17] J. Kunegis, A. Lommatzsch, and C. Bauckhage. TheSlashdot Zoo: Mining a social network with negativeedges. In Proc. Int. World Wide Web Conf., pages741750, 2009.

[18] J. Kunegis, S. Schmidt, C. Bauckhage, M. Mehlitz,

and S. Albayrak. Modeling collaborative similaritywith the signed resistance distance kernel. In Proc.European Conf. on Artificial Intelligence, pages 261265, 2008.

[19] M. Lien and W. Watkins. Dual graphs and knot invari-ants. Linear Algebra and its Applications, 306(1):123130, 2000.

[20] U. v. Luxburg. A tutorial on spectral clustering.Statistics and Computing, 17(4):395416, 2007.

[21] P. Massa and P. Avesani. Controversial users demandlocal trust metrics: an experimental study on epin-ions.com community. In Proc. American Associationfor Artificial Intelligence Conf., pages 121126, 2005.

[22] A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral

clustering: Analysis and an algorithm. In Advancesin Neural Information Processing Systems, pages 849856, 2001.

[23] K. E. Read. Cultures of the Central Highlands, NewGuinea. Southwestern J. of Anthropology, (1):143,1954.

[24] J. Shi and J. Malik. Normalized cuts and imagesegmentation. IEEE Trans. on Pattern Analysis andMachine Intelligence, 22(8):888905, 2000.

signed kernels

Documents