similarity on graphs and hypergraphs · 2016-02-15 · similarity on graphs and hypergraphs ... the...

Faculty of ScienceMaster of Science in Mathematics - profile: Education

Similarity on Graphs and HypergraphsGraduation thesis submitted in partial fulfillment of the requirements forthe degree of Master of Science in Mathematics - profile: Education

Filip Moons

Promotor: Prof. Dr. P. Cara

August 2015

Acknowledgements

First of all, I would like to thank my promotor, Philippe Cara for his incredible attentionto detail, for the patience he had with me and the many reminders to continue and to workhard. He also showed me the way to this beautiful topic for a Master thesis. Besides his roleas promotor, I will never forget him as professor of the Linear Algebra course in the very firstterm of the BSc in Mathematics, that without any doubt was the most memorable courseduring my college years: the hard study and the stress I had before entering the exam roomare memories that I still will be telling in my old days.

Secondly, I would like to thank Roger Van Nieuwenhuyze, my math lecturer during theteacher training I followed for one year at the Hogeschool-Universiteit Brussel. Without hisencouragements, I would never have dared to leave and study Mathematics at university level.

I also thank all professors and teaching assistants who taught me a lot during the pastfive years. A special thanks goes to Reen Tallon, who always managed to overcome anyadministrative burden caused by the combination of study programs.

Last but not least, I want to thank my parents for their angry looks when they felt Idid not study enough, it certainly kept me on track. Also my partner Sacha, my classmates,my friends from university and the Classical Liberal Student Union made my college yearsunforgettable for the rest of my life.

1

Abstract

This thesis is about similarity on graphs and hypergraphs. In the first chapter, we give anoverview of all the preliminary knowledge needed to understand everything in the followingchapters. All the methods of similarity we will discuss, are eventually solving an eigenvalueproblem. So it is no surprise we prove the Perron-Frobenius theorem and introduce the powermethod in the first chapter, the power method is a numerical algorithm to calculate the largesteigenvalue and a corresponding eigenvector.

In chapter 2, we give a complete overview of the research done on similarity of graphs.The notion of similarity originated from the HITS algorithm: this algorithm was developedin the late nineties to sort results of a search engine in a meaningful way. The algorithm usesthe hyperlink structure of a set of webpages to assign to each webpage in the set a score inhow well they are in referring to other pages in the set (the hub-score) and it assigns to eachwebpage a score in how authoritative they are compared to the other sites in the set (theauthority-score). So you get two groups in the set of webpages: some pages are good hubs,others are good authorities. Obviously, this dichotomy is strongly related: pages that aregood hubs will contain a lot of links to authoritative pages, and, conversely, pages are goodauthorities when they have a lot of incoming links of good hubs. This algorithm was veryinnovative at the time, because back then, search engine results where sorted based only onthe number of incoming links or on the number occurences of the search query.

The HITS algorithm is generalized in 2004 to a notion of similarity between graphs. A setcontaining hyperlinked web pages, can indeed be seen as a graph. This generalization will leadto a method where we can compare two graphs by putting a similarity score between everynode of the first graph and every node of the second graph. This similarity score indicateshow similar these nodes are: two nodes will have a high similarity score when their adjacentnodes will also have a high similarity score.

Once this node similarity is introduced on graphs, we extend this notion to a similaritymethod that also returns a similarity score between the edges of two graphs. We will alsodiscuss a similarity method that deals with colored graphs. With all these methods, we havea complete overview of the most recent studies about graph similarity. It is remarkable thatthe convergence of the resulting algorithm of all these methods can be proved with the sameconvergence theorem (Theorem 2.2.10.). Every solution is in fact also just an eigenvalueproblem: every extension is a variant of the power method to calculate eigenvectors.

An important remark is that all the methods return slightly different results. At the mo-ment, ‘the’ similarity score between two vertices of a graph doesn’t exists, simply because itdepends on the method you are using: in some cases the results are equal (up to a constant),in other cases the results are different because slightly different criteria are used in the cal-culation. In that case, we can explicitly derive the difference. Still, it is not that worse: it isimportant to remember that the similarity scores are mainly interesting when compared to

2

3

other similarity scores within a graph and with one method. From that point of view, everymethod give the same ranking. This is also justified by the applications: all of them are usingsimilarity scores in comparison to other similarity scores.

In the last chapter, we discover a new field: we try to transfer the concept of similarity tothe more general structure of hypergraphs. We end this chapter with a reflection on the resultsof the previous chapter: when we try to develop a method for similarity on hypergraphs, whichconditions must be fulfilled? Are these conditions also met by the graph similarity methods ofthe first chapter? At first, we take a look at the graph representations of hypergraphs and wetry to get a good method of similarity by using them. It will be clear that the incidence graphof a hypergraph returns the best results when used for similarity. After that, we also try todevelop a method based on the incidence matrix. This method also fulfills all conditions. Weconclude the thesis by proving that both methods are equal up to a constant.

Samenvatting

Deze thesis bespreekt similariteit op graffen en hypergraffen. In het eerste hoofdstuk gevenwe een overzicht van alle voorkennis die nodig is om deze thesis te doorgronden. Bijna allemethodes die in deze thesis besproken worden, komen uiteindelijk neer op het oplossen van eeneigenwaardeprobleem: het is dan ook niet verwonderlijk dat we de Perron-Frobeniusstellingbewijzen en de methode van de machten invoeren om eigenvectoren numeriek te berekenen.

De notie van similariteit op graffen, ingevoerd in hoofdstuk 2, vindt zijn oorsprong in hetHITS-algoritme: dit algoritme werd eind jaren negentig ontdekt om zoekresultaten van eeninternetzoekmachine op een betekenisvolle manier te sorteren. Het algoritme onderzoekt derelaties tussen de hyperlinks van een verzameling webpagina’s: sommige webpagina’s wor-den bestempeld als goede doorverwijzers, andere webpagina’s worden dan weer als een geza-ghebbende bron beschouwd voor de zoekterm. Uiteraard is deze tweedeling sterk verwant:pagina’s die goede doorverwijzers zijn zullen veel links naar pagina’s bevatten die als eengezaghebbende bron gezien worden en, omgekeerd, zullen pagina’s als gezaghebbende bronworden beschouwd als ze veel inkomende links krijgen van goede doorverwijzers. Dit algo-ritme was erg vernieuwend in die tijd, vermits tot dan toe zoekresultaten enkel gesorteerdwerden op het aantal inkomende links (zonder daarbij de bronpagina na te gaan) of het aantalkeren dat de zoekterm op de pagina voorkwam.

Deze methode werd rond 2004 veralgemeend naar een notie van similariteit tussen graffen.Een verzameling webpagina’s die onderling gelinkt zijn zoals die in het HITS-algoritme ge-bruikt worden, kan immers beschouwd worden als een graf. Deze veralgemening leidt tot eenmethode waarbij we een soort score kunnen kleven op hoe twee toppen van twee verschillendegraffen op elkaar lijken. Deze ‘similariteisscore’ zal hoger zijn als de adjacente toppen vandeze twee toppen ook gelijkaardig zijn.

Eens we similariteit op de toppen van graffen hebben ingevoerd, gaan we deze methodeverder uitbreiden tot similariteit op toppen en bogen van graffen en tot similariteit tussengekleurde graffen. We hebben zo een compleet overzicht van de meest recente studies overhet onderwerp. Opvallend is dat de convergentie van de resulterende algoritmes van al dezeuitbreidingen via dezelfde convergentiestelling (Stelling 2.2.10.) kunnen bewezen worden.Elke uitbreiding komt ook neer op het oplossen van een eigenwaardeprobleem: uiteindelijk iselke uitbreiding een variant op de methode van de machten om eigenvectoren te berekenen.

Een belangrijke opmerking is hier wel dat deze verschillende uitbreidingen lichtjes andereresultaten opleveren. Er bestaat niet zoiets als ‘de’ similariteitsscore tussen twee toppen vantwee graffen, eenvoudigweg omdat elke uitbreiding een licht verschillend resultaat zal oplev-eren: in sommige gevallen zijn de resultaten op een constante na gelijk aan elkaar, in anderegevallen gebruiken de variaties iets andere criteria om similariteit te bepalen (bv. de methodevan similariteit op de toppen van graffen en similariteit op de toppen en de bogen), waardoorde resultaten sowieso verschillend zijn. We kunnen in dat geval wel expliciet het verschil

4

5

tussen de methodes uitschrijven. Belangrijk is te onthouden dat similariteitscores vooral inverhouding tot de andere scores interessant zijn: we kunnen bv. gemakkelijk bepalen welketop van de ene graf, het meest lijkt op de top van een andere graf simpelweg door de grootstesimilariteitsscore te bekijken. Dit zal ook blijken uit het korte overzicht van toepassingen datop het einde van hoofdstuk 2 is toegevoegd, ook daar wordt een similariteitsscore enkel inverhouding tot de andere similareitsscores bekeken.

In het laatste hoofdstuk ontdekken we tot slot nieuw terrein: we proberen het concept vansimilariteit over te brengen op de meer algemene structuur van hypergraffen. Dat start meteen uitgebreide reflectie op de resultaten van het vorige hoofdstuk: aan welke criteria moetzo’n similariteitsmethode voor hypergraffen allemaal voldoen? Voldoen de grafmethodes vanhet vorige hoofdstuk hier ook aan? Hypergraffen kunnen door verschillende grafrepresen-taties worden voorgesteld en we proberen eerst om via deze weg tot een goede methode tekomen. Het zal blijken dat de incidentiegrafrepresentatie van een hypergraf de beste resul-taten oplevert. Nadien proberen we ook een methode te ontwikkelen aan de hand van deincidentiematrix, ook deze methode voldoet aan alle voorwaarden. Welnu: we kunnen be-wijzen dat de methode met de incidentiegraf en de methode met de incidentiematrix op eenconstante na dezelfde resultaten oplevert.

Contents

1 Preliminaries and notations 71.1 Some families of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.1 Permutation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1.2 Nonnegative and primitive matrices . . . . . . . . . . . . . . . . . . . 81.1.3 Irreducible nonnegative matrices . . . . . . . . . . . . . . . . . . . . . 9

1.2 Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2.1 Spectral radii of nonnegative matrices . . . . . . . . . . . . . . . . . . 121.2.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.3 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.3.1 Vector norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

p-norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Maximum norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Norm equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.3.2 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Frobenius norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20p-norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Maximum norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.3.3 Spectral radius formula . . . . . . . . . . . . . . . . . . . . . . . . . . 221.4 Numerical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.4.1 Bachmann-Landau notations . . . . . . . . . . . . . . . . . . . . . . . 26Big O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Small O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Asymptotical Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.4.2 The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Computational cost and usage . . . . . . . . . . . . . . . . . . . . . . 30Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.5 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.5.1 General definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Product graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

1.5.2 Strong connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.6 Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6

CONTENTS 7

1.6.1 General definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36k-hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Directed hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1.6.2 Incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2 Similarity on graphs 392.1 The HITS algorithm of Kleinberg . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.1.3 Constructing relevant graphs of webpages . . . . . . . . . . . . . . . . 412.1.4 Hubs and Authorities . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.1.5 Convergence of the algorithm . . . . . . . . . . . . . . . . . . . . . . . 452.1.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Searching for math professors at the VUB . . . . . . . . . . . . . . . . 49Predictors in the Eurovision Song Contest 2009-2015 . . . . . . . . . . 51

2.1.7 Final reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582.2 Node similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592.2.1 Convergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 632.2.2 Similarity matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702.2.3 Algorithm 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Equivalence with the power method . . . . . . . . . . . . . . . . . . . 73Computational cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

2.2.4 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74HITS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Central scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Self-similarity of a graph . . . . . . . . . . . . . . . . . . . . . . . . . . 79Undirected graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

2.3 Node-edge similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842.3.1 Coupled node-edge similarity scores . . . . . . . . . . . . . . . . . . . 842.3.2 Algorithm 6 and Algorithm 7 . . . . . . . . . . . . . . . . . . . . . . . 882.3.3 Difference with node similarity . . . . . . . . . . . . . . . . . . . . . . 892.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

2.4 Colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.4.1 Colored nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Algorithm 8 and Algorithm 9 . . . . . . . . . . . . . . . . . . . . . . . 96Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

2.4.2 Colored edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

2.4.3 Fully colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1052.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

2.5.1 Synonym Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062.5.2 Graph matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

CONTENTS 8

3 Similarity on hypergraphs 1073.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

3.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073.1.2 What would be a good hypergraph similarity method? . . . . . . . . . 107

3.2 Similarity through corresponding graph representations . . . . . . . . . . . . 1143.2.1 Line-graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

General definitions and properties . . . . . . . . . . . . . . . . . . . . 114Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 115Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

3.2.2 2-section of a hypergraph . . . . . . . . . . . . . . . . . . . . . . . . . 119General definitions and properties . . . . . . . . . . . . . . . . . . . . 119Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 121Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.2.3 Extended 2-section of a hypergraph . . . . . . . . . . . . . . . . . . . 124General definitions and properties . . . . . . . . . . . . . . . . . . . . 124Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 125Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.2.4 The incidence graph of a hypergraph . . . . . . . . . . . . . . . . . . . 129General definitions and properties . . . . . . . . . . . . . . . . . . . . 129Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 129Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

3.3 Similarity by using the incidence matrix . . . . . . . . . . . . . . . . . . . . . 1343.3.1 Compact form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1343.3.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1363.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1373.3.4 Concluding theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1393.3.5 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Appendix A Listings 143

Appendix B Results of the Eurovision Song Contest 2009-2015 151

Chapter 1

Preliminaries and notations

In this chapter, we give an overview of all important prelimiary knowledge needed to under-stand everything in the following chapters. We start with introducing different matrices andnext, we prove the Perron-Frobenius theorem. The Perron-Frobenius theorem states that areal square matrix with nonnegative entries has a unique largest real eigenvalue (i.e. withalgebraic multiplicity 1) with an eigenvector that has only positive entries. The theoremwas proved by Oskar Perron (1880-1975) in 1907 for strictly positive entries and extendedby Ferdinand Georg Frobenius (1849-1917) to irreducible matrices with nonnegative entries.Next, we introduce vector and matrix norms and we prove the equivalence of norms and thespectral radius formula. After that, we introduce the power method. We conclude with theintroduction of graphs and hypergraphs.

1.1 Some families of matrices

In this section, we first introduce different kinds of matrices. Note that all matrices in thismaster thesis have real entries, unless otherwise stated. We start with permutation matricesand their uses. With permutation matrices, we can introduce irreducible matrices. Alsononnegative and primitive square matrices are presented. After defining those, we look at thePerron-Frobenius theorem.

1.1.1 Permutation matrices

Definition 1.1.1. Given a permutation π of n elements:

π : {1, . . . , n} → {1, . . . , n},with:

π =(

1 2 · · · nπ(1) π(2) · · · π(n)

),

the associated permutation matrix Pπ is the n× n-matrix obtained by permuting the rowsof the identity matrix In according to π. So:

Pπ =

eTπ(1)eTπ(2)

...eTπ(n)

.

9

CHAPTER 1. PRELIMINARIES AND NOTATIONS 10

where ej is the j-th column of In, the identy matrix.

Example 1.1.2. The permutation matrix Pπ corresponding to the permutation

π =(

1 2 3 41 4 2 3

)

is:

Pπ =

1 0 0 00 0 0 10 1 0 00 0 1 0

.Note that pij = 1 if and only if π(i) = j.

Property 1.1.3. A permutation matrix P satisfies:

PP T = In.

Proof. By direct computation, we get with P = (pij):

(PP T )ij =n∑k=1

pikpTkj =

n∑k=1

pikpjk

Assume i 6= j. Then for each k, pikpjk = 0 since there is only one nonzero entry in the k-throw and i 6= j, pik and pjk can’t be both the nonzero entry. So, (PP T )ij = 0 when i 6= j.

When i = j, then there exists a k′ ∈ {1, . . . , n} with pik′pjk′ = 1, since there is only onenonzero entry in the k-th row, this k′ is unique, which results in ∑n

k=1 pikpjk = (PP T )ij = 1.In other words,

(PP T )ij ={

1 if i = j

0 otherwise,

this is exactly the formula for the entries of the identity matrix.

Corollary 1.1.4. The transpose of a permutation matrix P is its inverse:

P T = P−1.

This can also more easily be concluded from the fact that a permutation matrix is clearly anorthogonal matrix (a real n× n-matrix with orthonormal entries).

1.1.2 Nonnegative and primitive matrices

Definition 1.1.5. Let A and B be two real n×r-matrices. Then, A ≥ B (respectively A > B)if aij ≥ bij (respectively aij > bij) for all 1 ≤ i ≤ n, 1 ≤ j ≤ r.

Definition 1.1.6. A real n × r-matrix A is nonnegative if A ≥ 0, with 0 the n × r-zeromatrix.

Definition 1.1.7. A real n× r-matrix A is positive if A > 0, with 0 the n× r-zero matrix.


Since column vectors are n× 1-matrices, we shall use the terms nonnegative and positivevector throughout.

Notation 1.1.8. Let B be an arbitrary complex n × r-matrix, then |B| denotes the matrixwith entries |bij |. This is not to be confused with the determinant of a square matrix B, whichwe denote by det(B).

Definition 1.1.9. A nonnegative square matrix A is called primitive if there is a k ∈ N0such that all entries of Ak are positive.

1.1.3 Irreducible nonnegative matrices

In developing the Perron-Frobenius theory, we shall first establish a series of theorems andlemmas on nonnegative irredicuble square matrices.

Definition 1.1.10. A square matrix A is called reducible if there is a permutation matrixP such that

PAP T =(B 0C D

)where B, and D are square matrices, each of size at least one and 0 is a zero matrix. Asquare matrix A is called irreducible if it is not reducible.

It follows immediately that a 1 × 1-matrix is always irreducible by definition. We nowshow a useful property to identify a reducible matrix.

Property 1.1.11. Let A be an n×n-matrix with n ≥ 2. Consider a nonempty, proper subsetS of {1, . . . , n} with aij = 0 for i ∈ S, j 6∈ S. Then A is reducible.

Proof. Let S = {i1, i2, . . . , ik}, where we assume, without loss of generality, that i1 < i2 <· · · < ik−1 < ik. Let Sc be the complement of S, consisting of the ordered set of elementsj1 < j3 < · · · < jn−k. Consider the permutation σ of {1, 2, . . . , n} given by

σ =(

1 2 . . . k k + 1 k + 2 . . . ni1 i2 . . . ik j1 j2 . . . jn−k

)

σ can be represented by the permutation matrix Pσ = (pij), where prs = 1 if σ(r) = s.We prove that

PAP T =(B 0C D

)where B and D are square matrices and 0 is a k × (n − k) zero matrix. Consider row c

and column d, where 1 ≤ c ≤ k and k + 1 ≤ d ≤ n:

(PAP T )cd =∑i

∑j

pciaijpdj . (1.1)

It is enough to show that each term in the summation is zero. Suppose pci = pdj = 1. Thusσ(c) = i and σ(d) = j. Since 1 ≤ c ≤ k, then i ∈ {i1, i2, . . . , ik}; similarly, since k+1 ≤ d ≤ n,we have j ∈ {j1, j2, . . . , jn−k}. By assumption, for such a pair i, j, we have aij = 0. Thatcompletes the proof.


We know prove some equivalent definitions for a nonnegative, irreducible square matrix.

Theorem 1.1.12. Let A ≥ 0 be a nonnegative n× n-matrix. Then the following conditionsare equivalent:

(1) A is irreducible.

(2) (I +A)n−1 > 0

(3) For any pair (i, j), with 1 ≤ i, j ≤ n, there is a positive integer k = k(i, j) ≤ n suchthat (Ak)ij > 0.

Proof. (1) ⇒ (2): Let x ≥ 0,x 6= 0 be an arbitrary vector in Rn. If a coordinate of xis positive, the same coordinate is positive in x + Ax = (I + A)x as well. We claim that(I + A)x has fewer zero coordinates than x as long as x has a zero coordinate. If this claimis not true, then the number of zero coordinates must be at least equal, this means that foreach coordinate j with xj = 0 we would have that xj + (Ax)j = 0. Let J = {j : xj > 0}.For any j 6∈ J, r ∈ J , we have (Ax)j = ∑

k ajkxk = 0 and xr > 0. It must be that ajr = 0.It follows from Property 1.1.11 that A is reducible, which is a contradiction and the claim isproved. Thus (I + A)x has at most n − 2 zero coordinates. Continuing in this manner weconclude that (I + A)n−1x > 0. Let x = ei, then the corresponding column of (I + A)n−1

must be positive. Thus (2) holds.(2)⇒ (3): We have (I +A)n−1 > 0, A ≥ 0, so A 6= 0 and

A(I +A)n−1 =n∑k=1

(n− 1k − 1

)Ak > 0.

Thus for any i, j at least one of the matrices A,A2, . . . , An has its (i, j)-th element entrypositive.

(3)⇒ (1): Suppose A is reducible. Then for some permutation matrix P ,

PAP T =(B1 0C1 D1

)where B1 and D1 are square matrices. Furthermore, we know from Property 1.1.3 thatPAP TPAP T = PA2P T , hence for some square matrices B2, C2 we have:

PA2P T =(B2 0C2 D2

)More generally, for some matrix Ct and square matrices Bt and Dt,

PAtP T =(Bt 0Ct Dt

)

Thus (PAtP T )rs = 0 for t = 1, 2, . . . and for any r, s corresponding to an entry of the zerosubmatrix in PAP T . Now, for t = 1, . . . , n :

0 = (PAtP T )rs =∑k

∑l

prka(t)kl pls

By using the same reasoning as in 1.1, choose r, s so that prk = pls = 1. Then a(t)kl = 0 for all

t, contradicting the hypothesis. This completes the proof.


Corollary 1.1.13. If A is irreducible then I +A is primitive.

Corollary 1.1.14. AT is irreducible whenever A is irreducible.

Property 1.1.15. No row or column of an irreducible matrix A can vanish. This means thatA cannot have a row or a column of zeros.

Proof. Suppose that A has a zero row, then it could be permuted to

PAP T =

0 0 . . . 0c1...cn

D

by some permutation matrix P . It follows from Definition 1.1.10 that A is reducible. Similarly,if A has zero column, it can be permuted to

PAP T =

B

0...0

c1 . . . cn 0

,

again from Definition 1.1.10 we conclude that A is reducible.


1.2 Perron-Frobenius Theorem

We know prove the Perron-Frobenius theorem. The results of this section are mainly basedon the book ‘Elemetary Matrix Theory’ of H.W. Eves [? ]. Although the proof of the Perron-Frobenius is very straightforward in this work, it was sometimes necessary to extend thereasonings in some proofs, the books [? ] and [? ] were very helpful in this case.

1.2.1 Spectral radii of nonnegative matrices

Definition 1.2.1. Let A be an n× n-matrix with real entries and eigenvalues λi, 1 ≤ i ≤ n.Then:

ρ(A) = max1≤i≤n

|λi|

is called the spectral radius of the matrix A.

Geometrically, if all the eigenvalues λi of A are plotted in the complex plane, then ρ(A)is the radius of the smallest disk |z| ≤ R, with center at the origin, which includes all theeigenvalues of the matrix A.

We now establish a series of lemmas on nonnegative irreducible square matrices. Theselemmas will allow us to prove the Perron-Frobenius at the end of this section.

If A ≥ 0 is an irreducible n× n-matrix and x, a vector of size n with 0 6= x ≥ 0, let

rx = min{∑n

j=1 aijxj

xi

}(1.2)

where the minimum is taken over all i for which xi > 0. Clearly, rx is a nonnegative realnumber and is the supremum of all p ≥ 0 for which

Ax ≥ px (1.3)

We now consider the nonnegative quantity r defined by

r = supx≥0x 6=0

{rx} (1.4)

As rx and rαx have the same value for any scalar α > 0, we need consider only the setB of vectors x ≥ 0 with ||x|| = 1, and we correspondingly let Q be the set of all vectorsy = (I + A)n−1x where x ∈ B. From Theorem 1.1.12 , Q consists only of positive vectors.Multiplying both sides of the inequality Ax ≥ rxx by (I +A)n−1, we obtain:

∀y ∈ Q : Ay ≥ rxy,

and we conclude from (1.3) that ry ≥ rx. Therefore, the quantity r of (1.4) can be definedequivalently as:

r = supy∈Q{ry} (1.5)


As B is a compact set (in the usual topology) of vectors, so is Q, and as ry is a continuousfunction on Q, we know from the extreme value theorem that there necessarily exists a positivevector z for which:

Az ≥ rz, (1.6)

and no vector w ≥ 0 exists for which Aw > rw.

Definition 1.2.2. We call all nonnegative, nonzero vectors z satisfying (1.6) extremal vec-tors of the matrix A.

Lemma 1.2.3. If A ≥ 0 is an irreducible n× n-matrix, the quantity r of (1.4) is positive.

Proof. If x is the positive vector whose coordinates are all unity, then since the matrix A isirreducible, we know from Property 1.1.15 that no row of A can vanish, and consequently nocomponent of Ax can vanish. Thus, rx > 0, proving that r > 0.

Lemma 1.2.4. If A ≥ 0 is an irreducible n× n-matrix, each extremal vector z is a positiveeigenvector of A with corresponding eigenvalue r of (1.4), i.e., Az = rz and z > 0.

Proof. Let z be an extremal vector with Az− rz = t. If t 6= 0, then some coordinate of t ispositive; multiplying through by the matrix (I +A)n−1, we have:

Aw− rw > 0, with w = (I +A)n−1z

from Theorem 1.1.12 we know that w > 0. It would then follow that rw > r, contradictingthe definition of r in (1.5). Thus Az = rz, and since w > 0 and w = (1 + r)n−1z, then wehave z > 0, completing the proof.

Lemma 1.2.5. Let A ≥ 0 be an irreducible n × n-matrix, and let B be an n × n- complexmatrix with |B| ≤ A. If β is any eigenvalue of B, then

|β| ≤ r, (1.7)

where r is the positive quantity of (1.4). Moreover, equality is valid in (1.7), i.e., β = reiφ,if and only if |B| = A, and where B has the form:

B = eiφDAD−1, (1.8)

and D is a diagonal matrix whose diagonal entries have modulus unity.

Proof. If βy = By where y 6= 0, then

βyi =n∑j=1

bijyi, with 1 ≤ i ≤ n.

Using the hypotheses of the lemma and the notation of Definition 1.1.8, it follows that:

|β||y| ≤ |B||y| ≤ A|y|, (1.9)

which implies that |β| ≤ r|y| ≤ r, proving (1.7). If |β| = r, then |y| is an extremal vectorof A. Therefore, from Lemma 1.2.4, |y| is a positive eigenvector of A corresponding to thepositive eigenvalue r. Thus,

r|y| = |B||y| = A|y|, (1.10)


and since |y| > 0, we conclude from (1.10) and the hypothesis |B| ≤ A that

|B| = A (1.11)

For the vector y, where |y| > 0, let

D = diag{y1|y1|

, . . . ,yn|yn|

}.

It is clear that the diagonal entries of D have modulus unity, and

y = D|y|. (1.12)

Setting β = reiφ, then By = βy can be written as:

C|y| = r|y|, (1.13)

where

C = e−iφD−1BD. (1.14)

From (1.10) and (1.13), equiting terms equal to r|y| we have

C|y| = |B||y| = A|y|. (1.15)

From the definition of the matrix C in (1.14), |C| = |B|. Combining with (1.11), we have:

|C| = |B| = A. (1.16)

Thus, from (1.15) we conclude that C|y| = |C||y|, and as |y| > 0, it follows that C = |C|and thus C = A from (1.16). Combining this result with (1.14), gives the desired result thatB = eiφDAD−1. Conversely, it is obvious that if B has the form in (1.8), then |B| = A, andB has an eigenvalue β with |β| = r, which completes the proof.

Corollary 1.2.6. If A ≥ 0 is an irreducible n × n-matrix, then the positive eigenvalue r ofLemma 1.2.4 equals the spectral radius ρ(A) of A

Proof. Setting B = A in Lemma 1.2.5 immediately gives us this result.

In other words, if A ≥ 0 is an irreducible n×n-matrix, its spectral radius ρ(A) is positive,and the intersection in the complex plane of the circle |z| = ρ(A) with the positive real axisis an eigenvalue of A.

Definition 1.2.7. A principal square submatrix of an n × n-matrix A is any matrixobtained by crossing out any j rows and the corresponding j columns of A, with 1 ≤ j ≤ n.

Lemma 1.2.8. If A ≥ 0 is an irreducible n × n-matrix, and B is any principal squaresubmatrix of A, then ρ(B) < ρ(A).


Proof. If B is any principal submatrix of A, then there is an n × n-permutation matrix Psuch that B = A11 where

C =(A11 00 0

);PAP T =

(A11 A12A21 A22

)(1.17)

Here, A11 and A22 are, respectively, m×m and (n−m)×(n−m) principal square submatricesof PAP T , 1 ≤ m ≤ n. Clearly, 0 ≤ C ≤ PAP T , and ρ(C) = ρ(B) = ρ(A11), but asC = |C| 6= PAP T , the conclusion follows immediately from Lemma 1.2.5 and Corollary 1.2.6.

The following lemma is used to prove that ρ(A) is a simple eigenvalue of A in the Perron-Frobenius theorem. The proof uses the extension of the product rule of derivation for mul-tilinear functions M(a1, . . . , ak). Suppose x1, .., xk are differentiable vector functions, thenM(x1, . . . , xk) is differentiable and:

ddtM(x1, . . . , xk) = M( d

dtx1, x2, . . . , xk) +M(x1,ddtx2, . . . , xk) + . . .+M(x1, x2, . . . ,

ddtxk)

The most important application of this rule is for the derivative of the determinant:

ddt det(x1, . . . , xk) = det( d

dtx1, x2, . . . , xk)+det(x1,ddtx2, . . . , xk)+ . . .+det(x1, x2, . . . ,

ddtxk)

Lemma 1.2.9. Let A be an n × n-matrix over the complex numbers and let φ(A, λ) =det(λIn − A) be the characteristic polynomial of A. Let Bi be the principal submatrix ofA formed by deleting the i-th row and column of A and let φ(Bi, λ) be the characteristicpolynomial of Bi. Then:

φ′(A, λ) = dφ(A, λ)dλ =

∑i

φ(Bi, λ)

Proof. The proof is immediately done by direct computation:

φ(A, λ) = det

λ− a1,1 −a1,2 . . . −a1,n−a2,1 λ− a2,2 . . . −a2,n

...... . . . ...

−an,1 −an,2 . . . λ− an,n

.Using the extension of the product rule of derivation for multilinear functions

φ′(A, λ) = det

1 0 . . . 0−a2,1 λ− a2,2 . . . −a2,n

...... . . . ...

−an,1 −an,2 . . . λ− an,n

+ det

λ− a1,1 −a1,2 . . . −a1,n

0 1 . . . 0...

... . . . ...−an,1 −an,2 . . . λ− an,n

+ . . .

+ det

λ− a1,1 −a1,2 . . . −a1,n−a2,1 λ− a2,2 . . . −a2,n

...... . . . ...

0 0 . . . 1

=∑i

φ(Bi, λ).


1.2.2 Proof

We now collect the above results into the following main theorem: we finally arrived at thePerron-Frobenius Theorem:

Theorem 1.2.10. (Perron-Frobenius theorem)Let A ≥ 0 be an irreducible n× n-matrix. Then,

1. A has a positive real eigenvalue equal to its spectral radius.

2. To ρ(A) there corresponds an eigenvector x > 0.

3. ρ(A) increases when any entry of A increases.

4. ρ(A) is a simple eigenvalue of A.

5. If Ax = ρ(A)x where x > 0 and x is a normalized vector, then x is unique.

Proof. (1) and (2) follow immediately from Lemma 1.2.4 and Corollary 1.2.6.(3) Suppose we increase some entry of the matrix A, giving us a new irreducible matrix

A where A ≥ A and A 6= A. Applying Lemma 1.2.5, we conclude that ρ(A) > ρ(A).(4) ρ(A) is a simple eigenvalue of A, i.e., ρ(A) is a zero of multiplicity one of the char-

acteristic polynomial φ(λ) = det(λIn − A), we make use of Lemma 1.2.9 by using the factthat φ′(λ) is the sum of the determinants of the principal (n − 1) × (n − 1) submatrices ofλI −A. If Ai is any principal submatrix of A, then from Lemma 1.2.8, det(λI −Ai) (with Ithe identity matrix with the same size as the principal submatrix Ai) cannot vanish for anyλ ≥ ρ(A). From this it follows that:

det(ρ(A)I −Ai) > 0,

and thusφ′(ρ(A)) > 0.

Consequently, ρ(A) cannot be z zero of φ(λ) of multiplicity greater than one and thus ρ(A)is a simple eigenvalue of A.

(5) If Ax = ρ(A)x where x > 0 and ||x|| = 1 (‖ · ‖ denotes the standard Euclidean norm),we cannot find another eigenvector y 6= sx, with s a scalar, of A with Ay = ρ(A)y, so thatthe eigenvector x , meaning that the normalized eigenvector x is uniquely determined.

With the previous proof in mind, the following definition comes not unexpected:

Definition 1.2.11. If a matrix A has an eigenvalue equal to the spectral radius ρ(A), thiseigenvalue is called the Perron root, the corresponding eigenvector x such that:

Ax = ρ(A)x and ‖x‖1 = 1

is called the Perron vector.

Corollary 1.2.12. Being a strictly positve square matrix (A > 0) is a sufficient condition toapply the Perron-Frobenius Theorem.

Proof. This follows immediately from Theorem 1.1.12: a positive square matrix is alwaysirreducible.


1.2.3 Example

To check wether a matrix with nonnegative entries is primitive, irreducible or neither, wejust have to replace all nonzero entries by 1 since this does not affect the classification. Thematrix (

1 11 1

)is strictly positive and thus primitive. The matrices(

1 01 1

)and

(1 10 1

)

both have 1 as a double eigenvalue hence cannot be irreducible. The matrix(

1 11 0

)satisfies:

(1 11 0

)2

=(

2 11 1

)

and hence is primitive. The same goes for(0 11 1

),

this matrix is irreducible but not primitive. Its eigenvalues are 1 and −1.


1.3 Norms

If one has several vectors in Rn or several matrices in Rn×m, how do we measure that some ofthem are ‘large’ and some of them are ‘small’? One way to answer this question is to studynorms, which are basically functions that assign a positive ‘size’ to a vector in a vector space.The norms that we define in this master thesis are limited to Rn and Rn×m and we call themrespectively vector norms (norms on Rn) and matrix norms (norms on Rn×m). This sectionis mainly based on chapter 2 from [? ] and chapter 5 form [? ].

1.3.1 Vector norms

Definition 1.3.1. A vector norm on Rn is a function ‖ · ‖ : Rn → R with the followingproperties:

1. ‖x‖ ≥ 0, for all x ∈ Rn with equality if and only if x = 0.

2. ‖x + y‖ ≤ ‖x‖+ ‖y‖ for all x,y ∈ Rn.

3. ‖αx‖ = |α|‖x‖ for all α ∈ R, x ∈ Rn.

We now give some well known vector norms:

p-norms

Definition 1.3.2. The Holder or p-norms are defined by:

‖x‖p = (|x1|p + · · ·+ |xn|p)1p =

(n∑i=1|xi|p

) 1p

,

for x ∈ Rn.

The 1-norm is also known as the Manhattan norm:

‖x‖1 = |x1|+ · · ·+ |xn|

The 2-norm is also known as the standard Euclidean norm:

‖x‖2 = (|x1|2 + · · ·+ |xn|2)12 = (xTx)

12

Notice that the 2-norm is invariant under orthogonal transformation, for if QQT = I withQ ∈ Rn×n and x ∈ Rn:

‖Qx‖22 = xQQTxT = xTx = ‖x‖22

Maximum norm

Finally, when p→∞ we get the maximum norm:

‖x‖∞ = max(|x1|, . . . , |xn|)

We will prove this:


Theorem 1.3.3. Let x ∈ Rn then:

limp→∞

‖x‖p = ‖x‖∞ = max(|x1|, . . . , |xn|)

Proof. Rewrite ‖x‖p as:

‖x‖p =(

n∑i=1|xi|p

) 1p

= ‖x‖∞(

n∑i=1

( |xi|‖x‖∞

)p) 1p

Note that(|xi|‖x‖∞

)≤ 1 for every i, with equality at least once and at most n times, then:

‖x‖∞ ≤ ‖x‖p ≤ ‖x‖∞n1p

because n > 0, we get limp→∞ n1p = 1, then:

limp→∞

‖x‖p = ‖x‖∞.

Norm equivalence

One very important property of all the norms of Rn is that they are all equivalent, meaningthat when two vectors have about the same size according to one vector norm, they will alsohave more or less the same size (up to a constant) according to another vector norm.

Theorem 1.3.4. All norms on Rn are equivalent, i.e., if ‖ · ‖α and ‖ · ‖β are norms on Rn,then there exist positive constants c1, c2 ∈ R+ such that for all x ∈ RËĘn:

c1‖x‖α ≤ ‖x‖β ≤ c2‖x‖α

Proof. We will prove for the norm ‖ · ‖2 that there are c1 > 0, c2 > 0 such that for all x > 0it holds:

c1‖x‖α ≤ ‖x‖2 ≤ c2‖x‖α.

From this, the theorem easily follows (notice that for x = 0 this inequality evidently holdsby definition of a norm). First let

x = x1e1 + x2e2 + · · ·+ xnen,

where e1, e2, . . . , en is a basis for Rn. By the triangle inequality:

‖x‖α ≤n∑j=1|xj |‖ej‖α.

By Cauchy’s inequality, for the standard inner product ∑nj=1 |xj |‖ej‖α,

n∑j=1|xj |‖ej‖α ≤

n∑j=1

x2j

1/2 n∑j=1‖ej‖2α

1/2

,


so:c1‖x‖α ≤ ‖x‖2,where c1 = 1(∑n

j=1 ‖ej‖2α)1/2 .

Now consider the function x → ‖x‖α on the set ‖x‖2 = 1. The set K = {x | ‖x‖2 = 1}is closed and bounded, so the theorem of Heine-Borel (see Theorem 3.5.2 in [? ]) shows thatit is compact. Now, we have actually proved that the function x → ‖x‖ is continuous onRn, meaning that if ‖xj‖2 → 0 then ‖xj‖α → 0 (j ∈ {1, . . . , n}), so this function attains it’sminimum m on K, m is strictly greater than 0 because a norm is always greater or equal tozero and 0 6∈ K. Now let x 6= 0 be any vector on Rn and let u = x

‖x‖2∈ Rn. Then ‖u‖2 = 1

so ‖u‖α ≥ m. Hence:

x ≥ m‖x‖2, or ‖x‖2 ≤ c2‖x‖, where c2 = 1m.

For example, for any x ∈ Rn we have:

‖x‖2 ≤ ‖x‖1 ≤√n‖x‖2

‖x‖∞ ≤ ‖x‖2 ≤√n‖x‖∞

‖x‖∞ ≤ ‖x‖1 ≤ n‖x‖∞

1.3.2 Matrix norms

The analysis of algorithms where matrices are involved, requires that we are able to assessthe size of matrices. We introduce therefore matrix norms, of course, this definition is com-pletely analogous but with an extra condition: for square matrices, the matrix norm must besubmultiplicative:

Definition 1.3.5. A matrix norm on Rn×m is a function ‖ · ‖ : Rn×m → R with thefollowing properties:

1. ‖A‖ ≥ 0, for all A ∈ Rn×m with equality if and only if A is a matrix with only zeroentries.

2. ‖A+B‖ ≤ ‖A|+ ‖B‖ for all A,B ∈ Rn×m.

3. ‖αA‖ = |α|‖A‖ for all α ∈ R, A ∈ Rn×m.

4. For all A,B ∈ Rn×n : ‖AB‖ ≤ ‖A‖‖B‖.

Frobenius norm

One of the most frequently used matrix norms for a matrix A ∈ Rn×m is the so calledFrobenius norm:

‖A‖F =

n∑i=1

m∑j=1

a2ij

12

= trace(ATA)12 ,

If A ∈ R1×m, then the Frobenius norm equals the 2-norm.


p-norms

We can also define p-norms on matrices:

‖A‖p = supx 6=0

‖Ax‖p‖x‖p

Maximum norm

First notice that the norm defined for A ∈ Rn×n as:

‖A‖max = max1≤i,j≤n

|aij |

is a norm on the vector space Rn×n, but is not a matrix norm. Consider the matrix J = ( 1 11 1 )

and compute J2 = 2J . So: ‖J‖max = 1, ‖J2‖max = ‖2J‖max = 2‖J‖max = 2. So it is not thecase that ‖J2‖max ≤ ‖J‖2max, and hence ‖ · ‖max is not a submultiplicative norm. Thereforewe define the maximum norm as follows:

Definition 1.3.6. Let ‖ · ‖ be a vector norm on Rn. Define the maximum norm |||·||| onRn×n by:

|||A||| = max‖x‖=1

‖Ax‖

It is easy to see these equivalences:

|||A||| = max‖x‖=1

‖Ax‖

= max‖x‖≤1

‖Ax‖

= maxx 6=0

‖Ax‖‖x‖

Because we need in the next paragraph the fact that the maximum norm is indeed a matrixnorm, we prove this:

Theorem 1.3.7. The function |||·|||, also called the maximum norm, is a matrix norm onRn×n.

Proof. Let A be a n × n-matrix. Axiom (1) of matrix norms (definition 1.3.5) follows fromthe fact that |||A||| is the maximum of a nonnegative valued function. That this maximumexists follows from the extreme value theorem of Bolzano (see Theorem 5.5.1 in [? ]) because‖Ax‖ is a continuous function of x on the unit ball with ‖x‖ = 1, which is a compact set.That |||A||| = 0 occurs only when A is a matrix with only zero entries follows from the factthat Ax = 0n×1 for all x is only possible when A = 0n×n.

To see axiom (2), we notice that the triangle inequality is inherited from the vector norm


‖ · ‖, since:

|||A+B||| = max‖x‖=1

‖(A+B)x‖

= max‖x‖=1

‖Ax +Bx‖

≤ max‖x‖=1

(‖Ax‖+ ‖Bx‖)

≤ max‖x‖=1

‖Ax‖+ max‖x‖=1

‖Bx‖

= |||A|||+ |||B|||

Axiom (3) of matrix norms follows from the calculation :

|||αA||| = max‖x‖=1

‖αAx‖ = max‖x‖=1

|α|‖Ax‖ = |α| max‖x‖=1

‖Ax‖ = |α||||A|||

The submultiplicative axiom (4) follows from the fact that:

|||AB||| = maxx 6=0

‖ABx‖‖x‖

= maxx 6=0

‖ABx‖‖Bx‖

‖Bx‖‖x‖

≤ maxx 6=0

‖Ay‖‖y‖ max

x 6=0

‖Bx‖‖x‖

= |||A|||.|||B|||

Again, we can show that all matrix norms are equivalent with the same reasoning as inTheorem 1.3.4.

1.3.3 Spectral radius formula

In this section, we prove the spectral radius formula also known as the Gelfand’s formula. Thisformula is a way to calculate the spectral radius of a square matrix as a limit of matrix norms.The formula will play an important role in Chapter 2, where it will be used to get some moreadvanced results of the Perron-Frobenius when considering nonnegative, symmetric matrices.

Lemma 1.3.8. Let A be a square matrix, then

ρ(A)k = ρ(Ak)

Proof. This follows immediately from the fact that if A has eigenvalues λ1, λ2, . . . , λn, thenλk1, λ

k2, . . . , λ

kn are all the eigenvalues of Ak (see 1.2 in [? ]), so:

ρ(A)k =(

max1≤i≤n

|λi|)k

= |max λi|k = ρ(Ak).


Lemma 1.3.9. Let A be a square matrix and let ‖ · ‖ be a matrix norm then:

ρ(A) ≤ ‖A‖

Proof. Let λ be an eigenvalue of A and let x 6= 0 be a corresponding eigenvector (x ∈ Rn).From Ax = λx, we get:

AX = λX, where X = (x . . .x) ∈ Rn×n \ 0n×n

It follows from property (3) and (4) of matrix norms that:

|λ|‖X‖ = ‖λX‖ = ‖AX‖ ≤ ‖A‖‖X‖,

and simplifying by ‖X‖ (‖X‖ > 0 by matrix norm property 1) gives:

|λ| ≤ ‖A‖.

This holds for all eigenvalues, so also for the maximum of all the eigenvalues of A. Hence theresult follows.

Before we prove the following lemma, remember the well known theorem about the Jordancanonical form of a square matrix A (see Theorem 2.3.1 in [? ]).

Theorem 1.3.10. For each square matrix A there exists an invertible matrix P such that

PAP−1 = J ′

J ′ is called the Jordan normal form of A and:

J ′ =

J1. . .

Jp

where each block Ji is a square matrix of the form:

Ji =

λi 1 0 · · · 00 λi 1 · · · 0...

...... . . . ...

0 0 0 λi 10 0 0 0 λi

,

with λi’s eigenvalues.

Lemma 1.3.11. Let A be an n × n matrix and ε > 0, there exist a matrix norm ‖.‖ρ suchthat:

‖A‖ρ ≤ ρ(A) + ε

Proof. The Jordan canonical form of A is:

A = S

Jn1(λ1) 0 . . . 0

0 Jn2(λ2) . . . ...... . . . . . . 00 . . . 0 Jnk(λk)

S−1,


where S ∈ Rn×n is an invertible matrix, λ1, . . . , λk are the eigenvalues of A and n1 +· · ·+nk =n. Let:

D(η) =

Dn1(η) 0 . . . 0

0 Dn2(η) . . . ...... . . . . . . 00 . . . 0 Dnk(η)

with Dm(η) =

η 0 . . . 00 η2 . . . ...... . . . . . . 00 . . . 0 ηm

Since the left multiplication by Dm(1/ε) multiplies the ith row by 1/εi and the right multi-plication on the right by Dm(ε) multiplies the jth colum by εj , we calculate:

D(1/ε)S−1ASD(ε) =

Bn1(λ1, ε) 0 . . . 0

0 Bn2(λ2, ε). . . ...

... . . . . . . 00 . . . 0 Bnk(λk, ε)

with

Bm(λ, ε) = Dm(1/ε)Jm(λ)Dm(ε) =

λ ε 0 . . . 00 λ ε 0

...0 . . . . . . . . . 0... . . . . . . λ ε0 . . . 0 0 λ

We now define the matrix norm for M ∈ Rn×n by:

‖M‖ρ = max‖x‖1=1

‖D(1/ε)S−1MSD(ε)x‖1 (1.18)

= maxl∈[1:n]

n∑k=1|(D(1/ε)S−1MSD(ε))k,l|. (1.19)

The conditions for being a matrix norm are trivially met because max‖x‖1=1 ‖Ax‖ is a matrixnorm by Theorem 1.3.7.

Theorem 1.3.12. Spectral radius formula Let a be an n × n matrix and let ‖ · ‖ be amatrix norm then:

ρ(A) = limk→∞

‖Ak‖1/k

Proof. Given k ≥ 0, we use Lemma 1.3.9 to write:

ρ(A)k = ρ(Ak) ≤ ‖Ak‖,

so:ρ(A) ≤ ‖Ak‖1/k.

Taking the limit as k →∞ gives ρ(A) ≤ limk→∞ ‖Ak‖1/k. To establish the reverse inequality,we need to prove that, for any ε > 0, there exists a K ≥ 0 such that ‖Ak‖1/k ≤ ρ(A)+ε for allk ≥ K. From Lemma 1.3.11, we know that there exists a matrix norm ‖.‖ρ so ‖A‖ρ ≤ ρ(A)+ε.


Moreover, by the equivalence of the norms (see Theorem 1.3.4) on Rn×n, we know that thereexists some constant C > 0 such that ‖M‖ ≤ C‖M‖ρ for all M ∈ Rn×n. Then, for any k ≥ 0,

‖Ak‖ ≤ C‖Ak‖ρ ≤ C‖A‖kρ ≤ C(ρ(A) + ε)k

So:‖Ak‖1/k ≤ C1/k(ρ(A) + ε),

and thus:limk→∞

‖Ak‖1/k ≤ ρ(A) + ε

This implies the existence of K ≥ 0 such that ‖Ak‖1/k ≤ ρ(A) + ε for k ≥ K, as desired.


1.4 Numerical analysis

1.4.1 Bachmann-Landau notations

For comparing the computational cost of algorithms, it’s important to know the Bachmann-Landau notations. These notations are used to describe the limiting behavior of a functionin terms of simpler functions. These notations are used a lot in computer science to classifyalgorithms by how their number of steps depends on changes in input size. We are onlyinterested in the effects on the number of steps for really large input sizes, so constants don’tplay any role in the classification.

Big O

Definition 1.4.1. (Big O)The Big O of a function g is the set of all functions f that are bounded above by g asymptot-ically (up to constant factor).

O(g(n)) = {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ cg(n)}We now prove a very simple lemma to show that indeed constant factors don’t matter for

the Big O:Lemma 1.4.2. ∀k > 0 : O(k.g(n)) = O(g(n))Proof.

O(k.g(n)) = {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ k.c.g(n)}= {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ (k.c).g(n)}

let c’ = k.c= {f | ∃c′, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ c′.g(n)}= O(g(n))

Small O

Definition 1.4.3. (Small O)Small O of a function g is the set of all functions f that are dominated by g asymptotically.

o(g(n)) = {f |∀ε > 0, ∃n0∀n ≥ n0 : f(n) ≤ ε.g(n)}Note that the small-notation is a much stronger statement than the corresponding big

o-notation: every function that is in the small o of g is also in big o, but the inverse isn’tnecessarily true. Intuitively, f(x) ∈ o(g(x)) means that g(x) grows much faster than f(x).

Asymptotical Equality

Definition 1.4.4. (Asymptotically Equal)

Let f and g be real functions, then f is asymptotically equal to g iff limx→+∞

f(x)g(x) = 1. Notation:

f ≈ g.In fact asymptotical equality, can also be defined equivalently as an equivalency relation:

f ≈ g ⇔ (f − g) ∈ o(g). It is trivially clear that as f ≈ g ⇒ f ∈ O(g).


1.4.2 The Power Method

We now introduce the classical power method, also called the Von Mises iteration ([? ]). Thisnumerical algorithm is the core of the whole thesis, because we will see in the next chaptersa lot of different algorithms that are in fact just adaptations of this iterative method.

The power method is an eigenvalue algorithm that, given a diagonalizable matrix A, findsthe eigenvalue λ with the greatest magnitude and a corresponding eigenvector v such that:

Av = λv.

The power method is special because it doesn’t use any matrix decomposition technique forobtaining results, making it suitable for very large matrices. On the other hand, it only findsone eigenvalue with a corresponding eigenvector and the the iterative process might convergevery slowly.

There are plenty of variations of the power method available that overcome all theselimitations, but we limit our discussion here to the very basic method, therefore we call it theclassical power method.

We first introduce some needed definitions, theorems and notations.

Definition 1.4.5. Consider a real n×n-matrix A with (not necessarily different) eigenvaluesλ1, λ2, . . . , λn. When

|λ1| > |λ2| ≥ |λ3| ≥ · · · ≥ |λn|,

λ1 is called the dominant eigenvalue.

Corollary 1.4.6. If a real, n× n-matrix A has a dominant eigenvalue λ1, then λ1 is real.

Proof. Recall that the eigenvalues of a real matrix A are in general complex, and occur inconjugate pairs. So if λ1 would be complex, the complex conjugate of λ1 would also be aneigenvalue with the same modulus. Because λ1 must be the only eigenvalue that is strictlygreater than all the other eigenvalues, this is impossible.

Notation 1.4.7. Let x(i) denote vector x at iteration step i.

Algorithm 1

Let A ∈ Rn×n be a diagonalizable matrix with dominant eigenvalue λ1. We know that Ahas an eigenbasis V = {v1,v2, . . . ,vn}. Every vector x(0) ∈ Rn can be written as a linearcombination of elements in V , because V spans the space Rn. So:

x(0) =n∑i=1

ξivi.

Now construct the sequence of vectors x(k):

x(k) = Ax(k−1) = Akx(0)


Now:

Akx(0) =n∑i=1

ξiAkvi

=n∑i=1

ξiλki vi

= λk1

{ξ1v1 + ξ2

(λ2λ1

)kv2 + · · ·+ ξn

(λnλ1

)kvn

}

Because |λi| < |λ1| for i > 1, we have that(λiλ1

)k→ 0 for k →∞

so:

x(k) = λk1ξ1v1 + o(1) for k →∞ (1.20)

This means that for k large enough x(k+1) is almost equal to λ1 times x(k). So when theratio between the corresponding vector entries in x(k+1) and x(k) becomes constant after kiteration steps, then this ratio will be equal to the dominant eigenvalue λ1. The vector x(k)

will be a corresponding eigenvector because it is proportional to v1.The start value x(0) must only satisfy the condition that ξ1 6= 0, in other words: x(0) must

have a non-zero component belonging to the dominant eigenvector. Each randomly chosenoption for x(0) will fulfill this requirement. Even if we are so unlucky to pick a startingvector which doesn’t, subsequent x(k) will again fulfill the requirement because roundingerrors sustained during the iteration will have a component in this direction.

A practical problem arises now when one of the components of x(k) is equal to zero. Ifwe want to take the ratio between the corresponding components of x(k+1) and x(k) we get adivision by zero. We can solve this by axiom (3) of vector norms (see definition 1.3.1):

‖λ1x(k)‖ = |λ1|‖x(k)‖.

Because x(k) 6= 0 we have that ‖x(k+1)‖ ≈ ‖λ1x(k)‖, so we calculate λ1 in the power methodby:

|λ1| = limk→∞

‖x(k+1)‖‖x(k)‖

To decide on the sign of λ1 (λ1 is always real, see Corollary 1.4.6), just divide two non-zerocomponents of x(k+1) and x(k).

Another issue to address is that the components of x(k) = Akx(0) can become too largeor too low, which can cause an overflow or underflow in the real number representation ofcomputers. To avoid this, we use normed versions of the x(k)-vectors: we start with a vectory(0) with ‖y(0)‖ = 1. Subsequently, we calculate for k = 0, 1, . . .:

z(k+1) = Ay(k), µk+1 = ‖z(k+1)‖, y(k+1) = z(k+1)

µk+1.

The vectors y(k) all have magnitude 1 and the components of z(k) are restricted because:

‖z(k)‖ = ‖Ay(k−1)‖ ≤ ‖A‖‖y(k−1)‖ = ‖A‖


and when A is invertible we have ‖y(k−1)‖ = 1 ≤ ‖A−1‖‖z(k)‖. Thus ‖z(k)‖ ≥ 1‖A−1‖ . If we

want to calculate the eigenvalue we can use:

Aky(0) = Ak−1Ay(0) = Ak−1z(1) = Ak−1µ1y(1)

= Ak−2µ1Ay(1) = Ak−2µ1z(2) = Ak−2µ1µ2y(2)

= · · ·= µ1µ2 . . . µky(k)

So:

|λ1| = limk→∞

‖Ak+1y(0)‖‖Aky(0)‖2

(1.21)

= limk→∞

µ1µ2 . . . µk+1‖y(k+1)‖µ1µ2 . . . µk‖y(k)‖2

= limk→∞

µk+1

Because µk+1 converges, a good choice for a stop condition for our numerical algorithmcould be (≈ is defined in definition 1.4.4)

|µk − µk−1| ≈ TOL,

which guarantees an estimation error of at most TOL (usually TOL=10−5) for the approxi-mation of the dominant eigenvalue. With all this information, we construct algorithm 1. Justto give an understandable algorithm, we used the Euclidean norm in this algorithm.

Data:A: a matrixy(0): a start vector with ‖y(0)‖2 = 1,T ol: Tolerance for the estimation error.Result:y(k): an estimation of a dominant eigenvector,µk: an estimation of the dominant eigenvalue.begin power method(y(0), k)

k = 1 ;repeat

z(k) = Ay(k−1);µk = ‖z(k)‖2;

y(k) = z(k)

µk;

k = k + 1until k > 2 and |µk − µk−1| < TOL;if the components y(k) and y(k−1) have a different sign then

µk = −µk;endreturn y(k), µk;

endAlgorithm 1: The Power method


Computational cost and usage

The computational cost of the algorithm is determined by the speed at which the o(1) termsin (1.20) go to zero. This is indicated by the slowest converging term (λ2/λ1)k. This meansthat the algorithm converges slowly when there is an eigenvalue close in magnitude to thedominant eigenvalue. We get the following expression for approximation µk of λ1:

|µk − λ1| = O

(∣∣∣∣λ2λ1

∣∣∣∣k)

In our algorithm we accepted an estimation error of 10−5, so the number of steps n can becomputed as: ∣∣∣∣λ2

λ1

∣∣∣∣n ≈ TOL

So, for example, for TOL = 10−5 we get n = −5log λ1

λ2

, so:

power method ∈ O( −1

log(λ1)− log(λ2)

)Note that this O holds for any estimation error TOL of the form 10−e with e ∈ N.

When λ2 ≈ λ1 we see that the power method (almost) needs infinitely many steps. Sincewe do not know the eigenvalues of A, this means we cannot know in advance whether thepower method will converge or not. Recall that the eigenvalues of a real matrix A are ingeneral complex, and occur in conjugate pairs. This means, when λ1 is not real, the powermethod will certainly fail. Therefore, it is a good idea to apply the power method only tomatrices whose eigenvalues are known to be real. The only thing that can go wrong withthose matrices is that the dominant eigenvalue has an algebraic multiplicity larger than 1.

From the Perron-Frobenius theorem in 1.2.10, we also get another good choice: namelythe irreducible matrices or any matrix with strictly positive entries. Indeed, The Perron-Frobenius theorem tells us that they have a unique dominant eigenvalue.

Example

Example 1.4.8. Consider the matrix:

A =

1 −3 5−1 −7 11−1 −9 13

A has a dominant eigenvalue 3 and a double eigenvalue 2. A corresponding dominant eigen-vector is (1, 1, 1). Now we use the classical power method with start vector y(0) = (1, 0, 0),TOL= 10−5 and we become the values in Table 1.4.8. Because we know the eigenvalues ofA, we can predict the number of steps n = −5

log(λ2λ2

)= −5

log(2/3) ≈ 28. Because λ2/λ1 = 2/3 is

not that small, the convergence here is also not that fast. We have for the approximation ofa corresponding eigenvector

y(29) = (−0.577348,−0.577352,−0.577351)

which is (more or less) in proportion with (1, 1, 1).


k µk k µk0 1.00000 15 3.004591 1.73205 16 3.003052 4.12311 17 3.002043 4.06564 18 3.001364 3.58774 19 3.000905 3.34047 20 3.000606 3.20743 21 3.000407 3.13055 22 3.000268 3.08385 23 3.000179 3.05457 24 3.00011

10 3.03582 25 3.0000811 3.02363 26 3.0000512 3.01564 27 3.0000313 3.01037 28 3.0000214 3.00689 29 3.00001

Table 1.1: The iteration values µk of Example 1.4.8.

1.5 Graphs

After introducing different kinds of matrices and proving the Perron-Frobenius theorem, wenow take a closer look at graphs. Here too, we’ll look at different families of graphs and provesome relevant properties about them. We also link the concept of graphs with different kindsof matrices, deepening our insight of some theorems of the previous section.

The definitions and results in this section are mainly based on the course ‘Discrete Math-ematics’ by P. Cara [? ].

1.5.1 General definitions

Definition 1.5.1. A graph is an ordered pair (V,→) where V is a set and → is a relation.The elements of V are called vertices or nodes and → is called the adjacency relation.Let u, v ∈ V , then the ordered pair (u, v) belonging to → is called an arc or edge and wewrite u → v. We also say that u is adjacent to v. When v → v (with v ∈ V ) we say thatthe graph has a loop at v. A graph (V,→) is most of the time denoted by calligraphic lettersG ,H , ...

With this definition, we defined directed graphs, when the relation → is symmetric, wecall the graph undirected, in this case we often write ↔ instead of →.Example 1.5.2. The graph

v1

v3v2


is an undirected graph with vertices v1, v2, v3. The adjacency relation ↔ equals{(v1, v2), (v2, v1), (v2, v3), (v3, v2), (v3, v1), (v1, v3)}.

There is a small problem with our definition, because not all graphs are taken into account,for example the graph below is not a graph following our definition because you cannot definemultiple edges between vertices in a relation.

v1

v3v2

Therefore we define multisets and make a remark that introduces the concept of multi-plicity of an edge using these multisets. Multisets are a generalization of the notion of a setin which elements are allowed to appear more than once.

Definition 1.5.3. A multiset A = (S, µ) is an ordered pair with S a set and µ : S → N0 afunction that gives the multiplicity of an element of S. A multiset can be written as a setin the following way:

A = (S, µ) = {(s, µ(s)) : s ∈ S}

The cardinality of a multiset is defined as

|A| =∑s∈S

µ(s).

Consider as an example the multiset {a, a, b, b, bc} (with a, b, c different elements), whichcan be denoted as a set as {(a, 2), (b, 3), (c, 1)}

We now introduce the following important remark using multisets:

Remark 1.5.4. Although we will stay writing a graph as G = (V,→), this definition doesn’tallow for repeated edges. A graph that can have multiple edges between two vertices is oftencalled a multigraph, but in this master thesis we call a multigraph just a graph. To definethis in a mathematical correct way, you define G as an ordered pair (V,→) where V is a finiteset and → is a multiset consisting of elements of the cartesian product V × V .

Definition 1.5.5. The neighbourhood of a vertex v of a graph G = (V,→) is the inducedsubgraph Gv with vertex set V ′ consisting of all vertices adjacent to v without v itself and withthe multiplicity function µ′, which is the restriction of µ to the vertices in V ′. A vertex witha neighbourhood equal to the empty graph (a graph with an empty set of vertices) is calledisolated.

Definition 1.5.6. Let G = (V,→) be a graph with vertices v1, v2, . . . , vn ∈ V and edges(defined as ordered pairs) e1, . . . em ∈→. Let u → v be an edge between u, v ∈ V . We call uthe source node and v the terminal node of the edge. sG (i) denotes the source node ui ofedge i, tG (i) denotes the terminal node wi of edge i.

Definition 1.5.7. The order of a finite graph G is the number of vertices of G and isdenoted by |G |.


Definition 1.5.8. The indegree of a vertex v in a graph G is the number of times v is aterminal node of an edge.

Definition 1.5.9. The outdegree of a vertex v in a graph G is the number of times v isa source node of an edge.

Definition 1.5.10. The degree of a vertex v in a graph G is the sum of the indegree andoutdegree of vertex v.

Definition 1.5.11. A walk in a graph G is a sequence of vertices

a0, a1, . . . , ak

such that ai−1 → ai for each i ∈ {1, . . . , k}. The length of the walk is k, one less than thenumber of vertices.

Definition 1.5.12. If all edges are distinct in a walk in a graph G , we call the walk a path.

Definition 1.5.13. A cycle is a walk from v0 to v0 in which all vertices except v0 are distinct.

Definition 1.5.14. A simple graph is an undirected graph G = (V,→) containing no loopsand for all vertices vi, vj ∈ V , their is at most one edge.

Definition 1.5.15. A clique in a graph G = (V,→) is a subset C of V , such that every twodistinct vertices in C are adjacent.

Definition 1.5.16. A bipartite graph is a graph G = (V,→) whose vertices can be dividedinto two disijount sets U and T such that every edge connects a vertex in U and a vertex V .There are no edges between vertices in U or between vertices in V .

Product graphs

Definition 1.5.17. Take two graphs G = (U,→),H = (V,→′), the product graph G ×His the graph with |G |.|H | vertices and that has an edge between vertices (ui, vj) and (uk, vl)if there is an edge between ui and uk in G and there is an edge between vj and vl in H .

Colored graphs

Definition 1.5.18. A node colored graph G is quadruple (V,→, C, a) with V a set ofvertices, → an adjacency relation, C a set of colors and a a surjective function a : V → Cthat assigns to each vertex one color.

Definition 1.5.19. In a node colored graph G = (V,→, C, a), cG (V, i) denotes the number ofvertices of color i. So:

cG (V, i) = |{(i, vj) ∈ C × V |a(vj) = i}|

Definition 1.5.20. A edge colored graph G is quadruple (V,→, C, b) with V a set ofvertices, → an adjacency relation, C a set of colors and b a surjective function b : (→)→ Cthat assigns to each edge one color.

Definition 1.5.21. In an edge colored graph G = (V,→, C, b), cG (→, i) denotes the numberof edges of color i. So:

cG (→, i) = |{(i, ej) ∈ C × E|b(ej) = i}|


Definition 1.5.22. A node-edge colored graph or a fully colored graph G is 5-tuple(V,→, C, a, b) with V a set of vertices,→ an adjacency relation, C a set of colors, a a functiona : V → C that assigns to each vertex one color and b a function b : (→)→ C that assigns toeach edge one color with as condition that a(V ) ∪ b(→) = C.

Adjacency matrix

We now represent a finite graph in the form of an adjacency matrix. This matrix gives a lotof useful information about the graph and vice versa.

Definition 1.5.23. Let G = (V,→) be a graph of order n and define a numbering on thevertices v1, . . . , vn. Then the adjacency matrix AG of G is the real n × n-matrix with aijequal to µ(vi, vj).

Corollary 1.5.24. The adjacency matrix of an undirected graph G = (V,∼) is a symmetricmatrix.

Proof. This is trivial by the definition of an undirected graph.

Theorem 1.5.25. Let k > 0. The element on place (i, j) in AkG contains the number of walksof length k from i to j in the graph G = (V, µ).

Proof. By induction on k.For k = 1 we count the walks of length 1. These are edges and the result follows immedi-

ately from the definition of AG .Let vl be a vertex of G . If there are bij walks of length k from i to l and alj walks of

length 1 (edges) from vl to vj , then there are bilalj walks of length k+ 1 from vi to vj passingvertex vl. Therefore, the number of walks of length k + 1 between vi and vj is equal to:∑

l∈Vbilalj =: cij .

By the induction hypothesis we know that bil equals the element on place (i, l) in AkG so cijis exactly the element on place (i, j) in the matrix product

AkGAG = Ak+1G .

Example 1.5.26. The adjacency matrix of the graph in Example 1.5.2 is:

A =

0 1 11 0 11 1 0

Incidence matrix

A graph can also be represented as an incidence matrix by numbering the vertices and theedges. The resulting incidence matrix will have −1, 1 or 0 as entries.


Definition 1.5.27. The incidence matrix of directed graph G = (V,→) with verticesv1, . . . , vn ∈ V and edges e1, . . . , em ∈ E is a n × m-matrix A where rows represent thevertices and the columns represent the edges, such that:

(A)ij =

1 if vi is the source node of ej−1 if vi is the terminal of ej0 otherwise

In the case of an undirected graph G = (V,↔) with vertices v1, . . . , vn ∈ V and edgese1, . . . , em ∈ E we define:

(A)ij ={

1 if vi is a node of ej0 otherwise

1.5.2 Strong connectivity

In this section, we take a closer look at directed graphs and introduce the concept of connec-tivity.

Definition 1.5.28. An undirected graph G = (V,↔) is connected if it possible to establisha path from any vertex to any other vertex.

Definition 1.5.29. A directed graph G (V,→) is connected if the underlying undirected graph(remove all arrows on the edges) is connected, the directed graph G is strongly connectedif there is a path in each direction between each pair of vertices of the graph.

In the next proof, we study the equivalence of the matrix property of irreducibility ofDefinition 1.1.10 with the concept of the strongly connected directed graphs of a matrix:

Theorem 1.5.30. Let G be a (directed) graph with adjacency matrix A. Then G is stronglyconnected if and only if A is irreducible.

Proof. From Theorem 1.5.25 we know that a graph is strongly connected if and only if forevery pair of indices i and j there is an integer k such that (Ak)ij > 0, from Theorem 1.1.12we know this means that A is irreducible and vice versa.


1.6 Hypergraphs

After introducing graphs, we now look at hypergraphs. Intuitively, a hypergraph is a gener-alization of a graph in which an edge can connect any number of vertices. The definitionspresented are mainly from [? ] and [? ].

1.6.1 General definitions

Definition 1.6.1. A hypergraph is an ordered pair (V, E) with V a finite set and E ⊆ 2V \∅,the power set of V with E 6= ∅. The elements of V are called the vertices and the elementsof E are called the edges. An hypergraph (V, E) will be denoted by the calligraphic lettersG,H, . . ..

Remark 1.6.2. As in graphs, the classic definition of a hypergraph doesn’t allow to havemultiple edges that cover the same vertices. Also repeated vertices within an edge, often calledhyperloops are not allowed by the above definition. A hypergraph that can have multipleedges connecting the same vertices, and hyperloops is called a multi-hypergraph, but in thismaster thesis we call a multi-hypergraph just a hypergraph. So a hypergraph G is an orderedpair (V,E) where V a finite set and E is a multi-set consisting of multi-subsets of V (amulti-subset is just a subset of a set where the elements are allowed to appear more thanonce).

Example 1.6.3. The hypergraph G = (V,E) consists of 7 vertices and 4 edges. The edgesare equal to the multiset {{v1, v2, v3}, {v2, v3}, {v3, v5, v6}, {v3, v5, v6}, {v4}}. Note that thecolors don’t have any meaning to the hypergraph (the hypergraph is not an edge coloredhypergraph), but only serve to clarify the drawing.

v1 v2 v3

v4 v5 v6

v7

E1

E2

E3

E4

E5

Definition 1.6.4. In a hypergraph G = (V,E), two vertices vi, vj ∈ V are called adjacent ifthere is an edge Ei ∈ E that contains both vertices. Two edges Ek, El are called adjacent iftheir intersection is not empty.

Definition 1.6.5. The order of a finite hypergraph G is the number of vertices of G andis denoted by |G|.


We now define the degree of a vertex in a hypergraph. Note that we don’t define theindegree and outdegree of a vertex as the hypergraphs we consider are undirected.

Definition 1.6.6. The degree of a vertex v in a hypergraph G is the number of times v iscontained in an edge.

Definition 1.6.7. A path in an hypergraph G = (V,E) is a sequence p = (a0, A1, a1, . . . , Ak, ak),k ≥ 1 where the ai’s are pairwise distinct vertices, the Ai’s are pairwise distinct edges andai−1, ai ∈ Ai for 1 ≤ i ≤ k. The path p is said to join a0 and ak. The length of the path isk.

Definition 1.6.8. A hypergraph is connected if for each vertex there is a path to any othervertex.

k-hypergraphs

In most applications, the edges of an hypergraph connect a fixed number of vertices.

Definition 1.6.9. An hypergraph is called a k-uniform hypergraph for k ≥ 2 ∈ N if for allEi ∈ E, the cardinality |Ei| is equal to k. The cardinality of multisets is defined in Definition1.5.3. The term k-graph is often used instead of a k-uniform hypergraph. The edges in ak-graph are sometimes called k-edges.

Notice that the 2-hypergraphs are just the undirected graphs we defined in the previoussection.

Directed hypergraphs

An important difference with the previous section is that all hypergraphs we have definedare undirected: there is no specific order in which an edge connect different vertices. In agraph, directed edges arise naturally as some vertex can be the source node and some vertexcan be the terminal node, no other nodes are connected by an edge of a graph. For edgesof hypergraphs this concept is not straightforward to generalize: one option is to see edgesas paths connecting vertices in a specific order, another option is to partition the verticesconnected by an edge in a set of source nodes and a set of terminal nodes. This last notionis studied in [? ]. We will not discuss this topic in detail, all the hypergraphs in this masterthesis are undirected.

1.6.2 Incidence matrix

A hypergraph can be represented as an incidence matrix by numbering the vertices and theedges. The resulting incidence matrix will be a boolean matrix with only 1 and 0 entries. Analternative way to represent a hypergraph is by using adjacency tensors.

Definition 1.6.10. The incidence matrix of a hypergraph G = (V,E) with vertices v1, . . . , vn ∈V and edges e1, . . . , em ∈ E is a n ×m-matrix A where rows represent the vertices and thecolumns represent the edges, such that:

(A)ij ={

1 if vi ∈ ej0 if vi 6∈ ej


Example 1.6.11. The corresponding incidence matrix of the hypergraph of Example 1.6.3is:

1 0 0 0 01 1 0 0 01 1 1 1 00 0 0 0 10 0 1 1 00 0 1 1 00 0 0 0 0

.

Chapter 2

Similarity on graphs

In the previous chapter all the basic terminology and results were introduced. Now we takean extensive look at the concept of similarity on graphs. Similarity on graphs is a fairlynew concept to compare the nodes of two graphs. The concept arose from the researchon algorithms for web searching engines (like Google, Yahoo,. . . ) in the late nineties. Morespecifically, Jon M. Kleinberg introduced in his paper ‘Authoritative Sources in a HyperlinkedEnvironment’ [? ] the famous ‘HITS algorithm’ for extracting information from the linkstructure of websites. The method leads to an iterative algorithm where graphs represent thelink structure of a collection of websites on a specific topic. Because this paper formed thebasis of later research on similarity on graphs, the whole idea and algorithm of Kleinberg isintroduced in the first section of this chapter. In 2004, V.D. Blondel et al. [? ] generalizedthe algorithm of Kleinberg, introducing the notion of similarity on directed graphs. Thissimilarity is covered in the second section. With this similarity on directed graphs, there is amuch wider scope of applications than just search algorithms. The method of Blondel onlyreturns node similarity scores, which are in fact a measurement of how similar two nodes oftwo graphs are to each other, and so in the next section, we extend the method of Blondelto both node and edge similarity scores. We conclude this chapter with a notion of similarityon colored graphs.

2.1 The HITS algorithm of Kleinberg

2.1.1 History

Back in the nineties, internet became more and more popular. The popular search enginesback then where Altavista and Yahoo, but they weren’t as advanced as search engines today.The main pitfall of the first search engines was that the search results were purely based onthe number of occurrences of a word in a webpage. This was a pitfall for many reasons. Thefirst reason was the growing popularity of the internet: as more and more webpages wereput online, simply getting the relevant pages to a search query in this text-based manner,was a process that could possibly return millions of relevant pages. Also content similaritywas an issue: a website owner can easily cheat in a text-based search system by just addingand repeating some very popular search words, making his website appear in the results of alarge number of search queries. Two possible solutions were simultaneously invented in 1997and 1998. The first one was the PageRank-system developed by Larry Page and Sergey Brin([? ]). The PageRank system led to the foundation of the immensely popular Google search

41

CHAPTER 2. SIMILARITY ON GRAPHS 42

engine. Meanwhile, also Jon Kleinberg came up with his own solution, the HITS algorithm(hyperlink-induced topic search). At that time, he was both a professor in the ComputerScience Department at the Cornell University and researcher for IBM. The algorithm is usedinter alia today by the Ask search engine (www.ask.com). Both these algorithms use thehyperlinks between webpages to rank search results. Because this master thesis is aboutsimilarity and this concept is introduced on graphs as a generalization of the HITS algorithm,we don’t go into further detail about the PageRank algorithm. In the following paragraphs,the HITS algorithm is extensively explained.

2.1.2 Motivation

Kleinberg’s work originates in the problems that arise with text-based searching the WWW.Text-based searching just counts all the occurrences of a given search query on webpages andreturns a set of webpages ordered by decreasing occurency. When a user supplies a searchquery, we probably face an abundance problem with this method: the number of pages thatcould reasonably be returned as relevant is far too large for a human user to digest. To provideeffective search results under these conditions, we need to filter the ‘authoritative’ ones. Weface some complications when we want to filter the ‘authoritative’ webpages in a text-basedsystem. For example, if we search for ‘job offers in Flanders ’ the most authoritativepage and expected first result in a search engine would be www.vdab.be. Unfortunately, thequery ‘job offers’ is used in over a million pages on the internet and www.vdab.be is notthe one using the term most often. Therefore, there is no way to favor www.vdab.be in atext-based ranking function. This a recurring phenomenon, e.g., if you search for the query‘computer brands’, there is no reason at all to be sure that the website of Apple or Toshibaeven contains this search term.

The HITS algorithm solves these difficulties by analyzing the hyperlink structure amongwebpages. The idea is that hyperlinks encode a sort of human judgment and that thisjudgement is crucial to formulate a notion of authority. Specifically, when a page p includesa link to page q, it means that p gives a conferred authority on q. Again we face difficulties,because this conferred authority doesn’t hold for every link. Links are created for a widevariety of reasons, for example, a large number of links are created for navigation within awebsite (e.g. “Return to homepage”) and these have of course nothing to do with a notion ofauthority.

The HITS method is based on the relationship between the authorities for a topic and thepages that link to many related authorities, called hubs. Page p is called an authority for thequery ‘smartphone brand’ if it contains valuable information on the subject. In our examplewebsites of smartphone manufacturers such as ‘www.apple.com’, ‘www.samsung.com’,... wouldbe good authorities for this search query. These are also the results a user would expect froma search engine.

A hub is a second category of pages needed to find good authorities. Their role is to adver-tise authoritative pages. Hubs contain useful links toward these authorities. In our example,consumer websites with reviews on smartphones, websites of smartphone shops,. . . would begood hubs. In fact, hubs point the search process in the ‘right direction’.

To really grasp the idea, we make an analogy with everyday life. If you tell a friend thatyou think of buying a new smartphone, he might tell you his experiences with smartphonesand he will probably share some opinions he got from other friends. He might suggest yousome good models and good brands. Now, you are more inclined to buy a smartphone that


Data:σ: a query string.E : a text-based search engine.t: natural number (usually initiated to 200)d: natural number (usually initiated to 50).Result: A page set Sσ satisfying all the properties of our wish list.begin create graph(σ, E, t, d)

Let Rσ denote the top t results of E on σ;Set Sσ := Rσ;for each page p ∈ Rσ do

Let Γ+(p) denote the set of all pages p points to;Let Γ−(p) denote the set of all pages pointing to p;Add all pages in Γ+(p) to Sσ;if |Γ−(p)| ≤ d then

Add all pages in Γ−(p) to Sσ;else

Add an arbitrary set of d pages from Γ−(p) to Sσ;end

endreturn Sσ;

endAlgorithm 2: Algorithm to construct Sσ.

your friend suggested. Well, this idea is used in the HITS method: your friend served as ahub, the brands and models he suggested are good authorities.

2.1.3 Constructing relevant graphs of webpages

Any collection of hyperlinked pages can be transformed to a directed graph G = (V,→):the nodes correspond to the pages, and if there is a link from page p to page q, there isan arc p → q. Suppose a search query is performed, specified by a query σ. We wish todetermine the authoritative pages by an analysis of the link structure. But first we have toconstruct a subgraph of the internet on which our algorithm will operate. We want to makethe computational effort as efficient as possible, so we restrict the subgraph to the set Qσ ofall pages where the query σ occurs. For this, we could use any already existing text-basedsearch engine. But, for our algorithm Qσ is possibly much too big: it may contain millionsof pages making it impossible for any computer to preform the algorithm. Moreover it is,as explained in the motivation in 2.1.2, possible that Qσ does not contain some of the mostimportant authorities because they never use the query string σ on their website.

Therefore, we wish to transform the set Qσ to a set Sσ of pages following this ‘wish list’of properties:

1. Sσ is relatively small,

2. Sσ is rich in relevant pages,

3. Sσ contains most of the strongest authorities.


By keeping Sσ small, the computational cost of preforming non-trivial algorithms can be keptunder control. By the property of being rich in relevant pages, it will be easier to find goodauthorities.

To construct Sσ, we first construct a root set Rσ with the t highest-ranked pages for σusing a text-based search engine (they sort results based on the occurence of σ). Typically, tis set about 200. Rσ complies with properties 1 and 2 of our wish list, but because Rσ ⊂ Qσ,it may fail from satisfying property 3. Now we use the root set Rσ to create the set Sσsatisfying our complete wish list. When a strong authority is not in Rσ, it is very likely thatat least one of the pages in Rσ points to this authority. Hence, by using the pages in Rσ, wecan expand it to Sσ by looking at the links that enter and leave Rσ. We get algorithm 2.

Thus, we obtain Sσ by expanding Rσ to include any page pointed to by a page in Rσ.We also add d pages that point to a page in Rσ. d is usually initiated to 50. The parameterd is crucial to stay in accordance with property 1 of our wish list. Indeed, a webpage canbe pointed to by several thousands and thousands of other webpages, and we don’t want toinclude them all if we want to keep Sσ relatively small. Some experiments in [? ] showed thatthis algorithm resulted in a Sσ with a size in the range of 1000 to 5000 web pages. Property3 of our wish list is usually met because a strong authority need only be reference once in thet pages of the root set Rσ to be added to Sσ.

Denote the resulting graph of the page set Sσ by G [Sσ]. Note that G [Sσ] will containa lot of links serving only navigational purposes within a website. As mentioned before,these links have nothing to do with the notion of authority and they must be removed fromour final graph if we want a good determination of the authoritative pages by an analysisof the link structure. A very simple heuristic can be used to derive a subgraph of G [Sσ]leaving out all the navigational links: we make a distinction between transverse links andintrinsic links. Transverse links are links between different domain names (e.g. a link betweenwww.vub.ac.be and www.ua.ac.be) and intrinsic links are links between the same domainname (e.g. a link between www.vub.ac.be and dwis.vub.ac.be). Intrinsic links exist toallow navigation within a website and they tell us very little about the authority of thepages they point to. Therefore, we delete all intrinsic links from G [Sσ], keeping only the arcscorresponding to transverse links.

Our graph still contains some meaningless links in the context of page authority. Supposea large number of pages from the same domain name have a transverse link to the same pagep. Most of the time, this means a form of advertisement (by example ‘Website created by. . . .’at the bottom of each page). It is useful to only allow m pages (m is usually initiated to 6)from the same domain name to have a transverse link to the same page. If m is exceeded,all the transverse links must be deleted from the graph. Note, however, that not all links toadvertisements will be erased because on most web pages, advertisements change on everypage which avoids the exceeding of m.

Applying the two described heuristics above on G [Sσ], we get a new graph G ′σ which isexactly what we need to preform our link analysis.

2.1.4 Hubs and Authorities

A very simple approach would now be to order the pages in G ′σ by their indegree. Althoughthis approach can sometimes return good search results, this heuristic is often too simplebecause Sσ will probably contain some web pages with a lot of incoming links without beingvery relevant to the search query σ (e.g. advertisements). With these incoming links, those


web pages are ranked high in the final search result, which we want to avoid.Do we have to return to a text-based approach to avoid irrelevant web pages being on top

of the search results? No, the link structure of G ′σ can tell us a lot more than it may seemat first glance. Authoritative pages relevant to query σ should indeed have a large in-degree,but there should also be a considerable overlap in the sets of pages that point to authoritativepages. This set of pages that point to authoritative pages are called hubs. Hubs have linksto several authoritative pages and they sort of “concentrate” all the authorities on query σ.Figure 2.1.1 shows what this means conceptually.

Hubs AuthoritiesIrrelevant page of largeindegree

Figure 2.1.1: The concept of hubs and authorities

So, for each page j we assign two scores, an authority score which estimates the valueof the content of the page and a hub score which estimates the value of the outgoing linksto other pages. We now get a dichotomy: a good hub is a page pointing to many goodauthorities, a good authority is a page that is pointed to by many good hubs. This leads usto a mutually reinforcing relation resulting in an iterative method to break this circularity.

So let G ′σ = (V,→) and let hj and aj be the hub and authority scores of vertex vj(corresponding with page j). These scores must be initialized by some positive start valuesand then updated simultaneously for all vertices. This leads to a mutually reinforcing relationin which the hub score of vj is set equal to the sum of the authority scores of all vertices pointedto by vj and in an equal manner the authority score of vj is set equal to the sum of the hubscores of all vertices pointing to vj .{

hj := ∑i:(vj ,vi)∈→ ai,

aj := ∑i:(vi,vj)∈→ hi.

The basic operations in which hubs and authorities reinforce one another are depicted inFigure 2.1.2.


page jaj = sum of hl, for alll pointing j

l1

l2

l3

k1

k2

k3

page jhj = sum of ak, for all kpointed to by j

Figure 2.1.2: The basic operations in the reinforcing relation between hubs and authorities

Let B be the adjacency matrix of G ′σ and denote a as the authority vector with coordinates(a1, a2, . . . , an) (with n = |G ′σ|, the number of pages) and h as the hub vector. The mutuallyreinforcing relation can now be rewritten as:(

ha

)(k+1)

=(

0 BBT 0

)(ha

)(k)

, k = 0, 1, . . . ,

In compact form, we denote

x(k+1) = Mx(k), k = 0, 1, . . . , (2.1)

where

x(k) =(

ha

)(k)

,M =(

0 BBT 0

)After each iteration, we have to normalize hj and aj . Indeed, we want to get the authority

and hub weights for each page and in order to compare these after each iteration step, theymust be normalized because only the relative differences do matter, otherwise the wholeprocedure would be meaningless. Pages with larger aj-scores are viewed as being betterauthorities, pages with larger hj-scores are better hubs.

We get the following sequence (with z(0) some positive start value) of normalized vectors:

z(0) = x(0) > 0, z(k+1) = Mz(k)

||Mz(k)||2, k = 0, 1, . . . , (2.2)


Data:G : a graph of n linked pages.k: natural number.Result: A vector (h,a) containing the hub and authority scores after k steps.begin hits(G , k)

Set a(0) = (1, 1, . . . , 1) ∈ Rn;Set h(0) = (1, 1, . . . , 1) ∈ Rn;for i = 1, 2, . . . , k do

Calculateh′(i) =

(∑m:(v1,vm)∈→ a(i−1)

m ,∑m:(v2,vm)∈→ a(i−1)

m , . . . ,∑m:(vn,vm)∈→ a(i−1)

m

);

Normalize h′(i) obtaining h(i);Calculate a′(i) =

(∑m:(vm,v1)∈→ h(i)

m ,∑m:(vm,v2)∈→ h(i)

m , . . . ,∑m:(vm,vn)∈→ h(i)

m

);

Normalize a′(i) obtaining a(i);endreturn (h(k),a(k));

endAlgorithm 3: The iterative HITS algorithm.

How do we decide on x(0)? We will see that any positive vector in R2n is a good choice,but for the sake of simplicity, we make the natural choice1 1 ∈ R2n . The limit to which thesequence converges results in ‘definitive’ hub and authority scores for each page in the graphG ′σ.

To compute the iterative algorithm, we update the hub and authority scores in an alter-nating form (by each step we have to normalize the scores). Because we will prove that thesequence converges, theoretically we can keep on iterating until a fixed point is approximated.But in most practical settings, we choose a fixed number of steps k to reduce the computa-tional cost because we cannot know beforehand how large k has to be to reach the limit. Butof course, it is extremely important to know that method converges anyway. Let x(i) denotevector x at iteration step i as in Notation 1.4.7, and we get Algorithm 3.

To filter the top c hubs and the top c authorities, you can use the trivial Algorithm 4.How do we decide on the values of k and c? It is immediately clear that c and k must be

proportional: for low c values, a lower value for the number of iteration steps k is appropriateand vice versa. Experiments in [? ] showed that k set to 20 is sufficient to become stable forfinding the 5 best hubs and authorities, thus for c = 5.

2.1.5 Convergence of the algorithm

We now want to prove that for arbitrarily large values of k, the sequence Z(k) converges to alimit (h′,a′). Before proving the convergence, note that adjacency matrices are nonnegativeby definition, and thus the matrix M is nonnegative too. M is also clearly a symmetric n′×n′-matrix with nonnegative, real entries. We prove that such matrices have n′ (not necessarilydifferent) real eigenvalues and that we can diagonalize M . This is the first condition of thepower method we introduced in section 1.4.2. If we can also prove the second condition (havinga unique dominant eigenvalue), convergence is immediately shown by the power method.

11 is a matrix, or vector, whose entries are all equal to 1.


Data:G : a graph of n linked pages.k: natural number.c: natural number.Result: A vector ((H1,H2, . . . ,Hc), (A1,A2, . . . ,Ac)) containing exactly the nodes of

the c top hubs and c top authorities.begin filter(G , k, c)

(h,a) = hits(G , k);Sort the pages with the c largest values in h, resulting in a vector of nodes(H1,H2, . . . ,Hc);Sort the pages with the c largest values in a, resulting in a vector of nodes(A1,A2, . . . ,Ac);return ((H1,H2, . . . ,Hc), (A1,A2, . . . ,Ac));

endAlgorithm 4: Returning the top c hubs and authorities

However, there is a problem here: we cannot prove that nonnegative symmetric matriceshave a unique dominant eigenvalue (a unique dominant eigenvalue means the largest eigen-value with multiplicity 1), simply because this is not true in general.2 In the original paperof Kleinberg [? ] he solves this issue by simply imposing that the matrix M has a uniquedominant eigenvalue and he doesn’t pay any further attention to this problem. He presentsit as ‘a small, technical assumption for the sake of simplicity’.

Is this justified in practice? Actually it is, because you can prove with probability theorythat a random matrix Cn, with a probability tending to 1, has no repeated eigenvalues as thesize of the matrix goes to infinity (See for example Thereom 2.2.3 in [? ]). You can also defendthis differently: the only reason why we can’t use the Perron-Frobenius theorem (see 1.2.10)here, is because M will have zero entries (not all pages in Sσ will be linked to each other, thegraph G ′σ is not strongly connected in general). But, it is intuitively clear that by adding 1to each entry of M , the final results of the algorithm (a sorted vector with the best hubs andauthorities) will not be changed at all, because pages with larger indegrees and outdegreeswill continue to get better hub and authority scores (note, however, that the relative hub andauthority scores can fluctuate a bit and the algorithm will converge slower because of thelack of zero entries). So, by adding 1 to each entry of M , the matrix becomes a positive, realmatrix and we know from the Perron-Frobenius that these matrices have a unique dominanteigenvalue. So yes, the ‘small, technical assumption’ in the paper of Kleinberg is justified.

Now that this problem is solved, we present the relevant theorems below. We also imposeon the matrix M that it has a unique dominant eigenvalue with the preceding explanations inmind. Remember that we will generalize the idea of the HITS algorithm to introduce similarityon graphs. Therefore, we will reconsider the convergence of the generalized algorithm in thefollowing section. We will prove that there also exists a limit even when the matrix M hasno unique dominant eigenvalue. We don’t present this result immediately, because we wantto present the results as authentic as possible and we want to show the evolution of the ideasin the successive papers.

2The matrix(

1 00 1

)is a simple counterexample of a symmetric, nonnegative, real matrix that has no

unique dominant eigenvalue.


Theorem 2.1.1. If A is a symmetric, real n × n-matrix, then it has n (not necessarilydifferent) real eigenvalues corresponding to real eigenvectors.

Proof. First, threat A as complex matrix. The characteristic polynomial det(A − λI) has nroots in C and each root is an eigenvalue for A. Let λ ∈ C be any eigenvalue and v ∈ Cn bea corresponding eigenvector for A. We have:

Av = λv.

As A = At, we also get:vtA = λvt.

Taking the complex conjugate of both sides we get (A is a real matrix):

vtA = λvt

We get:vtAv = (vtA)v = (λvt)v = λvtv.

We also have:vtAv = vt(Av) = λvtv.

Hence:λvtv = λvtv.

We conclude that λ = λ for v 6= 0. We proved that every eigenvalue of A is real. If λ is aneigenvalue of A, then the matrix (A− λI) is not invertible so a vector s ∈ Rn exists with

(A− λI)s = 0,

proving that also the corresponding eigenvector is real.

Theorem 2.1.2. (Symmetric Schur Decomposition) Let A be a real symmetric matrix,then there exist an orthogonal matrix P such that:

(i) P−1AP = D, a diagonal matrix,

(ii) The diagonal entries of D are the eigenvalues of A,

(iii) The column vectors of P are the eigenvectors of the eigenvalues of A.

Proof. By induction on the order of the matrix. For n = 1 the theorem is trivial. Let A bea symmetric n × n-matrix. A has at least one eigenvalue λ1 by the previous theorem. Letx1 be a corresponding eigenvalue with ‖x1‖ = 1 and Ax1 = λ1x1. By the Gram-Schmidtprocedure, we construct an orthonormal basis V1 = {x1,v2, . . . ,vn} of Rn. Let:

S1 = [x1,v2, . . . ,vn],

since S1 is orthonormal, we get St1 = S−1. Consider the matrix: S−11 AS1. We have:

(S−11 AS1)t = (St1AS1)t = St1A

tS1 = S−11 AS1


Thus S−11 AS1 is a symmetric matrix. Since S1e1 = x1, we get:

S−11 AS1e1 = (S−1

1 A)(x1)= S−1

1 (λ1x1)= λ1(S−1

1 x1)= λ1e1

So we get:

S−11 AS1 =

(λ1 00t A1

),

with 0 a vector of zero entries of size n − 1 and A1 an (n − 1) × (n − 1) symmetric matrix.We know by induction that there exist a (n − 1) × (n − 1) orthogonal matrix S2 such thatS−1

2 A1S2 = D′ with D′ an (n− 1)× (n− 1) diagonal matrix. Let:

S′2 =(

1 00t S2

),

and also S′2 is an orthogonal matrix, we get:

(S′2)−1

S−11 AS1S

′2 =

(1 00t St2

)(S−1

1 AS1)( 1 0

0t S2

)

=(

1 00t St2

)(λ1 00t A1

)(1 00t S2

)

=(λ1 00t St2A1S2

)

=(λ1 00t D′

)

Thus, if we put

P = S1S′2

D =(λ1 00t D′

),

we have proved (1). From the definition of diagonalizable matrices and the fact that a squarematrix is diagonalizable if and only if it has an eigenbasis (a basis containing only lineairindependent eigenvectors), (ii) and (iii) immediately follow.

Theorem 2.1.3. Giving a graph G with n linked pages, the sequence as defined in the previousparagraph:

z(0) = 1 ∈ Rn, z(k+1) = Mz(k)

||Mz(k)||2, k = 0, 1, . . . ,

converges when M has a unique dominant eigenvalue.


Proof. Since 1) M is diagonalizable as symmetric matrix by Theorem 2.1.2 and 2) M has aunique dominant eigenvalue, it follows from the power method that the sequence will convergeto a corresponding dominating eigenvector (h′,a′). This eigenvector contains the hub andauthority scores.

We conclude with a nice corollary.

Corollary 2.1.4. The second power of the matrix M has the form:

M2 =(BBT 0

0 BTB

),

and the normalized hub and authority scores are given by the dominant eigenvectors of BBT

and BTB.

Proof. By the compact form given in equation 2.1, we see that hk ← (BBT )k−1Ba0 andak ← (BTB)ka0. Let a0 be 1 ∈ Rn. From the previous theorem we also know that:

limk→∞

hk = h and limk→∞

ak = a,

and also from the previous proof we know that (h,a) is the dominant eigenvector of M . Itfollows immediately that h is also the dominant eigenvector of BBT and a is the dominanteigenvector of BTB.

2.1.6 Examples

Searching for math professors at the VUB

Example 2.1.5. We conclude this section with a fictitious example of the HITS algorithm.Suppose you are looking for math professors vub with a text-based search engine and youget the following results:

• The website of the mathematics department of the VUB,

• The website of the faculty of science of the VUB,

• The websites of 4 math professors,

• The website of 10 PhD students at the the mathematics department of the VUB.

Lets take a look at the link structure of these web pages (remember that it is a fictitiousexample):

• The website of the mathematics department at the VUB links to the websites of all the4 professors, the 10 PhD students and the faculty of science,

• The website of the faculty of science of the VUB links to the websites of all the 4 mathprofessors and the mathematics department,

• The websites of the 4 math professors link to the website of the Mathematics departmentand the faculty of science,


• The websites of the 10 PhD students at the the VUB link to the the website of theirpromotor. 1 professor has 4 PhD students, the other 3 professors have 2 PhD students.

We can now construct the graph Gσ (of course this graph is not completely made accordingto Algorithm 2) and we have the following adjacency matrix of Gσ:

• Row 1: website of the mathematics department,

• Row 2: website of the faculty of science,

• Row 3: website of the professor with the 4 PhD students,

• Row 4, 5, 6: websites of the professors with the 2 PhD students,

• Row 7, 8, 9, 10: websites of the 4 PhD students of the professor on row 3,

• Row 11, 12: websites of the 2 PhD students of the professor on row 4,



Leads to:

B =

0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 0 1 1 1 1 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

,

Intuitively, we expect that the professor with his 4 PhD student will have the largest au-thority score, immediately followed by the other 3 professors. The website of the mathematicsdepartment is clearly the best hub in this example and should get the largest hub score. Alsothe website of the faculty of science should get a high hub score.

We now apply the HITS method by calculating the dominant eigenvector of BBT (thisreturns the hub scores) and the dominant eigenvector of BTB (this returns the authorityscores) with the power method (see 1.4.2). We get:


a =

0.19790.31620.36880.32310.32310.32310.20290.20290.20290.20290.20290.20290.20290.20290.20290.2029

and h =

0.86450.36050.12070.12070.12070.12070.08660.08660.08660.08660.07580.07580.07580.07580.07580.0758

We see that the websites of the 4 math professors are indeed the best authorities for the

search query math professors vub and that the website of the mathematics department isan extremely good hub (this is very logic because it links to all the other relevant websites).The professor with his 4 PhD students would be ranked first in the search results (he has thehighest authority score), the other professors would appear just underneath him. Obviously,a hub score of 0.8645 is so high that it would be quite exceptional in a graph containing alot more websites (it is very unlikely that you find a website containing links to all the otherpages/nodes in the graph). Nevertheless, we conclude that the HITS algorithm returns theresults we wanted intuitively.

Predictors in the Eurovision Song Contest 2009-2015

The Eurovision Song Contest is an annual competition between countries whose public broad-caster is part of the EBU-network. The contest is the biggest music competition in the world,reaching about 200 million annually.

The contest consists of three shows: 2 semi-finals and 1 grand final. From each semi-final,10 countries proceed the the grand final. Italy, Germany, Spain, United Kingdom and Franceare always qualified for the grand final because they are the main funders of the event. Alsothe winner of last year participates automatically in the final. Each country, also those whodropped out during the semi-finals, gives points during the voting of the grand final. Thevoting during the grand final takes place after all the countries have performed their song.Each country is called and awards 12 points to their favorite song, 10 points to their secondfavorite, and then points from 8 down to 1 to eight other songs. Countries cannot vote forthemselves.

The voting system is in fact a positional voting system that is very similar to the Bordacount method (see [? ] for a scientific explanation of Borda count): the list of points of acountry represents the ranking of the 10 best countries in the voting of that country. So thepoints are values on an ordinal scale.

The complete voting procedure during the Eurovision Song Contest can be seen as adirected graph: all the participating countries are the nodes and the edges represent the


points between the countries (when country a assigns 3 points to country b, then there are 3edges from a to b).

Let A be the adjacency matrix of the voting during a song contest (A will in fact be justa points table). If we take A as input for the HITS algorithm, we expect that the countrywith the highest authority score will be the winner of the competition. Actually we expecta lot more: when we order the countries based on their authority score, we expect that thisordering will be practically equal to the final ranking of the contest. This is based on thesimple fact that Borda count just sums up points, and we only expect very small differenceswhen the difference in points is low between two countries. These small differences are thencaused by the algorithm. Remember that the HITS algorithm does not simply give a highauthority score to nodes with a large indegree, but also takes the hub scores into account. Wewill see that the hub scores will be low (compared to some authority scores) in this example,so their influence will be limited.

But what is very interesting now, is the role that the hub scores are playing. In fact thesescores can be seen as a kind of ‘predictive value’ of a country: a country with a high hubscore will have assigned points in such way that it is seen as a reliable source, meaning thatthe points of that country will match well with the final result of the contest.

Let’s take the Eurovision Song Contest 2014 as an example.The final results of the contest where (the complete result table can be found in Appendix

B):

1. Austria (290 points)2. The Netherlands (238 points)3. Sweden (218 points)4. Armenia (174 points)5. Hungary (143 points)6. Ukraine (113 points)7. Russia (89 points)8. Norway (88 points)9. Denmark (74 points)

10. Spain (74 points)11. Finland (72 points)12. Romania (72 points)13. Switzerland (64 points)14. Poland (62 points)15. Iceland (58 points)16. Belarus (43 points)17. United Kingdom (40 points)18. Germany (39 points)19. Montenegro (37 points)20. Greece (35 points)


21. Italy (33 points)22. Azerbaijan (33 points)23. Malta (32 points)24. San Marino (14 points)25. Slovenia (9 points)26. France (2 points)

Now we calculate the hub and authority scores, based on the full scoreboard and the resultsare presented in Table 2.1.6.

Notice that ordering the countries by their authority scores indeed returns the final rankingof the contest. The countries who have an authority score equal to 0 are the countries thatdidn’t make it to the final and couldn’t therefore receive any points. Also the complete losersreceiving the famous ‘nul points’ will have an authority score equal to 0, but that didn’t occurduring the final of 2014. When looking at the hub scores, it appears that Portugal ‘predicted’the final ranking the best. When we look at the points awarded by Portugal during the final,this is indeed true:

• 12 points: Austria• 10 points: The Netherlands• 8 points: Sweden• 7 points: Switzerland• 6 points: Hungary• 5 points: Denmark• 4 points: Armenia• 3 points: Norway• 2 points: Russia• 1 point: Romania

No less than 7 countries in the points of Portugal have achieved the final top 10, the top 3results of Portugal are even equal to the final top 3! Russia, Romania and Switzerland are the3 countries where Portugal awarded points to but didn’t achieve the top 10, but still they areplaced 1th, 12th and 13th in the final ranking. So it is absolutely not surprising that Portugalis the country with the highest hub score.

Note that countries who scored very well during the final, usually have a moderate hubscore: this is due to the fact that countries cannot vote for themselves. The average hubscore is, in general, also lower than the average authority score, this is because countries canaward points to only 10 countries, but (qualified) countries can receive points from everycountry, except themselves. The ‘predictive value’ of a country is therefore limited to only 10countries, while the voting produces a complete final ranking of 26 countries.

Of course, it is tempting to research the predictive value of countries during a coupleof years. We opted for the period 2009-2015 because during the last seven editions thevoting procedure remained unchanged: the points awarded by each country are based on 50%televoting and 50% jury vote. So we take the average of the hub scores of the last seven years


Participant Authority ScoreAustria 0.285110029The Netherlands 0.235720093Sweden 0.212949392Armenia 0.156091783Hungary 0.124453384Ukraine 0.095410297Norway 0.086504385Denmark 0.074474911Finland 0.070675128Spain 0.068432379Russia 0.065352465Romania 0.062560768Switzerland 0.055868228Iceland 0.053973263Poland 0.052246035United Kingdom 0.037978889Germany 0.029958317Belarus 0.027945852Malta 0.026911408Italy 0.023817428Montenegro 0.023425296Azerbaijan 0.022613716Greece 0.021816155San Marino 0.009315756Slovenia 0.005843162France 0.002167387Albania 0Belgium 0Estonia 0FYR Macedonia 0Georgia 0Ireland 0Israel 0Latvia 0Lithuania 0Moldova 0Portugal 0

Participant Hub ScorePortugal 0.173610946Finland 0.172574335Belgium 0.169584694Latvia 0.166295198Spain 0.165764986Hungary 0.165699760Iceland 0.165062483Estonia 0.162056401Denmark 0.160549539Lithuania 0.159920120Greece 0.159278464Norway 0.156676045Slovenia 0.156533823Sweden 0.155725338Romania 0.155212094France 0.153993935Switzerland 0.153864143Israel 0.153059334Ireland 0.147584917The Netherlands 0.145211066United Kingdom 0.144342619Austria 0.137956312Germany 0.136869443Ukraine 0.135769452Italy 0.121638547Malta 0.119063784Georgia 0.117423087Moldova 0.115874904Poland 0.112296570FYR Macedonia 0.107531180Russia 0.102149330San Marino 0.098204303Montenegro 0.097193444Albania 0.091897617Belarus 0.089227164Azerbaijan 0.068005452Armenia 0.050899422

Table 2.1: The authority and hub scores of the countries during the Final of the EurovisionSong Contest 2014

of each country. The result is presented in Table 2.1.6. The complete results of each year canbe found in Appendix B.

We have to put a condition on the results in Table 2.1.6: countries should have participated6 out of 7 times, more than 1 absence would possibly make the average misleading. In that


case, the best predictors are:

1. Hungary

2. Cyprus

3. The Netherlands

4. Belgium

5. Spain

And the worst predictors are:

1. Armenia

2. Azerbaijan

3. FYR Macedonia

4. Georgia

5. Albania

We see that the best predictors are indeed not the most succesfull countries at the contestitself: Hungary and Cyprus never reached the top 10 in the last seven years, The Netherlandsonly reached the final twice, Belgium qualified three times. Spain is as a main funder directlyqualified for the grand final, but was 5 out of 7 times placed lower than the 20th place. Thebottom is also not very surprising: in 2010, 2012 and 2013 Albania did not receive enoughtelevotes so the jury decided their points (see [? ] for more information), making their judgingprocess vary from one year to another. Azerbaijan and Armenia scored very high in the lastfive years, (e.g. Azerbaijan reached the top 5 every year except 2014 and 2015, won in 2011and became second in 2013) and due to their dispute about the Nagorno-Karabakh region,they never exchanged a single point during the last five years. Also cultural differences areprobably an explanation: Georgia, Azerbaijan and Armenia are located in the Caucasus, aremote corner of Europe, with many Asian influences (the region is sometimes refered to asEurasia).

CHAPTER 2. SIMILARITY ON GRAPHS 58Ta

ble

2.2:

The

aver

age

ofth

ehu

bsc

ores

betw

een

2009

-201

5

#P

arti

cipa

nt20

1520

1420

1320

1220

1120

1020

09A

vera

ge1.

Aus

tral

ia0.

1575

16-

--

--

-0.

1575

162.

Hun

gary

0.14

6121

0.16

5700

0.15

0320

0.15

1291

0.13

0650

-0.

1582

600.

1503

903.

Cyp

rus

0.14

5637

-0.

1614

130.

1403

850.

1456

910.

1441

750.

1465

670.

1473

114.

The

Net

herl

ands

0.15

5502

0.14

5211

0.13

3926

0.14

7638

0.13

7637

0.13

9085

0.16

3395

0.14

6056

5.B

elgi

um0.

1454

780.

1695

850.

1591

890.

1534

620.

1184

870.

1473

620.

1222

340.

1451

146.

Spai

n0.

1589

950.

1657

650.

1536

090.

1331

740.

1142

550.

1626

260.

1227

390.

1444

527.

Isra

el0.

1466

910.

1530

590.

1577

440.

1399

480.

1316

770.

1189

700.

1570

310.

1435

898.

Lat

via

0.14

9527

0.16

6295

0.13

4627

0.13

8436

0.12

1329

0.14

2837

0.14

3788

0.14

2406

9.Sl

ovak

ia-

--

0.13

8399

0.14

1130

0.14

5328

0.14

2350

0.14

1802

10E

ston

ia0.

1453

290.

1620

560.

1410

960.

1314

950.

1432

540.

1360

980.

1314

010.

1415

3311

.L

ithu

ania

0.11

8747

0.15

9920

0.14

1552

0.14

4121

0.12

7425

0.14

4634

0.15

2601

0.14

1286

12.

Den

mar

k0.

1605

070.

1605

500.

1124

960.

1397

040.

1047

820.

1441

740.

1563

650.

1397

9713

.M

alta

0.14

5890

0.11

9064

0.14

6482

0.12

2804

0.16

2192

0.13

8073

0.13

9682

0.13

9170

14.

Pol

and

0.15

9925

0.11

2297

--

0.12

7825

0.15

9433

0.13

5973

0.13

9090

15.

Slov

enia

0.13

5649

0.15

6534

0.14

1375

0.14

8665

0.11

8682

0.13

7277

0.13

3178

0.13

8766

16.

Icel

and

0.14

5817

0.16

5062

0.14

9906

0.12

9060

0.13

0358

0.13

7492

0.11

3493

0.13

8741

17.

Aus

tria

0.15

0454

0.13

7956

0.12

7058

0.15

3796

0.12

9429

-0.

1322

170.

1384

8518

.G

erm

any

0.15

7865

0.13

6869

0.12

6650

0.15

7167

0.10

3942

0.11

5909

0.15

5134

0.13

6220

19.

Cro

atia

--

0.16

9310

0.13

2023

0.12

9980

0.11

4014

0.13

4540

0.13

5974

20.

Rom

ania

0.15

3499

0.15

5212

0.14

1742

0.11

9829

0.13

1675

0.12

8084

0.11

9466

0.13

5644

21.

Fran

ce0.

1422

720.

1539

940.

1351

870.

1514

350.

1372

680.

1181

940.

1090

340.

1353

4122

.G

reec

e0.

1255

440.

1592

780.

1449

430.

1240

540.

1459

120.

0956

290.

1480

540.

1347

7423

.F

inla

nd0.

1495

640.

1725

740.

1080

080.

1414

720.

1018

910.

1333

300.

1327

730.

1342

3124

.Ir

elan

d0.

1414

920.

1475

850.

1377

900.

1328

640.

1037

040.

1474

110.

1281

260.

1341

3925

.U

nite

dK

ingd

om0.

1531

250.

1443

430.

1436

150.

1219

220.

0969

170.

1361

720.

1336

300.

1328

1826

.R

ussi

a0.

1284

120.

1021

490.

1302

770.

1299

070.

1433

600.

1418

610.

1529

920.

1327

0827

.Sw

eden

0.13

0248

0.15

5725

0.12

2816

0.09

9063

0.10

6864

0.15

3858

0.16

0329

0.13

2701

28.

Bul

gari

a-

-0.

1363

420.

1507

720.

1107

760.

1362

080.

1227

940.

1313

7829

.N

orw

ay0.

1443

310.

1566

760.

0989

330.

1548

500.

0990

120.

1554

550.

1075

460.

1309

72

CHAPTER 2. SIMILARITY ON GRAPHS 59Ta

ble

2.2:

The

aver

age

ofth

ehu

bsc

ores

betw

een

2009

-201

5(c

ontin

ued)

#P

arti

cipa

nt20

1520

1420

1320

1220

1120

1020

09A

vera

ge30

.B

osni

a&

Her

zego

vina

--

-0.

1351

920.

1271

860.

1390

540.

1191

580.

1301

4731

.P

ortu

gal

0.14

4985

0.17

3611

-0.

1168

360.

1307

230.

1076

740.

1058

990.

1299

5532

.U

krai

ne-

0.13

5769

0.10

6438

0.12

3998

0.12

0992

0.13

9278

0.14

2826

0.12

8217

33.

Bel

arus

0.14

7342

0.08

9227

0.14

3351

0.12

0914

0.13

8934

0.10

4108

0.14

9588

0.12

7638

34.

Serb

ia0.

1285

84-

0.16

5969

0.11

3907

0.09

5025

0.13

2455

0.12

3080

0.12

6503

35.

Swit

zerl

and

0.14

6739

0.15

3864

0.11

7585

0.12

5128

0.09

2621

0.12

7077

0.11

7733

0.12

5821

36.

Mol

dova

0.13

0904

0.11

5875

0.13

9519

0.11

9439

0.12

6125

0.11

9696

0.11

9161

0.12

4388

37.

Tur

key

--

-0.

1135

350.

1456

080.

1377

040.

0947

150.

1228

9038

.Sa

nM

arin

o0.

1315

330.

0982

040.

0925

210.

1308

410.

1585

59-

-0.

1223

3239

.It

aly

0.14

6217

0.12

1639

0.13

4389

0.10

9725

0.09

8845

--

0.12

2163

40.

Mon

tene

gro

0.08

5579

0.09

7193

0.15

6165

0.13

7301

--

0.12

6144

0.12

0476

41.

Alb

ania

0.12

4384

0.09

1898

0.10

2282

0.09

8185

0.14

2108

0.14

0538

0.14

1493

0.12

0127

42.

Geo

rgia

0.11

1867

0.11

7423

0.14

7386

0.12

3067

0.13

2744

0.08

6281

-0.

1197

9543

.F

YR

Mac

edon

ia0.

0854

140.

1075

310.

1385

210.

1326

120.

1161

970.

1276

260.

1242

280.

1188

7644

.C

zech

Rep

ublic

0.13

0862

--

--

-0.

1039

170.

1173

9045

.A

zerb

aija

n0.

1258

080.

0680

050.

1095

530.

1133

960.

1191

730.

1208

180.

1093

570.

1094

4446

.A

rmen

ia0.

1192

440.

0508

990.

1257

92-

0.12

8531

0.10

1509

0.10

9782

0.10

5960


2.1.7 Final reflection

The HITS algorithm is one of the few algorithms that has the ability to rank pages accordingto a specific search query. Also the computational cost of the HITS algorithm, which equalsthe cost of the power method (see 1.4.2), is not excessive and feasible for most servers. Theresult of the HITS algorithm for popular queries will also be cached by most search engines,which reduces the computational cost even more because the saved results can be serveddirectly to the user without any new calculations.

The biggest disadvantage of the HITS algorithm is that it suffers from topic drift: thegraph G ′σ could contain nodes which have high authority scores for the query but are com-pletely irrelevant. E.g. Facebook is nowadays a universally popular website, almost everywebsite contains a ‘like’ or ‘share’ button linking to Facebook, and Facebook itself containstons of posts linking to other webpages. This means that Facebook has a great chance to ap-pear in almost any G ′σ and receive a high authority score because the original HITS algorithmas presented here cannot detect such ‘universally popular’ websites. The same goes for othersocial media websites and some advertisements.

Nowadays, we know that Ask.com uses this algorithm. In fact, most search engines arevery secretive about their search algorithm (e.g. Google) to make profit and avoid cheating bywebmasters. Still, the chances are that other search engines use some variant of the algorithmas well, in combination with a lot of other procedures.


2.2 Node similarity

This section provides a detailed overview of the paper ‘A Measure of Similarity betweenGraph Vertices: Applications to Synonym Extraction and Web Searching’ [? ] of V. D.Blondel and others. The paper generalizes the HITS algorithm leading to the concept of(node) similarity on graphs. In the following chapters, we will call this method often the‘method of Blondel’ . This method is explained in detail and will serve as a framework forthe following sections and chapters. The method allows a mathematical rigorous approach,leading to a very important convergence theorem. Thanks to it is generality, this convergencetheorem is extremely powerful and will be used to prove convergence for all the presentedalgorithms in the next sections and chapters. This theorem also solves the issue from theprevious section where the matrix M has to have a unique dominant eigenvalue. Once weconstructed the algorithm, proved the convergence of it and introduced the compact form,we also look at some special cases of node similarity between graphs.

Introduction

In this introduction, we explain the main idea behind the method of Blondel. Because weare focussing on the intuitive ideas here, we don’t prove any theorems in this paragraph yet.The theoretical results are presented in the following paragraphs. Notice that all the graphsin this section are directed graphs, although this is not always explicitely stated. However,the method works for undirected graphs too, often leading to much easier formulas since thegraphs have symmetric adjacency matrices in that case. We will give explicit attention toundirected graphs in the section about ‘Special cases’ (see 2.2.4).

The method of Blondel is a pure generalization of the HITS algorithm. Remember fromthe previous section that we constructed a graph G ′σ and calculated hub and authorities scoresfor each vertex. Well, the authors of [? ] found a way to replace these hub and authorityscores by just any other graph, allowing us to compare G ′σ to any other graf H . We will callthe graph that replaces the hub and authority scores the structure graph.

How does this work? For G ′σ , the authority score of vertex vj of G ′σ can be thought of asa score between vj of G and the vertex denoted as authority of the graph:

hub authority

and, conversely, the hub score of vertex vj of G ′σ can be thought of as a score between vjand the vertex denoted as hub. We can replace G ′σ with any other graph G . We call thehub-authority graph a structure graph and we already know the resulting iterative methodfrom the previous section. We will call these scores similarity scores.

The central question is now: which mutually reinforcing relation do we get when usinganother structure graph, different from the hub-authority structure graph? If we take, forexample, as structure graph a path graph with three vertices v1, v2, v3:

v1 v2 v3


Let G (W,→) be a graph. With each vertex wi of G we now associate three scores zi1, zi2 andzi3, one for each vertex of the structure graph. We initialize these scores with a positive valueand then update them according to the mutually reinforcing relation:

zi1 := ∑j:(wi,wj)∈→ zj2,

zi2 := ∑j:(wj ,wi)∈→ zj1 +∑

j:(wi,wj)∈→ zj3,

zi3 := ∑j:(wj ,wi)∈→ zj2,

or, in matrix form (zj denotes the column vector with entries zij , B is the adjacency matrixof graph G ), z1

z2z3

(k+1)

=

0 B 0BT 0 B0 BT 0

z1

z2z3

(k)

which we, again, can denote by

z(k+1) = Mz(k). (2.3)

The principle is now exactly the same as the previous example with hubs and authorities. Thematrix M is symmetric and nonnegative, and again the result is the limit of the normalizedvector sequence:

s(0) = z(0) > 0, s(k+1) = Ms(k)

||Ms(k)||2, k = 0, 1, . . . , (2.4)

Remember that the HITS algorithm assumed that M has a unique dominant eigenvalue butwe don’t want to make this assumption in this section because we want a concept that canbe applied to all kinds of directed graphs. We will see that without this assumption, thesequence 2.4 does not always converge but oscillates between the limits:

seven = limk→∞

s(2k) and sodd = limk→∞

s(2k+1)

The limit vectors seven and sodd do in general depend on the initial vector z(0). We will provelater on that the vector z(0) = J , with J a matrix of ones, is a good choice.

The extremal limit seven(J) will be defined as the similarity matrix. The element sij iscalled the similarity score between vertex wi of G and vertex vj of the structre graph.

We now give a numerical example.

Example 2.2.1. Take as structure graph again the path graph with three vertices v1, v2, v3:

v1 v2 v3

Let G (W,→) be the following graph:


w1

w2 w3

w4 w5

Then the adjacency matrix B is:

B =

0 1 1 0 00 0 1 1 10 0 1 1 00 0 0 0 00 0 0 0 0

By using the described mutually reinforcing updating iteration we get the following similaritymatrix (a numerical algorithm to calculate this is presented later on in this section):

S =

0.3557 0.1265 00.3102 0.3451 0.05570.2732 0.4619 0.4115

0 0.1579 0.35570 0.0840 0.1521

The similarity score of w4 with v2 of the structure graph is equal to 0.1579.

We now construct the general case. Take two (directed) graphs G = (U,→) and H =(V,→′) with nG and nH the order of the graphs. We think of G as the structure graph (suchas the graphs hub → authority and the graph 1 → 2 → 3 in the previous paragraphs). Weget the following mutually reinforcing updating iteration with as updating equations:

z(k+1)ij :=

∑p:(vp,vi)∈→′,q:(uq ,uj)∈→

x(k)pq +

∑p:(vi,vp)∈→′,q:(uj ,uq)∈→

z(k)pq (2.5)

Consider the product graph G ×H (see Definition 1.5.17). The above updating equation isequivalent to replacing the scores of all vertices of the product graph by the sum of the scoresof the vertices linked by an incoming or outgoing edge.

Equation (2.4) can be rewritten in a more compact matrix form. Let Zk be the nH × nG

matrix of entries zij at iteration k, and A and B are the adjacency matrices of G and H .Then the updating equations can be written as:

Z(k+1) = BZ(k)AT +BTZ(k)A, k = 0, 1, . . . , (2.6)


We will prove that the normalized even and odd iterates of this updating equation convergewhen using as start matrix J , the matrix of ones. This normalized result will be denotedby S, the similarity matrix. This compact form also shows the true meaning of similaritybetween nodes: the similarity score between two vertices is large when the similarity scoresof the adjacent vertices are large. We will dive deeper into the meaning of the algorithmin Chapter 3. The following example shows a calculated similarity matrix of two directedgraphs.

Example 2.2.2. Let H (V,→) be the following graph:

v2

v1

v3

v4

Let G (V ′,→′) be the following graph:

v′6

v′1

v′2

v′4 v′3

v′3

This results in the following similarity matrix (a numerical algorithm to calculate this matrixis introduced later in this section):

S =

0.2636 0.2786 0.2723 0.12890.1286 0.1286 0.0624 0.12680.2904 0.3115 0.2825 0.16670.1540 0.1701 0.2462 00.0634 0.0759 0.1018 00.3038 0.3011 0.2532 0.1999


We see for example, that vertex v2 of H is most similar to vertex v′3 in G because thesimilarity score s32 is the highest among all the similarity scores of v2.

2.2.1 Convergence theorem

In the introduction, we mentioned already that the sequence in Equation (2.4) convergesfor even and odd iterates. We will prove this at the end of this subsection. But before wearrive there, we first need some results on the eigenvectors and eigenvalues of nonnegativematrices. The Perron-Frobenius applies only to nonnegative, irreducible matrices, but sinceone is confronted in practice with nonnegative matrices that are not necessary irreducible, weextend the Perron-Frobenius and see what remains without this assumption. We will provethat the spectral radius ρ(M) of a nonnegative matrix M is an eigenvalue of M , also calledin this case the Perron root (see Definition 1.2.11). Moreover, there exists an associatednonnegative eigenvector x ≥ 0(x 6= 0), the Perron vector, such that Mx = ρx. We will alsoinvestigate in more specific results if M is not only nonnegative, but also symmetric.

Lemma 2.2.3. Let A,B be n × n-matrices, if |A| ≤ B, then ρ(A) ≤ ρ(|A|) ≤ ρ(B). (SeeDefinition 1.1.8 for the definition of | · |).

Proof. For every m = 1, 2, . . . we have

|Am| ≤ |A|m ≤ Bm

by using some trivial properties of the absolute value function. Let ||·||2 be the matrix 2-norminduced by the Euclidean vector norm: for any matrix M , we have ‖M‖2 = max‖x‖2=1 ‖Mx‖2.For this matrix norm it is trivial to see that if |M | ≤ |M ′| (see Definition 1.1.5) it followsthat ‖M‖2 ≤ ‖M ′‖2 and also ‖M‖2 = ‖ |M | ‖2, we get:

‖Am‖2 ≤ ‖ |A|m ‖2 ≤ ‖Bm‖2

and‖Am‖1/m2 ≤ ‖ |A|m ‖1/m2 ≤ ‖Bm‖1/m2

for all m = 1, 2, . . . If we now let m→∞ and apply the spectral radius formula from Theorem1.3.12 we get:

ρ(A) ≤ ρ(‖A‖) ≤ ρ(B).

Theorem 2.2.4. If A ≥ 0 is an n× n-matrix, then ρ(A) is an eigenvalue of A and there isa nonnegative vector x ≥ 0,x 6= 0, such that Ax = ρ(A)x.

Proof. For any ε > 0 define A(ε) = (aij + ε) > 0, so in A(ε) we add ε to each entry ofA. Denote by x(ε) the Perron vector of A(ε), this exists by the Perron-Frobenius as A(ε)is a strictly positive matrix (see Theorem 1.2.10), so x(ε) > 0 and ∑n

i=1 x(ε)i = 1. Sincethe set of vectors {x(ε) : ε > 0} is contained in the compact set {x : x ∈ Rn, ‖x‖1 ≤ 1},we know from the Bolzano-Weierstrass theorem (see Theorem 2.1 in [? ]) that there is amonotone decreasing sequence ε1 > ε2 > . . . with limk→∞ εk = 0 such that limk→∞ x(εk) = xexists. Since x(εk) > 0 for all k = 1, 2, . . . , it must be that x = limk→∞ x(εk) ≥ 0; x = 0 isimpossible because:

n∑i=1

xi = limk→∞

n∑i=1

x(εk)i = 1


By Lemma 2.2.3, ρ(A(εk)) ≥ ρ(A(εk+1) ≥ · · · ≥ ρ(A) for all k = 1, 2, . . . , so the sequence{ρ(A(εk))}k=1,2,... is a monotone decreasing sequence. Thus, ρ = limk→∞ ρ(A(εk)) exists andρ ≥ ρ(A). From the fact that

Ax = limk→∞

A(εk)x(εk)

= limk→∞

ρ(A(εk))x(εk)

= limk→∞

ρ(A(εk)) limk→∞

x(εk)= ρx,

and the fact that x 6= 0 we conclude that ρ is an eigenvalue of A. But then ρ ≤ ρ(A) so itmust be that ρ = ρ(A).

Now that we know that any nonnegative matrix M has his spectral radius as an eigenvalueand there exists an associated nonnegative eigenvector, we will see if we can get more specificresults when handling nonnegative, symmetric matrices. First we introduce the orthogonalprojection.

Definition 2.2.5. A projection of a vector x on a vector y is defined as3:

projyx = 〈y,x〉‖y‖ y,

the projection of x on a subspace Y with orthonormal basis {y1,y2, . . . ,ym} is defined as thesum of projections:

projY x =m∑i=1

〈yi,x〉‖yi‖

yi

We now combine the above results to get a new result of the Perron-Frobenius whenhandling nonnegative, symmetric matrices.

Theorem 2.2.6. Let V be a linear subspace of Rn with orthonormal basis {v1,v2, . . . ,vm}.Arrange the column vectors vi in a matrix V and let x ∈ Rn. The orthogonal projectionof x on V is then given by:

Πx = V V Tx,

the matrix Π = V V T is the orthogonal projector. Projectors have the property that Π2 = Π

Proof. We use the connection between transposes and the standard inner product to find thematrix of the orthogonal projection on the subspace V:

projVx =m∑i=1

〈vi,x〉‖vi‖

vi

=m∑i=1〈vi,x〉vi

=m∑i=1

vi〈vi,x〉

3〈x, y〉 =∑n

i=1 xiyi with x, y ∈ Rn is the standard inner product of two vectors in Rn.


Remember that 〈x,y〉 = xTy so:

=m∑i=1

vi(vTi x)

=m∑i=1

(vivTi )x

= V V Tx

Proving that Π2 = Π is also trivial, remember that for an orthogonal matrix A it holds thatAT = A−1:

Π2 = (V V T )2

= V V TV V T

= V V −1V V T

= InV VT

= V V T

= Π

Lemma 2.2.7. Let Π be an orthogonal projecter, then:

ΠT = Π

Proof.ΠT = (V V T )T = (V T )TV T = V V T = Π

We already introduced the Jordan normal form in Theorem 1.3.10, but we also need thefollowing theorem. The proof of this theorem is highly technical and falls behind the scopeof this master thesis. The seperate results can be found in Theorems 1.3.15, 1.3.16, 1.3.17 in[? ].

Theorem 2.2.8. Let A be a n × n-matrix with s distinct eigenvalues λ1, . . . , λs. Let eachλi have algebraic multiplicity mi and geometric multiplicity µi. The matrix A is similar to aJordan matrix

J =

J1. . .

Jµ

where

1. µ = µ1 + µ2 + · · ·+ µs,

2. For each λi, the number of Jordan blocks in J with value λi is equal to µi,

3. λi appears on the diagonal of J exactly mi times.

Further, the matrix J is unique, up to re-ordering the Jordan blocks on the diagonal.


We are now ready for a generalization of the Perron-Frobenius theorem: we generalize thePerron-Frobenius to nonnegative, symmetric matrices.

Theorem 2.2.9. Let M be a symmetric nonnegative matrix with spectral radius ρ. Thenthe algebraic and geometric multiplicity of the Perron root ρ are equal; there is a nonnegativematrix X whose columns span the eigenspace associated with the Perron root; and the elementsof the orthogonal projector Π on the vector space associated with the Perron root of M are allnonnegative.

Proof. We know that any symmetric nonnegative matrix M can be permuted to a Jordancanonical form J . So from the previous Theorem 2.2.8, we know that ρ will appear exactlymρ (= the algebraic multiplicity of ρ) times on the diagonal and we also know that therewill be exactly µρ (= the geometric multiplicity of ρ) Jordan blocks with value ρ on theirdiagonal. But from Theorem 2.1.2 we already know that J is in fact a diagonal matrix (allJordan blocks have size 1× 1) with all eigenvalues on the diagonal, so mρ = µρ. To constructthe nonnegative matrix X, we use again Theorem 2.1.2: P−1MP = D and we know from thetheorem that the column vectors of P are the eigenvectors corresponding to the eigenvalues ofM . Looking at the proof of Theorem 2.1.2 we know that the eigenvector xi corresponding tothe eigenvalue dii (the i’th diagonal element of D), can be found at the i’th column of P . Weknow that these eigenvectors are linear independent because of the definition of diagonalizablematrices (A n × n matrix A is diagonalizable if and only if it has an eigenbasis). So if weselect all the columns of P corresponding to ρ, we found the eigenspace of the Perron root.

To see that the resulting matrix X is nonnegative, notice that the eigenspace of ρ canalso be found by taking the Perron vectors of the (1× 1) Jordan blocks Ji that have ρ on thediagonal appropriately padded with zeros. All these blocks Ji are irreducible and nonnegativeand it follows from the Perron-Frobenius that the corresponding eigenvectors are nonnegative.

To see that the orthogonal projector Π associated with the eigenspace of ρ is nonnegative,notice that normalizing the vectors in X gives an orthonormal basis because by Theorem 2.1.2we know that P is orthogonal and we picked some columns of P to form our eigenbasis of ρ.Normalizing doesn’t affect the sign of the vectors, so the normalized X ′ is also nonnegative,so Π = X ′X ′T will also be nonnegative.

Theorem 2.2.10. Let M be a n×n-symmetric, nonnegative matrix of spectral radius ρ. Lets(0) > 0 and consider the sequence

s(k+1) = Ms(k)

||Ms(k)||2, k = 0, 1, . . .

Two convergence cases can occur depending on whether or not −ρ is an eigenvalue of M .When −ρ is not an eigenvalue of M , then the sequence of s(k) simply converges to Πs(0)

||Πs(0)||2,

where Π is the orthogonal projector on the eigenspace associated with the Perron root ρ. When−ρ is an eigenvalue of M , then the subsequences s(2k) and s(2k+1) converge to the limits

seven(s(0)) = limk→∞

s(2k) = Πs(0)

||Πs(0)||2and sodd(s(0)) = lim

k→∞s(2k+1) = ΠMs(0)

||ΠMs(0)||2.

where Π is the orthogonal projector on the sums of the invariant subspaces associated with ρand −ρ. In both cases the set of all possible limits is given by:{

seven(s(0)), sodd(s(0)) : s(0) > 0}

={ Πs||Πs||2

: s > 0}


and the vector seven(1) is the unique vector of largest possible Manhattan norm in this set.

Proof. We prove only the case where −ρ is an eigenvalue, the other case is an easy adaption.Denote the invariant subspaces of M corresponding to ρ, −ρ and to the rest of the spectrum,respectively by Vρ,V−ρ and Vµ. From the previous theorem, we know that Vρ,V−ρ are certainlynontrivial because ρ and −ρ have at least multiplicity 1, so the eigenspace contains also atleast 1 vector. We also suppose that Vµ is nontrivial (if Vµ were trivial, the proof becomeswould get only easier). Let Vρ be the n × u-matrix whose columns span Vρ, V−ρ the n × vmatrix spanning V−ρ, Vµ the n× w- matrix spanning Vµ. We now have:

MVρ = ρVρ, MV−ρ = −ρV−ρ,

for M we can write:MVµ = VµMµ,

where Mµ is a w×w-matrix with spectral radius µ strictly less than ρ because Vµ is nontrivialby assumption.

Remember from Theorem 2.1.2 that P−1MP = D, with D a diagonal matrix, this can berewritten as M = PDP−1 or M = PDP T (P is an orthogonal matrix), we can rewrite thisin this case as (this is the so called eigendecomposition for symmetric matrices):

M =(Vρ V−ρ Vµ

)ρI −ρIMµ

(Vρ V−ρ Vµ)T

= ρVρVTρ − ρV−ρV T

−ρ + VµMµVTµ

Let A = ( Vρ V−ρ Vµ ), it follows that (remember that A is orthogonal from Theorem 2.1.2):

M2 = A

ρI −ρIMµ

ATAρI −ρI

Mµ

AT

= A

ρI −ρIMµ

A−1A

ρI −ρIMµ

AT

= A

ρ2I(−ρ)2I

M2µ

AT= ρ2VρV

Tρ + ρ2V−ρV

T−ρ + VµM

2µV

Tµ

= ρ2Π + VµM2µV

Tµ

whereΠ = VρV

Tρ + V−ρV

T−ρ

is the orthogonal projector on the invariant subspace Vρ ⊕ V−ρ of M2 corresponding to ρ2.Analogously, we also have:

M2k = ρ2kΠ + VµM2kµ V T

µ , (2.7)


and since Mµ has as spectral radius µ, strictly below ρ, we multiply (2.7) by s(0):

M2ks(0) = ρ2kΠs(0) + VµM2kµ V T

µ s(0)

⇔ M2ks(0)

‖M2ks(0)‖2=

ρ2kΠs(0) + VµM2kµ V T

µ s(0)

‖ρ2kΠs(0) + VµM2kµ V T

µ s(0)‖2

⇔ s(2k) =Πs(0) + 1

ρ2kVµM2kµ V T

µ s(0)

‖Πs(0) + 1ρ2kVµM2k

µ V Tµ s(0)‖2

⇔ s(2k) = Πs(0)

‖Πs(0)‖2+O

(µ

ρ

)2k(2.8)

⇔ limk→∞

s(2k) = Πs(0)

‖Πs(0)‖2(2.9)

Analogously, when multiplying (2.7) by Ms(0) we have:

limk→∞

s(2k+1) = ΠMs(0)

‖ΠMs(0)‖2(2.10)

The expressions (2.9) and (2.10) bring up the question whether ‖Πs(0)‖2 and ‖ΠMs(0)‖2 arealways nonzero, from Definition 1.3.2 we know these norms can be calculated as:

(Πs(0))TΠs(0) and (ΠMs(0))T (ΠMs(0)),

since Π2 = Π and ΠT = Π, this equals:

s(0)TΠs(0) and s(0)TMΠMs(0),

these norms are both nonzero since s(0) > 0 and we know from the previous Theorem 2.2.9that both Π and MΠ are nonnegative and nonzero.

From the fact that M is nonnegative and the formula for seven(s(0)) and sodd(s(0)) weconclude that both limits lie in {Πs/‖Πs‖2 : s > 0}. We now show that every elements(0) ∈ {Πs/‖Πs‖2 : s > 0} can be obtained as seven(s(0)) for some s(0) > 0. Since the entriesof Π are nonnegative, so are those of s(0). This vector may have some zero entries. Froms(0) we construct s(0) by adding ε > 0 to all the zero entries of s(0). For ε → 0, the vectors(0) − s(0) is clearly orthogonal to Vρ ⊕ V−ρ and will vanish in the iteration of M2. Thus wehave seven(s(0)) = s(0) for s(0) > 0, proving our statement.

We now prove the last statement. We actually want to prove that:∥∥∥∥ Π1‖Π1‖2

∥∥∥∥1≥∥∥∥∥∥ Πs(0)

‖Πs(0)‖2

∥∥∥∥∥1∀s(0) > 0.

To prove this we rewrite both sides. Remember that Π is nonnegative, then we get with thenorm properties:∥∥∥∥ Π1

‖Π1‖2

∥∥∥∥1

= | 1‖Π1‖2

| ‖Π1‖1

= ‖Π1‖1√1TΠ21


=√

1TΠ21‖Π1‖1√1TΠ21

√1TΠ21

=√

1TΠ21‖Π1‖11TΠ1

=√

1TΠ21 (|(Π1)1|+ |(Π1)2|+ · · ·+ |(Π1)n|)1TΠ1

=√

1TΠ21 ((Π1)1 + (Π1)2 + · · ·+ (Π1)n)1TΠ1

=√

1TΠ21 ((Π1)1 + (Π1)2 + · · ·+ (Π1)n)(Π1)1 + (Π1)2 + · · ·+ (Π1)n

=√

1TΠ21

and also: ∥∥∥∥∥ Πs(0)

‖Πs(0)‖2

∥∥∥∥∥1

= ‖Πs(0)‖1‖Πs(0)‖2

= |(Πs(0))1|+ |(Πs(0))2|+ · · ·+ |(Πs(0))n|√s(0)TΠ2s(0)

= (Πs(0))1 + (Πs(0))2 + · · ·+ (Πs(0))n√s(0)TΠ2s(0)

= 1TΠs(0)√

s(0)TΠ2s(0)

= 1TΠ2s(0)√

s(0)TΠ2s(0)

We apply the Cauchy-Schwarz inequality to Π1 and Πs(0), remember that Π = WW T forsome W , so ΠT = (WW T )T = (W T )TW T = WW T = Π, that the matrix Π and all vectorsare nonnegative and Π2 = Π:

|〈Π1,Πs(0)〉| ≤ ‖Π1‖.‖Πs(0)‖

⇔ |(Π1)TΠs(0)| ≤√

1TΠ21.√

s(0)TΠ2s(0)

⇔ |1TΠTΠs(0)| ≤√

1TΠ21.√

s(0)TΠ2s(0)

⇔ |1TΠ2s(0)| ≤√

1TΠ21.√

s(0)TΠ2s(0)

⇔ |1TΠ2s(0)|√s(0)TΠ2s(0)

≤√

1TΠ21

⇔ 1TΠ2s(0)√

s(0)TΠ2s(0)≤√

1TΠ21

The last equation is true since Πs(0) and Π1 are both real and nonnegative.


2.2.2 Similarity matrices

We now come to the formal definition of the similarity matrix of two graphs G = (U,→) andH = (V,→′), by proving that the mutually reinforcing relation is given by (A and B are theadjacency matrices of G and H ):

Z(k+1) = BX(k)AT +BTX(k)A, k = 0, 1, . . . , (2.11)

which we already introduced in (2.6). To prove this relation, we take a detour via a niceproperty of the Kronecker product and the vec-operator. The vec-operator is a convenientway of stacking columns of a matrix left to right. With this operator we can rewrite (2.11) inz(k+1) = Mz(k) with M equal to a sum of Kronecker products. This will allow us the applyTheorem 2.2.10.

Definition 2.2.11. With each m× n-matrix A = [aij ] we can associate the vector vec(A) ∈Rmn defined by:

vec(A) =(a11 . . . am1 a12 . . . am2 a1n . . . amn

)TDefinition 2.2.12. The Kronecker product (or tensor product) of an m × n-matrixA = (aij) and a p× q-matrix B = (bij) is denoted by A⊗B and is defined by the matrix:

A⊗B =

a11B . . . a1nB... . . . ...

am1B . . . amnB

∈ Rmp×nq

Lemma 2.2.13. Consider a m× n-matrix A = (aij) and a p× q-matrix B = (bij), we have:

AT ⊗BT = (A⊗B)T

Proof. By direct computation, we get:

(A⊗B)T =

a11

b11 b12 . . . b1q... . . . ...

...bp1 bp2 . . . pq

. . . a1n

b11 b12 . . . b1q... . . . ...

...bp1 bp2 . . . pq

... . . . ...

am1

b11 b12 . . . b1q... . . . ...

...bp1 bp2 . . . pq

. . . amn

b11 b12 . . . b1q... . . . ...

...bp1 bp2 . . . pq

T

=

a11

b11 b21 . . . bp1... . . . ...

...b1q b2q . . . bpq

. . . am1

b11 b21 . . . bp1... . . . ...

...b1q b2q . . . bpq

... . . . ...

a1n

b11 b21 . . . bp1... . . . ...

...b1q b2q . . . bpq

. . . amn

b11 b21 . . . bp1... . . . ...

...b1q b2q . . . bpq


=

a11BT . . . am1B

T

... . . . ...a1nB

T . . . amnBT

= AT ⊗BT

Theorem 2.2.14. Let A a m×n-matrix , B a p×q-matrix and C a m×p matrix, the matrixequation:

AXB = C

with X an unknown n×p-matrix, is equivalent tot the system of qm equations and np unkonwsgiven by:

(BT ⊗A) vec(X) = vec(C)so:

vec(AXB) = (BT ⊗A) vec(X)

Proof. Let Qk be the kth column of a given matrix Q, we get:

(AXB)k = A(XB)k= AXBk

= Ap∑i=1

bikXi

= Ab1kX1 +Ab2kX2 + · · ·+AbpkXp

=(b1kA b2kA . . . bpkA

)vec(X)

= (BTk ⊗A) vec(X)

We get:

vec(AXB) =

BT

1 ⊗A...

BTq ⊗A

vec(X)

But it follows immediately that:

vec(AXB) = (BT ⊗A) vec(X)

because the transpose of a column of B is a row of BT .

Theorem 2.2.15. Let G and H be two graphs with adjacency matrices A = [aij ] ∈ Rn×nand B = [bij ] ∈ Rm×m and let S(0) > 0 be an initial positive matrix, and define:

S(k+1) = BS(k)AT +BTS(k)A

‖BS(k)AT +BTS(k)A‖F, k = 0, 1, . . . .

Then the matrix subsequences S(2k) and S(2k+1) converge to Seven and Sodd and among all thematrices in the set of all possible limits:

{Seven(S(0)), Sodd(S(0)) : S(0) > 0}

the matrix Seven(J), with J the all-one matrix, is the unique matrix of largest 1-norm.


Proof. We first rewrite the compact form of 2.11:

Z(k+1) = BZ(k)AT +BTZ(k)A

⇔ vec(Z(k+1)) = vec(BZ(k)AT +BTZ(k)A)⇔ vec(Z(k+1)) = vec(BZ(k)AT ) + vec(BTZ(k)A)⇔ vec(Z(k+1)) =

((AT )T ⊗B

)vec(Z(k)) +

(AT ⊗BT

)vec(Z(k))

⇔ vec(Z(k+1)) = (A⊗B +AT ⊗BT ) vec(Z(k))

If we define z(k) = vec(Z(k)) and M = A⊗B +AT ⊗BT we get:

z(k+1) = Mz(k),

exactly the same compact form as introduced in the beginning of this section in 2.3. If wecan prove that M is symmetric and nonnegative, then we can apply Theorem 2.2.10 and theresult follows. M is, of course, nonnegative because the adjacency matrices A and B arealways nonnegative. To see that M is symmetric, first notice that M can be rewritten usingLemma 2.2.13 to:

M = A⊗B + (A⊗B)T , (2.12)

because A ⊗ B is a nm × nm square matrix, we know that M is symmetric because it isthe sum of a square matrix and its transpose. In order to stay consistent with the Euclediannorm appearing in Theorem 2.2.10, we use as matrix norm the Frobenius norm: we knowfrom section 1.3.2 that for A ∈ R1×n the Frobenius norm equals the 2-norm. We can nowapply Theorem 2.2.10 and the result follows.

Definition 2.2.16. Let G ,H be two graphs, then the unique matrix

S = Seven(J) = limk→+∞

S(2k)

(with the notations of the previous theorem) is the similarity matrix between G and H .

Notice that it follows from 2.11 that the similarity matrix between H and G is thetranspose of the similarity matrix between G and H .


2.2.3 Algorithm 5

Data:A: the n× n adjacency matrix of a directed graph GB: the m×madjacency matrix of a directed graph H .TOL: tolerance for the estimation error.Result:S: the similarity matrix between G and H .begin similarity matrix(A,B,TOL)

k = 1 ;Z(0) = J (n×m-matrix with all entries equal to 1);µ = n×m-matrix with all entries equal to TOL;repeat

Z(k) = BZ(k−1)AT+BTZ(k−1)A‖BZ(k−1)AT+BTZ(k−1)A‖F

;k = k + 1;

until k is even and |Z(k) − Z(k−2)| < µ;return Z(k);

endAlgorithm 5: Algorithm for calculating the similarity matrix between G and H

The compact form introduced in the previous section leads directly to the approximationalgorithm 5. Note that the algorithm must iterate an even number of times and to check ifthe tolerance limit is reached, only the even iteration steps are considered because the resultswill oscillate between the even and odd iterates. This is of course an obvious consequenceof the definition of the similarity matrix and Theorem 2.2.10. Remember that the absolutevalue that is mentioned is the same as in Notation 1.1.8. A Matlab implementation of thealgorithm can be found in Listing A.2 in Appendix A.

Equivalence with the power method

Algorithm 5 is in fact completely equivalent to the Power Method of section 1.4.2. To seethis, remind that we are in fact calculating the expression of Theorem 2.2.10:

seven(s(0)) = limk→∞

s(2k) = Πs(0)

||Πs(0)||2,

when we compare this with equation (1.21) from the power method section, we see thatin both cases we are in fact calculating the largest eigenvector of a matrix, but with thedifference that in this Π is constructed in such way that it oscillates and the even iteratesalways converge. The attentive reader will note that Π doesn’t appear directly in Algorithm5, but this is, of course, a consequence of the vectorization defined in Definition 2.2.11. Withthis operation, we are able to work directly with matrices while proving convergence throughTheorem 2.2.10. With this in mind, it is clear that Algorithm 5 can be seen as a matrixanalogue of the power method.

This brings us to one of the main ideas of this master thesis: when we use the importantTheorem 2.2.10 to prove convergence of some algorithm (all the algorithms that will followare using this theorem for this purpose), we know for sure that the resulting algorithm willsomehow be equivalent to the power method. Indeed, this theorem shows that the algorithm


essentially comes down to finding the largest eigenvalue of some orthogonal projector Π whichmight, however, not appear directly in the algorithm thanks to vectorization . Depending onthe possible characteristics of Π, this makes it possible to derive the computational costs ofthe constructed algorithm.

Also note that in this case, the equivalence is very clear when looking at the (pseudo)codefor both algorithms, but as this master thesis proceeds, the algorithms will get more compli-cated and the equivalence with the power method will not be as clear.

Computational cost

We already showed in Theorem 2.2.10 that the convergence of the even iterates in the aboverecurrence relation is linear with ratio (µ/ρ)2n, where ρ is the spectral radius of M and µ isthe spectral radius of the matrix Mµ, a matrix where the columns span the subspaces of alleigenvectors besides ρ and also −ρ when −ρ is an eigenvalue of M .

With the accuracy level TOL, we write:∣∣∣∣µρ∣∣∣∣2n ≈ TOL

So, for example, for TOL = 10−5 we get n = −5/(2 logµ/λ), we become:

similarity matrix ∈ O( 1

logµ− log ρ

)Note that this O holds for any estimation error TOL of the form 10−e with e ∈ N. Alsonote that this is completely analogous to the computational cost of the power method, whichcomes as no surprise since the algorithms are similar, but remember from Theorem 2.2.4 thatwe cannot encounter a situation in which µ = ρ, as µ is the spectral radius of the spectrumwithout ρ and −ρ (if −ρ is an eigenvalue). When Mµ is trivial (meaning that M has onlyeigenvalues ρ and possibly −ρ), we conclude from equation (2.8) in Theorem 2.2.10 that inthis case

∣∣∣1ρ ∣∣∣2n ≈ TOL so:

similarity matrix ∈ O( 1

log 1− log ρ

).

We conclude that the algorithm can always be calculated in a finite number of steps.

2.2.4 Special cases

We now consider some special cases of of similarity scores between two vertices of graphs.

HITS algorithm

Remember that the HITS algorithm assumed that

M =(

0 BBT 0

),

has a unique dominant eigenvalue to calculate the hub and authority scores of nodes in agraph Gσ with adjacency matrix B. This assumption is a direct result of the power method,


because it is one of the conditions to apply the algorithm. One of our goals was to drop thisassumption. We finally arrived at this result, that also returns a solution when the Perron roothas a multiplicity larger then 1. To prive this result, we need the singular value decompositionof real matrices. A comprehensive overview of this decomposition can be found in [? ], werestrict ourself to the main result of this decomposition, without the intuitive explanations,otherwise we would eviate too much from the main subject.

Theorem 2.2.17. (Singular value decomposition) If A is a real m×n-matrix, then thereexist orthogonal matrices:

U = [u1, . . . ,um] ∈ Rm×m

V = [v1, . . . ,vn] ∈ Rn×n,

such thatUTAV = Σ = diag(σ1, . . . , σp) ∈ Rm×n,

meaning that Σ is a so called pseudo-diagonal matrix meaning that the first diagonal elementsare the singular values σ1, . . . , σp and all other elements of Σ are zero, p = min(m,n) andσ1 ≥ · · · ≥ σp. Equivalently:

A = UΣV T .

The columns of V are right singular vectors of A, and those of U are left singularvectors.

Proof. Let x and y be unit vectors in Rn and Rm, respectively, and consider the bilineairform:

z = yTAx.

The setS = {x,y | x ∈ Rn,y ∈ Rm, ‖x‖2 = ‖y‖2 = 1}

is compact, so that the scalar function z(x,y) must achieve a maximum value on S (possiblyat more than one point). Let u1, v1 be two unit vectors in Rm and Rn respectively wherethis maximum is achieved, and let σ1 be the corresponding value of z:

max‖x‖2=‖y‖2=1

yTAx = uT1 Av1 = σ1.

We want to show that u1 is parallel to the vector Av1: if this were not the case, the innerproduct uT1 Av1 could be increased by rotating u1 towards the direction of Av1, therebycontradicting the fact that uT1 Av1 is a maximum. Similarly, notice that:

uT1 Av1 = vT1 ATu1

and repeating the argument above, we see that v1 is parallel to ATu1. The vectors u1and v1 can be extended into orthonormal bases for Rm and Rn, respectively. Collect theseorthonormal basis vectors into orthogonal matrices U1 and V1. Then:

UT1 AV1 = S1 =(σ1 0T0 A1

).

In fact, the first column of AV1 is Av1 = σ1u1, so the first entry of UT1 AV1 is uT1 σ1u1 = σ1,and its other entries are uTj Av1 = 0 because Av1 is parallel to u1 and therefore orthogonal,


by construction, to u2, . . . ,um. A similar argument shows that the entries after the first rowof S1 are zero: the row vector uT1 A is parallel to vT1 and therefore orthogonal to v2, . . . ,vn,so that uT1 Av2 = · · · = uT1 Avn = 0. The matrix A1 has one fewer row and column than A.We can repeat the same construction on A1 and write

UT2 A1V2 = S2 =(σ2 0T0 A2

)

so that (1 0T0 UT2

)UT1 AV1

(1 0T0 V T

2

)=

σ1 0 0T0 σ2 0T0 0 A2

.this procedure can be repeated until Ak vanishes to obtain:

UTAV = Σ,

where UT and V are orthogonal matrices obtained by multiplying together all the orthogonalmatrices used in the procedure, and:

Σ = (σ1, . . . ,p ).

Since matrices U and V are orthogonal, we can multiply with U and V T to obtain:

A = UΣV T ,

which is the desired result.It remains to show that the elements on the diagonal of Σ arrange in decreasing order. To

see that σ1 ≥ · · · ≥ σ where p = min(m,n) we can observe that the successive maximizationproblems that yield σ1, . . . , σp are performed on a sequence of sets each of which contains thenext.

Theorem 2.2.18. Let G be a graph with adjacency matrix B. The normalized hub and au-thority scores of the vertices are given by the normalized dominant eigenvectors of the matricesBBT and BTB, provided the corresponding Perron root is of multiplicity 1. Otherwise, it isthe normalized projection of the vector 1 on the eigenspace of the Perron root.

Proof. The corresponding matrix M is:

M =(

0 BBT 0

)so:

M2 =(BBT 0

0 BTB

),

and the result follows from Theorem 2.2.10 under the condition that the matrix M2 has adominant root ρ2 . This can be seen as follows: let V and U be the dominant right and leftsingular vectors of B, so:

BV = ρU, BTU = ρV


then clearly V and U are also the dominant right and left singular vectors of BTB and BBT

so:BTBV = ρ2V, BBTU = ρ2U.

The projectors associated with the dominant eigenvalues of BBT and BTB are, respectively:

Πv = V V T and Πu = UUT ,

the projector of Π of M2 is then:

Π = diag(Πv,Πu),

and hence the subvectors of Π1 are the vectors Πv1 and Πu1, which can be computed withBTB and BBT .

Central scores

As for the hub and authority scores we can give an explicit expression for the similarity scorewith vertex 2, which we will call the central score.

Theorem 2.2.19. Let G be a graph with adjacency matrix B. The normalized similarityscores of vertex 2 of the path graph 1 → 2 → 3 with the vertices of graph G are called thecentral scores and are given by the normalized dominant eigenvector of the matrix BTB+BBT ,provided the corresponding Perron root is of multiplicity 1. Otherwise, it is the normalizedprojection of the vector 1 on the dominant invariant subspace.

Proof. Let

M =

0 B 0BT 0 B0 BT 0

so:

M2 =

BBT 0 BB0 BTB +BBT 0

BTBT 0 BTB

because M is nonnegative and symmetric, the result follows from Theorem 2.2.10 under thecondition that the central matrix BTB +BBT has a dominant root ρ2 of M2. We can statethis condition otherwise, because M can be permuted to:

M = P T(

0 EET 0

)P, where E =

(BBT

)

Now let V and U be the dominant right and left singular vectors of E:

EV = ρU,ETU = ρV,

then clearly V and U are also dominant right and left singular vectors of ETE and EET ,since:

ETEV = ρ2V, EETU = ρ2U.


Moreover,

PM2P T =(EET 0

0 ETE

),

and letΠv = V V T and Πu = UUT

be the projectors associated with the dominant eigenvalues of EET and ETE. The projectorΠ of M2 is then equal to:

Π = P Tdiag(Πv,Πu)P,

and it follows that the subvectors of Π1 are te vectors Πv1 and Πu1, which can be computedfrom ETE or EET . Since ETE = BTB +BBT , the central vector Πv1 is the central vectorof Π1.

Example 2.2.20. In order to illustrate the intuitive meaning of calculating a similaritymatrix where the path graph 1 → 2 → 3 is the structure graph, consider the followingdirected bow tie graph:

1center

m left vertices n right vertices

Label the center vertex first, then the m left vertices and finally the n right vertices, then theadjacency matrix of the bow tie graph becomes:


B =

0 0 . . . 0 1 . . . 11... 0n×n 0n×m10... 0m×n 0m×m0

By direct computation, we get that the matrix BTB +BBT is equal to:

BTB +BBT =

m+ n 0 00 1n×n 00 0 1m×m

By Theorem 2.2.19, the Perron root of M is equal to ρ =

√n+m because the central matrix

BTB + BBT has a dominant root ρ2 of M2 and we clearly see that if we would take thecharacteristic polynomial of BTB+BBT , we would get m+n as Perron root. The dominanteigenvector of BTB + BBT is equal to the similarity score s2 by Theorem 2.2.19. it is noweasy to see that:

s2 =

10n0m

If we see vertex 2 of the path graph 1→ 2→ 3 as the center , which can be seen as a vertexthrough which much information is passed, then it is not surprising that s2 indicates thatvertex 1 of the directed bow tie graph is the only one that looks like a center. The left verticesof the bow tie graph look like vertex 1 of the path graph and the right vertices of the bow tiegraph look like vertex 3. This is a beautiful example because it confirms our intuition.

Self-similarity of a graph

When we compute the similarity matrix of two equal graphs G = H , the similarity matrixS is square matrix with as entries the similarity scores between vertices of G , we call S theself-similarity matrix of G in this case.

Intuitively, we expect that vertices have the highest similarity scores with themselves,which means that the largest entries are located on the diagonal of S. We prove in the nexttheorem that the largest entry of a self-similarity matrix always appears on the diagonaland that, except for some special cases, the diagonal elements of a self-similarity matrixare nonzero. This doesn’t mean that the diagonal elements are always larger than the otherelements on the same row or column and we conclude this paragraph with some easy examplesthe show this.

That the similarity matrix of a graph with itself is not always diagonally dominant isa serious limitation on the intuitive concept of similarity between graphs! It shows thatthe algorithm is sometimes not capable to detect the essential link between a vertex anditself, which is one of the reasons why the presented algorithm of similarity is still not totallysatisfactory.

To prove these statements, we first need to dive a little bit in the world of the symmetric,positive semidifinite matrices with some definitions and some proofs.


Definition 2.2.21. Let A be an n×n-matrix, the determinant of a k×k principal submatrix(see Definition 1.2.7) is called a principal minor of order k.

Definition 2.2.22. A n × n, symmetric matrix A is called positive semidefinite if alleigenvalues of A are nonnegative.

Theorem 2.2.23. For a n×n, symmetric matrix A, the following properties are equivalent:

1. A is positive semidefinite,

2. A = UTU for some matrix U ,

3. xTAx ≥ 0 for every x ∈ Rn,

4. All principal minors A are nonnegative.

Proof. (1)⇒ (2) Write A = UDUT by Theorem 2.1.2 where U is orthogonal and D diagonal.Now, we know from that theorem, that the entries on the diagonal of D are the eigenvalues ofA (which are nonnegative), so we may write D = C2 where C will also be a diagonal matrix.Then we have

A = UCCUT = UC(UC)T ,as desired.

(2)⇒ (3) For every x ∈ Rn we have

xTAx = xTUTUx = (Ux)T (Ux) ≥ 0.

(3)⇒ (1) If v is an eigenvector of A with eigenvalue λ then 0 ≤ vTAv = λvTv which impliesthat λ ≥ 0.

(2) ⇒ (4) Let B be a submatrix of A formed by deleting the rows and columns withindices in the set S. Then modify U to form V by deleting the columns with indices in theset S. Now: det(B) = det(V TV ) = det(V )2 ≥ 0. Remember that U is orthogonal from andthus V is orthogonal too.

(4)⇒ (1) By contraposition, we may assume that A has a unit eigenvector v with eigen-value λ ≤ 0. If A has only one eigenvalue ≤ 0, then det(A) < 0 and we are done. Otherwise,choose a unit eigenvector u orthogonal to v with eigenvalue µ ≤ 0. Now choose α ∈ R so thatthe vector w = v + αu has at least one zero coordinate, say wi. If A′ is the matrix obtainedfrom A by removing the column i and row i and w′ is obtained by removing coordinate i ofw, then we have

(w′)TA′w′ = wTAw = λ+ α2µ < 0,so A′ is not semidefinite since we already demonstrated the equivalence between (3) and (1),and the result follows by induction.

Lemma 2.2.24. The scaled sum of two positive semidefinite matrix is again a positivesemidefinite matrix.

Proof. Assume that A and B are n× n, positive semidefinite matrices and α, β ∈ R+, so forall x ∈ Rn we have xTAx ≤ 0, so then:

xT (αA+ βB)x = α(xTAx) + β(xTBx) ≥ 0.


Theorem 2.2.25. The self-similarity matrix of a graph G is positive semidefinite. The largestentry of the matrix appears on the diagonal and if a diagonal entry is equal to zero, all theentries of the corresponding row and column are also equal to zero.

Proof. Since A = B, the compact form of Theorem 2.2.15 becomes:

S(k+1) = AS(k)AT +ATS(k)A

‖AS(k)AT +ATS(k)A‖F, S(0) = J,

First notice that S(k+1) is in this case always symmetric because sij = sji (the similarityscores of vertex vi and vj are the same as the similarity scores of vertex vj and vi). Theall-one matrix J is clearly positive semidefinite as it has only nonnegative principal minors.So S(0) is postive semidefinite, so it can be decomposed as S(0) = W TW . We write for somevector x ∈ Rn:

xTATS(0)Ax = xTATW TWAx= ‖WAx‖2≥ 0,

similarly, xTAS(0)ATx ≥ 0, thus all matrices S(k) are positive semidefinite because the(scaled) sum of two positive semidefinite matrices is also positive semidefinite. Now, the restof the proof follows from the following observation: let α be a real number. If x = αei − ejthen:

xtS(k)x = α2s(k)ii − 2αs(k)

ij + s(k)jj , (2.13)

so to prove that the largest entry of the matrix S(k) appears on the diagonal, assume, con-trapositively, that the strictly largest entry is s(k)

ij = s(k)ji with i 6= j. Taking α = 1, equation

(2.13) becomes:

xtS(k)x = s(k)ii − 2s(k)

ij + s(k)jj = (s(k)

ii − s(k)ij ) + (s(k)

jj − s(k)ij ) < 0.

To prove the last statement, if s(k)ii = 0 but s(k)

ij 6= 0 for some j gives for (2.13):

xtS(k)x = −2αs(k)ij + s

(k)jj ,

which is negative for α large enough.

Example 2.2.26. The self-similaritiy matrix of the graph:v1 v2 v3

is equal to: 0.5774 0 00 0.5774 00 0 0.5774

Example 2.2.27. The self-similartiy matrix of the graph:


v1

v2 v4

v3

is equal to: 0.250 0.250 0.250 0.2500.250 0.250 0.250 0.2500.250 0.250 0.250 0.2500.250 0.250 0.250 0.250

Example 2.2.28. The self-similartiy matrix of the graph:

v1

v2

v4v3

is equal to: 0.4082 0 0 0

0 0.4082 0 00 0 0.4082 0.40820 0 0.4082 0.4082

Undirected graphs

In the case of undirected graphs, the compact form 2.2.15 only becomes easier and equals:

Theorem 2.2.29. Let G and H be two undirected graphs with adjacency matrices A =[aij ] ∈ Rn×n and B = [bij ] ∈ Rm×m and let S(0) > 0 be an initial positive matrix, and define:

S(k+1) = BS(k)A

‖BS(k)A‖F, k = 0, 1, . . . .


Then the matrix subsequences S(2k) and S(2k+1) converge to Seven and Sodd and among all thematrices in the set of all possible limits:

{Seven(S(0)), Sodd(S(0)) : S(0) > 0}

the matrix Seven(J) is the unique matrix of largest 1-norm.

Proof. This follows easily from the fact that in the case of undirected graphs, A and B aresymmetric:


‖BS(k)AT +BTS(k)A‖F

= BS(k)A+BS(k)A

‖BS(k)A+BS(k)A‖F

= 2BS(k)A

‖2BS(k)A‖F

= BS(k)A

‖BS(k)A‖F


2.3 Node-edge similarity

The paper of Blondel et al. [? ] caused a flow of successive papers which build upon theconcept of similarity between two graphs. For instance, the paper ‘Graph similarity scoringand matching’ of Laura Zager and George Verghese [? ] expands the notion of node similaritypresented in the previous section to similarity between the edges of two graphs. The paperwas presented in 2006. Intuitively an edge of a graph G is similar to an edge of graph H iftheir source and terminal nodes (Definition 1.5.6) are similar. As a consequence, the notionof similarity between edges introduces a coupling between edge and node similarity scores.The algorithm presented in this paper is therefore an extension of the algorithm presented inthe previous section.

2.3.1 Coupled node-edge similarity scores

We now present the extended algorithm allowing us to calculate not only a node similarityscores, but also edge similarity scores. The algorithm will use a new sort of matrices thatrepresent a graph.

Definition 2.3.1. Let be G = (V,→) be a graph with adjacency matrix A, numbered verticesv1, v2, . . . , vn ∈ V and numbered edges e1, e2, . . . em ∈→. We define the source-edge matrixAS as a n×m matrix with entries:

(AS)ij ={

1 if sG (ej) = vi

0 otherwise,

the notation AS is derived from the adjacency matrix A.

Definition 2.3.2. Let be G = (V,→) be a graph with adjacency matrix A, numbered verticesv1, v2, . . . , vn ∈ V and numbered edges e1, e2, . . . em ∈→. We define the terminus-edgematrix AT as a n×m matrix with entries:

(AT )ij ={

1 if tG (ej) = vi

0 otherwise,

the notation AT is derived from the adjacency matrix A.

Property 2.3.3. Let G = (V,→) be a graph, then ASATS is a diagonal matrix where the ith

diagonal entry is equal the outdegree of vertex vi.


(ASATS )ij =m∑k=1

(AS)ik(AS)Tkj =m∑k=1

(AS)ik(AS)jk

Assume i 6= j. Then for each k, (AS)ik(AS)jk = 0 since vetrex vi and vertex vj can’t be boththe source node of edge k.

When i = j, then for each k, (AS)ik(AS)jk = ((AS)ik)2 equals 0 or 1 depending on whethervi is a starting point or not, so 1 is added to (ASATS )ii each time an edge ‘departs’ from vi,this is exactly the outdegree of vertex vi.


In an analogous way, we prove:

Property 2.3.4. Let G = (V,→) be a graph, then ATATT is a diagonal matrix where the ith

diagonal entry is equal the indegree of vertex vi.

Property 2.3.5. Let G = (V,→) be a graph, then the adjacency matrix A is equal to ASATT .


(ASATT )ij =m∑k=1

(AS)ik(AT )Tkj =m∑k=1

(AS)ik(AT )jk

The terms (AS)ik(AT )jk equal 1 if edge k goes from vi to vj and since the sum is taken overall edges we conclude:

(ASATT )ij = Aij .

Let G (V,→), H (U,→′) be two (directed) graphs, G has nG vertices and mG edges andH has nH vertices and mH . Remember the following updating equation from (2.5), whichreturns the (node) similarity score between vertices ui from H and vj from G :

x(k+1)ij =

∑r:(ur,ui)∈→′,w:(vw,vj)∈→

x(k)rw +

∑r:(ui,ur)∈→′,w:(vj ,vw)∈→

x(k)rw ,

if we number the edges of G and H as e1, e2, . . . emG ∈→ and e′1, e′2, . . . , e

′mH∈→′, this can

be rewritten as:

x(k+1)ij =

∑tH (e′p)=ui,tG (eq)=vj

x(k)sH (e′p)sG (eq) +

∑sH (e′p)=ui,sG (eq)=vj

x(k)tH (e′p)tG (eq).

We now extend this mutually reinforcing relation with the notion of edge similarity. xijdenotes again the node similarity score between vertex ui from H and vertex vj from G andypq denotes the edge similarity score between edge p from H and edge q in G , the updateequations for edge and node similarity scores take now the following form:

y(k+1)pq = x

(k)sH (e′p)sG (eq) + x

(k)tH (e′p)tG (eq) (2.14)

x(k+1)ij =

∑tH (e′r)=ui,tG (ew)=vj

y(k)rw +

∑sH (et)=ui,sG (ew)=vj

y(k)rw (2.15)

With the same reasoning as presented in the previous section, these scores can be assembledinto matrices Y (k) and X(k) by using the source-edge matrices AS and the terminus-edgematrices AT . Let A be the adjacency matrix of G and B be the adjacency matrix of H , letX(k) be the nH × nG -matrix with entries xij , the node similarity scores at iteration step kand let Y (k) be the mH ×mG -matrix with entries ypq, the edge similarity scores at step k.The equations (2.14) and (2.15) can be rewritten as:

Y ′(k+1) = BTSX

(k)AS +BTTX

(k)AT (2.16)X ′(k+1) = BSY

(k)ATS +BTY(k)ATT (2.17)


for k = 0, 1, . . .Of course we want to customize these equations in a way that we can apply Theorem

2.2.10 to prove convergence. This will be completely analogous to Theorem 2.2.15, but wecan achieve a slightly better result: not only the even and odd iterates will converge, theiteration converges as a whole. Lastly, remember that one of the conditions of Theorem2.2.10 is to normalize the results at each iteration step. Therefore, the following theoremcomes as no surprise:Theorem 2.3.6. Let G and H be two graphs with adjacency matrices A and B, G has nG

vertices and mG edges and H has nH vertices and mH edges, define:

Y (k+1) = BTSX

(k)AS +BTTX

(k)AT‖BT

SX(k)AS +BT

TX(k)AT ‖F

(2.18)

X(k+1) = BSY(k)ATS +BTY

(k)ATT‖BSY (k)ATS +B′TY

(k)ATT ‖F(2.19)

for k = 0, 1, . . ..Then the matrix subsequences X(2k), Y (2k) and X(k+1), Y (k+1) converge to Xeven, Yeven

and Xodd, Yodd. If we take:

X(0) = J ∈ RnH ×nG

Y (0) = J ∈ RmH ×mG

as initial matrices, then Xeven(J) = Xodd(J), Yeven(J) = Yodd(J) are the unique matrices oflargest 1-norm among all possible limits with positive initial matices and the matrix sequenceX(k), Y (k) converges has a whole.Proof. By Theorem 2.2.14 we can rewrite (2.16) as follows:

Y ′(k+1) = BTSX

′(k)AS +BTTX

′(k)AT

⇔ vec(Y ′(k+1)) = vec(BTSX

′(k)AS +BTTX

′(k)AT )⇔ vec(Y ′(k+1)) = vec(BT

SX′(k)AS) + vec(BT

TX′(k)AT )

⇔ vec(Y ′(k+1)) = (ATS ⊗BTS ) vec(X ′(k)) + (ATT ⊗BT

T ) vec(X ′(k))⇔ vec(Y ′(k+1)) = (ATS ⊗BT

S +ATT ⊗BTT ) vec(X ′(k))

Completely analogously we can also rewrite (2.17):

vec(X ′(k+1)) = (AS ⊗BS +AT ⊗BT ) vec(Y k),

define y(k) = vec(Y ′(k+1)) and x(k) = vec(X ′(k+1)), we get:

y(k+1) = (ATS ⊗BTS +ATT ⊗BT

T )x(k)

x(k+1) = (AS ⊗BS +AT ⊗BT )y(k)

If we define G = ATS ⊗BTS +ATT ⊗BT

T , then with Lemma 2.2.13 and well known properties oftranspose matrices:

GT = (ATS ⊗BTS +ATT ⊗BT

T )T

= ((AS ⊗BS)T + (AT ⊗BT )T )T

= ((AS ⊗BS)T )T + ((AT ⊗BT )T )T

= AS ⊗BS +AT ⊗BT


So we get:

y(k+1) = Gx(k) (2.20)x(k+1) = GTy(k), (2.21)

G is a mGmH × nGnH -matrix, the previous expressions can be concatenated to a singlematrix update equation (we define matrix M and z(k+1)):

z(k+1) =(

xy

)(k+1)

=(

0mG×mH GT

G 0nG×nH

)(xy

)(k)

= Mz(k),

M is clearly nonnegative because G and GT consists of sums of Kronecker products ofAS , BS , AT and/or BT , all matrices with entries equal to zero or one, M is as a (nGnH +mGmH ) × (nGnH + mGmH )-matrix clearly symmetric, so the result follows immediatelyfrom Theorem 2.2.10. The appearance of the Perron norm can be explained in the same wayas in Theorem 2.2.15, note that we write Y (k) and X(k) for the normalized results at iterationstep k. We still have to prove that odd and even iterates are the same and that M has onlypositive eigenvalues, meaning that the sequence converges to Πz(0)

‖Πz(0)‖2, this can be done easily

by expanding the matrix equation:

x(k) =

(GTG) k2 x(0) k even(GTG) k−1

2 GTy(0) k oddand y(k) =

(GGT ) k2 y(0) k even(GGT ) k−1

2 Gx(0) k odd

The matrix G is the sum of two matrices (ATS ⊗BTS ) and (ATT ⊗BT

T ). Now, ATS is a mG ×nG -matrix and has in each row a single ‘1’-entry, simply because an edge has at most one sourcenode . The same holds for BT

S . Therefore, taking the Kronecker product of those two matricesresults in the matrix ATS ⊗ BT

S which also has just a single ‘1’ entry in each row. With thesame reasoning, we conclude that also ATT ⊗ BT

T has also just a single ‘1’-entry in each row.Taking the sum of (ATS ⊗BT

S ) and (ATT ⊗BTT ) results thus in the matrix G with exactly two

1 entries in each row. If we now choose the initial conditions as x(0) = 1 ∈ RnH nG andy(0) = 1 ∈ RmH mG , we conclude:

Gx(0) = 2y(0),

we get:

x(k) =

(GTG) k−22 GTGx(0) = 1

2(GTG) k−22 GTy(0) k even

(GTG) k−12 GTy(0) k odd

y(k) =

(GGT ) k2 y(0) k even(GGT ) k−1

2 Gx(0) = (GGT ) k−12 1

2y(0) k odd

First, observe that the matrices GGT and GTG are besides being symmetric (for example,(GGT )T = GGT ) and nonnegative, are also positive semi-definite and thus have only nonneg-ative eigenvalues. Note that the factor 1

2 that appears in the odd iterates will be eliminatedby the normalization in each step. So in the limit as k →∞ , the even and odd iterates arethe same.

We now define Xeven(1) as the node similarity matrix and Yeven(1) as the edge similaritymatrix.


2.3.2 Algorithm 6 and Algorithm 7

Data:AS : the nG ×mG source-edge matrix of a graph GAT : the nG ×mG terminal-edge matrix of a graph GBS : the nH ×mH source-edge matrix of a graph GBT : the nH ×mH terminal-edge matrix of a graph GTOL: tolerance for the estimation error.Result:X: the node similarity matrix between G and HY : the edge similarity matrix between G and Hbegin node edge similarity matrix(AS,AT , BS,BT ,TOL)

k = 1 ;X(0) = 1 (nH × nG -matrix with all entries equal to 1);Y (0) = 1 (mH ×mG -matrix with all entries equal to 1);µX = nH × nG -matrix with all entries equal to TOL;µY = mH ×mG -matrix with all entries equal to TOL;repeat

Y (k) = BTSX(k−1)AS+BTTX

(k−1)AT‖BTSX(k−1)AS+BTTX(k−1)AT ‖F

;

X(k) = BSY(k)ATS+BTY (k)ATT

‖BSY (k)ATS+B′TY (k)ATT ‖F;

k = k + 1;until |X(k) −X(k−1)| < µX and |Y (k) − Y (k−1)| < µY ;return X(k), Y (k);

endAlgorithm 6: Algorithm for calculating the node and edge similarity matrix X and Ybetween G and H .

The algorithm to calculate the node and edge similarity scores is presented in pseudo-code in Algorithm 6. Remember that the absolute value that is mentioned is the same as inNotation 1.1.8. Besides the different calculations, the main difference with the algorithm ofthe previous section (Algorithm 5) is that the whole sequence converges, not only the even(or odd) iterates, making this algorithm twice as fast. This is because in Algorithm 5 wetake the limit of the even iterates but we need the calculations of the odd iterates too toachieve a result. In this implementation, the algorithm will not oscillate and both the evenand odd iterates converge to the same limit, so the tolerance level will be satisfied with halfof the number of steps we would need for Algorithm 5. Also notice that we implement asequential update: when Y (k) is calculated,the result is immediately used in X(k). This isnot according to Theorem 2.3.6, which suggest simultaneous updating equations: in thatcase X(k) uses Y (k−1) in its calculations. It is not difficult to see, by an analogous argumentof Theorem 2.3.6, that both approaches yield the same set of similarity scores. Numericalanalysts, however, will always prefer the sequential update procedure because it performsslightly better (see section 3.2 in [? ]) as you directly use a more accurate result for Y (k) inthe calculation of X(k).

A Matlab implementation can be found in Listing A.3 in Appendix A. Because givingsource-edge matrices and terminal-edge matrices as input is quite unnatural, in Listing A.4


and Listing A.5 some Matlab code is also presented to transform an adjacency matrix intoa source-edge matrix and a terminal-edge matrix. The resulting matrices represent an edgenumbering left-to-right, based on the entries of the adjacency matrix. The algorithm tocalculate the source-edge matrix based on the adjacency matrix is also written in pseudo-code in Algorithm 7, the calculation of the terminal-edge matrix is completely analogous. AMatlab implementation of Algorithm 6 that takes two adjacency matrices as input can befound in Listing A.6.

Data:A: the n× n adjacency matrix of a graph GResult:AS : the source-edge matrix of graph Gbegin source edge matrix(A)

m = sum of all entries of A ;AS = initialize a n×m-matrix with all entries equal to 0;current edge = 1;for i : 1 to n do

for j : 1 to n doif (A)ij > 0 then

for e : 1 to (A)ij do(AS)i,current edge = 1;current edge = current edge + 1 ;

endend

endend

endreturn AS ;

Algorithm 7: Algorithm for calculating the source-edge matrix AS based on the adjacencymatrix A of a graph G .

2.3.3 Difference with node similarity

It is clear that the node-edge similarity algorithm is different from Algorithm 5 from theprevious section. We will make this a little more explicit and show the difference in theresulting node similarity matrix. We first need another property of the Kronecker product.

Property 2.3.7. (mixed-product property of Kronecker products) Let A ∈ Rm×n, B ∈Rr×s, C ∈ Rn×p and D ∈ Rs×t then:

(A⊗B)(C ⊗D) = AC ⊗BD


Proof.

(A⊗B)(C ⊗D) =

a11B . . . a1nB... . . . ...

am1B . . . amnB

c11D . . . c1pD

... . . . ...an1D . . . anpD

=

∑nk=1 a1kck1BD . . .

∑nk=1 a1kckpBD

... . . . ...∑nk=1 amkck1BD . . .

∑nk=1 amkckpBD

= AC ⊗BD.

Now take equations (3.10) and (3.11) and consider only the even iterates. We consideronly the even iterates because Algorithm 5 does, remember that in Algorithm 6 the even andodd iterates yield the same set of similarity scores. We get:

x(k) = (GTG)x(k−2)

= ((AS ⊗BS +AT ⊗BT )(ATS ⊗BTS +ATT ⊗BT

T ))x(k−2)

= ((AS ⊗BS)(ATS ⊗BTS ) + (AS ⊗BS)(ATT ⊗BT

T )+(AT ⊗BT )(ATS ⊗BT

S ) + (AT ⊗BT )(ATT ⊗BTT ))x(k−2)

= (ASATS ⊗BSBTS +ASA

TT ⊗BSBT

T

+ATATS ⊗BTBTS +ATA

TT ⊗BTBT

T )x(k−2)

= (ASATS ⊗BSBTS +A⊗B +AT ⊗BT +ATA

TT ⊗BTBT

T )x(k−2)

= (A⊗B +AT ⊗BT +ASATS ⊗BSBT

S +ATATT ⊗BTBT

T )x(k−2)

Where A,B are the adjacency matrices of G ,H (Property 2.3.5), ASATS , BSBTS are the

diagonal matrices with the outdegrees of the vertices on the diagonal (Property 2.3.3) andATA

TT , BTB

TT are the diagonal matrices with the indegree of the vertices on the diagonal

(Property 2.3.4). The iteration sugested in Theorem 2.2.15 in the previous section has thefollowing form (see equation (2.12)):

x(k) = A⊗B +AT ⊗BTx(k−2),

thus Algorithm 6 differs from Algorithm 5 in three important ways:

• In Algorithm 6 the inclusion of additional diagonal terms serve to amplify the scores ofnodes that are highly connected in the node similarity matrix,

• Algorithm 6 returns a node similarity matrix and an edge similarity matrix, Algorithm5 returns only a node similarity matrix,

• The whole sequence in Algorithm 6 converges, not only the even and odd iterates.

2.3.4 Example

Example 2.3.8. We retake Example 2.2.2 and number the edges as follows, let G = (V,→)be :


v1 v2 v31 2

Let H = (W,→) be the following graph:

w1

w2 w3

w4 w5

1 2

5

3

46

7

Then the node similarity matrix is:

X =

0.2338 0.0718 00.2472 0.3230 0.01280.1841 0.7553 0.3185

0 0.0935 0.23380 0.0441 0.0576

,and the edge similarity matrix equals:

Y =

0.2166 0.03290.3847 0.15180.3899 0.24950.1325 0.21660.1133 0.14800.3653 0.41760.1080 0.3847

If you look at the structure of H and compare it with the structure of G , then intuitively it isnot surprising that edge 3 of H is most similar to edge 1 of G and edge 6 of H is most similarto edge 2 of G because they are both very central edges with source nodes and terminal nodesthat are very similar (they both have high similarity scores). Also note that w7 is consideredas the most ‘central node’ (see subsection 2.2.4) as it has the largest similarity score for v2,also this can be explained because w3 has 2 incoming edges and 2 outgoing eges, no otheredge has this in H .

2.4 Colored graphs

In this last section, we extend the notion of similarity to colored graphs. The paper [? ] iswritten by two Belgian professors Paul Van Dooren and Catherine Fraikin from the CatholicUniversity of Louvain in 2009.


Graph coloring is already introduced in paragraph 1.5.1 and we will construct a methodthat returns similarity matrices for two graphs where the coloring is on the nodes or on theedges of both graphs. If you would compare the original paper to explanation in this section,you will see that there are a lot of differences. This has two reasons: first, with the previoussections we have already a broad overview of similarity on graphs so we can achieve results ina more detailed way, second, to make the notations uniform in the whole master thesis, thispaper had to be rewritten.

2.4.1 Colored nodes

Method

We first extend the node similarity introduced in section 2.2 to node colored graphs. Taketwo node colored graphs G = (V,→, C, a) and H = (U,→′, C, a′) with |C| (remember that a,a′ are surjective) different colors and assume that the nodes in both graphs are renumberedsuch that those of color 1 come first, then those of color 2,... The adjacency matrices A (ofgraph G ) and B (of graph H ) can then be partitioned as follows:

A =

A11 A12 . . . A1|C|A11 A12 . . . A1|C|

...... . . . ...

A|C|1 A|C|2 . . . A|C||C|

and

B11 B12 . . . B1|C|B11 B12 . . . B1|C|

...... . . . ...

B|C|1 B|C|2 . . . B|C||C|

Remember that cG (V, i) denotes the number of vertices of color i in graph G , then eachblock Aij ∈ RcG (V,i)×cG (V,j) and Bij ∈ RcH (U,i)×cH (U,j) describes the adjacency relationsbetween the nodes of color i to the vertices of color j in both A and B. In fact, by justrenumbering the vertices such that the nodes with the same color succeed each other, youimmediately get such partitioning. If you see this renumbering as a permutation on thenodes, then performing on the original adjacency matrix a left and right multiplication withthe corresponding permutation matrix (see Definition 1.1.1) leads to this adjusted adjacencymatrix. To make the idea more comprehensible, we give an example.

Example 2.4.1. Let G be following graph:

v1 v2 v3 v4

The adjacency matrix equals:

A =

0 1 1 00 0 2 00 1 0 20 0 0 1

.If we order the colors as: {green, blue, yellow} (so color 1 = green, color 2 = blue, color 3 =yellow), we can renumber the vertices and get the following graph:


v1 v2 v3 v4

The adjacency matrix equals:

A′ = PAP =

1 0 0 00 0 1 00 1 0 00 0 0 1

0 1 1 00 0 2 00 1 0 20 0 0 1

1 0 0 00 0 1 00 1 0 00 0 0 1

=

0 1 1 00 0 1 20 2 0 00 0 0 1

,which can be partitioned in blocks as follows (we have 3 colors: 1 = green, 2 = blue, 3 =yellow):

A′ =

A11 A12 A13A21 A22 A23A31 A32 A33

=

(0 10 0

) (11

) (02

)(0 2

) (0) (

0)

(0 0

) (0) (

1)

.

Now, we will only compare the nodes of the same color in both graphs, so that we definesimilarity matrices Sii ∈ RcG (V,i)×cH (vi), with i = 1, . . . , |C|, which we can put in a block-diagonal similarity matrix:

S =

S11 0 . . . 00 S12 . . . 0...

... . . . ...0 0 . . . S|C||C|

Theorem 2.4.2. Let G = (V,→, C, a) and H = (U,→′, C, a′) be two node colored graphsand define:

Z(k+1)1 =

∑i∈{1,...,|C|}

B1iS(k)ii A

T1i +BT

i1S(k)ii Ai1

Z(k+1)2 =

∑i∈{1,...,|C|}

B2iS(k)ii A

T2i +BT

i2S(k)ii Ai2

...Z

(k+1)|C| =

∑i∈{1,...,|C|}

B|C|iS(k)ii A

T|C|i +BT

i|C|S(k)ii Ai|C|

(S11, S22, . . . , S|C||C|)(k+1) =

(Z

(k+1)1 , Z

(k+1)2 , . . . , Z

(k+1)|C|

)∣∣∣∣∣∣(Z(k+1)

1 , Z(k+1)2 , . . . , Z

(k+1)|C|

)∣∣∣∣∣∣F

for k = 0, 1, . . .Then the the matrix subsequences Z(2k)

j for every j ∈ {1, . . . , |C|} and (S11, S22, . . . , S|C||C|)(2k)

converge to Zevenj and (S11, S22, . . . , S|C||C|)even. Also the odd subsequences converge. If we

take:S

(0)jj = J ∈ RcH (U,j)×cG (V,j)


as initial matrices, then the resulting Sevenjj (1) are the unique matrices of largest 1-norm

among all possible limits with positive start vector.

Proof. By induction on |C|. Remember from Definition 1.5.18 that the function a is surjective,meaning that C only consists of colors that are actually used.

For |C| = 1, we have a graph with all vertices having the same color, which can be seenas an uncolored graph. This is just the normal case as proved in Theorem 2.2.15.

Although redundant, it is instructive to prove the case |C| = 2 separately, because thegeneralization in the induction step is then immediately clear, so consider the partitionedadjacency matrices: (

A11 A12A21 A22

)and

(B11 B12B21 B22

),

the equations of the theorem are in this case:

Z′(k+1)1 = B11S

(k)11 A

T11 +BT

11S(k)11 A11 +B12S

(k)22 A

T12 +BT

21S(k)22 A21

Z′(k+1)2 = B21S

(k)11 A

T21 +BT

12S(k)11 A12 +B22S

(k)22 A

T22 +BT

22S(k)22 A22

(S11, S22)(k+1) = (Z ′(k+1)1 , Z

′(k+1)2 )

‖(Z ′(k+1)1 , Z

′(k+1)2 )‖F

We can write with Theorem 2.2.14:

Z′(k+1)1 = B11Z

′(k)1 AT11 +BT

11Z′(k)1 A11 +B12Z

′(k)2 AT12 +BT

21Z′(k)2 A21

⇔ vec(Z ′(k+1)1 ) = vec(B11Z

′(k)1 AT11 +BT

11Z′(k)1 A11 +B12Z

′(k)2 AT12 +BT

21Z′(k)2 A21)

⇔ vec(Z ′(k+1)1 ) = (A11 ⊗B11) vecZ ′(k)

1 + (AT11 ⊗BT11) vec(Z ′(k)

1 )+(A12 ⊗B12) vec(Z ′(k)

2 ) + (AT21 ⊗BT21) vec(Z ′(k)

2 )⇔ vec(Z ′(k+1)

1 ) = (A11 ⊗B11 +AT11 ⊗BT11) vec(Z ′(k)

1 )+(A12 ⊗B12 +AT21 ⊗BT

21) vec(Z ′(k)2 )

In a similar way, we can write Z ′(k+1)2 , which is the not normalized version of Z(k+1)

2 , as:

vec(Z ′(k+1)2 ) = (A21 ⊗B21 +AT12 ⊗BT

12) vecZ ′(k)1 + (A22 ⊗B22 +AT22 ⊗BT

22) vecZ ′(k)2

If we define M , zk+1 as follows, the previous expressions concatenate to a single matrix updateequation:

z(k+1) =(

vec(Z ′1)vec(Z ′2)

)(k+1)

=(A11 ⊗B11 +AT11 ⊗BT

11 A12 ⊗B12 +AT21 ⊗BT21

A21 ⊗B21 +AT12 ⊗BT12 A22 ⊗B22 +AT22 ⊗BT

22

)(vec(Z ′1)vec(Z ′2)

)(k)

= Mz(k)

Notice that the diagonal blocks in M are related to links between the nodes with the samecolor, while the off diagonal blocks refer to links between nodes of another color. As always,


we want to use Theorem 2.2.10 to get the result. M is clearly nonnegative, because everyblock Aij and Bij is nonnegative (it is derived from the nonnegative adjacency matrices Aand B). Proving that M is symmetric a bit trickier, but notice that A11⊗B11 +(A11⊗B11)Tis a symmetric cG (1)cH (1)× cG (1)cH (1) matrix (because it is the sum of a matrix with histranspose), and A22 ⊗ B22 + (A22 ⊗ B22)T is a symmetric cG (2)cH (2) × cG (2)cH (2). If wedefine G = A21 ⊗B21 +AT12 ⊗BT

12, G is a cG (2)cH (2)× cG (1)cH (1)-matrix. Now, notice thefollowing relation between the off diagonal blocks:

GT = (A21 ⊗B21 +AT12 ⊗BT12)T

= (A21 ⊗B21)T + ((AT12 ⊗BT12)T )T

= A12 ⊗B12 +AT21 ⊗BT21

GT is a cG (1)cH (1)× cG (2)cH (2)-matrix. So we can rewrite M as:

M =(A11 ⊗B11 +AT11 ⊗BT

11 GT

G A22 ⊗B22 +AT22 ⊗BT22

)

M is a (cG (1)cH (1) + cG (2)cH (2)) × (cG (1)cH (1) + cG (2)cH (2))-matrix, we want to provethat (M)ij = (M)ji and to do so we distinguish all possible cases:

• If i ≤ cG (1)cH (1), j ≤ cG (1)cH (1), then (M)ij and (M)ji will both be entries ofA11 ⊗B11 +AT11 ⊗BT

11 and this submatrix was symmetric.

• If i ≤ cG (1)cH (1), j > cG (1)cH (1), then (M)ij will be an entry of GT and (M)ji willbe an entry of G, so they are equal.

• If i > cG (1)cH (1), j > cG (1)cH (1), then (M)ij and (M)ji will both be entries ofA22 ⊗B22 +AT22 ⊗BT

22 and this submatrix was symmetric.

• If i > cG (1)cH (1), j ≤ cG (1)cH (1), then (M)ij will be an entry of G and (M)ji will bean entry of GT , so they are equal.

The result immediately follows from Theorem 2.2.10. We motivated the usage of theFrobenius norm already in the proof of Theorem 2.2.15. Notice that the normalization aftereach iteration step happens ‘together’ by dividing by ‖(Z(k+1)

1 , Z(k+1)2 )‖F , because this is in

accordance to the conditions of Theorem 2.2.10. Normalizing Z(k+1)1 , Z

(k+1)2 separately after

the expressions are calculated is a bad idea: it gives an iterative process that is different fromthe one described in Theorem 2.2.10 and we cannot prove convergence in this case.

We now prove the induction step |C| = n− 1⇒ |C| = n. The only crucial thing to proveis that M is again symmetric, the rest of the steps consist of an easy expansion of the case|C| = 2. M is in this case equal to:

M =

A11 ⊗B11+AT11 ⊗BT

11. . .

A1(n−1) ⊗B1(n−1)+AT(n−1)1 ⊗B

T(n−1)1

A1n ⊗B1n+ATn1 ⊗BT

n1... . . . ...

...A(n−1)1 ⊗B(n−1)1+AT1(n−1) ⊗B

T1(n−1)

. . .A(n−1)(n−1) ⊗B(n−1)(n−1)+AT(n−1)(n−1) ⊗B

T(n−1)(n−1)

A(n−1)n ⊗B(n−1)n+ATn(n−1) ⊗B

Tn(n−1)

An1 ⊗Bn1+AT1n ⊗BT

1n. . .

An(n−1) ⊗Bn(n−1)+AT(n−1)n ⊗B

T(n−1)n

Ann ⊗Bnn+ATnn ⊗BT

nn


which can be seen as:

M =

A1n ⊗B1n+ATn1 ⊗BT

n1

M ′...

A(n−1)n ⊗B(n−1)n+ATn(n−1) ⊗B

Tn(n−1)

An1 ⊗Bn1+AT1n ⊗BT

1n. . .

An(n−1) ⊗Bn(n−1)+AT(n−1)n ⊗B

T(n−1)n

Ann ⊗Bnn+ATnn ⊗BT

nn

From the induction hypothesis we now that M ′ is symmetric. It is clear that the entries inthe last column of M are the transpose of the entries in the last row. It follows that M isagain symmetric.

Algorithm 8 and Algorithm 9

We present an algorithmic implementation of the described method. First, we have to finda way to easily calculate a partitioned adjacency matrix. This is presented in Algorithm 8.This algorithm expects as input an adjancency matrix where the vertices are arranged bycolor and an ordered list that tells how many vertices belong to each color. In Algorithm 9we calculate the similarity matrix of two node colored graphs based on the adjacency matrixpatitioning algorithm. A Matlab implementation of both algoritmhs can be found in ListingA.7 and Listing A.8 in Appendix A.


Data:C: a list with the number of vertices sharing the same colors in G (both the vertices asthe colors must be ordered appropriately)A: the ‘normal’ adjacence matrix of the graph G (the vertices must be orderedapproprately: the first vertices must belong to the first color)Result:Z: a 2-dimensional list where the element on place (i, j) represents the partition ofadjancency matrix A, Aij , betweentheverticesofcoloriandtheverticesofcolorjbegin colored node adjacency matrix partitioning(C,A)

Z = ∏i,j C(i).C(j)×∏i,j C(i).C(j)-matrix with all entries equal to 0;

for i : 1 to |C| dofor j : 1 to |C| do

Ci = the number of vertices of color i (= C(i));Cj = the number of vertices of color j (= C(j));B = a Ci × Cj-matrix with all entries equal to 0;start vertex i = 0;start vertex j = 0;for k : 1 to i− 1 do

Ck = the number of vertices of color k (= C(k));start vertex i = start vertex i + Ck;

endfor k : 1 to i− 1 do

Ck = the number of vertices of color k (= C(k));start vertex j = start vertex j + Ck;

endfor r : 1 to ci do

for k : 1 to cj doB(r, k) = A(start vertex i +r, start vertex j +k);

endendZ(i, j) = B;

endendreturn Z;

endAlgorithm 8: Algorithm that takes an ordered list with the number of vertices and anormal adjacency matrix as input and returns a partitioned adjacency matrix.


Data:CA: a list with the number of vertices sharing the same colors in G ,A: the ‘normal’ adjacence matrix of the graph G ,CB: a list with the number of vertices sharing the same colors in H ,B: the ‘normal’ adjacence matrix of the graph H .TOL: tolerance for the estimation error.Result:S: a 1-dimensional list with on place (i) the similarity matrix of the vertices with colori, Sii.begin colored node similarity matrix(CA, A, CB, B, TOL)

AP = colored node adjacency matrix partitioning (CA, A);BP = colored node adjacency matrix partitioning (CB, B);k = 1;for i : 1 to |C| do

S(i) = a CB(i)× CA(i)-matrix with all entries equal to 1;µ(i) = a CB(i)× CA(i)-matrix with all entries equal to TOL;

endwhile True do

norm = 0;for i : 1 to |CA| do

Z(i) = ∑j∈{1,...,|CA|}BP (1, j)Sprevious(i)ATP (1, i) +BT

P (i, 1)SpreviousAP (i, 1);norm = norm + trace (ZT (i).Z(i));

endS = Z

norm ;if k even then

if |S − Sprevious even| < TOL thenbreak;

elseSprevious even = S

endend

endreturn S;

endAlgorithm 9: Algorithm that takes an ordered list with the number of vertices and anormal adjacency matrix as input and returns a partitioned adjacency matrix.

Example

Example 2.4.3. Let G be the graph:

v1 v2 v3 v4 v5


and H be the graph:

w1 w2 w3 w4 w5

The similarity matrix is given by:

S =

0.5969 0.3791 0 0 0

0 0.3791 0.5969 0 00 0 0 0.5986 00 0 0 0.3764 0.37640 0 0 0 0.5986

As one would expect, the highest similarity scores (0.59) are obtained for the nodes thatare the transitions between two type of nodes: all the pairs (v1, w1), (v3, w2), (v4, w3)(v5, w5)share this similarity score.

2.4.2 Colored edges

Method

We now extend the node-edge similarity method from Section 2.3 to edge colored graph.Take two edge colored graphs G = (V,→, C, b) and G = (U ′,→′, C, b′) with |C| differentcolors (remember that b, b′ are surjective). The edges can be renumbered such that thoseof the same color are next to each other in the source-edge and terminal-edge matrices, soAS , AT from G and BS , BT from H can be partitioned as follows:

AS =(AS1 . . . AS|C|

)and AT =

(AT1 . . . AT|C|

)BS =

(BS1 . . . BS|C|

)and BT =

(BT1 . . . BT|C|

)Again, this can easily be achieved by multiplying the original AS by the permutation matrixthat represents the renumbering of the edges. The blocks ASi , ATi ∈ RnG×cG (→,i) with nG

the number of vertices of G and cG (→, i) the number of edges of color i. So for H we have:BSi , BTi ∈ RnH ×cH (→′,i). We give a small example.

Example 2.4.4. Let G be following graph:

v1 v2 v3 v4

e1

e2

e3 e7

e6

e4, e5 e8

When we calculate the source-edge matrix with Algorithm 7 (the resulting matrices repre-sent an edge numbering left-to-right, the same as indicated in the graph) and terminal-edge


matrices in an equal way, we get:

AS =

1 1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 0 0 1 1 1 00 0 0 0 0 0 0 1

and AT =

0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 1 1 1 0 0 0 00 0 0 0 0 1 1 1

If we order the colors as: {green, blue, orange} (so color 1 = green, color 2 = blue, color 3 =yellow), we can renumber the edges:

v1 v2 v3 v4

e1

e3

e8 e2

e6

e4, e5 e7

A′S and A′T are now:

A′S = ASP =

1 1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 0 0 1 1 1 00 0 0 0 0 0 0 1

1 0 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 1 0 0 0 0 0 00 0 0 0 0 0 1 0

=

1 0 1 0 0 0 0 00 0 0 1 0 0 0 10 1 0 0 1 1 0 00 0 0 0 0 0 1 0

A′T = ATP =

0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 1 1 1 0 0 0 00 0 0 0 0 1 1 1

1 0 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 1 0 0 0 0 0 00 0 0 0 0 0 1 0

=

0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 0 1 1 0 0 0 10 1 0 0 0 1 1 0

which can be partitioned in blocks as follows (we have 3 colors: 1 = green, 2 = blue, 3 =


orange):

A′S =(A′S1

A′S2A′S3

)=

1 00 00 10 0

1 0 0 0 00 1 0 0 00 0 1 1 00 0 0 0 1

0100

A′T =(A′T1

A′T2A′T3

)=

0 01 00 00 1

0 0 0 0 00 0 1 0 01 1 0 0 00 0 0 1 1

0010

Just like the colored node method, the edge similarity matrix will be block-diagonal be-cause we compare only the edges of the same type. The edge similarity matrix has thus ablock diagonal structure with blocks Yii ∈ RcG (→,i)×cH (→′,i).

Y =

Y11 0 . . . 00 Y22 . . . 0...

... . . . ...0 0 . . . Y|C||C|

The node similarity matrix X, on the other hand, is no different from the one of Theorem2.3.6. To adapt the method of Theorem 2.3.6 to colored edges we have to rewrite the equations(2.18) and (2.19) in a decoupled form such that X(k) and Y (k) can be calculated independentlyof each other. To make this paragraph more readable, we write our orignal equations again:

Y (k+1) = BTSX

(k)AS +BTTX

(k)AT‖BT

SX(k)AS +BT

TX(k)AT ‖F



(k)ATT ‖F

Remember that G = ATS ⊗ BTS + ATT ⊗ BT

T and GT = AS ⊗ BS + AT ⊗ BT . From equations(3.10) and (3.11) we can write (see the proof of Theorem 2.3.6):

x(k+1) = GTG(x(k))‖GTG(x(k))‖F

and y(k+1) = GGT (y(k))‖GGT (y(k))‖F

(2.22)

Remember that x(k) = vec(X(k)), with Lemma 2.2.13 we can rewrite this to decouplednotations (see the first part of the proof Theorem 2.3.6):

X(k+1) = BS(BTSX

(k)AS +BTTX

(k)AT )ATS +BT (BTSX

(k)AS +BTTX

(k)AT )ATT‖BS(BT

SX(k)AS +BT

TX(k)AT )ATS +BT (BT

SX(k)AS +BT

TX(k)AT )ATT ‖F

Y (k+1) = BTS (BSY (k)ATS +BTY

(k) +ATT )AS +BTT (BSY (k)ATS +BTY

(k) +ATT )AT‖BT

S (BSY (k)ATS +BTY (k) +ATT )AS +BTT (BSY (k)ATS +BTY (k) +ATT )AT ‖F

To keep the notation understandable, we will keep on using the decoupled equations of(2.22). We are ready for the theorem that describes the method of edge similarity on colorededges:


Theorem 2.4.5. Let G = (V,→, C, b) and H = (U,→′, C, b′) be two edge colored graphs anddefine:

X ′(k+1) =∑

i∈{1,...,|C|}BSiY

(k)ii ATSi +BTiY

(k)ii ATTi

Y′(k+1)

11 = BTS1X

(k)AS1 +BTT1X

(k)AT1

...Y′(k+1)ii = BT

SiX(k)ASi +BT

TiX(k)ATi

...Y′(k+1)|C||C| = BT

S|C|X(k)AS|C| +BT

T|C|X(k)AT|C|

(X,Y11, . . . , Y|C||C|) =

(X ′(k+1), Y

′(k+1)11 , . . . , Y

′(k+1)|C||C|

)∥∥∥(X ′(k+1), Y

′(k+1)11 , . . . , Y

′(k+1)|C||C|

)∥∥∥F

for k = 0, 1, . . . Then the the matrix subsequences (X,Y11, . . . , Y|C||C|)(2k) converge to(X,Y11, . . . , Y|C||C|)even Also the odd subsequences converge. If we take:

X(0) = J ∈ RnH ×nG

Y(0)jj = J ∈ RcG (→,i)×cH (→′,i)

as initial matrices, then the resulting Sevenjj (1) are the unique matrices of largest 1-norm

among all possible limits with positive start vector.

Proof. By induction on |C|. Remember from Definition 1.5.20 that the function b is surjective,meaning that C only consist of colors that are actually used.

For |C| = 1, we have a graph with all edges having the same color, which can be seen asan uncolored graph. This is just the normal case as proved in Theorem 2.3.6.

Although again redundant, it is instructive to prove the case |C| seperately, because thegeneralization in the inducution step is then immediately clear, so consider the partitionedsource-edge and terminal-edge matrices:

AS =(AS1 AS2

)and AT =

(AT1 AT2

)BS =

(BS1 BS2

)and BT =

(BT1 AB2

)the equations of the theorem are in this case:

X ′(k+1) = BS1Y′(k)

11 ATS1 +BT1Y′(k)

11 ATT1 +BS2Y′(k)

22 ATS2 +BT2Y′(k)

22 ATT2

Y′(k+1)

11 = BTS1X

′(k)AS1 +BTT1X

′(k)AT1

Y′(k+1)

22 = BTS2X

′(k)AS2 +BTT2X

′(k)AT2

which can be rewritten using Theorem 2.2.13 as:

X ′(k+1) = BS1Y′(k)

11 ATS1 +BT1Y′(k)

11 ATT1 +BS2Y′(k)

22 ATS2 +BT2Y′(k)

22 ATT2

⇔ vec(X ′(k+1)) = (AS1 ⊗BS1 +AT1 ⊗BT1) vec(Y ′(k)11 ) + (AS2 ⊗BS2 +AT2 ⊗BT2) vec(Y ′(k)

22 )


Y′(k)

11 and Y′(k)

22 can also be rewritten:

vec(Y ′(k+1)11 ) = (ATS1 ⊗B

TS1 +ATT1 ⊗B

TT1) vec(X ′(k))

vec(Y ′(k+1)22 ) = (ATS2 ⊗B

TS2 +ATT2 ⊗B

TT2) vec(X ′(k))

We define z(k+1) and M as follows, and again the previous expressions concatenate to a singlematrix equation:

z(k+1) =

vec(X)vec(Y11)vec(Y22)

(k+1)

=

0 AS1 ⊗BS1

+AT1 ⊗BT1

AS2 ⊗BS2

+AT2 ⊗BT2

ATS1⊗BT

S1+ATT1

⊗BTT1

0 0

ATS2⊗BT

S2+ATT2

⊗BTT2

0 0

vec(X)vec(Y11)vec(Y22)

(k)

= Mz(k)

M is clearly nonnegative as it exists of zero elements or sums of Kronecker products ofnonnegative matrices. To see that M is symmetric, rewrite M with G and GT :

G =(ATS1⊗BT

S1+ATT1

⊗BTT1

ATS2⊗BT

S2+ATT2

⊗BTT2

)

With Lemma 2.2.13 we calculate GT :

GT =(AS1 ⊗BS1 +AT1 ⊗BT1 AS2 ⊗BS2 +AT2 ⊗BT2

)G is a nGnH × (cG (→, 1)cH (→′, 1) + cG (→, 1)cH (→′, 2))-matrix, so:

M =(

0cG (→,1)cH (→′,1)+cG (→,1)cH (→′,2) GT

G 0nGnH

)

which is clearly a symmetric matrix and the result follows.We now prove the induction step |C| = n− 1⇒ |C| = n. The only thing to show is that

M stays symmetric (nonnegative is clear), but this is obvious as G can be rewritten as:

G =

ATS1⊗BT

S1+ATT1

⊗BTT1

ATS2⊗BT

S2+ATT2

⊗BTT2

ATS|C| ⊗BTS|C| +ATT|C| ⊗B

TT|C|

and you can rewrite M with G and GT .


Algorithm

With the same reasoning as for colored nodes, we programmed an algorithm that takes asource-edge matrix or a terminal-edge matrix as input together with a an ordered list withthe number of edges belonging to each color. Under the condition that the edges are arrangedby color in the source-edge or terminal-edge matrix, this algorithm returns a partitionedsource or terminal-edge matrix. Based on this algorithm, we can construct an algorithm thatreturns the node similarity matrix X and the (colored) edge similarity matix Y . The Matlablistings of both algorithms can be found in Listing A.9 and Listing A.10 in Appendix A.Because the algorithms resemble a lot to the previous algorithms, we didn’t write them inpseudocode.

Example

Example 2.4.6. Let G be the graph:w1

w2

w4w3 w5

e′1

e′2 e′

3 e′4

and H be the graph:

v1 v2 v3

e1 e2

The node and edge similarity matrix are given by:

X =

0.2295 0 0 0 00 0.8903 0 0 00 0 0.2284 0.2284 0.2284

Y =

(0.5893 0 0 0

0 0.5812 0.5812 0

)In fact these two graphs are very similar in a way that node v3 is replaced by three nodesw3, w4, w5. This is also detected by the node and edge similarity matrices, giving equalsimilarity scores for (e′2, e2) and (e′3, e2), (e′4, e2) is equal to zero because e′4 has a differentcolor.


2.4.3 Fully colored graphs

We can also consider the combination of colored nodes and colored edges. This in fact thesame method as with the colored edges, both the matrices X and Y will in this case bothbe block diagonal (possibly with a different number of blocks). The iteration matrix for Ywill be completely equal to the iteration matrix for colored edges and it is easy to adapt theiteration matrix for X so it can handle colored nodes. We will not discuss this case in detail,as it would be an extensive repetition of the previous subsections.


2.5 Applications

Similarity on graphs is a fairly new concept, so there are not so many applications developedyet. In this small subsection we give a concise overview of the already existing applications.We just give an intuitive idea to give the reader a notion of which kind of applications arealready possible. For the more detailed, mathematical approach we refer to the relevantarticles.

2.5.1 Synonym Extraction

In [? ], they use the structure graph 1→ 2→ 3 to substract synonyms in graphs constructedfrom a dictionary. The method is based on the assumption that synonyms have many wordsin common in their definitions and appear together in the definition of many words. So toconstruct a dictionary graph G , each word of the dictionary is a vertex and there is an edgefrom vi to vj if vi appears in the definition of vj . Now, with a given word w, we construct aneighboorhood graph G ′, which is the subgraph of G whose vertices are pointed to by w orare pointing to w. Finally, we rank the words by decreasing central score and the words withthe highest central scores will possibly be good synonyms.

2.5.2 Graph matching

The mean usage of the current graph similarity algorithms as presented in this section, is tofind some kind of an optimal matching between the elements (vertices and edges) betweentwo graphs. Graph matching has received a lot of popularity because it is useful for datamining. By example, in [? ] they use brute-force algorithm that calculates the similarityscores between a graph G and any possible subgraph of G . By putting some conditions aboutthe desired result, it is possible to detect the subgraph that is most similar to the originalgraph. Practical uses of graph matching are:

• Comparing two databases,

• Finding clusters in a set of data,

• Align sequences of DNA,

• Comparing networks,

• Comparing connections in social networks,

• ...

Chapter 3

Similarity on hypergraphs

3.1 Introduction

3.1.1 Motivation

In this section, we explore similarity on hypergraphs. This exploration is new. No papers canbe found on the subject. A motivation to generalize the concept of similarity to hypergraphscan be found in the applications of hypergraphs in other sciences. A very clear application canbe found in chemistry: some of the graph models used for the representation of molecules arenot sufficient to grasp the whole complexity of the molecule structure (for example, moleculeswith delocalized polycentric bonds can not be represented as graphs without losing a lot ofinformation about the structure). Representing these molecules as hypergraphs is often a goodand more illustrative solution [? ]. An established similarity method for hypergraphs maypotentially detect equivalences in the structure of such molecules. Also in computer science,some research is done about representing data structures as hypergraphs. For example, tospeed up algorithms in parallel databases, a representation with hypergraphs can be used [?]. Also in this case, a hypergraph similarity method may lead to an algorithm that efficientlycompares two parallel databases.

3.1.2 What would be a good hypergraph similarity method?

In order to generalize this concept of similarity successfully and give it a correct meaning, weneed to formulate some conditions that a possible method for similarity on hypergraphs hasto fulfill. Intuitively, all methods from the previous chapter work in the same way: in thefirst iteration step only the adjacency relations of vertices (or edges) in two graphs are usedand in each following iteration step, the relationships between these adjaceny relations arecalculated, which will result in high similarity scores (compared to the others in the similaritymatrix) for a vertex when the adjacent vertices have a high similarity score. When calculatingan edge similarity matrix, an edge will have a high similarity score (compared to others) ifthe vertices that an edge connect have a high similarity score.

Based on this, we now present our conditions. These conditions must ensure that asimilarity method for hypergraphs has the same behaviour as the similarity methods forgraphs:

(C1) When the method is applied on two undirected graphs, it must return the same similarityscores (up to a constant) as the methods from Section 2.

109

CHAPTER 3. SIMILARITY ON HYPERGRAPHS 110

(C2) Adding a non-isolated node to one of the hypergraphs must influence the similarityscores.

(C3) Adding an edge that is not a hyperloop must influence the similarity scores.

(C4) The similarity score of a vertex in an hypergraph is large (compared to others) whenthe similarity score of the adjacent vertices in the hypergraph is large.

(C5) The similarity score of an edge is large (compared to others) when it connects verticeswith large similarity scores.

(C6) When two vertices (edges) have the same relationships to all other vertices (edges), wesay that these vertices (edges) are structural equivalent. Structural equivalent verticesand edges must have the same similarity scores because, intuitively, they play exactly thesame role in the hypergraph structure. Actually, we can give a more general definitionof structural equivalence in hypergraphs:

Definition 3.1.1. Two vertices vi, vj of a hypergraph G are structural equivalent iffor each edge of size k + 1 that connect vi to vp1 , vp2 , . . . , vpk , there is an edge of sizek + 1 (possibly the same) that connects vj to vp1 , vp2 , . . . , vpk .

When calculating an edge similarity matrix, structural equivalent edges (edges that havethe same size and connect exactly the same vertices) must also have the same similarityscores.

(C7) The cardinality of an edge E in a hypergraph (|E|) must influence the edge similarityscores: edges with a high cardinality must have a higher similarity score than two edgeswith a lower cardinality.

(C8) A vertex can only have a similarity score equal to zero when the hypergraph is notconnected. In that case, it can occur that a group of connected vertices dominates allvertices that are not adjacent to this group. Also an edge can only have a similarityscore equal to zero when the hypergraph is not connected. An isolated vertex (that isnot a loop) will always have a similarity score equal to zero.

The attentive reader may ask himself how we collected this set of conditions. These condi-tions are established in a heuristic way, based on observations from various tryouts on themethods of similarity on graphs. To give these conditions a better foundation, we now give anexplanation on how the methods of similarity on graphs of section 2 meet each condition forundirected graphs. Although most conditions are also met for directed graphs, we only haveto consider undirected graphs because hypergraphs itself are undirected. The explanationcan either be a proof of a simple theorem or a more heuristic explanation. The explanationsare numbered (E1,. . ., E8) in the same way as the conditions because we will often refer tothis explanations in the rest of this chapter.

(E1) Undirected graphs are equal to 2-hypergraphs. It is evident that in the case of a 2-hypergraph the results obtained by a similarity method for hypergraphs must returnthe same results (up to a constant).


(E2) It is easy to see that (C2) holds for every (node) similarity method on graphs as adding aextra node to a graph generates an additional row and column in the adjacency matrix.This row and column are thus included in the calculation of the similarity matrix Sand after the calculation, similarity scores for the added node compared to all the othernodes from the other graph are obtained.

(E3) This holds with the same reasoning as (E2) but now applied to the (edge) similaritymethod.

(E4) We can explain this in a heuristic way: for example, in the method of Blondel, thecompact form equals (A is the adjacency matrix of GA and B is the adjacency matrixof GB):


‖BS(k)AT +BTS(k)A‖F, k = 0, 1, . . . ,

first note that BS(k)AT +BTS(k)A is in fact the sum of the similarities of the childrenand parents of node vi of GB and vertex vj of GA. We see that si,j , the similarity scorebetween vi of B and vj of A, is in fact equal to a scalar times the element (i, j) ofBS(k)AT + BTS(k)A. The same can be said about the even iterates and hence aboutthe complete similarity matrix S. So we see that this implicit relation means that thesimilarity scores of a vertex in an graph is large (compared to others) when the similarityscore of the adjacent vertices in the graph is large.

(E5) This holds with the same reasoning as (E4) but now applied to the node-edge similaritymethod (section 2.3).

(E6) To see that (C6) also holds in graphs, we define structural equivalent vertices in graphsas follows:

Definition 3.1.2. In a graph G , two vertices vi, vj are structural equivalent if theyhave exactly the same adjacency structure. Meaning that for each edge that connects vito a vertex vq, there must also be an edge that connects vj to vq. In a directed graph,the same holds in the sense that for each edge vi → vq there must exist an edge vj → vqand for each edge vp → vi there must exist an edge vp → vj.

Notice that all isolated vertices in a graph are always structural equivalent. We willshow that (C6) holds for the method of Blondel on graphs by proving the followingtheorem.

Theorem 3.1.3. Let GA be a graph with structural equivalent vertices and GB anothergraph, then by calculating the similarity matrix between GA and GB the structural equiv-alent vertices of GA will have the same similarity scores for every vertex of GB.

Proof. Let GA = (V,→),GB = (W,→′) with |V | = n, |W | = m. First, it is easy to seethat the structural equivalent vertices form an equivalence relation ∼ on V with vp ∼ vqif and only if vp and vq are structural equivalent. Take vp, an equivalency class withmore than one vertex (this exists as GA has structural equivalent vertices).Now, from the definition of structural equivalent vertices in graphs, we conclude thattwo vertices vp and vq are structural equivalent if and only if for all i ∈ {1, . . . , n} holds


that aip = aiq and that api = aqi. This holds also for directed graphs, in the undirectedcase we even have that aip = api = aqi = aiq. So all vertices in vp have the same entriesin the adjacency matrix A of GA.The similarity scores of vp and any other vertex wj of GB at iteration step k+ 1 can becalculated as (A = (aij), B = (bij), S = (sij)) :

s(k+1)jp =

∑mf=1

∑ng=1 bjfs

(k)fg a

Tgp + bTjfs

(k)fg agp

‖BS(k)AT +BTS(k)A‖F(3.1)

=∑mf=1

∑ng=1 bjfs

(k)fg apg + b

(k)fj s

(k)fg agp

‖BS(k)AT +BTS(k)A‖F(3.2)

So, if we consider this iteratively we get that the similarity scores (at iteration stepk) between a vertex in vp and a vertex wj of GB are equal to s(k)

jp because for all g ∈{1, . . . , n}, apg is the same for all vertices in vp and the same holds for the agp’s. So all thevertices in vp undergo the same calculations with the entries bjf , bfj (f ∈ {1, . . . ,m}).Note that we start with s

(0)jp equal to 1.

Because we proved that for any equivalence class in ∼ (with more than one element)the similarity scores are equal at each iteration step k, the theorem follows.

Remark 3.1.4. The notion of structural equivalence was introduced on graphs into thenetwork literature by Lorrain and White in 1971 [? ] using category theory. Later on,in 1988, also the notion of of automorphic equivalence was proposed by Chris Winshipin [? ]. An automorphism of a graph G = (V,→) is a permutation π of the vertex setV , such that for any two vertices vi, vj it holds that π(vi)→ π(vj) if and only if vi → vj.Now, two vertices vi, vj of a graph G are automorphic equivalent if and only if thereexists an automorphism π such that π(vi) = vj. Automorphic equivalence is a naturalgeneralization of structural equivalence but it is easy to see that it is a weaker condition.Just as structural equivalence, also automophic equivalence can be easy generalized tonodes and edges of hypergraphs (see [? ]). When we developed these conditions for agood hypergraph similarity method, we have long thought of imposing the condition thattwo automorphic equivalent nodes (or edges) of an hypegraph must have the same sim-ilarity score. Finally, we didn’t impose this condition because our experiments showedthat even the similarity methods for graphs fail to detect automorphic equivalence (inthe sense that they should return the same similarity scores for automorphic equivalentvertices) for large and complex graphs. The methods we propose for hypergraphs will ingeneral also fail to detect automorphic equivalent nodes or edges. Only when the consid-ered hypergraphs are very easy, the methods might return equal scores for automorphicequivalent nodes or edges. One may wonder of this is a drawback of the current graphsimilarity methods or not. This depends on the practical setting, but in general we ar-gue that this is normal behaviour we would expect from a similarity method. We givean example to make our vision clear, consider the following graph:


A

CB D

E G H I

F

This graph describes the organizational structure of an organization, with A the CEO,B,C,D managers, E,G are workers for manager B, H, I are workers for manager Dand F is the lone worker for manager C. It is easy to see that the structural equivalentpairs are {E,G} and {H, I}. They have exactly the same working relations with therest of the organization. Manager B and manager D are not structural equivalent (theydo have the same CEO, but not the same workers), but they are automorphic equivalent(of course, also the E,F,H and I are automorphic equivalent) . This is also veryintuitive: if we swapped them, together with their workers, the organizational structurewould be the same. Still, when we would compare with a graph similarity method thisorganizational structure with another organizatinal structure, we would argue that it isjustified to give B and D another similarity score as they are managers of differentworkers. Moreover, if one indeed needs a method to detect automorphic equivalence,plenty of algorithms are already available for this purpose (see [? ]).

(E7) This condition arose based on intuition. A hypergraph that is not a k-hypergraph,has at least one edge that has a different size. Now, intuitively, when an edge Ei in ahypergraph is compared to an edge E′i in a hypergraph , it is somehow clear that theymust have a higher (edge) similarity score when they connect more vertices. Of course,this only holds in connected hypergraphs (see Condition 8).One may wonder whether graphs too meet this condition, but having a different edge sizeonly arises in the pathetic case the graph contains a loop. We can show that comparingtwo loops will normally result in a lower edge similarity score than comparing twoedges that aren’t loops. We say ‘normally’, because of course, (C4) or (C8) can alwaysinterfere in some graphs. Now, to show this, look at the node-edge similarity method,the edge similarity matrix Y and the node similarity matrix X are calculated as:

Y (k+1) = BTSX

(k)AS +BTTX

(k)AT‖BT

SX(k)AS +BT

TX(k)AT ‖F

(3.3)



(k)ATT ‖F(3.4)


for k = 0, 1, . . ..Element-wise, we get:

y(k+1)ji =

∑nHf=1

∑nGg=1 b

Tsjfx

(k)fg asgi + bTtjfx

(k)fg atgi

‖BTSX

(k)AS +BTTX

(k)AT ‖F

x(k+1)ji =

∑mHf=1

∑mGg=1 bsjf y

(k)fg a

Tsgi + btjf y

(k)fg a

Tstgi

‖BSY (k)ATS +B′TY(k)ATT ‖F

which is equal to:

y(k+1)ij =

∑nHf=1

∑nGg=1 bsfix

(k)fg asgj + btfix

(k)fg atgj

‖BTSX

(k)AS +BTTX

(k)AT ‖F

x(k+1)ij =

∑mHf=1

∑mGg=1 bsif y

(k)fg asjg + btif y

(k)fg astgj


Now let ej of G be a loop on vertex vp and e′i of H a loop of vertex v′q, the edgesimilarity score between ej and e′i equals:

y(k+1)ji =

bsqix(k)qp aspi + btqix

(k)qp atpj

‖BTSX

(k)AS +BTTX

(k)AT ‖F(3.5)

⇔ y(k+1)ji = 2x(k)

qp

‖BTSX

(k)AS +BTTX

(k)AT ‖F(3.6)

x(k+1)qp =

∑mHf=1

∑mGg=1 bsqf y

(k)fg aspg + btqf y

(k)fg astgp


(3.7)

where we indeed see that the result of yji is only based on xqp which will be high ifboth vp and v′q are heavy connected to other vertices (see C4). In the case of em of GAconnecting the vertices vp and vo and e′n of GB connecting the vertices v′q, v′r we get thefollowing edge similarity score ynm (GA and GB are undirected):

y(k+1)nm = 2(x(k)

qp + x(k)qo + x

(k)rp + x

(k)ro )

‖BTSX

(k)AS +BTTX

(k)AT ‖F,

which will normally be higher than (3.6) (keep in mind that condition (C4) and (C8)can occur).

(E8) To see (C8) on undirected graphs, we first notice that indeed isolated vertices havealways similarity scores equal to 0 in the method of Blondel (also in the case of directedgraphs), for example, let wj of H with adjacency matrix B be an isolated vertex andwe calculate the similarity scores with the vertices of G , elementwise we write:

s(k+1)jp =

∑mf=1

∑ng=1 bjfs

(k)fg apg + b

(k)fj s

(k)fg agp

‖BS(k)AT +BTS(k)A‖F


But since wj is an isolated vertex we know that all the b′jfs and bfj are equal to zero,so:

s(k+1)jp =

∑mf=1

∑ng=1 0s(k)

fg apg + 0s(k)fg agp

‖BS(k)AT +BTS(k)A‖F= 0

The condition about connectivity is easy to see: when a matrix is not connected, thereexists zeros in the adjacency matrix. Each zero decreases the similarity score, if thisresults in a very low similarity score in the first iteration steps, it could be that inthe limit - caused by the zeros in the adjacency matrix - the similarity scores are onlygetting lower which will finally result in a similarity score equal to zero.


3.2 Similarity through corresponding graph representations

In the section, we explore representations of hypergraphs as (classical) graphs and use themto calculate the similarity between two hypergraphs by simply using the methods of theprevious chapter. Intuitively, we want a characteristic graph representation of a hypergraphthat is faithful (meaning that a graph represents only 1 hypergraph), because only a faithfulcharacteristic graph will preserve all the information that the structure of a hypergraphcontains. When the characteristic graph of a hypergraph is not faithful, the resulting similaritymethods can do a fair job under certain circumstances, but we will see that we cannot fulfillall the conditions at the same time because some information of the original hypergraph islost. To prove that several graph representations of hypergraphs meet certain conditions, wewill often refer to the explanations (Ei) in the following way: we explain or prove that thegraph representation preserves a certain characteristic of the hypergraph and then we justuse the explanations for graphs.

3.2.1 Line-graphs

General definitions and properties

Definition 3.2.1. Let H = (V,E) be a hypergraph with E 6= ∅ and E = {E1, . . . Em}. Theline-graph of H is the undirected graph often denoted by `(H) = (V ′,↔) with:

1. V ′ = E,

2. Ei ↔ Ej if and only if Ei ∩ Ej 6= ∅ and i 6= j.

It is immediately clear that line-graphs are simple graphs (see Definition 1.5.14). Animportant property of a hypergraphs can be seen on a line-graph:

Property 3.2.2. If H = (V,E) is connected, then the line-graph `(H) is also connected.

Proof. By contraposition, if `(H) is not connected, take Ep, Eq, two vertices of `(H) thatdoesn’t have a path between each other. This means that there is no sequence

Ep = Ek0 , Ek1 , . . . , Ekl−1 , Eq = Ekl

such that Eki−1 ∩ Eki 6= ∅ for i ∈ {0, . . . , l}, if vp ∈ Ep and vq ∈ Eq with vp, vq vertices of H,this means that no path exists between vp and vq and therefore H is not connected.

The following example shows that characteristic line-graph of a hypergraph is not faithful:

Example 3.2.3. Let H1 be the following hypergraph:


v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

Let H2 be the following hypergraph:

v1 v2 v3 v4

E1

E2

E3

And let H3 be the following hypergraph (H3 is in fact also an undirected graph):

v1 v2

E1

E2

E3

All off H1,H2 and H3 lead to the same line-graph G :

E1 E2 E3

Algorithm for similarity

We want to apply Algorithm 5 to produce a similarity matrix between two hypergraphs.This will produces an edge similarity matrix, because the vertices are not represented in theline-graph representation.

To use Algorithm 5 from section 2.2, we first take a hypergraph as input and calculate theadjacency matrix of the corresponding line-graph. The Matlab implementation of this stepcan be found in Appendix A in Listing A.11. By applying this algorithm to two hypergraphs,we can use Algorithm 5.


Examples

We now give some examples of the similarity of two hypergraphs by using line-graphs:Example 3.2.4. Let H1 be the following hypergraph:

v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

Let H2 be the following hypergraph:

v4 v5 v6 v7 v8 v9

v1 v3v2

v10 v11 v12

E′1

E′2

E′3

E′4

By applying the algorithm we get the adjacency matrices A1 for H1 and A2 for H2:

A1 =

0 1 01 0 10 1 0

and A2 =

0 0 0 10 0 0 10 0 0 11 1 1 0

Now we use Algorithm 5 with A1 and A2 and get the following similarity matrix:

S =

0.2887 0.2887 0.28870.2887 0.2887 0.28870.2887 0.2887 0.28870.2887 0.2887 0.2887


Example 3.2.5. H1 is the same as in the previous example, but now, H2 is the followinghypergraph:

v4 v5 v6 v7 v8 v9

v1 v3v2

v10 v11 v12

E′1

E′2

E′3

The similarity score of these two hypergraphs by using their line-graph representation be-comes:

S =

0 0 00 0 00 0 0


v4 v5 v6 v7 v8 v9

v1 v3v2

v10 v11 v12

E′1

E′2

E′3


S =

0.4082 0.4082 0.40820.4082 0.4082 0.4082

0 0 0



v3 v4

v1

v5

v2

v6

v7 v8

E′1

E′2

E′3

E′4


S =

0.4082 0.4082 0 0.40820.4082 0.4082 0 0.4082

0 0 0 0

Interpretation

We now discuss the conditions from the introduction:

(C1) Not fulfilled: first, the method will only return an edge similarity matrix and second,the line-graph of an undirected graph is not necessary the undirected graph itself:

v1 v2 v3e1 e2

Has as line-graph:

e1 e2

which will clearly not result in the same edge similarity scores as no information aboutthe vertex adjacency is saved.

(C2) Not fulfilled: the vertices of the hypergraph don’t play a role in the line-graph. Noinformation about them is saved in the line-graph representation.

(C3) Fulfilled: Adding an edge to one of the two hypergraphs will result in an extra vertexin the line-graph. Adding this vertex will take the edge into account when calculatingthe similarity scores and therefore we conclude on a heuristic base that this conditionis fulfilled.

(C4) Not fulfilled: in Examples 3.2.4, 3.2.5, 3.2.6 and 3.2.7 all positive similarity scores arethe same, regardless of the adjacency structure of the hypergraph.


(C5) Not fulfilled: there is no information about the vertices saved in the line-graph repre-sentation of a hypergraph.

(C6) Fulfilled: from Example 3.2.7 we see that the E′1 and E′4 are structural equivalent andhave indeed the same similarity scores. We will prove this.

Theorem 3.2.8. The line-graph representation of a hypergraph preserves structuralequivalent edges.

Proof. Two edges are structural equivalent in a hypergraph if they contain exactly thesame vertices. They form an equivalence class on E where Ei ∼ Ej if they are structuralequivalent. Let Ei be the class of structural equivalent edges with Ei, we assume thatthis class contains at least 2 edges. Now from definition 3.2.1 we see that Ei ↔ Ej ifand only if Ei ∩ Ej 6= ∅ and i 6= j in the linegraph representation, thus all the edges inEi will have exactly the same adjacency structure, making them structural equivalentas vertices in the line-graph representation too.

Because the line-graph representation preserves structural equivalent edges and we usethe method of Blondel for which we already proved in (E6) that this condition holds,the result follows.

(C7) Not fulfilled: no information on the number of vertices an edge contains is preserverdby the line-graph representation.

(C8) Fulfilled: but it is also the only thing this representation tells us when used for thecalculation of similarity: two edges of the two line-graphs have a positive similarityscore when these edges are connected to other edges, meaning that for any Ep, Eq thereis a sequence

Ep = Ek0 , Ek1 , . . . , Ekl−1 , Eq = Ekl

such that Eki−1 ∩ Eki 6= ∅.This follows immediately from Property 3.2.2. The connectivity of edges is indeedrepresented in the line-graph representation by definition.

Conclusion

The line-graph fails to satisfy lots of the conditions and is therefore not a good representationto calculate similarity between two hypergraphs. Similarity between hypergraphs throughline-graphs only allows us to discover groups of connected edges in both hypergraphs. Thefact that the line-graph representation is a bit disappointing, was also predictable as a lot ofhypergraphs share the same line-graph representation.

3.2.2 2-section of a hypergraph


We now look at another graph representation of a hypergraph. In contrast to the line-graphrepresentation, the 2-section saves information about the vertices which will introduce a moresophisticated way to say something about similarity between two hypergraphs by using their2-section.


Definition 3.2.9. The 2-section of a hypergraph H = (V,E) is the (undirected) graphdenoted by H2 = (V,↔) with:

• The same vertex set as the hypergraph,

• vi ↔ vj if and only if vi, vj ∈ Ek for some Ek ∈ E and i 6= j.

Example 3.2.10. The 2-section of the following hypergraph H is drawn on top:

v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

Also the 2-section of a hypergraph is not a faithful graph characteristic of a hypergraph,as the following example shows:

Example 3.2.11. The 2-section of the following hypergraph H′ is drawn on top and is thesame as the previous example:

v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

E4

E5



In Listing A.12 in Appenix A, we introduce an algorithm that takes a hypergraph as inputand returns the adjacency matrix of the corresponding 2-section of the hypergraph. Theresulting adjacency matrix can then be used in the node similarity method from Algorithm5.

Examples

We now use the same examples as in the previous subsection and look at what the similarityof two hypergraphs becomes by using their 2-section.

Example 3.2.12. Take the sameH1 andH2 as in Example 3.2.4, by calculating the adjacencymatrices of the 2-section, and we can apply Algorithm 5 which returns the similarity matrix:

S =

0.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.02880.0147 0.0161 0.0028 0.0151 0.0161 0.0091 0.0147 0.01610.0790 0.0867 0.0153 0.0814 0.0867 0.0489 0.0790 0.08670.1610 0.1768 0.0312 0.1659 0.1768 0.0996 0.1610 0.17680.1610 0.1768 0.0312 0.1659 0.1768 0.0996 0.1610 0.17680.0890 0.0977 0.0172 0.0917 0.0977 0.0551 0.0890 0.09770.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.02880.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.0288

Example 3.2.13. Let H1 and H2 be the same as in Example 3.2.5, the similarity matrixusing the 2-section of the hypergraphs becomes:

S =

0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682

0 0 0 0 0 0 0 0

Example 3.2.14. Let H1 and H2 be the same as in Example 3.2.6, the similarity matrix


using the 2-section of the hypergraphs becomes:

S =

0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 00.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.1623 0.1782 0.0314 0.1673 0.1782 0.1004 0.1623 0.17820.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 0

Example 3.2.15. Let H2 be the same as in the previous example (Example 3.2.14), but takefor H1 the following hypergraph:

v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

E4

The similarity matrix using the 2-section of the hypergraph becomes the same as in theprevious example:

S =

0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 00.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.1623 0.1782 0.0314 0.1673 0.1782 0.1004 0.1623 0.17820.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639

0 0 0 0 0 0 0 0


Interpretation

(C1) Not fulfilled: an undirected graph with multiple edges between two vertices vp, vq will‘lose’ this edges in its 2-section. Also loops are not taken into consideration. Take forexample the following graph G :

v1 v2e1

e2

e3

G will have as 2-section:

v1 v2e1

(C2) Fulfilled: the number of vertices is preserved in the 2-section of a hypergraph, so eachvertex is taken into account when calculating the similarity matrix. We conclude withthe same reasoning as (E2) that the condition is fulfilled.

(C3) Not fulfilled: introducing an edge in a hypergraph that connects vertices who are alreadyconnected, doesn’t change anything to the 2-section of the hypergraph (by definition),for instance, we see in Example 3.2.15 that get the same similarity matrix as in Example3.2.14 as introducing edge E4 doesn’t influence the similarity scores at all becausev1, v4, v7 were already adjacent to each other in the 2-section of H1.

(C4) Fulfilled: take for Example 3.2.12, we see that the largest similarity scores occurs be-tween vertices v6, v7 in H2 and vertices v2, v5 and v8 from H1. This can be explainedfrom the fact that indeed v6, v7 are all adjacent to the vertices in E2. And that E2 isthe most central set of vertices in the whole hypergraph.This can be explained generally from the fact that the 2-section preserves the adjacencyrelations between vertices of the hypergraph by definition (namely: a vertex that isadjacent to another vertex in the hypergraph, will also be adjacent in the 2-section).By preserving these relations, we can use the information in (E4) to conclude that thiscondition is fulfilled.

(C5) This condition does not apply on this method because we don’t calculate edge similarityscores.

(C6) Fulfilled: all structural equivalent vertices have the same similarity scores in the Ex-amples 3.2.12, 3.2.13, 3.2.14. This can be explained from the fact that the structuralequivalent vertices of a hypergraph are also structural equivalent vertices in the 2-sectiongraph. We will prove this:

Theorem 3.2.16. The 2-section of a hypergraph preserves structural equivalent ver-tices.

Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V , take an equivalence vi with |vi| > 1 (we assume that G has at


least two structural equivalent vertices), this means that all vertices in vi have exactlythe same adjacency structure in the hypegraph G. So, when a vertex in vi is adjacent toa vertex vp, all vertices in vi are adjacent to vp in the hypergraph G. By the definitionof the 2-section, this means that all vertices in vi will also have an edge connectingthem to vp in the 2-section. So in the 2-section, all vertices in vi will be adjacent tothe same vertices, making them also structural equivalent by the definition of structuralequivalent vertices in graphs.

Because the 2-section preserves structural equivalent vertices and we use the method ofBlondel for which we already proved in (E6) that this condition holds, the result follows.

(C7) Not fulfilled: the 2-section doesn’t save any information about the number of verticesin each edge.

(C8) Fulfilled from the fact that isolated vertices in a hypergraph will also be isolated in the2-section by definition. Because we already know from (E8) that this condition holds forthe method of Blondel, we conclude that this condition is fulfilled because the 2-sectionpreserves connectivity by definition.

Conclusion

The 2-section of a hypergraph is a rich structure that saves a lot more information comparedto the the line-graph representation. The biggest drawback for this method is that adding anedge to a hypergraph can sometimes have no effect at all. This happens when the added edgeconnects vertices which were already connected. This is bad, because adding an edge alwaysshould have an impact on the similarity scores as it expresses an additional union betweenvertices. As a consequence, adjacent vertices that aren’t structural equivalent can still havethe same similarity scores. We saw in Example 3.2.11 that the 2-section of a hypergraphis not unique, meaning that we are loosing certain information on the hypergraph. In thiscase, we can lose information about the edges as some edges will not be represented in the2 section and we also lose all information about the number of vertices contained in eachedge. Meaning that the number of vertices contained in an edge doesn’t play any role whencalculating the similarity scores.

We can resolve the problems in conditions (C1) and (C3) by allowing multiple edgesbetween vertices and loops: every edge in the hypergraph is then also represented in thisextended 2-section.

3.2.3 Extended 2-section of a hypergraph


Definition 3.2.17. The extended 2-section of a hypergraph H = (V,E) is the (undirected)graph denoted by H’2 = (V,↔) with:

• The same vertex set as the hypergraph,

• for every Ei ∈ E with |Ei| > 1: vk ↔ vl for every vk, vl ∈ Eiandk 6= l,

• for every Ei ∈ E with |Ei| = 1 : vk ↔ vk for vk ∈ Ei.



We introduce Algorithm 10 that takes a hypergraph as input and returns the adjacency matrixof the corresponding extended 2-section of the hypergraph. A Matlab implementation can befound in Listing A.13 in Appenix A.

Data:n: the number of vertices of hypergraph HE: a set of subsets Ei of {1, . . . , n} that represent the edges of hypergraph HResult:A: the adjacency matrix of the corresponding extended 2-sectionbegin hypergraph to extended2section(n, E)

A = initialize a n× n-matrix with all entries equal to 0;m = number of edges;for i : 1 to m do

if |Em| = 1 thenk = vertex in Em;(A)kk = Akk + 1;

elsefor j : 1 to |Em| do

C = all possible combinations of elements in Em\j;for l : 1 to |C| do

Ajl = Ajl + 1end

endend

endendreturn A;

Algorithm 10: Algorithm to calculate the adjacency matrix of the extended 2-section ofa hypergraph.

Example

We take an example that really shows the power of extended 2-sections:

Example 3.2.18. Take H1 as in Example 3.2.4 and H2:


v3 v4

v1

v5

v2

v6

v7 v8

E1

E2

E3

E4

The extended 2-section has as adjacency matrices for H1 and H2:

A =

0 1 0 1 1 0 1 11 0 0 1 2 1 1 20 0 0 1 0 0 0 01 1 1 0 1 0 1 11 2 0 1 0 1 1 20 1 0 0 1 0 0 11 1 0 1 1 0 0 11 2 0 1 2 1 1 0

and B =

0 1 0 2 1 0 1 11 0 0 1 2 1 1 20 0 0 1 0 0 0 02 1 1 0 1 0 1 11 2 0 1 0 1 1 20 1 0 0 1 0 0 11 1 0 1 1 0 0 11 2 0 1 2 1 1 0

Remember that the ‘normal’ 2-section for H1 and H2 would return the following adjacencymatrices:

A′ =

0 1 0 1 1 0 1 11 0 0 1 1 1 1 10 0 0 1 0 0 0 01 1 1 0 1 0 1 11 1 0 1 0 1 1 10 1 0 0 1 0 0 11 1 0 1 1 0 0 11 1 0 1 1 1 1 0

and B′ =

0 1 0 1 1 0 1 11 0 0 1 1 1 1 10 0 0 1 0 0 0 01 1 1 0 1 0 1 11 1 0 1 0 1 1 10 1 0 0 1 0 0 11 1 0 1 1 0 0 11 1 0 1 1 1 1 0

The node similarity matrix with the extended 2-section is:

S =

0.1108 0.1651 0.0174 0.1132 0.1651 0.0763 0.1108 0.16510.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.20990.0168 0.0250 0.0026 0.0171 0.0250 0.0116 0.0168 0.02500.1128 0.1680 0.0177 0.1151 0.1680 0.0776 0.1128 0.16800.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.20990.0629 0.0937 0.0099 0.0643 0.0937 0.0433 0.0629 0.09370.0962 0.1433 0.0151 0.0982 0.1433 0.0663 0.0962 0.14330.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.2099


Conversely, the node similarity matrix with the ‘normal’ 2-section would return:

S′ =

0.1409 0.1547 0.0273 0.1452 0.1547 0.0872 0.1409 0.15470.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.16980.0273 0.0299 0.0053 0.0281 0.0299 0.0169 0.0273 0.02990.1452 0.1594 0.0281 0.1496 0.1594 0.0898 0.1452 0.15940.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.16980.0872 0.0957 0.0169 0.0898 0.0957 0.0539 0.0872 0.09570.1409 0.1547 0.0273 0.1452 0.1547 0.0872 0.1409 0.15470.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.1698

The most important thing to notice here is the difference in similarity scores of verticesv1, v7 of H2: in the extended 2-section these vertices have different similarity scores which iscorrect as v1 is also contained in edge E4 and therefore, v1 and v7 are not structural equivalent.Conversely, in the ‘normal’ 2-section, the representation doesn’t take E4 into account, leadingto the same similarity scores for v1, v7.

Example 3.2.19. Take H1 as in Example 3.2.4 and H2 as:

v3v2 v4

v1E1

E2

E3

E4

The similarity matrix becomes:

S =

0.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.2332

This is a beautiful example to show that sometimes (when considering easy hypergraphs),automorphic equivalent vertices can share the same similarity scores (see Remark 3.1.4).

Interpretation

(C1) Fulfilled: it is trivial to see that by definition of the extended 2-section, an undirectedgraph will have exactly the same vertices and exactly the same edges in its extended2-section.

(C2) Fulfilled: the number of vertices is preserved in the extended 2-section of a hypergraph,so each vertex is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E2) that the condition is fulfilled (equal to the ‘normal’2-section).


(C3) Fulfilled: the number of edges is preserved in the extended 2-section of a hypergraph,so each edge is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E3) that the condition is fulfilled.

(C4) Fulfilled: the extended 2-section preserves the adjacency relations between vertices ofthe hypergraph by definition (namely: a vertex that is adjacent to another vertex in thehypergraph, will also be adjacent in the extended 2-section). By preserving this rela-tions, we can use the information in (E4) to conclude that this condition is fulfilled.(equalto the ‘normal’ 2-section)

(C5) This condition does not apply on this method because we don’t calculate edge similarityscores.

(C6) Fulfilled, because:

Theorem 3.2.20. The extended 2-section of a hypergraph preserves structural equiva-lent vertices.

Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V . Take an equivalence class vi with |vi| > 1 (we assume that Ghas at least two structural equivalent vertices), this means that all vertices in vi haveexactly the same adjacency structure in the hypegraph G. So, when a vertex in vi isadjacent to a vertex vp with k edges that connect both vertices (the size of these edgesdoesn’t matter), all vertices in vi are adjacent with k edges to vp in the hypergraph G.By the definition of the extended 2-section, this means that all vertices in vi will alsohave k edges connecting them to vp in the 2-section. So in the extended 2-section, allvertices in vi will be adjacent to the same vertices, and connecting each vertex vi withthe same amount of edges to these vertices, making them also structural equivalent inthe extended 2-section by the definition of structural equivalent vertices in graphs.

Because the 2-section preserves structural equivalent vertices and we use the methodof Blondel for which we already proved in (E6) that this condition holds, the resultfollows. (equal equal to the ‘normal’ 2-section)

(C7) Not fulfilled: the extended 2-section doesn’t save any information about the number ofvertices contained in an edge.

(C8) Fulfilled from the fact that isolated vertices in a hypergraph will also be isolated in theextended 2-section by definition. Because we already know from (E9) that this conditionholds for the method of Blondel, we conclude that this condition is fulfilled because theextended 2-section preserves connectivity by definition. (equal to the ‘normal’ 2-section)

Conclusion

The extended 2-section solves all the issues with (C1) and (C3) as it preserves all the in-formation about the number of edges connecting vertices and the adjacency of the vertices.Still, this representation is not unique as no information about the cardinality of the edges issaved, therefore, the method fails to (C8). Therefore, the method preforms much better than


the normal 2-section, but is unreliable in detecting differences between edges. Still, if one iswilling to accept this limitation, by example in the case of an application that is not focussedon detecting differences in number of vertices in th edges, the extended 2-section can be anoption. Also note that it is impossible to calculate edge similarity scores with the extended2-section: an edge of a hypergraph is translated into multiple vertices in the 2-section so itwould be impossible to satisfy (C5) and (C7).

3.2.4 The incidence graph of a hypergraph


Definition 3.2.21. Let H = (V,E) be a hypergraph, then the incidence graph Gi of H isthe undirected graph with:

1. V ′ = V ∪ E,

2. ∀vi ∈ V,∀Ej ∈ E : vi ↔ Ej if vi ∈ Ej.

Because all edges in Gi are between an element of V and E, i is a bipartite graph and wewrite i = ((V,E),↔).

Example 3.2.22. Take the same hypergraph H as in Example 3.2.10, then the incidencegraph Gi equals:

v1 v2 v3 v4 v5 v6 v7 v8

E1 E2 E2

We can prove that the incidence graph Gi = (W = (V,E),↔) of a hypergraph is a faitfhulcharacteristic graph representation if we know its bipartite structure.

Theorem 3.2.23. The incidence graph Gi = (W = (V,E),→) of a hypergraph is a faithfulrepresentation: the incidence graph represents only one hypergraph under the condition thatwe know the bipartite structure, which means that for the vertex set W , we know which disjointsubset represents the vertices V of the hypergraph, respectively the edges E of the hypergraph.

Proof. Suppose that G and H are two hypergraphs that share the same incidence graphGi = ((V,E),→). Then G and H have the same set of vertices V and the same set of edges E.Also the adjacency relations in G and H are the same by the edge set → of Gi. We concludethat G = H.


In Listing A.14 in Appenix A, we introduce an algorithm that takes a hypergraph as input andreturns the adjacency matrix of the corresponding incidence graph. The resulting adjacencymatrix can then be used in the node similarity method from Algorithm 5.


An important note has to be made here: this algorithm will return both the node andedge similarity scores in the same matrix. Also similarity scores will be available to comparean edge to a node and vice versa. Since we cannot give a correct meaning to such similarityscores, we regard them as redundant, intermediate results. In the examples, we will alwaysdraw lines in order to clearly separate the node similarity submatrix and the edge similaritysubmatrix.

Examples

Example 3.2.24. Take H1 and H2 as in Example 3.2.4, the similarity matrix with theincidence graph representation becomes:

0.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0081 0.0138 0.0013 0.0094 0.0138 0.0057 0.0081 0.0138 0.0022 0.0137 0.00960.0517 0.0880 0.0082 0.0599 0.0880 0.0363 0.0517 0.0880 0.0139 0.0871 0.06120.1067 0.1817 0.0170 0.1237 0.1817 0.0750 0.1067 0.1817 0.0287 0.1799 0.12650.1067 0.1817 0.0170 0.1237 0.1817 0.0750 0.1067 0.1817 0.0287 0.1799 0.12650.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0123 0.0209 0.0020 0.0143 0.0209 0.0086 0.0123 0.0209 0.0033 0.0207 0.01460.0958 0.1632 0.0153 0.1111 0.1632 0.0674 0.0958 0.1632 0.0258 0.1616 0.11360.0196 0.0333 0.0031 0.0227 0.0333 0.0138 0.0196 0.0333 0.0053 0.0330 0.02320.0661 0.1126 0.0106 0.0767 0.1126 0.0465 0.0661 0.1126 0.0178 0.1115 0.0784


0.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000



0.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0254 0.0432 0.0041 0.0294 0.0432 0.0178 0.0254 0.0432 0.0068 0.0428 0.03010.0254 0.0432 0.0041 0.0294 0.0432 0.0178 0.0254 0.0432 0.0068 0.0428 0.03010.1092 0.1860 0.0174 0.1267 0.1860 0.0768 0.1092 0.1860 0.0294 0.1842 0.12950.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0302 0.0514 0.0048 0.0350 0.0514 0.0212 0.0302 0.0514 0.0081 0.0509 0.03580.0997 0.1697 0.0159 0.1156 0.1697 0.0701 0.0997 0.1697 0.0268 0.1681 0.11810.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000


0.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0253 0.0431 0.0040 0.0293 0.0431 0.0178 0.0253 0.0431 0.0068 0.0427 0.03000.0818 0.1392 0.0130 0.0948 0.1392 0.0575 0.0818 0.1392 0.0220 0.1379 0.09690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0379 0.0646 0.0061 0.0440 0.0646 0.0267 0.0379 0.0646 0.0102 0.0640 0.04500.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0237 0.0403 0.0038 0.0275 0.0403 0.0166 0.0237 0.0403 0.0064 0.0399 0.02810.1057 0.1800 0.0169 0.1226 0.1800 0.0743 0.1057 0.1800 0.0284 0.1783 0.12530.0710 0.1210 0.0113 0.0824 0.1210 0.0499 0.0710 0.1210 0.0191 0.1198 0.08420.0237 0.0403 0.0038 0.0275 0.0403 0.0166 0.0237 0.0403 0.0064 0.0399 0.0281

Example 3.2.28. Take G1 and G2 as in Example 3.2.19, the similarity matrix becomes:

0.1034 0.1761 0.0165 0.1199 0.1761 0.0727 0.1034 0.1761 0.0278 0.1744 0.12260.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.1034 0.1761 0.0165 0.1199 0.1761 0.0727 0.1034 0.1761 0.0278 0.1744 0.12260.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.0941

Interpretation

(C1) Fulfilled: we will prove later in Corallary 3.3.7 that using this method with two undi-rected graphs returns the same results as the node-edge similarity matrix of section


2.3.

(C2) Fulfilled: the number of vertices is preserved in the incidence graph a hypergraph, soeach vertex is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E2) that the condition is fulfilled.

(C3) Fulfilled from the fact that all edges are represented in the incidence graph of thehypergraph, so each edge is taken into account when calculating the similarity matrix.We conclude with the same reasoning as (E3) that the condition is fulfilled.

(C4) Fulfilled from the fact that the adjacency relations of the original hypergraph are rep-resented in its incidence graph and that we use Algorithm 5, for which this statementalready holds by (E4).An example can be found in Example 3.2.24 where v2, v5, v8 of H1 have the largestsimilarity score with v6, v7 of H2. This is not surprising as v6, v7 are contained in allpossible edges.

(C5) Fulfilled from the fact that the adjacency relations of the original hypergraph are repre-sented in its incidence graph, that the edges are considered as normal vertices and thatwe use Algorithm 5, for which this statement already holds by (E4).An example can be found in Example 3.2.24 where E2 of H1 has the largest similarityscore with E′2 of H2. This is not surprising as these edges are almost equal.

(C6) Fulfilled, an example of structural equivalent edges can be found in Example 3.2.27,where E′1 and E′4 of H2 have the same similarity scores. In the same example, thevertices v1, v7 of H2 as well as v2, v5, v8.We know prove that the incidence graph preserves the interchangeability of edges andvertices:

Theorem 3.2.29. The incidence graph of a hypergraph preserves structural equivalentvertices.

Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V , take an equivalence vi with |vi| > 1 (we assume that G has atleast two structural equivalent vertices), this means that all vertices in vi have exactlythe same adjacency structure in the hypegraph G. So, when a vertex in vi is adjacent toa vertex vp, all vertices in vi are adjacent with the same edge Ek to vp in the hypergraphG. By the definition of the incidence graph, this means that all vertices in in vi willbe connected to the edge Ek connecting them to vp in the incidence graph. So in theincidence graph, all vertices in vi will be adjacent to the same vertices and the sameedges, making them also structural equivalent by the definition of structural equivalentvertices in graphs. With the same reasoning, we conlcude that also the structuralequivalent edges are preserved.

(C7) Fulfilled: the adjancency relations in the incidence graph are determined by the numberof vertices contained in an edge in the hypergraph.


(C8) Fulfilled. An example is Example 3.2.6 where the vertices v3, v8, v9 and v12 have allsimilarity scores equal to zero as they form a clique that is not connected to the othervertices. The incidence graph also preserves the adjancency relations and thus connec-tivity, so we know from (E8) that this condition holds.

Conclusion

We conclude that using the incidence graph returns similarity scores that satisfy all the con-ditions and can be seen as a very good way to calculate the similarity scores of a hypergraph.By Theorem 3.2.23 this is not so surprising: the incidence graph is a faithful representation.Intuitively, this means that this representation saves all information of the represented hyper-graph, meaning that it is possible to reconstruct the hypergraph based on its incidece graph.As a result, no information of the hypergraph is lost when using Algorithm 5, so it is possibleto satisfy all the conditions.


3.3 Similarity by using the incidence matrix

We already showed that using the incidence graph representation of a hypergraph is a verygood way to calculate similarity between hypergraphs. It would be somehow logic to stopsearching for similarity methods for hypergraphs now, but there is one very natural general-ization that is worth mentioning: the node-edge similarity method described in Section 2.3uses a source-edge matrix and a terminus-edge matrix. When we would use this method (seeTheorem 2.3.6) with undirected graphs G and H , the source-edge matrix and terminus-edgematrix are the same and are equal to the incidence matrices of G and H . The question isnow: is there any problem if we enter not the incidence matrices of two graphs, but of twohypergraphs (see Definition 1.6.10)? The only difference is that each column can have morethan 2 entries equal to 1. The answer is no.

Even better, we will be able to prove that this method is equal to the method with theincidence graph up to a constant! By this ‘concluding theorem’, we know that also thismethods meets all conditions imposed in the introduction.

We first prove the compact form of the method:

3.3.1 Compact form

Theorem 3.3.1. Let G = (V,E) and H = (V ′, E′) be two hypergraphs, G has nG vertices andmG eges and H has nH vertices and mH edges. Let A and B be the incidence matrices of Gand H and define:

Y (k+1) = BTX(k)A

‖BTX(k)A‖F(3.8)

X(k+1) = BY (k)AT

‖BY (k)AT ‖F(3.9)

for k = 0, 1, . . ..Then the matrix subsequences X(2k), Y (2k) and X(k+1), Y (k+1) converge to Xeven, Yeven

and Xodd, Yodd. If we take:

X(0) = J ∈ RnH ×nG

Y (0) = J ∈ RmH ×mG

as initial matrices, then Xeven(J) = Xodd(J), Yeven(J) = Yodd(J) are the unique matrices oflargest 1-norm among all possible limits with positive initial matices and the matrix sequenceX(k), Y (k) converges has a whole.

Proof. The only thing we have to prove is that we can construct a matrix M that is nonneg-ative and symmetric, the rest of the proof is completely analogous to the proof of Theorem2.3.6.

So, by Theorem 2.2.14 we can rewrite (2.16) as follows:

Y ′(k+1) = BTX(k)A

⇔ vec(Y ′(k+1)) = vec(BTX ′(k)A)⇔ vec(Y ′(k+1)) = (AT ⊗BT ) vec(X ′(k))


Completely analogous we can also rewrite (2.17):

vec(X ′(k+1)) = (A⊗B) vec(Y ′k),

define y(k) = vec(Y ′(k+1)) and x(k) = vec(X ′(k+1)), we get:

y(k+1) = (AT ⊗BT )x(k)

x(k+1) = (A⊗B)y(k)

If we define G = AT ⊗BT , then with Lemma 2.2.13:

GT = (AT ⊗BT )T

= A⊗B

So we get:

y(k+1) = Gx(k) (3.10)x(k+1) = GTy(k), (3.11)

G is a mGmH × nGnH -matrix, the previous expressions can be concatenated to a singlematrix update equation (we define matrix M and z(k+1)):

z(k+1) =(

xy

)(k+1)

=(

0mGmH GT

G 0nGnH

)(xy

)(k)

= Mz(k),

M is clearly nonnegative because G and GT are nonnegative and M is also clearly symmetric,so the result follows immediately from Theorem 2.2.10. The rest of the proof is now completelyanalogous to the proof of Theorem 2.3.6.

We define now Xeven(1) as the node similarity matrix and Yeven(1) as the edge similaritymatrix.

Note that the equations in compact form in the node-edge similarity method where definedas:

Y (k+1) = BTSX

(k)AS +BTTX

(k)AT‖BT

SX(k)AS +BT

TX(k)AT ‖F



(k)ATT ‖F

But since AS = AT = A and BS = BT = B in this case, the second term of the sum isredundant and would be eliminated by the normalization in each step. This shows that thecompact form as presented in the previous theorem is not different from the compact form ofTheorem 2.3.6.


3.3.2 The algorithm

The algorithm is completely analogous to Algorithm 6. A Matlab implementation can befound in Listing A.16 in Appendix A. We also present Algorithm 12, an algorithm thattakes a hypergraph as input and returns the incidence matrix of the hypergraph. A Matlabimplementation of this algorithm can be found in Listing A.15.

Data:A: the nG ×mG incidence matrix of a hypergraph GB: the nH ×mH incidence matrix of a hypergraph GTOL: tolerance for the estimation error.Result:X: the node similarity matrix between G and HY : the edge similarity matrix between G and Hbegin node edge similarity matrix hypergraphs(A,B,TOL)

k = 1 ;X(0) = 1 (nH × nG -matrix with all entries equal to 1);Y (0) = 1 (mH ×mG -matrix with all entries equal to 1);µX = nH × nG -matrix with all entries equal to TOL;µY = mH ×mG -matrix with all entries equal to TOL;repeat

Y (k) = BTX(k−1)A‖BTX(k−1)A‖F

;X(k) = BY (k)AT

‖BY (k)AT ‖F;

k = k + 1;until |X(k) −X(k−1)| < µX and |Y (k) − Y (k−1)| < µY ;

endreturn X(k), Y (k);

Algorithm 11: Algorithm for calculating the node and edge similarity matrix X and Ybetween G and H.


Data:n: the number of vertices of hypergraph HE: a set of subsets Ei of {1, . . . , n} that represent the edges of hypergraph HResult:A: the incidence matrix of the hypergraphbegin hypergraph to incidencematrix(n, E)

m = |E|;A = initialize a n× n-matrix with all entries equal to 0;for Ej ∈ E do

for vi ∈ Ej do(A)ij = 1;

endendreturn A;

endAlgorithm 12: Algorithm to calculate the adjacency matrix of the 2-section of a hyper-graph.

3.3.3 Examples

We now calculate the same example as in the previous section:

Example 3.3.2. LetH1,H2 be the same hypergraphs as in Example 3.2.4, we get as incidencematrix A of H1 and B of H2:

A =

0 1 00 1 11 0 01 1 00 1 10 0 10 1 00 1 1

and B =

0 1 0 00 1 0 00 0 1 01 0 0 01 0 0 10 1 0 10 1 0 10 0 1 10 0 1 00 1 0 00 1 0 00 0 1 0

which can be used to apply Algorithm 11 which results in the node similarity matrix X and


the edge similarity matrix Y :

X =

0.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.02910.0108 0.0183 0.0017 0.0125 0.0183 0.0076 0.0108 0.01830.0686 0.1168 0.0109 0.0796 0.1168 0.0482 0.0686 0.11680.1417 0.2413 0.0226 0.1643 0.2413 0.0996 0.1417 0.24130.1417 0.2413 0.0226 0.1643 0.2413 0.0996 0.1417 0.24130.0750 0.1277 0.0120 0.0869 0.1277 0.0527 0.0750 0.12770.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.02910.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.0291

Y =

0.0134 0.0840 0.05900.1045 0.6549 0.46030.0213 0.1337 0.09400.0721 0.4519 0.3177

Example 3.3.3. Take H1 and H2 as in Example 3.2.5, the node similarity matrix X and theedge similarity matrix Y become:

X =

0.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Y =

0.0000 0.0000 0.00000.1294 0.8112 0.57020.0000 0.0000 0.0000

Example 3.3.4. Take H1 and H2 as in Example 3.2.6, the node similarity matrix X and the


edge similarity matrix Y become:

X =

0.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0326 0.0555 0.0052 0.0378 0.0555 0.0229 0.0326 0.05550.0326 0.0555 0.0052 0.0378 0.0555 0.0229 0.0326 0.05550.1401 0.2386 0.0224 0.1625 0.2386 0.0985 0.1401 0.23860.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Y =

0.0375 0.2351 0.16520.1239 0.7764 0.54570.0000 0.0000 0.0000

Example 3.3.5. Take H1 and H2 as in Example 3.2.7, the node similarity matrix X and theedge similarity matrix Y become:

X =

0.0778 0.1326 0.0124 0.0903 0.1326 0.0547 0.0778 0.13260.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.22160.0349 0.0594 0.0056 0.0404 0.0594 0.0245 0.0349 0.05940.1127 0.1919 0.0180 0.1307 0.1919 0.0792 0.1127 0.19190.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.22160.0523 0.0891 0.0083 0.0607 0.0891 0.0368 0.0523 0.08910.0778 0.1326 0.0124 0.0903 0.1326 0.0547 0.0778 0.13260.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.2216

Y =

0.0233 0.1459 0.10250.1039 0.6512 0.45770.0698 0.4376 0.30760.0233 0.1459 0.1025

3.3.4 Concluding theorem

Before heading to an interpretation of the examples and the method, we now present a‘concluding theorem’ that shows that the method of similarity using the incidence matrix iscompletely equal to the method with the incidence graph of the matrix. This gives us animmediate guarantee that all conditions of the introduction are met for the method with theincidence matrix.

Theorem 3.3.6. The similarity method using the incidence graphs of two hypergraphs G,Hreturns the same results as the method with the incidence matrices of G,H, up to a constant,so:

X(k) = αS(k)[1, . . . , nH; 1, . . . , nG ]Y (k) = βS(k)[nH, . . . ,mH;nG , . . . ,mG ]


with X,Y , the node and similarity matrices of the method with the incidence matrices ofH,G, S the similarity matrix of method of Blondel used with the incidence graphs of H,G andα, β ∈ R and k even. S[R,C] denotes the submatrix of S with rows R and columns C.

Proof. To calculate the similarity matrix using the incidence graph representation with themethod of Blondel, an element sij at iteration step k is calculated as follows (remember thatA,B are symmetric):

s(k+1)ij =

∑nH+mHf=1

∑nG+mGg=1 bifs

(k)fg agj

‖BS(k)AT ‖F(3.12)

Now, remember that the incidence graph is a bipartite graph and thus, the adjancecy matricesA and B have the following property (remember our convention to order the columns androws in a way that the vertices come first, followed by the edges):

bif = 0 when 0 ≤ i, f ≤ nH or nH ≤ i, f ≤ nH +mH (3.13)agj = 0 when 0 ≤ g, j ≤ nG or nG ≤ g, j ≤ nG +mG (3.14)

We are only interested in the case where sij is a calculation between nodes and between edges,as the method with the incidence matrix also only calculates this and it is rather hard to givea correct meaning to a comparision between an edge and an a node.

For the rest of the proof, we only consider the calculating of node similarity, the calculationof edge similarity is completely analogous. In this case, we have that 0 ≤ i ≤ nH and0 ≤ j ≤ nG , now (3.12) becomes with property (3.13):

s(k+1)ij =

∑nH+mHf=nH

∑nG+mGg=nG bifs

(k)fg agj

‖BS(k)AT ‖F(3.15)

So in the case of calculating similarities between nodes, only the s(k)ij of the edges play a role!

Now, remember the compact form with the method using the incidence matrices (B′ is theincidence matrix of H, A′ of G, the rows in the incidence graph represent the vertices, thecolumns the edges):

X(k+1) = B′Y (k)A′T

‖B′Y (k)A′T ‖F(3.16)

Y (k+1) = B′TX(k)A′

‖B′TX(k)A′‖F(3.17)

Elementwise, we write:

x(k+1)ij =

∑mHp=1

∑mGq=1 b

′ipy

(k)pq a′jq

‖B′TX(k)A′‖F(3.18)

y(k+1)cd =

∑nHh=1

∑nGe=1 b

′hcx

(k)he a

′ed

‖B′Y (k)A′T ‖F(3.19)

Now every non-zero element of the adjacency matrices A,B, can be found in the incidencematrices A′, B′ too (we assume that the vertices and edges have the same numbering in bothmatrices):

aij ={a′i,j−nH if 0 ≤ i ≤ nH and nH < j ≤ nH +mH

a′j,i−nH if nH < i ≤ nH +mH and 0 ≤ j ≤ nH


So we rewrite 3.15 an we get (remember that 0 ≤ i ≤ nH and 0 ≤ j ≤ nG):

s(k+1)ij =

∑nH+mHf=nH

∑nG+mGg=nG b′i,f−nHs

(k)fg a

′j,g−nG

‖BS(k)AT ‖F(3.20)

=∑mHf=1

∑mGg=1 b

′i,fs

(k)f+nH,g+nGa

′j,g

‖BS(k)AT ‖F(3.21)

The equivalence of (3.21) and (3.18) is now clear: (3.21) depends on the edge similarity scores(to calculate the node similarity scores) and every edge similarity score is multiplied withexactly the same entries of the incidence graph. The result follows by proving that also theedge similarity scores ycd are equal to the sij with nH < i ≤ nH+mH and nG < j ≤ nG+mG ,this is possible with the exact same reasoning. The difference up to a constant is caused bythe normalization in each iteration. We conclude that the theorem is true for each even k.

Corollary 3.3.7. The node-edge similarity method (see 2.3) returns the same results forundirected graphs G ,H as the node similarity method of Blondel (see 2.1) using incidencegraphs of G ,H , up to constant.

Proof. This follows immediately from the fact that in the node-edge similarity method forundirected graphs, the source-edge matrix and terminal-edge matrix are the same and areboth equal to the incidence matrix for undirected graphs (see Definition 1.5.27).

Remark 3.3.8. One may wonder whether the node-edge similarity method of 2.3 also returnsthe same results for directed graphs G ,H as the node similarity method of Blondel using theincidence graphs of G ,H . This is the case, but is a little harder to prove. First, we use thefollowing definition for the incidence graph of a directed graph:

Definition 3.3.9. Let G = (V,→) be a directed graph, then the incidence graph G ′i =(V ′,→′) of G is the directed graph with:

1. V ′ = V ∪ →,

2. ∀vi ∈ V,∀ej ∈→: vi → ej if vi is the source node of ej,

3. ∀vi ∈ V,∀ej ∈→: ej → vi if vi is the terminal node of ej.

Starting from the node-edge similarity method described in section 2.3, we notice that thismethod uses source-edge and terminal-edge matrices. In the undirected case, the source-edgeand terminal-edge of a graph are both equal to the incidence matrix of the graph, but thisis not true for the directed case. Indeed, both the source-edge and the terminal-edge matrixhave nonnegative entries and therefore the node-edge similarity method returns nonnegative(node or edge) similarity scores, while the incidence matrix for directed graphs also containsnegative entries (see Definition 1.5.27). Still, we can prove that using the method of Blondelwith the adjancy matrices A,B from the incidence graph of the directed graphs G ,H returnsthe same results (up to a constant) as using the node-edge similarity method that uses thesource-edge and terminal-edge matrices of G ,H . The proof of this theorem will have exactlythe same reasoning as the proof of Theorem 3.3.6, but the proof will be more extensive as youhave to consider both the equalities with the source-edge matrix and the terminal-edge matrix.


3.3.5 Interpretation

By the concluding theorem, we know that the similarity method for hypergraphs using theincidence matrix is the same (up to a constant) as the method with the incidence graphrepresentation of hypergraphs. Since the latter method meets all the conditions, this methodsatisifies all the requirements too.

Appendix A

Listings

Listing A.1: The MatLab code for the power method described in algorithm 1function [y, mu] = power metho(A, y, TOL)k = 1;mu = 0;while true

z = A*y;mu previous = mu;mu = norm(z,2);y previous = y;y = z/mu;k = k + 1;if and(k>2,abs(mu - mu previous)<TOL)

break;end

endif (y(1) == -y previous(1))

mu = -mu;endreturn;end

Listing A.2: The MatLab code for algorithm 5.function [Z] = similarity matrix(A,B, TOL)

Z 0 = ones(size(B,1),size(A,1));mu(1:size(B,1),1:size(A,1)) = TOL;Z = Z 0;Z previouseven = Z 0;k=1;while true;

Y = norm((B*Z*transpose(A)+transpose(B)*Z*A),'fro');X = B*Z*transpose(A)+transpose(B)*Z*A;Z = X/Y;if mod(k,2) == 0

difference = abs(Z-Z previouseven);disp(difference);Z previouseven = Z;

145

APPENDIX A. LISTINGS 146

if (difference < mu)break;

endendk = k + 1;

endreturn;

end

Listing A.3: The MatLab code for Algorithm 6.function [X, Y] = node edge similarity matrices(AS, AT, BS, BT, TOL)

X = ones(size(BS,1),size(AS,1));Y = ones(size(BS,2),size(AS,2));mu X(1:size(BS,1),1:size(AS,1)) = TOL;mu Y(1:size(BS,2),1:size(AS,2)) = TOL;X previouseven = X;Y previouseven = Y;k=1;

while true;Y = (transpose(BS)*X*AS+transpose(BT)*X*AT)

/norm(transpose(BS)*X*AS+transpose(BT)*X*AT, 'fro');X = (BS*Y*transpose(AS)+BT*Y*transpose(AT))

/norm(BS*Y*transpose(AS)+BT*Y*transpose(AT), 'fro');

difference X = abs(X-X previous);difference Y = abs(Y-Y previous);X previous = X;Y previous = Y;if difference X < mu X

if difference Y < mu Ybreak;

endend

endreturn;

end

Listing A.4: The MatLab code to converse an adjacency matrix to a source-edge matrix (edgesare numbered left-to-right based on the adjacency matrix).function [AS] = source edge matrix(A)

number of edges = sum(A(:));AS = zeros(size(A,1), number of edges); %initialize the souce-edge matrix A Scurrent edge = 1;for i=1:size(A,1)

for j=1:size(A,2)if A(i,j) > 0

for e=1:A(i,j)AS(i,current edge) = 1;current edge = current edge + 1;

endend

endend


end

Listing A.5: The MatLab code to converse an adjacency matrix to a terminal-edge matrix(edges are numbered left-to-right based on the adjacency matrix).function [AS] = terminal edge matrix(A)

number of edges = sum(A(:));AS = zeros(size(A,1), number of edges); %initialize the terminal-edge matrix A Tcurrent edge = 1;for i=1:size(A,1)

for j=1:size(A,2)if A(i,j) > 0

for e=1:A(i,j)AS(j,current edge) = 1;current edge = current edge + 1;

endend

endend

end

Listing A.6: The MatLab code for Algorithm 6 but with adjacency matrices as input (edgesare numbered left-to-right based on the adjacency matrices).function [X,Y] = node edge similarity matrices with adjacency matrix(A, B, TOL)AS = source edge matrix(A);AT = terminal edge matrix(A);BS = source edge matrix(B);BT = terminal edge matrix(B);[X,Y] = node edge similarity matrix(AS, AT, BS, BT, TOL);

end

Listing A.7: The MatLab code for Algorithm 8 that takes an ordered list with the number ofvertices of the same color and a normal adjacency matrix as input and returns a partitionedadjacency matrix.function [Z] = colored node adjacency matrix partitioning(colored vertices,A)number of colors = size(colored vertices,2);

for i = 1:number of colorsfor j = 1:number of colors

c i = colored vertices(i);c j = colored vertices(j);B = zeros(c i,c j);start vertex i = 0;start vertex j = 0;for k = 1:i-1

start vertex i = start vertex i + colored vertices(k);endfor k = 1:j-1

start vertex j = start vertex j + colored vertices(k);endfor r = 1:c i


for k = 1:c jB(r,k) = A(start vertex i + r, start vertex j + k);

endendZ{i,j} = B;

endendreturn;

end

Listing A.8: The MatLab code for Algorithm 9 that calculates the node similarity matrix forcolored nodes.function [Z] = colored node similarity matrix(colored verticesA, A,

colored verticesB, B, TOL)partitionedA = colored node adjacency matrix partitioning(colored verticesA, A);partitionedB = colored node adjacency matrix partitioning(colored verticesB, B);number of colors = size(colored verticesA, 2);k = 1;for i=1:number of colors

Z{i} = ones(colored verticesB(i), colored verticesA(i));endZ previouseven = Z;while true;

norm = 0;for i=1:number of colors

Z temp = 0;for l = 1:number of colors

Z temp = Z temp + partitionedB{i,l}*Z{l}*transpose(partitionedA{i,l})+ transpose(partitionedB{l,i})*Z{l}*partitionedA{l,i};

endZ{i} = Z temp;norm = norm + trace(transpose(Z{i})*(Z{i}));

end

norm = norm(Z, 'fro');

for i=1:number of colorsZ{i} = Z{i}/norm;

endif mod(k,2) == 0

have to stop = 1;for i = 1:number of colors

difference i = abs(Z{i}-Z previouseven{i});if not(all(difference i) < TOL)

have to stop = 0;end

endif(have to stop == 1)

break;else

Z previouseven = Z;end

endk = k + 1;

end


end

Listing A.9: The MatLab code for the algorithm that takes an ordered list with the numberof edges of the same color and a source-edge matrix or terminal-edge matrix as input andreturns a partitioned source-edge or terminal-edge matrix.function [Z] = colored edge matrix partitioning(colored edges,A)number of colors = size(colored edges,2);number of nodes = size(A,1);for i = 1:number of colors

c i = colored edges(i);B = zeros(number of nodes,c i);start vertex i = 0;for k = 1:i-1

start vertex i = start vertex i + colored edges(k);end

for r = 1:number of nodesfor k = 1:c i

B(r,k) = A(r, start vertex i + k);end

endZ{i} = B;

endreturn;

end

Listing A.10: The MatLab code for the algorithm that calculates the node and (colored) edgesimilarity matrix for colored edges.function [S] = colored edge similarity matrix(colored edgesA, AS,AT, colored edgesB, BS,BT, TOL)partitionedAS = colored edge matrix partitioning(colored edgesA,AS);partitionedAT = colored edge matrix partitioning(colored edgesA,AT);partitionedBS = colored edge matrix partitioning(colored edgesB,BS);partitionedBT = colored edge matrix partitioning(colored edgesB,BT);number of colors = size(colored edgesA, 2);k = 1;for i=1:number of colors

Z{i,i} = ones(colored edgesB(i), colored edgesA(i));endZ previouseven = Z;while true;

norm2 = 0;for i=1:number of colors

Z temp = 0;for l = 1:number of colors

Z temp = Z temp + partitionedBS{l}*Z{l,l}*transpose(partitionedAS{l})+ partitionedBT{l}*Z{l,l}*transpose(partitionedAT{l,i});

endZ{i,i} = Z temp;norm2 = norm2 + trace(transpose(Z{i,i})*(Z{i,i}));Z{i,i} = Z{i,i}/norm(Z temp,'fro');

end


if mod(k,2) == 0have to stop = 1;for i = 1:number of colors

difference i = abs(Z{i,i}-Z previouseven{i,i});if not(all(difference i) < TOL)

have to stop = 0;end

endif(have to stop == 1)

break;else

Z previouseven = Z;end

endk = k + 1;

enddisp(Z);for i = 1:number of colorsfor j = 1:number of colors

if not(i == j)Z{i,j} = zeros(size(Z{i,i},1),size(Z{j,j},2));

endendend

S = cell2mat(Z);return;

end

Listing A.11: The MatLab code for to calculate the adjacency matrix of the representingline-graph of a hypergraph.function [A] = hypergraph to linegraph(n,E)

m = numel(E);A = zeros(m,m);

for i=1:mfor j=1:m

if or(i==j,isempty(intersect(E{i},E{j})))A(i,j) = 0;

elseA(i,j) = 1;

endend

endend

Listing A.12: The MatLab code for to calculate the adjacency matrix of the 2-section of ahypergraph.function [A] = hypergraph to 2section(n,E)A = zeros(n,n);for i=1:n

for j=1:n


for idx = 1:numel(E)if i==j

A(i,j) = 0;break;

elseif not(or(isempty(intersect(E{idx}, [i])),isempty(intersect(E{idx}, [j]))))A(i,j) = 1break;

elseA(i,j) = 0;

endend

endend

end

Listing A.13: The MatLab code for Algorithm to calculate the adjacency matrix of theextended 2-section of a hypergraph.function [A] = hypergraph to extended2section(n,E)A = zeros(n,n);for k = 1:numel(E)

if size(E{k},2) == 1A(cell2mat(E(k)), cell2mat(E(k))) = A(cell2mat(E(k)), cell2mat(E(k))) + 1;

elsep = perms(E{k});already done = [];for i = 1:size(p,1)

if isempty(intersect([p(i,1)], already done))for j = 2:size(p,2)

A(p(i,1),p(i,j)) = A(p(i,1),p(i,j)) + 1;endalready done = [already done, p(i,1)];

endenddisp(A);

endendreturn;

end

Listing A.14: The MatLab code for Algorithm to calculate the adjacency matrix of theincidence graph of a hypergraph.function [A] = hypergraph to incidencegraph(n,E)m = numel(E);A = zeros(n+m,n+m);

for idx = 1:numel(E)edge = cell2mat(E(idx));for idx 2 = 1:numel(edge)

A(edge(idx 2), n + idx) = 1;A(n + idx,edge(idx 2)) = 1;

end

end


end

Listing A.15: The MatLab code for Algorithm 12 to calculate the incidence matrix of ahypergraph.function [A] = hypergraph to incidencematrix(n,E)m = numel(E);A = zeros(n,m);

for idx = 1:numel(E)edge = cell2mat(E(idx));for idx 2 = 1:numel(edge)

A(edge(idx 2), idx) =1end

endend

Listing A.16: The MatLab code for Algorithm 11 to calculate the node-edge similarity scoresof a hypergraph.function [A] = hypergraph to 2section(n,E)A = zeros(n,n);for i=1:n

for j=1:nfor idx = 1:numel(E)

if i==jA(i,j) = 0;break;

elseif not(or(isempty(intersect(E{idx}, [i])),isempty(intersect(E{idx}, [j]))))A(i,j) = 1break;

elseA(i,j) = 0;

endend

endend

end

App

endi

xB

Res

ults

ofth

eE

urov

isio

nSo

ngC

onte

st20

09-2

015

Tabl

eB.

1:Eu

rovi

sion

Song

Con

test

2009

AlbaniaAndorraArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGermanyGreeceHungaryIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Nor

way

710

881

2101

02

810

3121

28

881

2101

212

8121

212

881

012

551

2101

0121

212

812

3121

0387

10.3

1138

2108

0.10

7545

693

Icel

and

68

52

52

521

08

210

74

712

108

812

351

21

810

36

510

57

282

1820

.175

6834

130.

1134

9331

3A

zerb

aija

n4

110

33

810

810

87

32

181

06

45

110

610

64

74

41

810

1210

3207

30.1

6594

6697

0.10

9357

170

Tur

key

1041

212

710

11

612

5121

03

53

53

73

62

612

812

177

40.1

3663

4156

0.09

4714

828

Uni

ted

Kin

gdom

84

73

47

47

63

64

812

110

42

310

12

410

68

731

03

617

350

.137

1075

990.

1336

3049

8

Est

onia

44

63

512

15

610

610

107

81

812

74

129

60.1

0679

3281

0.13

1401

482

Gre

ece

1210

55

512

712

64

72

84

64

12

21

5120

70.0

9646

0793

0.14

8053

884

Fran

ce2

36

71

62

43

66

35

56

11

103

73

76

311

0780

.087

4622

710.

1090

3393

7

153

APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 154Ta

ble

B.1:

Euro

visio

nSo

ngC

onte

st20

09(c

ontin

ued)


AuthorityScore

HubScore

Bos

nia

&H

erze

govi

na5

241

210

62

212

1281

05

44

810

690

.081

4273

950.

1191

5838

7

Arm

enia

17

641

21

16

58

24

75

34

35

62

9210

0.06

9134

3820

.132

2172

98R

ussi

a12

68

18

105

76

76

34

891

110.

0721

2247

10.1

5299

2047

Ukr

aine

310

65

82

42

451

06

21

62

7612

0.05

8251

7490

.142

8263

04D

enm

ark

56

53

44

32

65

87

21

85

7413

0.05

6476

6070

.156

3654

31M

oldo

va2

74

31

351

212

15

77

6914

0.04

8643

1910

.119

1605

33P

ortu

gal

66

17

37

72

810

5715

0.03

9280

5350

.105

8987

49Is

rael

78

84

510

45

11

5316

0.03

7533

9190

.157

0314

29A

lban

ia5

17

72

72

16

1048

170.

0354

6362

10.1

4149

3351

Cro

atia

512

42

12

85

645

180.

0328

3076

90.1

3453

9946

Rom

ania

32

52

122

27

540

190.

0279

4082

40.1

1946

6004

Ger

man

y3

22

73

16

12

17

3520

0.02

7028

5310

.155

1341

99Sw

eden

12

46

72

34

433

210.

0251

7333

90.1

6032

8666

Mal

ta1

34

51

13

76

3122

0.02

4200

0050

.139

6822

95L

ithu

ania

11

12

77

423

230.

0182

9026

10.1

5260

0566

Spai

n12

17

323

240.

0151

5637

60.1

2273

9159

Fin

land

34

83

422

250.

0169

4659

70.1

3277

3031

And

orra

00.1

0978

1594

Bel

arus

00.1

4958

7578

Bel

gium

00.1

2223

3834

Bul

gari

a00

.122

7935

35C

ypru

s00

.146

5668

20C

zech

Rep

ublic

00.1

0391

7420

FY

RM

aced

onia

00.1

2422

7588


ble

B.1:

Euro

visio

nSo

ngC

onte

st20

09(c

ontin

ued)


AuthorityScore

HubScore

Hun

gary

00.1

5826

0146

Irel

and

00.1

2812

6358

Lat

via

00.1

4378

7849

Mon

tene

gro

00.1

2614

4148

Pol

and

00.1

3597

2955

Serb

ia00

.123

0803

64Sl

ovak

ia00

.142

3497

52Sl

oven

ia00

.133

1784

41Sw

itze

rlan

d00

.117

7334

65T

heN

ethe

rlan

ds00

.163

3949

17

Tabl

eB.

2:Eu

rovi

sion

Song

Con

test

2010

AlbaniaArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Ger

man

y10

110

83

641

212

812

32

38

1210

412

71

36

8121

0121

212

410

542

4610

.263

7813

090.

1159

0929

0T

urke

y8

123

6101

012

661

031

251

01

42

83

23

53

881

0170

20.1

6931

8656

0.13

7704

079

Rom

ania

51

71

28

84

57

83

251

210

610

36

171

010

45

22

8162

30.1

6889

8361

0.12

8083

998


ble

B.2:

Euro

visio

nSo

ngC

onte

st20

10(c

ontin

ued)


AuthorityScore

HubScore

Den

mar

k2

54

22

52

212

1241

03

881

241

21

712

48

261

4940

.158

2508

110.

1441

7417

0A

zerb

aija

n6

571

210

23

81

43

72

127

78

87

212

1214

550

.150

0375

830.

1208

1845

1B

elgi

um10

36

712

6101

04

710

310

54

551

02

76

114

360

.147

5567

630.

1473

6242

0A

rmen

ia5

78

74

610

77

121

65

612

14

81

126

614

170

.139

2313

190.

1015

0866

0G

reec

e12

312

65

127

48

87

21

87

105

53

312

140

80.1

4395

4029

0.09

5629

350

Geo

rgia

128

74

46

75

85

52

55

125

14

101

16

15

713

690

.135

3928

620.

0862

8112

1U

krai

ne81

010

76

27

27

68

35

77

71

5108

100.

1037

5014

30.1

3927

7675

Rus

sia

1031

210

108

510

26

410

9011

0.08

5893

5820

.141

8611

13Fr

ance

63

33

71

18

38

66

32

47

43

22

8212

0.08

0095

3410

.118

1936

39Se

rbia

312

81

710

58

710

172

130.

0715

3058

30.1

3245

4774

Isra

el4

58

12

101

41

58

610

33

7114

0.07

1051

2440

.118

9699

83Sp

ain

77

12

42

15

84

122

45

468

150.

0659

0434

00.1

6262

6170

Alb

ania

35

512

110

72

18

71

6216

0.05

9724

4710

.140

5376

52B

osni

a&

Her

zego

vina

610

65

124

851

170.

0501

4570

70.1

3905

3777

Por

tuga

l4

48

61

61

26

543

180.

0429

2200

10.1

0767

4053

Icel

and

28

34

54

36

641

190.

0422

3612

60.1

3749

2185

Nor

way

25

76

23

33

435

200.

0341

2301

10.1

5545

4891

Cyp

rus

44

112

12

327

210.

0243

0572

40.1

4417

4824

Mol

dova

64

161

027

220.

0240

3919

60.1

1969

6370

Irel

and

12

12

66

725

230.

0242

3986

30.1

4741

1097

Bel

arus

112

32

1824

0.01

3826

7390

.104

1077

56U

nite

dK

ingd

om1

23

410

250.

0093

7836

90.1

3617

1561

Bul

gari

a00

.136

2080

81C

roat

ia00

.114

0144

18E

ston

ia00

.136

0977

82F

YR

Mac

edon

ia00

.127

6261

78F

inla

nd00

.133

3304

33


ble

B.2:

Euro

visio

nSo

ngC

onte

st20

10(c

ontin

ued)


AuthorityScore

HubScore

Lat

via

00.1

4283

7341

Lit

huan

ia00

.144

6337

55M

alta

00.1

3807

2738

Pol

and

00.1

5943

3083

Slov

akia

00.1

4532

8357

Slov

enia

00.1

3727

7314

Swed

en00

.153

8583

52Sw

itze

rlan

d00

.127

0765

74T

heN

ethe

rlan

ds00

.139

0851

80

Tabl

eB.

3:Eu

rovi

sion

Song

Con

test

2011

AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Aze

rbai

jan

88

63

810

88

56

85

78

42

8121

08

8101

210

73

161

210

221

10.2

3945

6546

0.11

9172

772

Ital

y12

66

13

66

16

13

87

310

43

512

1010

1010

612

231

24

7189

20.1

9606

0135

0.09

8845

242

Swed

en4

34

45

4101

012

661

01

610

741

26

310

31

611

04

510

41

3185

30.1

9143

4165

0.10

6864

013

Ukr

aine

312

1210

85

57

107

77

48

27

210

612

62

715

940

.164

4381

090.

1209

9208

6D

enm

ark

17

310

76

1212

106

57

34

68

1012

5134

50.1

3390

6705

0.10

4782

124


ble

B.3:

Euro

visio

nSo

ngC

onte

st20

11(c

ontin

ued)


AuthorityScore

HubScore

Bos

nia

&H

erze

govi

na7

122

712

57

34

412

1281

281

012

560

.118

3923

530.

1271

8601

9

Gre

ece

107

88

1012

361

08

21

88

83

26

120

70.1

2232

5490

0.14

5912

383

Irel

and

47

23

1210

84

108

16

871

25

1211

980

.112

3439

660.

1037

0368

1G

eorg

ia10

1012

13

85

212

77

67

812

110

90.1

1725

2558

0.13

2743

840

Ger

man

y10

85

31

83

46

68

25

47

36

87

310

7100

.103

2116

170.

1039

4199

7U

nite

dK

ingd

om6

25

212

43

51

22

611

05

37

41

34

21

63

1001

10.1

0067

1386

0.09

6916

917

Mol

dova

57

71

45

48

84

512

75

78

9712

0.09

5132

5890

.126

1248

31Sl

oven

ia3

1212

67

210

16

23

33

14

511

01

22

9613

0.09

6457

4690

.118

6823

26Se

rbia

11

710

88

23

56

65

104

63

8514

0.08

2541

1260

.095

0249

68Fr

ance

25

124

67

42

121

11

22

33

105

8215

0.08

5010

9390

.137

2684

91R

ussi

a4

84

54

25

45

13

18

65

48

7716

0.07

7844

3200

.143

3602

23R

oman

ia1

610

64

11

161

212

84

577

170.

0733

5584

40.1

3167

4852

Aus

tria

37

53

12

131

22

23

54

71

12

6418

0.05

9654

2910

.129

4287

33L

ithu

ania

212

107

312

12

71

663

190.

0587

7302

10.1

2742

4625

Icel

and

16

812

58

42

110

461

200.

0527

5762

90.1

3035

8202

Fin

land

57

210

31

125

75

5721

0.05

2100

5750

.101

8913

25H

unga

ry2

212

25

78

65

453

220.

0482

3698

20.1

3064

9629

Spai

n5

44

1212

52

23

150

230.

0521

1593

60.1

1425

4584

Est

onia

22

77

54

76

22

4424

0.04

0929

1370

.143

2535

87Sw

itze

rlan

d5

410

1925

0.01

6064

2780

.092

6207

42A

lban

ia00

.142

1075

51A

rmen

ia00

.128

5312

90B

elar

us00

.138

9337

98B

elgi

um00

.118

4865

08B

ulga

ria

00.1

1077

6143


ble

B.3:

Euro

visio

nSo

ngC

onte

st20

11(c

ontin

ued)


AuthorityScore

HubScore

Cro

atia

00.1

2998

0068

Cyp

rus

00.1

4569

0798

FY

RM

aced

onia

00.1

1619

7406

Isra

el00

.131

6767

97L

ativ

a00

.121

3288

88M

alta

00.1

6219

1624

Nor

way

00.0

9901

1716

Pol

and

00.1

2782

4729

Por

tuga

l00

.130

7234

91Sa

nM

arin

o00

.158

5591

28Sl

ovak

ia00

.141

1303

28T

heN

ethe

rlan

ds00

.137

6365

06

Tur

key

00.1

4560

7545


ble

B.4:

Euro

visio

nSo

ngC

onte

st20

12

AlbaniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Swed

en51

27

612

88

7101

212

6121

281

261

2121

212

1210

67

712

3101

231

0121

012

712

661

2372

10.3

0563

9378

0.09

9063

059

Rus

sia

351

012

83

66

58

84

84

57

47

76

7101

06

36

48

84

107

38

87

471

032

5920

.206

5723

840.

1299

0657

5Se

rbia

110

5101

212

710

28

108

43

65

531

210

55

46

712

1010

102

214

30.1

7694

6532

0.11

3907

146

Aze

rbai

jan

47

710

82

105

18

1212

105

104

36

1212

2150

40.1

1705

1052

0.11

3395

687

Alb

ania

810

64

54

512

63

610

812

11

101

121

311

25

146

50.1

1894

9905

0.09

8184

652

Est

onia

410

104

108

88

27

76

106

87

141

2060

.097

9342

800.

1314

9547

9T

urke

y10

312

74

72

85

78

31

18

35

63

811

1270

.088

8723

290.

1135

3531

3G

erm

any

24

410

101

72

1010

87

32

310

23

42

6110

80.0

8902

4803

0.15

7167

197

Ital

y7

22

37

51

42

35

24

410

52

42

75

51

54

101

90.0

8031

4920

0.10

9725

154

Spai

n6

65

36

43

61

210

126

14

86

897

100.

0785

6162

00.1

3317

3657

Mol

dova

15

17

22

54

361

27

81

72

881

110.

0628

1188

10.1

1943

9246

Rom

ania

63

31

27

56

712

41

104

7112

0.05

4187

0990

.119

8286

31F

YR

Mac

edon

ia8

212

28

18

121

68

371

130.

0537

7386

70.1

3261

1623

Lit

huan

ia4

85

331

27

44

16

51

770

140.

0555

8404

50.

1441

2079

Ukr

aine

310

13

61

33

62

78

11

82

6515

0.04

9463

4100

.123

9980

28C

ypru

s6

212

83

52

28

512

6516

0.04

5779

1110

.140

3847

35G

reec

e12

52

11

121

12

48

34

35

6417

0.04

7316

8400

.124

0541

95B

osni

a&

Her

zego

vina

710

76

52

71

1055

180.

0441

1928

50.1

3519

1508

Irel

and

11

43

44

45

55

1046

190.

0364

6241

70.1

3286

3608

Icel

and

16

67

36

54

44

4620

0.03

9862

2300

.129

0600

84M

alta

83

17

26

27

541

210.

0306

0357

00.1

2280

3553

Fran

ce2

26

32

621

220.

0164

2624

60.1

5143

4886

Den

mar

k2

55

52

221

230.

0177

2086

30.1

3970

3998

Hun

gary

17

28

119

240.

0145

5320

90.1

5129

0741

Uni

ted

Kin

gdom

15

42

1225

0.00

9791

3070

.121

9219

55


ble

B.4:

Euro

visio

nSo

ngC

onte

st20

12(c

ontin

ued)

AlbaniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Nor

way

13

372

60.0

0525

5627

0.15

4850

199

Aus

tria

00.1

5379

6281

Bel

arus

00.1

2091

3553

Bel

gium

00.1

5346

1988

Bul

gari

a00

.150

7723

07C

roat

ia00

.132

0229

28F

inla

nd00

.141

4721

75G

eorg

ia00

.123

0665

28Is

rael

00.1

3994

8084

Lat

via

00.1

3843

5935

Mon

tene

gro

00.1

3730

0587

Por

tuga

l00

.116

8356

94Sa

nM

arin

o00

.130

8405

89Sl

ovak

ia00

.138

3987

27Sl

oven

ia00

.148

6645

54Sw

itze

rlan

d00

.125

1280

02T

heN

ethe

rlan

ds00

.147

6384

19


ble

B.5:

Euro

visio

nSo

ngC

onte

st20

13

AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace

AutorithyScore

HubScore

Den

mar

k1

45

511

021

07

812

712

710

7101

212

812

62

661

07

64

1212

810

310

5122

8110

.251

7385

610.

1124

9638

1A

zerb

aija

n7

1210

512

78

812

4121

27

212

3121

281

210

122

53

76

210

234

20.2

1107

5596

0.10

9552

608

Ukr

aine

1211

212

8101

210

310

88

73

810

571

0101

21

41

1010

552

1430

.196

3244

660.

1064

3808

1N

orw

ary

23

23

71

341

23

812

47

421

06

88

63

24

87

77

551

27

63

191

40.1

6402

7896

0.09

8933

496

Rus

sia

78

45

65

712

62

66

211

07

127

77

810

65

441

0174

50.1

5712

9979

0.13

0276

520

Gre

ece

108

74

62

751

26

41

61

27

14

85

7101

28

181

5260

.126

7714

450.

1449

4265

6It

aly

1211

06

86

1010

26

86

14

481

212

126

70.1

1148

0415

0.13

4388

723

Mal

ta6

87

34

53

23

53

85

105

105

101

82

7120

80.0

9892

9720

0.14

6482

452

The

Net

herl

ands

48

122

107

28

58

64

82

37

84

6114

90.0

9596

8817

0.13

3925

774

Hun

gary

86

44

1012

23

52

62

310

784

100.

0666

4189

10.1

5032

0451

Mol

dova

21

43

37

43

14

512

66

28

7111

0.06

2911

5970

.139

5190

96B

elgi

um3

52

35

44

23

85

32

712

371

120.

0582

3365

80.1

5918

9216

Rom

ania

54

61

106

171

01

64

465

130.

0544

3120

30.1

4174

2263

Swed

en1

41

81

43

45

45

121

63

6214

0.05

0114

0340

.122

8162

63G

eorg

ia10

105

83

25

750

150.

0404

3627

60.1

4738

6391

Bel

arus

57

55

13

12

43

1248

160.

0394

4080

40.1

4335

1285

Icel

and

16

58

64

46

52

4717

0.03

8200

7610

.149

9055

94A

rmen

ia2

871

03

11

21

641

180.

0352

5796

20.1

2579

2175

Uni

ted

Kin

gdom

75

31

41

223

190.

0206

4564

90.1

4361

4616

Est

onia

610

319

200.

0154

3322

90.1

4109

6214

Ger

man

y3

65

31

1821

0.01

5543

9360

.126

6499

81L

ithu

ania

35

11

61

1722

0.01

4312

7770

.141

5520

96Fr

ance

21

12

814

230.

0101

5384

70.1

3518

6986

Fin

land

23

14

313

240.

0106

2753

50.1

0800

8031

Spai

n6

282

50.0

0563

0226

0.15

3609

206

Irel

and

22

152

60.0

0454

3069

0.13

7789

556

Alb

ania

00.1

0228

2457

Aus

tria

00.1

2705

8480

Bul

gari

a00

.136

3416

00


ble

B.5:

Euro

visio

nSo

ngC

onte

st20

13(c

ontin

ued)

AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace

AutorithyScore

HubScore

Cro

atia

00.1

6931

0202

Cyp

rus

00.1

6141

2928

FY

RM

aced

onia

00.1

3852

1061

Isra

el00

.157

7438

66L

atvi

a00

.134

6273

81M

onte

negr

o00

.156

1646

97Sa

nM

arin

o00

.092

5209

25Se

rbia

00.1

6596

9147

Slov

enia

00.1

4137

5070

Swit

zerl

and

00.1

1758

5074

Tabl

eB.

6:Eu

rovi

sion

Song

Con

test

2014

AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace

AuthorityScore

HubScore

Aus

tria

51

128

431

2101

071

2101

0121

212

6101

07

210

128

512

1212

1212

8122

9010

.285

1100

290.

1379

5631

2T

heN

ethe

rlan

ds41

02

8101

27

88

1281

2121

041

212

1212

103

321

071

010

8238

20.2

3572

0093

0.14

5211

066

Swed

en7

610

1210

104

22

87

410

87

76

38

481

221

081

06

812

7218

30.2

1294

9392

0.15

5725

338

Arm

enia

1210

25

841

212

67

72

610

631

01

47

86

45

710

174

40.1

5609

1783

0.05

0899

422

Hun

gary

87

85

73

710

56

61

71

412

610

67

27

14

314

350

.124

4533

840.

1656

9976

Ukr

aine

510

84

18

26

52

510

75

107

57

611

360

.095

4102

970.

1357

6945

2R

ussi

a10

1212

16

810

32

65

82

489

70.0

8650

4385

0.10

2149

33


ble

B.6:

Euro

visio

nSo

ngC

onte

st20

14(c

ontin

ued)


AuthorityScore

HubScore

Nor

way

14

63

72

53

17

58

27

31

53

510

8880

.074

4749

110.

1566

7604

5D

enm

ark

16

63

88

11

65

41

63

83

13

7490

.070

6751

280.

1605

4953

9Sp

ain

122

22

61

64

44

52

54

82

574

100.

0684

3237

90.1

6576

4986

Fin

land

43

46

46

56

37

33

64

26

7211

0.06

5352

465

0.10

7531

18R

oman

ia8

61

54

28

581

24

18

7212

0.06

2560

7680

.155

2120

94Sw

itze

rlan

d5

31

34

15

22

35

107

63

31

6413

0.05

5868

2280

.153

8641

43P

olan

d2

75

510

13

38

24

21

27

6214

0.05

3973

263

0.11

2296

57Ic

elan

d2

57

25

76

18

14

64

5815

0.05

2246

0350

.165

0624

83B

elar

us8

71

35

112

643

160.

0379

7888

90.0

8922

7164

Uni

ted

Kin

gdom

17

34

84

35

540

170.

0299

5831

70.1

4434

2619

Ger

man

y4

65

82

27

539

180.

0279

4585

20.1

3686

9443

Mon

tene

gro

612

127

3719

0.02

6911

4080

.097

1934

44G

reec

e2

74

64

23

14

235

200.

0238

1742

80.1

5927

8464

Ital

y10

21

126

233

210.

0234

2529

60.1

2163

8547

Aze

rbai

jan

37

1012

133

220.

0226

1371

60.0

6800

5452

Mal

ta1

53

31

45

1032

230.

0218

1615

50.1

1906

3784

San

Mar

ino

33

34

114

240.

0093

1575

60.0

9820

4303

Slov

enia

18

9250

.005

8431

620.

1565

3382

3Fr

ance

11

2260

.002

1673

870.

1725

7433

5A

lban

ia00

.091

8976

17B

elgi

um00

.169

5846

94E

ston

ia00

.162

0564

01F

YR

Mac

edon

ia00

.153

9939

35G

eorg

ia00

.117

4230

87Ir

elan

d00

.147

5849

17Is

rael

00.1

5305

9334

Lat

via

00.1

6629

5198

Lit

huan

ia0

0.15

9920

12M

oldo

va00

.115

8749

04


ble

B.6:

Euro

visio

nSo

ngC

onte

st20

14(c

ontin

ued)


AuthorityScore

HubScore

Por

tuga

l00

.173

6109

46

Tabl

eB.

7:Eu

rovi

sion

Song

Con

test

2015

AlbaniaArmeniaAustraliaAustriaAzerbaijanBelarusBelgiumCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUnitedKingdomPointsPlace

AuthorityScore

HubScore

Swed

en7

712

761

0121

0101

210

512

871

041

0121

0101

2121

010

851

212

88

87

812

812

1012

365

10.2

4760

20.1

3024

8R

ussi

a81

210

8121

210

581

012

681

051

28

63

881

010

710

72

6101

010

510

67

663

0320

.203

5610

.128

412

Ital

y12

681

08

181

27

53

72

68

312

26

612

811

27

65

7121

2121

07

812

86

782

9230

.193

4950

.146

217

Bel

gium

14

65

87

67

71

712

68

712

42

47

47

67

55

710

54

361

021

232

1740

.147

4820

.145

478

Aus

tral

ia3

112

64

48

55

27

58

85

76

53

64

108

24

83

271

28

8101

9650

.135

9290

.157

516

Lat

via

27

15

57

35

46

67

46

35

712

412

42

810

25

212

17

45

42

7186

60.1

2628

60.1

4952

7E

ston

ia4

36

47

22

621

02

27

13

26

83

13

21

72

33

410

670

.071

5420

.145

329

Nor

way

44

34

44

410

45

25

23

66

64

271

03

102

80.0

7078

30.1

4433

1Is

rael

52

27

26

31

55

81

53

44

71

26

14

35

597

90.0

6571

80.1

4669

1Se

rbia

25

33

102

312

65

11

5310

0.03

0338

0.12

8584

Geo

rgia

101

103

14

12

36

55

5111

0.03

1356

0.11

1867

Aze

rbai

jan

1210

28

38

33

4912

0.02

8747

0.12

5808

Mon

tene

gro

68

24

1210

244

130.

0260

560.

0855

79Sl

oven

ia3

81

26

34

41

15

139

140.

0225

770.

1356

49R

oman

ia5

12

15

124

535

150.

0239

820.

1534

99A

rmen

ia4

32

331

21

634

160.

0201

100.

1192

44


ble

B.7:

Euro

visio

nSo

ngC

onte

st20

15(c

ontin

ued)

AlbaniaArmeniaAustraliaAustriaAzerbaijanBelarusBelgiumCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUnitedKingdomPointsPlace

AuthorityScore

HubScore

Alb

ania

612

610

3417

0.01

6755

0.12

4384

Lit

huan

ia1

23

77

64

3018

0.02

0556

0.11

8747

Gre

ece

105

823

190.

0143

580.

1255

44H

unga

ry1

81

14

419

200.

0130

610.

1461

21Sp

ain

15

11

23

11

1521

0.00

9537

0.15

8995

Cyp

rus

101

1122

0.00

6646

0.14

5637

Pol

and

43

12

1023

0.00

6909

0.15

9925

Uni

ted

Kin

gdom

11

352

40.0

0325

80.1

5312

5Fr

ance

31

4250

.002

3380

.142

272

Aus

tria

026

00.1

5045

4G

erm

any

027

00.1

5786

5B

elar

us00

.147

342

Cze

chR

epub

lic00

.130

862

Den

mar

k00

.160

507

FY

RM

aced

onia

00.0

8541

4F

inla

nd00

.149

564

Icel

and

00.1

4581

7Ir

elan

d00

.141

492

Mal

ta00

.145

890

Mol

dova

00.1

3090

4P

ortu

gal

00.1

4498

5Sa

nM

arin

o00

.131

533

Swit

zerl

and

00.1

4673

9T

heN

ethe

rlan

ds00

.155

502

similarity on graphs and hypergraphs · 2016-02-15 · similarity on graphs and hypergraphs ... the...

Documents