survey on evolving graphs research speaker: chenghui ren supervisors: prof. ben kao, prof. david...
TRANSCRIPT
1
Survey on Evolving Graphs Research
Speaker: Chenghui RenSupervisors: Prof. Ben Kao,
Prof. David Cheung
2
MotivationEvolving graphs are everywhere
Social networks Users join social networks Friendships are established
The Web New Web pages are created Hyperlinks are established
3
MotivationEvolving graphs are everywhere
P2P networks New routers appear Routing table size (vertex degree)
changes Spatio networks
Transportation cost (edge weight) changes
4
Research branches
Evolution of graphs How do graphs evolve over time? Example
The networks are becoming denser over time with the average degree increasing [J. Leskovec 2007]
Querying evolving graphs Apply queries on evolving graphs to extract
information Example
How to update the PageRank efficiently as graphs evolve?
5
Roadmap
Motivation Why we are interested in evolving graphs
Evolution of graphs How graphs evolve over time
Macroscopic evolution Microscopic evolution
Querying evolving graphs How to process queries on evolving graphs
Incremental computation Key moment detection Find-verify-fix framework
6
Evolution of graphs
Macroscopic evolution of graphs How do global properties (e.g., degree
distribution, diameter) evolve? Microscopic evolution of graphs
Example How do a user link to other users?
Microscopic node behavior results in macroscopic behavior
7
Macroscopic evolution Stable degree distributions[R. Albert 1999]
Power law distribution: P(degree = k) is proportional to 1/k^a The major hubs are closely followed by smaller ones The nodes tend to form communities
Examples Social networks, including collaboration networks. An example that
has been studied extensively is the collaboration of movie actors in films.
Protein-Protein interaction networks. Sexual partners in humans, which affects the dispersal of sexually
transmitted diseases. Many kinds of computer networks, including the internet and
the World Wide Web. Semantic networks Airline networks.
8
Macroscopic evolution Densification and shrinking diameters [J.
Leskovec 2007] Densification formula
E(t) is proportional to N(t) ^ a (1 < a < 2) Shrinking diameters
9
Microscopic evolution
Preferential attachment model [R. Albert 1999] New vertices attach preferentially to
sites that are already well connected Obey the power law distribution Global model: new vertices can
connect to any vertex in the whole network
10
Microscopic evolution
Forest fire model [J. Leskovec 2007] Intuition: how do authors identify
references? Find first paper and cite it Copy a few citations from first Continue recursively From time to time use bibliographic tools
(e.g. CiteSeer) and chase back-links
11
Microscopic evolution
Forest fire model [J. Leskovec 2007] A node arrives Randomly chooses an “ambassador” Starts burning nodes (with probability p)
and adds links to burned nodes “Fire” spreads recursively, with
exponential decay
12
Microscopic evolution
Forest fire model [J. Leskovec 2007] Obey the densification, shrinking
diameter and power law distribution Local model: A newcomer will have a
lot of links near the community of his/her ambassador, a few links beyond this, and significantly fewer farther away
13
Roadmap
Motivation Why we are interested in evolving graphs
Evolution of graphs How graphs evolve over time
Macroscopic evolution Microscopic evolution
Querying evolving graphs How to process queries on evolving graphs
Incremental computation Key moment detection Find-verify-fix framework
14
Querying evolving graphs
A number of queries in literature PageRank queries Diameter queries Minimum spanning tree (MST) queries Shortest path queries Centrality queries …
15
Querying evolving graphs Methodologies
Incremental computation PageRank queries Diameter queries
Key moment detection Minimum spanning tree queries
Our work: find-verify-fix framework Shortest path queries Centrality queries
16
Incremental computation
Typically, the difference between two consecutive snapshots G1 and G2 is small
Compute the solution for G2 based on the solution for G1
The incremental algorithms are expected to be fast
17
PageRank queries
Rank of a web page depends on the rank of the web pages pointing to it
18
PageRank queries
Computing PageRank for large graphs at each time instance is expensive
Incremental algorithms are proposed [P. Desikan 2005]
Principle idea: PageRank depends only on the pages that point to it and is independent of the pages pointed by the page
19
PageRank queriesDetect a changed portion of graphPartition the graph into scalable P and non-scalable Q such that there are no incoming links from Q to PCompute PageRank for QMerge the rankings of the two independent partitionsPageRank values of partition P
are obtained by simple scaling with scaling factor n(G1)/n(G2)
20
Diameter queries
In a P2P network, an important and fundamental question is how many neighbors should a computer have, i.e., what size the routing table should be
Network diameter corresponds to the number of hops a query needs to travel in the worst case
If the diameter is large, the routing table size should be increased
21
Diameter queries
G-Scale [Y. Fujiwara 2011] First study to address diameter
detection problem that guarantees exactness and efficiency on both single big graph and evolving graphs
Weak point: It assumes that one node and its connected edges are added to a time-evolving graph at each time tick. General edge insertions and edge deletions are not considered
22
Key moment detection
Given an evolving graph and a query, a key moment detection algorithm tries to detect those moments at which the solution to the query changes
23
MST queries
MSTs can be used to solve energy-efficient problems in spatio networks
A time aggregated graph is a graph in which each edge is associated with an edge weight function
A time-sub-interval is defined as a maximal sub interval of time horizon which has a unique MST
An efficient solution to determine time-sub-intervals is available [V. Gunturi 2010]
24
MST queries
Methodology [V. Gunturi et al 2010] Edge order interval: a sub interval of
time horizon during which there is clear ordering of edge weight functions, i.e., none of them intersect with each other
Principle idea: An edge-order-interval has a unique MST
Inspired by Prim’s algorithm
25
MST queries
An edge-order-interval
26
MST queries
V. Gunturi et al proposed methods to efficiently determine at which moments to partition the edge-order-intervals
They also provided methods to incrementally compute MST based on the MST for the preceding edge-order-interval
27
Our find-verify-fix framework
Given an evolving graph (G1, G2, G3, …, Gn), FVF Find representative solutions (RS’s) for
G1~Gn Verify whether these RS’s are indeed the
solution for each individual snapshot If the verification fails, try to fix the RS’s
28
Our find-verify-fix framework
FVF can now handle: Exact shortest path (SP) queries on un-
weighted evolving graph Approximate SP queries on weighted
evolving graphs Approximate centrality queries
29
Future work
Find more interesting queries Incorporate the ideas of incremental
algorithms and key moment detection algorithms to the FVF framework
30
Thanks!