doubling dimension in real-world graphs melitta lorraine geistdoerfer andersen

18
Doubling Dimension Doubling Dimension in Real-World in Real-World Graphs Graphs Melitta Lorraine Melitta Lorraine Geistdoerfer Andersen Geistdoerfer Andersen

Post on 22-Dec-2015

224 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Doubling Doubling Dimension Dimension

in Real-World in Real-World GraphsGraphs

Melitta Lorraine Geistdoerfer Melitta Lorraine Geistdoerfer AndersenAndersen

Page 2: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Recap: DefinitionRecap: Definition A A metric spacemetric space is a set is a set XX together with distance together with distance

function function dd that gives a non-negative distance that gives a non-negative distance between any 2 points in between any 2 points in XX and satisfies 3 properties: and satisfies 3 properties: dd((x,yx,y) = 0 if and only if ) = 0 if and only if xx = = yy dd((x,yx,y) = ) = dd((y,xy,x)) The triangle inequality holds: The triangle inequality holds: dd((x,yx,y) + ) + dd((y,zy,z) ) ¸̧ dd((xx,,zz))

The The doubling dimensiondoubling dimension of a metric space ( of a metric space (X,dX,d) is ) is the least the least kk such that any ball of radius such that any ball of radius RR can be can be covered by 2covered by 2kk balls of radius balls of radius RR/2./2.

So the doubling dimension is logSo the doubling dimension is log22 of the maximum of the maximum over all centers and all radii of the number of balls of over all centers and all radii of the number of balls of half radius it takes to cover a ball with a specific half radius it takes to cover a ball with a specific center and radius.center and radius.

Page 3: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

An Example with a Set of An Example with a Set of PointsPoints

In this case, all of the points can be covered by 2In this case, all of the points can be covered by 2kk=2=2 balls of radius balls of radius RR/2./2. Each of the balls also have a doubling dimension of 2.Each of the balls also have a doubling dimension of 2. And each of those contain no more than 2And each of those contain no more than 222 points. points. When the doubling dimension is a constant (i.e. bounded) the metric is When the doubling dimension is a constant (i.e. bounded) the metric is

called a called a doubling metricdoubling metric..

Page 4: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Some Uses of Doubling Some Uses of Doubling DimensionDimension

Chan, Gupta, Maggs, and Zhou proved that for Chan, Gupta, Maggs, and Zhou proved that for any network that has a metric with a bounded any network that has a metric with a bounded doubling dimension, a hierarchical routing doubling dimension, a hierarchical routing structure can be imposed on it.structure can be imposed on it.

With this structure, the network can be With this structure, the network can be addressed in such a way as to be able to get addressed in such a way as to be able to get routing information from the addresses of the routing information from the addresses of the source and the destination.source and the destination.

This routing also achieves minimum or near-This routing also achieves minimum or near-minimum path length.minimum path length.

There are also efficient nearest-neighbor There are also efficient nearest-neighbor algorithms that work with a graph of low algorithms that work with a graph of low doubling dimension.doubling dimension.

Page 5: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Now We Can Apply It To A Now We Can Apply It To A GraphGraph

We found a 200,000 node router We found a 200,000 node router level graph of the Internet at level graph of the Internet at http://www.caida.org/tools/measurehttp://www.caida.org/tools/measurement/skitter/router_topology/ment/skitter/router_topology/. .

This was an adjacency graph, so we This was an adjacency graph, so we treated all edges as unit distances.treated all edges as unit distances.

The doubling dimension was ~14.The doubling dimension was ~14.

Page 6: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Average Covering for Each Average Covering for Each RadiusRadius

Average Number of Balls of Half Radius to Cover a Ball of Radius R

3446

10695

2053

67

1

667

1

10

100

1000

10000

100000

R = 2 R = 4 R = 8 R = 16 R = 32 R = 64

Radius

Nu

mb

er o

f B

alls

Plotted on a log scale (because the x axis is also on a Plotted on a log scale (because the x axis is also on a log scale), the average number of balls increased nearly log scale), the average number of balls increased nearly linearly until it reached radius 8.linearly until it reached radius 8.

One interpretation of the downturn is the finite nature One interpretation of the downturn is the finite nature of the graph.of the graph.

At R=64, only one ball of radius 32 is required to cover At R=64, only one ball of radius 32 is required to cover the entire ball. Hence, the diameter of the graph is at the entire ball. Hence, the diameter of the graph is at most 32.most 32.

Page 7: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

But What About But What About Latencies?Latencies?

This was all well and good for an This was all well and good for an adjacency graph, but for routing you adjacency graph, but for routing you actually want to know the fastest actually want to know the fastest route. So we needed a weighted route. So we needed a weighted graph.graph.

http://www.cs.cornell.edu/People/egshttp://www.cs.cornell.edu/People/egs/meridian/data.php/meridian/data.php yielded a graph that measured yielded a graph that measured latencies between 2,500 sites.latencies between 2,500 sites.

The doubling dimension of this The doubling dimension of this weighted graph was ~9.weighted graph was ~9.

Page 8: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Covering for a Weighted Covering for a Weighted GraphGraph

Plotted on a log scale, the average number of balls formed Plotted on a log scale, the average number of balls formed a more symmetric curve than the unweighted graph.a more symmetric curve than the unweighted graph.

There were few nodes within range for the lower radii, There were few nodes within range for the lower radii, and at the higher radii, we again saw the effects of a finite and at the higher radii, we again saw the effects of a finite graph.graph.

One thing of note is the spike of 2 after 1 had already One thing of note is the spike of 2 after 1 had already been reached.been reached.

Average Number of Balls of Half Radius it Takes to Cover Ball

1.00 1.00 1.02

7.04

29.47

142.42

278.33188.65

42.71

11.35

1.00

2.00

1.00 1.00 1.001

10

100

1000

256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304

Radius

Nu

mb

er

of

Balls

Page 9: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

A Possible ExplanationA Possible Explanation

One thing that could cause the spike is a 2 cluster graph.One thing that could cause the spike is a 2 cluster graph. Everything within a ball of a certain size can be covered Everything within a ball of a certain size can be covered

by a ball of half the radius, for both clusters.by a ball of half the radius, for both clusters. But when you double that radius, you run into the other But when you double that radius, you run into the other

cluster, so 2 balls are required to cover the whole thing.cluster, so 2 balls are required to cover the whole thing.

Page 10: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Infinite Graphs?Infinite Graphs?

Another thing to note is that the doubling dimension Another thing to note is that the doubling dimension is finite because the graph is finite.is finite because the graph is finite.

If this were a section of an infinite doubling metric If this were a section of an infinite doubling metric the doubling dimension would eventually flatten out the doubling dimension would eventually flatten out and become constant.and become constant.

Though the graph does start to flatten out at the Though the graph does start to flatten out at the peak, we don’t know if this merely indicates that the peak, we don’t know if this merely indicates that the finite nature of the graph is affecting it.finite nature of the graph is affecting it.

Average Number of Balls of Half Radius it Takes to Cover Ball

1.00 1.00 1.02

7.04

29.47

142.42

278.33188.65

42.71

11.35

1.00

2.00

1.00 1.00 1.001

10

100

1000

256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304

Radius

Nu

mb

er o

f B

alls

Page 11: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Other GraphsOther Graphs

We had so much fun with doubling dimension We had so much fun with doubling dimension on these graphs, we wanted to find other on these graphs, we wanted to find other graphs to play with. But what other graphs to play with. But what other interesting graphs are out there?interesting graphs are out there?

The Citation Graph connects authors of papers The Citation Graph connects authors of papers by references. An edge indicates that the by references. An edge indicates that the author cited a paper by the other author in one author cited a paper by the other author in one of his papers.of his papers.

People use these graphs to study nearest People use these graphs to study nearest neighbor algorithms.neighbor algorithms.

The doubling dimension of this graph is ~12.The doubling dimension of this graph is ~12.

Page 12: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

The Citation GraphThe Citation Graph

This graph looks similar to the router graph.This graph looks similar to the router graph. The Citation Graph also has unit distances for the edges, The Citation Graph also has unit distances for the edges,

so this similarity makes sense.so this similarity makes sense. The earlier downward turn could be due to the high degree The earlier downward turn could be due to the high degree

of each node. Many authors write many papers, and cite a of each node. Many authors write many papers, and cite a large number of papers in them.large number of papers in them.

Citation Graph Average Covers

1527 1626

431

29

1 11

10

100

1000

10000

R = 2 R = 4 R = 8 R = 16 R = 32 R = 64

Radius

Nu

mb

er o

f B

alls

Page 13: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

More GraphsMore Graphs

Doubling dimension can give us Doubling dimension can give us information about many types of information about many types of graphs.graphs.

For instance, using the Internet Movie For instance, using the Internet Movie Database a graph of actors can be Database a graph of actors can be created with edges connecting two created with edges connecting two actors who were in the same movie.actors who were in the same movie.

The doubling dimension of this graph The doubling dimension of this graph is ~14.is ~14.

Page 14: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Yet Another Signature Yet Another Signature GraphGraph

This graph started it’s downward trend right This graph started it’s downward trend right away.away.

One possible explanation is that this graph is One possible explanation is that this graph is much denser than the router graph, so the balls much denser than the router graph, so the balls of radius 2 cover many points that may not be of radius 2 cover many points that may not be within 1 hop of each other.within 1 hop of each other.

Covering for Dense Unit Graph (Actor Graph)

1

10

100

1000

R = 2 R = 4 R = 8 R = 16 R = 32 R = 64

Radius

Avera

ge N

um

ber

of

Balls

Page 15: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

The Effects of ScalingThe Effects of Scaling

The actor graph had 400,000 nodes. This made The actor graph had 400,000 nodes. This made it an interesting graph for experimentation with it an interesting graph for experimentation with scaling. If we included only a portion of the scaling. If we included only a portion of the nodes, what would that do to the dimension?nodes, what would that do to the dimension?

Effects of Scaling on Doubling Dimension

1 1 1 1 1 11.00 1.00 1.00 1.00 1.00 1.001.00 1.00 1.00 1.00 1.00 1.001.17 1.14 1.00 1.00 1.00 1.00

2.201.59

1.00 1.00 1.00 1.001

10

100

1000

10000

R = 2 R = 4 R = 8 R = 16 R = 32 R = 64

Radii

Avera

ge N

um

ber

of

Ball

s

8 Nodes

16 Nodes

32 Nodes

64 Nodes

128 Nodes

256 Nodes

512 Nodes

1024 Nodes

2048 Nodes

4096 Nodes

8192 Nodes

16384 Nodes

32768 Nodes

65536 Nodes

Page 16: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Doubling DimensionsDoubling Dimensions

Plotted on a log scale, the graph increases Plotted on a log scale, the graph increases logarithmically until the maximum logarithmically until the maximum doubling dimension is reached.doubling dimension is reached.

Doubling Dimensions

1.00

2.81

4.756.09

7.158.31 9.38

10.69 11.82 12.39 13.1313.59

1

10

100

8 16 32 64 128

256

512

1024

2048

4096

8192

1638

4

3276

8

6553

6

1310

72

Nodes in Graph

Dim

ensi

on

Page 17: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

ConclusionsConclusions

Finite graphs have bounded Finite graphs have bounded doubling dimensions.doubling dimensions.

Different types of graphs have Different types of graphs have different signature cover graphs.different signature cover graphs.

The number of nodes in a graph has The number of nodes in a graph has some relation to the doubling some relation to the doubling dimension.dimension.

I like playing with graphs.I like playing with graphs.

Page 18: Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen

Future WorkFuture Work

Actually implementing the routing Actually implementing the routing algorithm on a graph.algorithm on a graph.

Measuring latencies of adjacent Measuring latencies of adjacent routers to get a more accurate routers to get a more accurate picture to work with.picture to work with.

Figuring out bounds on how scaling Figuring out bounds on how scaling effects doubling dimension, possibly effects doubling dimension, possibly working with some infinite graphs.working with some infinite graphs.