shortest path algorithms based on component hierarchies

Shortest path algorithms based on

Component Hierarchies

Anne de Koning

May 25th, 2007

University of Utrecht, department of Mathematicsin cooperation with ORTEC

Supervisor University of Utrecht: Dr R. BisselingSupervisor ORTEC: Dr J. dos Santos Gromicho

CONTENTS

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2. Historic developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1 Graph theory and Single Source Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Dijkstra’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Complexity of Dijkstra’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Improvements preceding Component Hierarchies . . . . . . . . . . . . . . . . . . . . . . . 14

3. Component-free sketch of the idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4. Component Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.2 Proof of correctness Component Hierarchy method . . . . . . . . . . . . . . . . . . . . . . 23

4.2.1 Why minimality suffices as condition . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Queries: Traversing the Component Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3.1 Correctness of Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.4 Differences between Undirected and Directed Graphs . . . . . . . . . . . . . . . . . . . . . 32

5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.1 My Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.1 Future improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Appendix 41

A. Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42A.1 ExampleGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42A.2 DelaunayGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43A.3 Thesis example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44A.4 Real network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.5 RandomQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48A.6 RealQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49A.7 saveMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50A.8 saveTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51A.9 GraphToLatex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52A.10 TreeToLatex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53A.11 show . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.12 Dijkstra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

B. Building the component tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57B.1 componentTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57B.2 squareSize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59B.3 transClosure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60B.4 subGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

C. Component Hierarchy Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62C.1 ComponentHierQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62C.2 Visit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64C.3 Expand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67C.4 nicefloor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68C.5 GoToChild . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4

1. INTRODUCTION

Route-calculations are an intriguing field of applied mathematics. Single Source Shortest Path problemsdeal with finding shortest paths from one origin to all other vertices in a network. Route planners becomeincreasingly demanding as calculations need to be performed faster and faster on maps with a growingnumber of roads and locations.Dijkstra’s [1] original method already has been improved a number of times. These methods however,do not include a linear time algorithm. Thorup [2] introduced the Component Hierarchy method forundirected graphs, which runs in linear time. It is a complex method, and the proof of correctness takesquite some perseverance to read. Building on the work of Thorup, a method for directed graphs wassuggested by Hagerup [3]. This algorithm was not supported by a proof for directed graphs.The goals of this thesis are:

• to survey recent developments concerning single source shortest path problems (SSSP’s)

• to give a more readable proof of Thorup’s algorithm while proving the correctness of Hagerup’salgorithm.

• prototyping an implementation of these algorithms.

• to determine whether Component Hierarchies are useful for implementation by ORTEC, consideringcorrectness and calculation speed.

Developments preceding the Component Hierarchy methods are described in chapter 2. The readableproof for the undirected method and the new proof for the directed method are integrated in chapter4. The proof for both methods but mainly the directed case is supported by an implementation of theComponent Hierarchy method in Matlab. This implementation was done in Matlab 6.5.

The Component Hierarchy algorithm works on a RAM (Random Access Model). An addressable wordor memory unit has length ω bits, corresponding to a capacity of 2ω. This quantity will mainly be usedto bound edge lengths.

2. HISTORIC DEVELOPMENTS

2.1 Graph theory and Single Source Shortest Paths

Recall the following basics of graph theory. A graph G is a combination of a set of vertices V and a setof edges E connecting the vertices, written as G = (V, E), where the number of vertices is |V | = n, andthe number of edges is |E| = m. A connected graph contains at least one path from every vertex v toevery other vertex u. If the graph is edge-weighted, there exists a positive edge weight function `(v, w)describing the edge lengths. If two vertices u and v are not connected by an edge e ∈ E, then define`(v, w) to be∞. We assume that the graph is strongly connected and contains no edges with zero length.If such edges do exist, their endpoints are easily contracted to single vertices in O(m) time.

This thesis assumes the edge weights to be integers. Furthermore, note that the title “shortest” pathsis somewhat misleading. Any proper function ` : E →

�0 may be used as objective function. This does

not have to be the shortest path: the most common applications of shortest path calculations, requirethe fastest route. It is also possible to minimize cost, which is a weighed function of for instance distance(fuel usage), time (truckdriver’s payment), possibly toll and ferry fees, and so on. Following convention,no matter what the objective is, minimizing it will simply be referred to as finding the ‘shortest’ paththroughout this thesis, and the edge weight function is called the distance function. This is done withoutloss of generality.For the Component Hierarchy method to work properly, there has to be a feasible solution. Thereforethe graph must be strongly connected, meaning there is a path from every vertex to every other vertex.The Single Source Shortest Path problem (SSSP) deals with finding the optimal routes from one pointin a graph to all others with respect to some objective function that we call the distance function.The SSSP-problem was solved in 1959 by Edsger Dijkstra [1]. He proposed an algorithm that finds anoptimal solution in O(n2) time. Over the last two decades, as practical demands became increasinglyhard, adaptations to the original algorithm were proposed. These algorithms achieved timebounds ofO(m + n log n), which is a great improvement, but the biggest improvement to date is the invention ofthe Component Hierarchy approach by Mikkel Thorup [2]. For undirected graphs, his approach runs inlinear time O(m).This chapter gives a description of how Dijkstra’s original algorithm solved shortest paths and an overviewof the methods that have been proposed to speed up the calculation.

2.2 Dijkstra’s algorithm

Dijkstra’s algorithm works as follows. Starting from a vertex s, the starting point or source, all verticesare labeled with the distance it takes to get there from the source. This is done by first relaxing theoutgoing edges of s which means the end-points are labeled with a temporary label D(v). When a vertexis first given a label (that is not necessarily the correct, minimum label for that vertex) we call the vertexreached. The notation d(v) is used to describe the distance from s and thus also the label a vertex vwill eventually get. When it is certain that there is no shorter way of traveling to the vertex, the labelbecomes permanent and is referred to as d(v) and the vertex is called visited.A label is guaranteed to be optimal if it is the smallest temporary label, by Lemma 2.2.1. Vertices thatare permanently labeled are stored in S, meaning that for v ∈ S, d(v) = D(v).

Lemma 2.2.1. If v ∈ V \S minimizes D(v), then D(v) = d(v).

Proof. Suppose it is possible to get a temporary label D(v)∗ that is smaller than D(v) by taking adifferent, shorter path. Then the previous vertex u on this shortest path from s to v must have a labelD(u)∗ ≤ D(v)∗ < D(v) and since D(v) was the minimum temporary label, u must have been permanently

labeled, D(u) = d(u). Since u has been labeled permanently, its outgoing edges are relaxed, meaningthat D(v) = D(v)∗ would already have this better label.

The permanent labels will spread out over the graph from starting point s like a wavefront in water.The process of labeling can be seen in the example in figures 2.1 through 2.18 and is presented byAlgorithm 1. In the algorithm, no distinction is made between temporary and permanent labels, bothare stored D. The difference can only be told by looking at S.

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

Fig. 2.1: Starting step in Table 2.1. This is the complete graph. Start in vertex s = i.

7

D=7

d(s)=0

D=7

s

11

111

33

1

1

1

2

6

11

11

3

7

73

4

11

1

Fig. 2.2: Step 1 in Table 2.1. s is appointed a distance-label d(s) = 0. The yellow dashed edges have been“relaxed”, making the vertices at their end points “reached”. Reached vertices get a temporary labelD(v) containing the shortest distance found so far to reach this vertex from s.

d=7D=8

D=8

d(s)=0

D=7

s

33

2

6

3

7

73

4

Fig. 2.3: Step 2. At each step, a vertex with the smallest temporary label receives a permanent label (here d = 7)and its outgoing edges are relaxed. The edge that has to be traversed to get to the permanently labeledvertex, is thick and green. Again, the yellow dashed edges have been “relaxed”, making the vertices attheir end points “reached”.

8

d=7D=8

D=8

d(s)=0

d=7

D=10

s

33

2

6

3

7

73

4

Fig. 2.4: Step 3. Permanently label the vertex with the minimum temporary label D = 7, relax its outgoingedge.

d=7d=8

D=8

D=9

d(s)=0

d=7

D=10

s

33

2

6

3

7

73

4

Fig. 2.5: Step 4. Again, permanently label a vertex with the minimum temporary label, relax its outgoing edges.The vertical edge with length 1 reaches a vertex that already had a temporary label D = 8 which cannotbe improved, so this vertical edge will not be used in any shortest paths.

9

d=7d=8

d=8

D=9

d(s)=0

d=7

D=10

s

33

2

6

3

7

73

4

Fig. 2.6: Step 5. And again, label and relax. The vertical edge with length 6 reaches a vertex that already had atemporary label D = 10 which cannot be improved, so this edge will not be used in any shortest paths.In the example, this means the edge will not become green and thick.

d=7d=8

d=8

d=9D=12

D=12

d(s)=0

d=7

D=10

D=13

s

Fig. 2.7: Step 6. Label the only vertex with D = 9and relax outgoing edges

d=7d=8

d=8

d=9D=12

D=12

d(s)=0

d=7

d=10

D=11

D=11

D=13

s

Fig. 2.8: Step 7. ... and again for the only vertexwith D = 10.

d=7d=8

d=8

d=9D=12

D=12

d(s)=0

d=7

d=10

d=11D=12

D=11

D=13

s

Fig. 2.9: Step 8. Choose a vertex with label 11, byalphabet l comes first.

d=7d=8

d=8

d=9D=12

D=12

d(s)=0

d=7

d=10

d=11D=12

d=11

D=13

s

Fig. 2.10: Step 9. Label the only vertex with D =11, again a relaxable edge has no effect,as the temporary label of o is alreadyD = 13, and would not be improved byD = 14..

10

d=7d=8

d=8

d=9d=12

D=12

D=13

d(s)=0

d=7

d=10

d=11D=12

d=11

D=13

s

Fig. 2.11: Step 10. Label and relax,

d=7d=8

d=8

d=9d=12

d=12

D=13

D=13

d(s)=0

d=7

d=10

d=11D=12

d=11

D=13

s

Fig. 2.12: Step 11. label and relax.

d=7d=8

d=8

d=9d=12

d=12

D=13

D=13

d(s)=0

d=7

d=10

d=11d=12

d=11

D=13

s

Fig. 2.13: Step 12. And again...

d=7d=8

d=8

d=9d=12

d=12

d=13

D=13

d(s)=0

d=7

d=10

d=11d=12

d=11

D=13

s

Fig. 2.14: Step 13. ... and again...

d=7d=8

d=8

d=9d=12

d=12

d=13

d=13

d(s)=0

d=7

d=10

d=11d=12

d=11

D=13

s


d=7d=8

d=8

d=9d=12

d=12

d=13

d=13

d(s)=0

d=7

d=10

d=11d=12

d=11

d=13D=14

D=14s


d=7d=8

d=8

d=9d=12

d=12

d=13

d=13

d(s)=0

d=7

d=10

d=11d=12

d=11

d=13D=14

d=14s

Fig. 2.17: Step 16. ... and again.

d=7d=8

d=8

d=9d=12

d=12

d=13

d=13

d(s)=0

d=7

d=10

d=11d=12

d=11

d=13d=14

d=14s

Fig. 2.18: Step 17. The last vertex is labeled per-manently (with label d = 14). There areno more vertices with a temporary label,so Dijkstra terminates.

11

Algorithm 1 Dijkstra’s algorithm.

1: Input: requires a start vertex s and a distance matrix L where L(i, j) contains the distance from ito j. If there is no direct edge from i to j, L(i, j) =∞

2: Output: returns distances D(v) from s to all other vertices v ∈ V3: for every vertex v ∈ V do

4: D(v) :=∞5: end for

6: D(s) := 07: S := ∅8: while there are vertices that have not been labeled permanently, V \S 6= ∅ do

9: u := argmin{D(v) ‖ v ∈ V \S}10: S := S ∪ u11: for each edge (u, v) do

12: if D(u) + L(u, v) < D(v) then

13: D(v) = D(u) + L(u, v)14: end if

15: end for

16: end while

A description of the algorithm is as follows. The permanently labeled vertices are stored in the set S.Permanently label source s with distance d(s) = 0, since dist(s, s) = 0. Traverse the outgoing edges ofs. Give each reached vertex v a temporary label D(v) containing the distance from the source, duringthe first round this is simply the length `(s, v) of the edge from s to v.Find the vertex u with the smallest temporary label D(u) that has not been permanently labeled,D(u) = minv∈V \S D(v), or denoted more compactly D(u) = min D(V \S). Since there is no shorter pathto get to this vertex, its label becomes permanent. From u, traverse the outgoing edges. Set the label ofa reached vertex v to be min(D(u) + `(u, v) , D(v)). If v has not been reached yet, its temporary labelis set to D(u) + `(u, v).After Dijkstra’s algorithm is finished, all vertices are labeled with the shortest distance that needs tobe traversed to get there from s. These quantities are used to calculate back from the end-point vertexcalled the sink or terminus t. The vertex x that satisfies: d(x) = d(t) − `(x, t), must be the previousvertex on a shortest path to t. Repeating this for internal vertices on the shortest path, the path can berecovered very efficiently.

2.2.1 Complexity of Dijkstra’s algorithm

There are two actions in Dijkstra’s algorithm (algorithm 1) that determine the complexity: looking upthe smallest temporary label (line 9), and relaxing its outgoing edges (line 11).To find the smallest label (once), at most all n vertices have to be checked, which has O(n) complexity.This action has to be repeated at every step of the algorithm, until all vertices have been permanentlylabeled (once), giving O(n2). All edges are relaxed once, adding O(m) to the complexity so the simplestapproach would lead to an overall complexity of O(n2 + m). Since O(m) is at worst equal to O(n2), theoverall complexity is O(n2).Apparently, selection of the smallest temporary label is the bottleneck. It would be efficient, if insteadof scanning all candidates we were able to keep track of vertices with their distances.

Dials algorithm (See [4]) does so, by storing the vertices in buckets according to their temporarylabel, so bucket b contains all vertices with temporary label D(v) = b. The buckets can be implementedusing a doubly linked list. Call C = maxe∈E `(e). In the worst case, the maximum temporary label thatcan be assigned, is δ0 =

∑

e∈E `(e) ≤ nC so the list must be at least that long. Initially, the temporarylabels are set to infinity, so the bucketing starts as a vertex is first reached and the first temporary labelis assigned. Every time a vertex receives a new, improved temporary label, it is moved to the appropriatebucket.When looking for a vertex to label permanently, we do not have to search all temporary labels, but insteadwe only look up the first non-empty bucket. As the vertices in this bucket are labeled permanently, their

12

Tab. 2.1: Buckets are filled and emptied as the algorithm runs.bucket 0 1 . . . 6 7 8 9 10 11 12 13 14

stepstart s

1 a,j2 j b,c3 b,c k4 c d k5 d k6 k e,f o7 l,n e,f o8 l e,f,m o9 e,f,m o10 f,m o,g11 m o,g,h12 o,g,h13 o,h14 o15 p,q16 q17

outgoing edges have been relaxed, and the vertices that receive a lower temporary label are placed inthe new appropriate bucket.Table 2.1 visualizes how the buckets are filled and emptied as the algorithm is executed. The verticesare denoted by the letters a, b, c, . . .

Now the complexity depends not only on the number of vertices remaining, but also on the number ofempty buckets we need to scan. For selection of a vertex to label permanently, we scan all buckets untilwe find a nonempty one. Assume that checking whether a bucket is empty can be done in constanttime O(1). Suppose the first nonempty bucket is bucket k. Then all vertices in k have the minimumtemporary label, and one by one they are removed from the bucket and labeled permanently, relaxingtheir outgoing edges and placing vertices that receive a lower temporary label in the appropriate bucket.After the bucket is emptied, resume scanning for a nonempty bucket. Assume that deleting an elementfrom a bucket and adding one to a bucket can also be done in constant time. Then the algorithmspends O(1) time on one label update, giving O(m) time for all label updates. Scanning nC + 1 bucketstakes O(nC) time, giving an overall complexity of O(m + nC). For memory efficiency purposes, theimplementation can be adapted to use only C + 1 buckets, but that does not modify the complexity.The complexity can be improved by using the heap data structure. The universal operations on a heapare:create-heap(H) Create an empty heapfind-min(i,H) return an object i of minimum keyinsert(i,H) insert a new object i with a predefined keydecrease-key(value,i,H) Reduce the key of object i from its current value

to value. value must be smaller than beforedelete-min(i,H) Delete an object i of minimum key

Using various types of heaps, algorithm 2 describes Dijkstra.

1. Binary heap. A binary heap data structure requires only O(log n) for the operations of insert,decrease-key and delete-min, and constant time for the other operations. That makes the complex-ity of Dijkstra implemented using a binary heap O(m log n).

2. d-Heap. The d is a parameter, that influences what logarithm will appear in the cost of the heap’soperations. d ≥ 2 and the d-heap requires O(logd n) time to perform the insert and decrease-key operations, and O(d logd n) time for delete-min. All other heap operations require only O(1).

13

Algorithm 2 Dijkstra’s algorithm using a heap. (See [4], page 115)

1: Input: requires a start vertex s and a distance matrix L2: Output: returns distances D(v) from s to v ∈ V3: for every vertex v ∈ V do

4: D(v) :=∞5: end for

6: D(s) := 07: S := ∅8: while there are vertices that have not been labeled permanently, buckets 6= ∅ do

9: for all u ∈ smallest non-empty bucket do

10: S := S ∪ u11: remove u from bucket12: for each edge (u, v) do

13: if D(u) + L(u, v) < D(v) then

14: D(v) = D(u) + L(u, v)15: place v in the correct bucket16: end if

17: end for

18: end for

19: end while

Delete-min is called once for all vertices, addingO(n·d logd n) to the complexity, insert and decrease-key are called at most m times, adding O(m logd n) to the complexity, giving a total complexityO(nd logd n + m logd n). This can be optimized by choosing a good d.

3. Fibonacci heap. A fibonacci heap data structure requires only O(1) time for all operations exceptdelete-min, which takes O(log n). Every edge relaxation adds a constant factor, so the edge relax-ations add O(m) to the complexity, and all vertices are deleted once, giving O(n log n), combiningto O(m + n log n).

The overall impression the reader should have gotten, is that it is not possible to achieve a linear timealgorithm, based on the Dijkstra approach. Mainly, because mankind has not invented a comparison-based sorting algorithm that works in linear time. Thus, it will remain impossible to improve on thesetime bounds simply with Dijkstra-based methods.

2.3 Improvements preceding Component Hierarchies

For dense networks, Dijkstra’s algorithm works well, but for sparse networks, improvements can be madeexploiting this sparseness. Of course many improvements on this complexity have been suggested andimplemented since 1959.When we are not interested in the shortest paths from source s to all other vertices in the graph, butonly from one distinct s to one distinct t, Dijkstra can be aborted as soon as t is permanently labeled.For relatively short paths, this is a very intuitive adaptation to original Dijkstra.Over the last twenty years various algorithms have been proposed that improve on Dijkstra’s methodin calculation speed. Some methods are based on searching a smart part of the graph, some requirepreprocessing.

The bidirectional Dijkstra approach assumes that both a starting point or source s and a destinationor sink t are specified. Dijkstra is started from both s and t. Two spheres grow around these points,and eventually they will touch. Intuitively, one can see that this requires less edges to be relaxed thanexpanding only one sphere if the edge-density may be assumed to be proportional to the sphere’s area.

A* is an adaptation, where lower bounds are determined for the remaining path length from eachvertex with a temporary label. Bad partial solutions need not be investigated further.

Highway Hierarchies is a method that requires preprocessing. Travelers will generally search on localroads near their start- and endpoint, and try to use highways in between. These highways are roads

14

with a higher speed limit, but have no special meaning in a mathematical sense. Unlike the edgesthat travelers would call highways, the preprocessing for Highway Hierarchies determines what edgesare used in relatively long shortest paths, characterized by a neighborhood size H that is a parameterchosen in advance. The neighborhood of a vertex v is the set of vertices that are reached in the firstH steps of Dijkstra’s algorithm starting from v. All edges that are on paths from some start s to someendpoint t and outside the neighborhood of both s and t, are marked as highways. Queries need to bestarted simultaneously forward from s and backward to t. When during a query the path leaves theneighborhoods, only highway edges are used from then on. This drastically decreases the number ofedges that needs to be relaxed.Preprocessing is done by starting Dijkstra from all vertices in the graph. The above approach is repeatedfor the ensuing highway network, that is called the first highway level. From this smaller network, thesame method can be repeated to form highways of a second level. This can be repeated until the entiregraph on for instance level 6, exists of only H vertices.These methods however, are all still based on Dijkstra, and do not avoid sorting.

15

3. COMPONENT-FREE SKETCH OF THE IDEA

The main drawback of all previously mentioned Dijkstra-based algorithms, was the necessity of searchinga minimum, causing a lower bound on the complexity of those algorithms.Thorup presented a way of avoiding the searching bottleneck, by proposing a new method that guaranteesto correctly identify vertices that have reached the state where D(v) = d(v). Instead of finding theabsolute smallest D(v), it finds a selection of vertices that can all be labeled directly, in any order. Thisis intuitively possible, because as the original Dijkstra algorithm runs, vertices in several directions maybe fit to be labeled permanently. But Dijkstra does not know how to find these specific vertices. Inorder to make use of this possibility, we need a criterium to correctly identify these vertices. As can beseen in Lemma 3.0.1, in order to permanently label v it is not necessary for D(v) to be equal to theabsolute smallest min D(V \S). It suffices to have a v ∈ Vi\S for which the temporary label is alreadywithin a small enough range from the absolute minimum to guarantee d(v) = D(v). Here, Vi is a subsetsufficiently far away from the other reached vertices.

Lemma 3.0.1. Suppose the vertex set V can be divided into disjoint subsets V1, . . . , Vk and that all edgesbetween subsets have length at least δ. Further suppose for some i there is a v ∈ Vi\S for whichD(v) = min D(Vi\S) ≤ min D(V \S) + δ. Then d(v) = D(v).

Proof. Let v ∈ Vi\S with D(v) = min D(Vi\S) ≤ min D(V \S) + δ. The fact that v has a label meansthat the previous vertex on a path to v has been labeled permanently, otherwise v would not have beenreached. To prove that v can be labeled permanently, we need to show that this path is a shortest path,i.e., there is no shorter path from s to v than the one already found. Therefore it suffices to prove thatthere is no possibility to get to v quicker via another vertex.Suppose there is a vertex u that is the first vertex outside S on a shortest path from s to v. Thend(u) = D(u) since it was defined to be on a shortest path.Suppose u ∈ Vi, then D(v) ≥ d(v) ≥ d(u) = D(u) ≥ D(Vi\S) = D(v) proving d(v) = D(v).Suppose u /∈ Vi. Since all edges from V \Vi to Vi are of length ≥ δ, we have D(v) ≥ d(v) ≥ d(u) +δ = D(u) + δ. And the condition from the lemma guarantees that D(u) + δ ≥ min D(V \S) + δ ≥min D(Vi\S) = D(v) proving that d(v) = D(v) once again.

In the example at figure 3.1, the absolute minimum temporary label is D = 11, and occurs twicein subset V4. If we can find a temporary label D(v), that is the smallest in its subset, for whichD(v) ≤ min D(V \S) + δ = 11 + 3 = 14, then that label can be made permanent as well. As luck wouldhave it, subsets V5 and V6 contain such temporary labels. All these vertices can therefore be labeledpermanently.

To guarantee that a vertex set V divides into disjoint subsets separated by edges of length ≥ δ, considerthe following. Simply remove all edges with length ≥ δ. The maximal connected subgraphs become thedisjoint subsets.Based on Lemma 3.0.1, a first simple SSSP bucketing algorithm can be formalized. For the purpose ofcalculation speed, choose δ to be 2α. This will make it possible to use operations like bitshift, whichis a very fast way of multiplying by a power of 2, and it will guarantee that for a graph with ratherbadly balanced edge lengths, the number of buckets will not grow too fast. Write bx/2ic as bxci formore readable formulas. Thorup’s original work [2] uses the notation x � i, to indicate that it may becalculated by simply shifting the i least significant bits out to the right.Then min D(Vi\S) < min D(V \S) + δ with δ = 2α is implied bybmin D(Vi\S)cα ≤ bmin D(V \S)cα. We take an array B of buckets where we store all integers i ∈{1, . . . , k} corresponding with subsets Vi, according to bmin D(Vi\S)cα, so integer i will be found inbucket B(bmin D(Vi\S)cα). Note that bmin D(V \S)cα = mini(bmin D(Vi\S)cα). So, if i is in the lowest

V1

V3

V2 V4

V5

V6

d=7

d=8

d=8

d=9

D=12

D=12

d(s)=0

d=7

d=10

D=11

D=11

D=13

s

1

11

1

1

3

31

1

1

2

6

1

1

1

13

7

73

4

1

11

Fig. 3.1: Circles represent disjoint subsets, edges between subsets have length ≥ 3.

nonempty bucket, bmin D(Vi\S)cα = bmin D(V \S)cα.Algorithm 3 shows how the Component-free approach could be implemented. The complexity of thisalgorithm is O(m + ∆), plus the cost of storing min D(Vi\S) for each i. Here ∆α =

∑

e∈E b`(e)cα, and∆α can be kept small by choosing α large.

17

Algorithm 3 Component-free algorithm.

1: Input: requires a start vertex s, a distance matrix L and a partitioning of V into subsets V1, . . . , Vk

such that all edges between subsets have length ≥ 2α

2: Output: returns distances d(v) from s to all other vertices v ∈ V3: for every vertex v ∈ V do

4: d(v) :=∞5: end for

6: ∆α =∑

e∈E b`(e)cα7: for ix← 0, 1, . . . , ∆α,∞ do

8: B(ix)← ∅9: end for

10: d(s) := 011: S := ∅12: B(0)← s13: for ix← 0 to ∆ do

14: while B(ix) 6= ∅ do

15: pick i ∈ B(ix)16: pick v ∈ Vi\S minimising D(v)17: for all outgoing edges (v, w) ∈ E, w /∈ S do

18: let Vj be the component containing w19: D(w) ← min{D(w), D(v) + `(v, w)}20: if this decreases bD(Vj\S)c

αthen

21: B(bD(w)cα)← j22: end if

23: end for

24: S := S ∪ v25: end while

26: end for

18

4. COMPONENT HIERARCHY

This chapter will describe how to use Components to determine which vertices can be permanentlylabeled, and provide proof that this method works for both undirected and directed graphs.

4.1 General

First the necessary definitions and a description of what is done in preprocessing.

Definition 4.1.1. A component hierarchy is defined as follows: By Gi we denote the subgraph ofG = (V, E) whose edge set consists of the edges e ∈ E with length `(e) < 2i. That means that G0 isthe collection of singleton vertices. Since the wordlength ω bounds the length of an edge to be ≤ 2ω,Gω = G.

Definition 4.1.2. On level i in the component hierarchy, the components are the maximal connectedsubgraphs of Gi. (See Figure 4.1 for an example of how a graph is divided in components.) Fordirected graphs, use weakly connected components. Weakly connected components are the maximalconnected subgraphs without respect for direction. This means that there is no guarantee there will bea path from every vertex in the component to every other vertex in the same component. In contrast,for strongly connected components, we have such a guarantee but the Component Hierarchy methoddoesn’t work on strongly connected components.A component on level i containing vertex v is denoted as [v]i. Note that a component can be namedafter any vertex in the component.

Definition 4.1.3. An obvious way to represent these components and how they relate to each other,is in a component tree T . Every component in the component hierarchy is represented with a nodein the tree. The leaves of the tree are level 0 and represent the singleton vertices, since a component atlevel 0 allows only edges with length ` < 20 = 1 and there are no edges of zero length. On the highestlevel, the entire graph is in one component since all edges are allowed, so the root of the tree correspondsto the entire graph. We skip treenodes where [v]i = [v]i−1, therefore there are no treenodes with onechild, and the number of treenodes is ≤ 2n− 1. The parent of a treenode [v]i is its nearest ancestor withat least two children in the component hierarchy.The component tree is uniquely defined by the graph.

The component tree for the component hierarchy in Figure 4.1 is given in Figure 4.2.

Building a component tree from a graph can be done in various ways. The Matlab implementation givenin the appendix is slow, but will do for our application. In [3], the author Hagerup suggests a methodfor generating the component tree in linear time O(n + m).This method for constructing a component tree works for both directed and undirected graphs. It is asfollows. Work up from the singleton vertices, those will become the leaves in the resulting componenttree. Initially each of the n vertices in the original graph corresponds to a single-node tree. Call thecollection of these trees a forest F . The trees in the forest merge to form the desired component tree byiterating the following procedure.Start with a graph N−1 that contains all the vertices V but no edges. At each iteration j we

1. insert the edges of level j in the previous graph Nj−1 to form a new graph Nj .

2. compute the weakly connected components of Nj

3. contract the weakly connected components that contain more than one vertex to a single vertex.

V7

V5

V6

V1

V4

V3

V2

11 1 1

133 1

1

1

2

6

11

11

3

7

73

4

11 1

Fig. 4.1: Circles represent the components. Components on level 0 are not drawn, these are the singleton vertices.Components V1, V2, V3 and V4 are level 1 (all edge lengths in the component < 21), components V5 andV6 are level 2 (all edge lengths in the component < 22), the whole graph becomes connected at level 3.

The children of [v]i are the components at level i − 1 containing one or more elements of [v]i, so[w]i−1 is a child of [v]i if [w]i = [v]i. Note that, of course, for a component containing more than onevertex, it does not matter which vertex it is named after.

a b c d e f g h i j k l m n o p q

V1 V2 V4 V3

V5 V6

V7

Fig. 4.2: The tree corresponding to example 4.1.

20

ab

c

de

f

g

h

i

j

k

l m

n

op

q

Fig. 4.3: The complete graph.


Fig. 4.4: Each graph-vertex corresponds to asingle-node tree, the set of trees is calleda forest F .

The components on the previous level are contracted to single nodes (which does not make any differenceat level 0). Now all edges with length `(e) < 21 = 2 are taken into account. The maximal connectedsubgraphs are the components at level i = 1.

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

Fig. 4.5: The circles represent components. In thenext graph each component is contractedto a single vertex.


Fig. 4.6: The graph-component corresponds to anew treenode. The children of the newtreenode correspond to the elements ofthe graph-component.

The elements of a component are contracted. The new graph is smaller, making it easier to search.

The contraction of weakly connected components corresponds to the merging of trees in the forest F byadding a shared parent.See illustrations 4.3 through 4.10.

21

i

..

.........

............

..........................

.................

....................

......................

.........................

...........................

..............................

.............................

..........................

......................

....................

..................

.

...............

.

..

..

..

..

.

..

.

.

.

..

.

.

..

.

..

..

..

...

..

..

...........

...........

............

..........................

.................

....................

......................

..

.......................

..

.........................

..

..

..........................

..

..

.........................

..

..

..

....................

.

..

...................

..

.

..

..

..

..

..

..

..

..

.

..

..

..

.

..

..

.

..

..

..

.

..

.

..

..

..

.

..

..

.

.

..

.

..

..

..

..

.

..

.

..

.

.

..

.

.

.

..

.

..

..

.

..

..

.

..

..

..

..

7

7

3

6

3

4

3

Fig. 4.7: The contracted network shows new com-ponents.


Fig. 4.8: Contracting these new components corre-sponds to the above forest.

...............

..............

..............

..

..

.........

..

.

..

..

.

..

..

.

..

..

.

..

.

..

.

..

.

.

..

..

..

..

.

..

..

..

..

..

..

.

..

..

..

.

..

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

.

.

.

..

.

..

.

.

..

.

..

.

.

..

.

.

..

..

.

..

.

.

..

.

..

.

..

.

.

..

.

..

.

.

.

..

.

..

.

..

..

.

..

..

.

..

..

.

.

..

..

..

..

..

..

..

..

..

.

.

..

..

.

..

.

..

..

.

..

.

.

..

.

..

..

.

..

..

..

..

..

..

..

........

..

.............

................

.................

........................................

.......................................................................

................................

.............................

...........................

........................

......................

...................

.................

...............

...

...

...

...

...

..

..

...

..

...

..

..

..

.

..

.

..

..

.

..

.

..

.

..

.

.

..

.

..

.

..

.

..

.

.

..

..

..

..

..

.

..

..

.

...

...

...

...

...

..

.................

..................

....................

.......................

.........................

............................

..............................

.................................

....................................

......................................

.........................................

7

7

4

Fig. 4.9: There is only one component left, thatmeans that the component tree is fin-ished.


Fig. 4.10: The component containing the completegraph, becomes the root of the tree. Thetree is completed.

From the way components are formed, the following Lemma 4.1.4 follows.

Lemma 4.1.4. If [v]i 6= [w]i, then dist(v, w) ≥ 2i.

Proof. By definition of maximal connectedness, all edges from one ith level component to another havelength ≥ 2i. Since [v]i 6= [w]i, a path from [v]i to [w]i must contain at least one edge of length ≥ 2i.

We will denote [v]i\S as [v]−i . Note that [v]−i may be disconnected. Denote [v]i ∩ S as [v]+i .

Definition 4.1.5. Define [v]i to be a min-child of [v]i+1, if⌊

min D([v]−i )⌋

i=

⌊

min D([v]−i+1)⌋

i. See

also example 4.11. A min-child can be seen as a child containing the minimum of its parent, or a valueclose enough to the parents’ minimum.

22

V7

V5

V6

V1

V4

V3

V2

d=6d=7

d=7

d=8D=11

D=11

d(s)=0

d=5

d=8

D=9

D=9

D=12

s

11

111

33

1

1

1

2

6

11

11

3

6

53

4

11

1

Fig. 4.11: V6 has minimum temporary label D = 9 and is of level 2. Its min-childs must therefore obey bD(V6)c1= bD(child)c

1. Here, bD(V6)c1 = b9c

1= 4. Only component V4 containing the absolute minimum

satisfies this condition. The highest level component, V7 of level 3 has minimum temporary labelD = 9, meaning its min-childs must obey bD(V7)c2 = b9c

2= 2 = bD(child)c

2. Both component V6

and V5 satisfy this condition. When looking for a minimal node, we find that V7 has min-childs V6 andV5. V6 has min-child V4, V4 has 2 min-childs, with D=9. V5 has min-child V2, V2 has 2 min-childswith D=11.

Let b be the maximal non-trivial level that occurs in the component tree, or equivalently let for all v,[v]b = [v]ω = G.

Definition 4.1.6. Define [v]i to be minimal if [v]−i 6= ∅ and for j = i, . . . , b− 1, [v]j is a min-child of[v]j+1. See also example 4.11.

The requirement that [v]−i 6= ∅ is necessary for the situation where all vertices have been given apermanent label, V = S. If all vertices have been labeled permanently, min D([v]−i ) =∞ for all [v]i, andall [v]i would be minimal without this requirement. With the requirement, no components are minimalif V = S.

4.2 Proof of correctness Component Hierarchy method

Before we trust an algorithm to generate correct distance-labels for a SSSP problem, we need to see proofthat the method works. The following series of lemmas and proofs aims not only to prove correctness,but also provide insight in how this is guaranteed.In order to achieve this, we follow the mathematical proof of Thorup in [2].

The concluding statement that needs to be proven, is

Theorem 4.2.1. The only requirement to guarantee D(v) = d(v) is that [v]0 is minimal,

[v]0 minimal =⇒ D(v) = d(v).

To do this directly, would give a rather unreadable proof, so the building blocks will be presented first,and in Lemma 4.2.6 they all come together.

23

Note that Dijkstra’s original condition fits within the component hierarchy condition. If v ∈ V \Sminimizes D(V ), Dijkstra tells us that D(v) = d(v). But v ∈ V \S minimizes D(v) also means that [v]0is minimal because if v ∈ V \S minimizes D(v), for all higher-level components containing v we have∀i :

⌊

min D([v]−i )⌋

i=

⌊

D([v]−i+1)⌋

i= bD(v)ci.

A fact we will need further on, is a lemma about vertices on a shortest path.

Lemma 4.2.2. Let there be a shortest path from s to t, and let x be a vertex on that path, such that thepath is {s, . . . , pprev, x, pnext, . . . , t}. Then the subpaths p1 = {s, . . . pprev, x} and p2 = {x, pnext, . . . , t}are shortest paths from respectively s to x and x to t.

Proof. Suppose paths p1 or p2 are not shortest paths from s to x and x to t respectively. Then for oneor both there must be a shorter route possible, that we will call p′

1 and p′2. Then there would exist ashorter path from s to t, being either {p′1, p2} or {p1, p

′2}. This is in contradiction with the assumption

that {p1, p2} was a shortest path.

4.2.1 Why minimality suffices as condition

We start with an insight about minimality.

Lemma 4.2.3. If v /∈ S, [v]i is minimal, and i ≤ j ≤ ω, then⌊

min D([v]−i )⌋

j−1=

⌊

min D([v]−j )⌋

j−1.

Proof. When j = i, the equality is trivial. For j > i inductively,⌊

min D([v]−i )⌋

j−2=

⌊

min D([v]−j−1)⌋

j−2, which implies

⌊

min D([v]−i )⌋

j−1=

⌊


j−1. The minimality of [v]i means that [v]j−1 is a min-child of [v]j

. So⌊


j−1=

⌊

min D([v]−j )⌋

j−1.

Then, we try to understand the strength of components. By the definition of components, the followinglemmas hold. These lemmas will be necessary for the next steps.

Lemma 4.2.4. Suppose v /∈ S and there is a shortest path from s to v on which the first vertex u outsideS is in [v]i. Then d(v) ≥ min D([v]−i ).

Proof. Since u is the first vertex outside S on the shortest path to v, D(u) = d(u), by Lemma 4.2.2. Ofcourse d(u) is a lower bound on the path length to v, so d(v) ≥ d(u) = D(u). Moreover, since u ∈ [v]−i ,D(u) ≥ min D([v]−i ).

Lemma 4.2.5. Suppose v /∈ S and let i be such that [v]i+1 is minimal. If there is no shortest path to vwhere the first vertex outside S is in [v]i, then bd(v)ci >

⌊

min D([v]−i+1)⌋

i.

Proof. Assume there is no shortest path to v where the first vertex outside S is in [v]i. If there is morethan one shortest path to v, pick one path P such that the first vertex u outside S is in [v]k with kminimized. Then k > i since u /∈ [v]i and d(v) = `(P ) = D(u) + dist(u, v). The proof or the lemma isnow by induction on ω − i, and will be split up in two parts.

If u /∈ [v]i+1, we have i + 1 < ω. We know by induction bd(v)ci+1 >⌊

min D([v]−i+2)⌋

i+1. Minimality

of [v]i+1 gives⌊

min D([v]−i+2)⌋

i+1=

⌊

min D([v]−i+1)⌋

i+1. Thus,

bd(v)ci+1 >⌊

min D([v]−i+1)⌋

i+1implying bd(v)ci >

⌊

min D([v]−i+1)⌋

i.

If u ∈ [v]i+1, bD(u)ci ≥⌊

min D([v]−i+1)⌋

i. Moreover, since u /∈ [v]i, by Lemma 4.1.4 dist(u, v) ≥ 2i.

Hence,bd(v)ci = bD(u) + dist(u, v)ci ≥

⌊

min D([v]−i+1)⌋

i+ 1.

To understand Lemma 4.2.5, it should be helpful to consider the following. In order for d(v) to be inthe same 2i interval (or bucket) as min D([v]−i+1), it is necessary for the first vertex u outside S to bein [v]i, since if u were to lay in min-child [w]i of [v]i+1, the distance from [w]i to [v]i would be ≥ 2i.Now we can prove that the minimality of [v]0 implies D(v) = d(v).

24

Lemma 4.2.6. If v /∈ S and [v]i is minimal, min D([v]−i ) = min d([v]−i ). In particular, D(v) = d(v) if[v]0 = {v} is minimal.

Proof. Since D(w) ≥ d(w) for all w, min D([v]−i ) ≥ min d([v]−i ). View v as an arbitrary vertex in [v]i, itremains to show that d(v) ≥ min D([v]−i ). If there is more than one shortest path to v, pick one P suchthat the first vertex u outside S is in [v]i if possible.

If this is possible, and u ∈ [v]i, the proof follows directly from Lemma 4.2.4.If u /∈ [v]i, Lemma 4.2.5 gives bd(v)ci >

⌊

min D([v]−i+1)⌋

i. Since [v]i is a min-child of [v]i+1,

⌊

min D([v]−i+1)⌋

i=

⌊

min D([v]−i )⌋

i. Thus

bd(v)ci >⌊

min D([v]−i )⌋

i, implying d(v) > min D([v]−i ), which fits the condition.

This proves Theorem 4.2.1.

4.3 Queries: Traversing the Component Tree

An algorithm must be built that visits minimal vertices, which are components at level 0 ([v]0 = {v}).The query starts from the root of the tree. For this component, determine the min-childs. To avoid search-ing, each component keeps his children in a bucket array B. Buckets are filled according to the mini-mal distance label the child-component contains, and this quantity is scaled. Notation: B([v]i, D([w]h))means that [w]h is a child of [v]i and that it is in [v]i’s bucket bD([w]h)ci−1.This way, the min-childs are exactly those components in the minimum non-empty bucket! Recursivelyvisit all components in this bucket. If [v]i was not a direct child of [v]i+1 (but for instance [v]i+2 or[v]i+3), more than one bucket of [v]i contains minimal children. This can be understood as: if there wasa component [v]i+1 in between, than the minimal bucket of [v]i+1 would have been larger. All elementstherein would have been acceptable as min-childs. In that case, a number of buckets of [v]i is emptied.All components in the lowest non-empty bucket are minimal, and all minimal components will be in thelowest bucket before we stop, so all and only minimal leaves will be visited.

In [2] the algorithm is described as in algorithm 4, 5 and 6.The main algorithm 4 is called, and it calls “Visit” (Algorithm 6). Initially, only a tree and a start vertexare defined, meaning that there are no components in any buckets. Those are initialized by the “Expand”method in Algorithm 5. When a component is first visited (6) (because its parent concluded that it wasminimal), expand (5) is called and the current component determines in what bucket it must keep itstree-children. Many of those may not have non-trivial distance-labels, that means that the distance forthe component is ∞.When a leaf is visited, and the corresponding graph-vertex is labeled permanently, its outgoing edgesare relaxed as in Dijkstra’s method. The reached vertices can hereby get an improved label; this candecrease the minimum distance of components they are members of. All components that see a decreasein their minimum distance-label, must be placed in the correct bucket of their parent.Keep bookkeeping up to date at all times.

25

Algorithm 4 Finding Shortest path-lengths.

1: Input: requires a start vertex s, a distance matrix L where L(i, j) contains the distance from i to jand a tree containing the component relations.

2: Output: returns distances D(v) from s to all other vertices v ∈ V3: S ← {s}4: D(s)← 0, for all v 6= s : D(v)← `(s, v)5: Visit([s]ω)6: return D

Algorithm 5 Expand([v]i)

1: Input: an unvisited component [v]i2: Output: allocates memory for the bucket array (assuming that a bucket will not use more than the

sum of all edge lengths in the component) and initializes them as empty buckets. For undirectedgraphs, this amount may not suffice and can be expanded. The children of [v]i are put in theappropriate bucket, if a child has no temporary labels (other than ∞), one can choose for instanceto put them in bucket -1 or max+1.In this algorithm, unlike the matlab implementation, we assume that we only initialize buckets fromix0, which is the lowest non-empty bucket.

3: ix0([v]i)←⌊

min D([v]−i )⌋

i−1

4: ix∞([v]i)← ix0([v]i) + d∑

e∈[v]i`(e)/2i−1e

5: for q = ix0([v]i) to ix∞([v]i) do

6: B([v]i, q)← ∅7: end for

8: for all children [w]h of [v]i do

9: bucket [w]h in B([v]i,⌊

min D([w]−h )⌋

i−1)

10: end for

11: return ix0([v]i)

To show how the Component Hierarchy would work for the example we used before, see figure 4.12through 4.35.

26

Algorithm 6 Visit([v]i)

1: Input: a component [v]i. Requires access to preliminary labels D, matrix L, tree T ,2: Goal: If [v]i is a leaf: label permanently, update temporary labels of outgoing vertices, update

bookkeeping for the tree-ancestors of [v]i. If [v]i is an internal node, visit its min-childs.3: Output: Distances are improved and made permanent in the process.4: call [v]j the tree-parent of [v]i. (j may be > i + 1.)5: if i = 0 then

6: %%% [v]0 is a leaf %%%7: S ← S ∪ {v}8: for all (v, w) ∈ E do

9: if D(v) + `(v, w) < D(w) then

10: determine [w]h = the highest unvisited ancestor of [w]0. %%% This is the first componentthat is already in a bucket, so the first that may need updating %%%

11: determine [w]k = the (visited) parent of [w]h in T .12: if bD(v) + `(v, w)ck−1 <

⌊

D([w]−h )⌋

k−1then

13: move [w]h from B([w]k ,⌊

D([w]−h )⌋

k−1) to B([w]k), bD(v) + `(v, w)ck−1

14: if this decreases the distance of higher-level components then

15: move those to the appropriate bucket as well.16: end if

17: end if

18: D(w) ← D(v) + `(v, w)19: end if

20: end for

21: remove [v]i from the bucket B([v]j , D([v]i)) where its parent kept him.22: return

23: else

24: %%% [v]i is an internal node %%%25: if [v]i has not been visited before then

26: Expand([v]i) (this returns ix([v]i))27: else

28: ix([v]i)← the minimum non-empty bucketnumber29: end if

30: %%% start visiting all min-childs of [v]i %%%31: loopstart = ix([v]i)32: loopend = (bix([v]i)cj−i + 1)× 2j−i %%% this gives exactly the slack caused by [v]i possibly not

being a direct child of [v]j %%%33: for counter = loopstart to loopend do

34: while B([v]i, counter)) 6= ∅ (so until [v]−i = ∅ or bix([v]i)cj−i is increased) do

35: let [w]h ∈ B([v]i, counter)36: Visit([w]h) visit all min-childs of [v]i37: end while

38: end for

39: %%% clean up after [v]i itself %%%40: if [v]i is not the root of T then

41: remove [v]i from the bucket where its parent kept him, B([v]j , ·)42: if [v]−i 6= ∅ then

43: place [v]i in B([v]j ,⌊

min([v]−i )⌋

j−1)

44: end if

45: end if

46: end if

27

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.12: Start from vertex i. i is permanentlylabeled with d(i) = 0 and added to S,its outgoing edges are relaxed and a andj receive a temporary label.

a

7b c d

7

e f g h iS

j

7k l m n o p q

7 7

7

Fig. 4.13: The temporary labels of a and j meanthat all higher level components they arein, also get this label (since it is the min-imum temporary label). In an imple-mentation, these numbers are recordedas bD(v)c

i−1.

From the root, start visiting min-childs. This search is depth-first, so we drop to the lowest level beforevisiting all min-childs of the root! This means the first visited vertex is a.

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.14: a is labeled permanently, b and c arereached.

a

Sb8

c

8d

8

e f g h iS

j

7k l m n o p q

8 7

7

Fig. 4.15: The minima of components are updated,the left branch is no longer minimal sinceb7c

26= b8c

2, so from the root, we visit

the right branch of the tree, visiting itsonly minimal descendant j.

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.16: j is labeled permanently, k is reached.

a

Sb8

c

8d

8

e f g h iS

j

Sk10

l m n

10

o p q

8 10

8

Fig. 4.17: Minima updated, the right branch of thetree is still minimal (b8c

2= b10c

2) so we

visit k next.

28

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.18: k is labeled permanently, l and n arereached.

a

Sb8

c

8d

8

e f g h iS

j

SkS

l11

m n

11

11

o p q

8 11

8

Fig. 4.19: Update minima, right branch is still min-imal, visit both l and n.

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.20: l and n visited, m and o reached.

a

Sb8

c

8d

8

e f g h iS

j

SkS

lS

m

12

n

S

12

o

14

p q

14

8 12

8

Fig. 4.21: Update, m gets label 12, and the rightbranch is no longer minimal. Visit b andc.

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.22: b and c visited, d reached.

a

SbS

c

Sd9

9

e f g h iS

j

SkS

lS

m

12

n

S

12

o

14

p q

14

9 12

9

Fig. 4.23: The only minimal vertex is d.

29

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.24: d visited, e and f reached and o im-proved.

a

SbS

c

SdS

e

12f12

g h

12

iS

j

SkS

lS

m

12

n

S

12

o

13

p q

13

12 12

12

Fig. 4.25: The left branch remains minimal, so visitits minimal elements, e and f .

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.26: e and f visited, g and h reached.

a

SbS

c

SdS

e

SfS

g

13h13

13

iS

j

SkS

lS

m

12

n

S

12

o

13

p q

13

13 12

12

Fig. 4.27: The left branch still remains minimal, sovisit its minimal elements, g and h.

a

b

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.28: g and h visited, no outgoing edges to re-lax.

a

SbS

c

SdS

e

SfS

g

ShS

iS

j

SkS

lS

m

12

n

S

12

o

13

p q

13

12

12

Fig. 4.29: The left branch is now empty, so visit theonly minimal element of the remainingright branch, m.

30

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.30: m visited, no outgoing edges to relax.

a

SbS

c

SdS

e

SfS

g

ShS

iS

j

SkS

lS

m

S

n

S

o

13

p q

13

13

13

Fig. 4.31: Visit the only minimal element o.

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.32: Visit o, reach p and q.

a

SbS

c

SdS

e

SfS

g

ShS

iS

j

SkS

lS

m

S

n

S

o

S

p

14

q

14

14

14

14

Fig. 4.33: Only two elements remain unvisited,these are both minimal, visit p and p.

ab

c

de

f

g

h

i

j

k

l m

n

op

q

33 2

6

3

7

73

4

Fig. 4.34: Visit p and q.

a

SbS

c

SdS

e

SfS

g

ShS

iS

j

SkS

lS

m

S

n

S

o

S

p

S

q

S

Fig. 4.35: And we’re done!

31

4.3.1 Correctness of Queries

We have seen in section 4.2.1 that in order to guarantee that a vertex [v]0 can be labeled permanently,it only needs to be minimal. Now we need a guarantee that the algorithm selects all and only thosevertices that are minimal.Lemma 4.3.1 shows that only minimal vertices are visited, and Lemma 4.3.1 provides the proof that allvertices are visited.

Lemma 4.3.1. If v /∈ S and [v]i is not minimal, but [v]i+1 is minimal, then⌊

min d([v]−i )⌋

i>

⌊

min D([v]−i+1)⌋

i.

Proof. Consider any w ∈ [v]−i .If there is no shortest path to w where the first vertex outside S is in [v]i, then Lemma 4.2.5 says thatbd(w)ci >

⌊

min D([v]−i+1)⌋

i.

Otherwise, by Lemma 4.2.4 d(w) ≥ min D([v]−i ). The non-minimality of [v]i implies that⌊

min D([v]−i )⌋

i>

⌊

min D([v]−i+1)⌋

i. Combining these leads to bd(w)ci >

⌊

min D([v]−i+1)⌋

ifor all w ∈ [v]−i .

From a certain component (parent) in the tree, we only visit components (tree children) that are in itslowest non-empty bucket as long as the parent is minimal. By Lemma 4.3.1, only minimal componentsare in this bucket. Since singleton vertices are simply components on level 0, they too are visited onlywhen minimal.

For the following lemma, remember that [v]+ = [v]i ∩ S.

Lemma 4.3.2. For all [v]i,⌊

max d([v]+i )⌋

i−1≤

⌊

min d([v]−i )⌋

i−1.

The maximal permanent label that has been appointed in component [v]i sofar is max d([v]+i ). We showthat within a component, the maximal permanent label is always in the same bucket as the minimumunlabeled vertex will be, or in a lower bucket. In other words, a bucket will always be emptied beforelabeling is started on the next bucket.

Proof. We continue to select elements from the lowest non-empty bucket (initially⌊

min d([v]−i )⌋

i−1)

until the parent component is no longer minimal. [w]0 is minimal just before being visited, so Lemma4.2.3 gives bD(w)ci−1 =

⌊

min D([w]−0 )⌋

i−1=

⌊

min D([w]−i )⌋

i−1. Combined with Lemma 4.2.6 saying

D(w) = d(w) and min D([w]−i ) = min d([w]−i ), we conclude that bd(w)ci−1 =⌊

min d([v]−i )⌋

i−1. This

holds every time a vertex w is visited.

From this lemma we can conclude that any component that contains vertices that have not been labeledpermanently, has to be in the same or a higher bucket than the permanent labels that have beenappointed. In other words, it is impossible to skip a value for temporary labels and continue labeling onthe next bucket.This translates to: if there are vertices in the graph that have not been permanently labeled, then thecomponents they are in are in buckets that still need to be emptied.The graph is assumed to be connected, so all vertices will get a distance-label <∞. Since the algorithmdoes not terminate while there are still buckets to be emptied, it will continue to permanently labelvertices until they are all permanently labeled.

4.4 Differences between Undirected and Directed Graphs

As seen, the approach for undirected and directed graphs is very similar. Where for undirected graphsthere is no question about what should be defined as a component, for directed graphs we have the choiceof strongly connected components or weakly connected components. Because we require the conditionof Lemma 4.1.4 about distances between vertices in distinct components, we need to choose the weaklyconnected components.Unfortunately, this means that the space that needs to be reserved for the buckets, cannot be estimated(bounded) by the sum of the edge lengths within the component. This is a setback for the memory usage,which affects complexity as well since more buckets need to be scanned before a non-empty bucket isfound. However, Hagerup [3] states that the complexity for queries on directed graphs is O(n+m log(ω)).

32

5. EXPERIMENTAL RESULTS

The implementation given in the appendix is meant to support the proof that the Component Hierarchymethod generates correct answers. It is not optimized for speed nor memory usage. Section 5.1 describesthe calculation times for this implementation. In section 5.2, experimental results by others are described.

5.1 My Results

The map used is a TIGER/Line map [6]. The smallest map (with the least vertices) is Washington DC,with 9559 vertices and 14909 edges. Of this map I selected subgraphs of size at most n, where I chosen = {125, 250, 500, 1000, 2000} by selecting only the first n vertices of the map and the edges connectingonly those n vertices.The TIGER/Line data is undirected, I adapted the data to get a directed graph by deleting 20% of thereverse edges. This generally results in a graph that is not strongly connected. We use the TransClosureroutine to find the largest strongly connected subgraph. For n = 125 this results in a graph of only 63vertices.The TIGER/Line data gives both spatial distance and travel time, I used only spatial distance as edgeweight function.This approach gives undue attention to the vertices with lower indices, but a choice had to be made.The resulting graphs are appropriate enough to represent real road-networks.All queries generated exact solutions, as checked with Dijkstra’s.

Figures 5.1 through 5.12 show the graphs and the resulting trees.The axes on the maps give the longitude and latitude. To show all graphs as directed graphs takes toomuch space, therefor only the smallest graph is shown directed.

1

2

3

45

6

78

9

10

11

1213

14

1516

17

18

19

20

2122

23

2425

26

2728

29

30

31

32

33

34

3536

37

38

39

4041

42

43

44

45

46

47

48

49

50

51

52

53

54

55 56

57

58

59

60

61

62

63

236225

594565

672

60

640

161

269

257213

63

223

261

234

108

278

113564

691

593

322

454

223

274

304

296

410

282

238249

222

41202

306

431

233

329

256

707481103

8575

134

133127117

289

111

99476

10875

94500

448

605

79

125

866

275

649

289

681

322

387

648

275

162

223

39213

131

57

477

191

799

182

726541

345

182

886

60

182

908

839

293

281

222 191

692

74497

194

71

325

448

154302

522

259

213

306

204

583

427

12190

247

12899485

1043

223

612

318

346

470

909

316

635

282

541

680

333

307

256

332

265

296

432

285

279

278

304

363167

454

454

271

291

454

238

322

258

503

351

269479

175476

334

305

476

265

319

250

306

454

295

233

726

278

477

Fig. 5.1: map for 125

Fig. 5.2: tree for 125

The first 125 vertices of the DC map were selected and TransClosure was used to get the largest possiblestrongly connected subgraph. Apparently the first 125 vertices of the original DC graph were not aconnected region, the strongly connected subgraph returned a graph of only 63 vertices and 167 edges.

34

7.705 7.7055 7.706 7.7065 7.707 7.7075 7.708 7.7085

x 107

3.896

3.8965

3.897

3.8975

3.898

3.8985x 10

7

Fig. 5.3: map for 250 Fig. 5.4: tree for 250

When we ask for the first 250 vertices, TransClosure returned a graph with 192 vertices and 535 edges.

7.705 7.706 7.707 7.708 7.709 7.71 7.711 7.712

x 107

3.891

3.892

3.893

3.894

3.895

3.896

3.897

3.898

3.899x 10

7

Fig. 5.5: map for 500 Fig. 5.6: tree for500

Asking for 500 vertices returned 477 vertices and 1358 edges.

7.705 7.706 7.707 7.708 7.709 7.71 7.711 7.712

x 107

3.89

3.891

3.892

3.893

3.894

3.895

3.896

3.897

3.898

3.899x 10

7

Fig. 5.7: map for 1000 Fig. 5.8: tree for 1000

Here we asked for 1000 vertices, calculating the strongly connected subgraph returned 935 vertices and2679 edges.

35

7.7 7.702 7.704 7.706 7.708 7.71 7.712

x 107

3.89

3.891

3.892

3.893

3.894

3.895

3.896

3.897

3.898

3.899

3.9x 10

7

Fig. 5.9: map for n = 2000 Fig. 5.10: corresponding tree

The largest strongly connected subgraph for the first 2000 vertices is a graph with 1916 vertices and5374 edges.

7.7 7.702 7.704 7.706 7.708 7.71 7.712

x 107

3.888

3.89

3.892

3.894

3.896

3.898

3.9x 10

7

Fig. 5.11: map for n = 4000 Fig. 5.12: corresponding tree

Asking for 4000 vertices returned 3829 vertices and 11066 edges.

36

Tab. 5.1: Calculation results.

# vertices # edges calculation order = time/order time Dijkstratime (s) n + m log(ω) Dijkstra time/n2×10−5

63 167 0.2370 1840 0.0001287 0.0070 0.1764192 535 1.0021 5890 0.0001702 0.0086 0.0233477 1358 5.5405 14940 0.0003710 0.0809 0.0356935 2679 19.5294 29460 0.0006630 0.1739 0.0199

1916 5374 81.3326 59130 0.0013754 0.6322 0.01723829 11066 433.5346 21650 0.0035639 2.3687 0.0162

For all these graphs, component trees were determined in a preprocessing step. Since this is not part ofthe query, the calculation times are not reported. The calculation times for the queries are averaged over100 distinct start-vertices where possible, for the smallest graph the average is over 60 start-vertices.The longest path in the largest graph is 42058 long, so this value is used in the following table as ω. Forcompleteness, the calculation times for Dijkstra are also given.

5.2 Literature

The number of authors that extends the Component Hierarchy method is modest. The thesis ”ShortestPath Problems and Experimental Results” by Joe Crobak [7] states experimental results comparing asmart Component Hierarchy implementation with Dijkstra’s and Goldberg’s approach. For various typesof graphs, the CH approach performs worse than both Dijkstra and/or Goldberg. Only on graphs wherethe number of components remains small, the Component Hierarchy method scores better.

37

6. DISCUSSION

Chapter 2 and mainly section 2.3 give an overview of what methods were available before ComponentHierarchies. Mainly the Highway Hierarchy method stands out in calculation speed.

Literature research showed that there was no guarantee that for directed graphs there exists a methodto correctly solve single source shortest path problems in linear time O(m). The challenge was to provethat the Component Hierarchy method for undirected graphs could be extended to work on directedgraphs.

To make this possible, it was necessary to clarify the proof for the undirected Component Hierarchymethod, and set up a proof analogously for directed graphs. These two are intertwined in chapter 4.

We may conclude that it is indeed proven that the Component Hierarchy method works on directedgraphs. This conclusion is supported by a functioning Matlab implementation.

Dijkstra approximately follows the quadratic behavior we expected.To determine whether the Component Hierarchy method is fit for implementation, a prerequisite is thatit generates exact solutions. This has been proven and indeed, all experiments confirm correctness. Todecide if an implementation of Component Hierarchies meets our demands, some runs are performed andcalculation times recorded as shown in section 5.1.

These results do not match the order time bound, they even appear to be linearly worse. Partiallythis could be excused by the fact that the real profit is made for large graphs. However, it is more likelythat the implementation has a negative effect on the results. Memory use and the constant swapping oflarge matrices is very expensive and this costs disproportional time. Implementing the algorithm roughlythe same way in a different programming language where it is not necessary to move the matrix aroundprobably makes a lot of difference. Also, the buckets are implemented far from optimal in a number ofways.

• As the calculation progresses, the index of the first non-empty bucket increases. Some buckets areeven initialized with a lot of free space at the beginning. This not only costs memory, but makessearches slower as well.

• Currently, the buckets are implemented as a three-dimensional matrix containing a lot of noth-ing (sparsity is exploited), and only one 1 per column, indicating what bucket the componentcorresponding to the column is in.

• Their size is set directly at the beginning. This means that generally far to much space is allocated.

I have not been able to back up the statement by Hagerup [3] that the C.H. method for directed graphstakes O(n + m log(ω)). The Matlab implementation mainly shows that the method works.

6.1 Future improvements

As mentioned, the presented Matlab-implementation is not optimal. Here are some points of interestthat can be improved.

• Make it possible to perform queries on larger road networks.

• The preprocessing phase is currently quite unpolished. For starters, some quick gains can probablybe made by polishing the current work.

• Preprocessing according to Hagerup’s paper [3] based on first finding the minimum spanning tree.The minimum spanning tree with original edge lengths, has the same components as the com-plete graph (by definition of a minimum spanning tree). In a minimum spanning tree however,components can be found much more efficiently.

• The amount of memory allocated for the buckets can be reduced greatly, by initially allocating anamount for instance equal to the sum of the edges within the weakly connected component. If thisturns out to be insufficient, allocate twice as much memory. This way, you never allocate morethan twice as much memory as necessary, while keeping the number of times you need to allocatelow.

• A Fibonacci heap seems to be the heap of choice for implementing the buckets.

39

7. CONCLUSION

The Component Hierarchy method works for both undirected and directed graphs. This is provenin Chapter 4 and supported by the fact that the matlab implementation generates correct answers.However, my conclusion is that currently, the Component Hierarchy method is not polished enough tobe an alternative to for example the Highway Hierarchy method. Although the theoretical complexityfor Component Hierarchies is better than for Highway Hierarchies, apparently the constants in thecomplexity order are disastrous.

APPENDIX

A. PREPARATION

The first few Matlab-files are not necessary for performing a Component Hierarchy query but generaterandom graphs, load real road networks, perform queries or save the matrix and tree for future re-use. GraphToLatex and TreeToLatex generate LATEX-readable files describing the graph or tree. Showvisualizes a graph and corresponding tree and the optimality of a solution can be checked with Dijkstra’smethod.

A.1 ExampleGraph

1 function [matrix, coords] = ExampleGraph(n, maximum, directed)

3 % ExampleGraph(n, maximum) generates a random graph, with n nodes in4 % the interval (0 to fieldwidth∗maximum, 0 to fieldwidth∗maximum). For5 % this graph, coords contains the coordinates of the nodes, and matrix is6 % a sparse matrix containing only those edges that were euclidian distances7 % smaller than maximum.

9 fieldwidth = 3;10 coords = (fieldwidth∗maximum/sqrt(2)).∗rand(n,2);11 matrix = sqrt((ones(n,1)∗coords(:,1)’−coords(:,1)∗ones(1,n)).ˆ2+(ones(n,1)∗coords(:,2)’−coords(:,2)∗

ones(1,n)).ˆ2);

13 matrix = round( matrix );14 [x, y, v] = find( matrix );15 for l=1:length(v)16 if v(l) > maximum17 v(l) = 0;18 end

19 if v(l) > maximum/220 if rand > 0.9 % randomly select 70% of the longest edges21 v(l) = 0;22 end

23 end

24 end

25 matrix = sparse(x, y, v, n, n);26 if ˜directed27 matrix = max(matrix, matrix’);28 end

30 test = transClosure( matrix, directed, 1 );31 if ˜isempty( find( test == 0 ) ) % check strongly connectedness32 [matrix, coords] = ExampleGraph(n, maximum, directed);33 end

A.2 DelaunayGraph

1 function [matrix, coords] = DelaunayGraph(n, interval, directed)

3 % [matrix, coords] = DelaunayGraph(n, interval, directed) generates Delaunay4 % triangulated graph with n nodes, on interval [0,interval]ˆ2. If no5 % interval is specified, interval is set to 30. If no boolean ’directed’ is6 % specified, generate an undirected graph.7 % This generator is based on [5]

9 if nargin == 110 interval = 30;11 end

12 if nargin < 313 directed = false;14 end

16 coords=rand(n,2)∗interval;17 tri=delaunayn(coords,{’QJ’});18 edges=[tri(:,1),tri(:,2);19 tri(:,2),tri(:,3);20 tri(:,3),tri(:,1)];21 % remove repeated edges22 edges=unique(sort(edges’)’,’rows’);23 edges=[edges;edges(:,2),edges(:,1)]; % symmetrize24 if directed25 selection = rand(1, length(edges));26 selection = find(selection>0.3); % randomly select 70% of the edges27 edges = edges(selection, :);28 end

29 matrix=sparse(edges(:,1),edges(:,2),1,size(coords,1),size(coords,1));

31 edgeLengths = sqrt((ones(n,1)∗coords(:,1)’−coords(:,1)∗ones(1,n)).ˆ2+(ones(n,1)∗coords(:,2)’−coords(:,2)∗ones(1,n)).ˆ2);

32 edgeLengths = round( edgeLengths );

34 matrix = edgeLengths .∗ matrix;

36 test = transClosure( matrix, directed, 1 );37 if ˜isempty( find( test == 0 ) ) % check strongly connectedness38 matrixFailed = 139 [matrix, coords] = DelaunayGraph(n, interval, directed);40 end

43

A.3 Thesis example

1 function [G,x] = Thesis;

3 % [G,x] = Thesis returns the example graph used

5 a=1;6 b=2;7 c=3;8 d=4;9 e=5;

10 f=6;11 g=7;12 h=8;13 i=9;14 j=10;15 k=11;16 l=12;17 m=13;18 n=14;19 o=15;20 p=16;21 q=17;

23 G = sparse( 17, 17 );

25 G(a,b) = 1;26 G(a,c) = 1;27 G(b,c) = 1;28 G(b,d) = 1;29 G(c,d) = 1;30 G(d,e) = 3;31 G(d,f) = 3;32 G(e,f) = 1;33 G(e,g) = 1;34 G(f,h) = 1;35 G(g,h) = 1;36 G(c,k) = 6;37 G(k,n) = 1;38 G(k,l) = 1;39 G(l,m) = 1;40 G(n,m) = 1;41 G(j,k) = 3;42 G(i,a) = 7;43 G(i,j) = 7;44 G(n,o) = 3;45 G(d,o) = 4;46 G(o,p) = 1;47 G(o,q) = 1;48 G(p,q) = 1;

50 x(a,:)=[0 0];51 x(b,:)=[0.866 0.5];52 x(c,:)=[0.866 −0.5];53 x(d,:)=[1.732 0];54 x(e,:)=[4.6900 0.5];

44

55 x(f,:)=[4.6900 −0.5];56 x(g,:)=[5.556 1];57 x(h,:)=[5.556 −1];58 x(i,:)=[−6.3245 −3];59 x(j,:)=[−1.6172 −8.1808];60 x(k,:)=[0.866 −6.5];61 x(l,:)=[0.866 −7.5];62 x(m,:)=[1.866 −7.5];63 x(n,:)=[1.866 −6.5];64 x(o,:)=[3.1496 −3.7885];65 x(p,:)=[4.0156 −4.2885];66 x(q,:)=[4.0156 −3.2885];

68 G = G + G’; % Symmetrical matrix

45

A.4 Real network

1 function [matrix, coords] = readFiles(map, max)

3 if nargin < 24 max = inf;5 if nargin < 16 map = ’DC’;7 end

8 end

9 filenamecoords = strcat( ’coords’ , map, ’.txt’);10 filenamematrix = strcat( ’matrix’ , map, ’.txt’);

12 coordsfile = fopen(filenamecoords,’rt’);13 if (coordsfile < 0)14 error(’could not open coords file’);15 end;16 longVector = fscanf(coordsfile,’%f’);17 fclose(coordsfile);

19 if max > length(longVector)/320 max = round(length(longVector)/3);21 end

23 for i = 1:max;24 coords(i,1)= longVector(3∗i−1);25 coords(i,2)= longVector(3∗i);26 end

27 matrix = sparse(max, max);

29 matrixfile = fopen(filenamematrix,’rt’);30 if (matrixfile < 0)31 error(’could not open matrix file’);32 end;33 longVector = fscanf(matrixfile,’%f’);34 fclose(matrixfile);

36 for i = 1:length(longVector)/5;37 from = longVector(5∗i−4)+1;38 to = longVector(5∗i−3)+1;39 if from <= max && to <= max

40 dist = longVector(5∗i−2);41 matrix(from, to) = round(dist);42 if mod(i,5) ˜= 0 % leave out 20% of the back−edges43 matrix(to, from) = round(dist ∗ 1.05);44 end

45 if matrix(from, to) == 046 matrix(from, to) = 1;47 end

48 end

49 end

51 mytime = fix(clock);52 output = strcat(’matrix loaded’, int2str(mytime(1)), ’−’, int2str(mytime(2)), ’−’, int2str(mytime

(3)), ’ on ’ , int2str(mytime(4)), ’:’, int2str(mytime(5)), ’:’, int2str(6))

46

54 % this matrix may not be strongly connected. Correct this here.55 closure = transClosure( matrix, 1, 1 );56 mytime = fix(clock);57 output = strcat(’transclosure finished’, int2str(mytime(1)), ’−’, int2str(mytime(2)), ’−’, int2str(

mytime(3)), ’ on ’ , int2str(mytime(4)), ’:’, int2str(mytime(5)), ’:’, int2str(6))

59 if ˜isempty( find( closure == 0 ) )60 [size, index] = max(sum(closure));61 selection = find(closure(index,:));62 matrix = matrix(selection, :);63 matrix = matrix(:, selection);64 coords = coords(selection, :);65 end

47

A.5 RandomQuery

1 function [printable, error] = RandomQuery(startnode, n, maximum, directed, write)

3 % [printable, error] = randomQuery(startnode, n, maximum, directed, write)4 % creates a random graph, computes the tree, does queries,5 % computes exact solution using Dijkstra, calculates the error, writes the6 % matrix to a latex−readable file if write is true, and shows the graph7 % and tree.

9 if nargin < 510 write = false;11 if nargin < 412 directed = false;13 end

14 end

16 if nargin == 117 [Matrix, Coords] = Thesis;18 n = 17;19 maximum = 8;20 else

21 [Matrix, Coords] = ExampleGraph(n, maximum, directed);22 % [Matrix, Coords] = DelaunayGraph(n, maximum, directed);23 end

25 [Tree, Treei] = componentTree(Matrix);

27 if write28 GraphToLatex(Matrix, Coords);29 TreeToLatex(Tree, Treei);30 end

31 saveMatrix(Matrix, Coords);

33 [Distances, S, printable] = ComponentHierQuery(startnode, Matrix, Tree, Treei, write);34 Distances2 = Dijkstra( Matrix, startnode );35 error = Distances − Distances2;36 error = error∗error’;

38 show(Matrix, Coords, Tree, Treei)

48

A.6 RealQuery

1 function [error, Distances, Tree, Treei, Matrix, Coords, tijd] = RealQuery(map, startnode)

3 % [error, Distances, Tree, Treei, Matrix, Coords, tijd] = RealQuery(map, startnode)4 % reads the required map, computes the tree, does queries, computes exact solution5 % using Dijkstra, calculates the error and shows the graph and tree.

7 if nargin < 28 startnode = 1;9 if nargin < 1

10 map = ’Thesis’;11 end

12 end

14 tijd = fix(clock);15 output = strcat(’Process started on ’, int2str(tijd(1)), ’−’, int2str(tijd(2)), ’−’, int2str(tijd(3)), ’

on ’ , int2str(tijd(4)), ’:’, int2str(tijd(5)), ’:’, int2str(6), ’. Start reading network’)

17 [Matrix, Coords] = readFiles(map);

19 tijd = fix(clock);20 output = strcat(’Network read on ’, int2str(tijd(1)), ’−’, int2str(tijd(2)), ’−’, int2str(tijd(3)), ’ on

’ , int2str(tijd(4)), ’:’, int2str(tijd(5)), ’:’, int2str(6), ’. Start calculation component tree’)

22 [Tree, Treei] = componentTree(Matrix);

24 tijd = fix(clock);25 output = strcat(’Tree finished on ’, int2str(tijd(1)), ’−’, int2str(tijd(2)), ’−’, int2str(tijd(3)), ’ on ’

, int2str(tijd(4)), ’:’, int2str(tijd(5)), ’:’, int2str(6), ’. Start query Component Hierarchy’)

27 [Distances, S] = ComponentHierQuery(startnode, Matrix, Tree, Treei);

29 tijd = fix(clock);30 output = strcat(’Query finished on ’, int2str(tijd(1)), ’−’, int2str(tijd(2)), ’−’, int2str(tijd(3)), ’ on

’ , int2str(tijd(4)), ’:’, int2str(tijd(5)), ’:’, int2str(6), ’. Calculate exact solutions usingDijkstra’)

32 Distances2 = Dijkstra( Matrix, startnode );

34 tijd = fix(clock);35 output = strcat(’Dijkstra finished on ’, int2str(tijd(1)), ’−’, int2str(tijd(2)), ’−’, int2str(tijd(3)), ’

on ’ , int2str(tijd(4)), ’:’, int2str(tijd(5)), ’:’, int2str(6))

37 error = Distances − Distances2;38 error = error∗error’;

40 show(Matrix, Coords, Tree, Treei)

49

A.7 saveMatrix

1 function saveMatrix(Matrix, coords)

3 % saveMatrix(Matrix, coords)4 % Make a matlab readable outputfile Matrix.m, containing the matrix and5 % coords for future re−use.

7 n = squareSize(Matrix);8 filename = strcat(’Matrix’ , int2str(n) , ’.m’);9 fileId = fopen(filename, ’wt’);

10 [x,y,s] = find(Matrix);11 fprintf(fileId, strcat(’function [A, coords] = Matrix’, int2str(n), ’ \n \n’));12 fprintf(fileId, strcat(’%% [A, coords] = Matrix’, int2str(n), ’ returns the last saved matrix with

corresponding coords \n’));13 for i = 1:length(x)14 fprintf(fileId, ’A(%g, %g) = %g; \n’, x(i), y(i), s(i));15 end

16 fprintf(fileId, ’A = sparse(A); \n’);17 for i = 1:n18 fprintf(fileId,’coords(%g, %g) = %g; \n’, i, 1, coords(i, 1));19 fprintf(fileId,’coords(%g, %g) = %g; \n’, i, 2, coords(i, 2));20 end

22 fclose(fileId);

50

A.8 saveTree

1 function saveTree(Tree, Treei, max)

3 % saveTree(Tree, Treei)4 % make a matlab readable outputfile Tree.m, containing the Tree and Treei5 % for future re−use.

7 if nargin < 38 max = ’’;9 else

10 max = int2str(max);11 end

12 filename = strcat(’Tree’ , max , ’.m’);13 fileId = fopen(filename, ’wt’);14 fprintf(fileId, strcat(’function [Tree, Treei] = Tree’ , max , ’ \n \n’));15 fprintf(fileId, strcat(’%% [Tree, Treei] = Tree’ , max , ’ returns the last saved Tree with

corresponding Treei \n’));16 fprintf(fileId, ’Tree = zeros(1,%g); \n’, length(Tree));17 for i = 1:length(Tree)18 fprintf(fileId, ’Tree(%g) = %g; \n’, i, Tree(i));19 end

20 fprintf(fileId, ’Treei = zeros(1,%g); \n’, length(Treei));21 for i = 1:length(Treei)22 fprintf(fileId, ’Treei(%g) = %g; \n’, i, Treei(i));23 end

25 fclose(fileId);

51

A.9 GraphToLatex

1 function GraphToLatex(Matrix, coords)

3 % GraphToLatex(Matrix, coords)4 % Make a LaTeX integrable outputfile Graph.txt, containing the code for5 % a graph corresponding with the matrix and coords,6 % and a Matrix.m containing the matrix and coords in a matlab−readable7 % file for future re−use.

9 minx = min(coords(:,1));10 maxx = max(coords(:,1));11 miny = min(coords(:,2));12 maxy = max(coords(:,2));13 n = squareSize(Matrix);14 offset(1,1) = 30/(maxx−minx);15 offset(1,2) = −20/(maxy−miny);

17 % first, write the graph−vertices18 fileId = fopen(’Graph.tex’, ’wt’);19 fprintf(fileId, ’\\begin{figure} \n’);20 fprintf(fileId, ’\\psset{unit= %g cm} \n’, 14∗0.88/(maxx−minx));21 fprintf(fileId, ’\\begin{pspicture}(%g, %g)(%g, %g) \n’, minx, miny, maxx, maxy);22 for i = 1:n23 fprintf(fileId, ’\\cnodeput(%g,%g){node%g}{%g} \n’, coords(i,:), i, i);24 end

25 [x,y,s] = find(Matrix);

27 % for symmetric matrix28 if Matrix == Matrix’29 for i = 1:length(x);30 fprintf(fileId, ’\\ncline{node%g}{node%g} \\mput∗{%g}\n’, x(i), y(i), s(i));31 end

32 else

33 fprintf(fileId, ’\\psset{nodesep=1pt}\n’);34 for i = 1:length(x);35 fprintf(fileId, ’\\ncarc{−>}{node%g}{node%g} \\mput∗{%g}\n’, x(i), y(i), s(i));36 end

37 end

38 fprintf(fileId, ’\\end{pspicture}\n’);39 fprintf(fileId, ’\\caption{ include caption } \\label{ include label }\n’);40 fprintf(fileId, ’\\end{figure} \n’);41 fprintf(fileId, ’\\ \\\\ \n’);

43 fclose(fileId);

45 % then, save the matrix46 saveMatrix(Matrix, coords);

52

A.10 TreeToLatex

1 function TreeToLatex(Tree, Treei, Treeinfo)

3 % TreeToLatex(Tree, Treei, Treeinfo) writes a file Tree.tex, containing4 % the LaTeX code for displaying the tree (and with correct levels)5 % Tree and Treei are obligatory, Treeinfo is what is used for output.

7 nTree = length(Tree);8 baseNodes = [];9 baseNodes = GoToChild(Tree, Treei, nTree, baseNodes);

10 n = length(baseNodes);11 place = zeros(1,nTree); % contains the x position, retrieve level from Treei12 place(baseNodes) = 1:n; % place baseNodes in correct order

14 for i = (n+1): nTree15 children = find(Tree == i);16 place(i) = sum(place(children))/length(children);17 end

19 fileName = ’Tree’;20 if nargin > 221 for i=1:nTree22 fileName = strcat(fileName , int2str(Treeinfo(i)) );23 end

24 end

25 fileName = strcat(fileName, ’.tex’);26 largerPicture = 3∗(nargin − 2); % to stretch picture, so more info can be added at the bottom27 fileId = fopen(fileName, ’wt’);28 fprintf(fileId, ’\\begin{figure}[htbp] \n’);29 fprintf(fileId, ’\\psset{unit= %g cm} \n’, 12/(n+2));30 fprintf(fileId, ’\\begin{pspicture}(%g, %g)(%g, %g) \n’, −1, −1−largerPicture, n+1, nTree+1);

32 % to get a tree that feels a bit balanced, a factor between height and width is necessary33 heightfactor = 2.5;34 % place nodes35 for i = 1:nTree;36 if i>n37 fprintf(fileId, ’\\cnodeput(%g,%g){node%g}{c%g} \n’, place(i), Treei(i)∗ heightfactor, i, i−n)

;38 else

39 fprintf(fileId, ’\\cnodeput(%g,%g){node%g}{%g} \n’, place(i), Treei(i)∗ heightfactor, i, i);40 end

41 end

43 % place edges44 for i = 1:nTree−145 fprintf(fileId, ’\\ncline{node%g}{node%g} \n’, i, Tree(i));46 end

48 % place labels next to nodes49 if nargin > 250 for i = 1:nTree51 if i>n52 fprintf(fileId, ’\\rput(%g,%g){%g} \n’, place(i)+1, Treei(i)∗ heightfactor, Treeinfo(i));53 else

53

54 fprintf(fileId, ’\\rput(%g,%g){%g} \n’, place(i), −heightfactor/2, Treeinfo(i));55 end

56 end

57 end

59 fprintf(fileId, ’\\end{pspicture}\n’);60 fprintf(fileId, ’\\caption{ voeg hier caption in } \\label{ voeg hier label in }\n’);61 fprintf(fileId, ’\\end{figure} \n’);62 fprintf(fileId, ’\\ \\\\ \n’);

64 fclose(fileId);

54

A.11 show

1 function show(matrix, coords, Tree, Treei)

3 % show(matrix, coords) Of this graph, a component tree is built and both4 % the graph and the component tree are shown in figures5 % show(matrix, coords, Tree, Treei) also plots the graph and the tree, but6 % assumes the tree is input.

9 close all;

11 if nargin == 012 [matrix, coords] = DelaunayGraph(20);13 end

15 if nargin < 316 [Tree, Treei] = componentTree(matrix);17 end

19 % show USA coordinates as they are20 coords(:,1) = abs(coords(:,1));

22 figure(1);23 hold on24 gplot( matrix, coords, ’o−k’);25 gplot( subGraph ( matrix, 16), coords, ’b’);26 gplot( subGraph ( matrix, 8), coords, ’r’);27 gplot( subGraph ( matrix, 4), coords, ’g’);28 gplot( subGraph ( matrix, 2), coords, ’m’);29 hold off

31 figure(2);32 treeplot(Tree);33 axis off

55

A.12 Dijkstra

1 function [Distances, elapsedTime] = Dijkstra( Matrix, startnode )

3 % [Distances] = Dijkstra( Matrix, startnode ) returns the distances to the4 % n nodes that the matrix connects, starting from the startnode. Works on5 % directed graphs as well.

7 n = squareSize(Matrix);

9 Distances = Inf∗ones(1,n);10 Distances(startnode) = 0;11 S = ones(1,n); % S(v) becomes inf when v is permanently labeled

13 t = clock;14 for i = 1:n15 minimum = min(Distances.∗S);16 fixed = find(Distances.∗S == minimum); % which labels become permanent17 fixed = fixed(1); % not very pretty, but easy, just take the first18 S(fixed) = Inf; % make this one permanent19 Outgoing = find(Matrix(fixed, :)); % determine non−zero outgoing edges20 for j = Outgoing21 if Distances(fixed) + Matrix(fixed,j) < Distances(j)22 Distances(j) = Distances(fixed) + Matrix(fixed,j);23 end

24 end

25 end

27 elapsedTime = etime(clock,t);

56

B. BUILDING THE COMPONENT TREE

The component tree is built by componentTree, the other files are necessary for this method.

B.1 componentTree

1 function [Tree, Treei] = componentTree( G, strongly )

3 % [Tree, Treei] = componentTree( G ) returns a component tree after4 % receiving input graph G. Tree contains the plottable tree−components,5 % Treei contains the levels of these components.

7 n=squareSize(G); % size of the matrix, n x n

9 if nargin < 210 strongly = false;11 end

13 if sum(sum( (G − G’).ˆ2)) == 014 directed = false;15 else

16 directed = true;17 end

19 [i,j,c]=find(G);20 C = max(c);21 last = n;22 comp = 1:n;23 Tree = sparse([], [], [], 1, 2∗n);24 Treei = sparse([], [], [], 1, 2∗n);

26 CG = transClosure( G, directed, strongly );27 if ˜isempty( find( CG == 0 ) )28 error( ’Error, the graph is not (strongly or weakly) connected’ );29 end

31 complete = false;32 l = 1;33 while ˜complete % for all levels of the resulting tree (at most floor(log2(C))+1)34 d = 2ˆl;35 CG = subGraph( G, d );36 CG = transClosure( CG, directed, strongly );37 if isempty( find( CG == 0 ) )38 complete = true;39 end

40 idx = 1:n;41 while idx42 i = find( CG(idx(1),: ) );43 % with what nodes is i in a component? Save in the tree on the

44 % positions of those nodes, that they point to the same component45 % ’last’, that is thereby declared.46 components = unique(comp(i));47 if components(2:end)48 last = last + 1;49 idx = setdiff( idx, i );50 Tree(components) = last;51 comp( i ) = last;52 Tree( last ) = 0;53 Treei( last ) = l;54 else

55 idx = idx(2:end);56 end

57 end

58 l = l+1;59 end

60 Tree = Tree(1:last);61 Treei = Treei(1:last);

58

B.2 squareSize

1 function n = squareSize(G)2 % Returns size n of an n by n−matrix, error if the matrix is not square.

4 n = size( G );5 if n(1) ˜= n(2)6 error( ’Please give me a square matrix!’ );7 end

9 n = n(1);

59

B.3 transClosure

1 function G = transClosure(G, directed, strongly)

3 % G = transitive closure(G, directed) returns the transitive closure,4 % meaning all possible connectable points are connected. For undirected5 % graphs, this is enough to form components. For directed graphs, the6 % choice must be made to return weakly connected components.7 %8 % This result can also be used the determine the largest strongly9 % connectesd subgraph for real roadnetworks.

11 n=squareSize(G);

13 if directed && ˜strongly14 G = G + transpose(G);15 end

17 G = G + eye(n);18 s = round(sqrt(n)+1)+1;19 for i = 1:s20 G = G∗G;21 G(G>0.5)=1;22 end

24 if directed && strongly25 G = G + G’;26 G(G<2)=0;27 G(G>0)=1;28 end

60

B.4 subGraph

1 function G = subGraph( G, d )

3 % G = subGraph(G, d) returns only the elements of the graph G that are smaller4 % than d in a sparse matrix

6 n=squareSize(G);

8 [i,j,c] = find(G);9 idx=c<d;

10 i=i(idx);11 j=j(idx);12 c=c(idx);

14 G=sparse(i,j,c,n,n);

61

C. COMPONENT HIERARCHY QUERY

ComponentHierQuery performs a single source shortest path calculation, where primarily Visit and alsoExpand are the main parts of the calculation.

C.1 ComponentHierQuery

1 function [Distances, S, printable, elapsed time in CHQuery] = ComponentHierQuery(startnode,Matrix, Tree, Treei, Write)

2 % Main Class3 % [distances, S] = ComponentHierQuery(startnode, Matrix, Tree, Treei)4 % requires as input a connectionmatrix Matrix, a tree Tree and a startnode5 % (between 1 and n) and returns a vector with distances from s to all nodes.

7 if nargin < 58 Write = false;9 if nargin < 3

10 if nargin < 211 [Matrix, x] = Thesis;12 end

13 [Tree, Treei] = componentTree(Matrix);14 end

15 end

17 mytime = fix(clock);18 output = strcat(’Query started on ’, int2str(mytime(1)), ’−’, int2str(mytime(2)), ’−’, int2str(

mytime(3)), ’ on ’ , int2str(mytime(4)), ’:’, int2str(mytime(5)), ’:’, int2str(6), ’. (inside querynow)’)

20 n = squareSize(Matrix);21 nTree = length(Tree);

23 Distances(1:n) = Inf; % Distances(v) will contain distance from startnode to v24 S = zeros(1,nTree); % S(v)=0 means v is not in S, S(v)=1 means v is in S25 Buckets = cell(1,nTree); % Buckets{v} will contain a matrix telling us in what bucket the children of v

are found

27 % fill ’Children’ such that Children(component) contains a vector with all leaves of that component28 % so if ’component’ is itself a leaf, then Children(component) = [component]29 Children(1:nTree) = struct(’children’, []);30 for i = 1:n31 Children(i).children = i;32 parent = Tree(i);33 while parent ˜= 034 Children(parent).children = [ Children(parent).children , i];35 parent = Tree(parent);36 end

37 end

39 t = clock;40 % permanently label the startnode41 Distances(startnode) = 0;42 S(startnode) = 1;

44 % appoint temporary labels to the nodes that can be reached from the node45 % that we just labeled permanently46 Outgoing = find(Matrix(startnode, :));47 Distances(Outgoing) = Matrix(startnode, Outgoing);

49 % main recursive call:50 % Visit is called for the root of the component tree, nTree51 [Distances, Buckets, S] = Visit(Matrix, Tree, Treei, Children, Distances, S, Buckets, nTree, Write);

53 elapsed time in CHQuery = etime(clock,t);54 mytime = fix(clock);55 output = strcat(’Process finished on ’, int2str(mytime(1)), ’−’, int2str(mytime(2)), ’−’, int2str(

mytime(3)), ’ on ’ , int2str(mytime(4)), ’:’, int2str(mytime(5)), ’:’, int2str(6), ’. (leaving query)’)

63

C.2 Visit

1 function [Distances, Buckets, S] = Visit(Matrix, Tree, Treei, Children, Distances, S, Buckets,component, Write)

3 % Visit has two very separate parts4 % if Visit is called for an internal tree−node, its minchilds are visited5 % if Visit is called for a leaf, apparently this leaf is minimal and6 $ therefor permanently labeled.

8 if Treei(component) == 09 % register as permanently labeled

10 S(component) = 1;

12 % update all relevant tree−ancestors13 parent = component;14 while Tree(parent) ˜= 015 child = parent;16 parent = Tree(child);17 % remove component from higher bucket, either permanently or replace it later18 Buckets{parent}(:,child) = 0;19 if ˜isempty(Buckets{child})20 % smallest bucket in use, correction of 121 newInterval = min(find(sum(Buckets{child} , 2)))−1;22 if newInterval < size(Buckets{child},1)23 newInterval = nicefloor(newInterval, 2ˆ(Treei(parent) − Treei(child)));24 else

25 newInterval = size(Buckets{parent},1)−1;26 end

27 % and place back on correct position28 Buckets{parent}(newInterval+1, child) = 1;29 end

30 end

32 % generate a tree for latex33 if Write34 Treeinfo = Distances;35 Treeinfo(length(Tree)) = 0;36 TreeToLatex(Tree, Treei, Treeinfo);37 clear Treeinfo;38 end

40 % give improved temporary labels to reachable nodes41 positions = find(Matrix(component, :));42 for i = positions43 if Distances(component) + Matrix(component, i) < Distances(i)44 Distances(i) = Distances(component) + Matrix(component, i);45 % find buckets that need to be updated because of this improved label46 parent = i;47 while isempty(Buckets{parent})48 child = parent;49 parent = Tree(child);50 end

51 while true52 Dwh = find(Buckets{parent}(:,child)) − 1; % shift up, for offset53 DwhCheck = nicefloor(Distances(i),2ˆ(Treei(parent)−1));

64

54 if DwhCheck < Dwh55 Buckets{parent}(:,child) = 0;56 Buckets{parent}(DwhCheck+1,child) = 1;57 end

58 if Tree(parent) == 059 break

60 end

61 child = parent;62 parent = Tree(child);63 end

64 end

65 end

67 else % visit my minchilds68 n = squareSize(Matrix);69 % check whether this component has been visited before, if not, expand.70 % either way, calculate ix, for determining minchilds. Note that ix has71 % an offset of +1 when compared to the algorithm72 if isempty(Buckets{component})73 [Buckets{component},ix] = Expand(Matrix, Tree, Treei, Children, Distances, S, component);74 else

75 ix = min(find(sum(Buckets{component} , 2)));76 end

78 % shiftJminI is 2ˆ(the leveldifference between component and its79 % parent), used to determine what interval guarantees minchild80 if Tree(component) == 081 shiftJminI = 2; % should be a lot! Shouldn’t matter actually82 else

83 shiftJminI = 2ˆ(Treei(Tree(component))−Treei(component));84 end

85 if Write86 Treeinfo = Distances;87 Treeinfo(component) = ix;88 Treeinfo(length(Treei)) = 1;89 TreeToLatex(Tree, Treei, Treeinfo);90 clear Treeinfo;91 end

93 i = ix;94 if isempty(i)95 correctBucket = 0;96 else

97 correctBucket = nicefloor(ix−1,shiftJminI); % what bucket are we emptying of components98 % parent; if component and its parent differ more than just one level, this functions as99 % if the current component was called more than once

100 end

101 while ((nicefloor(i−1,shiftJminI) == correctBucket) ||102 ((Tree(component) == 0) && (sum(S) < n) && ˜isempty(i)))103 if i > size(Buckets{component},1)104 % this is allowed, as buckets are somewhat realistic, and105 % the i can get more slack than the buckets can use.106 break

107 end

109 % find the minchilds of component; these must be in bucket i

65

110 childs = find(Buckets{component}(i,:));111 while childs112 % take the first minchild, visit that child, determine again what the minchilds of

component are113 j = childs(1);114 [Distances, Buckets, S] = Visit(Matrix, Tree, Treei, Children, Distances, S, Buckets, j,

Write);115 childs = find(Buckets{component}(i,:));116 end

117 i = i+1;118 end

120 % visiting component (and it’s minchilds) has removed relatively small121 % temporary labels to make them permanent.122 % this may have increased the minimum of tree−ancestors, these must be123 % put in the new correct bucket.124 parent = component;125 while Tree(parent) ˜= 0126 child = parent;127 parent = Tree(child);128 % remove component from parents bucket129 Buckets{parent}(:,child) = 0;130 if ˜isempty(Buckets{child})131 % smallest bucket in use, correction of 1132 newInterval = min(find(sum(Buckets{child} , 2)))−1;133 if newInterval < size(Buckets{child},1)134 newInterval = nicefloor(newInterval, 2ˆ(Treei(parent) − Treei(child)));135 else

136 newInterval = size(Buckets{parent},1);137 end

138 Buckets{parent}(newInterval+1, child) = 1;139 end

140 end

141 end

66

C.3 Expand

1 function [bucket,ix0] = Expand(Matrix, Tree, Treei, Children, Distances, S, component)

3 % assumes that Visit(component) has just been called for the first time.4 % Bucket the children of component in B(component, ...) with offset +1,5 % because matlab counts from one and not from zero as intended

7 children = Children(component).children;8 % minimal bucket in use9 ix0 = nicefloor( min(Distances(intersect(children,find(1−S)))) , 2ˆ(Treei(component)−1) );

10 % maximal bucket that could possibly be used, if all edges in the component were used11 ixInf = ceil(sum(sum(Matrix))/2ˆ(Treei(component)−1)) + ix0;12 bucket = sparse(ixInf+1+2,length(Tree));

14 directChilds = find(Tree == component);15 directChilds = intersect( directChilds, find(1−S) );

17 % bucket the children in components correct bucket18 for i = directChilds19 values = Distances(intersect( Children(i).children , find(1−S) ));20 floorminimum = nicefloor(min(values) , 2ˆ(Treei(component)−1));21 bucket( 1 + min( floorminimum, ixInf+2) , i) = 1;22 end

24 % returning the minimal non−empty bucket also with offset25 ix0 = ix0 + 1;

67

C.4 nicefloor

1 function z = nicefloor(x,y)

3 % a more adequate version of floor, tailormade for use with Comp.Hier.4 % x must be positive, y must be strictly positive

6 z = 0;7 if y < 18 error(’Please give a divider greater than 1.’);9 end

10 if x < 011 error(’Please give positive input.’);12 end

13 if isempty(x)14 z = inf;15 else

16 x = x + 0.5;17 if isinf(x)18 z = x;19 else

20 while x>y21 x = x − y;22 z = z + 1;23 end

24 end

25 end

68

C.5 GoToChild

1 function [baseNodes] = GoToChild(Tree, Treei, node, baseNodes)

3 % [baseNodes] = GoToChild(Tree, Treei, node, baseNodes) steps from ’node’4 % to a lower level, and looks at all children there. If node is on the lowest5 % level, it is added to baseNodes and returned.

7 % This info is also contained in Children, but that variable is not always accessible

9 if Treei(node) == 010 baseNodes = [baseNodes, node];11 end

13 % find the positions in the tree that point to this node; these are your children14 children = find(Tree == node);15 for i = children16 baseNodes = GoToChild(Tree, Treei, i, baseNodes);17 end

69

BIBLIOGRAPHY

[1] Dijkstra, E. W., 1959. A note on two problems in connection with graphs. Numer. Math. 1, p. 269-271.

[2] Thorup, Mikkel. May 1999. Undirected Single Source Shortest Paths with Positive Integer Weights in LinearTime. In Journal of the ACM, vol 46, No. 3., p. 362-394.

[3] Hagerup, Torben. 2000. Improved Shortest Paths on the Word RAM. In ICALP 2000, LNCS 1853, p. 61-72.

[4] Ahuja, R., Magnanti, Th. & Orlin, J. 1993. Network Flows: Theory, Algorithms, and Applications PrenticeHall, New Jersey.

[5] Matlab implementation of Delaunay graphs, found at: http://www.ece.ucsb.edu/ hespanha/software/graph-tools.html

[6] Real map-data from UA Census 2000 TIGER/Line Files U.S. Census Bureau, Washington, DC, downloadedfrom: http://www.dis.uniroma1.it/ challenge9/data/tiger/

[7] Crobak, J. 2006. Shortest Path Problems and Experimental Results Senior Honors Thesis, Lafayette College.

shortest path algorithms based on component hierarchies

Documents