traveling salesman problem - uva · the traveling salesman problem (tsp) is a textbook example of a...

Traveling Salesman Problem

Isja Mannens

July 12, 2017

Bachelor Thesis

Supervisor: dr. Viresh Patel

Korteweg-de Vries Instituut voor Wiskunde

Faculteit der Natuurwetenschappen, Wiskunde en Informatica

Universiteit van Amsterdam

Abstract

The Traveling Salesman Problem (TSP) asks for a minimal cost tour along all vertices ofan edge-weighted graph. In this thesis we will discuss different versions of the problemand some of the algorithms designed to solve the problem. We will discuss an exactalgorithm for general TSP, found in (Bjorklund, 2010). The algorithm uses an algebraicapproach, reducing Hamiltonicity to Labeled Cycle Cover Sum. We also discuss anapproximation algorithm for Euclidean Tsp, found in (Arora, 1998). This algorithmdivides the graph up using a quad-tree, which then allows for a dynamic programmingalgorithm on the resulting subgraphs. Both algorithms have some random element, forwhich an upper bound is found, using Markov’s inequality.

Title: Traveling Salesman ProblemAuthors: Isja Mannens, [email protected], 10730346Supervisor: dr. Viresh PatelDate: July 12, 2017

Korteweg-de Vries Instituut voor WiskundeUniversiteit van AmsterdamScience Park 904, 1098 XH Amsterdamhttp://www.science.uva.nl/math

1

http://www.science.uva.nl/math

Contents

1 Introduction 3

2 Prior Knowledge 42.1 Complexity of an algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Markov’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Exact TSP 63.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1.1 Labeled Cycle Cover Sum . . . . . . . . . . . . . . . . . . . . . . . 63.1.2 Calculating the Labeled Cycle Cover Sum . . . . . . . . . . . . . . 73.1.3 Hamiltonicity Reduction . . . . . . . . . . . . . . . . . . . . . . . . 93.1.4 Extension to General TSP . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2.3 False negatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Approximate TSP 164.1 Algorithm description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1.1 Making the graph well rounded . . . . . . . . . . . . . . . . . . . . 164.1.2 Building the shifted quad-tree . . . . . . . . . . . . . . . . . . . . . 174.1.3 Applying the dynamic programming algorithm . . . . . . . . . . . 18

4.2 Calculating the complexity of the algorithm . . . . . . . . . . . . . . . . . 194.3 Proof of correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.4 Additions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.4.1 Patch Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.4.2 Reasons behind the shifted quad-tree . . . . . . . . . . . . . . . . . 24

5 Conclusions 25

6 Popular Summary 26

7 Bibliography 28

2

1 Introduction

Some problems have achieved an almost mythical status in mathematics. They arethe kind of problems which go unsolved for decades, sometimes even centuries. Theseproblems tend to be the focal point for a lot of research and as a result they can be agreat source of innovation. One of these problems is the Traveling Salesman problem,or more specifically, the search for a polynomial time algorithm to solve it.

The traveling Salesman Problem (TSP) is a textbook example of a real life problem,turned into an abstract one. Given a weighted graph, the problem asks for the cheapestroute along all vertices of the graph, in terms of the edge weights. The problem hasbeen proven to be NP-Hard, meaning that a polynomial time solution to TSP, wouldprovide polynomial time solutions to every problem in NP. This means that finding sucha solution would also solve the ’P vs. NP’ problem, one of the millennium problems.

Improvements to the runtime, even if they are still superpolynomial, can be veryusefull, since TSP has many applications. There are obvious applications like planningdelivery routes and bus routes. There are also less obvious applications, like optimizingfuel use in telescope targeting1 and even computing DNA sequences2.

In this project we will attempt to learn more about TSP by examining two fairly recentpapers on TSP. Each paper describes an algorithm for a certain version of TSP. Thefirst paper discusses an algorithm for the general version and the second paper discussesan approximation algorithm for the Euclidean variant of TSP. Our main goal will beto understand these algorithms. Through this understanding we hope to get an idea ofsome of the techniques that are being used to tackle this problem.

I would like to thank dr. Viresh Patel for helping me with this project. His guidancehas not only helped me to understand the subject matter but has also allowed me tofind a reasonable focus in such a widely documented problem.

1http://www.math.uwaterloo.ca/tsp/apps/starlight.html2http://www.math.uwaterloo.ca/tsp/apps/dna.html

3

2 Prior Knowledge

2.1 Complexity of an algorithm

In this thesis we will discuss the complexity of a number of algorithms. The complexityof an algorithm is usually one of its most important characteristics. In this section wewill give a short introduction to the complexity of an algorithm and the notations used.

Broadly speaking, the complexity of an algorithm gives an indication of the rate atwhich the runtime of an algorithm increases, as a function of the problem size. Here weinterpret run time as the number of computations1. Complexity is important, because itgives an indication of the input sizes which are feasible to compute using the algorithm.For example, the run time of multiplying two n digit numbers, as a function of n, canby bounded by some multiple of n2. However, the complexity of factorizing a numberinto its prime factors is assumed to be much higher2. This fact is used in cryptographicmethods like RSA cryptography, which depend on the fact that it is easy to calculatethe product of two large primes, but it is very hard to find those primes, given theirproduct.

When we have found such a function f , which gives an upper bound on the runtimeof an algorithm, we say that the algorithm has a runtime of O(f(n)). A function g(n),which would describe the runtime of an algorithm in our case, is O(f(n)), if there issome constant c > 0 such that g(n) < cf(n) for all n.

Finally there are some classes of algorithms, known as complexity classes. For examplethe complexity of the multiplication mentioned earlier can be described by a polynomialand thus we say that is a polynomial time algorithm. An algorithm with complexityO(epoly(n)) would be considered exponential time.

This definition of complexity is sometimes referred to as time complexity, since wemay use the same principles to describe other aspects of an algorithm, most notably theamount of memory it would use on a computer.

1Some computations take more time than others. However, we will find that, as long as we have someupper bound on the time a computation takes, this has no impact on the resulting definition ofcomplexity.

2There are no polynomial time algorithms known

4

2.2 Dynamic Programming

Dynamic programming is a common technique used in algorithmics. The basic idea is tostart at one or more trivial versions of the main problem and gradually build a lookuptable of solutions, by combining solutions for smaller problems into solutions for biggerproblems.

A simple example of this technique is the calculation of the n-th Fibonacci number. Arecursive algorithm would calculate fib(n) by fib(n) = fib(n− 1) + fib(n− 2) meaningthat the running time roughly doubles from n− 1 to n, since the algorithm will executeitself for both f(n− 1) and f(n− 2). This gives it a running time of O(2n). A dynamicprogramming algorithm for the same problem, would build a list of all Fibonacci num-bers, starting at fib(0), and would use previous entries in the list to calculate each newentry. This means that the increase in running cost from n to n + 1 is only the cost ofa single addition, making the running cost of this algorithm O(n).

2.3 Markov’s Inequality

Both algorithms discussed in this thesis use Markov’s inequality. This inequality is auseful tool for finding upper bounds on probabilities.

Given a non-negative random variable X and a > 0, Markov’s inequality tells us that

P(X ≥ a) ≤ E(X)

a

where P(Y ) indicates the chance of event Y taking place and E(X) indicates the expectedvalue of X.

For example the chance of X being larger than 2 · E(X) is at most E(X)2·E(X) = 1

2 .

5

3 Exact TSP

A common technique for finding algorithms, is to reduce the problem to another problem,by describing a way to translate a solution for the second problem into a solution of theoriginal problem. This can often simplify the problem. In this section we discuss anexact algorithm for General TSP, found in (Bjorklund, 2010).

Definition 1 (General TSP). Given a weighted graph G = (V,E,w) with w : E → R≥0,find a cycle π, such that all nodes in V are visited by π and the sum of the weights ofthe edges of π is minimal over all possible choices of π.

The paper first gives an algorithm which calculates the so called Labeled Cycle CoverSum and then reduces the Hamiltonicity Problem to the Labeled Cycle Cover Sum.

Definition 2 (Hamiltonicity). Given a graph G = (V,E), determine whether there existsa cycle π in G, such that all nodes in V are visited by π.

The algorithm is then extended to an algorithm for General TSP.

3.1 Framework

3.1.1 Labeled Cycle Cover Sum

In this section we will discuss the labeled cycle cover sum and some lemmas which willhelp with the reduction in the next section.

Definition 3 (Labeled Cycle Cover). Given a directed graph D = (V,A), a cycle coveris a subset C ⊆ A, such that for all v0 ∈ V there is exactly one v0v1 ∈ C and exactly onev2v0 ∈ C1. Given a label set L, a labeled cycle cover is a cycle cover C, together with asurjective function g : L C. We denote the set of all labeled cycle covers (C, g) on Das lcc(D,L) and the set of labeled Hamiltonian cycle covers (H, g) on D as lhc(D,L)2.

Intuitively, a cycle cover is a disjoint collection of cycles on a graph, which cover allvertices of the graph. The function g, simply distributes the labels in L over the cyclecover. Using this defintion, we can define the labeled cycle cover sum.

Definition 4 (Labeled Cycle Cover Sum). Given a directed graph D = (V,A), a labelset L and a function f : A× 2L\∅ → R for some ring R, the labeled cycle cover sumis defined as

Λ(D,L, f) =∑

(C,g)∈lcc(D,L)

∏a∈C

f(a, g−1(a))

1We assume that D has no arcs from a node to itself and thus v1 6= v0 6= v2.2Note that this set is not the same as labeled Hamiltonian cycles, since the cycles have a direction.

6

Later on, we will use this sum to determine Hamiltonicity. The following definitionand lemma show how we can we can remove all non-Hamiltonian cycle covers from thesum.

Definition 5 (s-oriented Mirror Function). Given a bidirected graph D = (V,A), afinite set L and a special node s ∈ V , an s-oriented mirror function is a functionf : A× 2L\∅ → R, such that f(uv, Z) = f(vu, Z) for all Z ∈ 2L\∅ and u 6= s 6= v.

Lemma 1. Given a bidirected graph D = (V,A), a finite set L and a special node s ∈ V ,let f be an s-oriented mirror function with a codomain ring of characteristic two. Then

Λ(D,L, f) =∑

(H,g)∈lhc(D,L)

∏a∈H

f(a, g−1(a))

Proof (sketch). Since f(a, g−1(a)) is in a ring of characteristic two, equal elements willcancel out. This means that if we can pair each labeled non-Hamiltonian cycle cover toanother, unique labeled non-Hamiltonian cycle cover with the same term for∏

a∈Hf(a, g−1(a))

they will cancel out and we will be left with just the Hamiltonian cycle covers. Theway we construct this pairing is by choosing an ordering on the possible subcycles. Wethen find an elements partner by reversing the direction of the first subcycle, which doesnot contain s3. Since f is an s-oriented mirror function, reversing the direction of thesubcycle, does not change cycle cover’s contribution to the sum and thus the pairs cancelout.

3.1.2 Calculating the Labeled Cycle Cover Sum

In this section we will discuss a method for calculating the labeled cycle cover sum over aring of characteristic two, which will prove to be relatively fast. This method is the mainfactor in determining the runtime of the algorithm and thus is crucial to the effectivenessof the overall algorithm.

Let D = (V, F ) a directed graph. If we define a matrix A as

Ai,j =

ω(ij) : ij ∈ F0 : otherwise

we find that

per(A) =∑

σ:VV

|V |∏i=1

ω(iσ(i)) =∑

C∈cc(D)

∏ij∈C

ω(ij)

Furthermore, we use the fact that in a ring of characteristic two, det(A) = per(A).

3We will not go into detail on how to construct this ordering, since it does not add much to theunderstanding of the proof.

7

We will use this fact to construct a polynomial in x, out of determinants of certainmatrices, for which the coefficient for x|L| will equal the labeled cycle cover sum of D.

First we will define the aforementioned matrices as follows. For Z ⊆ L, with L somelabel set, let

Mf (Z)i,j =

f(ij, Z) : ij ∈ F,Z 6= ∅0 : otherwise

We then define the polynomial as

p(f, x) =∑Y⊆L

det

( ∑Z⊆Y

x|Z|Mf (Z)

)The following lemma then tells us how this polynomial relates to the labeled cycle coversum.

Lemma 2. For a directed graph D, a labelset L and f : A× 2L\∅ → GF (2k),

[x|L|]p(f, x) = Λ(D,L, f)

where [xk]p(x) indicates the coefficient of xk in p(x).

Proof. Using the equality of determinants and permanents in rings of characteristic two,we write p(f, x) as

p(f, x) =∑Y⊆L

per

( ∑Z⊆Y

x|Z|Mf (Z)

)=∑Y⊆L

∑C∈cc(D)

∏ij∈C

∑Z⊆Y

x|Z|f(ij, Z)

Now note that for any set of numbers (ai,j)n,mi,j=0

n∏i=1

m∑j=1

ai,j =∑

q:1,...,n→1,...,m

n∏i=1

ai,q(i)

Applying this to the polynomial we find

p(f, x) =∑Y⊆L

∑C∈cc(D)

∑q:C→2Y

∏ij∈C

x|q(ij)|f(ij, q(ij))

=∑

C∈cc(D)

∑Y⊆L

∑q:C→2Y

∏ij∈C


=∑

C∈cc(D)

∑q:C→2L

∑Y⊆L

(⋃

ij∈Cq(ij))⊆Y

∏ij∈C


8

There are 2|L\(

⋃ij∈C

q(ij))|different Y , such that (

⋃ij∈C

q(ij)) ⊆ Y . This means that if

(⋃ij∈C

q(ij)) 6= L, there are an even number of equal∏ij∈C

x|q(ij)|f(ij, q(ij)) terms in the

final sum. Since we work in a ring of characteristic two, these cancel out, leaving us with

p(f, x) =∑

C∈cc(D)

∑q:C→2L

(⋃

ij∈Cq(ij))=L

x∑

ij∈C |q(ij)|∏ij∈C

f(ij, q(ij))

From this we find that

[x|L|]p(f, x) =∑

C∈cc(D)

∑q:C→2L

(⋃

ij∈Cq(ij))=L

∀a6=b:q(a)∩q(b)=∅

∏ij∈C

f(ij, q(ij))

Because of this we may now ’reverse’ q, by g(a) := b where a ∈ q(b). This results in

[x|L|]p(f, x) =∑

C∈cc(D)

∑g:LC

∏ij∈C

f(ij, g−1(ij)) =∑

(C,g)∈lcc(D,L)

∏ij∈C

f(ij, g−1(ij))

which is exactly the definition of the labeled cycle cover sum.

3.1.3 Hamiltonicity Reduction

In this section we will reduce the Hamiltonicity problem to the Labeled Cycle Cover Sum.The main idea is to divide the vertex set into two equal halves, using one half as labelsand the other half as vertices in a new graph. We will find that the Labeled Cycle CoverSum of this new graph is zero, precisely when the original graph is non-Hamiltonian.

Let G = (V,E) be any graph, we divide V into to halves V1 and V2 such that|V1| = |V2|.4 We call an edge in G unlabeled by V2 if it has no end points in V2 andcall the remaining edges labeled by V2. We will denote by U(G) and L(G) the sets ofunlabeled and labeled edges respectively. We also define lhcmV2(G) as the set of orientedHamiltonian cycles (H,σ) with precisely m edges in U(G), where σ : Lm → U(G) labelsthese m edges.

In order to use lemma 1 we will need to choose a label set, define a bidirected graphD and define an s-oriented function f over a ring of characteristic two. We defineD = (V1, F ) as a complete graph and use the label set V2 ∪ Lm for some 0 ≤ m ≤ n− 1and Lm an arbitrary label set of size m. We will later find that we have to check multiplevalues of m. We also choose some s ∈ V1.

Before we can define our s-oriented mirror function f we have to introduce somevariables. For every edge uv ∈ L(G), we introduce variables xuv and xvu. We setxuv = xvu, precisely when v 6= s 6= u, which will help ensure that the function f is

4If |V1| is uneven, we choose the halves, such that |V1| = |V2|+ 1.

9

s-oriented. For every edge uv ∈ U(G) and every d ∈ Lm we introduce variables xuv,dand xvu,d. We again set xuv,d = xvu,d, precisely when v 6= s 6= u. Finally let Pu,v(X) bethe set of paths from u to v, which pass through every vertex in X.

We now define f as follows. For uv ∈ F and ∅ 6= X ⊆ V2, we set

f(uv,X) =∑

P∈Puv(X)

∏wz∈P

xwz

For uv ∈ F such that uv ∈ L(G) and d ∈ Lm, we set

f(uv, d) = xuv,d

Everywhere else, we set f to zero. We will interpret these variables as polynomials overGF (2k) for some k. We will choose the value of k in section 3.2.3

Lemma 3. With G,D, V2,U ,L,m,Lm, f and lhcmV2 defined as above,(i) Λ(D,V2 ∪ Lm, f) =

∑(H,σ)∈lhcmV2 (G)

∏uv∈U(H)

xuv,σ−1(uv)

∏uv∈L(H)

xuv

(ii) Λ(D,V2 ∪ Lm, f) is the zero polynomial, precisely when hcmV2(G) = ∅

Proof (i). Using lemma 1 we find that

Λ(D,V2 ∪ Lm, f) =∑

(H,g)∈lhc(D,V2∪Lm)

∏uv∈H

f(a, g−1(a))

Since f(a, g−1(a)) = 0 when g−1(a) intersetcts Lm and is not a singleton, we may assumethat

g(a) =

g′(a) if a ∈ V2

σg(a) if a ∈ LmWhere g is surjective and σg is bijective for codomains Hg′ , Hσg , such that Hg′∪Hσg = Hand im(g′) ∩ im(σg) = ∅.

Using this we write the sum as

Λ(D,V2 ∪ Lm, f) =∑


( ∏uv∈H

g−1(uv)⊆V2

∑P∈Pu,v(g−1(uv))

∏wz∈P

xwz

)( ∏uv∈H

g−1(uv)⊆Lm

xuv,g−1(uv)

)

(1) =∑


( ∑S∈SH,g

∏wz∈P∈S

xwz

)( ∏uv∈H

g−1(uv)⊆Lm

xuv,g−1(uv)

)

(2) =∑


∑S∈SH,g

( ∏wz∈P∈S

xwz

)( ∏uv∈H

σ−1g (uv)⊆Lm

xuv,σ−1g (uv)

)

(3) =∑

(H,σ)∈lhcmV2

(G)

( ∏wz∈L(H)

xwz

)( ∏uv∈U(G)

xuv,σ−1(uv)

)

10

In step (1) we define SH,g :=∏

uv∈Hg−1(uv)⊆V2

Pu,v(g−1(uv)) as the set of all possible com-

binations of paths P , and rewrite the expression accordingly. In step (2) we only usedistributivity in rings to move the last product into the summation over SH,g and replaceg by σg. In step (3) we reınterpret edges in the paths P as labeled by V2 and the othersas unlabeled. We also note that each Hamilton cycle in lhcmV2(G) is precisely an orderingof V1, together with a choice of paths between all but m consecutive vertices in this or-dering. This translates into a choice of g, to determine which vertices of V2 are betweenwhich vertices in the ordering and to determine which m consecutive vertices will beconnected directly. Given g, we only need to determine the paths between consecutiveorderings, which is precisely a choice of S ∈ SH,g.

(ii). If hcmV2(G) = ∅, then clearly the sum in (i) evaluates to zero.For the other direction, we will argue that each oriented Hamilton cycle in hcmV2(G)

contributes m! different monomials to the sum, one for each σ. If two oriented Hamiltoncycles use different edges, then these edges contribute different variables and thus resultin different monomials. If two different oriented Hamilton cycles use the same edges,then they must be the same cycle with opposite orientation. In this case the asymmetryof the variables around s ensures that the edges around s contribute different variablesand thus the two Hamiltion cycles contribute different monomials.

Since each Hamilton cycle contributes a unique set of monomials, the sum can onlybe zero if hcmV2(G) = ∅

3.1.4 Extension to General TSP

In this section we will describe a way to expand the algorithm to instances of TSP withbounded integer weights. The main idea is to keep track of the weight of the edges inthe Hamilton cycles, by introducing a new variable y, whose power will represent theedge weights. We then need a way to find the smallest l, such that yl has a nonzerocoefficient.

More precisely, given a weight function w : E → Z≥0, we define a function fy as

fy(uv,X) :=

∑

P∈Pu,v(X)

∏st∈P

yw(st)xst : if uv ∈ L(G) and ∅ 6= X ⊆ V2

yw(uv)xuv,d : if uv ∈ U(G), |X| = 1 and X ⊆ Lm

We essentially replace each variable xuv and xuv,d by yw(uv)xuv and yw(uv)xuv,d. In orderto keep the exponent from wrapping around, i.e. have an exponent larger that the sizeof the ring in which we work, we will choose k, such that |GF (2k)| > ”weight of heaviestHamilton cycle”. We can use any upper bound, such as wtotal, the sum of the weightsof all edges.

We now choose some generator g of the multiplicative group in GF (2k) and define

T (l) :=

mmax∑i=0

Λ(D,V2 ∪ Li, fy) where y = gl

11

We then compute the inverse Fourier transform of T

t(j) :=

2k−2∑l=0

g−jlT (l)

which gives us the coefficient of yj inmmax∑i=0

Λ(D,V2 ∪ Li, fy), since

∑g−jlgil =

1 : i = j

0 : i 6= j

Now it is only a matter of checking t(j) for j = 0, . . . , wtotal and returning the smallestj, for which we find a nonzero coefficient.

3.2 The algorithm

In this section we will describe the steps of the actual algorithm. We will refer todefinitions and lemmas from the previous section.

3.2.1 Description

Given a graph G = (V,E), the main idea of the algorithm is to construct everything usedin lemma 3, choosing V1 and V2 at random. This allows us to define cycle cover sums,which are nonzero, precisely when the graph has a Hamiltoncycle with m edges unlabeledby V2. We then evaluate these labeled cycle cover sums in some randomly chosen set ofvalues5. We will use the method described in section 3.1.2, which calculates the cyclecover sum, by constructing a polynomial, which has the value of Λ(D,V2 ∪ Lm, f) asone of its coefficients. We do this for values of m from 0 to mmax for some value ofmmax. This procedure is called a run. We will perform r runs, where r and mmax willbe determined in section 3.2.2, and if any of the cycle cover sums evaluate to a non-zerovalue, we conclude that the graph is Hamiltonian. If all evaluations return zero, weassume that the graph is non-Hamiltonian.

In order to quickly compute the value of f , in a subset of F ×V2, we use the followingrecursion ∑

P∈Pu,v(X)

∏sz∈P

xsz =∑w∈Xuw∈E

xuw

( ∑P∈Pw,v(X\w)

∏sz∈P

xsz

)

In the case of general TSP, we use the function fy as defined in section 3.1.4. We thencalculate t(j) and search for the smallest j for which t(j) 6= 0. This forms a run, whichwe then repeat r times, using the smallest found tour as our output.6

5The cycle cover sum is a polynomial in this case.6Note that we still use the method from section 3.1.2, in order to calculate

T (l) =mmax∑i=0

Λ(D,V2 ∪ Li, fy)

12

3.2.2 Complexity

There are a number of factors which contribute to the complexity of the algorithm.First of all there is the calculation of f(uv,X) for X ⊆ V2. However, since |X| ≤|V2| = n

2 , this contributes O(2n2 ) to the runtime7. We will find that we may safely ignore

this contribution.Next, the number of runs also affects the runtime, by a factor of r ·mmax. We will

find that setting mmax = n4 and r = n2 will give a good runtime, while still keeping the

chance of false negatives at e−Ω(n).Finally the method described in section 3.1.2 also adds to the complexity and we will

find that this method forms the main contribution to the runtime. The following lemmawill show exactly how.

Lemma 4. The labeled cycle cover sum Λ(D,L, f) for a function f with codomainGF (2k) on a directed graph D on n vertices, and with 2k > |L|n, can be computed inO((|L|2n+ |L|n1+ω)2|L|+ |L|2n2) operations, where ω is the square matrix multiplicationexponent.

Proof. In order to find the coefficient of x|L| in p(f, x), we will use a number of existingtechniques. We will evaluate p(f, x) in |L|n different values of x and apply the Lagrangeinterpolation to find the desired coefficient8. This step will take O(|L|2n2) operationsand accounts for the second term in the runtime expression.

In order to apply the interpolation, we need to calculate p(f, xi) for all |L|n choicesof xi. As a reminder we have

p(f, x) =∑Y⊆L

det

( ∑Z⊆Y

x|Z|Mf (Z)

)We will calculate ∑

Z⊆Yx|Z|i Mf (Z)

for all Y ⊆ L using Yates’ fast zeta transform (Yates, 1937), using O(|L|2|L|) operations.We then use the determinant algorithm by Bunch and Hopcroft (Bunch and Hopcroft,1974), to calculate p(f, xi) in O(nω2|L|) operations, where ω is the square matrix multi-plication exponent. Doing this for all |L|n choices of xi, we find a runtime of

O((|L|2|L| + nω2|L|)|L|n) = O((|L|2n+ |L|n1+ω)2|L|)

which accounts for the first term in the runtime expression.

7O(f(n)) suppresses all polylogarithmic functions in f(n), meaning that O(log(f(n))kf(n)) = O(f(n))for any constant k.

8Note that the maximum degree that p(f, x) can achieve, is equal to |L|n, from the determinant of∑Z⊆L

r|Z|Mf (Z)

13

Since |L| = |V2|+ |Lm| ≤ n2 +mmax, we find a total complexity of

”Runtime” = O(r ·mmax((n

2+mmax)2n+ (

n

2+mmax)n1+ω)2(n

2+mmax) + (

n

2+mmax)2n2))

= O(n3 ·mmax(n52(n2

+mmax) + n4))

= O(n3

4(n52(n

2+n

4) + n4))

= O(n823n4 + n7)

= O(n823n4 )

= O(23n4 )

In the case of general TSP, the runtime is increased by a factor of wtotal, by thecalculation of T (l), for each l = 0, . . . wtotal. The Fast Fourier Transform allows us tocalculate t(j) for all j, in only O(wtotal log(wtotal)). Since this is only done once per run,we add r · wtotal log(wtotal) to the runtime, which can be neglected in most cases9.

3.2.3 False negatives

Since the algorithm evaluates a polynomial in a random point, in order to determinewhether it is the zero polynomial, there is a chance to receive false negatives. If thealgorithm happens to choose one of the roots of a nonzero polynomial, it will wrongfullylabel it as the zero polynomial. If this happens, every time Λ(D,V2∪Lm, f) is evaluated,the algorithm will return non-Hamiltonicity, when the graph in question is actuallyHamiltonian. In order to find an upper bound on the chance of this happening, we usethe following, well-known lemma.

Lemma 5 (Schwartz-Zippel). Let P (x1, x2, . . . , xn) be a nonzero n-variate polynomialof total degree d over a field F . For r1, r2, . . . , rn ∈ F (uniformly) randomly chosen, wehave

P(P (r1, r2, . . . , rn) = 0) ≤ d

|F |

If we choose k such that |GF (2k)| > cn for some c > 1, we find that

P(P (r1, r2, . . . , rn) = 0) ≤ n

cn=

1

c

if the graph is Hamiltonian.Another factor which may cause false positives is the fact that only some values of

m are checked. It is possible that all Hamilton cycles have at least mmax + 1 edges,unlabeled by V2, given the chosen V1 and V2. Given the expected number of unlabeled

9Of course, if w happens to be very large, this will have an effect on the runtime, but if the weight ofthe individual edges is assumed to be bound by some finite value M , then we find that w is O(n2)and we may neglect it.

14

edges in a Hamilton cycle, we can use Markov’s inequality to find an upper bound forthe chance of this happening.

P( n/4∑m=0

|lhcmV2(G)| = 0

)≤ P

(|U(H)| ≥ n

4+ 1)

for a given H

≤ E(|U(H)|)n+4

4

(Markov’s inequality)

=n4

n+44

=n

n+ 4

In the third step we use the fact that E(|uv|u ∈ V1, v ∈ V2|) = n2 and E(|U(H)|) =

n−E(|uv|u∈V1,v∈V2|)2 = n

4 .We find that the chance of a false negative, after each run is

P(”False Negative, one run”

)= P

( n/4∑m=0

|lhcmV2(G)| = 0

)+

(1− P

( n/4∑m=0

|lhcmV2(G)| = 0

))1

c

≤ n

n+ 4+ (1− n

n+ 4)1

c

=n

n+ 4+

4

c(n+ 4)=

cn+ 4

c(n+ 4)

and thus after n2 runs, the chance of false negatives is

P(”False Negative, n2 runs”

)=( cn+ 4

c(n+ 4)

)n2

=(( cn+ 4

c(n+ 4)

)n)n≤(c+ 4

5c

)nIn the last step, we use the fact that( cn+ 4

c(n+ 4)

)n=(

1− 4c− 4

c(n+ 4)

)n≤(

1− c′

n

)nFor c′ = 4c−4

c . We know that (1− kn)n is a monotonically decreasing function for k > 0.

This tells us that ( cn+ 4

c(n+ 4)

)n≤( 1c+ 4

c(1 + 4)

)1=c+ 4

5c

for n ≥ 1. We find that the chance of false negatives is exponentially small in n.In the case of general TSP, the same upper bound holds for the chance of returning

a non-optimal salesman tour. The reason we can use this upper bound, is that inthe derivation for it, we only assumed the existence of a single Hamilton cycle. If wesubstitute this Hamilton cycle for the optimal salesman tour, the exact same derivationstill holds.

15

4 Approximate TSP

In this section we will discuss a polynomial time approximation algorithm for Euclidean-Tsp, found in (Arora, 1998). The precise statement of the problem is as follows:

Definition 6. (PTAS for Euclidean TSP) Given a set of n nodes in Rd and a constantc > 0, let OPT be the length of the shortest cycle visiting all nodes. A ’PolynomialTime Approximation Scheme for Euclidean TSP’, is a polynomial time algorithm whichoutputs a cycle which visits all nodes and is of length at most (1 + 1

c )OPT .

In the following sections we will discuss the algorithm in detail, a proof of its correct-ness and some additions to the paper.

4.1 Algorithm description

In this section we will discuss the algorithm, as it would be executed in R2. The mainidea behind the algorithm is to build what is known as a quad-tree on the given graph,a certain series of subdivisions starting with a square around the graph. The quad-treeis built a randomly shifted dissection 1. We will then apply a dynamic programmingalgorithm on the resulting figure. Each step will be discussed in more detail in thefollowing sections.

4.1.1 Making the graph well rounded

In order to make the graph easier to work with, the nodes are moved around slightly,such that they meet the following criteria:

• All nodes have integral coordinates

• The distance between two nodes is at least 8 (unless two nodes are moved to thesame set of coordinates)

• The maximum distance between nodes is of O(n)

The first criterion will allow us to make certain assumptions about the nodes. Forexample, none of the nodes will lie on a grid-line. The second criterion will allow us tofind an upper bound on the number of times a salesman tour crosses a grid-line. This isrelated to a more technical argument we will discuss later on, but on an intuitive levelwe may understand it as a way to make sure that there cannot be a situation in whichwe jump back and forth between nodes, which are relatively close but still make us cross

1The reason for this will be discussed in section 4.4.2

16

a large number of grid-lines. The last criterion gives us a way to relate the size of thebounding box around the graph, to the number of nodes n in the graph, which will allowus to express the run-time in terms of n.

This so called ’perturbation’ is executed as follows. Place a square around the graph,called the bounding box, and place a grid in it. The width of each square in the grid isequal to L0

8nc , where L0 is the width of the bounding box and c is the constant mentionedin the statement of the problem. We then move each node to the nearest grid-pointand scale up by a facter of 64nc

L0, making the distance between consecutive grid-lines 8.

Note that the perturbation increases the length of each edge in a salesman path by atmost 2 · L0

8nc and thus the cost of the tour is increased by at most 2n · L08nc = L0

4c <OPT

4c ,since the width of the bounding box is determined by the largest horizontal or verticaldistance between two nodes2 and thus OPT > L0. Also note that the width L of thenew bounding box is L = L0 · 64nc

L0= 64nc = O(nc).

4.1.2 Building the shifted quad-tree

We will now build a quad-tree on the the graph. We will assume L is a power of 2. Ifnot, we will increase the bounding box until L is a power of 2, thus increasing L by atmost a factor of 2. The conclusion that L = O(nc) still holds in this case.

Building a quad-tree is a recursive process. At each step, we determine the numberof nodes in the current square. If there are two or more nodes in the square, we dividethe square into four equally sized squares and apply the same process to each square. Ifthere are one or no nodes in the square, we do nothing and the recursion stops.

Figure 4.1: An example of a quad-tree on a Euclidean graph and a (1,1)-shifted quad-tree. The original bounding box is shown in red. Note that the resultingquad-tree is very different from the original quad-tree, which had a (0,0)-shift.

A shifted quad-tree is constructed by first shifting the bounding box, horizontally andvertically, by integer values. An (a,b)-shift is a shift of a distance a horizontally and adistance b vertically. We then construct the quad-tree as normal, where we interpret

2Technically L0 should be 1 more than this distance, in order to make sure none of the nodes are on theouter edges, however since OPT has to start and finish at the same node, we also get OPT > 2L0

17

squares at the edge to (possibly) wrap around to the other side. In the example above,the first dissection in the (1,1)-shifted quad-tree, would consist of the thick black lines.

4.1.3 Applying the dynamic programming algorithm

The general idea behind the dynamic programming step, is to solve a general versionof the problem for each square of the shifted quad-tree. These solutions can then becombined to find solutions for larger squares and so on, until we find the solution for thesquare which contains the whole graph.

In order to limit the number of solutions for each square, we will only permit thepath to cross the quad tree at certain points and only a certain number of times. Morespecifically we use the following definitions.

Definition 7. An m-regular set of portals on a quad-tree is a set of points on the edgesof the squares in the quad-tree. Each square has one point on each corner and m equallyspaced points on each edge.

Definition 8. An (m,r)-light salesman path, with respect to a quad-tree, is a salesmanpath which only crosses the quad-tree at an m-regular set of portals on the quad-tree,using ≤ r portals on each edge of each square of the quad-tree3, where we count multipleuses of the same portal as multiple portals. The edges in this path are allowed to ’bend’to reach a portal, meaning that an edge between two nodes doesn’t have to be a straightline.

The dynamic programming step will find the optimal (m,r)-light salesman path, withrespect to the shifted quad-tree. In section 4.3 we will show that this path is sufficientlyclose to the actual salesmen path, with chance at least 1

2 , w.r.t. the random shift of thequad-tree.

The dynamic programming algorithm will use the following observations. Let S bea square in the quad-tree, excluding the outer square. Then the section of the optimal(m,r)-light salesman path OPTm,r in S is a sequence of p paths such that:

• Let a1, . . . a2p be the sequence of portals which OPTm,r crosses, in the order itcrosses them. Then the i− th path connects a2i−1 to a2i.

• Each node in S is visited by one of the paths.

• The collection of paths uses at most r portals on each edge, where we count multipleuses of the same portal as multiple portals.

On each square S in the quad tree, the section of OPTm,r in S will be the ’cheapest’set of paths with these properties. If this weren’t the case, we could switch the set ofpaths with a cheaper set and still have a salesman path, but with a difference in costequal to the difference between the sets of paths. Using these properties we may definethe generalized version of the (m,r)-light salesman path, a minimal (m,r)-multipath, asfollows.3Here the term square refers only to squares in the quad-tree which have not been further divided.

18

Definition 9. Let S be a non-empty square in the shifted quad-tree. Let M be a multisetof ≤ r portals on each side of S, such that the total number of portals in M , includingmultiplicity, is even4. Let p1, p2, p3, p4, . . . p2t−1, p2t be a pairing between theportals in M . The optimal (m,r)-multipath is the ’cheapest’ set of paths, which connectp2i−1 to p2i, for i = 1, . . . t and collectively visit all nodes in S.

The optimal (m,r)-multipaths will form the entries of our lookup table, for all squaresexcept the bounding box. The entry for the bounding box will be OPTm,r. For moreinformation on the use of lookup tables in dynamic programming algorithms, see section2.2.

In order to find the entry of a larger square S, using smaller squares S1, . . . , S4, wemust have a way of combining smaller multipaths into larger multipaths. We do thisby first choosing a set of ≤ r portals on each of the four inner edges of the ’children’of S. We then choose an ordering of the inner portals and a way to distribute themover the paths in an (m,r)-multipath of S. We can then simply look up the four entriescorresponding to the choice of portals and their ordering on each Si. The minimum overall possible choices will give us the table entry for S.5

Once we have the entries of the four children of the bounding box, we can find the(m,r)-light salesman path by checking each multiset and (reasonable) ordering of theinner portals, thus omitting the necessity for the inner portals to lie on a certain path.

4.2 Calculating the complexity of the algorithm

The smallest squares in the table contain one node and use ≤ 4r portals, so for eachchoice of portals computing the entry takes O(r) time, by placing the node in each of theO(r) paths. Each time we combine existing solutions, we have to try at most (m+ 4)4r

multisets of portals on the inside edges, at most (4r)! orderings of the portals and at most(4r)4r ways to distribute the portals over the given paths. We find that the calculationof each entry costs O((m+ 4)4r(4r)4r(4r)!) time.

There are T = O(n log(n)) squares in the quad-tree. For each square there are atmost (m + 4)4r ways to choose the portals and (4r)! ways to order the portals. Thismeans that every square contributes at most (m+ 4)4r(4r)! entries to the table, givingus a total of O(T (m+ 4)4r(4r)!) entries.

Factoring in the time for each entry we find a running time ofO(T (m+4)8r(4r)4r(4r)!2).By choosing m = O(c log(n)) and r = O(c) we find:

Trun = O(T (m+ 4)8r(4r)4r(4r)!2)

= O(n log(n)(c log(n) + 4)8c(4c)4c(4c)!2)

= O(n log(n)(c log(n))O(c)(4c)O(c)cO(c))

= O(n(c log(n))O(c)cO(c))

=∗ O(n(log(n))O(c))

4Note that the use of a multiset implicates the possibility of some portals appearing multiple times.5Note that explicitly placing the inner portals on paths keeps cycles from appearing.

19

(*) Since c is a constant, we find O(cO(c)) = O(1).We see that the algorithm is indeed of polynomial time, with the degree depending

on the desired precision.

4.3 Proof of correctness

The algorithm as described earlier computes the optimal (m,r)-light salesman path. Theoriginal problem, however was to find an approximation of the optimal salesman path.In this section we will discuss a proof for the following theorem, which relates thesetwo problems. We will not give the full proof and will focus on the important aspectsinstead.

Theorem 1 (Structure Theorem). Let c > 0 be any constant. Let the minimum nonzerointernode distance in a Euclidean graph be 8 and let L be the size of the bounding box.Then for a randomly chosen (a,b)-shifted quad-tree and m = O(c log(L)), r = O(c),there is a chance of at least 1

2 that the optimal (m,r)-light salesman path is at most 1+ 1c

times as expensive as the optimal salesman path.

In the proof we will use the following two lemmas.

Lemma 6 (Patching Lemma). There is a constant g > 0 such that the following is true.Let S be any line segment of length s and π be a closed path that crosses S at least threetimes. Then there exist line segments on S whose total length is at most g · s and whoseaddition to π changes it into a closed path π′ that crosses S at most twice.

Lemma 7. For a grid where the distance between lines is of unit length, for π a salesmanpath and l a line in the grid, let t(π, l) denote the number of times π crosses l. If theminimum internode distance is at least 4 and T is the length of π, then∑

l,vertical

t(π, l) +∑

l,horizontal

t(π, l) ≤ 2T

For a proof of the first lemma, resulting in g = 4, see section 4.4.1. The second lemmacan be proven by looking at a single line segment in π and showing that the number oflines it crosses is at most twice its length6.

Proof (sketch). In the proof we will use constants s = 12gc and r = s+4. In order to findthe difference in cost between the optimal salesman path π and the optimal (m,r)-lightsalesman path, we will modify π into an (m,r)-light salesman path and show that withchance ≥ 1

2 it is at most 1 + 1c times as expensive as π.

Let l be a line in the quad-tree of the graph, we say that a line is at level i, if it bordersa square in the quad-tree, which was formed after i divisions. For example the verticallines on the outer borders of the left quad-tree in figure 4.1, on page 17, are at level 0,1 and 2, where vertical line through the middle of the square is only at level 1 and 2.

6In order for this inequality to hold, we must use the assumption that the internode distance is ≥ 4.

20

We must first modify π, such that it is (m,r)-light at l. We do this by going throughall subsections of l and applying lemma 6, if it has more that s crossings 7. We dothis using a bottom up approach, starting with the edges of the smallest squares andgradually dealing with larger sections8. Rewriting some sums9 and using the fact thatwe have applied the lemma at most t(π,l)

s−3 times10, we find that the expected cost increase

for the line l, over the random shift of l, is at most 2gt(π,l)s−3 .

We then move each crossing of π with l to the nearest portal of an m-regular set ofportals, where m ≥ 2s log(L), by adding two line segments on l. The expected increasefor this procedure is at most

E(”Cost of moving portals”) =

log(L)∑i=1

P(level l = i) · t(π, l) · 2 · distance to portal

=

log(L)∑i=1

2i

L· t(π, l) · L

2im

=t(π, l) log(L)

m

(since m ≥ 2s log(L)) ≤ t(π, l)

2s

Assuming that s > 15 and g > 1 we find

2gt(π, l)

s− 3+t(π, l)

2s≤ 3gt(π, l)

s

By adding the expected cost increase of all lines together and using lemma 7 we find∑l,vertical

3gt(π, l)

s+

∑l,horizontal

3gt(π, l)

s≤ 6g OPT

s=OPT

2c

where OPT is the length of π. A direct application of Markov’s inequality now givesus that the total cost increase is less then OPT

c with a chance of at least 12 .

7The reason why we apply the lemma for more than s crossings, is that we may also get more crossingswhen we apply the lemma on a line l′, perpendicular to l. We can limit this to 2 extra crossings perline, using the patching lemma, giving us at most 4 extra crossings per section, resulting in s+ 4 = rcrossings.

8We will not discuss exactly how this approach works.9We will not discuss these derivations.

10Since each application replaces at least s + 1 crossings by at most 4 crossings.

21

4.4 Additions

4.4.1 Patch Lemma

In this section we will give an alternate proof of the patching lemma, which was usedto prove the correctness of the algorithm. We will find a value of g = 4, instead of theg = 6 which was found in (Arora, 1998).

Lemma 8 (Patching Lemma). There is a constant g > 0 such that the following is true.Let S be any line segment of length s and π be a closed path that crosses S at least threetimes. Then there exist line segments on S whose total length is at most g · s and whoseaddition to π changes it into a closed path π′ that crosses S at most twice.

Proof. Like the proof in (Arora, 1998), we will use the well-known result that any con-nected graph with all even degrees has an Eulerian path. In order to illustrate the proof,we will apply it to the example path below.

v1 v2 v3 v4 v5 v6S

Figure 4.2: An example closed path π, crossing a line S.

For ease of exposition, we will change the visualization of the given path. We willbreak the path up into arcs, at the points where it crosses S. In the example, this is atv1, v2, . . . , v6. We will then represent the crossing points as vertical lines and the arcs ashorizontal lines between them. This results in the following, ’layered’ representation.

v1 v2 v3 v4 v5 v6

S

Figure 4.3: The ’layered’ representation of the path π.

22

From here on, we will interpret the path as a graph, with the crossing points as verticesand the arcs between them as edges. Note that the layers only connect their endpoints,meaning that v1 and v2 are not connected. Also note that the vertical position of the’layers’ has no extra meaning. Our goal is to turn the graph into an Eulerian graph,on both sides of S, by adding edges along S. This is sufficient, because we can find theclosed path π′ by first walking along an Eulerian path on the top side of S, starting andending at any of the original crossing points, then crossing and doing the same on thebottom side of S, again crossing at the same point.

We construct these Eulerian graphs, by first connecting each pair of consecutive cross-ing points. This guarantees that the graph on v1, v2, . . . , v6, with the horizontal linesas edges, is connected. We now add a second edge between two consecutive crossingpoints, if the number of ’layers’ between them is uneven. This results in the followinggraph.

v1 v2 v3 v4 v5 v6

S

Figure 4.4: The ’layered’ representation of the path π′. Theadded edges on S are indicated in red.

v1

v2

v3

v4

v5

v6

Figure 4.5: The top half ofπ′, as a multi-graph.

This graph has all even degrees, because the degree di at crossing point vi is di =EL,i +ER,i where EL,i is the number of edges starting at vi and moving to the left andER,i is defined similarly but to the right. We find that EL,i +ER,i = LL,i +LR,i− 2L•,i,where LL,i is the number of layers, slightly to the left of vi, LR,i the number slightlyto the right and L•,i the number of layers that move along vi without ending or begingat it. For example, on the top side of the example graph, we have EL,3 = 3, ER,3 =1, LL,3 = 4, LR,3 = 2 and L•,3 = 1. We find that di ≡ LL,i + LR,i ≡ 0 + 0 mod 2 andthus vi has even degree. This makes the graph Eulerian at both sides of S.

Finally we have to find the value of g. We add, at most, 2 edges between each pairof consecutive crossing points and thus add, at most 2s worth of edges. We do this onboth sides of S, giving us a final value of g = 4.

This lemma gives us an upper bound of g, but we can also find a lower bound of g,by looking at the following path.

23

v1 v2

S

Figure 4.6: A closed path for which g ≥ 2, with crossing points v1 and v211.

Note that for any choice of crossing points of π′, we can find a pair of crossing points,which lie on crossing points of π and are at least as efficient12. Thus we may assumethat the crossing points of π′ lie on v1, v2.

If both crossing points lie on the same point then π′ has to go back and forth on thetop side of S, giving us g ≥ 2. If the crossing points lie on different points, one on v1

and one on v2, π′ has to go along S once on the top side and once on the bottom side,since the arcs on the bottom side also go along the entirety S and in order to get to theother side of S, the path has traverse S an odd number of times.

4.4.2 Reasons behind the shifted quad-tree

The randomly shifted quad-tree is probably the most mystifying part of this algorithm.In this section we will discuss its role in more detail.

As part of the proof of correctness, we find an upper bound on the expected costincrease from making a salesman tour (m,r)-light. Since the structure of each shiftedquad-tree can be completely different, the number of times an optimal salesman tourcrosses one of the lines in the quad-tree can vary significantly between different quad-trees. Since the number of crossings is loosely related to the aforementioned cost increase,it is not surprising that some shifted quad-trees will result in much better approximationsthan others. The proof of correctness really just tells us that at least half of the shiftedquad-trees will meet our requirements.

This explains the need for a random shift, but not the need for a quad-tree. Firstof all the quad-tree decreases the runtime by roughly a factor of n, since a quad-treehas O(n log(n)) squares which have to be computed in the dynamic programming step,where a full disection has O(n2 log(n)) squares. Next to that the quad-tree limits theamount of unnecessary crossings by not dividing up empty squares. This also factorsinto some of the derivations which were left out in section 4.3

11Strictly speaking there should be four crossing points, but for the sake of clarity overlapping crossingpoints are left out.

12This is not hard to prove, simply consider all possible situations for a certain crossing point and youwill see that the claim holds for each.

24

5 Conclusions

In this thesis we have examined two algorithms, each of which solves a different versionof TSP.

We have discussed an algorithm for general TSP as seen in (Bjorklund, 2010). Thisalgorithm used some algebraic methods to reduce the problem to a simpler one, whichwas then solved using a combination of existing methods. One of the most notabletechniques used was the use of rings of characteristic two to have certain, unwantedelements cancel out. Another notable technique was the use of Markov’s inequalityand the Schwartz-Zippel lemma to justify the assumption that, with high probability, acertain polynomial is zero, if it is zero for some test input.

We have also discussed an approximation algorithm for Euclidean TSP as seen in(Arora, 1998). This algorithm uses a more geometric approach, combined with dynamicprogramming. One of the most interesting results is the manner in which the problemis divided up into smaller subproblems, in order to allow for a dynamic programmingapproach. Another surprising result is the fact that, what seems like a very convolutedmethod, turns out to be rather fast. Like the first algorithm, this algorithm also dependson randomized elements to function, also using Markov’s inequality to deal with theprobabilities. It is not at all unlikely that we will see more algorithms incorporaterandom elements in the future.

Of course, there is still a lot of progress to be made on the Traveling Salesman Problem.In regard to the papers discussed in this thesis, one avenue of research may be theoptimization of these algorithms. In (Bjorklund, 2010) there is a section discussing amore optimal choice of some of the parameters. It may be possible to improve theefficiency or the chance of correctness by further tweaking these variables. This mayalso be the case for (Arora, 1998). In the case of (Arora, 1998) it may also be possible tofind a more efficient dissection than a quad-tree, or perhaps one which doesn’t requirerandomness to achieve a polynomial runtime.

Or perhaps there is some completely different approach which could prove effective.

25

6 Popular Summary

We’ve all been there, it’s a Saturday, you’re in the city centre with a long list of errands torun. You have to pick up your prescription from the pharmacist, buy groceries for dinner,check five different stores to find a birthday gift for a friend, and so on, ending up backat the car park you started at. Of course, like most rational people, you strongly dislikeshopping and want to finish as fast as possible. This raises the question, which order ofthe locations results in the shortest shopping trip? Most people will see this as an averageSaturday, but a mathematician will immediately find an interesting scheduling problemin this situation. What we have here is an instance of the famous Traveling SalesmanProblem, a problem which has been the focal point of a large amount of mathematicalresearch. The problem asks for the fastest route along a collection of points, which startsand ends at the same point. Like a lot of famous problems in mathematics, stating theproblem is not very complex, but finding good solutions has proven to be quite difficult.

Figure 6.1: An example of a possible tour along a collection of cities.

In this case a ’good’ solution consist of a fast algorithm which finds the fastest tourfor a given instance of the Traveling Salesman Problem. Currently, the fastest knownalgorithms run in time, exponential in the number of points. This means that, if thereare n points to visit, the runtime will be around kn, for some constant k. This meansthat adding a single extra point, will multiply the run time by a factor of k.

Part of the difficulty of this problem is the sheer number possible tours we need to

26

check. If we fix one of the points as the starting point, we can pick any of the remainingn − 1 points as the next point, then any of the remaining n − 2, etc. This gives us atotal of (n− 1)! = (n− 1) · (n− 2) · · · · · 3 · 2 · 1 possible routes to check!

Luckily, putting certain restrictions on the problem can make it easier to solve. Theassumption that the distances between points are ’Euclidean’ in nature, meaning thatthe points behave as points in a 2 dimensional plane, allows for much faster algorithms.When we also allow approximate solutions, we can push the runtime down even further.

In my thesis I discuss two algorithms, one for the general case of the Traveling SalesmanProblem and one for the approximate, Euclidean variant. I discuss the workings of eachalgorithm and also give a proof of correctness for each. I also try to add as much originalwork as I can, by giving alternative proofs for lemmas from the papers and other suchadditions.

One of the reasons this is such a famous problem, is because it is ’NP-hard’. Essentiallywhat this means is that finding a fast solution to this problem will allow you to thenfind fast solutions to a whole class of problems. Among these problems are things likeprime factorization, which is vital to the functioning of some widely used cryptographicmethods. This means that finding a fast enough solution to the Traveling SalesmanProblem, could potentially endanger the privacy of a large portion of internet traffic.

27

7 Bibliography

Sanjeev Arora. 1998. Polynomial Time Approximation Schemes for Euclidean TravelingSalesman and Other Geometric Problems. J. ACM 45, 5 (Sept. 1998), 753–782. DOI:

http://dx.doi.org/10.1145/290179.290180

Andreas Bjorklund. 2010. Determinant Sums for Undirected Hamiltonicity. CoRRabs/1008.0541 (2010). http://arxiv.org/abs/1008.0541

James R. Bunch and John E. Hopcroft. 1974. Triangular factorization and inversion byfast matrix multiplication. Math. Comp. 28, 125 (1974), 231–236. http://www.ams.

org/jourcgi/jour-getitem?pii=S0025-5718-1974-0331751-8

F. Yates. 1937. The Design and Analysis of Factorial Experiments. Imperial Bureau ofSoil Science. https://books.google.nl/books?id=YW1OAAAAMAAJ

28

http://dx.doi.org/10.1145/290179.290180

http://arxiv.org/abs/1008.0541

http://www.ams.org/jourcgi/jour-getitem?pii=S0025-5718-1974-0331751-8

http://www.ams.org/jourcgi/jour-getitem?pii=S0025-5718-1974-0331751-8

https://books.google.nl/books?id=YW1OAAAAMAAJ

traveling salesman problem - uva · the traveling salesman problem (tsp) is a textbook example of a...

Documents