an o(log n=loglog n -approximation algorithm for...
TRANSCRIPT
Submitted to Operations Researchmanuscript 2
An O(logn/ log logn)-approximation Algorithm for theAsymmetric Traveling Salesman Problem
Arash AsadpourNYU Stern School of Business, Department of Informations, Operations, and Management Science, [email protected]
Michel X. GoemansMassachusetts Institute of Technology, Department of Mathematics, [email protected]
Aleksander MądryMassachusetts Institute of Technology, Department of Electrical Engineering & Computer Science, [email protected]
Shayan Oveis GharanUniversity of Washington, Department of Computer Science and Engineering, [email protected]
Amin SaberiStanford University, Department of Management Science and Engineering, [email protected]
We present a randomized O(logn/ log logn)-approximation algorithm for the asymmetric traveling sales-
man problem (ATSP). This provides the first asymptotic improvement over the long-standing Θ(logn)-
approximation bound stemming from the work of Frieze et al. [FGM82].
The key ingredient of our approach is a new connection between the approximability of the ATSP and the
notion of so-called thin trees. To exploit this connection, we employ maximum entropy rounding – a novel
method of randomized rounding of LP relaxations of optimization problems. We believe that this method
might be of independent interest.
Key words : maximum entropy; thin tree; Held-Karp relaxation; circulation network.
Subject classifications : Analysis of algorithms: performance guarantee; Linear programming: randomized
rounding; Traveling salesman: approximation algorithm.
Area of review : Optimization.
1. Introduction
The traveling salesman problem is one of the most celebrated and intensively studied problems in
combinatorial optimization [LSKL85, ABCC07, Coo12]. It has found applications in logistics, plan-
ning, manufacturing and testing microchips [LK74], as well as DNA sequencing [PTW01]. The roots
of this problem go back as far as the first half of the 19th century, to the works of Hamilton [Coo12] on
1
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP2 Article submitted to Operations Research; manuscript no. 2
chess knight movement. However, its most popular formulation is in the context of traveling through
a collection of cities: given a list of cities and their pairwise distances, the task is to find a shortest
possible tour that visits each city exactly once.
The asymmetric (or general) traveling salesman problem (ATSP) concerns a situation when the
distances between the cities are asymmetric. Formally, we are given a set V of n points and an
(asymmetric) cost function c : V ×V →R+. For the rest of this paper, we represent the problem by
a directed graph G, where V (resp. V ×V ) refers to the set of vertices (resp. arcs) of G. The goal is
to find a minimum cost tour that visits every vertex exactly once. As is standard in the literature,
throughout the paper, we make an assumption that the cost function is a metric, i.e., it satisfies the
triangle inequality cij + cjk ≤ cik for all vertices i, j, and k. Note that if we are allowed to visit a
vertex more than once, then by substituting the cost of each arc (u, v) with the cost of the shortest
path from u to v we automatically ensure that the triangle inequality holds for all triplets.
In the very important special case of the problem where the cost function is symmetric, i.e., when
for every u, v ∈ V we have c(u, v) = c(v,u), there is a celebrated factor 3/2-approximation algorithm
due to Christofides [Chr76]. This algorithm first computes a minimum cost spanning tree T on V ; then
finds the minimum cost Eulerian augmentation of that tree; and finally shortcuts the corresponding
Eulerian walk into a tour.
In this paper, we are concerned with the general, asymmetric version and give an O( lognlog logn
)-
approximation algorithm for it. This finally improves the thirty-year-old Θ(logn)-approximation
ratio stemming from the work of Frieze et al. [FGM82] and its subsequent improvements to
0.999 log2 n,0.842 log2 n, and 0.666 log2 n [Bla02, KLSS05, FS07]. This paper is a revised and extended
version of the conference version by Asadpour et al. [AGM+10].
1.1. The Roadmap of the Algorithm
Our algorithm begins by formulating the asymmetric TSP via the Held-Karp linear programming
relaxation. By solving the LP relaxation, the algorithm gets an optimal fractional solution x ∗. The
algorithm then symmetrizes the solution x ∗ and scales it down by a factor of n−1n
, in order to get
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 3
a solution z ∗. We show that z ∗ is in the spanning tree polytope of the undirected version of G.
See Section 3 for the details about the Held-Karp relaxation and the transformation of x ∗ to z ∗.
The main step of our approach is to round the fractional solution z ∗ to a spanning tree (i.e., a
corner point in the spanning tree polytope) with some specific properties: we want to find a spanning
tree T that, on one hand, has relatively small cost compared to OPTHK (the value of the optimum
solution of the Held-Karp relaxation of the asymmetric TSP), and on the other hand, has the crucial
property of having (almost) the same number of edges in any cut as of z ∗. In order to do so, we
introduce the notion of thinness. More specifically, as we define in Definition 4.1, a spanning tree
T is called “(α,s)-thin” with respect to a given point z in the spanning tree polytope, if for each
subset of vertices ∅ 6= U ⊂ V the number of arcs of T with exactly one endpoint in U is at most
α ·∑
u∈U,v∈V \U zuv, and moreover, the cost of T is at most s ·∑
u,v zuv. The central fact motivating
our interest in such a tree is that, as we show in Theorem 4.1, if a spanning tree T is (α,s)-thin
with respect to z ∗, then one can find an Eulerian augmentation of T with a total cost of at most
(2α+ s)OPTHK (and thus, within the same factor of the optimum solution of the asymmetric TSP).
The idea of the augmentation is based on Hoffman’s circulation theorem. We emphasize that the
importance of thin tree definition is that it disregards the orientation of the arcs. This helps us to
eliminate the inherent asymmetry of the asymmetric TSP. See Section 4 for the details.
We remark that two similar notions of thinness have been studied before in the literature. In
Section 4 of [God], a notion of thinness is defined to study Jaeger’s conjecture. This definition can
be thought as a special case of ours, when z is integral. Also, in [GHJS09] the notion of “nearly-
balanced” graphs was introduced, where instead of bounding the number of edges in every cut by
some factor of the size of the cut, the authors require the number of out-edges and in-edges in every
cut are close to each other.
Note that we are building an Eulerian graph by augmenting a spanning tree. In this regard, one
can think of our approach as the generalization of Christofides’ work to the asymmetric TSP.
In the light of the above, the technical core of our paper comprises a way to find an (α,s)-thin tree
with respect to z ∗, for α = O( lognlog logn
) and s = O(1). We achieve this by sampling a spanning tree
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP4 Article submitted to Operations Research; manuscript no. 2
from a certain distribution over the spanning trees of the undirected version of G. This distribution
is the one that maximizes the entropy among all the distributions that (approximately) preserve the
edge marginals imposed by z ∗. The crucial property of such entropy-maximizing distribution is that
the events corresponding to edges being present in the sampled tree are negatively correlated. This
means that the well-known Chernoff bounds for the upper tail of independent settings still hold (see
Panconesi and Srinivasan [PS97]). See Section 5 for the details of maximum entropy sampling in the
asymmetric TSP.
By using this tail bound together with the union-bounding technique of Karger [Kar93], we are
able to establish the desired (O( lognlog logn
),O(1))-thinness of the sampled tree (with high probability)
in Theorem 6.1. Note that even without a polynomial-time algorithm to sample from the maximum
entropy distribution, this theorem establishes that the Held-Karp LP relaxation has an integrality
gap no worse than O( lognlog logn
). See Section 6 for more details.
All that remains, in order to provide a polynomial-time algorithm that provides an O( lognlog logn
)-
approximation factor to the asymmetric TSP, is to give a polynomial-time algorithm to sample from
the maximum entropy distribution. There are two main ingredients for this part. The first ingredient is
a succinct representation for the maximum entropy distribution as a product-form distribution, which
emerges as the solution of a convex program that can be efficiently solved. The other ingredient is the
fact that sampling from product-form distributions over spanning trees can be reduced to counting
the number of spanning trees, which is well-known to admit efficient algorithms. See Sections 5.2
and 5.3 for how to sample a spanning tree from product-form distributions, and Sections 7 and 8 for
how to solve the corresponding convex program.
We believe that this general, maximum entropy–based approach to rounding is also of indepen-
dent interest. In particular, after the publication of [AGM+10], this approach was used to obtain
an improved approximation algorithm for the graphic variant of the symmetric TSP [OSS11]. Maxi-
mum entropy distributions have a long history in information theory and statistical mechanics (refer
to [CT06] for some references and examples). In the context of randomized rounding, it has been
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 5
observed in [AS10] that the specific martingale–based dependent randomized rounding method used
in their algorithm for the fair allocation of indivisible goods coincides with a distribution that maxi-
mizes entropy on the bipartite matching polytope. In this paper, we introduce rounding by sampling
from the maximum entropy distribution as a potential technique for randomized rounding in the
presence of a combinatorial structure.
Algorithm 1 provides the high level description of our approach. Also, the proof of our main
theorem (Theorem 6.2) gives a formal overview of the algorithm. In the rest of the introduction
we provide the high-level intuition behind our maximum entropy–based approach for finding a thin
spanning tree.
1.2. Rounding By Sampling From Maximum Entropy Distribution
At a high level, one can view our sampling procedure as a randomized rounding approach. It operates
on z ∗, a symmetrized and slightly scaled down version of our solution x ∗ to the Held-Karp relaxation.
As we will show in Section 3 (see Lemma 3.1) z ∗ is a point in the (relative interior of) spanning tree
polytope of an undirected graph whose edge set (defined by u, v : x∗uv + x∗vu > 0) is the support
of x ∗. In other words, z ∗ can be seen as a “fractional spanning tree”.
In the light of this observation, our goal is to round a given fractional spanning tree z ∗ to an
integral spanning tree (i.e., a corner point of the spanning tree polytope) while roughly preserving
some of its quantitative properties, namely, making it (α,s)-thin for some values of α and s that are
sufficiently small for our purpose.
The independent randomized rounding technique of Raghavan and Thompson [RT87] is one of
the most well-known and widely used technique for rounding a fractional vector z = (z1, z2, · · · , zm)
with 0≤ zi ≤ 1, for each i. In this rounding technique, the fractional vector z is rounded to a 0/1
vector Z = (Z1,Z2 · · · ,Zm), where Pr[Zi = 1] = zi, independently for each i. This method has two
very convenient properties:
(Property 1) The resulting distribution preserves marginals , i.e., E[Zi] = zi for every 1 ≤ i ≤ m.
Hence, due to the linearity of expectation, every quantity that is linear in z (such as the
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP6 Article submitted to Operations Research; manuscript no. 2
Algorithm 1 An O(logn/ log logn)-approximation Algorithm for the ATSP.Input: A set V consisting of n points and a cost function c : V × V → R+ satisfying the triangle
inequality.
Output: O( lognlog logn
)-approximation to the asymmetric TSP instance described by V and c.
1. Solve the Held-Karp LP relaxation of the ATSP instance to get an optimum extreme point
solution x ∗. Define z ∗ by z∗u,v := n−1n
(x∗uv + x∗vu), for all u, v ∈ V , making it a symmetrized
and scaled down version of x ∗. Vector z ∗ can be viewed as a point in the spanning tree
polytope of the undirected graph defined by the edges corresponding to the support of x ∗
after disregarding the directions of arcs. (See Section 3.)
2. Let E be the support graph of z∗ when the direction of the arcs are disregarded. Find weights
γe∈E such that the product-form distribution over the spanning trees defined by p(T ) ∝
exp(∑
e∈T γe
)approximately preserves the marginals imposed by z ∗, i.e., for any edge e∈E,
∑T :T3e
p(T )≤ (1 + ε)z∗e ,
for a small enough value of ε. (In this paper, we show that ε= 0.2 suffices for our purpose. See
Sections 7 and 8 for a description of how to compute such a distribution.)
3. Sample 2dlogne spanning trees T1, . . . , T2dlogne from p(·). (See Sections 5.2 and 5.3 for how
to sample a spanning tree from p(·) efficiently.) For each of these trees, orient all its edges so
as to minimize its cost with respect to our (asymmetric) cost function c. Let T ∗ be the tree
whose resulting cost is the lowest among that of all the sampled trees.
4. Find a minimum cost integral circulation that contains the oriented tree ~T ∗. Shortcut this
circulation to a tour and output it. (See Section 4.)
total cost of the tree∑
e ceze, or, the number of edges crossing a given cut) remains the
same in expectation. In other words, for any vector b∈Rm, we have E[bT ·Z ] = bT ·z .
(Property 2) All the variables are rounded independently, which allows us to use strong concentration
bounds such as Chernoff bounds.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 7
However, the problem with this approach is that the independent randomized rounding ignores
any underlying combinatorial structures of the solution. For example, in our case, it is not hard to
see that independent rounding of variables associated with each edge of our underlying graph will not
only most likely fail to produce a tree, but – even more critically – the resulting graph will be almost
surely disconnected. (We should note that one can show that oversampling the edges by a factor of
Ω(logn) would ensure connectivity here. Nevertheless, the expected cost of the sampled graph would
increase by the same factor, which is too big for our purpose of producing (O( lognlog logn
),O(1))-thin
spanning trees.)
Therefore, our rounding procedure needs to be more careful. In particular, to ensure connectivity,
we will restrict ourselves only to sampling from distributions over spanning trees.
Note that since z is in the spanning tree polytope, it can be expressed as a convex combination of
spanning trees. Namely, we can write z in the following form.
z = α1T(1) +α2T
(2) + · · ·+αkT(k),
for some αj ≥ 0, and∑
j αj = 1, where each T (j) = (T(j)1 , T
(j)2 , ·, T (j)
m ) is a 0/1 vector representing
a spanning tree with edges e : T (j)e = 1. One can see that there can be many ways to express a
fractional spanning tree z in such a manner, and each one of these convex combinations defines a
distribution over spanning trees where each T (j) is sampled with probability αj . However, all such
distributions preserve the edge marginals imposed by z , i.e., E[T(j)i ] = zi, for 1≤ i≤m. Hence, the
choice of any such convex combination automatically ensures that Property 1 of the independent
randomized rounding technique still holds.
Unfortunately, Property 2 (namely, independence) is impossible to satisfy, as the underlying struc-
ture of spanning trees imposes inevitable dependencies between the random variables. This is prob-
lematic, as without the ability to resort to some strong concentration phenomena we cannot hope
that all the (exponentially many) quantities that we are interested in preserving (specifically, the
number of edges crossing each cut) are indeed simultaneously close to their desired expected value.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP8 Article submitted to Operations Research; manuscript no. 2
However, there is a way to get around this difficulty. The crucial observation is that we do not really
need these events (corresponding to edges being a part of the sampled tree) to be fully independent. It
actually suffices if they are negatively correlated. As was observed first by Panconesi and Srinivasan
[PS97], negative correlation is sufficient for the upper tail of the Chernoff bound to hold. As we then
show, this concentration combined with union-bounding technique of Karger [Kar93] will be sufficient
to obtain the desired (O( lognlog logn
),O(1)) bound for the thinness of the sampled tree. See Sections 5.4
and 6 for details.
Now, in the light of the above discussion, it remains to devise a way of obtaining (and efficiently
sampling from) a marginal-preserving distribution over spanning trees that has such a negative
correlation property. Our idea is to employ maximum entropy rounding: we take the distribution
among all the marginal-preserving distributions over spanning trees that maximizes the entropy.
Intuitively, this distribution maximizes the “randomness” while preserving the structural properties
of the fractional solution z . In other words, it avoids inflicting any correlation upon the edges other
than what is imposed by the spanning tree structure.
To complete our approach, first we need to devise an algorithm to find and sample from the
marginal-preserving maximum entropy distribution, and second we need to prove the negative corre-
lation property. We note that finding the maximum entropy distribution and sampling from it may
seem intractable at first, because this distribution is supported over all (possibly exponentially many)
spanning trees of our graph. The key to answer both of these questions is the close connection between
maximum entropy distributions and product-form distributions. Product-form distributions can be
viewed as a generalization of uniform distributions. In general, any set of weights γe assigned to edges
of a graph defines a product-form distribution over spanning trees where the probability of each tree
T is proportional to exp(∑
e∈T γe
). Note that having equal values for γe-s results in the uniform
distribution over spanning trees. It was known before the publication of [AGM+10] that maximum
entropy distributions are product-form distributions and any product-form distribution is a maximum
entropy distribution for its own marginals (see e.g., [BV06, Section 5.2.4]). This has several implica-
tions. First, the maximum entropy distribution can be described concisely just by representing the
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 9
values γe for all edges. Second, sampling from maximum entropy distribution reduces to the problem
of sampling uniform spanning trees which has been studied for many years [Gue83, Kul90, CMN96].
Third, the maximum entropy distribution satisfies negative correlation because any product-form
distribution over spanning tree satisfies negative correlation [LP14, Chapter 4].
It remains to find the maximum entropy distribution over spanning trees that preserve the marginal
probabilities imposed by z . This problem can be cast as a simple convex programming optimization
problem with exponentially many variables corresponding to the probability of each spanning tree
of our graph. We then give two polynomial-time algorithms that find product-form distributions
that approximately (with sufficiently high precision) preserve the marginals of a given vector z in
the spanning tree polytope. The first one is a simple combinatorial algorithm based on the idea
of multiplicative weight updates. The second one uses the Ellipsoid method to find a near optimal
solution to the dual of the maximum entropy convex program. For more information on the Ellipsoid
method, we refer the reader to [GLS93]. See Sections 7 and 8 for more details.
1.3. Structure of the Paper
The rest of the paper is organized as follows. In Section 2, we define the required notations. In
Section 3, we discuss the Held-Karp linear programming relaxation for ATSP and how to derive a
solution z ∗ in the spanning tree polytope of the undirected version of G from the optimal fractional
solution x ∗ of the Held-Karp LP relaxation. Our main proof starts afterwards. In Section 4, we
formally define thin trees and we reduce the approximation of the asymmetric TSP to the problem
of finding a thin tree. In Section 5, we formally define the maximum entropy sampling method and
the maximum entropy convex program that preserves the marginals of z ∗, derived from the optimal
solution of Held-Karp relaxation. We also prove that the optimum solution of this program is a
product-form distribution. Finally, we prove our main result in Section 6 and show that we can
efficiently find an O( lognlog logn
)-approximate solution for the asymmetric TSP. This is done by proving
that a spanning tree sampled from the maximum entropy distribution with respect to marginal
probabilities z ∗ is sufficiently thin. The last two sections are dedicated to providing two different
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP10 Article submitted to Operations Research; manuscript no. 2
algorithms for finding a product-form distribution that (approximately) preserve the marginals of
a given point z in the spanning tree polytope; namely, in Section 7 we provide a combinatorial
algorithm and in Section 8, we use the Ellipsoid method to solve the dual of the maximum entropy
convex program.
2. Notation
Throughout this paper, we use a= (u, v) to denote the arc (directed edge) from u to v and e= u, v
to denote an undirected edge. We will use A (resp. E) to denote the set of arcs (resp. edges) of a
directed (resp. undirected) graph we are working with. (This graph will be always clear from the
context. See the comments after (3.5) as an example.)
Given a function f : A → R on the arcs of a graph, we define the cost of f to be c(f) :=∑a∈A c(a)f(a), where c : V × V → R+ is the cost function of the asymmetric traveling salesman
problem.
Also, for a subset S ⊆A of arcs, we denote by f(S) :=∑
a∈S f(a) the sum of the values of f on
this subset. We use an analogous notation for a function defined on the edge set E of an undirected
graph or a vector whose entries corresponds to the elements of A or E.
For a given subset of vertices U ⊆ V , we also define
δ+(U) := a= (u, v)∈A : u∈U,v /∈U,
δ−(U) := δ+(V \U)
A(U) := (u, v)∈A : u∈U,v ∈U,
to be the set of arcs that are, respectively, leaving, entering, and contained in U . Also, with a slight
abuse of the notation, for a vertex v ∈ V we sometimes use δ+(v) and δ−(v) instead of δ+(v) and
δ−(v), respectively. Similarly, for an undirected graph, δ(U) denotes the set of edges with exactly
one endpoint in U , and E(U) denotes the set of edges with both endpoints in U .
Finally, for the rest of this paper, we use log to denote the natural logarithm.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 11
3. The Held-Karp Relaxation
We start by presenting the Held-Karp relaxation [HK70] for the asymmetric traveling salesman
problem. In this relaxation, given an instance of ATSP with cost function c : V ×V →R+, we consider
the following linear program defined on the complete digraph over the vertex set V :
min∑a
c(a)xa (3.1)
s.t. x (δ+(U))≥ 1 ∀U ⊂ V and U 6= ∅, (3.2)
x (δ+(v)) = x (δ−(v)) = 1 ∀v ∈ V, (3.3)
xa ≥ 0 ∀a.
As shown in [HK70], the cost OPTHK := c(x ∗) of the optimal solution x ∗ for the preceding linear
program is a lower bound on the cost OPT of the optimal solution to the input instance of ATSP. To
observe this, note that every subset of arcs (e.g., every tour) can be represented by assigning binary
values to xa-s. More specifically, for every arc a, the value of xa can be set to 1 if a is included in the
tour and 0 otherwise. Note that the cost of the tour will be the corresponding value of the objective
function for this solution. Also, for every tour that we are interested in (i.e., the tours that visits
every city exactly once and return to the original vertex), the corresponding solution satisfies (3.2)
(due to being connected) as well as (3.3) (due to entering at every city and exiting from it exactly
once).
Although the Held-Karp relaxation has exponential number of constraints (in terms of |V |), it is
well-known that an optimal solution x ∗ to it can be computed in polynomial-time (either by employ-
ing the Ellipsoid method or by reformulating it as an LP with polynomial number of constraints).
Furthermore, we can assume that x ∗ is an extreme point of the corresponding polytope.
Now, observe that (3.3) implies that any feasible solution x to the Held-Karp relaxation satisfies
x (δ+(U)) = x (δ−(U)), (3.4)
for any U ⊆ V . In other words, for any subset U ⊆ V of vertices, the (fractional) number of arcs
leaving U in x is equal to the (fractional) number of arcs entering it.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP12 Article submitted to Operations Research; manuscript no. 2
Our particular interest will be in a symmetrized and slightly scaled down version of x ∗. Namely,
let us define
z∗u,v :=n− 1
n(x∗uv +x∗vu). (3.5)
Let us also denote by A the support of x ∗, i.e., A= (u, v) : x∗uv > 0, and by E the support of
z ∗. For every edge e = u, v of E, we define its cost as minc(a) : a ∈ (u, v), (v,u) ∩A, which
corresponds to choosing the cheaper of possible orientations of that edge. We denote this new cost
of this edge e by c(e). This implies, in particular, that c(z ∗)≤ c(x ∗).
The main purpose of the scaling factor in (3.5) is to make z ∗ belong to the spanning tree polytope P
of the graph (V,E). This ensures that z ∗ can be viewed as a convex combination of incidence vectors
of spanning trees. In fact, as stated in Lemma 3.1, this makes z ∗ belong to the relative interior of P .
We will later use this fact in proving Theorem 5.1. We emphasize that E is defined as the support
of z ∗, i.e., E = (u, v) : z∗uv > 0. The proof of Lemma 3.1 can be found in the appendix.
Lemma 3.1. The vector z ∗ defined by (3.5) belongs to relint(P ), the relative interior of the spanning
tree polytope P of the graph (V,E).
Finally, we note that as x ∗ is an extreme point (i.e., basic feasible) solution, it is known that its
support A has at most 3n− 2 arcs [VY99]. (Also, see [Goe06] for a bound of 3n− 4.) In addition,
since x ∗ can be expressed as the unique solution of an invertible system with only 0–1 coefficients,
by using the result of [Had93], we have that every entry x∗a is rational with integral numerator and
denominator bounded by 2O(n logn). In particular, z∗min = mine∈E z∗e > 2−O(n logn).
Remark 3.1. All our results still hold if we take any optimal solution x ∗ (not necessary an extreme
point) where the denominators of all the entries of x∗a are bounded by 2nO(1)
. However, for the
rest of the paper we will focus on the aforementioned solution x ∗ and its corresponding z ∗ where
mine∈E z∗e > 2−O(n logn).
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 13
4. Thin Trees and the Asymmetric Traveling Salesman Problem
The key element of our approach is a connection between the approximability of the asymmetric
traveling salesman problem and the notion of thin trees. To describe this connection, let us first
formally define thin trees.
Definition 4.1. Given a point z in a spanning tree polytope of the graph (V,E), and an α ≥ 1,
we say that a spanning tree T is α-thin with respect to z iff, for each subset U ⊂ V of vertices,
|E(T )∩ δ(U)| ≤ α · z (δ(U)). Also, we say that T is (α,s)-thin with respect to z iff it is α-thin and
moreover c(T )≤ s · c(z ), where c(T ) :=∑u,v∈T minc((u, v)), c((v,u)) denotes the cost of T after
directing each of its edges according to the orientation that yields smaller cost.
Note that the requirement of α-thinness only concerns the graph structure and not the weight
of the edges. Now, let us consider x ∗ to be an optimal solution to the Held-Karp relaxation, as
described in Section 3, and z ∗ to be the symmetrized and scaled down version of x ∗ defined in (3.5).
By Lemma 3.1, we know that z ∗ belongs to the (relative interior of) the corresponding spanning tree
polytope. The crucial observation that we make is that finding an (α,s)-thin tree with respect to
z ∗, for some α and s, translates directly to obtaining a (2α+ s)-approximation for the asymmetric
traveling salesman problem.
Theorem 4.1. Let x ∗ be an optimal solution to the Held-Karp relaxation and z ∗ be the correspond-
ing point in the spanning tree polytope as defined in (3.5). Given T ∗, an (α,s)-thin spanning tree
with respect to z ∗ for some α and s, we can find in polynomial-time an Eulerian tour whose cost is
at most (2α+ s)c(x ∗) = (2α+ s)OPTHK ≤ (2α+ s)OPT.
Our proof for Theorem 4.1 relies on a certain classical result on flow circulations. To state this
result, let us recall that a circulation is any function f : A→R such that f(δ+(v)) = f(δ−(v)) for each
vertex v ∈ V . The following theorem (see, e.g., [Sch03, Theorem 11.2] for a proof) gives a necessary
and sufficient condition for the existence of a circulation subject to the lower and upper bounds (or
capacities) on arcs.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP14 Article submitted to Operations Research; manuscript no. 2
Theorem 4.2 (Hoffman’s circulation theorem). Given lower and upper capacities l, u : A→R,
there exists a circulation f satisfying l(a)≤ f(a)≤ u(a) for all a∈A iff
1. l(a)≤ u(a) for all a∈A and
2. l(δ−(U))≤ u(δ+(U)), for all subsets U ⊆ V .
Furthermore, if l and u are integer-valued, f can be chosen to be integer-valued too.
We proceed now to the proof of Theorem 4.1.
Proof of Theorem 4.1: Let us first orient each edge u, v of T ∗ according to arg minc(a) :
a ∈ (u, v), (v,u) ∩A, and denote the resulting directed tree by ~T ∗. Observe that by definition of
our undirected cost function, we have c(~T ∗) = c(T ∗).
Let us now consider a minimum cost augmentation of ~T ∗ into an Eulerian directed (multi)graph.
Finding such an augmentation can be formulated as a minimum cost circulation problem with integral
lower capacities and infinite upper capacities. One just needs to set
l(a) :=
1 a∈ ~T ∗
0 a /∈ ~T ∗
and consider the minimum cost circulation problem
minc(f) : f is a circulation and f(a)≥ l(a) ∀a∈A. (4.1)
It is well-known (see, e.g., [Sch03, Corollary 12.2a]) that an optimum solution f∗ to Problem (4.1)
is polynomial-time computable and can be assumed to be integral, since l is integral. This integral
circulation f∗ can be viewed as a directed (multi)graph H that contains ~T ∗ and is Eulerian, i.e.,
every vertex has in-degree equal to its out-degree in H.
Hence, as H is also weakly connected (since it contains ~T ∗ as its subgraph), we can take an Eulerian
walk of H and shortcut it to obtain a Hamiltonian cycle of cost at most c(f∗). (We are using here
the fact that the costs satisfy the triangle inequality.)
To complete the proof, it remains to bound the cost of f∗. That is, our goal is to show that
c(f∗)≤ (2α+ s)c(x ∗). To this end, let us define
u(a) :=
1 + 2αx∗a a∈ ~T ∗
2αx∗a a /∈ ~T ∗.(4.2)
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 15
We claim that there exists a circulation g satisfying l(a) ≤ g(a) ≤ u(a) for every a ∈ A. Note that
this claim would imply that
c(f∗)≤ c(g)≤ c(u) = c(~T ∗) + 2αc(x ∗)≤ (2α+ s)c(x ∗).
Since g is a feasible solution to (4.1), we have c(f∗)≤ c(g), which establishes the desired bound on
the cost of f∗.
In the light of this, it only remains to establish the claim. To this end, observe that the α-thinness
of T ∗ implies that, for any subset U ⊆ V of vertices, the number of arcs of ~T ∗ in δ−(U) is at most
αz ∗(δ(U)), regardless of how T ∗ is oriented to achieve ~T ∗. As a consequence,
l(δ−(U))≤ αz ∗(δ(U))≤ 2αx ∗(δ−(U)).
Note that the second inequality holds because (3.5) implies that z ∗(δ(U)) = n−1n
[x ∗(δ−(U)) +
x ∗(δ+(U))]. However, x ∗ itself is a circulation due to (3.4) and hence, x ∗(δ−(U)) = x ∗(δ+(U)).
On the other hand, due to the definition of u(·) in (4.2), we have u(δ+(U))≥ 2αx ∗(δ+(U)). Thus,
u(δ+(U))≥ 2αx ∗(δ+(U)) = 2αx ∗(δ−(U))≥ l(δ−(U)).
Therefore, we can conclude that
l(δ−(U))≤ u(δ+(U)),
for any U ⊆ V and thus, by Theorem 4.2, the circulation g indeed exists. This concludes the proof
of the theorem.
5. Maximum Entropy Sampling and Concentration Bounds
In the light of the connection established in the previous section (see Theorem 4.1), our goal is to
develop a way of producing a tree that is sufficiently thin with respect to the point z ∗ (see (3.5)).
We will achieve that by devising an efficient procedure for sampling from an appropriate probability
distribution over spanning trees.
To this end, recall that by Lemma 3.1 we know that z ∗ belongs to the relative interior of the
spanning tree polytope P that corresponds to the optimal solution x ∗ to our LP relaxation from Sec-
tion 3. This means that not only that z ∗ can be expressed as a convex combination of spanning trees
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP16 Article submitted to Operations Research; manuscript no. 2
on the support E of z ∗, but also it can be done in such a way that all coefficients are positive. In
fact, there can be many ways to express z ∗ in this manner.
Furthermore, observe that each such convex combination naturally defines a distribution over
spanning trees that preserves the marginal probabilities imposed by z ∗, i.e., it is the case that
PrT [e ∈ T ] = z∗e , for every edge e ∈ E. It is, therefore, tempting to use such a distribution for our
thin tree sampling procedure. (After all, one can interpret the thinness condition – cf. Definition 4.1
– as a statement about approximate preservation of the respective cost- and cut-based marginals.)
However, as we already mentioned, there are many possible ways of representing z ∗ as a probability
distribution. Which one should we choose? As it turns out, a good choice is to take the distribution
that maximizes the entropy among all such marginal-preserving distributions. We formalize this
notion as well as derive some of its crucial properties in the following section.
5.1. Maximum Entropy Distribution
Let T be the collection of all the spanning trees of G= (V,E) and let z be an arbitrary point in the
corresponding spanning tree polytope P of G. The maximum entropy distribution p∗(·) with respect
to the marginal probabilities imposed by z is the optimum solution of the following convex program.
inf∑T∈T
p(T ) log p(T ) (5.1)
s.t.∑T3e
p(T ) = ze ∀e∈E,
p(T )≥ 0 ∀T ∈ T .
It is not hard to see that this convex program is feasible since z belongs to the spanning tree polytope
P . Note that since the support T of the distribution is finite (in fact, |T | ≤ nn−2, due to Cayley’s
formula [Cay89]), the objective function is bounded by at most log |T | ≤ (n− 2) logn. Also, note
that the feasible region is compact (closed and bounded). Hence, the infimum is attained and there
exists an optimum solution p∗(·). Furthermore, since the objective function is strictly convex, this
maximum entropy distribution p∗(·) is unique. Let OPTEnt denote the optimum value of this convex
program CP (5.1).
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 17
The value p∗(T ) determines the probability of sampling any tree T in the maximum entropy
rounding scheme. Note that it is implicit in the constraints of this convex program that, for any
feasible solution p(·), we have∑
T p(T ) = 1 since
n− 1 =∑e∈E
ze =∑e∈E
∑T3e
p(T ) = (n− 1)∑T
p(T ).
We now want to show that, if we assume that the vector z is in the relative interior of the spanning
tree polytope of (V,E) then p∗(T )> 0 for every T ∈ T and p∗(T ) can be written as a product-form
distribution. (Observe that our vector z ∗ indeed satisfies this assumption.)
For this purpose, we write the Lagrange dual to the convex program CP (5.1) (see, e.g., [BTN01]
for the relevant background). For every e∈E, we associate a Lagrange multiplier δe to the constraint
corresponding to the marginal ze, and define the Lagrange function as
L(p, δ) =∑T∈T
p(T ) log p(T )−∑e∈E
δe
(∑T3e
p(T )− ze
).
We can then rewrite it as
L(p, δ) =∑e∈E
δeze +∑T∈T
(p(T ) log p(T )− p(T )
∑e∈T
δe
).
The Lagrange dual to CP (5.1) is now
supδ
infp≥0
L(p, δ). (5.2)
The inner infimum in this dual is easy to solve. Namely, as the contributions of each p(T ) are
separable, we have that, for every T ∈ T , p(T ) must minimize the convex function
p(T ) log p(T )− p(T )δ(T ),
where, as usual, δ(T ) =∑
e∈T δe. Taking partial derivatives with respect to p(T ), we derive that
1 + log p(T )− δ(T ) = 0, or
p(T ) = eδ(T )−1. (5.3)
Thus,
infp≥0
L(p, δ) =∑e∈E
δeze−∑T∈T
eδ(T )−1.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP18 Article submitted to Operations Research; manuscript no. 2
Using the change of variables γe = δe− 1n−1
for e∈E, the Lagrange dual (5.2) can thus be rewritten
as
supγγγ
[1 +
∑e∈E
zeγe−∑T∈T
eγγγ(T )
]. (5.4)
Now, our assumption that the vector z is in the relative interior of the spanning tree polytope
translates to satisfying the Slater condition for the primal program (5.1) [BV06]. This, together
with convexity, implies that the supremum in (5.4) is attained by some vector γγγ∗, and the Lagrange
dual value equals the optimum value OPTEnt of our convex program. Furthermore, we have that the
(unique) primal optimum solution p∗ and any dual optimum solution γγγ∗ must satisfy
L(p,γγγ∗)≥L(p∗,γγγ∗)≥L(p∗,γγγ), (5.5)
for any p≥ 0 and any γγγ, where we have implicitly redefined L due to our change of variables from δ
to γγγ. Therefore, p∗ is the unique minimizer of L(p,γγγ∗) and from (5.3), we have that
p∗(T ) = eγγγ∗(T ). (5.6)
In summary, the following theorem holds.
Theorem 5.1. Given a vector z in the relative interior of the spanning tree polytope P of G =
(V,E), there exist γ∗e , for all e ∈ E, such that if we sample a spanning tree T of G according to
p∗(T ) := exp(γγγ∗(T )), then Pr[e∈ T ] = ze for every e∈E.
It is worth noting that the requirement that z is in the relative interior of the spanning tree
polytope (as opposed to being just in this polytope) is crucial. (This has been already observed
before, see Exercise 4.19 in [LP14]). To see that, consider G being a triangle and z being the vector
( 12, 1
2,1). In this case, one can verify that z is in the polytope (but not in its relative interior) and
there are no γ∗e -s that would satisfy the statement of the theorem. (However, one can get arbitrarily
close to ze for all e∈E.)
In Sections 7 and 8 we show how to efficiently find γγγ-s that approximately satisfy the conditions of
Theorem 5.1. The method proposed in Section 7 is combinatorial, as opposed to the Ellipsoid-based
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 19
method of Section 8. However, as we will see, the running time of the former grows polynomially
with the inverse of the desired error, while for the latter the running time grows polylogarithmically.
The results of Sections 7 and 8 can be formalized as the following theorem whose result we use in
the rest of the paper. For our application to the asymmetric traveling salesman problem, we set zmin
to be 2−O(n logn) and ε to be 15.
Theorem 5.2. Given z in the spanning tree polytope of G = (V,E) and some ε > 0, there is an
algorithm that finds values γe for all e∈E so that for the product-form distribution p defined by
p(T ) :=exp(
∑e∈T γe)∑
T∈T exp(∑
e∈T γe)∀T ∈ T ,
for every edge e∈E, we have
ze :=∑
T∈T :T3e
p(T )≤ (1 + ε)ze.
Furthermore, the running time of the algorithm is polynomial in n= |V |, − log zmin and 1/ε. (For
the same theorem with running time polynomial in n= |V |, − log zmin and log(1/ε), see Section 8.)
Note that Theorem 5.2 does not require z to be in the relative interior of the spanning tree
polytope.
5.2. Maximum Entropy Distribution and λλλ-Random Trees
Interestingly, the distributions over spanning trees considered in Theorem 5.2 turn out to be closely
related to the notion of λλλ-random (spanning) trees. Given λe ≥ 0 for e∈E, a λλλ-random tree T of G
is a tree T chosen from the set of all spanning trees of G with probability proportional to∏e∈T λe.
The notion of λλλ-random trees has been extensively studied (see, e.g., Ch.4 of [LP14]) – note that
in case of all λe-s being equal, a λλλ-random tree is just a uniform spanning tree of G. Many of the
results for uniform spanning trees carry over to λλλ-random spanning trees in a graph G; since, for
rational λe-s, a λλλ-random spanning tree in G corresponds to a uniform spanning tree in a multigraph
obtained from G by letting the multiplicity of edge e be proportional to λe.
Furthermore, observe that a tree T sampled from a probability distribution p(·) as given in The-
orem 5.2 is λλλ-random for λe := exp(γe) for all e ∈ E. As a result, we can use the tools developed
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP20 Article submitted to Operations Research; manuscript no. 2
for λλλ-random trees to obtain an efficient sampling procedure, see Section 5.3, and to derive sharp
concentration bounds for the distribution p(·), see Section 6.
5.3. Sampling a λλλ-Random Tree
There is a host of results (see [Gue83, Kul90, CMN96, Ald90, Bro89, Wil96, KM09] and references
therein) on obtaining polynomial-time algorithms for generating a uniform spanning tree, i.e., a λλλ-
random tree for the case of all λe-s being equal. Almost all of them can be easily modified to allow
arbitrary rational values of λe. However, many of those algorithm do not guarantee a polynomial
running time for our situation, where λe-s could have different values. Therefore, we resort to an
iterative approach similar to the one of Kulkarni [Kul90] that runs in polynomial-time for arbitrary
rational values of λe. We note that “polynomial-time” here means that the algorithm takes a poly-
nomial number of bit operations in terms of the size of the input (which translates to the number of
bits required to represent the vector λλλ).
The basic idea of this approach is to order the edges e1, . . . , em of G arbitrarily and process them
one by one, deciding probabilistically whether to add a given edge to the final tree or to discard it.
To be precise, we define Dj := 1ej∈T as the decision we make for edge ej . We include the first edge,
e1, to the tree with probability p1 = ze1 . When we process the j-th edge ej for j > 1, we decide to add
it to the final spanning tree T with probability pj(D1, · · · ,Dj−1), being the probability that ej is in a
λλλ-random tree conditioned on the decisions that were made for edges e1, . . . , ej−1 in earlier iterations.
In other words, pj(D1, · · · ,Dj−1) = Pr[ej ∈ T |D1, · · · ,Dj−1]. Clearly, this procedure generates a λλλ-
random tree, and its running time is polynomial as long as the computation of the probabilities pj
can be done in polynomial time.
To compute these probabilities efficiently we note that, by definition, p1 = ze1 . Now, if we choose
to include e1 in the tree (i.e., D1 = 1), then:
p2(1) = Pr[e2 ∈ T |e1 ∈ T ] =
∑T ′3e1,e2
∏e∈T ′ λe∑
T ′3e1
∏e∈T ′ λe
=
∑T ′3e1,e2
∏e∈T ′\e1 λe∑
T ′3e1
∏e∈T ′\e1 λe
.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 21
As one can see, the probability that e2 ∈ T conditioned on the event that e1 ∈ T is equal to the
probability that e2 is in a λλλ-random tree of a graph obtained from G by contracting the edge e1.
Similarly, if we choose to discard e1, the probability p2 is equal to the probability that e2 is in a
λλλ-random tree of a graph obtained from G by removing e1. In general, pj(D1, · · · ,Dj−1) is equal to
the probability that ej is included in a λλλ-random tree of a graph G′ obtained from G by contracting
all edges that we have already decided to add to the tree, and deleting all edges that we have already
decided to discard. In other words, G′ = (G − ei : Di = 0)/ei : Di = 1, where the symbol “/”
stands for the contraction operation..
Therefore, to obtain each pj we just need to be able to compute efficiently for a given (multi)graph
G′ and values of λe-s, the probability pG′ [λλλ,f ] that some edge f is in a λλλ-random tree of G′. It is well-
known how to perform such computation. For this purpose, one can evaluate∑
T∈T∏e∈T λe for both
G′ and G′/f (in which edge f is contracted) using Kirchhoff’s matrix tree theorem (see [Bol02]).
Theorem 5.3 (Kirchhoff Matrix Tree Theorem [Kir47]). Let G = (V,E) be an undirected
graph with given edge weights wi,j for all i, j ∈G, and let T be the set of all spanning trees of G.
The value∑
T∈T∏i,j∈T wi,j for any graph G is equal to the absolute value of any cofactor of the
weighted Laplacian L where
Li,j =
−λe e= (i, j)∈E∑
e∈δ(i) λe i= j
0 otherwise.
An alternative approach to computing pG′ [λλλ,f ] is to use the fact (see, e.g., Ch. 4 of [LP14]) that
pG′ [λλλ,f ] is equal to λf times the effective resistance of f in G′ treated as an electrical circuit with
conductances of edges given by λλλ. The effective resistance can be expressed by an explicit linear-
algebraic formula whose computation boils down to inverting a certain matrix that can be easily
derived from the Laplacian of G′ (see, e.g., Section 2.4 of [GBS08] for details).
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP22 Article submitted to Operations Research; manuscript no. 2
5.4. Negative Correlation and Concentration Bounds
We now derive a concentration bound that will be instrumental in establishing the thinness of our
sampled tree. The following theorem states that for any arbitrary subset C of edges in the graph,
the number of edges in C that are included in a sampled λλλ-random tree is with high probability
concentrated around its expected value.
Theorem 5.4. For each edge e, let Xe be an indicator random variable associated with the event
[e ∈ T ], where T is a sampled λλλ-random tree. Also, for any subset C of the edges of G, define
X (C) =∑
e∈CXe. Then we have
Pr[X (C)≥ (1 + δ)E[X (C)]]≤(
eδ
(1 + δ)1+δ
)E[X (C)]
.
Usually, when we want to obtain such concentration bounds, we prove that the variables XeE
are independent and use the Chernoff bound. Unfortunately, in our case, the variables XeE are not
independent. However, it is well-known that since our distribution is in product-form, they are neg-
atively correlated, i.e., for any subset F ⊆E, Pr[∀e∈FXe = 1]≤∏e∈F Pr[Xe = 1], see, e.g., Chapter 4
of [LP14].
Lemma 5.1. The random variables XeE are negatively correlated.
Before proceeding further, we should note that Lyons and Peres prove this fact only in the case
of T being a uniform spanning tree, i.e., when all λe-s are equal. However, as mentioned before, for
rational λe-s it is enough to replace each edge e with Cλe copies of edge e (for an appropriate choice
of C to obtain all integral numbers, such as the lowest common denominator of all λe-s) and consider
a uniform spanning tree in the corresponding multigraph. This will make the probability of each tree
T proportional to∏e∈T Cλe =Cn−1
∏e∈T λe, which is proportional to
∏e∈T λe. The irrational case
follows from a straightforward limit argument where C→∞.
Now that we have established negative correlation between the Xe-s, Theorem 5.4 follows directly
from the result of the Theorem 3.4 in Panconesi and Srinivasan [PS97] that the upper tail part of
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 23
the Chernoff bound requires only negative correlation (or even a weaker notion, see [PS97]) and not
the full independence of the random variables.
Finally, it is worth to point out that, since the initial publication of our work [AGM+10], another
way of producing negatively correlated marginal-preserving probability distributions over spanning
trees (or, more generally, on matroid bases) has been proposed [CVZ10]. This approach can also be
used in the framework developed in this paper.
6. Establishing the Thinness of the Sampled Tree
In this section, we focus on the product-form distribution p(·) that we obtain by applying the algo-
rithm of Theorem 5.2 to z ∗. We show that the tree sampled from the distribution p(·) is thin with
high probability. Refer to Theorem 6.1 for the formal statement.
The following lemma relates the number of edges of the sampled tree across a particular cut to the
sum of the fractions of the edges across that cut in the fractional solution. The proof can be found in
the appendix. In the rest of this paper and for the sake of simplicity, we define β = 4 logn/ log logn.
Lemma 6.1. Given z in the spanning tree polytope of G = (V,E) for n = |V | ≥ 5, let p(·) be the
probability distribution obtained by applying Theorem 5.2 on z for ε≤ 0.2. Then for any set U ⊂ V ,
Pr[|T ∩ δ(U)|>βz (δ(U))]≤ n−2.5z (δ(U)),
where T is a spanning tree sampled from p(·).
Now, we are ready to combine this concentration result with Karger’s bound on the number of
near-minimum cuts in a graph [Kar93] in order to establish the desired thinness of our sampled tree.
Theorem 6.1. Consider an instance of ATSP defined by the cost function c : V ×V →R+ for n=
|V | ≥ 5, and let x ∗ be the optimum solution of its Held-Karp linear programming relaxation. Also,
let z ∗ be a point in the spanning tree polytope of the underlying undirected graph G= (V,E) derived
from x ∗ by (3.5). Let T1, . . . , Td2 logne be d2 logne independent samples from a distribution p(·) as
given by Theorem 5.2 applied on z ∗ for ε= 0.2. Finally, let T ∗ be the tree among these samples that
minimizes the cost c(Tj), for 1≤ j ≤ d2 logne. Then, with probability at least 1− 2n−1
, the spanning
tree T ∗ is (4 logn/ log logn,2)-thin with respect to z ∗.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP24 Article submitted to Operations Research; manuscript no. 2
Proof: We start by showing that for any 1≤ j ≤ d2 logne, Tj is β-thin with high probability
for β = 4 logn/ log logn. From Lemma 6.1 we know that the probability of some particular cut
δ(U) violating the β-thinness of Tj is at most n−2.5z∗(δ(U)). Now, we use a result of Karger [Kar93]
that shows that there are at most n2l cuts of size at most l times the minimum cut value for any
half-integer l ≥ 1. Since, by the definitions of the Held-Karp relaxation and of z ∗, we know that
z ∗(δ(U))≥ 2(1− 1/n), it means there are at most nl cuts δ(U) with z ∗(δ(U))≤ l(1− 1/n) for any
integer l ≥ 2. Therefore, by applying the union bound (and n ≥ 5), we derive that the probability
that there exists some cut δ(U) with |Tj ∩ δ(U)|>βz ∗(δ(U)) is at most
∞∑i=3
nin−2.5(i−1)(1−1/n),
where each term is an upper bound on the probability that there exists a violating cut of size within
[(i− 1)(1− 1/n), i(1− 1/n)]. For n≥ 5, this simplifies to:
∞∑i=3
nin−2.5(i−1)(1−1/n) ≤∞∑i=3
n−i+2 =1
n− 1,
Thus, the probability that there exists a cut for which the β-thinness property is violated by T ∗ is
at most 1− 1n−1
.
Now, the expected cost of Tj is
E[c(Tj)]≤∑e∈E
ze ≤ (1 + ε)n− 1
n
∑a∈A
x∗a ≤ (1 + ε)OPTHK.
So, by the Markov inequality we have that for any j, the probability that c(Tj)> 2OPTHK is at most
(1 + ε)/2. Thus, with probability at most ( 1+ε2
)2 logn < 1n
for ε= 0.2, we have c(T ∗)> 2OPTHK.
Now, taking a union bound for the events of the violation of β-thinness in some cut and the event
of the cost being more than 2OPTHK, we conclude that the probability that T ∗ is not (β,2)-thin is
at most 1− 2n−1
. This completes the proof of the theorem.
We note that the preceding proof establishes a thinness probability of 1−O(1/nk), for any k > 1,
by increasing the value of β by a factor of k.
After proving the thinness result of Theorem 6.1, we can put it together with the developments
of Section 4 to establish the main result of the paper.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 25
Theorem 6.2. Algorithm 1 finds a (2 + 8 logn/ log logn)-approximate solution to the Asymmetric
Traveling Salesman Problem with probability at least 1− 2n−1
in time that is polynomial in the size of
the input.
Proof: The algorithm starts by finding an optimal extreme-point solution x ∗ to the Held-Karp
LP relaxation of ATSP of value OPTHK. Next, using the algorithm of Theorem 5.2 on z ∗ (which
is defined by (3.5)) with ε = 0.2, we obtain γe-s that define the probability distribution p(T ) :=
exp(∑
e∈T γe). Since x ∗ was an extreme point, we know that z∗min ≥ e−O(n logn); thus, the algorithm
of Theorem 5.2 indeed runs in polynomial time.
Next, we use the polynomial-time sampling procedure of Subsection 5.3 to sample 2dlogne trees
Tj from the distribution p(·), and take T ∗ to be the one among them that minimizes the cost c(Tj).
By Theorem 6.1, we know that T ∗ is (4 logn/ log logn,2)-thin with probability at least 1− 2n−1
.
Finally, we use Theorem 4.1 to obtain, in polynomial time, a (2 + 8 logn/ log logn)-approximation
of our ATSP instance.
The proof also shows that the integrality gap of the Held-Karp relaxation for the asymmetric TSP
is bounded above by 2 + 8 logn/ log logn. The best known lower bound on the integrality gap is only
2, as shown in [CGK06]. Closing this gap is a challenging open question.
Corollary 6.1. Showing the existence of a (C1,C2)-thin spanning tree, where C1 and C2 are uni-
versal constants, would establish a constant integrality gap for the Held-Karp LP relaxation of the
asymmetric TSP. Moreover, an efficient algorithm to find such spanning trees would lead to an effi-
cient constant factor approximation algorithm for the asymmetric TSP.
We believe finding such thin trees could be the key to improve the approximation factor for ATSP
even further. Indeed, since the publication of an earlier version of this paper [AGM+10], thin trees
were used by [OS11] to develop constant-factor approximation algorithms for ATSP on planar graphs
and graphs with bounded genus and by [AO15] for a polyloglog(n) bound on the integrality gap of the
Held-Karp relaxation. The former shows the existence of constant thin trees for planar graphs and
graphs with bounded genus, and the latter shows the existence of trees with thinness of a bounded
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP26 Article submitted to Operations Research; manuscript no. 2
degree polynomial of log log(n) for all graphs but stops short of giving a polynomial time algorithm
for finding them. We conjecture that the constant thin trees exists, they can be found in polynomial
time, and consequently, constant factor approximation algorithms for the ATSP can be found.
Conjecture 6.1. There exists universal constant values of C1,C2 > 0 such that for any feasible
solution of the Held-Karp relaxation of the asymmetric TSP, there exists a (C1,C2)-thin spanning
tree and it can be found in polynomial time.
We refer the reader to [OS11, Corollary 5.1] for a slightly more general variant of the conjecture
that does not depend on the cost of the arcs.
Conversely, one can approach a lower-bound on the integrality gap of the Held-Karp relaxation by
studying families of feasible solutions that do not admit thin trees. Goddyn [God] showed a direct
relationship between thin trees and the Jaeger’s conjecture [Jae84] on the existence of nowhere-zero
flows (Jaeger’s conjecture has been recently proved by Thomassen [Tho12]). Currently, Conjecture 6.1
has only been proved for planar and bounded genus graphs [OS11] (see also [ES13, HO14]).
7. Solving the Maximum Entropy Convex Program: A Combinatorial Approach
In this section, we provide a combinatorial algorithm to efficiently find γe-s that approximately
preserve the marginal probabilities given by z and therefore, prove Theorem 5.2. As an alternative,
in Section 8 we show that the maximum entropy convex program can also be solved via the Ellipsoid
method. The advantage of the latter approach is that the resulting running-time dependence on ε is
only polylogarithmic.
Given a vector γγγ, for each edge e, define
qe(γγγ) :=
∑T3e eγγγ(T )∑T eγγγ(T )
, (7.1)
where γγγ(T ) =∑
f∈T γγγf . For notational convenience, we have dropped the fact that T ∈ T in these
summations; this shouldn’t lead to any confusion. Restated, qe(γγγ) is the probability that edge e will
be included in a spanning tree T that is chosen with probability proportional to exp(γγγ(T )). We will
discuss the computational issues of deriving the values of qe(γγγ)-s at the end of this section.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 27
We compute γγγ using the following simple algorithm. Starting with all γe-s equal to each other, as
long as the marginal qe(γγγ) for some edge e is more than (1 + ε)ze, we decrease γe appropriately to
decrease qe(γγγ) to (1 + ε/2)ze. A formal description can be found in Algorithm 2.
Algorithm 2 A Combinatorial Algorithm to Find γγγ as Defined in Theorem 5.2Input: The vector of marginal probabilities z .
Set γγγ =~0.
while there exists an edge e with qe(γγγ)> (1 + ε)ze do
Compute δ such that if we define γγγ′ as γ′e = γe− δ, and γ′f = γf for all f ∈E \ e,
then qe(γγγ′) = (1 + ε/2)ze.
Set γγγ←γγγ′.
end while
return γγγ :=γγγ.
Clearly, if the described procedure terminates then the resulting γγγ satisfies the requirement of
Theorem 5.2. Therefore, what we need to show is that this algorithm terminates in time polynomial
in n, − log zmin and 1/ε, and that each iteration can be implemented in polynomial time.
We start by bounding the number of iterations - we will show that it is O( 1ε|E|2[|V | log(|V |)−
log(εzmin)]). The next lemma derives an equation for δ and proves that decreasing γe does not
decrease qf (·) for any f 6= e. We refer the reader to the appendix for the proof.
Lemma 7.1. Fix a δ ≥ 0 and an edge e ∈ E. Define γγγ′ by γ′e := γe − δ and γ′f =: γf for all f 6= e.
Then,
1. for all f ∈E \ e, qf (γγγ′)≥ qf (γγγ),
2. qe(γγγ′) satisfies 1qe(γγγ′)
− 1 = eδ(
1qe(γγγ)− 1)
.
In particular, in the main loop of the algorithm, since qe(γγγ) > (1 + ε)ze and we want qe(γγγ′) =
(1 + ε/2)ze, we get δ = log qe(γγγ)(1−(1+ε/2)ze)
(1−qe(γγγ))(1+ε/2)ze> log (1+ε)
(1+ε/2)> ε
4for ε≤ 1. For larger values of ε, we can
simply decrease ε to 1.
We continue with some basic definitions and results regarding spanning trees and forests.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP28 Article submitted to Operations Research; manuscript no. 2
Definition 7.1. A subgraph H of a given undirected graph G is called a spanning forest of G if it
consists of a spanning subtree in each connected component of G.
In other words, a spanning forest of G is a maximal acyclic subgraph of G. Now, consider some subset
Q of the edges of G. Let r be the size of a maximum spanning forest of Q. By the definition of r, it
is clear that |T ∩Q| ≤ r, for any spanning tree T of G. We distinguish two types of spanning trees in
G by defining T (r)= := T ∈ T : |T ∩Q|= r and T (r)
< := T ∈ T : |T ∩Q|< r. The following lemma
establishes some structural properties of the spanning trees of G. Note that for the sake of simplicity,
we may get rid of the superscript (r), and work with T= and T< instead. The value of r will be clear
from the context.
Lemma 7.2. Let G = (V,E) be a graph with weights γe for e ∈ E. Let Q ⊂ E be such that for all
f ∈Q, e∈E \Q, we have γf >γe + ∆ for some ∆≥ 0.
1. Any given spanning tree T ∈ T= can be generated by taking the union of a spanning forest F
(of cardinality r) of the graph (V,Q) and a spanning tree (of cardinality n− r− 1) of the graph
G/Q in which the edges of Q have been contracted.
2. Let Tmax be a maximum weight spanning tree of G with respect to the weights γγγ, i.e., Tmax =
arg maxT∈T γγγ(T ). Then, for any T ∈ T<, we have γγγ(T )<γγγ(Tmax)−∆.
Proof: The first property follows from the matroidal properties of spanning trees. To prove the
second property, consider any T ∈ T<. Since |T ∩Q|< r, there exists an edge f ∈ (Tmax∩Q)\T such
that (T ∩Q)∪f is a forest of G. Thus, the unique circuit in T ∪f contains an edge e /∈Q. Hence,
T ′ = T ∪f \ e is a spanning tree. Our assumption on Q implies that
γ(Tmax)≥ γ(T ′) = γ(T )− γe + γf >γ(T ) + ∆,
which yields the desired inequality.
We proceed to bounding the number of iterations.
Lemma 7.3. Algorithm 2 executes at most O( 1ε|E|2[|V | log(|V |)− log(εzmin)]) iterations of the main
loop.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 29
Proof: Let n = |V | and m = |E|. Assume for the sake of contradiction that the algorithm
executes more than
τ :=4
εm2[n logn− log(εzmin)]
iterations. Let γγγ be the vector of γe-s computed at iteration τ+1. For brevity, let us define qe := qe(γγγ)
for all edges e. We first provide an upper bound for the minimum entry of γγγ.
Claim 7.1. There exists some e∗ ∈E such that γe∗ <− ετ4m
.
Proof: Note that there are m edges in the graph, and Lemma 7.1 implies that each “while”
iteration of Algorithm 2 decreases γe of one of these edges by δ, which is at least ε/4 (refer to the
discussion after the statement of Lemma 7.1). Moreover, all entries of γγγ are initially set to 0. Thus,
we know that, after more than τ iterations, there exists e∗ for which γe∗ is as desired.
Now, we provide a lower bound for the maximum entry of γγγ.
Claim 7.2. There exists some f∗ ∈E such that γf∗ = 0.
Proof: Note that Algorithm 2 never decreases γe for edges e with qe(·) smaller than (1 + ε)ze.
Moreover, Lemma 7.1 shows that reducing γf of edge f 6= e can only increase qe(·). Therefore, all
the edges with negative value of γe must satisfy qe ≥ (1 + ε/2)ze. In other words, all edges e such
that qe < (1 + ε/2)ze satisfy γe = 0. However, a simple averaging argument establishes that∑
e qe =∑e
∑T3e exp(γγγ(T ))∑T exp(γγγ(T ))
=∑T∑e∈T exp(γγγ(T ))∑T exp(γγγ(T ))
= n− 1. Consequently,∑
e qe = n− 1< (1 + ε/2)(n− 1) = (1 +
ε/2)∑
e ze. Hence, there exists at least one edge f∗ with qf∗ < (1+ε/2)zf∗ and thus, having γf∗ = 0.
Note that as a result of Claims 7.1 and 7.2, the difference between the highest and the lowest entry
of λ is at least ετ4m
. This can be used to prove the existence of a gap in the values of the entries of λ.
The following claim establishes this result.
Claim 7.3. There exists ∅ 6=Q(E such that for all e∈E \Q and f ∈Q, γe + ετ4m2 <γf .
Proof: We construct Q as follows. We set threshold values Γi = − ετi4m2 , for i ≥ 0, and define
Qi = e ∈E |γe ≥ Γi. Let Q=Qj where j is the first index such that Qj =Qj+1. First, note that
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP30 Article submitted to Operations Research; manuscript no. 2
due to the finiteness of E such index j always exists (and is at most d−4m2
ετmine∈E γee). Moreover,
we have Q⊂ E. However, it is clear from the construction of Q that the gap property is satisfied,
i.e., for all e ∈ E \Q and f ∈Q, γe + ετ4m2 < γf . Also, Q is non-empty since claim 7.2 implies that
f∗ ∈ Q0 ⊆ Qj = Q. Finally, by the pigeonhole principle, since we have m different edges, we know
that j <m. Thus, for each e∈Q we have γe > Γm =− ετ4m
. This means that e∗ /∈Q and thus, Q 6=E.
Observe that the subset Q corresponding to Claim 7.3 satisfies the hypothesis of Lemma 7.2 with
∆ = ετ4m2 . Thus, for any T ∈ T<, we have
γγγ(Tmax)>γγγ(T ) +ετ
4m2, (7.2)
where Tmax and r are as defined in Lemma 7.2.
Let G be the graph G/Q obtained by contracting all the edges in Q. So, G consists only of edges
not in Q (some of them can be self-loops). Let T be the set of all spanning trees of G , and for any
given edge e /∈Q, let qe :=∑T∈T ,T3e exp(γγγ(T ))∑T∈T exp(γγγ(T ))
be the probability that edge e is included in a random
spanning tree T of G , where each tree T is chosen with probability proportional to exp(γγγ(T )). Since
spanning trees of G have n− r− 1 edges, we have
∑e∈E\Q
qe = n− r− 1. (7.3)
Note that z , as the input to Algorithm 2, is a point in the spanning tree polytope. Hence, z satisfies
z (E) = n− 1 and z (Q)≤ r (by definition of r, see Lemma 7.2, part 1.). As a result, we have that
z (E \Q)≥ n− r− 1. Therefore, Eq. (7.3) implies that there must exist e′ /∈Q such that qe′ ≤ ze′ .
Our final step is to show that for any e /∈Q, qe < qe + εzmin2
. Note that once we establish this, we
know that qe′ < qe′+εzmin
2≤ (1+ ε
2)ze′ , and thus it must be the case that γe′ = 0. But this contradicts
the fact that e′ /∈Q, as by construction all e with γe = 0 must be in Q. Thus, we obtain a contradiction
that concludes the proof of the Lemma.
It remains to prove that for any e /∈Q, qe < qe + εzmin2
. We have that
qe =
∑T∈T :e∈T eγγγ(T )∑T∈T eγγγ(T )
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 31
=
∑T∈T=:e∈T eγγγ(T ) +
∑T∈T<:e∈T eγγγ(T )∑
T∈T eγγγ(T )
≤∑
T∈T=:e∈T eγγγ(T )∑T∈T= eγγγ(T )
+
∑T∈T<:e∈T eγγγ(T )∑
T∈T eγγγ(T )
≤∑
T∈T=:e∈T eγγγ(T )∑T∈T= eγγγ(T )
+∑
T∈T<:e∈T
eγγγ(T )
eγγγ(Tmax), (7.4)
where the first inequality following from replacing T with T= in the first denominator, and the second
inequality following from considering only one term (i.e., Tmax) in the second denominator. Using
(7.2) and the fact that the number of spanning trees is at most nn−2, the second term is bounded
by:
∑T∈T<:e∈T
eγγγ(T )
eγγγ(Tmax)≤ nn−2e−ετ/4m
2
(7.5)
<1
2nne−ετ/4m
2
=εzmin
2,
by definition of τ . To handle the first term of (7.4), we can use part 2. of Lemma 7.2 and factorize:
∑T∈T=
eγγγ(T ) =
∑T∈T
eγγγ(T )
(∑T ′∈F
eγγγ(T ′)
),
where F is the set of all spanning forests of (V,Q). Similarly, we can write
∑T∈T=,T3e
eγγγ(T ) =
∑T∈T ,T3e
eγγγ(T )
(∑T ′∈F
eγγγ(T ′)
).
As a result, we have that the first term of (7.4) reduces to:(∑T∈T ,T3e eγγγ(T )
)(∑T ′∈F eγγγ(T ′)
)(∑
T∈T eγγγ(T )) (∑
T ′∈F eγγγ(T ′)) =
∑T∈T ,T3e eγγγ(T )∑T∈T eγγγ(T )
= qe.
Together with (7.4) and (7.5), this gives
qe ≤ qe +εzmin
2,
which completes the proof.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP32 Article submitted to Operations Research; manuscript no. 2
To complete the analysis of the algorithm, we need to argue that each iteration can be imple-
mented in polynomial time. First, for any given vector γγγ, if we can calculate the exact values of
eγe (see Remark 7.1 below for a discussion) we can efficiently compute the sums∑
T exp(γγγ(T )) and∑T3e exp(γγγ(T )) for any edge e. This will enable us to compute all qe(γγγ)-s. This can be done using
Kirchhoff’s matrix tree theorem (see [Bol02]), as discussed in Section 5.3 for sampling λ-random
spanning trees, with λe defined to be exp(γe). Observe that we can bound all entries of the weighted
Laplacian matrix in terms of the input size since the proof of Lemma 7.3 actually shows that − ετ4|E| ≤
γe ≤ 0 for all e ∈ E and any iteration of the algorithm. Therefore, we can compute these cofactors
efficiently, in time polynomial in n, − log zmin and 1/ε. Finally, δ can be computed efficiently from
Lemma 7.1.
Remark 7.1. The terms eγe cannot be calculated exactly due to the irrationality issues. But any
of the terms eγe can be calculated within a factor (1 + ξ) of their actual value in polynomial time
in log(1/ξ). Hence, all qe(γγγ) values can be calculated efficiently within any given factor (1 + nξ) in
polynomial time in log(1/ξ). Now, in order to prove Theorem 4 for a given value ε it is enough to
calculate all values of qe-s with an error of ε/(3n) and run Algorithm 2 for ε/3. Since (1 + ε/3)2 ≤
(1 + ε) for ε ≤ 1/5, at the end of algorithm 1, all the actual values of qe(γγγ) will be bounded from
above by the corresponding (1 + ε)ze as is desired.
8. Solving the Maximum Entropy Convex Program via the Ellipsoid Method
In this section we design an algorithm to find a λ-random spanning tree distribution that preserves
the marginal probability of all the edges within multiplicative error of 1+ε using the Ellipsoid method
as an alternative to the combinatorial approach provided in Section 7. The following is our main
theorem.
Theorem 8.1. There is an algorithm, that for a given z in the relative interior of the spanning
tree polytope of G= (V,E), and ε≤ 1/8, outputs a vector γγγ such that for λλλ= exp(γγγ), the λλλ-random
spanning tree distribution, µ, satisfies
∑T∈T :T3e
Prµ
[T ]≤ (1 + ε)ze, ∀e∈E. (8.1)
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 33
Furthermore, the running time of the algorithm is polynomial in n= |V |, − log zmin and log(1/ε).
Remark 8.1. Theorem 8.1 assumes that z is in the relative interior of the spanning tree polytope.
However, if z is on a facet of the spanning tree polytope, then one can always find a point z in the
relative interior of the spanning tree polytope such that z e ≤ (1 + ε/3)z e for every edge e. Note that
for ε≤ 1/5, we have (1 + ε/3)2 ≤ 1 + ε. Hence, applying Theorem 8.1 on z and ε/3 guarantees that
(8.1) holds for z and ε.
Subsequent to [AGM+10], Singh and Vishnoi [SV14] generalized and improved the above theorem;
they showed that for any family of discrete objects,M on a ground set E of elements, and any given
marginal probability vector in the interior of the convex hull of characteristic vectors corresponding
to objects of M, one can efficiently compute a product distribution which approximately preserves
the marginal probabilities if and only if there is an efficient algorithm that approximately counts
the weighted sum of all the objects of M for a given vector γγγ, i.e., an efficient algorithm that
approximates∑
M∈M exp(γ(M)) for a given γγγ. For example, since there is an efficient algorithm
that approximately counts the weighted sum of all of the perfect matchings of a bipartite graph for
a given γγγ, [JSV04], one can approximately compute the maximum entropy distribution defined over
the perfect matchings with respect to any given marginal vector in the relative interior of the perfect
matching polytope (see [SV14] for more details).
First, we show that to prove the preceding theorem it is enough to design an algorithm that
approximately solves the dual program (5.4).
Theorem 8.2. There is an algorithm, that for a given z in the relative interior of the spanning tree
polytope of G= (V,E), and ε≤ 1/8, outputs γγγ such that
1 +∑e∈E
zeγe−∑T∈T
eγ(T ) ≥OPTEnt− ε.
Furthermore, the running time of the algorithm is polynomial in n and log(1/ε).
We can use the preceding theorem to prove Theorem 8.1
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP34 Article submitted to Operations Research; manuscript no. 2
Proof of Theorem 8.1: Using Theorem 8.2 for ε= ε, we can find γγγ such that for any γγγ,
∑e∈E
zeγe−∑T∈T
eγγγ(T ) ≥−ε+∑e∈E
zeγe−∑T∈T
eγγγ(T ). (8.2)
To prove the theorem, it is enough to choose ε such that∑T :e∈T eγγγ(T )∑T∈T eγγγ(T )
≤ ze(1 + ε), ∀e∈E. (8.3)
To prove the above inequality, first we lower bound the denominator (of the LHS), then we upper
bound the numerator. Let δ > 0 be a parameter that we fix later. Firstly, for any e ∈ E, let γe =
γe + δn−1
. Then, by (8.2),
(eδ − 1)∑T∈T
eγγγ(T ) =∑T∈T
(eγγγ(T )− eγγγ(T ))≥−ε+∑e∈E
ze(γe− γe) =−ε+∑e∈E
zeδ
n− 1= δ− ε. (8.4)
Secondly, fix an edge f ∈E; let γe = γe for all e 6= f and γf = γf − δ. Then, by (8.2),
∑T :f∈T
eγγγ(T )(1− e−δ) =∑T :f∈T
(eγγγ(T )− eγγγ(T ))≤ ε+∑e∈E
ze(γe− γe) = ε+ δ · zf . (8.5)
Therefore, to prove (8.3), it is enough to choose δ, ε such that
δ− εeδ − 1
≥ 1− ε/4,
δzf + ε
1− e−δ≤ zf (1 + ε/2).
Letting δ = ε/8 and ε= zminδ2 and using the fact that ex ≤ 1 + x+ x2/2 for any −1≤ x≤ 1 proves
the above inequalities. It follows that∑T :f∈T eγγγ(T )∑T∈T eγγγ(T )
≤δzf+ε
1−e−δ
δ−εeδ−1
≤ zf (1 + ε/2)
1− ε/4≤ zf (1 + ε).
The first inequality follows by (8.4) and (8.5).
In the rest of this section we prove Theorem 8.2. Consider a concave function g(·) that is differ-
entiable everywhere. We use ∇g to denote the gradient of g. One can use the Ellipsoid method to
(approximately) find the maximum of g(·) given a first-order oracle for g(·), i.e., a routine that for a
given vector γγγ returns g(γγγ) and ∇g(γγγ). The statement is easily derivable from [BN12, thm 8.1.2].
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 35
Theorem 8.3. There is an algorithm that for any given ε > 0 and R > 0 returns a vector γγγ such
that
g(γγγ)≥ supγγγ:‖γγγ‖∞≤R
g(γγγ)− ε( supγγγ:‖γγγ‖∞≤R
g(γγγ)− infγγγ:‖γγγ‖∞≤R
g(γγγ)),
where ‖γγγ‖∞ is the largest coordinate, in absolute value, of the vector γγγ. Furthermore, the number
of calls to the first order oracle is upper bounded by a polynomial in the dimension of γγγ, logR and
log 1/ε.
We use the above theorem to prove Theorem 8.2. Let g(γγγ) = 1 +∑
e γeze −∑
T∈T eγγγ(T ). It is easy
to see that g(·) is concave, and differentiable everywhere. In addition, we can use the matrix tree
theorem (see [Bol02]) to obtain a first order oracle for g(·). To compute g(γγγ) it is enough to compute
a cofactor of the Laplacian of G as defined in Section 5.3 where λe = exp(γe). If ‖γγγ‖∞ <R, then all
entries of the weighted Laplacian matrix are at most exp(R); therefore, we can compute a cofactor
efficiently, in time polynomial in n and R. To compute an entry e of ∇g(γγγ), i.e., ∂g(γγγ)
∂γe, it is enough
to compute the function g(·) on the graph where e is contracted.
Lemma 8.1. For any ε < 1, there exists R=O(n3 log(n) +n2 log(1/ε)) such that
supγγγ:‖γγγ‖∞≤R
g(γγγ)≥OPTEnt− ε.
Before proving the above lemma, let us use its result to prove Theorem 8.2.
Proof of Theorem 8.2: The proof simply follows by an application of Theorem 8.3. As we
mentioned above, the function g(·) is concave, everywhere differentiable and has a first order oracle.
Choose R as in Lemma 8.1 such that
supγγγ:‖γγγ‖∞≤R
g(γγγ)≥OPTEnt− ε/2. (8.6)
Using Theorem 8.3 for R and ε= ε we get a point γγγ where
g(γγγ) ≥ supγγγ:‖γγγ‖∞≤R
g(γγγ) + ε( supγγγ:‖γγγ‖∞≤R
g(γγγ)− infγγγ:‖γγγ‖∞≤R
g(γγγ))
≥ OPTEnt−ε
2− ε( sup
γγγ:‖γγγ‖∞≤Rg(γγγ)− inf
γγγ:‖γγγ‖∞≤Rg(γγγ)). (8.7)
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP36 Article submitted to Operations Research; manuscript no. 2
The second inequality follows by (8.6). For any γγγ where ‖γγγ‖∞ ≤R, we have |g(γγγ)| ≤ 1+nR+ |T |enR.
Since any graph with n vertices has at most nn−2 spanning tree ([Cay89]), |T | ≤ nn−2, so
supγγγ:‖γγγ‖∞≤R
g(γγγ)− infγγγ:‖γγγ‖∞≤R
g(γγγ)≤ 2(1 +nR+nn+nR),
Letting ε= ε/2
2(1+nR+nn+nR), by (8.7) we get g(γγγ)≥OPTEnt− ε. The theorem follows by the fact that R
is a polynomials in n and log(1/ε), and that the number of oracle calls of the algorithm Theorem 8.3
is a polynomial in log(1/ε) and log(R). Furthermore, each oracle call takes time which is polynomial
in n and R.
The last, and perhaps the most technical part of the proof is the proof of Lemma 8.1. We refer the
interested readers to [SV14] for a generalization of Lemma 8.1 which is one of the major contributions
of [SV14].
Proof of Lemma 8.1: Let us start by defining the concept of “gap”. We say that a vector γγγ
has a “gap at an edge f ∈E” if for any e∈E, λe /∈ (λf , λf + gap], where gap is a parameter that we
fix later. Observe that for any γγγ, the number of gaps of γγγ is at most |E|, corresponding to one gap
for each edge, which is at most(n2
).
In the following claim we show that if γγγ has a gap at an edge, then we can construct another vector
γγγ with fewer number of gaps such that g(γγγ)≈ g(γγγ). The proof is very similar in nature to the proof
of Lemma 7.3, but for the sake of completeness, we provide it here.
Claim 8.1. For any γγγ with at least one gap, there exists γγγ with at least one fewer gap such that∑e zeγe =
∑e zeγe and ∑
T∈T
eγγγ(T ) ≤ (1 + |T |e−gap)∑T∈T
eγγγ(T ). (8.8)
Proof of Claim 8.1: Suppose that γγγ has a gap at an edge f ∈E. Let F := e ∈E : γe > γf.
Let k= rank(F ) be the size of the maximum spanning forest of F . For any edge e∈E, let
γe =
γe + k∆
n−1if e /∈ F ,
γe−∆ + k∆n−1
if e∈ F ,
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 37
where ∆ = mine∈F γe − γf − gap. Note that by the assumption of the claim, we have ∆ > 0. By
definition of γγγ, it has fewer gaps than γγγ.
Recall that any spanning tree of G has at most k edges from F , so z (F )≤ k. Using this we get
∑e∈E
zeγe =∑e∈E
zeγe +k∆
n− 1
∑e∈E
ze− z (F )∆≥∑e
zeγe + k∆− k∆ =∑e
zeγe.
It remains to prove (8.8). First, observe that if a spanning tree T has exactly k edges from F , then
γγγ(T ) =γγγ(T ), and exp(γγγ(T )) = exp(γγγ(T )). We will show that the contribution of the rest of the trees
to the LHS of (8.8) is negligible.
Let Tmax be the maximum weighted spanning tree of (V,E,γγγ). By Lemma 7.2, Tmax has exactly
k edges of F . Since γγγ(T ) = γγγ(T ) for any tree where |T ∩F |= k, Tmax is also a maximum weighted
spanning tree of (V,E,γγγ). Consider a spanning tree T where |T ∩F |< k. It follows by the matroid
exchange property (see e.g., [CVZ10]) that there are edges e1 ∈ (Tmax ∩F ) \ T and e2 ∈ T \F such
that T ′ = T ∪e1 \ e2 is a spanning tree. By the definition of γγγ,
γγγ(Tmax) = γγγ(Tmax)≥ γγγ(T ′) = γγγ(T )− γe + γf >γγγ(T ) + gap, (8.9)
Therefore,
∑T∈T
eγγγ(T ) =∑
T :|F∩T |=k
eγγγ(T ) +∑
T :|F∩T |<k
eγγγ(T )
≤∑
T :|F∩T |=k
eγγγ(T ) +∑
T :|F∩T |<k
eγγγ(Tmax)−gap ≤ (1 + |T |e−gap)∑T∈T
eγγγ(T ),
which implies (8.8). This completes the proof of Claim 8.1.
Let γγγ∗ be an optimum solution of (5.4). If γγγ∗ does not have any gap we let γγγ = γγγ∗ and we are
done. Otherwise, we repeatedly apply Claim 8.1 and remove all of the gaps and we get a vector γγγ
such that∑
e zeγe =∑
e zeγ∗e , and
∑T∈T
eγγγ(T ) ≤ (1 + |T |e−gap)|E|∑T∈T
eγγγ∗(T ) = (1 + |T |e−gap)|E| ≤ 1 +nne−gap ≤ 1 + ε, (8.10)
where the first inequality follows by the fact that γγγ∗ has at most |E| gaps, the equality follows by the
fact that γγγ∗ is the optimal solution of (5.4), the second inequality follows by the fact that any graph
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP38 Article submitted to Operations Research; manuscript no. 2
on n vertices has at most nn−2 spanning trees [Cay89], and the last inequality follows by letting
gap= log(1/ε) +n logn. Therefore,
g(γγγ)≥ g(γγγ∗)− ε=OPTEnt− ε.
So, to prove the theorem it is enough to show that ‖γγγ∞‖ ≤R. Since γ does not have any gap
maxeγe−min
eγe ≤ |E| · gap.
So, it is sufficient to lower-bound maxe γe and upper-bound mine γe. Firstly, since OPTEnt ≥
log(1/|T |)≥− log(nn−2)≥−n logn, we have
−n logn≤OPTEnt =∑e∈E
zeγ∗e =
∑e∈E
zeγe ≤ n ·maxe∈E
γe.
On the other hand, by (8.10), eγ(T ) ≤ 2 for any tree T , so mine γe ≤ 1. Therefore, ‖γγγ‖∞ ≤ n logn+
|E| · gap, as desired. This completes the proof of Lemma 8.1.
9. Concluding Remarks
In this paper, we provided the first asymptotic improvement over the long-standing Θ(logn)-
approximation ratio by Frieze, Galbiati and Maffioli [FGM82] for the asymmetric traveling salesman
problem (ATSP). Our main ideas are the concepts of “thin spanning trees” and “maximum entropy
rounding”.
Using ideas from network flow circulation, we showed that finding thin spanning trees lead to good
approximation algorithms for ATSP. We believe finding such thin trees could be the key to improve
the approximation factor for ATSP even further. We conjecture that constant thin trees exists and
consequently, constant factor approximation algorithms for the ATSP can be found.
We also developed a new randomized rounding technique called maximum entropy rounding to
produce thin trees. We used this method to round a fractional spanning tree (derived from the
Held-Karp relaxation of the ATSP) to an integral one such that the linear quantities defined on
the tree remain approximately the same. The main intuition behind our approach is that among all
marginal-preserving and structure-preserving ways of performing randomized rounding, maximum
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 39
entropy rounding tries to lower the dependencies among the variables as much as possible. This
helped us to derive strong concentration bounds on values of cuts in the rounded solution.
As we discussed before, we believe that this method could be useful in other settings and for
other combinatorial structures as well, and that we expect to see further applications in the future.
In particular, after the publication of [AGM+10], the maximum entropy rounding approach was
used to obtain an improved approximation algorithm for the graphic variant of the symmetric TSP,
where [OSS11] provided a ( 32− ε0)-approximation algorithm for some positive constant ε0.
Also, [SV14] recently showed that maximum entropy distributions over a discrete set of objects
subject to observed marginals can be computed, as long as there exists an efficient algorithm for
counting the underlying discrete set. Prime examples of such sets are spanning trees (as discussed in
our paper) and perfect matchings.
We note that the main property of the maximum entropy rounding over spanning trees that we
used in our algorithm was the fact that it imposed negative correlation over the edges. In [CVZ10],
another way of producing negatively correlated marginal-preserving probability distributions over
spanning trees (or, more generally, on matroid bases) has been proposed. This approach which is
based on the idea of swap rounding can also be used in the framework developed in our paper.
Finally, there has been recent developments on improving the integrality gap of the Held-
Karp relaxation for the asymmetric traveling salesman problem. In particular, [Sve15] provides a
polynomial-time constant factor approximation algorithm for the problem when restricted to shortest
path metrics of node-weighted digraphs. Also, [AO15] shows that the integrality gap of the Held-Karp
relaxation is in general bounded by (log log(n))O(1). However, the algorithm of [AO15] is based on
solving an exponential-size convex program that is not known to be solvable in polynomial time.
Therefore, improving the O(logn/ log logn)-approximation factor of our paper still remains an open
question.
Acknowledgments
The authors would like to thank the associate editor of the journal and the three anonymous referees for their
highly valuable comments. The work of the second and the third author was supported by NSF contract
CCF-0829878 and by ONR grant N00014-05-1-0148.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP40 Article submitted to Operations Research; manuscript no. 2
References[ABCC07] David L. Applegate, Robert E. Bixby, Vasek Chvatal, and William J. Cook. The Traveling Salesman
Problem: A Computational Study. Princeton University Press, 2007. 1[AGM+10] Arash Asadpour, Michel X. Goemans, Aleksander Madry, Shayan Oveis Gharan, and Amin Saberi. An
o(log n/ log log n)-approximation algorithm for the asymmetric traveling salesman problem. In ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 379–389, 2010. 2, 4, 8, 23, 25, 33, 39
[Ald90] David J. Aldous. A random walk construction of uniform spanning trees and uniform labelled trees. SIAMJournal on Discrete Mathematics, 3(4):450–465, 1990. 20
[AO15] Nima Anari and Shayan Oveis Gharan. Effective-resistance-reducing flows and asymmetric TSP. In IEEEAnnual Symposium on Foundations of Computer Science, FOCS, pages 20–39, 2015. 25, 39
[AS10] Arash Asadpour and Amin Saberi. An approximation algorithm for max-min fair allocation of indivisiblegoods. SIAM Journal of Computing, 39(7):2970–2989, 2010. 5
[Bla02] Markus Blaser. A new approximation algorithm for the asymmetric TSP with triangle inequality. InACM-SIAM Symposium on Discrete Algorithms, SODA, pages 638–645, 2002. 2
[BN12] A. Ben-Tal and A. Nemirovski. Optimization i-ii convex analaysis nonlinear programming theory nonlinearprogramming algorithms. Lecture Notes, 2012. 34
[Bol02] Bela Bollobas. Modern Graph Theory. Springer-Verlag, 2002. 21, 32, 35[Bro89] Andrei Z. Broder. Generating random spanning trees. In IEEE Symposium on Foundations of Computer
Science, FOCS, pages 442–447, 1989. 20[BTN01] Aharon Ben-Tal and Arkadi Semenovich Nemirovski. Lectures on modern convex optimization: analysis,
algorithms, and engineering applications. Society for Industrial and Applied Mathematics, Philadelphia,PA, USA, 2001. 17
[BV06] Stephen P. Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2006. 8,18
[Cay89] Arthur Cayley. A theorem on trees. Quarterly Journal of Mathematics, 23:376–378, 1889. 16, 36, 38[CGK06] Moses Charikar, Michel X. Goemans, and Howard J. Karloff. On the integrality ratio for the asymmetric
traveling salesman problem. Mathematics of Operations Research, 31:245–252, 2006. 25[Chr76] Nicos Christofides. Worst case analysis of a new heuristic for the traveling salesman problem. Report 388,
Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh, PA, 1976. 2[CMN96] Charles J. Colbourn, Wendy J. Myrvold, and Eugene Neufeld. Two algorithms for unranking arbores-
cences. Journal of Algorithms, 20(2):268–281, 1996. 9, 20[Coo12] William J. Cook. In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation.
Princeton University Press, 2012. 1[CT06] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory (Wiley Series in Telecommuni-
cations and Signal Processing). Wiley-Interscience, 2006. 4[CVZ10] Chandra Chekuri, Jan Vondrak, and Rico Zenklusen. Dependent randomized rounding via exchange
properties of combinatorial structures. In IEEE Symposium on Foundations of Computer Science, FOCS,pages 575–584, 2010. 23, 37, 39
[Edm71] Jack Edmonds. Matroids and the greedy algorithm. Mathematical Programming, pages 127–136, 1971. 42[ES13] Jeff Erickson and Anastasios Sidiropoulos. A near-optimal approximation algorithm for asymmetric TSP
on embedded graphs. CoRR, abs/1304.1810, 2013. 26[FGM82] Alan M. Frieze, Giulia Galbiati, and Francesco Maffioli. On the worst-case performance of some algorithms
for the asymmetric traveling salesman problem. Networks, 12:23–39, 1982. 1, 2, 38[FS07] Uriel Feige and Mohit Singh. Improved approximation ratios for traveling salesman tours and paths in
directed graphs. In APPROX, pages 104–118, 2007. 2[GBS08] Arpita Ghosh, Stephen Boyd, and Amin Saberi. Minimizing effective resistance of a graph. SIAM Review,
50(1):37–66, 2008. 21[GHJS09] Michel X. Goemans, Nicholas J. A. Harvey, Kamal Jain, and Mohit Singh. A randomized rounding
algorithm for the asymmetric traveling salesman problem. CoRR, abs/0909.0941, 2009. 3[GLS93] Martin Grotschel, Laszlo Lovasz, and Alexander Schrijver. Geometric Algorithms and Combinatorial
Optimization, volume 2 of Algorithms and Combinatorics. Springer, 1993. 9[God] Luis A. Goddyn. Some open problems I like. available at http://www.math.sfu.ca/~goddyn/Problems/
problems.html. 3, 26[Goe06] Michel X. Goemans. Minimum bounded degree spanning trees. In IEEE Symposium on Foundations of
Computer Science, FOCS, pages 273–282, 2006. 12[Gue83] Alain Guenoche. Random spanning tree. Journal of Algorithms, 4:214–220, 1983. 9, 20[Had93] Jacques Hadamard. Resolution d’une question relative aux determinants. Bulletin des Sciences
Mathematiques, 17:30–31, 1893. 12
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 41
[HK70] Michael Held and Richard M. Karp. The traveling salesman problem and minimum spanning trees.Operations Research, 18:1138–1162, 1970. 11
[HO14] Nicholas J. A. Harvey and Neil Olver. Pipage rounding, pessimistic estimators and matrix concentration.In ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 926–945, 2014. 26
[Jae84] Francois Jaeger. On circular flows in graphs, finite and infinite sets. Colloquia Mathematica SocietatisJanos Bolyai 37, pages 391–402, 1984. 26
[JSV04] Mark Jerrum, Alistair Sinclair, and Eric Vigoda. A polynomial-time approximation algorithm for thepermanent of a matrix with nonnegative entries. Journal of ACM, 51(4):671–697, 2004. 33
[Kar93] David R. Karger. Global min-cuts in RNC, and other ramifications of a simple min-cut algorithm. InACM-SIAM Symposium on Discrete Algorithms, SODA, pages 21–30, Philadelphia, PA, USA, 1993.Society for Industrial and Applied Mathematics. 4, 8, 23, 24
[Kir47] Gustav Kirchhoff. Uber die auflosung der gleichungen, auf welche man bei der untersuchung der linearenverteilung galvanischer strome gefuhrt wird. Annalen der Physik und Chemie, 72:497–508, 1847. 21
[KLSS05] Haim Kaplan, Moshe Lewenstein, Nira Shafrir, and Maxim Sviridenko. Approximation algorithms forasymmetric TSP by decomposing directed regular multigraphs. Journal of ACM, pages 602–626, 2005. 2
[KM09] Jonathan Kelner and Aleksander Mądry. Faster generation of random spanning trees. In IEEE Symposiumon Foundations of Computer Science, FOCS, pages 13–21, 2009. 20
[Kul90] Vidyadhar G. Kulkarni. Generating random combinatorial objects. Journal of Algorithms, 11:185–207,1990. 9, 20
[LK74] Jan K. Lenstra and Alexander H. G. Rinnooy Kan. Some simple applications of the Traveling SalesmanProblem. Operations Research Quarterly, 26:717–733, 1974. 1
[LP14] Russell Lyons and Yuval Peres. Probability on Trees and Networks. Cambridge University Press, 2014. Inpreparation. Current version available at http://mypage.iu.edu/~rdlyons/. 9, 18, 19, 21, 22
[LSKL85] Eugene L. Lawler, David B. Shmoys, Alexander H. G. Kan, and Jan K. Lenstra. The Traveling SalesmanProblem. John Wiley & Sons, 1985. 1
[OS11] Shayan Oveis Gharan and Amin Saberi. The asymmetric traveling salesman problem on graphs withbounded genus. In ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 967–975, 2011. 25, 26
[OSS11] Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. A randomized rounding approach to the travelingsalesman problem. In IEEE Symposium on Foundations of Computer Science, FOCS, pages 267–276,2011. 4, 39
[PS97] Alessandro Panconesi and Aravind Srinivasan. Randomized distributed edge coloring via an extension ofthe Chernoff-Hoeffding bounds. SIAM Journal on Computing, 26:350–368, 1997. 4, 8, 22, 23
[PTW01] Pavel A. Pevzner, Haixu Tang, , and Michael S. Waterman. An eulerian path approach to dna fragmentassembly. In National Academy of Science 98(17), pages 9748–9753, August 2001. 1
[RT87] Prabhakar Raghavan and C. D. Thompson. Randomized rounding: A technique for provably good algo-rithms and algorithmic proofs. Combinatorica, 7(4):365–374, 1987. 5
[Sch03] Alexander Schrijver. Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer,2003. 13, 14, 42
[SV14] Mohit Singh and Nisheeth K. Vishnoi. Entropy, optimization and counting. In Symposium on Theory ofComputing, STOC, pages 50–59, 2014. 33, 36, 39
[Sve15] Ola Svensson. Approximating ATSP by relaxing connectivity. In IEEE Annual Symposium on Foundationsof Computer Science, FOCS, pages 1–19, 2015. 39
[Tho12] Carsten Thomassen. The weak 3-flow conjecture and the weak circular flow conjecture. Journal of Com-binatorial Theory, Series B, 102(2):521–529, 2012. 26
[VY99] Santosh Vempala and Mihalis Yannakakis. A convex relaxation for the asymmetric TSP. In ACM-SIAMSymposium on Discrete Algorithms, SODA, pages 975–976, 1999. 12
[Wil96] David Bruce Wilson. Generating random spanning trees more quickly than the cover time. In ACMSymposium on Theory of Computing, STOC, pages 296–303. ACM, 1996. 20
Appendix. Omitted Proofs
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP42 Article submitted to Operations Research; manuscript no. 2
Proof of Lemma 3.1
From Edmonds’ characterization of the base polytope of a matroid [Edm71], it follows that the
spanning tree polytope P is defined by the following inequalities (see [Sch03, Corollary 50.7c]):
P = z ∈RE : z (E) = |V | − 1, (.1)
z (E(U))≤ |U | − 1 ∀U ⊂ V and U 6= ∅, (.2)
ze ≥ 0 ∀e∈E. (.3)
The relative interior of P corresponds to those z ∈ P satisfying all inequalities (.2) and (.3) strictly.
Clearly, z ∗ satisfies (.1) since:
∀v ∈ V, x ∗(δ+(v)) = 1 ⇒ x ∗(A) = n= |V |
⇒ z ∗(E) = n− 1 = |V | − 1.
Consider any non-empty set U ⊂ V . We have∑v∈U
x ∗(δ+(v)) = |U |= x ∗(A(U)) +x ∗(δ+(U))
≥ x∗(A(U)) + 1.
Since x ∗ satisfies (3.2) and (3.3), we have
z ∗(E(U)) =n− 1
nx ∗(A(U))< x ∗(A(U))≤ |U | − 1,
showing that z ∗ satisfies (.2) strictly. Since E is the support of z ∗, (.3) is also satisfied strictly by
z ∗. This shows that z ∗ is in the relative interior of P .
Proof of Lemma 6.1
Note that by definition, for all edges e ∈E, ze ≤ (1 + ε)ze, where ε= 0.2 is the desired accuracy of
approximation of z by z as in Theorem 5.2. Hence,
E[|T ∩ δ(U)|] = z (δ(U))≤ (1 + ε)z (δ(U)).
Applying Theorem 5.4 with
1 + δ= βz (δ(U))
z (δ(U))≥ β
1 + ε,
we derive that Pr[|T ∩ δ(U)|>βz (δ(U))] can be bounded from above by
Pr[|T ∩ δ(U)|> (1 + δ)E[|T ∩ δ(U)|]]
≤(
eδ
(1 + δ)1+δ
)z (δ(U))
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSPArticle submitted to Operations Research; manuscript no. 2 43
≤(
e
1 + δ
)(1+δ)z (δ(U))
=
(e
1 + δ
)βz (δ(U))
≤
[(e(1 + ε)
β
)β]z (δ(U))
≤ n−2.5z (δ(U)).
Note that in the last inequality, we have used that
log
[(e(1 + ε)
β
)β]= 4
logn
log logn[1 + log(1 + ε)− log(4)
− log logn+ log log logn]
≤ −4 logn
(1− log log logn
log logn
)≤ −4
(1− 1
e
)logn≤−2.5 logn,
since e(1 + ε)< 4 and log log lognlog logn
≤ 1e
for all n≥ 5 (even for n≥ 3).
Proof of Lemma 7.1
Let us consider some f ∈E \ e. We have
qf (γγγ′) =
∑T∈T :f∈T eγγγ
′(T )∑T∈T eγγγ′(T )
=
∑T :e∈T,f∈T eγγγ
′(T ) +∑
T :e/∈T,f∈T eγγγ′(T )∑
T3e eγγγ′(T ) +∑
T :e/∈T eγγγ′(T )
=e−δ∑
T :e∈T,f∈T eγγγ(T ) +∑
T :e/∈T,f∈T eγγγ(T )
e−δ∑
T3e eγγγ(T ) +∑
T :e/∈T eγγγ(T )
=e−δa+ b
e−δc+ d
with a, b, c, d appropriately defined. On the other hand, qf (γγγ) = a+bc+d
by the definition (see Eq. (7.1)).
But, for general a, b, c, d≥ 0, if ac≤ a+b
c+d, then xa+b
xc+d≥ a+b
c+dfor x≤ 1. Since
a
c=
∑T∈T :e∈T,f∈T eγγγ(T )∑T∈T :e∈T eγγγ(T )
≤ qf (γγγ) =a+ b
c+ d
by negative correlation (since a/c represents the conditional probability that f is present given that
e is present), we get that qf (γγγ′)≥ qf (γγγ) for δ≥ 0.
Now, we proceed to deriving the equation for δ. By definition of qe(γγγ), we have
1
qe(γγγ)− 1 =
∑T :e/∈T eγγγ(T )∑T3e eγγγ(T )
.
Asadpour, Goemans, Mądry, Oveis Gharan, and Saberi: An O(logn/ log logn)-approximation Algorithm for ATSP44 Article submitted to Operations Research; manuscript no. 2
Hence,
1
qe(γγγ′)− 1 =
∑T :e/∈T eγγγ
′(T )∑T3e eγγγ′(T )
=
∑T :e/∈T eγγγ(T )
e−δ∑
T3e eγγγ(T )
= eδ(
1
qe(γγγ)− 1
),
which concludes the proof.