approximation algorithm

1

Approximation Algorithm

Instructor: YE, [email protected]

2

Dealing with Hard Problems

What to do if: Divide and conquer Dynamic programmingGreedyLinear Programming/Network Flows…

does not give a polynomial time algorithm?

3


Solution I: Ignore the problemCan’t do it ! There are thousands of problems for which we do not know polynomial time algorithms

For example: Traveling Salesman Problem (TSP) Set Cover

4

Traveling Salesman Problem

Traveling SalesmanProblem (TSP)

Input: undirected graph with lengths on edges Output: shortest cycle that visits each vertex exactly once

Best known algorithm: O(n 2n) time.

5

The vertex-cover problem

A vertex cover of an undirected graph G = (V, E) is a subset V ' ⊆ V such that if (u, v) ∈ E, then u ∈ V ' or v ∈ V ' (or both). A vertex cover for G is a set of vertices that covers all the edges in E.As a decision problem, we define

VERTEX-COVER = { 〈 G, k 〉 : graph G has a vertex cover of size k}.

Best known algorithm: O(kn + 1.274k)

6


Exponential time algorithms for small inputs. E.g., (100/99)n time is not bad for n < 1000. Polynomial time algorithms for some (e.g., average-case) inputs Polynomial time algorithms for all inputs, but which return approximate solutions

7

Approximation Algorithms

An algorithm A is ρ-approximate, if, on any inputof size n:

The cost CA of the solution produced by the algorithm, and The cost COPT of the optimal solution are such that CA ≤ ρ COPT

We will see: 2-approximation algorithm for TSP in the plane 2-approximation algorithm for Vertex Cover

8

Comments on Approximation

“CA ≤ ρ COPT ” makes sense only for minimization problems For maximization problems, replace by

COPT ≤ ρ CA

Additive approximation “CA ≤ ρ + COPT “ also makes sense, although difficult to achieve

9

The Vertex-cover problem

10

The vertex-cover problem

A vertex cover of an undirected graph G = (V, E)

is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V' or v ∈ V' (or both). A vertex cover for G is a set of vertices that covers all the edges in E.The goal is to find a vertex cover of minimum size in a given undirected graph G.

11

Naive Algorithm

APPROX-VERTEX-COVER(G) 1 C ← Ø 2 E′ ← E[G] 3 while E′ ≠ Ø 4 do let (u, v) be an arbitrary edge of E′ 5 C ← C {∪ u, v} 6 remove from E′ every edge incident on either u or v 7 return C

12

Illustration of Naive algorithm

Input Edge bc is chosenSet C = {b, c}

Edge ef is chosen

Optimal solution{b, e, d}

Naive algorithmC={b,c,d,e,f,g}

13

Approximation 2Theorem. APPROX-VERTEX-COVER is a 2-approximation algorithm.Pf. let A denote the set of edges that were picked by APPROX-VERTEX-COVER.To cover the edges in A, any vertex cover, in particular, an optimal cover C* must include at least one endpoint of each edge in A. No two edges in A share an endpoint.Thus no two edges in A are covered by the same vertex from C*, and we have the lower bound

C* ≥ |A|On the other hand, the algorithm picks an edge for which neither of its endpoints is already in C.

|C| = 2|A|Hence, |C| = 2|A| ≤ 2|C*|.

14

Vertex cover: summary

No better constant-factor approximation is known!!More precisely, minimum vertex cover is known to be approximable within (for a given |V|≥2) (ADM85)

but cannot be approximated within 7/6 (Hastad STOC97) for any sufficiently large vertex degree, Dinur Safra (STOC02)1.36067

log log | |22log | |

VV

15

Vertex cover: summary

Eran Halperin, Improved Approximation Algorithms for the Vertex Cover Problem in Graphs and Hypergraphs, SIAM Journal on Computing, 31/5 (2002): 1608 - 1623 .Tomokazu Imamura , Kazuo Iwama, Approximating vertex cover on dense graphs, Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms 2005

http://portal.acm.org/author_page.cfm?id=81100622064&coll=GUIDE&dl=GUIDE&trk=0&CFID=68633874&CFTOKEN=86197831

http://portal.acm.org/citation.cfm?id=1070513&dl=GUIDE&coll=GUIDE&CFID=68633874&CFTOKEN=86197831




16

The Traveling Salesman Problem

Traveling SalesmanProblem (TSP)

Input: undirected graph G = (V, E) with edges cost c(u, v) associated with each edge (u, v)

∈ E Output: shortest cycle that visits each vertex exactly onceTriangle inequality if for all vertices u, v, w ∈ V,c(u, w) ≤ c(u, v) + c(v, w).

u

v

w

17

2-approximation for TSP with triangle inequality

Compute MST T An edge between any pair of pointsWeight = distance between endpoints

Compute a tree-walk W of T Each edge visited twice Convert W into a cycle H using shortcuts

18

Algorithm

APPROX-TSP-TOUR(G, c) 1 select a vertex r ∈ V [G] to be a "root" vertex 2 compute a minimum spanning tree T for G from root r using MST-PRIM(G, c, r) 3 let L be the list of vertices visited in a preorder tree walk of T 4 return the hamiltonian cycle H that visits the vertices in the order L

19

Preorder Traversal

Preorder: (root-left-right)Visit the root first; and then traverse the left subtree; and then traverse the right subtree.

Example:

Order: A,B,C,D,E,F,G,H,I

20

Illustration

A full walk of the tree visits the vertices in the order a, b, c, b, h, b, a, d, e, f, e, g, e, d, a.

MST Tree walk W

preorder walk (Final solution H) OPT solution

21

2-approximation

Theorem. APPROX-TSP-TOUR is a polynomial-time 2-approximation algorithm for the traveling-salesman problem with the triangle inequality. Pf. Let COPT be the optimal cycle Cost(T) ≤ Cost(COPT)

Removing an edge from H gives a spanning tree, T is a spanning tree of minimum cost

Cost(W) = 2 Cost(T) Each edge visited twice

Cost(H) ≤ Cost(W) Triangle inequality

Cost(H) ≤ 2 Cost(COPT )

22

Load Balancing

Input. m identical machines; n jobs, job j has processing time tj.

Job j must run contiguously on one machine.A machine can process at most one job at a time.

Def. Let J(i) be the subset of jobs assigned to machine i. Theload of machine i is Li = j J(i) tj.

Def. The makespan is the maximum load on any machine L = maxi Li.

Load balancing. Assign each job to a machine to minimize makespan.

23

List-scheduling algorithm. Consider n jobs in some fixed order. Assign job j to machine whose load is smallest so far.

Implementation. O(n log n) using a priority queue.

Load Balancing: List Scheduling

List-Scheduling(m, n, t1,t2,…,tn) { for i = 1 to m { Li 0 J(i) }

for j = 1 to n { i = argmink Lk

J(i) J(i) {j} Li Li + tj

}}

jobs assigned to machine iload on machine i

machine i has smallest loadassign job j to machine iupdate load of machine i

24

Load Balancing: List Scheduling Analysis

Theorem. [Graham, 1966] Greedy algorithm is a (2-1/m)-approximation.

First worst-case analysis of an approximation algorithm. Need to compare resulting solution with optimal makespan L*.

Lemma 1. The optimal makespan L* maxj tj. Pf. Some machine must process the most time-consuming job. ▪

Lemma 2. The optimal makespan Pf.

The total processing time is j tj . One of m machines must do at least a 1/m fraction of total

work. ▪

L * 1m t jj .

25


Theorem. Greedy algorithm is a (2-1/m)-approximation.Pf. Consider load Li of bottleneck machine i.

Let j be last job scheduled on machine i. When job j assigned to machine i, i had smallest load. Its load

before assignment is Li - tj Li - tj Lk for all 1 k m.

j

0 L = LiLi - tj

machine i

blue jobs scheduled before j

26


Theorem. Greedy algorithm is a (2-1/m)- approximation.Pf. Consider load Li of bottleneck machine i.

Let j be last job scheduled on machine i. When job j assigned to machine i, i had smallest load. Its load

before assignment is Li - tj Li - tj Lk for all 1 k m. Sum inequalities over all k and divide by m:

Now ▪

Lemma 1

Lemma 2

1 ( )

* /

(1 1/ ) *

i j k jk

j

L t L tmL t m

m L

Lemma 1

*(1 1/ ) *

( ) (2 1/ ) *i i j j

Lm L

L L t t m L

27


Q. Is our analysis tight?A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m

Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m

machine 2 idlemachine 3 idlemachine 4 idlemachine 5 idlemachine 6 idlemachine 7 idlemachine 8 idlemachine 9 idle

machine 10 idle

list scheduling makespan = 19

m = 10

28


Q. Is our analysis tight?A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m

Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m

m = 10

optimal makespan = 10

29

Machine 2

Machine 1a d f

b c e gyes

Load Balancing on 2 Machines

Claim. Load balancing is hard even if only 2 machines.Pf. NUMBER-PARTITIONING P LOAD-BALANCE.

a d

f

b c

ge

length of job f

Time L0

machine 1

machine 2

30

Load Balancing: LPT Rule

Longest processing time (LPT). Sort n jobs in descending order of processing time, and then run list scheduling algorithm.

LPT-List-Scheduling(m, n, t1,t2,…,tn) { Sort jobs so that t1 ≥ t2 ≥ … ≥ tn

for i = 1 to m { Li 0 J(i) }

for j = 1 to n { i = argmink Lk

J(i) J(i) {j} Li Li + tj

}}

jobs assigned to machine iload on machine i

machine i has smallest loadassign job j to machine iupdate load of machine i

31


Observation. If at most m jobs, then list-scheduling is optimal.Pf. Each job put on its own machine. ▪

Lemma 3. If there are more than m jobs, L* 2 tm+1.Pf.

Consider first m+1 jobs t1, …, tm+1. Since the ti's are in descending order, each takes at least tm+1

time. There are m+1 jobs and m machines, so by pigeonhole

principle, at least one machine gets two jobs. ▪

Theorem. LPT rule is a 3/2 approximation algorithm.Pf. Same basic approach as for list scheduling.

▪

L i (Li t j ) L*

t j

12 L*

32 L *.

Lemma 3( by observation, can assume number of jobs > m )

32


Q. Is our 3/2 analysis tight?A. No.

Theorem. [Graham, 1969] LPT rule is a (4/3 – 1/(3m))-approximation.Pf. More sophisticated analysis of same algorithm.

Q. Is Graham's (4/3 – 1/(3m))- analysis tight?A. Essentially yes.

Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and three jobs of length m.

33

LPTProof. Jobs are indexed t1 ≥ t2 ≥ … ≥ tn.

If n ≤ m, already optimal (one machine processes one job).If n> 2m, then tn ≤ L*/3. Similar as the analysis of LS algorithm.Suppose total 2m – h jobs, 0 ≤ h < m Check that LPT is already optimal solution

1

hh+1h+2 n-1

n

Time

34

Approximation Scheme

NP-complete problems allow polynomial-time approximation algorithms that can achieve increasingly smaller approximation ratios by using more and more computation time Tradeoff between computation time and the quality of the approximation For any fixed >0, An ∈ approximation scheme for an optimization problem is an (1 + )-∈approximation algorithm.

35

PTAS and FPTAS

We say that an approximation scheme is a polynomial-time approximation scheme (PTAS) if for any fixed ∈ > 0, the scheme runs in time polynomial in the size n of its input instance.

Example: O(n2/∈). an approximation scheme is a fully polynomial-time approximation scheme (FPTAS) if it is an approximation scheme and its running time is polynomial both in 1/∈ and in the size n of the input instance

Example: O((1/ )∈ 2n3).

36

The Subset Sum

Input. A pair (S, t), where S is a set {x1, x2, ..., xn} of positive integers and t is a positive integer Output. A subset S′ of SGoal. Maximize the sum of S′ but its value is not larger than t.

37

An exponential-time exact algorithm

If L is a list of positive integers and x is another positive integer, then we let L + x denote the list of integers derived from L by increasing each element of L by x. For example, if L = 〈 1, 2, 3, 5, 9 〉 , then L + 2 = 〈 3, 4, 5, 7, 11 〉 . We also use this notation for sets, so that

S + x = {s + x : s ∈ S}.

38

Exact algorithm

MERGE-LISTS(L, L′): returns the sorted list that is the merge of its two sorted input lists L and L′ with duplicate values removed.EXACT-SUBSET-SUM(S, t) 1 n ← |S| 2 L0 ← 〈 0 〉3 for i ← 1 to n 4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi) 5 remove from Li every element that is greater than t 6 return the largest element in Ln

39

Example

For example, if S = {1, 4, 5}, thenP1 ={0, 1} ,P2 ={0, 1, 4, 5} ,P3 ={0, 1, 4, 5, 6, 9, 10} .Given the identity

Since the length of Li can be as much as 2i, it is an exponential-time algorithm .

1 ( )i i i iP P P x

40

The Subset-sum problem: FPTAS

Trimming or rounding: if two values in L are close to each other, then for the purpose of finding an approximate solution there is no reason to maintain both of them explicitly.Let δ such that 0 < δ < 1.L′ is the result of trimming L, for every element y that was removed from L, there is an element z still in L′ that approximates y, that is

1y z y

41

ExampleFor example, if δ = 0.1 andL = 〈 10, 11, 12, 15, 20, 21, 22, 23, 24, 29 〉 ,then we can trim L to obtainL′ = 〈 10, 12, 15, 20, 23, 29 〉 ,TRIM(L, δ) 1 m ← |L| 2 L′ ← 〈 y1 〉3 last ← y1 4 for i ← 2 to m 5 do if yi > last · (1 + δ) ▹ yi ≥ last because L is sorted 6 then append yi onto the end of L′ 7 last ← yi 8 return L′

42

(1+ ∈)-Approximation algorithm

APPROX-SUBSET-SUM(S, t, ) ∈1 n ← |S| 2 L0 ← 〈 0 〉3 for i ← 1 to n 4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi) 5 Li ← TRIM(Li, /2n∈ )6 remove from Li every element that is greater than t 7 let z* be the largest value in Ln 8 return z*

43

FPTAS

Theorem. APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.Pf. The operations of trimming Li in line 5 and removing from Li every element that is greater than t maintain the property that every element of Li is also a member of Pi. Therefore, the value z* returned in line 8 is indeed the sum of some subset of S.

44

Pf. Con.

Pf. Let y* ∈ Pn denote an optimal solution to the subset-sum problem. we know that z* ≤ y*. We need to show that y*/z* ≤ 1 + ∈. By induction on i, it can be shown that for every element y in Pi that is at most t, there is a z ∈ Li such that

Thus, there is a z ∈ Ln , such that (1 / 2 )i

y z yn

**

(1 / 2 )n

y z yn

45

Pf. Con.

And thus,

Since there is a z ∈ Ln

Hence,

*

(1 / 2 )ny nz

*

* (1 / 2 )ny nz

/ 2

2

(1 / 2 )

1 / 2 ( )1

nn e

O

46

Pf. Con.

To show FPTAS, we need to bound Li.

After trimming, successive elements z and z′ of Li must have the relationship z′/z > 1+∈/2n

Each list, therefore, contains the value 0, possibly the value 1, and up to log⌊ 1+∈/2n t⌋ additional values

1 / 2lnlog

ln (1 / 2 )2 (1 / 2 ) ln

4 ln

ntt

nn n t

n t

approximation algorithm

Documents

vertexcover problem

vertex cover of size

minimum vertex cover

subset v v

optimal cover c

vertex cover of minimum

naive algorithmapproxvertex

return c