integer programming and algorithmic geometry of numbers - epfl

57
Contents 14 Integer Programming and Algorithmic Geometry of Numbers ...... 1 Friedrich Eisenbrand 14.1 Lattices, integer programming and the geometry of numbers ..... 1 14.2 Informal introduction to basis reduction ...................... 3 14.3 The Hermite normal form .................................. 5 14.4 Minkowski’s theorem ...................................... 11 14.5 The LLL algorithm ........................................ 14 14.6 Kannan’s shortest vector algorithm ........................... 21 14.7 A randomized simply exponential algorithm for shortest vector . . . 25 14.8 Integer programming in fixed dimension ...................... 31 14.9 The integer linear optimization problem ...................... 38 14.10 Diophantine approximation and strongly polynomial algorithms . . 41 14.11 Parametric integer programming ............................. 46 References ..................................................... 52 v

Upload: others

Post on 10-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integer Programming and Algorithmic Geometry of Numbers - EPFL

Contents

14 Integer Programming and Algorithmic Geometry of Numbers. . . . . . 1Friedrich Eisenbrand14.1 Lattices, integer programming and the geometry of numbers . . . . . 114.2 Informal introduction to basis reduction . . . . . . . . . . .. . . . . . . . . . . 314.3 The Hermite normal form . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 514.4 Minkowski’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 1114.5 The LLL algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 1414.6 Kannan’s shortest vector algorithm . . . . . . . . . . . . . . . .. . . . . . . . . . . 2114.7 A randomized simply exponential algorithm for shortest vector . . . 2514.8 Integer programming in fixed dimension . . . . . . . . . . . . . .. . . . . . . . 3114.9 The integer linear optimization problem . . . . . . . . . . . .. . . . . . . . . . 3814.10 Diophantine approximation and strongly polynomial algorithms . . 4114.11 Parametric integer programming . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 46References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 52

v

Page 2: Integer Programming and Algorithmic Geometry of Numbers - EPFL
Page 3: Integer Programming and Algorithmic Geometry of Numbers - EPFL

Chapter 14Integer Programming and AlgorithmicGeometry of NumbersA tutorial

Friedrich Eisenbrand

Abstract This chapter surveys a selection of results from the interplay of integerprogramming and the geometry of numbers. Apart from being a survey, the text isalso intended as an entry point into the field. I therefore added exercises at the endof each section to invite the reader to delve deeper into the presented material.

14.1 Lattices, integer programming and the geometry ofnumbers

The central objects of study of thegeometry of numbersare lattices. Alattice isa setΛ = {y∈ Rn : y = Ax, x∈ Zn}, whereA∈ Rn×n is a nonsingular matrix. Wesay that the lattice isgenerated by Aand writeΛ = Λ(A). The matrixA is called abasisof the latticeΛ . If A is a rational matrix, i.e.,A ∈ Qn×n, thenΛ is a rationallattice.

A very important problem, which has also received a lot of attention in computerscience and optimization, is theshortest vector problemwith respect to theℓp-normfor somep∈ N+∪{∞}. It is as follows.

Given a rational lattice-basisA∈Qn×n, compute a nonzero vectorv∈ Λ(A) with minimalnorm‖v‖p.

If the norm is not specified, we implicitly assume theℓ2-norm and denote the lengthof a shortest vector w.r.t. theℓ2-norm asSV(Λ). The shortest vector problem can(in fixed dimension) be solved efficiently withlattice basis reduction. In varyingdimension the shortest vector problem is NP-hard under randomized reductions [2].Still, the fastest algorithms [43, 3] for this problem rely on basis reduction.

Friedrich EisenbrandDepartment of Mathematics, EPFL, Lausanne, Switzerlande-mail:[email protected]

1

Page 4: Integer Programming and Algorithmic Geometry of Numbers - EPFL

2 Friedrich Eisenbrand

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

(2,0)

(3,2)

Fig. 14.1 The lattice generated by(2,0) and(3,2).

Lenstra [56] has shown thatinteger programmingcan be solved in polynomialtime, if the dimension is fixed. His algorithm is based on lattice basis reduction.Why do lattices and the algorithmic geometry of numbers comeinto play in integerprogramming?

Thegreatest common divisorof two integersa0,a1 ∈ Z where not botha0 anda1 are zero, is the largest integer that divides botha0 and a1. It is denoted bygcd(a0,a1). The greatest common divisor can be efficiently computed with theEu-clidean algorithm, see, e.g. [51, 1]. It computes the remainder sequencea0,a1, . . . ,ak−1,ak∈N+, whereai, i > 2 is given byai−2 = ai−1qi−1+ai , qi ∈N, 0< ai < ai−1,andak dividesak−1 exactly. Thenak = gcd(a0,a1). The Euclidean algorithm can beinterpreted as a reduction algorithm and we will see that the2-dimensional basisreduction algorithm of Lagrange (see Section 14.2) works along the same lines.

Now, the connection between integer linear programming with algorithmic num-ber theory reveals itself already in the following theorem,which is proved at thebeginning of every course on elementary number theory, see,e.g. [65].

Theorem 14.1.Let a,b∈ Z be integers that are not both equal to zero. The greatestcommon divisorgcd(a,b) is the smallest integer in the set

{ax+by: x,y∈ Z, ax+by> 1}. (14.1)

The problem to find the minimum in (14.1) can be modeled as an integer programin two variables and one constraint.

min ax+byax+by> 1

x,y∈ Z.

This already explains why efficient methods for integer programming with a fixednumber of variables incorporatereduction techniques, which in a basic form appearalready in the Euclidean algorithm. Such reduction techniques are in the focus ofthis chapter.

Page 5: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 3

14.2 Informal introduction to basis reduction

Informally, lattice basis reductionis about transforming a lattice basisB intoa lattice basisB′ that generates the same lattice,Λ(B) = Λ(B′) and from whicha shortest vector can (in fixed dimension) be easily determined. Before we makethis more precise, we have to understand what valid transformations are, i.e., whenΛ(B) = Λ(B′) holds. Recall that an integer matrixU ∈ Zn×n is calledunimodularif det(U) = ±1. ThusU ∈ Zn×n is unimodular if and only ifU is non-singular andU−1 is a matrix with integer entries, see exercise 1.

Lemma 14.1.Let B,B′ ∈ Rn×n be two rational non-singular matrices. One hasΛ(B) = Λ(B′) if and only if there exists a unimodular matrix U∈ Zn×n withB′ = B ·U.

Proof. SupposeΛ(B) = Λ(B′). Then, every column ofB is in Λ(B′) which impliesthat there exists an integral matrixU ∈ Zn×n with B = B′ ·U . Similarly, there existsan integral matrixV ∈ Zn×n with B′ = B ·V. From this it follows thatB = B ·V ·Uand sinceB is non-singular, this implies thatV ·U = In, whereIn is then×n identitymatrix. This implies thatU is unimodular since 1= det(V ·U) = det(V) ·det(U) andsince both det(V) and det(U) are integers, one has det(U) =±1.

On the other hand, ifB′ = B ·U with an integralU , thenΛ(B′) ⊆ Λ(B). If U isunimodular, thenB = B′ ·U−1 andU−1 is integral, which impliesΛ(B)⊆Λ(B′).

⊓⊔

(2,0)

(3,2)

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

(2,0)

(1,2)

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

Fig. 14.2 A lattice, two different bases and the lattice determinant,which is depicted as the volumeof the parallelepipeds defined by the bases respectively.

Lemma 14.1 implies that the absolute value|det(B)| of the determinant of a ba-sis B of Λ is an invariant ofΛ . This value is called thedeterminantof Λ . ThesetΠ(B) = {Bλ : λ ∈ Rn, 0 6 λ < 1} is called theparallelepipedspanned by the

Page 6: Integer Programming and Algorithmic Geometry of Numbers - EPFL

4 Friedrich Eisenbrand

columns ofB. The volume of this parallelepiped is the absolute value of the deter-minant ofB. Thus the lattice determinant is the volume of the parallelepiped definedby the basis elements of any basis, see Figure 14.2.

The result of the multiplication ofB by a unimodular matrixU can also be ob-tained by a sequence ofelementary column operationsonB:

i) Swap of two columns.ii) Multiplication of a column with−1.iii) Addition of an integralmultiple of one column toanother column.

Now that we know which operations to apply, let us consider anexample oflattice basis reduction. Suppose that we want to determine the shortest vector in thelattice which is generated by the following two column-vectorsv1 andv2.

v2v1

It is difficult to guess what the shortest vector of this lattice could be. We subtractv1

from v2 and replacev2 with the outcome of this operation. The result is a new basis.

v2

v1

Still, it is not easy to determine the shortest vector. Next we subtract 3-timesv2 fromv1 and replacev1 with the outcome of this operation.

v2

v1

Now the shortest vector reveals itself easily. Since the obtained basis of our latticeconsists of two orthogonal vectors, the shortest vector of the lattice is the shortestvector of the basis itself. In fact, we have traced above the reduction algorithm ofLagrange which was also described by Gauß [29].

Intuitively it seems clear that the shortest vector problemshould still be easy, ifthe basis isalmost orthogonal. We will deliver a precise definition of this in Sec-tion 14.5, where we describe the LLL-algorithm.

Exercises

1) Prove thatU ∈ Zn×n is unimodular if and only ifU is non-singular andU−1 is amatrix with integer entries.

Page 7: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 5

2) LetB∈Qn×n be a lattice basis that consists of pairwise orthogonal vectors. Provethat the shortest vector ofΛ(B) is the shortest column vector ofB.

3) Let Λ ,Λ ′ ⊆ Zn be lattices and suppose thatΛ ⊇ Λ ′. Show that det(Λ) dividesdet(Λ ′).

4) Consider three pointsv1,v2,v3 ∈ Z2 that are not co-linear, i.e., there is no linecontaining all three points. Show that the triangle spannedby v1,v2 andv3 doesnot contain an integer point apart fromv1,v2 and v3 itself, if and only if thematrix (v2−v1,v3−v2) is unimodular.A similar statement cannot be made inR3. Provide an example of linearly inde-pendent integer vectorsv1,v2,v3 ∈ Z3 such that the simplex conv{0,v1,v2,v3}does not contain an integer point apart from 0,v1,v2 andv3 and det(v1,v2,v3) 6=±1.

5) (Picks formula) The convex hullP = conv{v1, . . . ,vn} of integer pointsvi ∈ Z2,i = 1, . . . ,n is a convex lattice polygon. LetA, I andB be the area, number ofinteger points in the interior and boundary ofP respectively. Prove Picks formula

A = I +B/2−1.

Hint: Exercise 4).

14.3 The Hermite normal form

The section is devoted to the algorithmic solution of the following problem

Given a rational matrixA ∈ Qm×n of full row-rank and a rational vectorb ∈ Qm, decidewhether there exists an integral vectorx∈ Zn with Ax= b.

This problem is solved with the Hermite normal form of a rational matrix which iscentral to this chapter. We will also see that the set{Ax: x ∈ Zn} is also a lattice,namely the lattice that is generated by the Hermite normal form of A.

We motivate the Hermite normal form first via certificates of unsolvability ofrational linear equations. Suppose we are given a systemAx= b of linear equationsover the reals. How can one certify that this system does not have a solution? Thefollowing well known theorem from linear algebra provides acertificate.

Theorem 14.2.Let A∈Rm×n be a matrix and b∈Rm be a vector. The system Ax= bdoes not have a solution x∈ Rn if and only if there exists aλ ∈ Rm with λ TA = 0andλ Tb 6= 0.

Proof. If x is a solution to the system, thenλ TA= 0 impliesλ Tb= λ TAx= 0Tx= 0.On the other hand, ifAx= b does not have a solution thenb is not in the vector

spaceZ = {z∈ Rm: z= Ax, x∈ Rn}.

The vector spaceZ is the kernel of a matrixC∈ Rk×m,

Page 8: Integer Programming and Algorithmic Geometry of Numbers - EPFL

6 Friedrich Eisenbrand

Z = {z∈ Rm: Cz= 0}.

We haveCA = 0 andCb 6= 0. We can choose ay with yTCb 6= 0. We then haveyTCA= 0, soλ = yTC serves our purpose. ⊓⊔

Now letA∈Qm×n be a rational matrix and letb∈Qm be a rational vector. Whathappens if we ask, whether a systemAx= b has an integer solutionx∈ Zn ?

ClearlyAx= b has an integer solution if and only if(α ·A)x= α ·b has an integersolution forα ∈ R−{0}. This means that we can multiplyA andb with the leastcommon multiple of the denominators of their entries and obtain an integral matrixand an integral vector.

The simplest nontrivial system of rational equations over the integers is one equa-tion and two variables. This case was already considered by Gauß [29], who men-tions the following theorem. Recall the notationu | v which stands foru divides vfor integersu andv.

Theorem 14.3.Let a,b and c be integers where a or b are nonzero. The system

ax+by= c (14.2)

has a solution with integers x and y if and only ifgcd(a,b) | c.

Proof. Let d be a common divisor ofa andb, i.e., d · a′ = a andd · b′ = b withintegersa′ andb′. If ax+by= c, then

d(a′x+b′y) = c

which implies thatd | c. Thus if (14.2) has an integral solution, then gcd(a,b) | c.On the other hand let gcd(a,b) ·k = c for some integerk. Recall that there exist

integersx′ andy′ with ax′+ by′ = gcd(a,b). An integral solution of (14.2) is thenx = kx′ andy = ky′. ⊓⊔

In other words, (14.2) does not have an integral solution, ifand only if thereexists anα ∈ R such thatα · (a,b) ∈ Z2 andα · c /∈ Z. It turns out that this simpleobservation can be generalized.

Theorem 14.4.Let A∈Qm×n and b∈Qm. The system

Ax= b (14.3)

does not have an integral solution, if and only if there exists aλ ∈Rm with λ TA∈Zn

andλ Tb /∈ Z.

To prove this theorem, we now introduce the Hermite normal form. A rationalmatrixA∈Qm×n of full row rank is said to be inHermite normal form (HNF)if it hasthe form[B | 0], whereB is a nonsingular, lower triangular matrix with non-negativeentries, in which each row has a unique maximal entry, located on the diagonal. Thefollowing is an example of a matrix in HNF

Page 9: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 7

(2 0 01 3 0

)

Theorem 14.5.Each rational matrix A∈Qm×n of full row-rank can be brought intoHermite normal form by a finite series of elementary column operations.

Proof. By scaling all entries ofA with a positive common multipleα of the denomi-nators, we obtain an integral matrixA′. If [H ′ | 0] is a HNF ofA′, then(1/α)[H ′ | 0] isan HNF ofA. Therefore we can assume without loss of generality thatA is integral.

Algorithm 1 computes the Hermite normal form and a corresponding unimod-ular matrix of an integral matrix with full row rank with the use of the extendedEuclidean algorithm. Here the term “update columnsi and j” with a 2×2-matrix(

a bc d

)means that the new columnsci andc j result from their old values by multi-

plying (ci ,c j) with(

a bc d

), i.e.,(ci ,c j) := (ci ,c j)

(a bc d

).

The functionexggT(a,b) implements theextended Euclidean algorithm, whichcomputes the triple(g,x,y), whereg = gcd(a,b) andx andy are integers withg =

xa+yb, see, e.g. [51]. Since the matrix(

x −b/gy −a/g

)is unimodular, the updating step

in the algorithm corresponds to a sequence of elementary column operations.

Algorithm 1 HNF of AInput: A∈ Zm×n full row rankReturn: [H | 0] ∈ Zm×n HNF of A, unimodular matrixU ∈ Zn×n with A ·U = [H | 0]

H← AU ← Infor i = 1 to mdo

for j = i +1 ton doif Hi, j 6= 0 then

(g,x,y) = exggT(Hi,i ,Hi, j)

update columnsi and j of H andU with(

x −Hi, j/gy Hi,i/g

)

end ifend forfor j = 1 to i−1 do

Hi, j = q·Hi,i + r (division with remainder,r non-negative)update columnsj and i of H andU with

( 1 0−q 1

){reduce entries to the left of diagonal

elementHi,i}end for

end for

⊓⊔

It is an easy exercise to show that the following invariant holds after each ”updatecolumns” operation of Algorithm 1

A ·U = H.

Page 10: Integer Programming and Algorithmic Geometry of Numbers - EPFL

8 Friedrich Eisenbrand

Let us trace the algorithm, when it is executed on the matrix(

2 3 42 4 6

). The greatest

common divisor of(2,3) is 1= (−1) ·2+1·3. Thus we update column 1 and 2 with

the matrix(−1 −3

1 2

), obtaining

(1 0 42 2 6

). The transforming matrixU yields

(−1 −3 01 2 00 0 1

).

Eliminating 4 yieldsH =(

1 0 02 2 −2

)andU =

(−1 −3 41 2 −40 0 1

).

Eliminating−2 yieldsH =(

1 0 02 2 0

), U =

(−1 −4 11 4 −20 −1 1

)

Now reducing 2 in the lower left corner yields the Hermite normal formH =(

1 0 00 2 0

)

and the unimodular matrix transformation matrixU =( 3 −4 1−3 4 −21 −1 1

)with A ·U = H.

Proof (of Theorem 14.4).SupposeAx= b has an integral solutionx∈Zn. Then withλ TA∈ Zn one hasλ Tb = λ TAx∈ Z.

Suppose now thatAx = b does not have an integral solution. IfAx = b is notsolvable inRn, then there exists aλ ∈ Rm with λ TA = 0 andλ Tb 6= 0 and thus aλwith λ TA= 0 andλ Tb = 1/2. Thus we can assume thatAx= b has a solution inRn

and therefore we can assume thatA has full row-rank.There exists a unimodular matrixU ∈ Zn×n such thatA ·U = [B | 0] is in HNF.

ThusB−1A is an integral matrix. We claim thatB−1b is not an integral vector andthus that there exists a rowλ T of B−1 with λ TA∈ Zn andλ Tb /∈ Z.

If B−1b is integral, then

A ·U(

B−1b0

)= [B | 0]

(B−1b

0

)= b

shows thatx=U ·(

B−1b0

)is an integral solution ofAx= b, which is a contradiction.

⊓⊔

The next theorem shows that the HNF is unique. For this we extend the notationΛ(A) = {Ax: x ∈ Zn} also for the case whereA ∈ Qm×n is a rational matrix offull row-rank which is not necessarily a basis. Such a set is also a lattice, sinceΛ(A) = Λ(B), where[B | 0] is the HNF ofA andB is nonsingular.

Theorem 14.6.Let A and A′ be integral matrices of full row-rank, with HNF[B | 0]and[B′ | 0] respectively. ThenΛ(A) = Λ(A′) if and only if B= B′.

Proof. If B = B′, the clearlyΛ(A) = Λ(A′).It remains to prove that ifB and B′ are different, thenΛ(B) 6= Λ(B′). So as-

sume thatB andB′ nevertheless generate the same latticeΛ . Let i be the smallestrow-index such that rowi of B differs from rowi of B′. Without loss of generalityassume thatBi, j > B′i, j holds for some 16 j < i. Observe that thei-th componentsof elements ofΛ whose firsti−1 components are 0, are integral multiples ofBi,i .

Notice that the firsti−1 components of the vectorβ = B( j)−B′( j) ∈Λ are 0, whereB( j) andB′( j) denote thej-th column ofB andB′ respectively. But thei-th compo-nent ofβ is strictly between 0 andBi,i . This is a contradiction. ⊓⊔

Corollary 14.1. The Hermite normal form of a rational matrix with full row-rank isunique.

Page 11: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 9

Clearly, Algorithm 1 requiresO(m· n) extended gcd-computations andO(m2 ·n) arithmetic operations. But is it a polynomial algorithm? This cannot be argued,since the binary encoding length of the numbers could grow exponentially. Thesizeof an integerz is the number of bits which are required to encodez. We definesize(z) = 1+ ⌈log2(|z|+1)⌉. Likewise, the size of a matrixA∈ Zm×n is the numberof bits needed to encodeA, i.e., size(A) = mn+∑i, j size(ai j ), see [75, p. 29]. In factFang and Havas [23] provide examples of pivoting schemes, where the size of thenumbers in the intermediate steps of the HNF-algorithm above grows exponentially.

Nevertheless, it can be shown that the HNF of a rational matrix can be computedin polynomial time, see [47, 76]. The key to Schrijver’s method [76] is the followinglemma.

Lemma 14.2.Let A∈ Zm×n be a matrix of full row rank and let d= det(Λ(A)) bethe lattice determinant ofΛ(A) and let D be a multiple of d. Then

Λ(A) = Λ([A |D · Im]).

Proof. ClearlyΛ(A)⊆Λ([A |D · Im]). LetΛ denoteΛ(A). For the reverse inclusionwe simply have to show thatD · ei ∈ Λ for each unit vectorei , i = 1, . . . ,m. LetB∈ Zm×m be a basis ofΛ and letB∈ Zm×m be the adjoint ofB. All column vectorsof B · B are elements ofΛ . But Cramers rule saysB−1 = (1/det(B))B. ThereforeB · B = det(B) · Im. Sinced = |det(B)| we see thatd · ei ∈ Λ , for each unit vectorei , i = 1, . . . ,n. Sinced | D we also have thatD ·ei ∈Λ . ⊓⊔Corollary 14.2. Let [H | 0] be the HNF of A∈ Zm×n, where A has full row-rank andlet d= det(Λ(A)) be the determinant ofΛ(A). Let[H ′ | 0] be the HNF of[A | d · Im]∈Zm×(n+m). Then H′ = H.

Theorem 14.7.There exists a polynomial time algorithm which computes theHNFof a rational matrix of full row-rank.

Proof. Since we can scaleA with the product of the denominators ofA we can as-sume without loss of generality thatA is an integral matrix. We start by identifyingm linearly independent columns ofA and by computingD = |det(A)|. This can bedone with Gaussian elimination in polynomial time [17]. Thefact that the encod-ing length ofD is polynomial follows from the Hadamard bound (14.9) which wediscuss later. Exercise 14.2.3) shows thatD | det(Λ(A)).

We compute the HNF of[A | D · Im] as in Algorithm 1, but keeping numbers in arow reduced moduloD until we eliminateD in this row stemming fromD · Im.

The important observation is this. The firsti rows ofH remain unchanged afterthei-th run through the firstfor loop. The remaining rows ofH can be kept reduced

moduloD, since the lastm− i columns ofH are of the form(

0d·Im−i

). This pro-

cedure requires a polynomial amount of extended-gcd computations and arithmeticoperations on numbers of size at most size(D). ⊓⊔Theorem 14.8.There exists a polynomial algorithm which computes the HNF[H |0] of a rational matrix A∈ Qm×n and the corresponding unimodular matrix U∈Zn×n with A·U = [H | 0].

Page 12: Integer Programming and Algorithmic Geometry of Numbers - EPFL

10 Friedrich Eisenbrand

Proof. Selectm linearly independent columns ofA using Gaussian elimination. As-sume without loss of generality that those are the firstmcolumns. The matrix[H | 0]

is the HNF ofA if and only if the HNF ofA′=

[A

0 | In−m

]is of the formH ′=

[H | 0

B

]

with some matrixB. ComputeH ′ with the polynomial modification of Algorithm 1.We haveA′ ·U = H ′ and sinceA′ is nonsingular we haveU = H ′ ·A′−1. ClearlyA ·U = [H | 0]. ⊓⊔

The proof of the next theorem is left as an excise.

Theorem 14.9.There exists a polynomial algorithm for the following problem

Given an integral matrix A∈Zm×n and an integral vector b∈Zm, compute a solutionx∈Zn

of Ax= b or establish that no suchx exists.

Notes

Showing that Gaussian elimination is in fact a polynomial-time algorithm is nottrivial. Edmonds [17] has shown that, if one keeps the numerators and denominatorsin the intermediate steps of the algorithm gcd-free, then the size of the numbersremains polynomial. Von zur Gathen and Sieveking [82] showed that systems ofrational equations over the integers can be solved in polynomial time.

In fact, there is an arsenal of ways to do linear algebra in polynomial time, one ofthem being a modular method again. If one wants to compute thedeterminant of anintegral matrixA∈ Zn×n for example, one could apply the Hadamard bound (14.9)to compute aD∈Z with 2· |det(A)|< D. If we now know a numberx∈Z, 06 x< Dwith x≡ det(A) (mod D), then we can retrieve det(A) from x. In fact, if x 6 D/2,then det(A) = x and if x > D/2, then det(A) = x−D. One computes with the sieveof Erathostenes the firstk prime numbersp1, . . . , pk such thatp1 · · · pk > D holds.The prime-number theorem guarantees that thevalueof pk is polynomial in theencoding length ofD. One can then compute the value det(A) (mod pi) for eachpi with Gaussian elimination in the fieldZpi and reconstructx with the Chineseremainder theorem, see, e.g., [81].

Exercises

1) Compute by hand the HNF of(

3 5 7 22 4 9 3

).

2) Write a computer program which implements Algorithm 1.3) Prove Theorem 14.9.4) LetD∈N+ be an integer which is not necessarily prime and letA∈Zm×n andb∈

Zm be an integer matrix and vector respectively. Show that there is a polynomialalgorithm which computes a solution to the system

Ax= b (mod D), x∈ Zn

Page 13: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 11

or asserts that the system does not have a solution.5) Describe a unimodular matrixG such that the step ”update columns”j andi of

H andU with( 1 0−q 1

)in Algorithm 1 corresponds to the multiplication ofH and

U with G from the right.6) Modify Algorithm 1 such that it detects on the fly whetherA has full row-rank

and possibly determines a row which is in the span of the otherrows.

14.4 Minkowski’s theorem

At the end of the 19-th century, Minkowski opened the stage for the geometry ofnumbers with his famous bookGeometrie der Zahlen[61]. One of his main resultsis described in this section. He used geometric methods to prove upper bounds onnumbers which are representable by positive definite quadratic forms. In the settingof lattices his result means that a latticeΛ ⊆ Rn contains a nonzero lattice point

whose norm is bounded roughly by√

2π e ndet(Λ)1/n. The simplicity and elegance

of his approach is stunning. His result was preceded by a sequence of bounds whichwe briefly discuss. Lagrange and Gauß [29] proved that a 2-dimensional lattice hasa nonzero lattice point whose norm is bounded by

√4/3 det(Λ)1/2. In his Disqui-

sitiones Arithmeticae[29] Gauß proved that a 3-dimensional lattice has a nonzerolattice vector of norm bounded by

√4/3 det(Λ)1/3. Hermite [37] generalized this

result to arbitrary dimension and provided the upper bound(4/3)(n−1)/4 det(Λ)1/n.All these results were algorithmic, in the sense that they provided algorithms com-puting nonzero lattice vectors achieving these bounds. It is remarkable that thesealgorithms run in polynomial time, if the dimension is fixed.The bound obtained byMinkowski is much stronger than the one of Hermite. It is however not known howto compute a nonzero lattice vector satisfying this bound inpolynomial time, if thedimension is not fixed.

A convex bodyis a compact and full-dimensional convex setK ⊆ Rn. In its sim-plest form, Minkowski’s theorem is as follows.

Theorem 14.10.Let K⊆Rn be a convex body which is symmetric around the origin(x∈ K implies−x∈ K). If vol(K) > 2n, then K contains a nonzero integral vectorv∈ Zn \ {0}.

Proof. We first argue that it suffices to assume that vol(K) > 2n holds. If vol(K) =2n, then, sinceK is compact, there exists anε > 0 such that the distance toK of eachinteger point that is not contained inK is at leastε. This means that there exists aconstantδ > 0 such that the sets(1+ δ ) ·K andK contain the same integer points.The volume of(1+ δ ) ·K however is strictly larger than 2n.

Suppose now that the theorem does not hold. Consider the setS= 1/2 ·K andthe translates ofSwith integer vectors

S+v, v∈ Zn.

Page 14: Integer Programming and Algorithmic Geometry of Numbers - EPFL

12 Friedrich Eisenbrand

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

Fig. 14.3 An illustration of the proof of Minkowski’s theorem, where the dark rectangle stands forK and the light rectangles for the setsS+ v. The inner square represents the pointsx ∈ R2 with‖x‖∞ 6 M and the outer square represents the set (14.5), being the points x ∈ R2 with ‖x‖∞ 6

M +D.

If S+ v1∩S+ v2 6= /0 for some integral vectorsv1 6= v2, then there existk1,k2 ∈ Ksuch that

0 6= v1−v2 = 1/2(k2−k1).

Due to the symmetry ofK around the origin we have−k1 ∈ K and due to the con-vexity of K we have 1/2(k2−k1)∈K, which is a nonzero integral point. This showsthat the translatesS+v, v∈ Zn do not intersect.

Now we consider the volume

VM = vol

v∈Zn

‖v‖∞6M

S+v

for someM ∈N. Since theS+v do not intersect we have

Page 15: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 13

VM = ∑v∈Zn,‖v‖∞6M

vol(S)

= (2 ·M +1)n ·vol(S).(14.4)

SinceS is bounded,Shas finite diameterD ∈ R. This means that the union⋃

v∈Zn

‖v‖∞6M

S+v

is a subset of{x∈ Rn : ‖x‖∞ 6 M +D}. (14.5)

The volume of the set in (14.5) is 2n · (M +D)n. Therefore the inequality

2n · (M +D)n> (2 ·M +1)n ·vol(S)

must hold. AsM tends to infinity, the expression

(2 ·M +2 ·D)n

(2 ·M +1)n

tends to one. This is a contradiction, since vol(S) > 1. ⊓⊔

Minkowski’s theorem has also a version for general lattices. Let Λ(B) ⊆ Rn

be a lattice andK ⊆ Rn be a convex set which is symmetric around the originwith vol(K) > 2ndet(Λ). The mappingφ(x) = B−1x mapsΛ to Zn and φ(K)is a convex body which is symmetric around the origin. The volume of φ(K) isvol(φ(K)) = (1/det(B))vol(K) and thus vol(φ(K)) > 2n. Theorem 14.10 impliesthat φ(K) contains a nonzero integer vector or equivalentlyK contains a nonzerolattice vector fromΛ .

Theorem 14.11 (Minkowski’s convex body theorem [61]).LetΛ ⊆Rn be a latticeand let K⊆ Rn be a convex body of volumevol(K) > 2ndet(Λ) that is symmetricabout the origin. Then K contains a nonzero lattice point.

As announced in the introduction of this section, Minkowski’s theorem can beused to derive an upper bound on the length of a shortest vector of a latticeΛ interms of the determinant ofΛ . LetVn be the volume of then-dimensional unit ball.By scaling the unit ball withα ∈ R+ one obtains a ball of volumeαn ·Vn. Thisis greater or equal to 2ndet(Λ) for α > 2 · n

√det(Λ)/Vn. This has the following

consequence.

Theorem 14.12.A latticeΛ ⊆ Rn has a nonzero lattice point of length less than orequal to2 · n

√det(Λ)/Vn.

The formula forVn is

Vn =π⌊n/2⌋2⌈n/2⌉

∏062i6n(n−2i).

Page 16: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Friedrich Eisenbrand

Using Stirling’s formula (n! ≈√

2π n(

ne

)n) one sees that this is roughly

(2πen

)n/2.

The bound of Theorem 14.12 is thus roughly√

2π e ndet(Λ)1/n.

Exercises

1) Show that a latticeΛ has a nonzero lattice pointv with ‖v‖∞ 6 n√

det(Λ).2) Show that Minkowski’s theorem holds also for convex sets that are full-dimensional

and closed, i.e., the boundedness condition is not really necessary.3) Let K ⊆ Rn be a convex body of volume vol(K) > k ·2n. Show thatK contains

at least 2·k nonzero integer points.4) (Two squares theorem) In this exercise you will prove thata prime numberp with

p≡ 1 (mod 4) can be written as the sum of two square numbersp = a2+b2 fora,b∈ N.

a) Show that the equationq2≡−1 (mod p) has a solution.b) Consider the latticeΛ generated by

(1 0q p

)and the disk of radius

√p ·2− ε

around 0 for a smallε > 0.i) Show that‖v‖2 is divisible byp for eachv∈Λ .ii) Show that there exists av∈Λ \ {0} with ‖v‖2 = p.iii) Conclude thatp is the sum of two squares.

Hints: Wilson’s theorem, see, e.g. [65] states(p−1)! ≡−1 (mod p). If q2≡−1 does not have

a solution, thenZ∗p can be paired (perfectly matched) into(p−1)/2 pairs, where the product

of each pair is congruent to−1. But(p−1)/2 is even (p≡ 1 (mod 4)), implying (p−1)! ≡ 1

(mod p), contradicting Wilson’s theorem.

14.5 The LLL algorithm

Recall the intuition from the introductory section on basisreduction. We wantto transform a lattice basis into an equivalent one that is almost orthogonal. Thehope is that a shortest vector can be determined from such a basis more easily. Fora clear understanding of what almost orthogonal should meanwe first recall theGram-Schmidt procedure.

Let U ⊆ Rn be a subspace ofRn and letv ∈ Rn. The projectionof v onto theorthogonal complement of Uis a vectorv∗ = v−h, whereh∈U and〈v−h,u〉= 0for eachu ∈U . If U =< x1, . . . ,xk > is the subspace generated byx1, . . . ,xk, thenv−h is also called the projection ofv onto the orthogonal complement ofx1, . . . ,xk.

The Gram-Schmidt procedure computes a set of vectorsb∗1, . . . ,b∗n such that the

following conditions hold.

i) The vectorsb1, . . . ,bk span the same subspace asb∗1, . . . ,b∗k for eachk= 1, . . . ,n.

Page 17: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 15

Fig. 14.4 The projectionb∗2 of b2 onto to the orthogonal complement of the subspace generatedbyb1.

b1

b2b∗2

ii) The vectorsb∗1, . . . ,b∗n are pairwise orthogonal.

The procedure is described below. It is easy to see that i) andii) hold for B∗.

Algorithm 2 Gram-Schmidt orthogonalizationInput: B = (b1, . . .,bn) ∈ Rn×n nonsingularReturn: B∗ = (b∗1, . . .,b

∗n) ∈ Rn×n satisfying conditions i) and ii)

b∗1← b1for j = 2, . . . ,k do

b∗j ← b j −∑ j−1i=1 µ ji b∗i , whereµ ji = 〈b j ,b∗i 〉/‖b∗i ‖2

end for

The Gram-Schmidt orthogonalization procedure decomposesthe matrix B =(b1, . . . ,bn) into

B = B∗ ·R (14.6)

whereB∗ is a matrix with pairwise orthogonal columnsb∗1, . . . ,b∗n and

R=

1 µ. . .

0 1

is an upper triangular matrix with diagonal elements 1. The decomposition (14.6) istheGram-Schmidt orthogonalization (GSO)of B.

From the Gram-Schmidt orthogonalization procedure, we canalso deduce theso-called Hadamard bound. Notice that

det(B) = det(B∗ ·R) = det(B∗). (14.7)

Since det(B∗ ·B∗T) = ‖b∗1‖2 · · · ‖b∗n‖2 and since det(B∗ ·B∗T) = det(B∗)2 we con-clude that

|det(B)|=n

∏i=1‖b∗i ‖. (14.8)

Since we have‖bi‖> ‖b∗i ‖ for i = 1, . . . ,n we obtain theHadamard bound

Page 18: Integer Programming and Algorithmic Geometry of Numbers - EPFL

16 Friedrich Eisenbrand

|det(B)|6n

∏i=1

‖bi‖. (14.9)

We can use equation (14.9) now to measure how far a lattice basis B deviatesfrom being orthogonal. Theorthogonality defectof the lattice basisB is the numberγ such that

γ · |det(B)|=n

∏i=1‖bi‖ (14.10)

holds. The columns ofB are pairwise orthogonal if and only ifγ = 1. With thenext theorem, we connect the computation of a shortest vector to this orthogonalitydefect. It implies that a shortest vector can be found among(2 · γ +1)n candidates.

Theorem 14.13.Let B∈ Qn×n be a lattice basis with orthogonality defectγ. Ashortest non-zero vector v∈Λ(B) is of the form

v =n

∑i=1

xi ·bi , with xi ∈ Z,−γ 6 xi 6 γ. (14.11)

Proof. Since‖b j‖> ‖b∗j‖ for j = 1, . . . ,n and since

‖b1‖· · ·‖bn‖= γ · ‖b∗1‖· · ·‖b∗n‖,

we can conclude that‖bn‖6 γ · ‖b∗n‖ (14.12)

holds. Consider nowv = B · x in (14.11). SinceB · x = B∗ ·R· x, whereB∗ is theGram-Schmidt orthogonalization ofB, and since the last component ofR·x is equalto xn, we conclude that

‖B ·x‖> |xn| · ‖b∗n‖> (|xn|/γ)‖bn‖. (14.13)

From this we can conclude that, ifv is a shortest vector, thenxn satisfies−γ 6 xn 6 γ.The orthogonality defect is invariant under the permutation of columns ofB.

Therefore each vector in the basis can play the role of being the last vector in theabove argument. This implies that each componentxi in (14.11) satisfies−γ 6 xi 6 γwhich implies the assertion. ⊓⊔

This implies that the shortest vector problem can be solved in fixed dimension,if we can compute a lattice basisB′ of Λ(B) whose orthogonality defect is boundedby a constant, depending on the dimension only. Therefore wecall a lattice basisB∈Qn×n reducedif its orthogonality defect is bounded by a constantγn, dependingonly on the dimensionn.

The LLL-algorithm[55] is an algorithm which reduces a lattice basis in polyno-mial time. The orthogonality defect of an LLL-reduced basiscan be bounded by2n(n−1)/4. Consequently, the shortest vector problem can be solved intime 2O(n3)

times a polynomial in the binary encoding length of the input.

Page 19: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 17

Normalization

Let B = B∗ ·R be the GSO ofB. If R would be the identity matrix, thenB wouldbe B∗ and thus orthogonal. The first step of the LLL-algorithm isnormalization:One applies elementary column operations to transform the upper triangular matrixR into a matrix that is as close as possible to the identity matrix. Normalization is inthe literature also sometimes referred to assize reduction.

Let r i j be the j-th entry of thei-th row of R. By subtracting⌊r i j ⌉ times thei-th column ofR from the j-th column, the new entryr ′i j at positioni j will satisfy−1/2 < r ′i j 6 1/2. Notice that the entries in a row below thei-th row of R remainunchanged. Thus working our way from the last to the first row,we obtain a basisB′ = B∗ ·R′ with

−1/2 < r ′i j 6 1/2, for 1 6 i < j 6 n. (14.14)

This procedure is called anormalization step.

Swapping

The LLL-algorithm iterates normalization and swapping steps. More preciselyit normalizes the basis and then searches for two consecutive basis elements whichshould be swapped. This its continued, until a certain condition holds.

Algorithm 3 LLL algorithmRepeat the following two steps, as long as there exists aj , 16 j 6 n−1 with

‖b∗j+1 + µ j+1, j b∗j ‖2 < 3/4‖b∗j ‖2 : (14.15)

NormalizeBSwapb j andb j+1

Notice that the condition (14.15) is invariant under normalization, sinceB∗ is leftuntouched by the normalization procedure. Let us shed some light on the swappingoperation and on this condition (14.15) that has to hold whenswapping is applied.The vectorb∗j+1+ µ j+1, j b∗j is the newj-th vector ofB∗ after the swap because

b∗j+1 = b j+1−j

∑i=1

µ j+1,ib∗i . (14.16)

Thus the vectorb∗j+1 + µ j+1 j b∗j is the projection ofb j+1 onto the orthogonal com-plement ofb1, . . . ,b j−1.

The condition (14.15) ensures that the norm of this newj-th column has de-creased by a factor of 3/4 at least. Since the vectorsb∗µ for µ 6= j, j +1 remain thesame (see exercise 14.5.2), the only side effect is an increase of the norm of thej +1-st column ofB∗. The rest of the GSO remains unchanged.

Page 20: Integer Programming and Algorithmic Geometry of Numbers - EPFL

18 Friedrich Eisenbrand

More precisely, if the normj-th column decreases by a factor ofα then thej +1-st column increases by a factor of 1/α since the product of the norms of the columnsof B∗ is equal to|det(B)| which is left invariant by a swap of columns inB.

Analysis

The above observation allows us now to show that the algorithm runs in polyno-mial time. Thepotentialof a lattice basisB is defined as

φ(B) = ‖b∗1‖2n‖b∗2‖2(n−1)‖b∗3‖2(n−2) · · ·‖b∗n‖2. (14.17)

A normalization step does not change the potential, sinceB∗ remains unchanged.How does a swap affect the potential? LetB′ be the basis after a swap operation,where thej-th and j +1-st column are swapped. We have (see exercise 2)

φ(B)

φ(B′)=

‖b∗j ‖2·(n− j+1)‖b∗j+1‖2(n− j)

‖b′j∗‖2·(n− j+1)‖b′j+1

∗‖2(n− j)

=‖b∗j ‖2·(n− j)‖b∗j+1‖2(n− j)

‖b′j∗‖2·(n− j)‖b′j+1

∗‖2(n− j) ·‖b∗j ‖2‖b′j∗‖2

=‖b∗j ‖2‖b′j∗‖2

> 43.

(14.18)

This shows that the potential drops at every iteration by at least 3/4. Next we showthat the potential of a lattice basisB∈ Zn×n is an integer.

Let Bi be the matrix consisting of the firsti columns ofB. Then we have

det(BTi ·Bi) = ‖b∗1‖2 · · ·‖b∗i ‖2 ∈ N. (14.19)

Consequently we have

φ(B) =n

∏i=1

det(BTi ·Bi) ∈N. (14.20)

This shows that the potential is bounded from below by 1 and the algorithmterminates inO(logφ(B)) steps. Clearlyφ(B) 6 det(B)2·n 6 (

√nM)2·n2

, whereMis an upper bound on the absolute value of an entry inB.

We thus have the following theorem.

Theorem 14.14.The LLL-algorithm terminates in O(n2(logn+s)) iterations, wheres is the largest binary encoding length of a coefficient of B∈ Zn×n.

In order to conclude that the LLL-algorithm runs in polynomial time, we alsohave to bound the binary encoding length of the numbers whichoccur in the inter-mediate steps of the algorithm. This is very important but a bit tedious and we don’tdo it here and refer, for example to [55]. We conclude with thefollowing theorem.

Theorem 14.15.The LLL-algorithm runs in polynomial time in the input encodinglength of the initial lattice basis.

Page 21: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 19

Orthogonality defect and approximation of the shortest vector

The next theorem bounds the length of a shortest vector from below by means ofthe GSO. Here,SV(Λ) denotes the length of a shortest nonzero vector ofΛ .

Theorem 14.16.Let B be a lattice basis and let B∗ = (b∗1, . . . ,b∗n) be its Gram-

Schmidt orthogonalization, then SV(Λ(B)) > mini=1,...,n‖b∗i ‖2.

Proof. Let 0 6= v∈ Λ , thenv = Bx and letk 6 n be the largest index withxk 6= 0.With B = B∗ ·Rwe have

v = B∗ ·R·x

= xk b∗k +k−1

∑i=1

λi b∗i , for someλi ∈ R.

This implies‖v‖> |xk|‖b∗k‖ and the theorem follows. ⊓⊔With this lower bound, we can show that the first vector of an LLL-reduced basis

is an approximation of the shortest vector, which is exponential in the dimensiononly.

Upon termination of the LLL-algorithm we have that eachµ2j+1, j 6 1/4. Since

also ‖b∗j+1 + µ j+1, j b∗j‖2 > 3/4‖b∗j‖2 and since‖b∗j+1 + µ j+1, j b∗j‖2 = ‖b∗j+1‖2 +

µ2j+1, j‖b∗j‖2 we have

‖b∗j+1‖2 > 1/2‖b∗j‖2 (14.21)

for eachj = 1, . . . ,n−1. Thus it follows by induction that

‖b∗j‖2 > 2i− j‖b∗i ‖2, for 1 6 i < j 6 n. (14.22)

From this we can conclude that

‖b1‖2 = ‖b∗1‖2 6 2n−1SV(Λ(B)). (14.23)

Also since

b j = b∗j +j−1

∑i=1

µ ji b∗i (14.24)

and since eachµ ji has absolute value less that 1/2 we have

‖b j‖2 6 ‖b∗j‖2 +1/4j−1

∑i=1

‖b∗i ‖ (14.25)

and by applying (14.22) we obtain

‖b j‖2 6 ‖b∗j‖2(1+1/4j−1

∑i=1

2 j−i) 6 2 j−1‖b∗j‖2. (14.26)

This implies the following for the orthogonality defect

Page 22: Integer Programming and Algorithmic Geometry of Numbers - EPFL

20 Friedrich Eisenbrand

‖b1‖· · ·‖bn‖6 2n(n−1)/4‖b∗1‖· · ·‖b∗n‖= 2n(n−1)/4|det(B)|. (14.27)

Together with Theorem 14.13 we thus have the next theorem.

Theorem 14.17.A shortest vector of an integral lattice can be computed in time2O(n3) times a polynomial in the length of the input encoding.

Notes

The LLL-algorithm requiresO(n5 logB) arithmetic operations on rational num-bers of sizeO(n+ logB). HereB is an upper bound on the norm of the basis vec-tors. Schnorr [72] presented an floating-point variant of the LLL algorithm whichrequiresO(n4 logB) arithmetic operations on rationals of sizeO(n+ logB). Im-provements on the algorithm itself and its analysis were given by Kaltofen [40]and Storjohann [79] among others. Using naive arithmetic, the analysis of the LLL-algorithm presented here amounts to a bit-complexity ofO(n4 logB(n+ logB)2).Recently, Nguyen and Stehle [63] have presented a floating-point variant of theLLL-algorithm which requiresO(n5(n+ logB) logB) bit-operations. In fixed di-mension, this matches the bit-complexity of the Euclidean algorithm. The great-est common divisor of twos-bit integers can be computed withO(M(s) logs) bit-operations [73], whereM(s) is the number of bit-operations, which are requiredfor the multiplication of twos-bit numbers. For a long time, the fastest methodfor integer multiplication was the one by Schonhage and Strassen [74], requiringO(slogslog logs) bit-operations. Martin Furer [26] improved this complexity re-cently to O(slogs· 2O(log∗ s)). It is an interesting open problem, whether a short-est vector in fixed dimension can be computed withO(M(s) logs) bit-operations.Eisenbrand and Rote [20] showed that one can compute a shortest vector in fixed di-mensionn usingO(M(s) logn−1s) bit-operations. Recently Gama and Nguyen [27]proved that using a shortest vector oracle in dimensionk, one can compute a((1+ ε)γk)

(n−k)/(k−1) approximation of the shortest vector, whereγk is the so-calledHermite constant.

Exercises

1. Prove that‖bi‖> ‖b∗i ‖ holds for eachi = 1, . . . ,n for the GSO ofB.2. Let

B = (b1, . . . ,bi−1,bi ,bi+1,bi+2, . . . ,bn)

andC = (b1, . . . ,bi−1,bi+1,bi ,bi+2, . . . ,bn)

be two lattice bases. Notice thatC originates fromB via swapping thei-th andi +1-st column. Prove thatB∗ andC∗ only differ in thei-th andi +1-st column.Show further that‖b∗i ‖ · ‖b∗i+1‖= ‖c∗i ‖ · ‖c∗i+1‖ holds.

Page 23: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 21

3. LetB∈ Rn×n be a matrix. Prove that|det(B)|6 (√

nM)n, whereM is an upperbound on the absolute values of the entries inB. Hint: Hadamard bound!

4. Estimate the total number of arithmetic operations whichare performed by theLLL-algorithm in terms ofφ(B).

5. Let α be a fixed constant and letΛ = Λ(A), A ∈ Qn×n be a rational lattice infixed dimensionn.

a) Prove that one can enumerate all vectorsv∈Λ(A) with ‖v‖6 α ·SV(Λ) inpolynomial time.

b) Let ‖ · ‖ be any fixed norm. Show that a shortest vector ofΛ w.r.t. ‖ · ‖ canbe computed in polynomial time, if‖v‖ can be evaluated in polynomial timein the binary input encoding ofv for anyv∈Λ .

14.6 Kannan’s shortest vector algorithm

As we have mentioned above, one can compute a shortest vectorof a lattice thatis represented by a LLL-reduced basisb1, . . . ,bn in 2O(n3) steps via enumeratingthe candidates∑n

j=1 λ j b j , where|λ j |6 2n(n−1)/4 and choosing the shortest nonzerovector from this set.

Kannan [42, 43] provided an algorithm for the shortest vector problem, whosedependence on the dimension is 2O(nlogn). Helfrich [36] improved Kannan’s algo-rithm. Recently, Ajtai, Kumar and Sivakumar [3] presented arandomized algorithmfor the shortest vector problem, with an expected dependence of 2O(n) which is thesubject of the next section. In this section, we describe Kannan’s algorithm.

Korkine-Zolotareff reduction

A lattice basisb1, . . . ,bn is Korkine-Zolotareff reduced, orK-Z reducedfor short,if the following conditions hold.

i) The vectorb1 is a shortest vector of the lattice generated byb1, . . . ,bn.ii) The numbersµ jk in the Gram-Schmidt orthogonalization ofb1, . . . ,bn satisfy|µ jk|6 1/2.

iii) If b′2, . . . ,b′n denote the projections ofb2, . . . ,bn onto the orthogonal comple-

ment of the space generated byb1, thenb′2, . . . ,b′n is Korkine-Zolotareff re-

duced.

A two-dimensional lattice basis that is K-Z reduced is also calledGaußreduced,see [29]. The algorithm of Kannan computes a Korkine-Zolotareff reduced basis indimensionn by first computing a partially Korkine-Zolotareff reduced lattice basis,from which a shortest vector is among 2O(nlogn) candidates. The basis is partiallyKorkine-Zolotareff reduced with the help of an algorithm for Korkine-Zolotareffreduction in dimensionn−1.

Page 24: Integer Programming and Algorithmic Geometry of Numbers - EPFL

22 Friedrich Eisenbrand

With a shortest vector at hand, one can then compute a fully K-Z reduced basisby K-Z reducing the projection along the orthogonal complement of this shortestvector. A lattice basisb1, . . . ,bn is partially Korkine-Zolotareff reducedor partiallyK-Z reducedfor short, if it satisfies the following properties.

1. If b′2, . . . ,b′n denotes the projection ofb2, . . . ,bn onto the orthogonal complement

of the space generated byb1, thenb′2, . . . ,b′n is Korkine-Zolotareff reduced.

2. The numbersµ jk in the Gram-Schmidt orthogonalization ofb1, . . . ,bn satisfy|µ jk|6 1/2.

3. ‖b′2‖> 1/2‖b1‖.

Notice that, once Conditions 1 and 3 hold, Condition 2 can be satisfied via anormalization step. Normalization does not destroy Conditions 1 and 3. Condition 1can be satisfied by applying Kannan’s algorithm for full K-Z reduction tob′2, . . . ,b

′n,

and applying the transformation to the original vectorsb2, . . . ,bn. Then if Condi-tion 3 is not satisfied, then Helfrich [36] has proposed to replaceb1 andb2 with theGauß-reductionof this pair, or equivalently its K-Z reduction. Clearly, ifb1,b2 isGauß-reduced, which means that‖b1‖6 ‖b2‖ and the angle enclosed byb1 andb2

is at least 60◦ and at most 120◦, then Condition 3 holds.The following algorithm computes a partially K-Z reduced basis from a given

input basisb1, . . . ,bn. It uses as a subroutine an algorithm to K-Z reduce the latticebasisb′2, . . . ,b

′n.

Algorithm 4 Partial K-Z reduction1. Apply the LLL-algorithm tob1, . . . ,bn.2. K-Z reduceb′2, . . .,b

′n and apply the corresponding transformation tob2, . . . ,bn.

3. Perform normalization step onb1, . . .,bn.4. If ‖b′2‖< 1/2‖b1‖, then replaceb1,b2 by its Gauß reduction and go to Step 2.

We show in a moment that we can extract a shortest vector from apartially K-Z reduced basis in 2O(nlogn) steps, but before, we analyze the running time of thealgorithm.

Theorem 14.18 ([36]).Step 4 of Algorithm 4 is executed at mostlogn+6 times.

Proof. Let v be a shortest vector and letb1, . . . ,bn be the lattice basis immediatelybefore Step 4 of Algorithm 4 and letb′2, . . . ,b

′n denote the projection ofb2, . . . ,bn

onto the orthogonal complement ofb1.If Step 4 is executed, thenv is not equal tob1. Then clearly, the projection ofv

onto the orthogonal complement ofb1 is nonzero. Sinceb′2, . . . ,b′n is K-Z reduced

it follows that‖v‖> ‖b′2‖ holds. Denote the Gauß reduction ofb1,b2 by b1, b2. Thedeterminant ofΛ(b1,b2) is equal to‖b1‖‖b′2‖. After the Gauß reduction in Step 4,we have therefore

Page 25: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 23

‖b1‖ 6 2√‖b1‖‖b′2‖ (14.28)

6 2√‖b1‖‖v‖. (14.29)

Dividing this inequality by‖v‖ gives

‖b1‖‖v‖ 6 2

√‖b1‖‖v‖ .

Thus, ifb(i)1 denotes the first basis vector after thei-th execution of Step 4, one has

‖b(i)1 ‖‖v‖ 6 4

(‖b(0)

1 ‖‖v‖

)(1/2)i

. (14.30)

Since we start with a LLL-reduced basis, we know that‖b(0)1 ‖/‖v‖6 2(n−1)/2 holds,

and consequently that‖b(logn)1 ‖/‖v‖6 8. Each further Gauß reduction decreases the

length of the first basis vector by at least 3/4. Therefore the number of runs throughStep 4 is bounded by logn+6. ⊓⊔

Extracting a shortest vector

We now argue that with such a partially K-Z reduced basisb1, . . . ,bn at hand, oneonly needs to checknO(n) candidates for the shortest vector. Letv = ∑n

j=1 λ jb j be ashortest vector. After rewriting eachb j in terms of the Gram-Schmidt orthogonal-ization one obtains

v =n

∑j=1

j

∑k=1

(λ j µ jkb∗k)

=n

∑k=1

(n

∑j=k

λ j µ jk)b∗k,

where theµ jk are as in Algorithm 2.The length ofv satisfies

‖v‖2 =n

∑k=1

(n

∑j=k

(λ j µ jk))2‖b∗k‖2. (14.31)

Consider the coefficientcn = |λnµnn| = |λn| of ‖b∗n‖ in (14.31). We can boundthis absolute value by|λn|6 ‖v‖/‖b∗n‖6 ‖b1‖/‖b∗n‖. This leaves us 1+2‖b1‖/‖b∗n‖possibilities forλn. Suppose now that we pickedλn, . . . ,λ j+1 and inspect the coeffi-cientc j of ‖b∗j‖ in (14.31), which is

Page 26: Integer Programming and Algorithmic Geometry of Numbers - EPFL

24 Friedrich Eisenbrand

c j = |n

∑k= j

(λkµk j)|

= |λ j +n

∑k= j+1

(λkµk j)|.

Since the inequalityc j 6 ‖b1‖/‖b∗j‖ must hold, this leaves only 1+ 2‖b1‖/‖b∗j‖possibilities to pickλ j . Thus by choosing the coefficientsλn, . . . ,λ1 in this order,one has at most∏n

j=1(1+2‖b1‖/‖b∗j‖) candidates.Suppose‖b∗j‖ > ‖b1‖ for some j. Thenb j can never have a nonzero coefficient

λ j in a shortest vector representationv = ∑nj=1 λ jb j . Because in that case,v has a

nonzero component in its projection to the orthogonal complement ofb1R+ . . .+bi−1R and sinceb′2, . . . ,b

′n is K-Z reduced, this implies that‖v‖ > ‖b∗j‖ > ‖b1‖,

which is impossible. Thus we can assume that‖b∗j‖6 ‖b1‖ holds for all j = 1, . . . ,n.Otherwise,b j can be discarded. Therefore the number of candidatesN for the tuples(λ1, . . . ,λn) satisfies

N 6

n

∏j=1

(1+2‖b1‖/‖b∗j‖)

6

n

∏j=1

(3‖b1‖/‖b∗j‖)

= 3n‖b1‖n/det(Λ).

Next we give an upper bound for‖b1‖. If b1 is a shortest vector, then Minkowski’stheorem, (Theorem 14.11) guarantees that‖b1‖ 6

√n det(Λ)1/n holds. If b1 is not

a shortest vector, then the shortest vectorv has a nonzero projection onto the or-thogonal complement ofb1R. Sinceb′2, . . . ,b

′n is K-Z reduced, this implies that

‖v‖ > ‖b′2‖ > 1/2‖b1‖, since the basis is partially K-Z reduced. In any case wehave‖b1‖6 2

√n det(Λ)1/n and thus thatN 6 6nnn/2.

Now it is clear how to compute a K-Z reduced basis and thus a shortest vector.With an algorithm for K-Z reduction in dimensionn−1, one uses Algorithm 4 topartially K-Z reduce the basis and then one checks all possible candidates for ashortest vector. Then one performs K-Z reduction on the basis for the projectiononto the orthogonal complement of the shortest vector. Kannan [43] has shown thatthis procedure for K-Z reduction requiresnO(n) ϕ operations, whereϕ is the binaryencoding length of the initial basis and where the operands during the execution ofthe algorithm have at mostO(n2ϕ) bits.

Theorem 14.19 ([43]).Let B be a lattice basis of binary encoding lengthϕ . Thereexists an algorithm which computes a K-Z reduced basis and requires nO(n) · p(ϕ)arithmetic operations on rationals of size O(n2ϕ), where p(·) is a fixed polynomial.

Page 27: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 25

Notes

Kannan [44] also developed an algorithm for theclosest vector problemwhoserunning time is 2O(nlogn) times a polynomial in the encoding length of the input.Here, one is given a rational vectorx ∈ Qn and a latticeΛ ⊆ Qn. The task is tocompute a lattice pointv∈Λ which is closest tox, i.e., minimizing‖v−x‖. Kannanalso showed that the integer programming feasibility problem can be solved withinthis complexity bound. Furthermore he showed that one can compute an approxi-mate solution of the closest vector problem with a polynomial number of queriesto an oracle which solves the shortest vector problem of ann+ 1-dimensional lat-tice. Blomer [9] showed that there exists an algorithm for the closest vector prob-lem which runs in timen! times a polynomial in the input encoding length. Thismeans an exponential improvement over Kannan’s algorithm [44] and its subse-quent improvement by Helfrich [36]. Hanrot and Stehle [33]improved this furtherto n0.5·n+o(n) and showed that the shortest vector problem can be solved in timen0.184·n+o(n) times a polynomial in the input length. Stehle and Pujol [66] improvethe arithmetic complexity of Kannan’s algorithm.

Exercises

1. Prove that a 2-dimensional lattice basis is Gauß reduced if and only if it is LLL-reduced.

2. Prove that the shortest vector can be extracted in time 2O(n2) out of a LLL-reduced basis by adapting the arguments given in this section.

14.7 A randomized simply exponential algorithm for shortestvector

Ajtai, Kumar and Sivakumar [3] described a randomized method which outputsa shortest vector of a lattice with very high probability andhas running time 2O(n)

times a polynomial in the binary input encoding length of thelattice basis. We followthe description of their algorithm in Oded Regev’s excellent lecture notes [68]. Atfirst sight, the algorithm is quite different from the shortest-vector algorithms thatare based on reduction and candidate-trial, as Kannan’s algorithm.

Our task is to compute the shortest vector of a latticeΛ(B), for a nonsingularB∈Qn×n. We can assume that 26 SV(Λ) < 3 holds, see exercise 14.7.1. The ideais to sample points inRn and to translate these samples into lattice points. How isthis translation done? A pointx∈ Rn can be represented as a linear combination ofthe basis vectors

x =n

∑i=1

λibi , (14.32)

Page 28: Integer Programming and Algorithmic Geometry of Numbers - EPFL

26 Friedrich Eisenbrand

for someλi ∈ R. If we round allλi down, we obtain a lattice point

n

∑i=1

⌊λi⌋ ·bi ∈Λ(B). (14.33)

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

b

bx

(2,0)

(3,2)

remB(x)

Fig. 14.5 The remainder of a point w.r.t. a basisB.

Theremainderof x modulo thebasis Bis the vector

remB(x) =n

∑i=1

(λi−⌊λi⌋)bi .

The lattice point in (14.33) isx− remB(x) and we associate it tox. Clearly one has

remB(x) = remB(x+v) (14.34)

for eachv∈Λ(B).

Sieving

What we discussed above suggests the following approach. Wesample a numberof pointsx1, . . . ,xk uniformly at random fromB2(0), the ball of radius 2 around 0,and compute their remaindersyi = remB(xi). The norms of these remainders arebounded byR= ∑n

i=1‖bi‖.If we sample enough points (and the next lemma provides a bound on k), then

there will be two such sampled pointsxi ,x j such that their remaindersyi and y j

have distance at mostR/2. The length of the lattice vectorxi−yi− (x j −y j) is thenbounded byR/2+4.

Lemma 14.3 (Sieving Lemma).Let y1, . . . ,yk ∈ Rn be points contained in the ballBR(0). There is a polynomial algorithm that computes a mappingτ : {1, . . . ,k} →{1, . . . ,k} with the following properties:

Page 29: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 27

i) The cardinality of the image ofτ is bounded by5n.ii) For each i= 1, . . . ,k one has‖yi−yτ(i)‖6 R/2.

In other words, we can identify6 5n centersamong they1, . . . ,yk such that the ballsof radiusR/2 around these centers cover all of the remainingyi . The mappingτassociates theyi to a center of a ball that containsyi .

Proof (of Lemma 14.3).We describe the simple procedure. In the beginning, allpoints are colored black. We iterate the following steps until there are no blackpoints left: Choose any black pointyi and defineτ( j) = i, for eachj such thaty j isblack and‖yi − y j‖ 6 R/2 holds. Color all points at distance at mostR/2 from yi

red.Call the pointsyi with τ(i) = i centers. The number of centers is the size of the

image ofτ. Two centers have distance at leastR/2 from each other. This means thatthe balls of radiusR/4 around the centers do not intersect. On the other hand, theseballs all fit into the ball of radiusR+ R/4. This implies that the number of centersis bounded by

vol(B5·R/4(0))/vol((BR/4(0)) = 5n,

and the assertion follows. ⊓⊔

The randomized algorithm for shortest vector

The sieving lemma provides a procedure to generate lattice vectors whose lengthis roughlyR/2. This idea is now iterated in the random sampling algorithmforshortest vector.

Algorithm 5 Randomized algorithm for shortest vector1. R0← ∑n

i=1‖bi‖ChooseN = 28n logR0 pointsx1, . . . ,xN uniformly at random fromB2(0)Initialize a set of tuplesL = {(xi ,yi) : yi = remB(xi ), i = 1, . . . ,N}R← R0

2. While R> 6 doApply the sieving algorithm to the vectorsyi for each(xi ,yi) ∈ LRemove fromL all tuples(xi ,yi), whereyi is a center of the sieving procedureReplace each of the remaining(x j ,y j) with (x j ,y j − (yτ( j)−xτ( j)))R← R/2+2.

3. For any two pairs(xi ,yi), (x j ,y j) in L compute the differencexi − yi − (x j − y j ) and outputthe shortest nonzero vector among these differences

Lemma 14.4.At the beginning of each iteration of thewhile loop one has for each(xi ,yi) ∈ L

i) xi−yi ∈Λ(B)ii) ‖yi‖6 R.

Page 30: Integer Programming and Algorithmic Geometry of Numbers - EPFL

28 Friedrich Eisenbrand

Proof. At the beginning of the first iteration, these invariants hold, as we discussedabove.

We now show that these conditions hold after the instructions of the while-loophave been executed, if they were true at the beginning of thatiteration. We onlyneed to consider a tuple(x j ,y j), wherey j is not a center in the sieving procedure.Let yi be the center ofy j , i.e.,τ( j) = i. Sincex j −y j ∈Λ(B) andxi−yi ∈Λ(B) weconclude thatx j −y j − (xi−yi) ∈Λ(B), which implies condition i).

Sinceyi is the center ofy j we have‖y j −yi‖6 R/2. From this we conclude

‖y j − (yi−xi)‖6 ‖y j −yi‖+‖xi‖6 R/2+2.

⊓⊔

Lemma 14.4 and Exercise 14.7.4 imply that the algorithm runsin time 2O(n)

times a polynomial in the input encoding of the lattice basisB. Furthermore, at theend of the algorithm, there are at least

28n logR0−5n logR0 > 27n logR0 (14.35)

tuples(xi ,yi) in L. The lattice vectorsxi−yi are short, i.e.‖xi−yi‖6 2+6= 8. Buthow can we be sure that they are not all zero? Exercise 3 shows that, even whenLis initialized, roughly half of these lattice vectors couldbe zero.

It turns out that the following observation that follows from (14.34) is crucial.

Consider a tuple(xi ,yi) in L before Step 3 of the algorithm. The algorithm would behavejust the same until this point ifxi was replaced byxi + vi after the initialization (Step 1),wherevi ∈Λ(B) is an arbitrary lattice vector.

When do we have to reallyknowa pointxi and not justyi = remB(xi)? The value ofxi is needed only whenyi was a center of the sieving procedure and tuples(x j ,y j)are replaced by(x j ,y j − (yi−xi)). Now we shed some light on why we sample thepointsx1, . . . ,xN from B2(0) and not from a ball of other radius. This is connectedto the fact that 26 SV(Λ(B)) < 3 and this crucial observation from above.

Let v∈ Λ(B) be a shortest vector. Consider the setC1 = B2(0)∩B2(v) and thesetC2 = B2(0)∩B2(−v).

b b b

−v v0C2 C1

Fig. 14.6 The setsC1 andC2.

Page 31: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 29

The mapping

f (x) =

x−v, if x∈C1

x+v, if x∈C2

x, if x∈ B2(0)\(C1∪C2)

(14.36)

is a bijection ofB2(0) which mapsC1 toC2 and vice-versa. A pointx∈ B2(0) whichis chosen uniformly at random, remains to be chosen uniformly at random after wetoss a fair coin and mapx to f (x) in the case that the coin shows heads.

This shows, together with the observation above, that we could, just before weenter Step 3 of the algorithm, toss a fair coin for eachxi with (xi ,yi) ∈ L and replacexi by f (xi) if the coin shows head.

The next theorem implies that a shortest vector ofΛ(B) is returned with highprobability.

Theorem 14.20.At the end of Step 2 there exists with high probability a subsetL′ ⊆ L of size2n which satisfies the following conditions

a) xi−yi = x j −y j for each(xi ,yi),(x j ,y j) ∈ L′.b) xi ∈C1∪C2 for each(xi ,yi) ∈ L′.

Before we prove the theorem we derive the main result of this section.

Theorem 14.21.The random sampling algorithm outputs a shortest vector withhigh probability.

Proof. Consider the setL′ from Theorem 14.20. For each(xi ,yi)∈ L′, we toss a coinand replacexi with f (xi) if the coin shows head. If we consider a difference

xi−yi− (x j−y j), where(xi ,yi) 6= (x j ,y j ) ∈ L′

then the probability that this difference becomes±v is exactly 1/2. (One coin showshead and coin shows tail). Since the cardinality ofL′ is 2n with high probability, thealgorithm will output a shortest vector ofΛ(B). ⊓⊔

We now prove Theorem 14.20. Recall that the setL contains at least 27·n tuples atthe end of Step 2. The next lemma shows that the expected amount of these tuples,whose first component belongs toC1∪C2 is at least 25·n+1.

Lemma 14.5.Consider the sets C1 and C2 described above. One has

vol(C1)/vol(B2(0)) = vol(C2)/vol(B2(0)) > 2−2n. (14.37)

Lemma 14.5 shows that the expected number of tuples(xi ,yi) with xi ∈C1∩C2

is 26n+1 logR0. The probability that this number of tuples is smaller than 26n logR0

at the end of Step 1 is exponentially small.Since we delete in each iteration 5n tuples and perform at most logR0 many

iterations,L contains at the end of Step 2, 25n tuples(xi ,yi) with xi ∈C1∪C2 withhigh probability. Notice that at this point, a differencexi − yi is a vector, whoselength is bounded by 8.

Page 32: Integer Programming and Algorithmic Geometry of Numbers - EPFL

30 Friedrich Eisenbrand

Lemma 14.6.|Λ(B)∩B8(0)|6 24n. (14.38)

Lemma 14.6 and the previous discussion show that there exists a lattice vectorw∈Λ(B) such thatxi−yi = w andxi ∈C1∪C2 for at least 2n of these tuples. Thisshows Theorem 14.20.

Notes

Ajtai et al. [4] also showed that there is a randomized algorithm to computeK-Z reduced bases in time 2O(n). Together with the block-reduction algorithm ofSchnorr [72], this implies the existence of a randomized polynomial time approxi-mation algorithm for shortest vector, with an approximation factor of 2nloglogn/ logn.Schnorr [72] previously showed that one approximate the shortest vector within afactor of 2nloglog2 n/ logn.

We did not mention how the sampling of points inB2(0) is done. A more generalprocedure for this task (sampling in convex bodies) was presented by Dyer, Kannanand Frieze [16]. Here the authors show how to sample points whose statistical dis-tance to the uniform distribution is exponentially small. This serves the purpose ofalgorithm 5, see also exercise 7.

Blomer and Naewe [10] modified the sampling procedure described above tocompute shortest vectors which are outside a given subspace. Under some lengthconditions on the target vector, they achieve a simply exponential algorithm forclosest vector.

The shortest vector problem is also very interesting from the viewpoint of com-putational complexity. Van Emde Boas [22] proved that the shortest vector problemwith respect to theℓ∞ norm is NP-hard, and he conjectured that it is NP-hard withrespect to the Euclidean norm. In the same paper he proved that the closest vectorproblem is NP-hard for anyℓp norm. Ajtai [2] proved that the shortest vector prob-lem with respect to theℓ2-norm is NP-hard for randomized problem reductions. Thismeans that the reduction makes use of results of a probabilistic algorithm. Ajtai alsoshowed that approximating the length of a shortest vector ina given lattice withina factor 1+ 1/2nc

is NP-hard for some constantc. The non-approximability factorwas improved to(1+ 1/nε) by Cai and Nerurkar [12]. Micciancio [60] improvedthis factor substantially by showing that it is NP-hard to approximate the shortestvector in a given lattice within any constant factor less that

√2 for randomized prob-

lem reductions, and that the same result holds for deterministic problem reductions(the “normal” type of reductions used in an NP-hardness proof) under the conditionthat a certain number theoretic conjecture holds. Goldreich and Goldwasser [30]proved that it is not NP-hard to approximate the shortest vector, or the closest vec-tor, within a factor

√n unless the polynomial-time hierarchy collapses. Dinur [15]

showed that it is NP-hard to approximate shortest vectors w.r.t. ℓ∞ within almostpolynomial factors. Khot [50] showed that the shortest vector problem w.r.t. theℓp-norm is hard to approximate within any constant ifp > 1. Using norm-embeddings

Page 33: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 31

and a reduction to theℓ2 norm, Regev and Rosen [69] showed that this also holdsfor p = 1 andp = ∞. The currently strongest inapproximability result for shortestvector is due to Haviv and Regev [34].

Exercises

1. Justify the assumption 26 SV(Λ) 6 3. Hint: Suppose that B∈ Zn×n and con-sider the latticesΛ

((3/2)i/‖b1‖ ·B

).

2. Show that‖remB(x)‖ is bounded by∑ni=1‖bi‖.

3. Suppose thatx is chosen uniformly at random fromB2(0). Show that the prob-ability that

x− remB(x) = 0 (14.39)

holds, is at most 1/2. For each fixedn, provide a family of lattice basesBnk with

2 6 SV(Λ(Bnk)) < 3 such that the probability for the event (14.39) tends to 1/2

for k→ ∞.4. Prove that the number of iterations through thewhile loop of algorithm 5 is

bounded by logR0−1.5. Prove Lemma 14.5.6. Prove Lemma 14.6.Hint: Packing argument!7. How many bits does one have to know of each sampled pointxi in Algorithm 5?

Show that the number of bits which are necessary to representthe numbers inthe intermediate steps of Algorithm 5 is polynomial in the input encoding of thelattice basis.

8*. A prominent research question is the one whether there exists a deterministicalgorithm for shortest vector with running time 2O(n) times a polynomial in theinput encoding.

9*. Is there a (randomized) 2O(n)-algorithm which computes a shortest nonnegativelattice vector, i.e. av∈Λ −{0} with v > 0 and‖v‖minimal?

14.8 Integer programming in fixed dimension

In this section, we will describe Lenstra’s algorithm [56] which solves the fol-lowing integer feasibility problem.

GivenA∈ Zm×n andb∈ Zm, decide whether there exists an integer point that satisfiestheinequality systemAx6 b.

The algorithm runs in polynomial time if the dimensionn is fixed. The integer fea-sibility problem is in NP [11]. In other words, if there exists an integer solution ofAx 6 b, then there exists one, whose size is polynomial in the size of the largestabsolute value of a coefficient of the systemAx6 b andn, see, e.g. [76].

We start with the flatness theorem, which is the key concept ofany polynomialtime algorithm for integer programming in fixed dimension. The flatness theorem

Page 34: Integer Programming and Algorithmic Geometry of Numbers - EPFL

32 Friedrich Eisenbrand

shows that the feasibility problem can be decided with a variant of branch-and-bound and in particular that the feasibility problem can be solved in polynomialtime, if the dimensionn is fixed.

Let K ⊆ Rn be a nonempty closed subset ofRn and letd ∈ Rn be a vector. Thewidthof K along d is the number

wd(K) = max{dTx: x∈ K}−min{dTx: x∈ K}.

If the maximum or minimum does not exist, thenwd(K) is defined to be infinity.Suppose now thatK ⊆ Rn is a convex body which does not contain an integer

point. Theflatness theorem, attributed to Khinchin 1948, ensures that there exists anonzero integer vectord ∈ Zn such thatwd(K) is bounded by a constant. Thewidthof P is defined as

w(P) = mind∈Zn\{0}

wd(P).

A d ∈ Zn \ {0} which minimizeswd(P) is called aflat directionof P.

Theorem 14.22 (Khinchine’s flatness theorem [49]).Let K ⊆ Rn be a convexbody. Either K contains an integer point, or w(K) 6 ω(n), whereω(n) is a con-stant depending on the dimension only.

d

P

Fig. 14.7 Branching on a flat direction. The vectord 6= 0 is integral and the grey lines representthe linesdTx = δ whereδ ∈ Z and maxx∈P dTx > δ > minx∈P dTx. The feasibility ofP can bechecked by checking the feasibility of the five 1-dimensional polytopes which are obtained fromthe intersection ofP with the gray lines.

If wd(P) > ω(n) for a flat directiond of P, then the flatness theorem guaranteesthe existence of an integer point inP. If, on the other hand,wd(P) 6 ω(n), then an

Page 35: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 33

integer pointx∗ ∈ P∩Zn must lie on one of the constant number of hyperplanes

cTx = δ , whereδ ∈ Z, and min{cTx: x∈ P}6 δ 6 max{cTx: x∈ P}.

Later on in this section, we prove the flatness theorem. If thereader has understoodhow the flatness theorem is proved, he can also solve exercises 6 and 7. These exer-cises ask you to prove that the width and a flat direction of a rational polyhedron infixed dimension can be computed in polynomial time.

Algorithm 6 Lenstra’s algorithmInput: A∈ Zm×n andb∈ Zm with P = {x∈ Rn : Ax6 b} full-dimensionalReturn: ”Yes” if P is integer feasible and ”No” otherwise.

Computew(P) and a flat directiond of Pif w(P) > ω(n) then

return ”Yes”end iffor δ = ⌈minx∈P dTx⌉, . . . ,⌊maxx∈P dTx⌋ do

ComputeC ∈ Zm×(n−1) and f ∈ Zm such thatCx 6 f is integer feasible if and only ifAx6

b, dTx = δ is integer feasible (See exercise 2)Recursively solve the integer feasibility problem forCx6 f

end for

Exercises 2, 6 and 7 show that Lenstra’s algorithm runs in polynomial time in theinput encoding length.

Theorem 14.23.Lenstra’s algorithm runs in polynomial time in the input encodinglength if the dimension is fixed.

Proving the flatness theorem

Let us first discuss how to compute a flat direction of an ellipsoid. An ellipsoidis the imagef (Bn) of then-dimensional unit ballBn = {x∈ Rn : ‖x‖6 1} under anaffine mapf (x) = A−1x+a, whereA∈Rn×n is a non-singular matrix, anda∈Rn isa vector. This ellipsoid can also be described asE(A,a)= {x∈Rn : ‖A(x−a) ‖6 1}.

Consider now an ellipsoidE(A,a) and a directiond ∈ Zn \ {0}. We first want tocompute the width ofE(A,a) alongd and the discuss how to find an integral vectord 6= 0 along which the width is minimal.

Since the width is invariant under translation, we can assume thata = 0 holds.The width ofE(A,0) alongd is

maxx1,x2∈E(A,0)

dT(x1−x2). (14.40)

We havedTx1−dTx2 = dTA−1(Ax1−Ax2) andAx1 andAx2 are contained in theunit ball if and only ifx1,x2 ∈ E(A,0). Consequently (14.40) is simply 2· ‖dTA−1‖.

Page 36: Integer Programming and Algorithmic Geometry of Numbers - EPFL

34 Friedrich Eisenbrand

Keep in mind that we want to compute a flat direction ofE(A,0) or, in otherwords, that we are interested in an integrald such that 2· ‖dTA−1‖ is as small as

possible. This is a shortest vector problem in the latticeΛ(A−1T).

Thedual latticeΛ∗ of a latticeΛ ⊆ Rn is defined as

Λ∗ = {x∈Rn : xTy∈ Z for all y∈Λ}.

Exercise 5 shows thatΛ∗(A) = Λ(A−1T) holds. Our discussion above implies the

following theorem.

Theorem 14.24.A nonzero vector d∈ Zn is a flat direction of the ellipsoid E(A,a)

if and only if A−1Td is a shortest nonzero vector ofΛ∗(A). A flat direction of an

ellipsoid can be computed in polynomial time in fixed dimension. If the dimensionis varying, one can compute an integer vector d∈ Zn\{0} in polynomial time suchthat wd(E(A,a)) 6 2n−1w(E(A,a)) holds.

The covering radiusµ(Λ) of a latticeΛ ∈ Rn is the smallest numberα suchthat the balls of radiusα centered at the lattice points coverRn. Alternatively, thecovering radius is the largest distance of a pointx∈Rn to the latticeΛ .

Thepacking radiusρ(Λ) of Λ is the largest numberβ such that the balls of ra-diusβ around the lattice points do not properly intersect. Alternatively, the packingradius isSV(Λ)/2.

We are now very close to proving the flatness theorem for ellipsoids. What ifE(A,a) does not contain an integer point? Inspecting the definitionof E(A,a) ={x∈ Rn : ‖A(x−a)‖6 1} we see that this is the case if and only if the distance ofAa to the latticeΛ(A) is larger than 1, i.e.,µ(Λ(A)) > 1. If we can now infer fromthis that the packing radius ofΛ∗(A) is bounded by a constant, depending only onthe dimension, we are done, since the width ofE(A,a) is exactly 2·SV(Λ∗(A)) =4 ·ρ(Λ∗(A)). This is established in the next theorem via basis reduction.

Theorem 14.25.LetΛ ⊆ Rn be a lattice andΛ∗ be its dual. One has

µ(Λ) ·ρ(Λ∗) 6 f (n),

where f(n) is a constant depending only on n which satisfies f(n) 6 n/4·2n(n−1)/4.

Proof. Suppose thatB = (b1, . . . ,bn) ∈ Qn×n is a basis of the rational latticeΛ(B)with orthogonality defectγ, i.e., one has

‖b1‖· · ·‖bn‖= γ · |det(B)|. (14.41)

Assume that the longest basis vector isbn, in other words that‖bn‖ > ‖b j‖ forj = 1, . . . ,n−1. This assumption can be made without loss of generality, since theorthogonality defect is invariant under swapping of columns.

Let u ∈ Rn be a point whose distance toΛ is µ(Λ). SinceB is a basis ofRn

we can writeu = ∑ni=1 λibi , with λi ∈ R. The vectorv = ∑n

i=1⌊λi⌉bi belongs to thelatticeΛ(B), where⌊λi⌉ denotes the closest integer toλi . Clearly‖v−u‖> µ(Λ).

Page 37: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 35

We also have‖v−u‖ = ‖∑n

i=1(⌊λi⌉−λi)bi‖6 ∑n

i=1‖(⌊λi⌉−λi)bi‖6

12 ∑n

i=1‖bi‖6 n

2‖bn‖,(14.42)

where the last inequality in (14.42) follows from the fact that the last basis vectorbn

is the longest one in the basis. Since‖v−u‖> µ(Λ) we therefore have

‖bn‖> 2µ(Λ)/n. (14.43)

Let B = B∗ ·Rbe the GSO ofB. The following facts (see exercise 14.5.1)

‖b1‖· · ·‖bn‖= γ · ‖b∗1‖· · ·‖b∗n‖‖b j‖> ‖b∗j‖, j = 1, . . . ,n.

imply ‖bn‖6 γ · ‖b∗n‖, which together with (14.43) implies

‖b∗n‖> 2µ(Λ)/(n · γ).

Now b∗n is orthogonal to the vectorsb∗j , j = 1, . . . ,n−1. SinceR is an upper triangu-

lar matrix with only 1’s on its diagonal, it follows that(b∗n)TB∗ ·R= (0, . . . ,0,‖b∗n‖2)

from which we can conclude that

d = b∗n/‖b∗n‖2 ∈Λ∗(B).

The norm ofd is the reciprocal of the norm ofb∗n and thus satisfies

‖d‖6 n · γ/(2µ(Λ)).

This bounds the length of a shortest vector ofΛ∗(B) and consequently one hasρ(Λ∗) 6 n · γ/(4µ(Λ)) implying

ρ(Λ∗)µ(Λ) 6 n · γ/4.

By using the LLL-bound (14.27) onγ we obtain

ρ(Λ∗)µ(Λ) 6 (n/4)2n(n−1)/4.

⊓⊔Theorem 14.26 (Flatness theorem for ellipsoids).Let E(A,a)⊆Rn be an ellipsoidwhich does not contain an integer point, then w(E(A,a)) 6 4 · f (n).

Proof. If E(A,a) does not contain an integer point, then the covering radius of Λ(A)is at least one. Sinceµ(Λ(A)) ·ρ(Λ∗(A)) 6 f (n) it follows thus thatρ(Λ∗(A)) 6

f (n) and consequently thatw(E(A,a)) 6 4 · f (n). ⊓⊔The flatness theorem 14.22 for general convex bodiesK ⊆ Rn follows from the

fact thatK can be well approximated by an ellipsoid. John [39] showed that there

Page 38: Integer Programming and Algorithmic Geometry of Numbers - EPFL

36 Friedrich Eisenbrand

1/n·E

E

K

Fig. 14.8 A John-pair of ellipsoids.

exists an ellipsoidE containingK such that ifE is centrally scaled by 1/n, then it iscontained inK. In other words one has

E/n⊆ K ⊆ E.

This implies the flatness theorem withω(n) 6 4 ·n · f (n).

Notes

The boundf (n) 6 (n/4)2n(n−1)/4 of Theorem 14.25 is far from optimal. La-garias, Lenstra and Schnorr [54] proved thatf (n) = O(n3/2). Banaszcyk [5] provedthat f (n) = O(n) holds.

Grotschel, Lovasz and Schrijver [32] have shown that there exists a polynomialtime algorithm which computes an ellipsoidE such that 1/(n(n+ 1)) ·E⊆ K ⊆ Eholds, if K is a convex body which is equipped with aweak separation oracle.Nesterov and Nemirovski [62] describe a polynomial algorithm which computesE such that((1+ ε)n)−1E ⊆ P ⊆ E holds, whereP ⊆ Rn is a full-dimensionalpolytope. Hereε > 0 is a parameter of the algorithm and the algorithm runs inpolynomial time in the binary encoding length ofP (rational data) and the binaryencoding length ofε. A faster algorithm was provided by Khachiyan [48].

Finding a maximum volume ellipsoid inscribed in a polyhedron is a convex pro-gramming problem ofLP-type. This means that such an ellipsoid can be computed inlinear time in the RAM-model of computation if the dimensionis fixed, see [59]. Inthe RAM-model, all numbers have input length one and basic arithmetic operationstake constant time. The binary encoding length of the numbers in the input do notcontribute to the input length. In fact those could even be real numbers. Gartner [28]provided a subexponential algorithm for problems of LP-type in varying dimension,and thus also a subexponential algorithm to compute the maximum volume ellipsoidof a polytope. His algorithm is a generalization of the linear programming algorithmof Matousek, Sharir and Welzl [59].

Page 39: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 37

If the dimension is fixed, then using the algorithms for the largest enclosed el-lipsoid by Matousek et al. [59] Lenstra’s algorithm requiresO(m·s) arithmetic op-erations on rational numbers of sizeO(s), wheres is the largest binary encodinglength of a coefficient ofAx6 b. If the number of constraintsm is fixed, then thismatches the running time of the Euclidean algorithm, see, e.g. [51]. If one uses thefloating point variant of the LLL-algorithm of Nguyen and Stehle [64], then thebit-complexity for fixed dimension and number of constraints is O(s2) which alsomatches the bit-complexity of the Euclidean algorithm.

Exercises

1. Show how to compute a feasible pointx∈ Zn of Ax6 b if it exists, by using apolynomial number of calls to an oracle solving an integer feasibility problemon an input which is of polynomial size in size(A), size(b) andn.

2. Consider a system of inequalities

Ax6 b, dTx = β , (14.44)

whereA∈ Zm×n, b∈ Zm, d ∈ Zn andβ ∈ Z.Show how to reduce the feasibility problem of (14.44) to a feasibility prob-lem in dimensionn−1 involvingm inequalities. This new system should be ofpolynomial size in the size of the system (14.44).Hint: Hermite normal form

3. Consider the polyhedronP = {x∈ Rn : Ax6 b}, whereA∈ Zm×n andb∈ Zm.Suppose thatP has a flat direction whose first component is nonzero. Formulatea mixed integer program to compute the width and a flat direction of P. Con-clude that the encoding length of a flat direction is polynomial in the encodinglength ofAx6 b.

4. Letd ∈Rn\{0} and letBn be then-dimensional unit ball. Show thatwd(Bn) =2 · ‖d‖.

5. Prove that the dual lattice ofΛ(A) is Λ((A−1)T), i.e.,Λ∗(A) = Λ((AT)−1).6. LetA∈ Qn×n be a nonsingular rational matrix anda∈Qn be a rational vector.

Show that there exists a polynomial algorithm which outputseither an integervector inE(A,a), or a nonzero integer vectord ∈ Zn with wd(E(A,a)) 6 n ·2n(n−1)/4.

7. LetP = {x∈Rn : Ax6 b} be a rational polyhedron in fixed dimensionn. Showthat the width and a flat direction ofP can be computed in polynomial time.Hint: Exercise 5 of Section 14.5.

Page 40: Integer Programming and Algorithmic Geometry of Numbers - EPFL

38 Friedrich Eisenbrand

14.9 The integer linear optimization problem

In this section we want to consider the integer optimizationproblem in fixeddimension

max{cTx: Ax6 b, x∈ Zn}. (14.45)

In the analysis of the algorithm that follows, we use the parametersm ands, wherem is the number of inequalities of the systemAx6 b ands is an upper bound on thebinary encoding length of a coefficient ofA,b andc.

Thegreatest common divisorof two integersa andb can be computed with theEuclidean algorithm withO(s) arithmetic operations, wheres is an upper bound onthe binary encoding length of the integersa andb. On the other hand we have thefollowing well known formula

gcd(a,b) = min{ax1+bx2 : ax1 +bx2 > 1, x1,x2 ∈ Z}. (14.46)

This implies that the greatest common divisor can be computed with an algorithmfor the integer optimization problem in dimension 2 with oneconstraint.

The integer optimization problem can be reduced to the integer feasibility prob-lem with binary search. One has a complexity ofO(m+s) for the integer feasibilityproblem in fixed dimension. This follows from an analysis of Lenstra’s algorithmin combination with an efficient algorithm to compute a Lowner-John ellipsoid, see[59, 83]. With binary search for an optimal point, one obtains a running time ofO(m· s+ s2). If, in addition to the dimension, also the number of constraints isfixed, this results in anO(s2) algorithm for the integer optimization problem, whichis in contrast to the linear running time of the Euclidean algorithm.

Clarkson [13] has shown that the integer optimization problem withmconstraintscan be solved with an expected number ofO(m) arithmetic operations andO(logm)calls to an oracle solving the integer optimization problemon a constant size subsetof the input constraints. Therefore we concentrate now on the integer optimizationproblem with a fixed number of constraints. In this section weoutline an algorithmwhich solves the integer optimization problem in fixed dimension with a fixed num-ber of constraints withO(s) arithmetic operations on rational numbers of sizeO(s).The algorithm relies on the LLL-algorithm.

The first step is to reduce the integer optimization problem over a full-dimensionalpolytope with a fixed number of facets to a disjunction of integer optimization prob-lems over a constant number oftwo-layer simplices. A two-layer simplex is a full-dimensional simplex, whose vertices can be partitioned into two setsV andW, suchthat the objective function of the elements in each of the sets V andW agree, i.e.,for all v1,v2 ∈V one hascTv1 = cTv2 and for allw1,w2 ∈W one hascTw1 = cTw2.

How can one reduce the integer optimization problem over a polytope P to asequence of integer optimization problems over two-layer simplices? Simply con-sider the hyperplanescTx = cTv for each vertexv of P. If the number of constraintsdefiningP is fixed, then these hyperplanes partitionP into a constant number ofpolytopes, whose vertices can be grouped into two groups, according to their objec-

Page 41: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 39

tive function value. Thus we can assume that the vertices ofP can be partitioned intotwo setsV andW, such that the objective function values of the elements in each ofthe setsV andW agree. Caratheodory’s theorem, see Schrijver [76, p. 94],impliesthatP is covered by the simplices that are spanned by the vertices of P. These sim-plices are two-layer simplices. Therefore, the integer optimization problem in fixeddimension with a fixed number of constraints can be reduced toa constant numberof integer optimization problems over a two-layer simplex by applying a constantnumber of arithmetic operations.

The key idea is then to let the objective function slide into the two-layer simplex,until the width of the truncated simplex exceeds the flatnessbound. In this way, onecan be sure that the optimum of the integer optimization problem lies in the trunca-tion, which is still flat. Thereby one has reduced theinteger optimization problemin dimensionn to a constant number ofinteger optimization problemsin dimensionn−1 and binary search can be avoided.

How do we determine a parameterπ such that the truncated two-layer simplexΣ ∩ (cTx > π) just exceeds the flatness bound? We explain the idea with the help ofthe 3-dimensional example in Figure 14.9.

0v1

w1

w2

µ w1 µ w2

(1−µ)v1 + µw1

(1−µ)v1 + µw2

0v1

w1

w2

µ w1 µ w2

(1−µ)v1 + µw1

(1−µ)v1 + µw2

Fig. 14.9 Reduction to the parametric lattice width problem.

Here we have a two-layer simplexΣ in 3-space. The setV consists of the points0 andv1 andW consists ofw1 andw2. The objective is to find a highest point in thevertical direction. The picture on the left describes a particular point in time, wherethe objective function slid intoΣ . So we consider the truncationΣ ∩ (cTx > π) forsomeπ > cTw1. This truncation is the convex hull of the points

0,v1,µw1,µw2,(1− µ)v1+ µw1,(1− µ)v1+ µw2, (14.47)

whereµ = π/cTw1. Now consider the simplexΣV,µW, which is spanned by thepoints0,v1,µw1,µw2. This simplex is depicted on the right in Figure 14.9. If thissimplex is scaled by 2, then it contains the truncationΣ ∩ (cTx > π). This is easy tosee, since the scaled simplex contains the points 2(1− µ)v1, 2µ w1 and 2µ w2. Sowe have the conditionΣV,µW ⊆ Σ ∩(cTx> π)⊆ 2ΣV,µW. From this we can infer the

Page 42: Integer Programming and Algorithmic Geometry of Numbers - EPFL

40 Friedrich Eisenbrand

important observation

w(ΣV,µW) 6 w(Σ ∩ (cTx > π)) 6 2w(ΣV,µW). (14.48)

This means that we essentially determine the correctπ by determining aµ > 0,such that the width of the simplexΣV,µW just exceeds the flatness bound. The widthof ΣV,µW is roughly (up to a constant factor) the length of the shortest vector of thelatticeL(Aµ) , whereAµ is the matrix

Aµ =

µwT1

µwT2

v1

.

Thus we have to find a parameterµ , such that the shortest vector ofL(Aµ) is sand-wiched betweenω(n)+1 andγ · (ω(n)+1) for some constantγ. This problem canbe understood as aparametric shortest vectorproblem.

To describe this problem let us introduce some notation. We define for ann×n-matrixA = (ai j )∀i, j , the matrixAµ,k = (ai j )

µ,k∀i, j , as

aµ,ki j =

{µ ·ai j , if i 6 k,

ai j , otherwise.(14.49)

In other words, the matrixAµ,k results fromA by scaling the firstk rows withµ . Theparametric shortest vector problem is now defined as follows.

Given a nonsingular matrixA∈ Zn×n and someU ∈ N, find a parameterp ∈ N such thatU 6 SV(L(Ap,k)) 6 2n+1/2 ·U or assert that SV(L) > U .

It turns out that the parametric shortest vector problem canbe solved in linear timewhen the dimension is fixed with a cascaded LLL-algorithm. From this, it followsthat the integer optimization problem in fixed dimension with a fixed number ofconstraints can be solved in linear time. Together with Clarkson’s result we obtain.

Theorem 14.27 ([18]).The integer optimization problem(14.45)can be solved withan expected number of O(m+ slogm) arithmetic operations on rationals of sizeO(s).

Notes

A polynomial time algorithm for thetwo-variable integer programming prob-lemwas presented by Hirschberg and Wong [38] and Kannan [46] forspecial casesand by Scarf [70, 71] for the general case. Then, Lenstra [56]proved that integerprogramming in arbitrary fixed dimension can be solved in polynomial time. After-wards, various authors were looking for faster algorithms for the two-dimensionalcase [24, 85, 41]. A linear-time algorithm in the arithmeticmodel was presented byEisenbrand and Laue [19].

Page 43: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 41

Exercises1

1*) Can one solve an integer program in fixed dimension with a fixed number ofconstraints in quadratic time in the bit-model of complexity ?Recent results on the bit-complexity of the LLL-algorithm [63] could help an-swering this question

2*) Is the shortest vector problem solvable in timeO(M(s) logs) if the dimensionis fixed?See also the Notes section of Chapter 14.5.

14.10 Diophantine approximation and strongly polynomialalgorithms

In this section we review a very important appliction of the LLL-algorithm inoptimization, namely the rounding algorithm of Frank and Tardos [25]. The input ofa combinatorial optimization problem usually consists of acombinatorial structure,like a graph or a subset system and some numbers, which reflectweights on theedges or costs of subsets or alike. An algorithm isstrongly polynomialif

A) it consist only of basic arithmetic operations+,−,∗,/ and comparisons<,>,=;

B) the number of such basic operations which are carried out is polynomial in theinput length, where the numbers in the input have length one;

C) the encoding lengths of the numbers in the intermediate steps of the algorithmare polynomial in the binary encoding length of the input, where numbers in theinput account with their binary encoding length.

An algorithm is weakly polynomial if it satisfies A) and C) butinstead of B)satisfies the weaker condition that the number of basic operations is bounded in thebinary encoding length of the input.

Algorithms which are based onbit-scalingof the involved numbers usually havethe property to be weakly polynomial only. It is an outstanding open problem,whether linear programming, which can be solved in weakly polynomial time, canalso be solved in strongly polynomial time.

A 0/1-optimization problemis an integer program, where the integer variablesare restricted to be either 0 or 1. The result that we survey now shows a facet of thetremendous importance of the LLL-algorithm in the area of algorithms and com-plexity. In fact it holds in greater generality, see [25, 31]. We treat a less generalversion here, to keep the exposition as simple as possible.

Theorem 14.28 ([25]).If a 0/1 optimization problem can be solved in weakly poly-nomial time, then it can be solved in strongly polynomial time.

1 Those are not really ”exercises” but rather research questions.

Page 44: Integer Programming and Algorithmic Geometry of Numbers - EPFL

42 Friedrich Eisenbrand

The starting point of the method leading to this result is Dirichlet’s theorem.Suppose that you have ann-dimensional vectorα ∈ Rn which is represented bynreal numbersα1, . . . ,αn ∈ R and you are interested in a vectorp∈ Zn with integercomponents of small absolute value and which encloses a small angle withα. Sucha p could be a candidate to replace the linear objective function αTx with the linearobjective functionpTx, hoping that the optimal 0/1-solutions are the same.

Theorem 14.29 (Dirichlet).Let Q> 1 be a real number and letα1, . . . ,αn ∈ R.There exists an integer q and integers p1, . . . , pn with

i) 1 6 q 6 Qn andii) |q ·αi− pi|6 1/Q for i = 1, . . . ,n.

Proof. Consider the latticeΛ which is generated by the following matrix

1 α1

1 α2. . .

...1 αn

1Qn+1

. (14.50)

The determinant ofΛ is 1/Qn+1. The cubeK = {x∈ Rn+1 : −1/Q 6 x 6 1/Q issymmetric around 0 and has volume 2n+1 ·1/Qn+1. Minkowski’s theorem impliesthat this set contains a nonzero lattice point2

−p1 +q ·α1...

−pn +q ·αn

q/Qn+1

.

We can assumeq to be positive, since the negative of this vector is also contained inK. The integerspi , i = 1, . . . ,n andq are as it is stated in the theorem. ⊓⊔

If Q in the Dirichlet theorem is a positive integer, then one can replace conditionii) by the slightly stronger condition

ii’) |q ·αi− pi|< 1/Q for i = 1, . . . ,n,

see exercise 1).Suppose now that we want to solve a 0/1 optimization problem

maxx∈F

αTx (14.51)

2 Because, if this cube would not contain a nonzero lattice point, then one could scale it with afactor larger than one and still it would not contain a nonzero lattice point. The volume of thisscaled cube is strictly larger than 2n+1 ·1/Qn+1. This would contradict Minkowski’s theorem.

Page 45: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 43

whereF ⊆ {0,1}n. The setF could, for example, be the characteristic vectors ofmatchings of a graphG = (V,E) or characteristic vectors of other combinatorialstructures.

Our goal is to replaceα by an integer vectorv whose binary encoding length ispolynomial in the dimensionn such that the optimal points inF w.r.t. the objectivefunctionsαTx andvTx are the same.

To this end, consider an optimal solutionx of problem (14.51), i.e., anx with

αT(x−x) > 0 for all x∈F .

The pointsx−x above have components in{0,±1}. Our integer vectorv∈ Zn willhave the following property.

For eachy∈ {0,±1}n one has

αTy > 0 if and only ifvTy > 0.

If this holds, then we can safely replaceα by v in (14.51).Let us assume for a moment that Dirichlet’s theorem has an efficient implemen-

tation, i.e., that theq and pi satisfying conditions i) and ii) can be computed inpolynomial time. Assume further that the largest absolute value of a component inα is one, or in other words that‖α‖∞ = 1. If this would not hold, then we couldsimply scaleα by 1/‖α‖∞. SetQ in Dirichlet’s theorem toQ := n and letq andpbe the resulting integer and integer vector respectively.

How large are the absolute values of the components ofp? Since 16 q 6 nn (i)and|qαi− pi|< 1/n (ii’) it follows that ‖p‖∞ 6 q‖α‖∞ 6 nn. The binary encodinglength ofp is therefore polynomial inn, which is one of the objectives that we wantan approximation ofα to satisfy.

How close comesp in terms of being a valid replacementvof α? The next lemmashows, that this vector comes very close already to what we want. Let sign(x) denotethe sign of a real numberx, i.e.,

sign(x) =

1, if x > 0,

−1, if x < 0,

0, if x = 0.

Lemma 14.7.Let y∈ {0,±1}n be a vector. Ifsign(pTy) 6= 0, then sign(pTy) =sign(αTy).

Proof. Suppose thatpTy> 0. Sincep andy are integer vectors, it follows thatpTy>

1 holds. Property ii) of the Dirichlet theorem implies|qαi − pi | < 1/n for eachi.Thus|qαTy− pTy|= |(qα− p)Ty|< n/n = 1. In other words, the distance ofpTyto qαTy is less than one, implying thatqαTy > 0 and consequentlyαTy > 0.

The casepTy < 0 is analogous. ⊓⊔

If sign(pTy) = 0, then we cannot infer anything about sign(αTy). The idea isnow to apply recursion. If sign(pTy) = 0, then clearly

Page 46: Integer Programming and Algorithmic Geometry of Numbers - EPFL

44 Friedrich Eisenbrand

sign(αTy) = sign((q ·α− p)Ty

)

Since one component ofα is 1, say componenti, we clearly havepi = q and thus thevectorq ·α− p has at least one more zero component thanα. Let v′ be an integervector which has been recursively computed and satisfies foreachy ∈ {0,±1}nsign(v′Ty) = sign

((q ·α− p)Ty

).

Let M be a large weight. The above discussion shows that

∀y∈ {0,±1}n : sign(αTy) = sign((M · p+v′)Ty

).

Thus we setv := M · p+v′.Let us deduce a bound onM. This numberM has to be large enough so that

M · pTy dominates the sign of(M · p+ v′)Ty if pTy 6= 0. This clearly holds ifMis at least as large as the absolute value ofv′Ty. To this end, lett(n) be an upperbound on the largest absolute value of a component ofv which is constructed by theabove described recursive procedure. The absolute value ofv′Ty is then bounded byn · t(n−1). Thus we setM to this value. How large is‖v‖∞? For this we have therecursive formula

t(n) 6 n · t(n−1) ·nn+ t(n−1),

which shows a crude bound oft(n) 6 (n+1)n2+n. The binary encoding length ofvis thusO(n2 logn).

What about our assumption from the beginning that Dirichlet’s theorem has anefficient implementation? In fact, it is an outstanding openproblem, whether theqand thepi as in Dirichlet’s theorem can be found in polynomial time. However, wehave the LLL-algorithm. The proofs of the next two lemmas areleft as an exercise.

Lemma 14.8.Let Q> 1 be an integer andα1, . . . ,αn ∈ Q be rational numbers.There exists a polynomial time algorithm that computes an integer q and integersp1, . . . , pn with

I) 1 6 q 6√

n+12(n−1)/2 ·Qn andII) |q ·αi− pi|<

√n+12(n−1)/2 ·1/Q for i = 1, . . . ,n.

Lemma 14.9.Let Q > 1 be an integer andα1, . . . ,αn ∈ Q be rationals num-bers. There exists a polynomial algorithm that computes an integer q and integersp1, . . . , pn with

a) 1 6 q 6 2n2Qn and

b) |q ·αi− pi|< 1/Q for i = 1, . . . ,n.

If we replace the guarantees of Dirichlet’s theorem with theones obtained inLemma 14.9, then the binary encoding length ofv is still polynomial in n. Thismeans that we have an algorithm to round an objective function vectorα ∈ Qn

to an equivalent integer vectorv∈ Zn. The only trouble is that the LLL-algorithmis not strongly polynomialand our goal is to develop a strongly polynomial timealgorithm. This is finally achieved by rounding theαi first individually before thealgorithm behind Lemma 14.9 is applied. The rounding operation ⌊·⌋ is not part of

Page 47: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 45

the standard set of operations on which a strongly polynomial time algorithm canrely. Letβ ∈R be a real number with 0< β 6 1 and letK > 2 be a positive integer.In exercise 4, you are asked to compute⌊K ·β⌋ with a linear (in the binary encodinglength ofK) number of elementary operations (+,−,∗,/,<,>,=).

Theorem 14.30.There exists a strongly polynomial algorithm that, given nrealnumbersα1, . . . ,αn with ‖α‖∞ = 1, computes integers p1, . . . , pn and q with

|q ·αi− pi|< 1/n, i = 1, . . . ,n, and1 6 q 6 2n2+nnn.

Proof. For eachi compute:

α ′i =⌊αi2n2+n+1nn+1⌋

2n2+n+1nn+1

in strongly polynomial time (see exercise 4) and run the algorithm behind Theo-rem 14.9 on inputα ′ andQ′ = 2 ·n. This algorithm returnsq andp.

We now have to show the following claim.

|q ·αi− pi|6 1/n, i = 1, . . . ,n, andq 6 2n2+nnn. (14.52)

Clearly one has 16 q 6 2n2(2n)n = 2n2+nnn. On the other hand we have

|qαi− pi| 6 |qαi−qα ′i |+ |qα ′i − pi|

<2n2+nnn

2n2+n+1nn+1+1/(2n)

= 1/n.

which shows the claim. ⊓⊔

If we adapt the argument from before with the new parameters of Theorem 14.30we obtain the result of Frank and Tardos [25].

Theorem 14.31.There exists a strongly polynomial algorithm that, givenα ∈ Rn,computes a v∈ Zn whose binary encoding length is polynomial in n such thatsign(vTy) = sign(αTy) for all y ∈ {0,±1}n.

Consequently, a 0/1-optimization problem can be solved in polynomial time ifand only if it can be solved in strongly polynomial time.

Notes

The result of Tardos and Frank [25] is more general than for the case restricted to0/1-problems described above. If one wants to optimize a linear function over theconvex hull of a set of rational points, then the linear function can be replaced by aninteger linear objective function, whose binary encoding length is polynomial in the

Page 48: Integer Programming and Algorithmic Geometry of Numbers - EPFL

46 Friedrich Eisenbrand

largest binary encoding length of a point in the set, see also[58, 31]. Tardos [80]has shown that a linear program max{cTx: Ax6 b} can be solved within a numberof elementary operations bounded by a polynomial in the binary encoding length ofA. This can also be shown with a rounding argument, see [31]. Diophantine approx-imation is also discussed in a recent survey of Lenstra [57].

Exercises

1. Show that condition ii) can be strengthened to|q·αi− pi |< 1/Q for i = 1, . . . ,n.

2. Prove Lemma 14.8.Hint: The shortest vector of the lattice generated by the matrix (14.50)hasℓ2-norm at most

√n+1/Q and the LLL-algorithm finds a vector x which is a

2(n−1)/2-approximation of this shortest vector. Bound theℓ∞-norm of x.3. Prove Lemma 14.9.

Hint: Set Q′ := ⌈√

n+1·2(n−1)/2 ·Q⌉ and apply Lemma 14.8 usingα and Q′.The bound a) holds for all but a finite number of n. For this fixednumber of none can afford to find a shortest vector w.r.t. theℓ∞-norm (see exercise 14.5.5).

4. Letβ ∈ R be a real number with 0< β 6 1 and letK > 2 be a positive integer.Show how to compute⌊K ·β⌋ with a linear (in the binary encoding length ofK)number of elementary RAM operations (+,−,∗,/,<,>,=).

5. Show that for a 0/1-optimization problem (14.51) there exists av ∈ Zn with‖v‖∞ = 2O(nlogn) such that anx ∈ F is optimal w.r.t.αTx if and only if x isoptimal w.r.t.vTx.Hint: Use linear programming duality and the Hadamard bound.

14.11 Parametric integer programming

TheFrobenius problemis as follows. Givenn integersa1, . . . ,an whose greatestcommon divisor is one, compute the largestt ∈ N which cannot be written as

x1 ·a1+ · · ·+xn ·an = t, x1, . . . ,xn ∈N0.

The Frobenius problem is NP-complete [67]. Kannan [45] showed that it can besolved in polynomial time, ifn is fixed. In fact, Kannan showed something moregeneral. The Frobenius problem can be solved with binary search, if the validity ofthe followingforall/exist-statement can be decided in polynomial time.

∀y∈ Z, y > N ∃x1, . . . ,xn ∈ N0 : y = x1 ·a1+ · · ·+xnan (14.53)

Page 49: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 47

Kannan’s result is a polynomial time procedure to decide forall/exist statementsof the following form, whereQ⊆ Rm is a polyhedron,A ∈ Zm×n is a matrix andt ∈ N is an integer.

∀b∈(Q∩ (Rm−t ×Zt)

)Ax6 b is IP-feasible. (14.54)

More precisely, he showed that such statements can be decided in polynomialtime, if n, t and the affine dimension ofQ are fixed.

Theorem 14.32 ([45]).If n, t and the affine dimension of Q are fixed, then∀∃-statements can be decided in polynomial time.

In this section we review the basic ideas of his procedure andthe generalizationof Eisenbrand and Shmonin [21], which only assumesn andt to be fixed whereasthe dimension ofQ can vary.

Theorem 14.33 ([21]).If n, t are fixed, then∀∃-statements can be decided in poly-nomial time.

Suppose that the width of a polyhedronP⊆ Rn is the width along the integer di-rectionc, i.e.,w(P) = wc(P) with c∈Zn. Letβ = min{cTx: x∈P}. If w(P) > ω(n),then we can scale and translateP, to obtain a polyhedronP′ which is sandwichedbetween the hyperplanescTx = β andcTx = β + ω(n), see figure 14.10.

Theorem 14.34.If w(P) > ω(n), then there exists an integer point in

P∩(β 6 cTx 6 β + ω(n)

).

The width ofP′ is exactlyω(n) and by the flatness theorem (Theorem 14.22),P′

contains an integer point. From this it follows that there exists an integer point onone of the constant number of lower dimensional polyhedra

P∩cTx = ⌈β⌉+ i, for i = 0, . . . ,ω(n). (14.55)

Theorem 14.35 (Strengthening of flatness theorem).Let P⊆Rn be a polyhedronwith w(P) = wc(P) for an integer vector c∈ Zn \ {0}. The polyhedron P containsan integer point if and only if at least one of the polyhedra

P∩(cTx = ⌈β⌉+ i

)i = 0, . . . ,ω(n)

contains an integer point.

We continue with a description of Kannan’s ideas in dimension 2. Suppose thatA ∈ Zm×2 andQ⊆ Rm and that we wish to decide the∀∃-statement (14.54) witht = 0. We then would have to decide, whether there exists ab∈Q such thatAx6 bis integer infeasible. We denote the polyhedron{x ∈ R2 : Ax 6 b} by Pb. Let usmake the following simplifying assumptions.

I) The flat direction ofPb is always the first unit vectore1.

Page 50: Integer Programming and Algorithmic Geometry of Numbers - EPFL

48 Friedrich Eisenbrand

cx= β cx= β +ω(n)

P′ P

Fig. 14.10 An illustration of the strengthening of the flatness theorem. There must be an integerpoint on one of the dashed linescT x = ⌈β⌉+ i, i = 1, . . . ,ω(n).

II) The optimum basis of the linear program min{eT1 x: x ∈ R2, Ax 6 b} is the

same for allb∈Q.

It follows from II) that the optimum solution of the linear program min{eT1 x: x∈

Pb} is of the formG ·b for a fixed matrixG∈ R2×2.

b

aτ(i)x = bτ(i)

e1

Pb

Gb

x1 = ⌈e1Gb⌉ x1 = ⌈e1Gb⌉+ i

b

Fig. 14.11 Illustration in dimension 2.

The strengthening of the flatness theorem tells thatPb contains an integral pointif and only if there exists an integral point on the linesx1 = ⌈e1Gb⌉+ j forj = 0, . . . ,ω(2). The intersections of these lines with the polyhedronPb are 1-

Page 51: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 49

dimensional polyhedra. Some of the constraintsax 6 b of Ax 6 b are “pointingupwards”, i.e.,ae2 < 0, wheree2 is the second unit-vector. Leta1x1 + a2x2 6 bbe a constraint pointing upwards such that the intersectionpoint (⌈e1Gb⌉+ j,y) ofa1x1+a2x2 = β with the linex1 = ⌈e1Gb⌉+ j has the largest second component. Theline x1 = ⌈e1Gb⌉+ j contains an integral point inPb if and only if (⌈e1Gb⌉+ j,⌈y⌉)is contained inPb. This point is illustrated in Figure 14.11. We assume that this”highest upwards constraint” for the linex1 = z+ i is always fixed, lets say it isthe constraintaτ(i)x6 bτ(i) for a fixed mappingτ : {0, . . . ,ω(2)} → {1, . . . ,m}. Wemake this assumption explicit.

III) The highest intersection point of a linea jx= b j with the linex1 = z+ i is the in-tersection point of the lineaτ(i)x= bτ(i), whereτ : {0, . . . ,ω(2)} → {1, . . . ,m}is a fixed mapping.

We now formulate a mixed-integer program from which we can extract the inte-ger solution ofPb on the linex1 = z+ i, assuming that I-III) holds.

(Gb)1 6 z< (Gb)1 +1xi

1 = z+ i

y = (bτ(i)−aτ(i)1 x1)/aτ(i)

2y 6 xi

2 < y+1xi

1,xi2,z∈ Z.

(14.56)

For a givenb∈ Q, the linex1 = z+ i contains an integer point inPb if and only ifxi = (xi

1,xi2) ∈ Pb.

Recall that we want to decide the forall/exist-statement (14.54)) by searching ab∈Q such thatPb does not contain an integer point. In other words, we are lookingfor a b∈ Q for which each of thexi for i = 0, . . . ,ω(2) violates some constraint ofAx6 b. To do that, we again fix a mappingµ : {0, . . . ,ω(2)} → {1, . . . ,m}, whereaµ(i)x 6 bµ(i) is supposed to be the constraint ofAx6 b which is violated byxi . Thenumber of such mappingsµ is mω(2)+1, which is polynomial inm, sinceω(2) is aconstant.

We enumerate all these mappingsµ and append to each of theω(2)+1 MIP’s (14.56))the constraint

aµ(i)xi > bµ(i). (14.57)

The conjunction of all these MIP’s is a MIP with a constant number of integervariables, which can be solved with Lenstra’s algorithm [56] in polynomial time.This shows that we can decide forall/exist statements, given that the conditions I-III) hold.

We justify condition I, at the heart of which lies the following theorem.

Theorem 14.36.Let n be fixed and A∈ Zm×n be a matrix of full column rank. Thereexist a set D⊆Zn\{0} of polynomial (in the encoding length of A) cardinality suchthat for each b∈Rm there exists a d∈ D with w(Pb) = wd(Pb). Such a set D can becomputed in polynomial time.

Page 52: Integer Programming and Algorithmic Geometry of Numbers - EPFL

50 Friedrich Eisenbrand

C1

cC2

c

Pb

FN1b

FN2b

b

b

Fig. 14.12 The flat directionc and the two conesC1 andC2.

Proof. Suppose thatPb is non-empty and letc ∈ Zn be its width direction. Thenthere are two basesN1 andN2 such that

max{cx : Ax6 b}= cFN1b and min{cx : Ax6 b}= cFN2b (14.58)

andcbelongs to the conesC1 andC2 defined by the optimal bases w.r.t. max{cTx: x∈Pb} and min{cTx: x∈ Pb}, see Figure 14.12. HereFN1 andFN2 are the inverses ofthe basis matrices ofN1 andN2 respectively.

In fact, equations (14.58) hold for any vectorc in C1∩C2. Thus, the lattice widthof Pb is equal to the optimum value of the following optimization problem:

min{c(FN1−FN2)b : c∈C1∩C2∩Zn\ {0}}. (14.59)

The latter can be viewed as an integer programming problem. Indeed, the conesC1

andC2 can be represented by some systems of linear inequalities, say cD1 6 0 andcD2 6 0, respectively, whereD1,D2 ∈ Zn×n. The minimum (14.59) is taken overall integral vectorsc satisfyingcD1 6 0 andcD2 6 0, except the origin. Since bothconesC1 andC2 are simplicial, i.e., generated byn linearly independent vectors, theorigin is a vertex ofC1∩C2 and therefore can be cut off by a single inequality, forexample,cD11 6−1, where1 denotes then-dimensional all-one vector. It is impor-tant that all other integral vectorsc in C1∩C2 satisfy this inequality and thereforeremain feasible. Thus, the problem (14.59) can be rewrittenas

min{c(FN1−FN2)b : cD1 6 0, cD2 6 0, cD11 6−1c∈ Zn}.

For a givenb, this is an integer programming problem. Therefore, the optimum valueof (14.59) is attained at some vertex of the integer hull of the underlying polyhedron

{c : cD1 6 0, cD2 6 0, cD11 6−1} (14.60)

Page 53: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 51

Shevchenko [77] and Hayes and Larman [35] proved that the number of verticesof the integer hull of a rational polyhedron is polynomial infixed dimension. Tightbounds for this number were presented by Cook et al. [14] and Barany et al. [6]. ⊓⊔

To justify conditions I,II) and III) one then partitions theparameter polyhedronQ into a polynomial number of polyhedra such that for thoseb in one partition, thewidth direction is always invariant. Via a unimodular transformation one can assumethat this flat direction is the first unit vector, see exercises 1 and 2) for further details.

Notes

The problem to decide forall/exist statements belongs to the second level of thepolynomial hierarchy and isΠ p

2 -complete, see [78, 84]. Barvinok and Woods [7]extended the counting algorithm of Barvinok [8] such that itcomputes a short ra-tional generating function for an integer projection of theset of integer points in apolytope. The algorithm is based on Kannan’s partitioning lemma. The variant ofthe partitioning theorem which we described was used and further refined for in im-plementation of the algorithm [7] by Koppe et al. [52]. Koppe and Verdoolaege [53]presented an algorithm which computes a formula for the number of integer pointsin a parameterized polyhedron.

Exercises

1. Consider the polyhedraPb = {x∈Rn : Ax6 b} for b∈Q and letd ∈ Zn−{0},whereA∈ Zm×n has full column rank andn is fixed. Write down a polynomialnumber of inequalities which describe thoseb∈Q such thew(Pb) = wd(Pb).Hint: Associate to each d∈ D from Theorem 14.36 an ”entering basis” and”leaving basis”.

2. Write down a linear program which partitionsQ further into parameters suchthat condition III) holds.

3*. I s there a polynomial algorithm which, given a matrixA ∈ Zm×n, wheren isfixed and a polyhedronQ⊆ Rm determines ab ∈ Q such that the number ofinteger points inPb is minimal?

Acknowledgments

I am grateful to Damien Stehle for numerous comments and suggestions whichhelped me a lot to improve this manuscript. I also want to thank Johannes Blomerfor several discussions on shortest and closest vectors.

Page 54: Integer Programming and Algorithmic Geometry of Numbers - EPFL

52 Friedrich Eisenbrand

References

1. A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The design and analysis of computer algorithms,Addison-Wesley, Reading, 1974.

2. M. Ajtai, The shortest vector problem in L2 is NP-hard for randomized reductions, Proceed-ings of the 30th Annual ACM Symposium on Theory of Computing (STOC-98) (New York),ACM Press, 1998, pp. 10–19.

3. M. Ajtai, R. Kumar, and D. Sivakumar,A sieve algorithm for the shortest lattice vector prob-lem, Proceedings of the thirty-third annual ACM symposium on Theory of computing, ACMPress, 2001, pp. 601–610.

4. M. Ajtai, R. Kumar, and D. Sivakumar,Sampling short lattice vectors and the closest latticevector problem, Proceedings of the 17th IEEE Annual Conference on Computational Com-plexity (Montreal, Canada), 2002, pp. 53–67.

5. W. Banaszczyk,Inequalities for convex bodies and polar reciprocal lattices inRn. II. Appli-cation of K-convexity, Discrete Comput. Geom.16 (1996), no. 3, 305–311. MR MR1410163(97h:11073)

6. I. Barany, R. Howe, and L. Lovasz,On integer points in polyhedra: A lower bound, Combina-torica12 (1992), no. 2, 135–142.

7. A. Barvinok and K. Woods,Short rational generating functions for lattice point problems,Journal of the American Mathematical Society16 (2003), no. 4, 957–979 (electronic). MRMR1992831 (2004e:05009)

8. A. I. Barvinok,A polynomial time algorithm for counting integral points inpolyhedra whenthe dimension is fixed, Mathematics of Operations Research19 (1994), no. 4, 769–779. MRMR1304623 (96c:52026)

9. J. Blomer,Closest vectors, successive minima, and dual HKZ-bases of lattices, Automata,Languages and Programming, 27th International Colloquium, ICALP 2000, Geneva, Switzer-land, July 9-15, 2000, Proceedings (U. Montanari, J. D. P. Rolim, and E. Welzl, eds.), LectureNotes in Computer Science, vol. 1853, Springer, 2000, pp. 248–259.

10. J. Blomer and S. Naewe,Sampling methods for shortest vectors, closest vectors andsucces-sive minima, Proceedings of the 34th International Colloquium on Automata, Languages andProgramming, ICALP 2007 (L. Arge, C. Cachin, T. Jurdzinski,and A. Tarlecki, eds.), LectureNotes in Computer Science, vol. 4596, Springer, 2007, pp. 65–77.

11. I. Borosh and L. B. Treybig,Bounds on positive integral solutions of linear diophantine equa-tions, Proceedings of the American Mathematical Society55 (1976), 299–304.

12. J.-Y. Cai and A. P. Nerurkar,Approximating the svp to within a factor(1+1/dimε ) is NP-hardunder randomized reductions, Proceedings of the 38th IEEE Conference on ComputationalComplexity (Pittsburgh), IEEE Computer Society Press, 1998, pp. 46–55.

13. K. L. Clarkson,Las Vegas algorithms for linear and integer programming when the dimensionis small, Journal of the Association for Computing Machinery42 (1995), 488–499.

14. W. Cook, M. E. Hartmann, R. Kannan, and C. McDiarmid,On integer points in polyhedra,Combinatorica12 (1992), no. 1, 27 –37.

15. I. Dinur, ApproximatingSVP∞ to within almost-polynomial factors is NP-hard, TheoreticalComputer Science285 (2002), no. 1, 55–71, Algorithms and complexity (Rome, 2000). MRMR1925787 (2003f:68048)

16. M. Dyer, A. Frieze, and R. Kannan,A random polynomial-time algorithm for approximatingthe volume of convex bodies, Journal of the ACM38 (1991), no. 1, 1–17.

17. J. Edmonds,Systems of distinct representatives and linear algebra, Journalof Reseach of theNational Bureau of Standards71B (1967), 241–245.

18. F. Eisenbrand,Fast integer programming in fixed dimension, In Proceedings of the 11th An-nual European Symposium on Algorithms, ESA’ 2003 (G. D. Battista and U. Zwick, eds.),LNCS, vol. 2832, Springer, 2003, pp. 196–207.

19. F. Eisenbrand and S. Laue,A linear algorithm for integer programming in the plane, Mathe-matical Programming102(2005), no. 2, Ser. A, 249–259. MR MR2124445 (2005k:90080)

Page 55: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 53

20. F. Eisenbrand and G. Rote,Fast reduction of ternary quadratic forms, Cryptography and Lat-tices Conference, CALC 2001 (J. Silverman, ed.), LNCS, vol.2146, Springer, 2001, pp. 32–44.

21. F. Eisenbrand and G. Shmonin,Parametric integer programming in fixed dimension, Mathe-matics of Operations Research (2008), to appear.

22. P. van Emde Boas,Another NP-complete partition problem and the complexity of computingshort vectors in a lattice, Tech. Report MI-UvA-81-04, Mathematical Institute, University ofAmsterdam, Amsterdam, 1981.

23. X. G. Fang and G. Havas,On the worst-case complexity of integer Gaussian elimination,Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation(Kihei, HI) (New York), ACM, 1997, pp. 28–31 (electronic). MR MR1809966

24. S. D. Feit,A fast algorithm for the two-variable integer programming problem, Journal of theAssociation for Computing Machinery31 (1984), no. 1, 99–113.

25. A. Frank andE. Tardos,An application of simultaneous Diophantine approximationin com-binatorial optimization, Combinatorica7 (1987), 49–65.

26. M. Furer,Faster integer multiplication, Proceedings of the 39th Annual ACM Symposium onTheory of Computing, STOC 2007 (D. S. Johnson and U. Feige, eds.), ACM, 2007, pp. 57–66.

27. N. Gama and P. Q. Nguyen,Finding short lattice vectors within mordell’s inequality, Pro-ceedings of the 40th Annual ACM Symposium on Theory of Computing, Victoria, BritishColumbia, Canada, STOC 2008 (R. E. Ladner and C. Dwork, eds.), ACM, 2008, pp. 207–216.

28. B. Gartner,A subexponential algorithm for abstract optimization problems, SIAM J. Comput.24 (1995), no. 5, 1018–1035. MR MR1350756 (96m:68067)

29. C. F. Gauß,Disquisitiones arithmeticae, Gerh. Fleischer Iun., 1801.30. O. Goldreich and S. Goldwasser,On the limits of non-approximability of lattice problems,

Proceedings of the 30th Annual ACM Symposium on Theory of Computing (New York),ACM Press, 1998, pp. 1–9.

31. M. Grotschel, L. Lovasz, and A. Schrijver,Geometric algorithms and combinatorial optimiza-tion, Algorithms and Combinatorics, vol. 2, Springer, 1988.

32. M. Grotschel, L. Lovasz, and A. Schrijver,Geometric methods in combinatorial optimization,Progress in Combinatorial Optimization (W. R. Pulleyblank, ed.), Academic Press, Toronto,1984, pp. 167–183. MR 86d:90127

33. G. Hanrot and D. Stehle,Improved analysis of Kannan’s shortest lattice vector algorithm(extended abstract), Advances in cryptology—CRYPTO 2007, Lecture Notes in Comput. Sci.,vol. 4622, Springer, Berlin, 2007, pp. 170–186. MR MR2419600

34. I. Haviv and O. Regev,Tensor-based hardness of the shortest vector problem to within almostpolynomial factors, STOC’07—Proceedings of the 39th Annual ACM Symposium on Theoryof Computing, ACM, New York, 2007, pp. 469–477. MR MR2402472(2009e:68036)

35. A. C. Hayes and D. G. Larman,The vertices of the knapsack polytope, Discrete Applied Math-ematics6 (1983), no. 2, 135–138.

36. B. Helfrich, Algorithms to construct Minkowski reduced and Hermite reduced latticebases, Theoretical Computer Science41 (1985), no. 2-3, 125–139 (1986). MR MR847673(87h:11124)

37. C. Hermite,Extraits de lettres de M. Ch. Hermite a M. Jacobi sur differents objets de la theoriedes nombres, Journal fur die reine und angewandte Mathematik40 (1850).

38. D. S. Hirschberg and C. K. Wong,A polynomial algorithm for the knapsack problem in twovariables, Journal of the Association for Computing Machinery23 (1976), no. 1, 147–154.

39. F. John,Extremum problems with inequalities as subsidiary conditions, Studies and EssaysPresented to R. Courant on his 60th Birthday, January 8, 1948, Interscience Publishers, Inc.,New York, N. Y., 1948, pp. 187–204. MR MR0030135 (10,719b)

40. E. Kaltofen,On the complexity of finding short vectors in integer lattices, Computer Algebra:Proceedings of EUROCAL ’1983 (J. VanHulzen, ed.), Lecture Notes in Computer Science,vol. 162, Springer-Verlag, Berlin, 1983, pp. 236–244. MR 86m:11101

41. N. Kanamaru, T. Nishizeki, and T. Asano,Efficient enumeration of grid points in a convexpolygon and its application to integer programming, International Journal of ComputationalGeometry & Applications4 (1994), no. 1, 69–85.

Page 56: Integer Programming and Algorithmic Geometry of Numbers - EPFL

54 Friedrich Eisenbrand

42. R. Kannan,Improved algorithms for integer programming and related problems, Proceedingsof the 15th Annual ACM Symposium on Theory of Computing (New York), ACM Press,1983, pp. 193–206.

43. , Minkowski’s convex body theorem and integer programming, Mathematics of Oper-ations Research12 (1987), no. 3, 415–440.

44. , Minkowski’s convex body theorem and integer programming, Mathematics of Oper-ations Research12 (1987), no. 3, 415–440.

45. , Lattice translates of a polytope and the Frobenius problem, Combinatorica12(1992),no. 2, 161–177.

46. R. Kannan,A polynomial algorithm for the two-variable integer programming problem, Jour-nal of the Association for Computing Machinery27 (1980), no. 1, 118–122.

47. R. Kannan and A. Bachem,Polynomial algorithms for computing the Smith and Hermite nor-mal forms of an integer matrix, SIAM Journal on Computing8 (1979), no. 4, 499–507.

48. L. G. Khachiyan and M. J. Todd,On the complexity of approximating the maximal in-scribed ellipsoid for a polytope, Math. Programming61 (1993), no. 2, Ser. A, 137–159. MRMR1240454 (94m:90071)

49. A. Khinchine,A quantitative formulation of Kronecker’s theory of approximation (in russian),Izvestiya Akademii Nauk SSR Seriya Matematika12 (1948), 113–122.

50. S. Khot,Hardness of approximating the shortest vector problem in high lp norms, Journal ofComputer and System Sciences72 (2006), no. 2, 206–219. MR MR2205284 (2007b:68066)

51. D. Knuth,The art of computer programming, vol. 2, Addison-Wesley, 1969.52. M. Koppe, S. Verdoolaege, and K. Woods,An implementation of the barvinok–woods integer

projection algorithm, Tech. report, Katholieke Universiteit Leuven, 2008.53. M. Koppe and S. Verdoolaege,Computing parametric rational generating functions with a

primal Barvinok algorithm, Electron. J. Combin.15 (2008), no. 1, Research Paper 16, 19. MRMR2383436

54. J. Lagarias, H. Lenstra, and C. Schnorr,Korkin-zolotarev bases and successive minima of alattice and its reciprocal lattice, Combinatorica10 (1990), no. 4, 333–348.

55. A. K. Lenstra, H. W. Lenstra, and L. Lovasz,Factoring polynomials with rational coefficients,Math. Annalen261(1982), 515 – 534.

56. H. W. Lenstra,Integer programming with a fixed number of variables, Mathematics of Oper-ations Research8 (1983), no. 4, 538 – 548.

57. H. W. Lenstra, Jr.,Lattices, Algorithmic number theory: lattices, number fields, curves andcryptography, Math. Sci. Res. Inst. Publ., Cambridge Univ.Press, Cambridge, 2008, pp. 127–181. MR MR2467546

58. L. Lovasz,An algorithmic theory of numbers, graphs and convexity, SIAM, 1986.59. J. Matousek, M. Sharir, and E. Welzl,A subexponential bound for linear programming, Algo-

rithmica16 (1996), no. 4-5, 498–516. MR 97f:9005260. D. Micciancio,The shortest vector in a lattice is hard to approximate to within some con-

stant, Proceedings of the 39th Annual Symposium on Foundations ofComputer Science (LosAlamitos, CA), IEEE Computer Society, 1998, pp. 92–98.

61. H. Minkowski,Geometrie der Zahlen, Teubner, Leipzig, 1896.62. Y. E. Nesterov and A. S. Nemirovski,Self–concordant functions and polynomial–time methods

in convex programming, Tech. report, Central Economic and Mathematical Institute, USSRAcademy of Science, Moscow, USSR, 1989.

63. P. Q. Nguyen and D. Stehle,Floating-point LLL revisited, Proceedings of the 24th AnnualInternational Conference on the Theory and Applications ofCryptographic Techniques (EU-ROCRYPT 05) (R. Cramer, ed.), Lecture Notes in Computer Science, vol. 3494, Springer,2005, pp. 215–233.

64. P. Q. Nguyen and D. Stehle,Floating-point LLL revisited, Advances in cryptology—EUROCRYPT 2005, Lecture Notes in Comput. Sci., vol. 3494, Springer, Berlin, 2005,pp. 215–233. MR MR2352190

65. I. Niven, H. S. Zuckerman, and H. L. Montgomery,An introduction to the theory of numbers,fifth edition, Wiley, 1991.

Page 57: Integer Programming and Algorithmic Geometry of Numbers - EPFL

14 Integer Programming and Algorithmic Geometry of Numbers 55

66. X. Pujol and D. Stehle,Rigorous and efficient short lattice vectors enumeration, 14th Inter-national Conference on the Theory and Application of Cryptology and Information Security(ASIACRYPT 2008), 2008, pp. 390–405.

67. J. L. Ramırez-Alfonsın,Complexity of the Frobenius problem, Combinatorica16(1996), no. 1,143–147. MR MR1394516 (97d:11058)

68. O. Regev,Lattices in computer science, Lecture notes, Tel Aviv University, 2004,http://www.cs.tau.ac.il/˜odedr/teaching/latticesfall 2004/index.html .

69. O. Regev and R. Rosen,Lattice problems and norm embeddings, STOC’06: Proceedings ofthe 38th Annual ACM Symposium on Theory of Computing, ACM, New York, 2006, pp. 447–456. MR MR2277170 (2007i:68034)

70. H. E. Scarf,Production sets with indivisibilities. Part I: generalities, Econometrica49 (1981),1–32.

71. , Production sets with indivisibilities. Part II: The case oftwo activities, Econometrica49 (1981), 395–423.

72. C.-P. Schnorr,A hierarchy of polynomial time lattice basis reduction algorithms, TheoreticalComputer Science53 (1987), no. 2-3, 201–224. MR 89h:11085

73. A. Schonhage,Schnelle berechnung von kettenbruchentwicklungen, Acta Informatica 1(1971), 139–144.

74. A. Schonhage and V. Strassen,Schnelle multiplikation grosser zahlen, Computing7 (1971),281–292.

75. A. Schrijver,Theory of linear and integer programming, John Wiley & Sons, Chichester, 1986.76. , Theory of linear and integer programming, John Wiley, 1986.77. V. N. Shevchenko,On the number of extreme points in integer programming, Kibernetika

(1981), no. 2, 133–134 (Russian).78. L. J. Stockmeyer,The polynomial-time hierarchy, Theoretical Computer Science3 (1976),

no. 1, 1–22.79. A. Storjohann,Faster algorithms for integer lattice reduction, Tech. Report 249, Department

of Computer Science, ETH Zurich, 1996.80. E. Tardos,A strongly polynomial algorithm to solve combinatorial linear programs, Opera-

tions Research34 (1986), no. 2, 250–256. MR MR861043 (87i:90151)81. J. von zur Gathen and J. Gerhard,Modern computer algebra, Cambridge University Press,

1999.82. J. von zur Gathen and M. Sieveking,Weitere zum erfullungsproblem polynomial aquiv-

alente kombinatorische aufgahen, Komplexitat von Entscheidungsproblemen: ein Seminar(E. Specker and V. Strassen, eds.), LNCS, vol. 43, Springer,1976, pp. 49–71.

83. E. Welzl,Smallest enclosing disks (balls and ellipsoids), New results and new trends in com-puter science (Graz, 1991), Lecture Notes in Comput. Sci., vol. 555, Springer, Berlin, 1991,pp. 359–370. MR MR1254721

84. C. Wrathall,Complete sets and the polynomial-time hierarchy, Theoretical Computer Science3 (1976), no. 1, 23–33.

85. L. Y. Zamanskij and V. D. Cherkasskij,A formula for determining the number of integralpoints on a straight line and its application, Ehkon. Mat. Metody20 (1984), 1132–1138, (inRussian).