discrepancy and sdps nikhil bansal (tu eindhoven, netherlands ) august 24, ismp 2012, berlin

Discrepancy and SDPs

Nikhil Bansal

(TU Eindhoven, Netherlands )

August 24, ISMP 2012, Berlin

Outline

Discrepancy Theory• What is it• Applications• Basic Results (non-constructive)

SDP connection• Algorithms• Lower Bounds

2/40

Discrepancy Theory: What is it?

Study of discrepancy between self-perception and reality

3/40

Discrepancy: What is it?

Study of irregularities in approximating the continuous by the discrete.

Historical motivation: Numerical Integration/ SamplingHow well can you approximate a region by discrete points ?

4/40

Discrepancy: What is it?

Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low.

n1/2

n1/2

RDiscrepancy: Max over rectangles R|(# points in R) – (Area of R)|

5/40

Distributing points in a grid

Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low.

Uniform Random Van der Corput Set

n= 64points

n1/2 discrepancy n1/2 (loglog n)1/2 O(log n) discrepancy!6/40

Quasi-Monte Carlo Methods

n random samples (Monte Carlo) : Error

Quasi-Monte Carlo Methods* :

Extensive research area.

*Different constantof proportionality

7/40

Discrepancy: Example 2

Input: n points placed arbitrarily in a grid.Color them red/blue such that each axis-parallel rectangle is

colored as evenly as possible

Discrepancy: max over rect. R ( | # red in R - # blue in R | )

Continuous: Color each element 1/2 red and 1/2 blue (0 discrepancy)

Discrete:Random has about O(n1/2 log1/2 n)Can achieve O(log2.5 n)

Why do we care? 8/40

Combinatorial Discrepancy

Universe: U= [1,…,n] Subsets: S1,S2,…,Sm

Color elements red/blue so each set is colored as evenly as possible.

Find : [n] ! {-1,+1} toMinimize |(S)|1 = maxS | i 2 S (i) |

Example: U={1,2,3} disc = 0 For disc = 2

S1

S2

S3

S4

9/40

Combinatorial Discrepancy

If A is a matrix.Disc(A) =

Set system: A = {0,1} incidence matrix

10/40

ApplicationsCS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, …

Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

11/40

Hereditary Discrepancy

Discrepancy a useful measure of complexity of a set system

Hereditary discrepancy:

herdisc (U,S) = maxdisc (U’, S|U’)

Robust version of discrepancy

A1

A2

…

1 2 … n

A’1

A’2

…

1’ 2’ … n’

But not so robust 𝑆 𝑖=𝐴𝑖∪𝐴 ’𝑖

12/40

Discrepancy = 0

Two Applications

13/40

Rounding

Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and can round x to s.t.

Proof: Round the bits of x one by one.

: blah .0101101

: blah .1101010

…

: blah .0111101

Error = herdisc(A) ( )

~𝑥

Ax=b

A

14/40

(-1)

(+1)

Key Point: Low discrepancycoloring guides our updates!

x

Rounding

LSV’86 result guarantees existence of good rounding.

How to find it efficiently? Nothing known until recently.

Thm [B’10]. Can round efficiently so that

15/40

Discrepancy and optimization

Corollary(LSV’86): A is integer matrix, herdisc(A) =1, then A is TU.(Totally unimodular: Ax b polytope integral for all integer vectors b.)

Ghouila-Houri test for TU matrices.

Open: Can you characterize matrices with herdisc(A) = 2?

Bin Packing: OPT LP + O(1) ?

[Eisenbrand, Palvolgyi, Rothvoss’11]: Yes. For constant item sizes, if k-permutation conjecture is true.(Recently, Newman-Nikolov’11 disproved the k-permutation conjecture)

Refined further by Rothvoss’12. (Entropy rounding method)16/40

Dynamic Data Structures

N weighted points in a 2-d region.

Weights updated over time.

Query: Given an axis-parallel rectangle R,

determine the total weight on points in R.

Goal: Preprocess (in a data structure)

1) Low query time

2) Low update time (upon weight change)

17/40

ExampleLine: Interval queries Trivial: Query Time= O(n) Update Time = 1

Query time= 1 Update time= O() (Table of entries W[a,b] )

Query time = 2 Update time= O(n) (W[a,b] = W[0,b] – W[0,a])

Query = O(log n) Update = O(log n)

Recursively for 2-d.

18/40

What about other queries?

Circles arbitrary rectangles aligned triangle

Turns out

Reason: Set system S formed by query sets & points has large discrepancy (about )

Larsen-Green’11

19/40

Bounding Discrepancy

20/40

General set system

What is the discrepancy of a general system on m sets?

Useful Fact: After n coin tosses

E[# Heads] = n/2

# Heads with prob.

In general: n independent “nice” random variables

Then ] with prob.

21/40

(Previous) Best Algorithm

Random: Color each element i independently x(i) = +1 or -1 with prob. ½ each.

Thm: Discrepancy = O (n log m)1/2

Pf: For each set, expect O(n1/2) discrepancy

Standard tail bounds: Pr[ | i 2 S x(i) | ¸ c n1/2 ] ¼ e-c2

Union bound + Choose c ¼ (log m)1/2

Tight: Random cannot do better.

For m=n case: Random 22/40

Better Colorings Exist!

[Spencer 85]: (Six standard deviations suffice)

Any system with n sets has discrepancy · 6n1/2

(In general for arbitrary m, discrepancy = O(n1/2 log(m/n)1/2)Tight: For m=n, cannot beat 0.5 n1/2 (Hadamard Matrix)

Inherently non-constructive proof (counting)Powerful Entropy Method.

Question: Can we find it algorithmically ?Certain algorithms do not work [Spencer]

Conjecture [Alon-Spencer]: May not be possible.

23/40

Space of colorings

Results

Thm: Can get Spencer’s bound constructively. That is, O(n1/2) discrepancy for m=n sets.

Thm: For any set system, can find coloring withDiscrepancy · Hereditary discrepancy.

Corollary: Rounding w/ error = Herdisc(A).

General Technique: k-permutation problem [Spencer, Srinivasan,Tetali]

geometric problems , Beck Fiala setting (Srinivasan’s bound) …

24/40

SDPs

Vector Program View:

Variables: Vectors

(in arbitrary dimension)

Constraints: Arbitrary linear constraints on

e.g.

25/40

Relaxations: LPs and SDPs

Not clear how to use.Linear Program is useless. Can color each element ½ red and ½

blue. Discrepancy of each set = 0!

In general, if x is a good coloring, then so is –x. But

SDPs: | i 2 S vi |2 · n 8 S

|vi|2 = 1 8 i

Intended solution vi = (+1,0,…,0) or (-1,0,…,0).

Trivially feasible: vi = ei (all vi’s orthogonal)

Yet, SDPs will be a major tool.26/40

Punch line

SDP very helpful if “tighter” () bounds for some sets.

But why does it work for Spencer’s setting?An additional idea needed.

Algorithm constructs coloring over time, using several SDPs.

27/40

Algorithm (at high level)

Cube: {-1,+1}n

Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)

start

finish

Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. t

1 + t2 ¼ 0

Each dimension: An ElementEach vertex: A Coloring

28/40

An SDP

Hereditary disc. ) the following SDP is feasible

SDP: Low discrepancy

|i 2 Sj vi |2 · 2 for each set .

|vi|2 = 1 for each element i.

Perhaps can guide us how to update element i ?

Trouble: is a vector. Need a real number.Perhaps project on some vector g? (i.e. for each i, consider i = g ¢ vi)

Seems promising:

Obtain vi 2 Rn

29/40

Idea

Which vector g to project on?

Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2).

Pf: N(0,a2) + N(0,b2) = N(0,a2+b2) g¢v = i v(i) gi » N(0, i v(i)2)30/40

Pick a random Gaussian vector g in

g = () each is i.i.d. N(0,1)

Properties of Rounding

Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2)

1. Each i » N(0,1)

2. For each set S,

i 2 S i = g ¢ (i2 S vi) » N(0, · 2)

(std deviation ·)

SDP:|vi|2 = 1|i2 S

vi|2 ·2

Recall: i = g ¢ vi

’s will guide our updates to x.31/40

Algorithm Overview Construct coloring iteratively.

Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0.

At Time t: Update coloring as xt = xt-1 + (t1,…,t

n) ( tiny: 1/n suffices)

x(i)

xt(i) = (1i + 2

i + … + ti)

Color of element i: Does random walk over time with step size ¼ (0,1)N

Fixed if reaches -1 or +1.

time

-1

+1

Set S: xt(S) = i 2 S xt(i) does a random walk w/ step N(0,· 2)

32/40

Analysis Consider time T = O(1/2)

Claim 1: With prob. ½, an element reaches -1 or +1.Pf: Each element doing random walk (martingale) with size ¼ .Recall: Random walk with step 1, is ¼ t1/2 away in t steps.

Claim 2: Each set has O() discrepancy in expectation.Pf: For each S, xt(S) doing random walk with step size ¼ .

At time T = O((log n)/) 1. Prob. that an element still floating < 1/(10 n).2. Expected discrepancy of set = (By Chernoff, all have discrepancy O( )

33/40

Recap

At each step of walk, formulate SDP on floating variables.SDP solution -> Guides the walk

Properties of walk: High Variance -> Quick convergence Low variance for discrepancy on sets -> Low discrepancy

34/40

start

finish

Refinements

Spencer’s six std deviations result: Recall: Want O(n1/2) discrepancy, but random coloring gives n1/2 (log n)1/2

Previous approach seems useless: Expected discrepancy for a set O(n1/2), but some random walks will deviate by up to (log n)1/2 factor

Tune down the variance of dangerous sets (not too many) Entropy Method -> SDP still feasible.

0 20n1/2 30n1/2 35n1/2 …

Danger 1 Danger 2 Danger 3 …

Further Developments

Can be derandomized [Bansal-Spencer’11]

Our algorithm still uses the Entrpoy method.

Gives no new proof of Spencer’s result.

Is there a purely constructive proof ?

Lovett Meka’12: Yes.

Gaussian random walks + Linear Algebra

36/40

Matousek Lower Bound

Thm (Lovasz Spencer Vesztergombi’86): (A)

detlb(A):

Conjecture (LSV’86): Herdisc O(1) detlb

Remark: For TU Matrices, Herdisc(A) =1, detlb = 1

(every submatrix has det -1,0 or +1)

37/40

DetlbHoffman: Detlb(A) 2

Palvolgyi’11: gap

Matousek’11: herdisc(A) O(log n ) detlb.

Idea: Our Algorithm -> SDP relaxation is not too weak

SDP Duality -> Dual Witness for large herdisc(A).

Dual Witness -> Submatrix with large determinant.

Other implications:

38/40

In ConclusionVarious basic questions remain open in discrepancy.

Algorithmic questions:– Conjecture (Matousek’11): disc(A) hervecdisc(A)

(would imply tight bound herdisc(A) = O(log m) detlb(A)) – Constructive Banaszczyk bound ( ) for Beck Fiala.– Approximation for hereditary discrepancy?

Various other non-constructive methods:

Counting, topological, fixed points, …

What can be made constructive, not so well understood ?

39/40

Thank you for your attention

40/40

discrepancy and sdps nikhil bansal (tu eindhoven, netherlands ) august 24, ismp 2012, berlin

Documents

r discrepancy

points n

n points

berlin slide

x slide

olog n discrepancy

discrepancy discrete

study of discrepancy