living on the edge - california institute of...

36
Living on the Edge Phase Transitions in Convex Programs with Random Data Joel A. Tropp Michael B. McCoy Computing + Mathematical Sciences California Institute of Technology Joint with Dennis Amelunxen and Martin Lotz (Manchester) Research supported in part by ONR, AFOSR, DARPA, and the Sloan Foundation 1

Upload: hakhanh

Post on 11-Jun-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Living on the Edge¦

Phase Transitions in Convex Programs with Random Data

Joel A. TroppMichael B. McCoy

Computing + Mathematical Sciences

California Institute of Technology

Joint with Dennis Amelunxen and Martin Lotz (Manchester)

Research supported in part by ONR, AFOSR, DARPA, and the Sloan Foundation 1

Page 2: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Convex Programs with Random Data

Examples...

§ Stat and ML. Random data models; fit model via optimization

§ Sensing. Collect random measurements; reconstruct via optimization

§ Coding. Random channel models; decode via optimization

Motivations...

§ Average-case analysis. Randomness describes “typical” behavior

§ Fundamental bounds. Opportunities and limits for convex methods

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 2

Page 3: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Research Challenge...

Understand and predictprecise behavior

of random convex programs

References: Donoho–Maleki–Montanari 2009, Donoho–Johnstone–Montanari 2011, Donoho–Gavish–Montanari 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 3

Page 4: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

A Theory Emerges...

§ Vershik & Sporyshev, “An asymptotic estimate for the average number of steps...” 1986

§ Donoho, “High-dimensional centrally symmetric polytopes...” 2/2005

§ Rudelson & Vershynin, “On sparse reconstruction...” 2/2006

§ Donoho & Tanner, “Counting faces of randomly projected polytopes...” 5/2006

§ Xu & Hassibi, “Compressed sensing over the Grassmann manifold...” 9/2008

§ Stojnic, “Various thresholds for `1 optimization...” 7/2009

§ Bayati & Montanari, “The LASSO risk for gaussian matrices” 8/2010

§ Oymak & Hassibi, “New null space results and recovery thresholds...” 11/2010

§ Chandrasekaran, Recht, et al., “The convex geometry of linear inverse problems” 12/2010

§ McCoy & Tropp, “Sharp recovery bounds for convex demixing...” 5/2012

§ Bayati, Lelarge, & Montanari, “Universality in polytope phase transitions...” 7/2012

§ Chandrasekaran & Jordan, “Computational & statistical tradeoffs...” 10/2012

§ Amelunxen, Lotz, McCoy, & Tropp, “Living on the edge...” 3/2013

§ Stojnic, various works 3/2013

§ Foygel & Mackey, “Corrupted sensing: Novel guarantees...” 5/2013

§ Oymak & Hassibi, “Asymptotically exact denoising...” 5/2013

§ McCoy & Tropp, “From Steiner formulas for cones...” 8/2013

§ McCoy & Tropp, “The achievable performance of convex demixing...” 9/2013

§ Oymak, Thrampoulidis, & Hassibi, “The squared-error of generalized LASSO...” 11/2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 4

Page 5: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

The Core Question

How big is a cone?

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 5

Page 6: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

.

RegularizedDenoising

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 6

Page 7: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Denoising a Piecewise Smooth Signal

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−4

−3

−2

−1

0

1

2

3

4Piecewise smooth function + additive white noise

Time

Val

ue

Original + noise

Original

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−4

−3

−2

−1

0

1

2

3Denoised piecewise smooth function

Time

Valu

e

By wavelet shrinkageOriginal

§ Observation: z = x\ + σg where g ∼ normal(0, I)

§ Denoise via wavelet shrinkage = convex optimization:

minimize1

2‖x− z‖22 + λ ‖Wx‖1

Reference: Donoho & Johnstone, early 1990s

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 7

Page 8: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Setup for Regularized Denoising

§ Let x\ ∈ Rd be “structured” but unknown

§ Let f : Rd → R be a convex function that reflects “structure”

§ Observe z = x\ + σg where g ∼ normal(0, I)

§ Remove noise by solving the convex program*

minimize1

2‖x− z‖22 subject to f(x) ≤ f(x\)

§ Hope: The minimizer x approximates x\

*We assume the side information f(x\) is available. This is equivalent** to knowing the

optimal choice of Lagrange multiplier for the constraint.

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 8

Page 9: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Geometry of Denoising I

{x : f(x) ≤ f(x\)}

zσg

x

x\

References: Chandrasekaran & Jordan 2012, Oymak & Hassibi 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 9

Page 10: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Descent Cones

Definition. The descent cone of a function f at a point x is

D(f,x) := {h : f(x+ εh) ≤ f(x) for some ε > 0}

{h : f(x+ h) ≤ f(x)}

D(f,x)

0{y : f(y) ≤ f(x)}

x+ D(f,x)

x

References: Rockafellar 1970, Hiriary-Urruty & Lemarechal 1996

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 10

Page 11: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Geometry of Denoising II

{x : f(x) ≤ f(x\)}

x\ +K

zσg

xΠK(σg)

x\

K = D(f,x\)

References: Chandrasekaran & Jordan 2012, Oymak & Hassibi 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 11

Page 12: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

The Risk of Regularized Denoising

Theorem 1. [Oymak & Hassibi 2013] Assume

§ We observe z = x\ + σg where g is standard normal

§ The vector x solves

minimize1

2‖z − x‖22 subject to f(x) ≤ f(x\)

Then

supσ>0

E ‖x− x\‖22σ2

= E ‖ΠK(g)‖22

where K = D(f,x\) and ΠK is the Euclidean metric projector onto K.

Related: Bhaskar–Tang–Recht 2012, Donoho–Johnstone–Montanari 2012, Chandrasekaran & Jordan 2012

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 12

Page 13: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

.

Statistical.Dimension

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 13

Page 14: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Statistical Dimension: The Motion Picture

K

g

ΠK(g)

0

small cone

K

g

ΠK(g)

0

big cone

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 14

Page 15: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

The Statistical Dimension of a Cone

Definition. The statistical dimension δ(K) of a closed,

convex cone K is the quantity

δ(K) := E[‖ΠK(g)‖22

].

where

§ ΠK is the Euclidean metric projector onto K

§ g ∼ normal(0, I) is a standard normal vector

References: Rudelson & Vershynin 2006, Stojnic 2009, Chandrasekaran et al. 2010, Chandrasekaran & Jordan 2012, Amelunxen

et al. 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 15

Page 16: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Basic Statistical Dimension Calculations

Cone Notation Statistical Dimension

j-dim subspace Lj j

Nonnegative orthant Rd+ 12d

Second-order cone Ld+1 12(d+ 1)

Real psd cone Sd+ 14d(d− 1)

Complex psd cone Hd+ 12d

2

References: Chandrasekaran et al. 2010, Amelunxen et al. 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 16

Page 17: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Circular Cones

00

1/4

1/2

3/4

1

1

References: Amelunxen et al. 2013, Mu et al. 2013, McCoy & Tropp 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 17

Page 18: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Descent Cone of `1 Norm at Sparse Vector

0 1/4 1/2 3/4 10

1/4

1/2

3/4

1

1

References: Stojnic 2009, Donoho & Tanner 2010, Chandrasekaran et al. 2010, Amelunxen et al. 2013, Mackey & Foygel 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 18

Page 19: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Descent Cone of S1 Norm at Low-Rank Matrix

0 1/4 1/2 3/4 1

1/4

1/2

3/4

1

0

References: Oymak & Hassibi 2010, Chandrasekaran et al. 2010, Amelunxen et al. 2013, Foygel & Mackey 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 19

Page 20: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

.

Regularized LinearInverse Problems

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 20

Page 21: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Example: The Random Demodulator

Time (μ s)

Freq

uenc

y (M

Hz)

40.08 80.16 120.23 160.31 200.39

0.01

0.02

0.04

0.05

0.06

0.07

Time (μ s)

Freq

uenc

y (M

Hz)

40.08 80.16 120.23 160.31 200.39

0.01

0.02

0.04

0.05

0.06

0.07

PseudorandomNumberGenerator

Seed

Input to sensor Reconstruct(e.g., convex optimization)

Linear data acquisition system Output from sensor

Reference: Tropp et al. 2010, ...

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 21

Page 22: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Setup for Linear Inverse Problems

§ Let x\ ∈ Rd be a structured, unknown vector

§ Let f : Rd → R be a convex function that reflects structure

§ Let A ∈ Rm×d be a measurement operator

§ Observe z = Ax\

§ Find estimate x by solving convex program

minimize f(x) subject to Ax = z

§ Hope: x = x\

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 22

Page 23: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Geometry of Linear Inverse Problems

x\ + null(A)

{x : f(x) ≤ f(x\)}x\

x\ + D(f,x\)

Success!

x\ + null(A)

{x : f(x) ≤ f(x\)}x\

x\ + D(f,x\)

Failure!

References: Candes–Romberg–Tao 2005, Rudelson–Vershynin 2006, Chandrasekaran et al. 2010, Amelunxen et al. 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 23

Page 24: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Linear Inverse Problems with Random Data

Theorem 2. [CRPW10; ALMT13] Assume

§ The vector x\ ∈ Rd is unknown

§ The observation z = Ax\ where A ∈ Rm×d is standard normal

§ The vector x solves

minimize f(x) subject to Ax = z

Then

m & δ(D(f,x\)

)=⇒ x = x\ whp [CRPW10; ALMT13]

m . δ(D(f,x\)

)=⇒ x 6= x\ whp [ALMT13]

References: Stojnic 2009, Chandrasekaran et al. 2010, Amelunxen et al. 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 24

Page 25: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Sparse Recovery via `1 Minimization

0 25 50 75 1000

25

50

75

100

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 25

Page 26: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Low-Rank Recovery via S1 Minimization

0 10 20 300

300

600

900

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 26

Page 27: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

.

DemixingStructured Signals

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 27

Page 28: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Example: Demixing Spikes + Sines

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18!1

!0.5

0

0.5

1

Time (seconds)

axis([A(1:2),1,1]);

Uy0: Signal with sparse DCT

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18!1

!0.5

0

0.5

1

x0: Sparse noise

Am

plitu

deA

mpl

itude

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18−1

−0.5

0

0.5

1 Noisy Signal

Am

plitu

de

Time (seconds)

Observation: z = x\ +Uy\ where U is the DCT

References: Starck–Donoho–Candes 2002, Starck–Elad–Donoho 2004

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 28

Page 29: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Convex Demixing Yields...

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18−1

−0.5

0

0.5

1

Time (seconds)

DemixedOriginal

Am

plitu

de

minimize ‖x‖1 + λ ‖y‖1subject to z = x+Uy

References: Starck–Donoho–Candes 2002, Starck–Elad–Donoho 2004

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 29

Page 30: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Setup for Demixing Problems

§ Let x\ ∈ Rd and y\ ∈ Rd be structured, unknown vectors

§ Let f, g : Rd → R be convex functions that reflect structure

§ Let U ∈ Rd×d be a known orthogonal matrix

§ Observe z = x\ +Uy\

§ Demix via convex program

minimize f(x) subject to g(y) ≤ g(y\)

x+Uy = z

§ Hope: (x, y) = (x\,y\)

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 30

Page 31: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Geometry of Demixing Problems

x\

x\ + D(f,x\)

x\ −UD(g,y\)

Success!

x\

x\ + D(f,x\)

x\ −UD(g,y\)

Failure!

References: McCoy & Tropp 2012, Amelunxen et al. 2013, McCoy & Tropp 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 31

Page 32: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Demixing Problems with Random Incoherence

Theorem 3. [MT12, ALMT13] Assume

§ The vectors x\ ∈ Rd and y\ ∈ Rd are unknown

§ The observation z = x\ +Qy\ where Q is random orthogonal

§ The pair (x, y) solves

minimize f(x) subject to g(y) ≤ g(y\)

x+Qy = z

Then

δ(D(f,x\)

)+ δ(D(g,y\)

). d =⇒ (x, y) = (x\,y\) whp

δ(D(f,x\)

)+ δ(D(g,y\)

)& d =⇒ (x, y) 6= (x\,y\) whp

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 32

Page 33: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Sparse + Sparse via `1 + `1 Minimization

0 25 50 75 1000

25

50

75

100

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 33

Page 34: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Low-Rank + Sparse via S1 + `1 Minimization

0 7 14 21 28 350

245

490

735

980

1225

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 34

Page 35: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

Statistical Dimension & Phase Transitions

§ Key Question: When do two randomly oriented cones strike?

§ Intuition: When do two randomly oriented subspaces strike?

The Approximate Kinematic Formula

Let C and K be closed convex cones in Rd

δ(C) + δ(K) . d =⇒ P {C ∩QK = {0}} ≈ 1

δ(C) + δ(K) & d =⇒ P {C ∩QK = {0}} ≈ 0

where Q is a random orthogonal matrix

References: Amelunxen et al. 2013, McCoy & Tropp 2013

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 35

Page 36: Living on the Edge - California Institute of Technologyusers.cms.caltech.edu/~jtropp/slides/Tro14-Living-Edge-Talk.pdfLiving on the Edge ƒ Phase Transitions in Convex Programs with

To learn more...

E-mail: [email protected]

Web: http://users.cms.caltech.edu/~jtropp

Main Papers Discussed:

§ Chandrasekaran, Recht, et al., “The convex geometry of linear inverse problems.” FOCM 2012§ MT, “Sharp recovery bounds for convex deconvolution, with applications.” FOCM 2014§ ALMT, “Living on the edge: A geometric theory of phase transitions in convex optimization.” I&I 2014§ Oymak & Hassibi, “Asymptotically exact denoising in relation to compressed sensing,” arXiv cs.IT

1305.2714§ MT, “From Steiner formulas for cones to concentration of intrinsic volumes.” DCG 2014§ MT, “The achievable performance of convex demixing,” arXiv cs.IT 1309.7478§ More to come!

Living on the Edge, Modern Time–Frequency Analysis, Strobl, 3 June 2014 36