minimizing general submodular functions - eth zurich · minimizing general submodular functions...

46
Minimizing general submodular functions CVPR 2015 Tutorial Stefanie Jegelka MIT

Upload: vannhan

Post on 24-May-2018

233 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Minimizing

general submodular functions

CVPR 2015 Tutorial

Stefanie Jegelka

MIT

Page 2: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

The set function view

2

cost of buying items

together, or

utility, or

probability, …

( ) =

We will assume:

• .

• black box “oracle” to evaluate F

Page 3: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Set functions and energy functions

any set function with .

… is a function on

binary vectors!

a

b

d

c

A

3

1

1

0

0

a

b

c

d

binary labeling problems = subset selection problems!

Page 4: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Discrete Labeling

4

Int J Comput Vis (2009) 82: 302–324 303

Fig. 1 Incorporating higher order potentials for object segmentation.

(a) An image from the MSRC-21 dataset. (b), (c) and (d) Unsuper-

vised image segmentation results generated by using different para-

meters values in the mean-shift segmentation algorithm (Comaniciu

and Meer 2002). (e) The object segmentation obtained using the unary

likelihood potentials from TextonBoost. (f) The result of performing

inference in the pairwise CRF defined in Sect. 2. (g) Our segmentation

result obtained by augmenting the pairwise CRF with higher order

potentials defined on the segments shown in (b), (c) and (d). (h) The

rough hand labelled segmentations provided in the MSRC data set. It

can be clearly seen that the use of higher order potentials results in a

significant improvement in the segmentation result. For instance, the

branches of the tree are much better segmented

sell et al. (2006) this is not always the case and segments ob-

tained using unsupervised segmentation methods are often

wrong. To overcome these problems (Hoiem et al. 2005b)

and (Russell et al. 2006) use multiple segmentations of the

image (instead of only one) in the hope that although most

segmentations are bad, some are correct and thus would

prove useful for their task. They merge these multiple super-

pixels using heuristic algorithms which lack any optimality

guarantees and thus may produce bad results. In this paper

we propose an algorithm that can compute the solution of

the labelling problem (using features based on image seg-

ments) in a principled manner. Our approach couples po-

tential functions defined on sets of pixels with conventional

unary and pairwise cues using higher order CRFs. We test

the performance of this method on the problem of object

segmentation and recognition. Our experiments show that

the results of our approach are significantly better than the

ones obtained using pairwise CRF models (see Fig. 1).

1.1 Object Segmentation and Recognition

Combined object segmentation and recognition is one of

the most challenging and fundamental problems in com-

puter vision. The last few years have seen the emergence

of object segmentation algorithms which integrate object

specific top-down information with image based low-level

features (Borenstein and Malik 2006; He et al. 2004;

Huang et al. 2004; Kumar et al. 2005; Levin and Weiss

2006). These methods have produced excellent results on

challenging data sets. However, they typically only deal

with one object at a time in the image independently and

do not provide a framework for understanding the whole

image. Further, their models become prohibitively large as

the number of classes increases. This prevents their appli-

cation to scenarios where segmentation and recognition of

many object classes is desired.

Shotton et al. (2006) recently proposed a method (Tex-

tonBoost) to overcome this problem. In contrast to using ex-

plicit models to encode object shape they used a boosted

combination of texton features which jointly modeled shape

and texture. They combine the result of textons with colour

and location based likelihood terms in a conditional random

field (CRF). Although their method produced good segmen-

tation and recognition results, the rough shape and texture

model caused it to fail at object boundaries. The problem

of extracting accurate boundaries of objects is considerably

more challenging. In what follows we show that incorpora-

tion of higher order potentials defined on superpixels dra-

matically improves the object segmentation result. In partic-

ular, it leads to segmentations with much better definition of

object boundaries as shown in Fig. 1.

1.2 Higher Order CRFs

Higher order random fields are not new to computer vision.

They have been long used to model image textures (Lan et

sky

tree

house

grass

Int J Comput Vis (2009) 82: 302–324 303

Fig. 1 Incorporating higher order potentials for object segmentation.

(a) An image from the MSRC-21 dataset. (b), (c) and (d) Unsuper-

vised image segmentation results generated by using different para-

meters values in the mean-shift segmentation algorithm (Comaniciu

and Meer 2002). (e) The object segmentation obtained using the unary

likelihood potentials from TextonBoost. (f) The result of performing

inference in the pairwise CRF defined in Sect. 2. (g) Our segmentation

result obtained by augmenting the pairwise CRF with higher order

potentials defined on the segments shown in (b), (c) and (d). (h) The

rough hand labelled segmentations provided in the MSRC data set. It

can be clearly seen that the use of higher order potentials results in a

significant improvement in the segmentation result. For instance, the

branches of the tree are much better segmented

sell et al. (2006) this is not always the case and segments ob-

tained using unsupervised segmentation methods are often

wrong. To overcome these problems (Hoiem et al. 2005b)

and (Russell et al. 2006) use multiple segmentations of the

image (instead of only one) in the hope that although most

segmentations are bad, some are correct and thus would

prove useful for their task. They merge these multiple super-

pixels using heuristic algorithms which lack any optimality

guarantees and thus may produce bad results. In this paper

we propose an algorithm that can compute the solution of

the labelling problem (using features based on image seg-

ments) in a principled manner. Our approach couples po-

tential functions defined on sets of pixels with conventional

unary and pairwise cues using higher order CRFs. We test

the performance of this method on the problem of object

segmentation and recognition. Our experiments show that

the results of our approach are significantly better than the

ones obtained using pairwise CRF models (see Fig. 1).

1.1 Object Segmentation and Recognition

Combined object segmentation and recognition is one of

the most challenging and fundamental problems in com-

puter vision. The last few years have seen the emergence

of object segmentation algorithms which integrate object

specific top-down information with image based low-level

features (Borenstein and Malik 2006; He et al. 2004;

Huang et al. 2004; Kumar et al. 2005; Levin and Weiss

2006). These methods have produced excellent results on

challenging data sets. However, they typically only deal

with one object at a time in the image independently and

do not provide a framework for understanding the whole

image. Further, their models become prohibitively large as

the number of classes increases. This prevents their appli-

cation to scenarios where segmentation and recognition of

many object classes is desired.

Shotton et al. (2006) recently proposed a method (Tex-

tonBoost) to overcome this problem. In contrast to using ex-

plicit models to encode object shape they used a boosted

combination of texton features which jointly modeled shape

and texture. They combine the result of textons with colour

and location based likelihood terms in a conditional random

field (CRF). Although their method produced good segmen-

tation and recognition results, the rough shape and texture

model caused it to fail at object boundaries. The problem

of extracting accurate boundaries of objects is considerably

more challenging. In what follows we show that incorpora-

tion of higher order potentials defined on superpixels dra-

matically improves the object segmentation result. In partic-

ular, it leads to segmentations with much better definition of

object boundaries as shown in Fig. 1.

1.2 Higher Order CRFs

Higher order random fields are not new to computer vision.

They have been long used to model image textures (Lan et

Page 5: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Summarization

5

Page 6: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Influential subsets

Page 7: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Submodularity

7

extra cost:

one drink

extra cost:

free refill

diminishing marginal costs

Page 8: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

The big picture

submodular functions

electrical networks (Narayanan

1997)

graph theory

(Frank 1993)

game theory

(Shapley 1970)

matroid theory

(Whitney, 1935) stochastic processes (Macchi 1975, Borodin 2009)

combinatorial optimization

computer vision & machine learning

G. Choquet J. Edmonds

L.S. Shapley L. Lovász

Page 9: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Examples

9

sensing:

F(S) = information gained from locations S

Page 10: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Example: cover

Page 11: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Maximizing Influence

11 Kempe, Kleinberg & Tardos 2003

Page 12: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Submodular set functions

• Diminishing gains: for all

• Union-Intersection: for all

A B + e + e

Page 13: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Submodularity: boolean & sets

Page 14: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Graph cuts

• Cut for one edge:

0 0

• cut of one edge is submodular!

• large graph: sum of edges

Useful property: sum of submodular functions is submodular

Page 15: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

submodular on . The following are submodular:

• Restriction:

Other closedness properties

15

S V S W V

Page 16: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

submodular on . The following are submodular:

• Restriction:

• Conditioning:

Other closedness properties

16

S V S W V

Page 17: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Closedness properties

submodular on . The following are submodular:

• Restriction:

• Conditioning:

• Reflection:

17

S V

Page 18: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Submodular optimization

• subset selection: min / max F(S)

• minimizing submodular functions: next

• maximizing submodular functions: afternoon

convex …

… and concave aspects!

Page 19: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Minimizing submodular functions

Why?

• energy minimization

• variational inference (marginals)

• structured sparse estimation …

How?

• graph cuts – fast, not always possible

• convex relaxations – can be fast, always possible

• …

Page 20: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

submodularity & convexity

any set function with .

… is a function on

binary vectors!

a

b

d

c

A

20

pseudo-boolean function

1

1

0

0

a

b

c

d

Page 21: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Relaxation: idea

Page 22: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

A relaxation (extension)

have want: extension

(1.0 - 0.5) + (0.5 – 0.2) + (0.2)

Page 23: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

The Lovász extension

have want: extension

Page 24: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

• truncation

• cut function

Examples

1.0 - 0.5

“total variation”!

Page 25: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Alternative characterization

Theorem (Lovász, 1983)

Lovasz extension is convex F is submodular.

if F is submodular, this is equivalent to:

Page 26: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

1

2

− 1− 2

sb

sa

sa F (a)

sb F (b)

1

2

− 1− 2

sb

sa

sa F (a)

1

2

− 1− 2

sb

sa

sa F (a)

sb F (b)

1

2

− 1− 2

sb

sa

sa F (a)

sb F (b)

sa + sb F (V)

Submodular polyhedra

submodular polyhedron:

Base polytope

Page 27: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Base polytope

Base polytope exponentially

many constraints!

Edmonds 1970: “magic”

compute argmax in O(n log n)

basis of (almost all) optimization! -- separation oracle – subgradient --

Page 28: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Base polytopes

Base polytope

3s

s2

s1

P(F)

B(F)

2D (2 elements) 3D (3 elements)

Page 29: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Convex relaxation

1. relaxation: convex optimization (non-smooth)

2. relaxation is exact!

submodular minimization in polynomial time! (Grötschel, Lovász, Schrijver 1981)

3s

s2

s1

P(F)

B(F)

Page 30: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Submodular minimization

• minimize

– subgradient descent

– smoothing (special cases)

• solve dual: combinatorial algorithms

– foundations: Edmonds, Cunningham

– first poly-time algorithms:

(Iwata-Fujishige-Fleischer 2001, Schrijver 2000)

– many more after that …

Page 31: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

proximal problem Lovász extension

Minimum-norm-point algorithm

dual: minimum norm problem

31

Fujishige ‘91, Fujishige & Isotani ‘11

minimizes F !

-1

1

a

b

-1 a

Page 32: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

The bigger story

projection proximal parametric

divide-and-conquer

thresholding

(Fujishige & Isotani 11, Nagano, Gallo-Grigoriadis-Tarjan 06, Hochbaum 01, Chambolle & Darbon 09, …)

Page 33: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Minimum-norm-point algorithm

a

b

d

c

-0.5

-0.5

0.8

1.0

a

b

c

d

1. optimization: find

2. rounding:

how solve?

Polytope has exponentially many inequalities / faces

BUT: can do linear optimization over

Frank-Wolfe or Fujishige-Wolfe algorithm

Page 34: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Frank-Wolfe: main idea

Page 35: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Empirically

(Figure from Bach, 2012)

convergence of relaxation convergence of S

min-norm point

Page 36: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Recap – links to convexity

• submodular function F(S)

• convex extension f(x) --- can compute it!

• submodular minimization as convex optimization

-- can solve it!

• What can we do with it?

3s

s2

s1

P(F)

B(F)

Page 37: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Links to convexity

• What can we do with it?

• MAP inference / energy minimization (out-of-the-box)

• variational inference (Djolonga & Krause 2014)

• structured sparsity (Bach 2010)

• decomposition & parallel algorithms

Page 38: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Structured sparsity and submodularity

Page 39: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Sparse reconstruction

40

discrete regularization on support S of x

relax to convex envelope

sparsity pattern often not random…

subset

selection:

S = {1,3,4,7}

Assumption:

x is sparse

Page 40: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Structured sparsity

...

. . .

⇡ ⇤

M xy

.. .

Assumption:

support of x

has structure

express by set function!

Page 41: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Preference for trees

Set function:

if T is a tree and S not |S| = |T|

use as regularizer?

Page 42: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Sparsity

43

Optimization: submodular minimization (min-norm) (Bach2010)

• x sparse • x structured sparse

submodular function

Lovász extension

discrete regularization on support S of x

relax to convex envelope

Page 43: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Special case

• minimize a sum of submodular functions

“easy”

• combinatorial algorithms (Kolmogorov 12, Fix-Joachims-Park-Zabih 13, Fix-Wang-Zabih 14)

• convex relaxations

Page 44: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Relaxation

• convex Lovász extension:

tight relaxation

• dual decomposition: parallel algorithms (Komodakis-Paragios-Tziritas 11, Savchynskyy-Schmidt-Kappes-Schnörr 11, J-Bach-Sra 13)

Page 45: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Results: dual decomposition

iteration20 40 60 80 100

log

10(d

ua

lity g

ap

)

-1

0

1

2

3

4

5

6 subgradBCDDRfista-smoothdual-decprimal-smoothed

convergence discrete problem relaxation I

relax II

(Jegelka, Bach, Sra 2013; Nishihara, Jegelka, Jordan 2014)

smooth dual non-smooth dual

faster

parallel

algorithms

Page 46: Minimizing general submodular functions - ETH Zurich · Minimizing general submodular functions CVPR 2015 Tutorial ... •structured sparse estimation … How?

Summary

• Submodular functions – diminishing returns/costs

• convex relations:

– exact relaxation

– structured norms

– fast algorithms

• more soon:

– constraints

– maximization: diversity, information

3s

s2

s1

P(F)

B(F)