topic 4 robust multi-label regularizationyboykov/courses/cs898/lectures/lec4_multi... · lp...

88
CS898 Topic 4 Robust multi-label regularization

Upload: others

Post on 04-Nov-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

CS898

Topic 4

Robust multi-label regularization

Page 2: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Topic overview

Multi-label problems:

• Stereo, restoration, texture synthesis, multi-object segmentation

Types of pair-wise pixel interactions

• Convex interactions

• Discontinuity preserving interactions

Energy minimization algorithms:

• Ishikawa (convex)

• a-expansions (robust metric interactions)

• ICM, simulated annealing, message passing, etc. (general)

Extra Reading: Szeliski Ch 3.7

Page 3: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Example of a binary labeling problem

Object/background segmentation (topic 2) is an example of binary labeling problem

}1,0{pSfeasible labels

at any pixel p

S

S

SS

}1|{ pSpS

}0|{ pSpS

Page 4: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Example of a binary labeling problem

Object/background segmentation (topic 2) is an example of binary labeling problem

}1,0{pLfeasible labels

at any pixel p

),...,( ||1 LLL

labeling ofimage pixels p

}|{ pLpL

or, equivalently,

S

S

SS

}1|{ pLpS

}0|{ pLpS

For conveniencethis topic usesLp (label at p)

instead of Sp (segment at p)

Page 5: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

depth mapL

Stereo (topic 3) is an example of image labeling problem

with non-binary labels

},...,3,2,1,0{ nLp

Example of a multi-label problem

feasible disparities

at any pixel p

}|{ pLpL

or, equivalently,

),...,( ||1 LLL

labeling ofimage pixels p

In topic 3 we usedequivalent notation Ld

}|{ pd pd

Page 6: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Remember:

stereo with s-t graph cuts [Roy&Cox’98]

x

y

Page 7: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Remember:

stereo with s-t graph cuts [Roy&Cox’98]

s

t cut

L(p)

p

“cut”

x

y

labels

x

yDis

parity

labels

Page 8: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Multi-label energy minimization

with s-t graph cuts [Ishikawa 1998, 2003]

Exact optimization for convex pair-wise potentials

V(dL)

dL=Lp-Lq

V(dL)

dL=Lp-Lq

graph construction for linear interactions extends to “convex” interactions

Npq

qp

p

pp LLVLDLE ),()()( 1RLp works only for 1D labels

Page 9: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Q: Are convex regularization models good enough for general labeling problems in vision?

A: No

(see the following discussion)

Page 10: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

observed noisy image I

I

palong one scan line in the image

Reconstruction in Vision: (a basic example)

image labeling L(restored intensities)

L = {L1, L2 , ... , Ln}I = {I1, I2 , ... , In}

How to compute L from I ?

L

Page 11: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

observed noisy image I

I

palong one scan line in the image

Reconstruction in Vision: (a basic example)

image labeling L(restored intensities)

L = {L1, L2 , ... , Ln}I = {I1, I2 , ... , In}

How to compute L from I ?

L

Page 12: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Energy minimization(discrete approach)

Markov Random Fields (MRF) framework

• weak membrane model (Geman&Geman’84, Blake&Zisserman’83,87)

pL qL

ZZLp 2:

Nqp

qp

p

pp LLVILE),(

2 ),()()(L

T Tdiscontinuity preserving potentials

Blake&Zisserman’83,87

spatial regularizationdata fidelity

piece-wise smooth labeling

Page 13: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Basic pairwise potentials V(α,β):

Convex regularization

• gradient descent works

• exact polynomial algorithms (Ishikawa)

TV regularization (extreme case of convex)

• a bit harder (non-differentiable)

• global minima algorithms (Ishikawa, Hochbaum, Nikolova et al.)

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. truncated convex)

• NP-hard, many local minima

• good approximations (message passing, a-expansion)

Nqp

qp

p

pp LLVILE),(

2 ),()()(L

total variation

Page 14: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Robust pairwise regularization

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. truncated convex)

• NP-hard, many local minima

• good approximations (message passing, a-expansion)

Nqp

qp

p

pp LLVILE),(

2 ),()()(L

pL qL

piece-wise smooth labeling

Page 15: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Robust pairwise regularization

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. Ising or Potts model)

• NP-hard, many local minima

• provably good approximations (a-expansion) maxflow/mincutalgorithms

Nqp

qp

p

pp LLVILE),(

2 ),()()(L

pL qL

piece-wise smooth labeling

}2L:p{ p2

“perceptual grouping”

}1L:p{ p1

}0L:p{ p0

pL qL

piece-wise constant labeling

weakmembrane

Page 16: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Potts model(piece-wise constant labeling)

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. Ising or Potts model)

• NP-hard, many local minima

• provably good approximations (a-expansion) maxflow/mincutalgorithms

Nqp

qp

p

pp LLVILE),(

2 ),()()(L

Page 17: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Left eye imageRight eye image

Potts model(piece-wise constant labeling)

depth layers

Nqp

qp

p

pp LLVLDE),(

),()()(L

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. Ising or Potts model)

• NP-hard, many local minima

• provably good approximations (a-expansion) maxflow/mincutalgorithms

Page 18: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Potts model(piece-wise constant labeling)

0

C

1

Nqp

qp

p

pp LLVLDE),(

),()()(L

Robust regularization (“discontinuity-preserving”)

• bounded potentials (e.g. Ising or Potts model)

• NP-hard, many local minima

• provably good approximations (a-expansion) maxflow/mincutalgorithms

Page 19: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

V(dL)

dL=Lp-Lq

Potts model

robust“discontinuity preserving”

interactions V

V(dL)

dL=Lp-Lq

“convex”

interactions V

V(dL)

dL=Lp-Lq

V(dL)

dL=Lp-Lq

“linear” model(TV)

“quadratic” model

boundedmodels

(truncated convex)

piecewise constantlabeling

piecewise smoothlabeling

smoothlabeling

smooth labeling (with some discontinuity

robustness)

see comparison in the next slides

Page 20: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

NOTE: optimization of restoration energywith quadratic regularization term

relates to noise-reduction via mean-filtering

N

L),(

22 )()()(qp

qp

p

pp LLILE

Indeed, optimum labeling (L) satisfies:

Can solve this linear system, of course. One approach - fixed point iterations:

I L

quadratic

=>

That is, start at L0=I and iteratively update each pixel’s label to weighted average of observed intensity Ip and mean current label in neighborhood Np

Page 21: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

quadratic

I L

NOTE: optimization of restoration energywith quadratic regularization term

relates to noise-reduction via mean-filtering

N

L),(

22 )()()(qp

qp

p

pp LLILE

Indeed, optimum labeling (L) satisfies:

Can solve this linear system, of course. One approach - fixed point iterations:

=>

That is, start at L0=I and iteratively update each pixel’s label to weighted average of observed intensity Ip and mean current label in neighborhood Np

Page 22: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

I

pblurred edge along one scan line in the image

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

quadratic (convex)may over-smooth

(similarly to mean filtering)

quadratic

I L

NOTE: minimizing the sum

of quadratic differences

prefers to split one large jump

into many small ones

N),(

2)(qp

qp LL

a l

arg

e ju

mp

Page 23: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

I

pblurred edge along one scan line in the image

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

histogram

equalization

linear (TV)

I L

quadratic (convex)may over-smooth

(similarly to mean filtering)

linear (TV)

NOTE: minimizing the sum

of absolute differences

does not care how a large jump

is split (the sum does not change)

=> no over-smoothing

N),(

||qp

qp LL

may create “stair-case”

better robustness (similar to median vs. mean)

a l

arg

e ju

mp

Page 24: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Pairwise interactions V:

“convex” vs. “discontinuity-preserving”

Potts model may create “false banding” (next slide)

histogram

equalization

Potts

I L

I

pblurred edge along one scan line in the image

quadratic (convex)may over-smooth

(similarly to mean filtering)

linear (TV)may create “stair-case”

bounded (e.g. Potts)

NOTE: minimizing the sum

of bounded differences

prefers one large jump

to splitting into smaller ones.

=> restores sharp boundaries.

N),(

][qp

qp LL

a l

arg

e ju

mp

Page 25: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

convex models discontinuity-preserving models

linear

(TV)

quadrat

ic

truncated

linear

truncat

ed

quadrat

ic

Potts

restoration

with:

Pairwise Regularization Models(comparison)

smooth

piecewis

e smooth

piecewise

constant

noisy

images

stair-casingover-smoothing

stair-casingstair-casing

stair-casingstair-casing

over-smoothing banding

banding

common artifacts: over-smoothing, stair-casing, banding

Page 26: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

lots of code

Optimization for

“discontinuity preserving” models

NP-hard problem (3 or more labels)

• two labels can be solved via s-t cuts

a-expansion approximation algorithm [BVZ 1998]

• guaranteed approximation quality [Veksler, 2001]

– within a factor of 2 from the global minima (Potts model)

Many other (small or large) move making algorithms

- a/b swap, jump moves, range moves, fusion moves, etc.

LP relaxations, mean-field approximation, message passing

- e.g. LBP [Weiss&Freeman], TRWS [Kolmogorov&Weinright]

Other MRF techniques (simulated annealing, ICM)

Variational formulations (continuous)

- e.g. convex approaches [Chambolle,Pock,Cremers,Darbon]

Page 27: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

other labelsa

a-expansion move

Basic idea is motivated by methods for multi-way cut problem(similar to Potts model)

Break computation into a sequence of binary s-t cuts

Page 28: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

),( qppq SSE)( pp SE

a-expansion (binary move)

optimizes submodular set function

any “expansion” of label correspond to some subset

(shaded area)

S

Npq

qppq

p

pp LLELESLESE)(

),()()()(ˆ

L current labeling

}{ pLp|

ppppp SLSSL )(

pS1

Page 29: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

),( qppq SSE)( pp SE

a-expansion (binary move)

optimizes submodular set function

L current labeling

}{ pLp|

pS 0 1

)(pE)( pp LE

Npq

qppq

p

pp LLELESLESE)(

),()()()(ˆ

Page 30: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

),( qppq SSE)( pp SE

a-expansion (binary move)

optimizes submodular set function

L current labeling

}{ pLp|

pS 0 1

)( ,pqE

),( qppq LLE

qS

0

1

),( qpq LE

),( ppq LE

Npq

qppq

p

pp LLELESLESE)(

),()()()(ˆ

Page 31: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansion (binary move)

optimizes submodular set function

L current labeling

}{ pLp|

pS 0 1

)( ,pqE

),( qppq LLE

qS

0

1

),( qpq LE

),( ppq LE

)(SE

(1,0)(0,1)(0,0)(1,1) pqpqpqpq EEEE ˆˆˆˆ

Set function is submodular if

Page 32: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansion (binary move)

optimizes submodular set function

L current labeling

}{ pLp|

pS 0 1

)( ,pqE

),( qppq LLE

qS

0

1

),( qpq LE

),( ppq LE

)(SE

),(),(),(),( qpqppqqppqpq LELELLEE

Set function is submodular if

=

0 triangular inequality for ||a-b||=E(a,b)

Page 33: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansion (binary move)

optimizes submodular set function

L current labeling

}{ pLp|

pS 0 1

)( ,pqE

),( qppq LLE

qS

0

1

),( qpq LE

),( ppq LE

),( baEpq

a-expansion moves are submodular ifis a metric on the space of labels

[BVZ, PAMI 2001]

Page 34: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

examples of metric

pairwise interactions

FACT (easy to prove): any truncated metric is also a metric

But, this is only a special case of L2 metric for 1D labels.In general, for this metric is defined as

We called this linear or TV potential

Just check that

Truncated L2 is also a metric

Potts is another important example of a metric

Quadratic (squared L2) and truncated quadratic potentials are not metrics.

Other very good approximation algorithms are available (e.g. TRWS, Kolmogorov&Weinright 2006)

Note: unlike Ishikawa, a-expansion and other methods (LBP, TRWS, etc.) apply to labels in

Page 35: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansion algorithm

1. Start with any initial solution2. For each label “a” in any (e.g. random) order

1. Compute optimal a-expansion move (s-t graph cuts)2. Decline the move if there is no energy decrease

3. Stop when no expansion move would decrease energy

Page 36: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansion moves

initial solution

-expansion

-expansion

-expansion

-expansion

-expansion

-expansion

-expansion

In each a-expansion a given label “a” grabs space from other labels

For each move we choose expansion that gives the largest decrease in the energy: binary optimization problem

Page 37: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Multi-way graph cuts

stereo vision

original pair of “stereo” images

depth map

ground truthBVZ 1998KZ 2002right imageleft image

Page 38: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansions vs. ICM

basic general alternative

Iterated Conditional Modes (ICM)

Example: consider pair-wise energy

N

Lpq

qppq

p

pp LLVLDE ),()()(

Unlike a-expansion, optimizes only over a single pixel at a time

=> local (pixel-wise) optimization

- Consider any fixed current labeling Lt and any given pixel - Treat label Lp as the only optimization variable x keeping other labels for fixed.

- This reduces our energy to

- Select optimal by enumeration and set new label .- Iterate over all pixels (in any fixed or random order) until no pixel operation reduces the energy.

},....,2,1,0{ nx

ppqLq

xLp

ICM algorithm

[Besag, JRSS’86]

Page 39: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansions vs. simulated annealing

- Consider any fixed current labeling Lt and any given pixel - Treat label Lp as the only optimization variable x keeping other labels for fixed.

- This reduces our energy to

- Select optimal by enumeration and set new label .- Iterate over all pixels (in any fixed or random order) until no pixel operation reduces the energy.

},....,2,1,0{ nx

pqLq

xLp

ICM algorithm

- incorporates the following randomization strategy into ICM:at each pixel p label is updated randomly according to probabilities

x 0 1 2 … n

Pr(x) ~ ...

xLp

T

Ep )0(exp

T

Ep )1(exp

T

Ep )2(exp

T

nEp )(exp

NOTE 1 - lower energy Ep(x) gives x more chances to be selected

randomization of ICM

simulated annealing (SA)

NOTE 2 - higher temperature parameter T means more randomness

- lower temperature parameter T reduces to ICM (optimal x is always selected)

Unlike a-expansion, optimizes only over a single pixel at a time

=> local (pixel-wise) optimization

Typical SAstarts with high T and

graduallyreduces T

to zero.

Szeliski: appendix B.5.1

[Geman&Geman 1982]

related to “soft-max”

p

Page 40: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

a-expansions vs. simulated annealing

initial

labeling

one pixel

local move

(ICM or SA)

large move

(a-b swap)

large move

(a-expansion)

a-expansion and ICM/SA are greedy iterative methods

converging to different kinds of “local” minima

- ICM/SA solution can not be improved by changing a label of any one pixel to any given label a

- a-expansion solution can not be improved by changing any subset of pixels to any given label a

small vs. large moves

Page 41: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

normalized correlation,

start for annealing, 24.7% erra-expansions (BVZ 89,01)

90 seconds, 5.8% err

a-expansions vs. simulated annealing

Page 42: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

simulated annealing,

19 hours, 20.3% erra-expansions (BVZ 89,01)

90 seconds, 5.8% err

a-expansions vs. simulated annealing

NOTE 1: ICM and SA are general methods applicable to arbitrary non-metric and high-order energiesNOTE 2: now-days there are other general methods based on graph cuts, message passing, relaxations, etc.

0

20000

40000

60000

80000

100000

1 10 100 1000 10000 100000

Time in seconds

Smoo

thne

ss E

nerg

y

Annealing Our method

[BVZ,2001]

a-expansion

Page 43: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Other applications

Graph-cut textures (Kwatra, Schodl, Essa, Bobick 2003)

similar to “image-quilting” (Efros & Freeman, 2001)

B

A B

G

DC

F

H I J

E

Page 44: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Other applications

Graph-cut textures (Kwatra, Schodl, Essa, Bobick 2003)

Page 45: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Other applications

Multi-object Extraction

Obvious generalization of binary object extraction technique[BJ’01]

Page 46: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Some computational photography applications

Image compositing(Agarwala et al. 2004, see Szeliski Sec 9.3.2)

Page 47: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Block-coordinate descent alternating a-expansion

(for segmentation L) and fitting colors Ii

Color model fitting(multi-label version of Chan-Vese)

...)()(,...),,(1:

21

0:

2010

pp Lp

p

Lp

p IIIIIILE

Npq

qppq LLw}{

][ Potts model

Page 48: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Block-coordinate descent alternating a-expansion

(for segmentation L) and fitting affine transforms Ti

Stereo via piece-wise constant

plane fitting [Birchfield &Tomasi 1999]

Models T = parameters of affine transformations T(p)=a p + b

...)()(,...),,(1:

2

)(

0:

2

)(

10

10

pp Lp

ppT

Lp

ppT IIIITTLE

2x2 2x1

Npq

qppq LLw}{

][ Potts model

Page 49: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Block-coordinate descent alternating a-expansion

(for segmentation L) and fitting planes Ti

Piece-wise smooth local plane fitting

[Olsson et al. 2013]

...)()(,...),,(1:

2

)(

0:

2

)(

10

10

pp Lp

ppT

Lp

ppT IIIITTLE

Npq

qp LLw}{

truncated angle-differences

non-metric interactionsneed other optimization

Page 50: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Block-coordinate descent alternating a-expansion

(for segmentation L) and fitting color space planes Ci

Signboard segmentation

[Milevsky 2013]

Labels = planes in RGBXY space C(p) = a x + b

...))(())((,...),,(1:

2

1

0:

2

0

10

pp Lp

p

Lp

p IpCIpCCCLE

Npq

qppq LLw}{

][ Potts model

3x2 3x1

Page 51: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Signboard segmentation

[Milevsky 2013]

3x2 3x1Goal: detection of characters, then text line fitting and translation

Page 52: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

[Marin et al. ICCV 2015]

3D reconstructionof heart vessels center lines

Vessel extraction

p

pLpLE 2)()( Npq

qp LL}{

,truncated

angle-differences

Page 53: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Learning pair-wise potentials

Structure detection(Kumar and Hebert 2006, see Szeliski Sec 3.7)

Standard (hand-tuned) pair-wise potentials learned pair-wise potentials

Page 54: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

MRF/CRF models in CNN segmentation

- Postprocessing of the results - e.g. improves edge alignment due to loss of resolution in CNNs

- Integrated as trainable or pass-through layers - RNNs, GraphNNs- typically mimicking convolution-like local optimization operations

(e.g. message passing)- currently limited to relatively weak regularization models/optimization

(e.g. dense-CRF due to its better amenability to simpler optimizers)

- Weakly supervised, unsupervised training- proposal generation- loss functions- stronger algorithms may be used in loss optimization

Page 55: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Next

Dense CRF, mean-field approximation?

Relaxations?

Geometric model fitting? Multi-part object fitting?

Single-view reconstruction?

Intro to learning, detection, CNN segmentation?

Student presentations

Page 56: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Multi-label Problems:

Linear/Unary, Quadratic, and other

Approximations and Relaxations

Page 57: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Approximations and Relaxations:

unary/linear, quadratic, etc…

Indeed…

where

yields linear relaxation over simplex:

“probabilities” of labels at point p

indicator if p is

assigned label k

Observation: as discussed earlier in Topic 2, arity of an energy potential corresponds to

the order of the polynomial expressing such potential via indicator variables.

For example, unary potentials can be interpreted as linear functions.

However, objective/energy/loss may have equivalent representations or relaxations of different order

Page 58: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Approximations and Relaxations:

unary/linear, quadratic, etc…

global “linearization”

local “linearization”

- gradient descent- parallel ICM

- Schlesinger LP- QPBO - TRWS

probabilistic “linearization”

- unary mean field

approximation- deterministic

annealing

higher-orderapproaches

- quadratic IP- quadratic

relaxations

Due to simplicity, linear approximations are particularly common

Page 59: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

local linearization:

first-order approximation (gradient descent)

Example: pairwise (or second-order) energy

Assume real valued labels and some continuous functions

“messages” from neighboring nodes

Page 60: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

local linearization:

first-order approximation (gradient descent)

Example: pairwise (or second-order) energy

Assume real valued labels and some continuous functions

Questions: step size Δt, labels may not be real-valued, relaxations of functions D and V may not be obvious

steepest descent

Page 61: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

local linearization:

Parallel ICM

Example: pairwise (or second-order) energy

Consider decomposition

same as in ICM (see earlier)

Note: for simplicity, here we assume sum over directed pairs

unary approximation at any given current labeling

Question: Minimizing unary approximation is easy, but in what sense this approximation is good?

In fact, in this case the original energy E may go upafter parallel ICM (minimizing the right-hand-side). Why?

Page 62: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Example: pairwise (or second-order) energy

GOAL: find simpler (e.g. unary) energy “approximating”

?One general approximation technique:

mean-field approximation [see more in Bishop, Chapter 10.1, on variational inference]

and

Find minimizing KL-divergence between Gibbs distributions

corresponding to energies E and U :

- KL-divergence is a distance measure between two distributions (already used in Topic 2).

In this case it also implicitly defines a quality of approximation of energy E by U.

Comments: - Gibbs distributions G(L) defines probability of state L with given energy E(L)

- Z and Zu are the corresponding normalization constants

Page 63: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

NOTE: Unary energy U corresponds to “factorized” Gibbs distribution Gu (independent variables Lp)

“soft-max” for potentials Up

(probabilities / beliefs)

The optimal distributions bp that give minimum KL divergence between Gu and G

are also called pseudo-marginals for joint distribution G. Also often called simply

“marginals” or “marginal distributions”, even though they are not.

Page 64: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Common approximate algorithm optimizing factorized distributions b for any given energy E corresponds to greedy coordinate descent

1.

2.

3. …

NOTE: greedy coordinate descent is used in ICM w.r.t. labels Lp rather than distributions bp

Page 65: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

(negative) entropy of Gumean energy E w.r.t. distribution Gu

so called Mean Field free energy

Page 66: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

(negative) entropy of Gumean energy E w.r.t. distribution Gu

so called Mean Field free energy

Page 67: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

(negative) entropy of Gumean energy E w.r.t. distribution Gu

so called Mean Field free energy

Entropy of the joint distribution for independent variables equals

the sum of entropies(easy to prove standard fact)

Page 68: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

(negative) entropy of Gumean energy E w.r.t. distribution Gu

so called Mean Field free energy

Entropy of the joint distribution for independent variables equals

the sum of entropies(easy to prove standard fact)

const

Page 69: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

for probability distribution

(negative) entropy of Gumean energy E w.r.t. distribution Gu

so called Mean Field free energy

remember from probabilityEntropy of the joint distribution for independent variables equals

the sum of entropies(easy to prove standard fact)

Page 70: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

Consider one iteration optimizing over bp at any given point p :

for probability distribution

The smallest value of KL divergence (zero) is achieved when

or equivalently [Bishop, eq.(10.9)]

Page 71: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

general coordinate descent formulas for

estimating mean-field approximation[Bishop]

(a.k.a. variational inference)

update belief bp at p for each label Lp

according to mean energy for Lp

w.r.t. current beliefs at other nodes

or equivalently [Bishop, eq.(10.9)]useful if directly interested in approximating (factorizing) complex distributions

Page 72: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

general coordinate descent formulas for

estimating mean-field approximation[Bishop]

(a.k.a. variational inference)

update belief bp at p for each label Lp

according to mean energy for Lp

w.r.t. current beliefs at other nodes

or equivalently [Bishop, eq.(10.9)]

Remember our original goal:

useful if directly interested in approximating (factorizing) complex distributions

Page 73: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Mean field approximation:

update belief bp at p for each label Lp

according to mean energy for Lp

w.r.t. current beliefs at other nodes

in terms of unary energy U

general coordinate descent formulas for

estimating mean-field approximation[Bishop]

(a.k.a. variational inference)

Remember our original goal:

Page 74: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Example: pairwise (or second-order) energy

mean-field approximation

updates for pairwise energies

sequential updates for beliefs b according to messages from neighboring nodes

update belief bp at p for each label Lp

according to mean energy for Lp

w.r.t. current beliefs at other nodes

Page 75: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

“probabilistic linearization”

Mean-Field Approximation

Example: pairwise (or second-order) energy

mean-field approximation

updates for pairwise energies

NOTE: converges to a “fixed point”, but… may not converge if updates are made in parallel

Question: once (approximately) optimal factorized distributions {bp} are known,

how can one estimate labels {Lp} ?

sequential updates for beliefs b according to messages from neighboring nodes

Page 76: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Messages for

beliefs and labels

Example: pairwise (or second-order) energy

mean-field approximation

locally averages energy for each Lp

ICM

locally optimizes labels Lp

current Lq

updates current beliefs bpupdates current labels Lptable with element

for each Lp

Page 77: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

locally optimizes labels Lp

Example: squared deviation potentials (e.g. in restoration)

mean-field approximation

locally averages energy for each Lp

ICM

updates current beliefs bptable with element

for each Lp

e.g. locally averages labels Lp

current Lq

Messages for

beliefs and labels

updates current labels Lp

Page 78: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Dense CRF

Densely connected pairwise Potts model

Due to graph density, graph cut is not practical

Uses mean-field approximationand bilateral filtering for significant speed-up [Koltun et al, NIPS11]

NOTE: dense CRF is a weaker regularization model, but it is an easier objectiveamenable to greedy approximate optimization methods such as mean field technique.

(see Topic 2)

Page 79: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Deterministic Annealing

Mean field approximation:

Introduce additional temperature parameter T

Observation: finding optimal beliefs is easier for larger T and harder for smaller T. Why?

(negative) entropy of Guaverage energy E w.r.t. distribution Gu

convex (easy)

non-convex(hard)

Page 80: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Deterministic Annealing

Mean field approximation:

Introduce additional temperature parameter T

(large)

(small)

General Idea for Deterministic Annealing:

- start from large T and uniform beliefs b- gradually decrease temperature while updating beliefs (e.g. using mean-field)

- eventually Gibbs distribution GT should be concentrated around globally optimal labeling Land (approximate) beliefs bT “may” reflect that

(no guarantees)

Page 81: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Loopy belief propagation (loopy BP)

One class of iterative message-passing inference methods (BP) uses messages derived from dynamic programming on non-loopy graphs (chains, trees).

- No guarantees on loopy graphs (oscillates), but works well in some applications.

- exact optimization on non-loopy graph

- sum-product, max-product

Technical operations inside many optimization algorithms can be represented via “local messages”, e.g. in gradient descent, ICM, graph cuts, ……

including dynamic programming (e.g. Viterbi) on chains/trees

Page 82: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

global linearization:

Linear Programming (LP) Relaxations

Example: pairwise (or second-order) energy

Introduce binary indicators:

indicates if p is assigned label k (as in Topic 1)

indicates if pair p,q is assigned labels k,m

and:

First, formulate an equivalent Integer Linear Program (ILP) as follows…

Page 83: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

global linearization:

Linear Programming (LP) Relaxations

Example: pairwise (or second-order) energy

Relaxing integrality constraints gives Schlesinger LP relaxation (1972)

indicates if p is assigned label k (as in Topic 1)

indicates if pair p,q is assigned labels k,m

Page 84: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

global linearization:

Linear Programming (LP) Relaxations

• binary labeling case - Quadratic Pseudo-Boolean Optimization (QPBO)

[Hammer et al. 1984, Boros 2002]- relaxation has half-integral solutions, such that

- submodular problems yield fully integral solution (global optimum)

• arbitrary number of labels- typical algorithms focus on a dual LP- block-coordinate ascent, message passing- tree-structures block-coordinate ascent

TRWS [Kolmogorov&Weinright 2006]- guaranteed global optima for a class of “multi-label submodular” problems

Page 85: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Global vs Local Linearization

Global Linearization

Local Linearization

- parallel ICM- gradient descent

- Schlesinger LP- QPBO - TRWS

Page 86: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Quadratic Relaxations and Approximations

Example: squared deviations image restoration energy

Can easily work with real valued labels, if desired

Page 87: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Quadratic Relaxations and Approximations

Example: pairwise Potts energy

indicates if p is assigned label k (as before)

“saddle” functions f(x,y) = -xy

equals for

Lots of other options for relaxing Potts model

Page 88: Topic 4 Robust multi-label regularizationyboykov/Courses/cs898/Lectures/lec4_multi... · LP relaxations, mean-field approximation, message passing - e.g. LBP [Weiss&Freeman], TRWS

Quadratic Relaxations and Approximations

- Natural for labeling problems with labels in Rn

(e.g. restoration, stereo, optical flows)

- Quadratic relaxations for Potts model (segmentation)

- Submodular approximations