eccv2008: map estimation algorithms in computer vision - part 1
Post on 20-Nov-2014
350 Views
Preview:
DESCRIPTION
TRANSCRIPT
MAP Estimation Algorithms in
M. Pawan Kumar, University of Oxford
Pushmeet Kohli, Microsoft Research
Computer Vision - Part I
Aim of the Tutorial
• Description of some successful algorithms
• Computational issues
• Enough details to implement
• Some proofs will be skipped :-(
• But references to them will be given :-)
A Vision ApplicationBinary Image Segmentation
How ?
Cost function Models our knowledge about natural images
Optimize cost function to obtain the segmentation
Object - white, Background - green/grey Graph G = (V,E)
Each vertex corresponds to a pixel
Edges define a 4-neighbourhood grid graph
Assign a label to each vertex from L = {obj,bkg}
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Cost of a labelling f : V L Per Vertex Cost
Cost of label ‘obj’ low Cost of label ‘bkg’ high
Object - white, Background - green/grey
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Cost of a labelling f : V L
Cost of label ‘obj’ high Cost of label ‘bkg’ low
Per Vertex Cost
UNARY COST
Object - white, Background - green/grey
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Cost of a labelling f : V L Per Edge Cost
Cost of same label low
Cost of different labels high
Object - white, Background - green/grey
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Cost of a labelling f : V L
Cost of same label high
Cost of different labels low
Per Edge Cost
PAIRWISECOST
Object - white, Background - green/grey
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Problem: Find the labelling with minimum cost f*
Object - white, Background - green/grey
A Vision ApplicationBinary Image Segmentation
Graph G = (V,E)
Problem: Find the labelling with minimum cost f*
A Vision ApplicationBinary Image Segmentation
Another Vision ApplicationObject Detection using Parts-based Models
How ?
Once again, by defining a good cost function
H T
L1
Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’
1
Edges define a TREE
Assign a label to each vertex from L = {positions}
Graph G = (V,E)
L2 L3 L4
Another Vision ApplicationObject Detection using Parts-based Models
2
Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’
Assign a label to each vertex from L = {positions}
Graph G = (V,E)
Edges define a TREE
H T
L1 L2 L3 L4
Another Vision ApplicationObject Detection using Parts-based Models
3
Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’
Assign a label to each vertex from L = {positions}
Graph G = (V,E)
Edges define a TREE
H T
L1 L2 L3 L4
Another Vision ApplicationObject Detection using Parts-based Models
Cost of a labelling f : V L
Unary cost : How well does part match image patch?
Pairwise cost : Encourages valid configurations
Find best labelling f*
Graph G = (V,E)
3 H T
L1 L2 L3 L4
Another Vision ApplicationObject Detection using Parts-based Models
Cost of a labelling f : V L
Unary cost : How well does part match image patch?
Pairwise cost : Encourages valid configurations
Find best labelling f*
Graph G = (V,E)
3 H T
L1 L2 L3 L4
Another Vision ApplicationObject Detection using Parts-based Models
Yet Another Vision ApplicationStereo Correspondence
Disparity Map
How ?
Minimizing a cost function
Yet Another Vision ApplicationStereo Correspondence
Graph G = (V,E)
Vertex corresponds to a pixel
Edges define grid graph
L = {disparities}
Yet Another Vision ApplicationStereo Correspondence
Cost of labelling f :
Unary cost + Pairwise Cost
Find minimum cost f*
The General Problem
b
a
e
d
c
f
Graph G = ( V, E )
Discrete label set L = {1,2,…,h}
Assign a label to each vertex f: V L
1
1 2
2 2
3
Cost of a labelling Q(f)
Unary Cost Pairwise Cost
Find f* = arg min Q(f)
Outline• Problem Formulation
– Energy Function– MAP Estimation– Computing min-marginals
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
Energy Function
Va Vb Vc Vd
Label l0
Label l1
Da Db Dc Dd
Random Variables V = {Va, Vb, ….}
Labels L = {l0, l1, ….} Data D
Labelling f: {a, b, …. } {0,1, …}
Energy Function
Va Vb Vc Vd
Da Db Dc Dd
Q(f) = ∑a a;f(a)
Unary Potential
2
5
4
2
6
3
3
7Label l0
Label l1
Easy to minimize
Neighbourhood
Energy Function
Va Vb Vc Vd
Da Db Dc Dd
E : (a,b) E iff Va and Vb are neighbours
E = { (a,b) , (b,c) , (c,d) }
2
5
4
2
6
3
3
7Label l0
Label l1
Energy Function
Va Vb Vc Vd
Da Db Dc Dd
+∑(a,b) ab;f(a)f(b)
Pairwise Potential
0
1 1
0
0
2
1
1
4 1
0
3
2
5
4
2
6
3
3
7Label l0
Label l1
Q(f) = ∑a a;f(a)
Energy Function
Va Vb Vc Vd
Da Db Dc Dd
0
1 1
0
0
2
1
1
4 1
0
3
Parameter
2
5
4
2
6
3
3
7Label l0
Label l1
+∑(a,b) ab;f(a)f(b)Q(f; ) = ∑a a;f(a)
Outline• Problem Formulation
– Energy Function– MAP Estimation– Computing min-marginals
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
MAP Estimation
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Label l0
Label l1
MAP Estimation
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
2 + 1 + 2 + 1 + 3 + 1 + 3 = 13
Label l0
Label l1
MAP Estimation
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Label l0
Label l1
MAP Estimation
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
5 + 1 + 4 + 0 + 6 + 4 + 7 = 27
Label l0
Label l1
MAP Estimation
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
f* = arg min Q(f; )
q* = min Q(f; ) = Q(f*; )
Label l0
Label l1
MAP Estimation
f(a) f(b) f(c) f(d) Q(f; )
0 0 0 0 18
0 0 0 1 15
0 0 1 0 27
0 0 1 1 20
0 1 0 0 22
0 1 0 1 19
0 1 1 0 27
0 1 1 1 20
16 possible labellings
f(a) f(b) f(c) f(d) Q(f; )
1 0 0 0 16
1 0 0 1 13
1 0 1 0 25
1 0 1 1 18
1 1 0 0 18
1 1 0 1 15
1 1 1 0 23
1 1 1 1 16
f* = {1, 0, 0, 1}q* = 13
Computational Complexity
Segmentation
2|V|
|V| = number of pixels ≈ 320 * 480 = 153600
Computational Complexity
|L| = number of pixels ≈ 153600
Detection
|L||V|
Computational Complexity
|V| = number of pixels ≈ 153600
Stereo
|L||V|
Can we do better than brute-force?
MAP Estimation is NP-hard !!
Computational Complexity
|V| = number of pixels ≈ 153600
Stereo
|L||V|
Exact algorithms do exist for special cases
Good approximate algorithms for general case
But first … two important definitions
Outline• Problem Formulation
– Energy Function– MAP Estimation– Computing min-marginals
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
Min-Marginals
Va Vb Vc Vd
2
5
4
2
6
3
3
7
0
1 1
0
0
2
1
1
4 1
0
3
f* = arg min Q(f; ) such that f(a) = i
Min-marginal qa;i
Label l0
Label l1
Not a marginal (no summation)
Min-Marginals16 possible labellings qa;0 = 15
f(a) f(b) f(c) f(d) Q(f; )
0 0 0 0 18
0 0 0 1 15
0 0 1 0 27
0 0 1 1 20
0 1 0 0 22
0 1 0 1 19
0 1 1 0 27
0 1 1 1 20
f(a) f(b) f(c) f(d) Q(f; )
1 0 0 0 16
1 0 0 1 13
1 0 1 0 25
1 0 1 1 18
1 1 0 0 18
1 1 0 1 15
1 1 1 0 23
1 1 1 1 16
Min-Marginals16 possible labellings qa;1 = 13
f(a) f(b) f(c) f(d) Q(f; )
1 0 0 0 16
1 0 0 1 13
1 0 1 0 25
1 0 1 1 18
1 1 0 0 18
1 1 0 1 15
1 1 1 0 23
1 1 1 1 16
f(a) f(b) f(c) f(d) Q(f; )
0 0 0 0 18
0 0 0 1 15
0 0 1 0 27
0 0 1 1 20
0 1 0 0 22
0 1 0 1 19
0 1 1 0 27
0 1 1 1 20
Min-Marginals and MAP• Minimum min-marginal of any variable = energy of MAP labelling
minf Q(f; ) such that f(a) = i
qa;i mini
mini ( )
Va has to take one label
minf Q(f; )
Summary
MAP Estimation
f* = arg min Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Min-marginals
qa;i = min Q(f; ) s.t. f(a) = i
Energy Function
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10
1 0 5
1 1 6
2 +
2 +
- 2
- 2
Add a constant to all a;i
Subtract that constant from all b;k
Reparameterization
f(a) f(b) Q(f; )
0 0 7 + 2 - 2
0 1 10 + 2 - 2
1 0 5 + 2 - 2
1 1 6 + 2 - 2
Add a constant to all a;i
Subtract that constant from all b;k
Q(f; ’) = Q(f; )
Va Vb
2
5
4
2
0
0
2 +
2 +
- 2
- 2
1 1
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10
1 0 5
1 1 6
- 3 + 3
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
- 3
Reparameterization
Va Vb
2
5
4
2
0
1 1
0
f(a) f(b) Q(f; )
0 0 7
0 1 10 - 3 + 3
1 0 5
1 1 6 - 3 + 3
- 3 + 3
- 3
Q(f; ’) = Q(f; )
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
Reparameterization
Va Vb
2
5
4
2
3 1
0
1
2
Va Vb
2
5
4
2
3 1
1
0
1
- 2
- 2
- 2 + 2+ 1
+ 1
+ 1
- 1
Va Vb
2
5
4
2
3 1
2
1
0 - 4 + 4
- 4
- 4
’a;i = a;i ’b;k = b;k
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i Q(f; ’) = Q(f; )
Reparameterization
Q(f; ’) = Q(f; ), for all f
’ is a reparameterization of , iff
’
’b;k = b;k
’a;i = a;i
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i
Equivalently Kolmogorov, PAMI, 2006
Va Vb
2
5
4
2
0
0
2 +
2 +
- 2
- 2
1 1
RecapMAP Estimation
f* = arg min Q(f; )Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Min-marginals
qa;i = min Q(f; ) s.t. f(a) = i
Q(f; ’) = Q(f; ), for all f ’ Reparameterization
Outline• Problem Formulation
• Reparameterization
• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Belief Propagation
• Belief Propagation gives exact MAP for chains
• Remember, some MAP problems are easy
• Exact MAP for trees
• Clever Reparameterization
Two Variables
Va Vb
2
5 2
1
0
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
Va Vb
2
5 2
1
0
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
a;0 + ab;00 = 5 + 0
a;1 + ab;10 = 2 + 1minMab;0 =
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
Two Variables
Potentials along the red path add up to 0
Va Vb
2
5 5
-2
-3
Va Vb
2
5
40
1
Choose the right constant ’b;k = qb;k
a;0 + ab;01 = 5 + 1
a;1 + ab;11 = 2 + 0minMab;1 =
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
Minimum of min-marginals = MAP estimate
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
f*(b) = 0 f*(a) = 1
Two Variables
Va Vb
2
5 5
-2
-3
Va Vb
2
5
6-2
-1
Choose the right constant ’b;k = qb;k
f(a) = 1
’b;0 = qb;0
f(a) = 1
’b;1 = qb;1
We get all the min-marginals of Vb
Two Variables
RecapWe only need to know two sets of equations
General form of Reparameterization
’a;i = a;i
’ab;ik = ab;ik
+ Mab;k
- Mab;k
+ Mba;i
- Mba;i
’b;k = b;k
Reparameterization of (a,b) in Belief Propagation
Mab;k = mini { a;i + ab;ik }
Mba;i = 0
Three Variables
Va Vb
2
5 2
1
0
Vc
4 60
1
0
1
3
2 3
Reparameterize the edge (a,b) as before
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (a,b) as before
f(a) = 1
f(a) = 1
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (a,b) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 60
1
-2
3
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
-2 -1 2 3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
f(b) = 1
f(b) = 0
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (b,c) as before
f(a) = 1
f(a) = 1
Potentials along the red path add up to 0
f(b) = 1
f(b) = 0
qc;0
qc;1-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
f(a) = 1
f(a) = 1
f(b) = 1
f(b) = 0
qc;0
qc;1
f*(c) = 0 f*(b) = 0 f*(a) = 1
Generalizes to any length chain
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
f(a) = 1
f(a) = 1
f(b) = 1
f(b) = 0
qc;0
qc;1
f*(c) = 0 f*(b) = 0 f*(a) = 1
Only Dynamic Programming
-2 -1 -4 -3
Three Variables
l0
l1
Why Dynamic Programming?
3 variables 2 variables + book-keeping
n variables (n-1) variables + book-keeping
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat
Why Dynamic Programming?
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat
Messages Message Passing
Why stop at dynamic programming?
Va Vb
2
5 5-3
Vc
6 12-6
-5
-2
9
Reparameterize the edge (c,b) as before
-2 -1 -4 -3
Three Variables
l0
l1
Va Vb
2
5 9-3
Vc
11 12-11
-9
-2
9
Reparameterize the edge (c,b) as before
-2 -1 -9 -7
’b;i = qb;i
Three Variables
l0
l1
Va Vb
2
5 9-3
Vc
11 12-11
-9
-2
9
Reparameterize the edge (b,a) as before
-2 -1 -9 -7
Three Variables
l0
l1
Va Vb
9
11 9-9
Vc
11 12-11
-9
-9
9
Reparameterize the edge (b,a) as before
-9 -7 -9 -7
’a;i = qa;i
Three Variables
l0
l1
Va Vb
9
11 9-9
Vc
11 12-11
-9
-9
9
Forward Pass Backward Pass
-9 -7 -9 -7
All min-marginals are computed
Three Variables
l0
l1
Belief Propagation on Chains
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’ab;ik = ab;ik+ Mab;k - Mab;k’b;k = b;k
Repeat till the end of the chain
Start from right, go to left
Repeat till the end of the chain
Belief Propagation on Chains
• A way of computing reparam constants
• Generalizes to chains of any length
• Forward Pass - Start to End• MAP estimate• Min-marginals of final variable
• Backward Pass - End to start• All other min-marginals
Won’t need this .. But good to know
Computational Complexity
• Each constant takes O(|L|)
• Number of constants - O(|E||L|)
O(|E||L|2)
• Memory required ?
O(|E||L|)
Belief Propagation on Trees
Vb
Va
Forward Pass: Leaf Root
All min-marginals are computed
Backward Pass: Root Leaf
Vc
Vd Ve Vg Vh
Outline• Problem Formulation
• Reparameterization
• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Belief Propagation on Cycles
Va Vb
Vd Vc
Where do we start? Arbitrarily
a;0
a;1
b;0
b;1
d;0
d;1
c;0
c;1
Reparameterize (a,b)
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
c;0
c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
- a;0
- a;1
’a;0 - a;0 = qa;0 ’a;1 - a;1 = qa;1
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Pick minimum min-marginal. Follow red path.
- a;0
- a;1
’a;0 - a;0 = qa;0 ’a;1 - a;1 = qa;1
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
c;0
c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
d;0
d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
a;0
a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Potentials along the red path add up to 0
- a;0
- a;1
’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≤
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Problem Solved
- a;0
- a;1
’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≤
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
Problem Not Solved
- a;0
- a;1
’a;1 - a;1 = qa;1 ’a;0 - a;0 ≤ qa;0≥
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’b;0
’b;1
’d;0
’d;1
’c;0
’c;1
- a;0
- a;1
Reparameterize (a,b) again
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Reparameterize (a,b) again
But doesn’t this overcount some potentials?
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Reparameterize (a,b) again
Yes. But we will do it anyway
Belief Propagation on Cycles
Va Vb
Vd Vc
’a;0
’a;1
’’b;0
’’b;1
’d;0
’d;1
’c;0
’c;1
Keep reparameterizing edges in some order
Hope for convergence and a good solution
Belief Propagation
• Generalizes to any arbitrary random field
• Complexity per iteration ?
O(|E||L|2)
• Memory required ?
O(|E||L|)
Outline• Problem Formulation
• Reparameterization
• Belief Propagation– Exact MAP for Chains and Trees– Approximate MAP for general graphs– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
Computational Issues of BP
Complexity per iteration O(|E||L|2)
Special Pairwise Potentials ab;ik = wabd(|i-k|)
i - k
d
Potts
i - k
d
Truncated Linear
i - k
d
Truncated Quadratic
O(|E||L|) Felzenszwalb & Huttenlocher, 2004
Computational Issues of BP
Memory requirements O(|E||L|)
Half of original BP Kolmogorov, 2006
Some approximations exist
But memory still remains an issue
Yu, Lin, Super and Tan, 2007
Lasserre, Kannan and Winn, 2007
Computational Issues of BP
Order of reparameterization
Randomly
Residual Belief Propagation
In some fixed order
The one that results in maximum change
Elidan et al. , 2006
Theoretical Properties of BP
Exact for Trees Pearl, 1988
What about any general random field?
Run BP. Assume it converges.
Theoretical Properties of BP
Exact for Trees Pearl, 1988
What about any general random field?
Choose variables in a tree. Change their labels.Value of energy does not decrease
Theoretical Properties of BP
Exact for Trees Pearl, 1988
What about any general random field?
Choose variables in a cycle. Change their labels.Value of energy does not decrease
Theoretical Properties of BP
Exact for Trees Pearl, 1988
What about any general random field?
For cycles, if BP converges then exact MAPWeiss and Freeman, 2001
ResultsObject Detection Felzenszwalb and Huttenlocher, 2004
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
H
TA1 A2
L1 L2
Labels - Poses of parts
Unary Potentials:Fraction of foreground pixels
Pairwise Potentials:Favour Valid Configurations
ResultsObject Detection Felzenszwalb and Huttenlocher, 2004
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
ResultsBinary Segmentation Szeliski et al. , 2008
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
ResultsBinary Segmentation
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Szeliski et al. , 2008
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
Belief Propagation
ResultsBinary Segmentation
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Szeliski et al. , 2008
Global optimum
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
ResultsSzeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
Stereo Correspondence
ResultsSzeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
Belief Propagation
Stereo Correspondence
ResultsSzeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Global optimum
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
Stereo Correspondence
Summary of BP
Exact for chains
Exact for trees
Approximate MAP for general cases
Not even convergence guaranteed
So can we do something better?
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties
TRW Message Passing
• Convex (not Combinatorial) Optimization
• A different look at the same problem
• A similar solution
• Combinatorial (not Convex) Optimization
We will look at the most general MAP estimation
Not trees No assumption on potentials
Things to Remember
• Forward-pass computes min-marginals of root
• BP is exact for trees
• Every iteration provides a reparameterization
• Basics of Mathematical Optimization
Mathematical Optimization
min g0(x)
subject to gi(x) ≤ 0 i=1, … , N
• Objective function • Constraints
• Feasible region = {x | gi(x) ≤ 0}
x* = arg
OptimalSolution
g0(x*)
OptimalValue
Integer Programming
min g0(x)
subject to gi(x) ≤ 0 i=1, … , N
• Objective function • Constraints
• Feasible region = {x | gi(x) ≤ 0}
x* = arg
OptimalSolution
g0(x*)
OptimalValue
xk Z
Feasible Region
Generally NP-hard to optimize
Linear Programming
min g0(x)
subject to gi(x) ≤ 0 i=1, … , N
• Objective function • Constraints
• Feasible region = {x | gi(x) ≤ 0}
x* = arg
OptimalSolution
g0(x*)
OptimalValue
Linear Programming
min g0(x)
subject to gi(x) ≤ 0 i=1, … , N
• Linear objective function • Linear constraints
• Feasible region = {x | gi(x) ≤ 0}
x* = arg
OptimalSolution
g0(x*)
OptimalValue
Linear Programming
min cTx
subject to Ax ≤ b i=1, … , N
• Linear objective function • Linear constraints
• Feasible region = {x | Ax ≤ b}
x* = arg
OptimalSolution
cTx*
OptimalValue
Polynomial-time Solution
Feasible Region
Polynomial-time Solution
Feasible Region
Optimal solution lies on a vertex (obj func linear)
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties
Integer Programming Formulation
Va Vb
Label l0
Label l12
5
4
2
0
1 1
0
2Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Labelling
f(a) = 1
f(b) = 0
ya;0 = 0 ya;1 = 1
yb;0 = 1 yb;1 = 0
Any f(.) has equivalent boolean variables ya;i
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Labelling
f(a) = 1
f(b) = 0
ya;0 = 0 ya;1 = 1
yb;0 = 1 yb;1 = 0
Find the optimal variables ya;i
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2Unary Potentials
a;0 = 5
a;1 = 2
b;0 = 2
b;1 = 4
Sum of Unary Potentials
∑a ∑i a;i ya;i
ya;i {0,1}, for all Va, li∑i ya;i = 1, for all Va
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2Pairwise Potentials
ab;00 = 0
ab;10 = 1
ab;01 = 1
ab;11 = 0
Sum of Pairwise Potentials
∑(a,b) ∑ik ab;ik ya;iyb;k
ya;i {0,1}
∑i ya;i = 1
Label l0
Label l1
Integer Programming Formulation
Va Vb
2
5
4
2
0
1 1
0
2Pairwise Potentials
ab;00 = 0
ab;10 = 1
ab;01 = 1
ab;11 = 0
Sum of Pairwise Potentials
∑(a,b) ∑ik ab;ik yab;ik
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Label l0
Label l1
Integer Programming Formulation
min ∑a ∑i a;i ya;i + ∑(a,b) ∑ik ab;ik yab;ik
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
= [ … a;i …. ; … ab;ik ….]
y = [ … ya;i …. ; … yab;ik ….]
One variable, two labels
ya;0
ya;1
ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1
y = [ ya;0 ya;1] = [ a;0 a;1]
Two variables, two labels
= [ a;0 a;1 b;0 b;1
ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1
yab;00 yab;01 yab;10 yab;11]
ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1
yb;0 {0,1} yb;1 {0,1} yb;0 + yb;1 = 1
yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1
yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1
In General
Marginal Polytope
In General
R(|V||L| + |E||L|2)
y {0,1}(|V||L| + |E||L|2)
Number of constraints
|V||L| + |V| + |E||L|2
ya;i {0,1} ∑i ya;i = 1 yab;ik = ya;i yb;k
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
= [ … a;i …. ; … ab;ik ….]
y = [ … ya;i …. ; … yab;ik ….]
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Solve to obtain MAP labelling y*
Integer Programming Formulation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
But we can’t solve it in general
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties
Linear Programming Relaxation
min Ty
ya;i {0,1}
∑i ya;i = 1
yab;ik = ya;i yb;k
Two reasons why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
yab;ik = ya;i yb;k
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ∑kya;i yb;k
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
One reason why we can’t solve this
= 1∑k yab;ik = ya;i∑k yb;k
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
One reason why we can’t solve this
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
No reason why we can’t solve this*
*memory requirements, time complexity
One variable, two labels
ya;0
ya;1
ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1
y = [ ya;0 ya;1] = [ a;0 a;1]
One variable, two labels
ya;0
ya;1
ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1
y = [ ya;0 ya;1] = [ a;0 a;1]
Two variables, two labels
= [ a;0 a;1 b;0 b;1
ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1
yab;00 yab;01 yab;10 yab;11]
ya;0 {0,1} ya;1 {0,1} ya;0 + ya;1 = 1
yb;0 {0,1} yb;1 {0,1} yb;0 + yb;1 = 1
yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1
yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1
Two variables, two labels
= [ a;0 a;1 b;0 b;1
ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1
yab;00 yab;01 yab;10 yab;11]
ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1
yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1
yab;00 = ya;0 yb;0 yab;01 = ya;0 yb;1
yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1
Two variables, two labels
= [ a;0 a;1 b;0 b;1
ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1
yab;00 yab;01 yab;10 yab;11]
ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1
yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1
yab;00 + yab;01 = ya;0
yab;10 = ya;1 yb;0 yab;11 = ya;1 yb;1
Two variables, two labels
= [ a;0 a;1 b;0 b;1
ab;00 ab;01 ab;10 ab;11]y = [ ya;0 ya;1 yb;0 yb;1
yab;00 yab;01 yab;10 yab;11]
ya;0 [0,1] ya;1 [0,1] ya;0 + ya;1 = 1
yb;0 [0,1] yb;1 [0,1] yb;0 + yb;1 = 1
yab;00 + yab;01 = ya;0
yab;10 + yab;11 = ya;1
In General
Marginal Polytope
LocalPolytope
In General
R(|V||L| + |E||L|2)
y [0,1](|V||L| + |E||L|2)
Number of constraints
|V||L| + |V| + |E||L|
Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
No reason why we can’t solve this
Linear Programming Relaxation
Extensively studied
Optimization
Schlesinger, 1976
Koster, van Hoesel and Kolen, 1998
Theory
Chekuri et al, 2001 Archer et al, 2004
Machine Learning
Wainwright et al., 2001
Linear Programming Relaxation
Many interesting Properties
• Global optimal MAP for trees
Wainwright et al., 2001
But we are interested in NP-hard cases
• Preserves solution for reparameterization
Linear Programming Relaxation
• Large class of problems
• Metric Labelling• Semi-metric Labelling
Many interesting Properties - Integrality Gap
Manokaran et al., 2008
• Most likely, provides best possible integrality gap
Linear Programming Relaxation
• A computationally useful dual
Many interesting Properties - Dual
Optimal value of dual = Optimal value of primal
Easier-to-solve
Dual of the LP RelaxationWainwright et al., 2001
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
Dual of the LP RelaxationWainwright et al., 2001
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
1
2
3
4 5 6
1
2
3
4 5 6
ii =
i ≥ 0
Dual of the LP RelaxationWainwright et al., 2001
1
2
3
4 5 6
q*(1)
ii =
q*(2)
q*(3)
q*(4) q*(5) q*(6)
i q*(i)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
max
Dual of the LP RelaxationWainwright et al., 2001
1
2
3
4 5 6
q*(1)
ii
q*(2)
q*(3)
q*(4) q*(5) q*(6)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
i q*(i)max
Dual of the LP RelaxationWainwright et al., 2001
ii
max i q*(i)
I can easily compute q*(i)
I can easily maintain reparam constraint
So can I easily solve the dual?
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties
TRW Message PassingKolmogorov, 2006
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
VaVb Vc
VdVe Vf
VgVh Vi
1
2
3
1
2
3
4 5 6
4 5 6
ii
i q*(i)
Pick a variable Va
TRW Message PassingKolmogorov, 2006
ii
i q*(i)
Vc Vb Va
1c;0
1c;1
1b;0
1b;1
1a;0
1a;1
Va Vd Vg
4a;0
4a;1
4d;0
4d;1
4g;0
4g;1
TRW Message PassingKolmogorov, 2006
11 + 44 + rest
1 q*(1) + 4
q*(4) + K
Vc Vb Va Va Vd Vg
Reparameterize to obtain min-marginals of Va
1c;0
1c;1
1b;0
1b;1
1a;0
1a;1
4a;0
4a;1
4d;0
4d;1
4g;0
4g;1
TRW Message PassingKolmogorov, 2006
1’1 + 4’4 + rest
Vc Vb Va
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
Va Vd Vg
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
One pass of Belief Propagation
1 q*(’1) + 4
q*(’4) + K
TRW Message PassingKolmogorov, 2006
1’1 + 4’4 + rest
Vc Vb Va Va Vd Vg
Remain the same
1 q*(’1) + 4
q*(’4) + K
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
TRW Message PassingKolmogorov, 2006
1’1 + 4’4 + rest
1 min{’1a;0,’1a;1} + 4
min{’4a;0,’4a;1} + K
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
TRW Message PassingKolmogorov, 2006
1’1 + 4’4 + rest
Vc Vb Va Va Vd Vg
Compute weighted average of min-marginals of Va
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{’1a;0,’1a;1} + 4
min{’4a;0,’4a;1} + K
TRW Message PassingKolmogorov, 2006
1’1 + 4’4 + rest
Vc Vb Va Va Vd Vg
’’a;0 = 1’1a;0+ 4’4a;0
1 + 4
’’a;1 = 1’1a;1+ 4’4a;1
1 + 4
’1c;0
’1c;1
’1b;0
’1b;1
’1a;0
’1a;1
’4a;0
’4a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{’1a;0,’1a;1} + 4
min{’4a;0,’4a;1} + K
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{’1a;0,’1a;1} + 4
min{’4a;0,’4a;1} + K
’’a;0 = 1’1a;0+ 4’4a;0
1 + 4
’’a;1 = 1’1a;1+ 4’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
1 min{’1a;0,’1a;1} + 4
min{’4a;0,’4a;1} + K
’’a;0 = 1’1a;0+ 4’4a;0
1 + 4
’’a;1 = 1’1a;1+ 4’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
1 min{’’a;0,’’a;1} + 4
min{’’a;0,’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
’’a;0 = 1’1a;0+ 4’4a;0
1 + 4
’’a;1 = 1’1a;1+ 4’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
(1 + 4) min{’’a;0, ’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
’’a;0 = 1’1a;0+ 4’4a;0
1 + 4
’’a;1 = 1’1a;1+ 4’4a;1
1 + 4
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
(1 + 4) min{’’a;0, ’’a;1} + K
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
min {p1+p2, q1+q2} min {p1, q1} + min {p2, q2}≥
TRW Message PassingKolmogorov, 2006
1’’1 + 4’’4 + rest
Vc Vb Va Va Vd Vg
Objective function increases or remains constant
’1c;0
’1c;1
’1b;0
’1b;1
’’a;0
’’a;1
’’a;0
’’a;1
’4d;0
’4d;1
’4g;0
’4g;1
(1 + 4) min{’’a;0, ’’a;1} + K
TRW Message Passing
Initialize i. Take care of reparam constraint
Choose random variable Va
Compute min-marginals of Va for all trees
Node-average the min-marginals
REPEAT
Kolmogorov, 2006
Can also do edge-averaging
Example 1
Va Vb
0
1 1
0
2
5
4
2l0
l1
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
1
4 1
0
6
3
6
4
2 =1 3 =11 =1
5
6
7
Pick variable Va. Reparameterize.
Example 1
Va Vb
-3
-2 -1
-2
5
7
4
2
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
-3
1 -3
-3
6
3
10
7
2 =1 3 =11 =1
5
6
7
Average the min-marginals of Va
l0
l1
Example 1
Va Vb
-3
-2 -1
-2
7.5
7
4
2
Vb Vc
0
2 3
1
4
2
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
7
6
7
Pick variable Vb. Reparameterize.
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.5
7
Vb Vc
-5
-3 -1
-3
9
6
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
7
6
7
Average the min-marginals of Vb
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
6.5
6.5
7 Value of dual does not increase
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
6.5
6.5
7 Maybe it will increase for Vc
NO
l0
l1
Example 1
Va Vb
-7.5
-7 -5.5
-7
7.5
7
8.75
6.5
Vb Vc
-5
-3 -1
-3
8.75
6.5
6
3
Vc Va
-3
1 -3
-3
6
3
7.5
7
2 =1 3 =11 =1
Strong Tree Agreement
Exact MAP Estimate
f1(a) = 0 f1(b) = 0 f2(b) = 0 f2(c) = 0 f3(c) = 0 f3(a) = 0
l0
l1
Example 2
Va Vb
0
1 1
0
2
5
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
1 1
0
0
3
4
8
2 =1 3 =11 =1
4
0
4
Pick variable Va. Reparameterize.
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
7
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
9
2 =1 3 =11 =1
4
0
4
Average the min-marginals of Va
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
4
0
4 Value of dual does not increase
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
4
0
4 Maybe it will decrease for Vb or Vc
NO
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1
f2(b) = 0 f2(c) = 1
Weak Tree Agreement
Not Exact MAP Estimate
l0
l1
Example 2
Va Vb
-2
-1 -1
-2
4
8
2
2
Vb Vc
1
0 0
1
0
0
0
0
Vc Va
0
0 1
-1
0
3
4
8
2 =1 3 =11 =1
Weak Tree Agreement
Convergence point of TRW
l0
l1
f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1
f2(b) = 0 f2(c) = 1
Obtaining the Labelling
Only solves the dual. Primal solutions?
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
’ = ii
Fix the labelOf Va
Obtaining the Labelling
Only solves the dual. Primal solutions?
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
’ = ii
Fix the labelOf Vb
Continue in some fixed order
Meltzer et al., 2006
Outline• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing– Integer Programming Formulation– Linear Programming Relaxation and its Dual– Convergent Solution for Dual– Computational Issues and Theoretical Properties
Computational Issues of TRW
• Speed-ups for some pairwise potentials
Basic Component is Belief Propagation
Felzenszwalb & Huttenlocher, 2004
• Memory requirements cut down by half
Kolmogorov, 2006
• Further speed-ups using monotonic chains
Kolmogorov, 2006
Theoretical Properties of TRW
• Always converges, unlike BP
Kolmogorov, 2006
• Strong tree agreement implies exact MAP
Wainwright et al., 2001
• Optimal MAP for two-label submodular problems
Kolmogorov and Wainwright, 2005
ab;00 + ab;11 ≤ ab;01 + ab;10
ResultsBinary Segmentation Szeliski et al. , 2008
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
ResultsBinary Segmentation
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Szeliski et al. , 2008
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
TRW
ResultsBinary Segmentation
Labels - {foreground, background}
Unary Potentials: -log(likelihood) using learnt fg/bg models
Szeliski et al. , 2008
Belief Propagation
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
ResultsStereo Correspondence Szeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
ResultsSzeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
TRW
Stereo Correspondence
ResultsSzeliski et al. , 2008
Labels - {disparities}
Unary Potentials: Similarity of pixel colours
Belief Propagation
Pairwise Potentials: 0, if same labels
1 - exp(|Da - Db|), if different labels
Stereo Correspondence
ResultsNon-submodular problems Kolmogorov, 2006
BP TRW-S
30x30 grid K50
BP TRW-S
BP outperforms TRW-S
Summary• Trees can be solved exactly - BP
• No guarantee of convergence otherwise - BP
• Strong Tree Agreement - TRW-S
• Submodular energies solved exactly - TRW-S
• TRW-S solves an LP relaxation of MAP estimation
• Loopier graphs give worse results Rother and Kolmogorov, 2006
Related New(er) Work
• Solving the Dual
Globerson and Jaakkola, 2007
Komodakis, Paragios and Tziritas 2007
Weiss et al., 2006
Schlesinger and Giginyak, 2007
• Solving the Primal
Ravikumar, Agarwal and Wainwright, 2008
Related New(er) Work
• More complex relaxations
Sontag and Jaakkola, 2007
Komodakis and Paragios, 2008
Kumar, Kolmogorov and Torr, 2007
Werner, 2008
Sontag et al., 2008
Kumar and Torr, 2008
Questions on Part I ?
Code + Standard Data
http://vision.middlebury.edu/MRF
top related