discrete optimization lecture 4 – part 3 m. pawan kumar [email protected] slides available online
TRANSCRIPT
![Page 1: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/1.jpg)
Discrete OptimizationLecture 4 – Part 3
M. Pawan Kumar
Slides available online http://cvn.ecp.fr/personnel/pawan/
![Page 2: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/2.jpg)
Recap
![Page 3: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/3.jpg)
Loopy Belief Propagation
Initialize all messages to 1
In some order of edges, update messages
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Until Convergence
Rate of changes in messages < threshold
Not Guaranteed !!
![Page 4: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/4.jpg)
Loopy Belief Propagation
B’ab(i,j) =
Normalize to compute beliefs Ba(i), Bab(i,j)
B’a(i) =
ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j
ψa(li)ΠnMna;i
At convergence Σj Bab(i,j) = Ba(i)
![Page 5: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/5.jpg)
Outline
• Free Energy
• Mean-Field Approximation
• Bethe Approximation
• Kikuchi Approximation
Yedidia, Freeman and Weiss, 2000
![Page 6: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/6.jpg)
Exponential Family
P(v) = exp{-Σa Σi θa;iIa;i(va) -Σa,b Σi,k θab;ikIab;ik(va,vb) - A(θ)}
A(θ) : log Z
Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)
Z
ψa(li) : exp(-θa(i)) ψa(li,lk) : exp(-θab(i,k))
![Page 7: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/7.jpg)
Exponential Family
P(v) = exp{-Σa Σi θa;iIa;i(va) -Σa,b Σi,k θab;ikIab;ik(va,vb) - A(θ)}
A(θ) : log Z
Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)
Z
ψa(li) : exp(-θa(i)) ψa(li,lk) : exp(-θab(i,k))
Energy Q(v) = Σa θa(va) + Σa,b θab(va,vb)
exp(-Q(v))
Z=
![Page 8: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/8.jpg)
Exponential Family
Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)
Z
exp(-Q(v))
Z=
Approximate probability distribution B(v)
Minimize KL divergence between B(v) and P(v)
B(v) has a simpler form than P(v)
![Page 9: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/9.jpg)
Kullback-Leibler Divergence
D = B(v)P(v)
Σv B(v) log
![Page 10: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/10.jpg)
Kullback-Leibler Divergence
D = Σv B(v) log B(v) - Σv B(v) log P(v)
![Page 11: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/11.jpg)
Kullback-Leibler Divergence
D = Σv B(v) log B(v) + Σv B(v) Q(v)
- (- log Z)
Helmholz free energy
Constant with respect to B
![Page 12: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/12.jpg)
Kullback-Leibler Divergence
Σv B(v) log B(v) + Σv B(v) Q(v)
Negative Entropy U(B)
![Page 13: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/13.jpg)
Kullback-Leibler Divergence
Σv B(v) log B(v) + Σv B(v) Q(v)
Average Energy S(B)
![Page 14: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/14.jpg)
Kullback-Leibler Divergence
Σv B(v) log B(v) + Σv B(v) Q(v)
Gibbs free energy
![Page 15: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/15.jpg)
Outline
• Free Energy
• Mean-Field Approximation
• Bethe Approximation
• Kikuchi Approximation
![Page 16: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/16.jpg)
Simpler Distribution
One-node marginals Ba(i)
Joint probability B(v) = Πa Ba(va)
![Page 17: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/17.jpg)
Average Energy
Σv B(v) Q(v)
![Page 18: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/18.jpg)
Average Energy
Σv B(v) (Σa θa(va) + Σa,b θab(va,vb))
* = Simplify on board !!!
*
![Page 19: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/19.jpg)
Average Energy
Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)
![Page 20: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/20.jpg)
Negative Entropy
Σv B(v) log (B(v))*
![Page 21: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/21.jpg)
Negative Entropy
Σa Σi Ba(i)log(Ba(i))
![Page 22: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/22.jpg)
Mean-Field Free Energy
Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)
+ Σa Σi Ba(i)log(Ba(i))
![Page 23: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/23.jpg)
Optimization Problem
Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)
+ Σa Σi Ba(i)log(Ba(i))
minB
Σi Ba(i) = 1s.t.
*
![Page 24: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/24.jpg)
KKT Condition
log(Ba(i)) = -θa(i) -Σb Σk Bb(k)θab(i,k) + λa-1
Ba(i) = exp(-θa(i) -Σb Σk Bb(k)θab(i,k))/Za
![Page 25: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/25.jpg)
Optimization
Initialize Ba (random, uniform, domain knowledge)
Ba(i) = exp(-θa(i) -Σb Σk Bb(k)θab(i,k))/Za
Set all random variables to unprocessed
Pick an unprocessed random variable Va
If Ba changes, set neighbors to unprocessed
Until Convergence Guaranteed !!
Tutorial: Jaakkola, 2000 (one of several)
![Page 26: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/26.jpg)
Outline
• Free Energy
• Mean-Field Approximation
• Bethe Approximation
• Kikuchi Approximation
![Page 27: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/27.jpg)
Simpler Distribution
One-node marginals Ba(i)
Two-node marginals Bab(i,k)
Joint probability hard to write down
But not for trees
![Page 28: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/28.jpg)
Simpler Distribution
One-node marginals Ba(i)
Two-node marginals Bab(i,k)
B(v) = Πa,b Bab(va,vb)
Πa Ba(va)n(a)-1
Pearl, 1988
n(a) = number of neighbors of Va
![Page 29: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/29.jpg)
Average Energy
Σv B(v) Q(v)
![Page 30: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/30.jpg)
Average Energy
Σv B(v) (Σa θa(va) + Σa,b θab(va,vb))*
![Page 31: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/31.jpg)
Average Energy
Σa Σi Ba(i)θa(i) + Σa,b Σi,k Bab(i,k)θab(i,k) *
![Page 32: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/32.jpg)
Average Energy
-Σa (n(a)-1)Σi Ba(i)θa(i)
+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k))
n(a) = number of neighbors of Va
![Page 33: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/33.jpg)
Negative Entropy
Σv B(v) log (B(v))*
![Page 34: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/34.jpg)
Negative Entropy
-Σa (n(a)-1)Σi Ba(i)log(Ba(i))
+ Σa,b Σi,k Bab(i,k)log(Bab(i,k))
Exact for tree
Approximate for general MRF
![Page 35: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/35.jpg)
Bethe Free Energy
-Σa (n(a)-1)Σi Ba(i)(θa(i)+log(Ba(i)))
+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k)+log(Bab(i,k))
Exact for tree
Approximate for general MRF
![Page 36: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/36.jpg)
Optimization Problem
-Σa (n(a)-1)Σi Ba(i)(θa(i)+log(Ba(i)))minB
Σk Bab(i,k) = Ba(i)
Σi,k Bab(i,k) = 1
Σi Ba(i) = 1
s.t.
*
+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k)+log(Bab(i,k))
![Page 37: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/37.jpg)
KKT Condition
log(Bab(i,k)) = -(θa(i)+θb(k)+θab(i,k)) + λab(k) + λba(i) + μab - 1
λab(k) = log(Mab;k)
![Page 38: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/38.jpg)
Optimization
BP tries to optimize Bethe free energy
But it may not converge
Convergent alternatives exist
Yuille and Rangarajan, 2003
![Page 39: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/39.jpg)
Outline
• Free Energy
• Mean-Field Approximation
• Bethe Approximation
• Kikuchi Approximation
![Page 40: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/40.jpg)
Local Free Energy
V3 V4
V1 V2Cluster of variablesc
Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))
G12 = Σv1,v2 B12(v1,v2)(log(B12(v1,v2)) +
θ1(v1) + θ2(v2) + θ12(v1,v2))
![Page 41: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/41.jpg)
Local Free Energy
V3 V4
V1 V2Cluster of variablesc
Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))
G1 = Σv1 B1(v1)(log(B1(v1)) + θ1(v1))
![Page 42: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/42.jpg)
Local Free Energy
V3 V4
V1 V2Cluster of variablesc
Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))
G12 = Σv1,v2 B12(v1,v2)(log(B1234(v1,v2,v3,v4)) +
θ1(v1) + θ2(v2) + θ3(v3) + θ4(v4) +θ12(v1,v2) + θ13(v1,v3) + θ24(v2,v4) + θ34(v3,v4))
![Page 43: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/43.jpg)
Sum of Local Free Energies
V3 V4
V1 V2
G12 + G13 + G24 + G34
Overcounts G1, G2, G3, G4 once !!!
Sum of free energies of all pairwise clusters
![Page 44: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/44.jpg)
Sum of Local Free Energies
V3 V4
V1 V2
G12 + G13 + G24 + G34
Sum of free energies of all pairwise clusters
- G1 - G2 - G3 - G4
![Page 45: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/45.jpg)
Sum of Local Free Energies
V3 V4
V1 V2
G12 + G13 + G24 + G34
Sum of free energies of all pairwise clusters
- G1 - G2 - G3 - G4
Bethe Approximation !!!
![Page 46: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/46.jpg)
Kikuchi Approximations
V3 V4
V1 V2
G1234
Use bigger clusters
![Page 47: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/47.jpg)
Kikuchi Approximations
V4 V5
V1 V2
G1245 + G2356
Use bigger clusters
V6
V3
- G25
Derive message passing using KKT conditions!
![Page 48: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649c745503460f94927269/html5/thumbnails/48.jpg)
Generalized Belief Propagation
V4 V5
V1 V2
G1245 + G2356
Use bigger clusters
V6
V3
- G25
Derive message passing using KKT conditions!