lecture 16: earth-mover distance - mit
TRANSCRIPT
Administrivia, Plan
⢠Administrivia:
â NO CLASS next Tuesday 11/3 (holiday)
⢠Plan:
â Earth-Mover Distance
⢠Scriber?
2
Earth-Mover Distance
⢠Definition:
â Given two sets ðŽ, ðµ of points in a metric space
â ðžðð·(ðŽ, ðµ) = min cost bipartite matching between ðŽ and ðµ
⢠Which metric space?
â Can be plane, â2, â1âŠ
⢠Applications in image vision
Images courtesy of Kristen Grauman
Embedding EMD into â1
⢠Why â1?
⢠At least as hard as â1
â Can embed 0,1 ð into EMD with distortion 1
⢠â1 is richer than â2
⢠Will focus on integer grid Π2:
4
Embedding EMD into â1
⢠Theorem: Can embed EMD over Î 2 into â1 with distortion ð log Î . In fact, will construct a randomized ð: 2 Î 2
â â1 such that:â for any ðŽ, ðµ â Î 2:
ðžðð· ðŽ, ðµ †ð¬ ||ð ðŽ â ð ðµ ||1 †ð log Î â ðžðð·(ðŽ, ðµ)
â time to embed a set of ð points: ð ð logÎ .
⢠Consequences:â Nearest Neighbor Search: ð(ð logÎ ) approximation with
ð(ð ð1+1/ð) space, and ð(ð1/ð â ð log Î) query time.
â Computation: ð(logÎ) approximation in ð ð logÎ time⢠Best known: 1 + ð approximation in ð ð time [ASâ12]
[Charikarâ02, Indyk-Thaperâ03]
What if ðŽ â |ðµ| ?
⢠Suppose:
â ðŽ = ð
â ðµ = ð < ð
⢠Define
ðžðð·â ðŽ, ðµ = Î ð â ð + ððððŽâ²,ð
ðâðŽâ²
ð(ð, ð ð )
where
ðŽâ² ranges over all subsets of ðŽ of size ð
ð: ðŽâ² â ðµ ranges over all 1-to-1 mappings
For optimal ðŽâ², call ð â ðŽ\ðŽâ² unmatched
6
Embedding EMD over small grid⢠Suppose = 3
⢠ð(ðŽ) has nine coordinates, counting # points in each integer point
â ð(ðŽ) = (2,1,1,0,0,0,1,0,0)
â ð(ðµ) = (1,1,0,0,2,0,0,0,1)
⢠Claim: 2 2 distortion embedding
High level embedding
⢠Set in Π2 box
⢠Embedding of set ðŽ:â take a quad-tree
⢠grid of cell size Î/3⢠partition each cell in 3x3
⢠recurse until of size 3x3
â randomly shift it
â Each cell gives a
coordinate:
ð (ðŽ)ð=#points in the
cell ð
⢠Want to proveð¬ ð ðŽ â ð ðµ
1â ðžðð·(ðŽ, ðµ)
8
2 2
1 0
02 11
1
0 00
0 0 0
0
0 2
21
ð ðš = âŠ2210âŠ0002âŠ0011âŠ0100âŠ0000âŠ
ð ð© = âŠ1202âŠ0100âŠ0011âŠ0000âŠ1100âŠ
Main idea: intuition
⢠Decompose EMD over []2 into EMDs over smaller grids
⢠Recursively reduce to = ð(1)
+â
Decomposition Lemma
⢠For randomly-shifted cut-grid ðº of side length ð, will prove:1) ðžðð·(ðŽ, ðµ) †ðžðð·ð(ðŽ1, ðµ1) + ðžðð·ð(ðŽ2, ðµ2) + â¯
+ ð â ðžðð·Î/ð(ðŽðº, ðµðº)
2) ðžðð·Î ðŽ, ðµ â¥1
3ð¬[ðžðð·ð ðŽ1, ðµ1 + ðžðð·ð ðŽ2, ðµ2 + â¯]
3) ðžðð·Î ðŽ, ðµ ⥠ð¬[ð â ðžðð·Î/ð(ðŽðº , ðµðº)]
⢠The distortion will
follow by applying the lemma
recursively to (AG,BG)
/ð
ð
1 (lower bound)⢠Claim 1: for a randomly-shifted cut-grid ðº of side length ð:
ðžðð·(ðŽ, ðµ) †ðžðð·ð(ðŽ1, ðµ1) + ðžðð·ð(ðŽ2, ðµ2) + â¯+ ð â ðžðð·Î/ð(ðŽðº , ðµðº)
⢠Construct a matching ð for ðžðð·Î ðŽ, ðµ from the matchings on RHS as follows
⢠For each ððŽ (suppose ððŽð) it is either:1) matched in ðžðð·(ðŽð , ðµð) to some ððµð (if ð â ðŽðâ²)
⢠then ð ð = ð
2) or ð â ðŽðâ², and then it is matched
in ðžðð·(ðŽðº , ðµðº) to some ððµð (ð â ð)⢠then ð ð = ð
⢠Cost? 1) paid by ðžðð·(ðŽð , ðµð)2) Move ð to center ()
⢠Charge to ðžðð·(ðŽð , ðµð)
Move from cell ð to cell ð⢠Charge ð to ðžðð·(ðŽðº , ðµðº)
⢠If ðŽ > |ðµ|, extra |ðŽ| â |ðµ|
pay ð â Î
ð= Î on LHS & RHS
/ð
ð
2 & 3 (upper bound)
⢠Claims 2,3: for a randomly-shifted cut-grid ðº of side length ð, we have:
2) ðžðð·Î ðŽ, ðµ â¥1
3ð¬[ðžðð·ð ðŽ1, ðµ1 + ðžðð·ð ðŽ2, ðµ2 + â¯]
3) ðžðð·Î ðŽ, ðµ ⥠ð¬[ð â ðžðð·Î/ð(ðŽðº , ðµðº)]
⢠Fix a matching ð minimizing ðžðð·Î(ðŽ, ðµ)â Will construct matchings for each EMD on RHS
⢠Uncut pairs ð, ð â ð are matched in respective (ðŽ, ðµ)⢠Cut pairs ð, ð â ð :
â are unmatched in their mini-grids
â are matched in (ðŽðº , ðµðº)
3: Cost
⢠Claim 2: ⢠3 â ðžðð·Î ðŽ,ðµ ⥠ð¬[ðžðð·ð ðŽ1, ðµ1 + ðžðð·ð ðŽ2, ðµ2 + â¯]
⢠Uncut pairs (ð, ð) are matched in respective (ðŽð , ðµð)â Total contribution from uncut pairs †ðžðð·Î(ðŽ, ðµ)
⢠Consider a cut pair (ð, ð) at distance ð â ð = (ðð¥ , ððŠ)â (ð, ð) can contribute to RHS as they may be unmatched in their
own mini-grids
â Pr[(ð, ð) cut] = 1 â 1 âðð¥
ð +1 â
ððŠ
ð +â€
ðð¥
ð+
ððŠ
ðâ€
1
ð||ð â ð||2
â Expected contribution of (ð, ð) to RHS:â€Pr[(ð, ð) cut] â 2ð †2 ð â ð
2
â Total expected cost contributed to RHS:2 â ðžðð·Î(ðŽ, ðµ)
⢠Total (cut & uncut pairs): 3 â ðžðð·Î(ðŽ, ðµ)
ðð¥
ð
3: Cost
⢠Claim:
â ðžðð·Î ðŽ, ðµ ⥠ð¬[ð â ðžðð·Î/ð(ðŽðº , ðµðº)]
⢠Uncut pairs: contribute zero to RHS!
⢠Cut pair: ð, ð â ð with ð â ð = (ðð¥, ððŠ)
â if ðð¥ = ð¥ð + ðð, and ððŠ = ðŠð + ððŠ , then
â expected cost contribution to ð â ðžðð·Î/ð(ðŽðº , ðµðº):
†ð¥ +ðð¥
ðâ ð + ðŠ +
ððŠ
ðâ ð = ðð¥ + ððŠ = ð â ð
2
⢠Total expected cost †ðžðð·Î(ðŽ, ðµ)
14
ðð¥
ð ð ð
Recurse on decomposition⢠For randomly-shifted cut-grid ðº of side length ð, we have:
1) ðžðð·(ðŽ, ðµ) †ðžðð·ð(ðŽ1, ðµ1) + ðžðð·ð(ðŽ2, ðµ2) + â¯
+ ð â ðžðð·Î/ð(ðŽðº, ðµðº)
2) ðžðð·Î ðŽ, ðµ â¥1
3ð¬[ðžðð·ð ðŽ1, ðµ1 + ðžðð·ð ðŽ2, ðµ2 + â¯]
3) ðžðð·Î ðŽ, ðµ ⥠ð¬[ð â ðžðð·Î/ð(ðŽðº , ðµðº)]
⢠We applying decomposition recursively for ð = 3â Choose randomly-shifted cut-grid ðº1 on []2
â Obtain many grids [3]2, and a big grid [/3]2
â Then choose randomly-shifted cut-grid ðº2 on [/3]2
â Obtain more grids [3]2, and another big grid [/9]2
â Then choose randomly-shifted cut-grid ðº3 on [/9]2
â âŠ
⢠Then, embed each of the small grids [3]2 into â1, using ð(1)distortion embedding, and concatenate the embeddingsâ Each 3 2 grid occupies 9 coordinates on â1 embedding
15
Proving recursion works⢠Claim: embedding contracts distances by ð 1 :
ðžðð·(ðŽ, ðµ) â€
†âð ðžðð·ð ðŽð, ðµð + ð â ðžðð·Î/ð(ðŽðº1, ðµðº1
)
†âð ðžðð·ð ðŽð, ðµð + ðâð ðžðð·ð ðŽðº1,ð, ðµðº1,ð
+ð â ðžðð· Î
ð2ðŽðº2
, ðµðº2
†âŠâ€ sum of ðžðð·3 costs of 3 à 3 instances
â€1
2 2||ð ðŽ â ð ðµ ||1
⢠Claim: embedding distorts distances by ð log Î in expectation:3 logð Î â ðžðð·(ðŽ, ðµ)
⥠3 â ðžðð·(ðŽ, ðµ) + 3 logðÎ
ðâ ðžðð·Î(ðŽ, ðµ)
⥠ð[ âð ðžðð·ð(ðŽð, ðµð) + 3 logðÎ
ðâ ð â ðžðð·Î/ð(ðŽðº1
, ðµðº1) ]
⥠â¯â¥ sum of ðžðð·3 costs of 3 à 3 instances
⥠||ð ðŽ â ð ðµ ||1
Final theorem⢠Theorem: can embed EMD over [Î]2 into â1
with ð(log Î) distortion in expectation.
⢠Notes:â Dimension required: ð(Î2), but a set ðŽ of size ð
maps to a vector that has only ð(ð â log Î) non-zero coordinates.
â Time: can compute in ð(ð â log )â By Markovâs, itâs ð(log Î) distortion with 90%
probability
⢠Applications:â Can compute ðžðð·(ðŽ, ðµ) in time ð(ð â log Î)â NNS: ð(ð â log Î) approximation, with ð(ð1+1/ð â
ð ) space, and ð(ð1/ð â ð â log Î) query time.