1 image parsing: unifying segmentation, detection, and recognition shai bagon oren boiman
TRANSCRIPT
![Page 1: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/1.jpg)
1
Image Parsing: Unifying Segmentation, Detection,
and Recognition
Shai BagonOren Boiman
![Page 2: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/2.jpg)
2
Image Understanding
• A long standing goal of Computer Vision
• Consists of understanding:– Objects and visual patterns– Context– State / Actions of objects– Relations between objects– Physical layout– Etc.
A picture is worth a
thousand words…
![Page 3: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/3.jpg)
3
Natural Language Understanding• Very far from being solved• Even NL parsing (syntax) is
problematic
• Ambiguities requirehigh level (semantic)knowledge
![Page 4: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/4.jpg)
4
Image Parsing• Decomposition to constituent visual
patterns– Edge Detection– Segmentation– Object Recognition
![Page 5: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/5.jpg)
5
Image Parsing Framework
Segmentation Edge Detection
Object Recognition Classification
Generic Framework
Low-Low-Level Level TasksTasks
High-High-Level Level TasksTasks
IS
![Page 6: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/6.jpg)
6
Inference:
Top-down (Generative)
Constellation, Star-Model etc.
Bottom-up(Discriminative)
SVM, Boosting, Neural Nets etc.
SPSIPISP || ITestsq jj |
+ Fast
- Possibly Inconsistent
+ Consistent Solutions
- Slow
Approach used in
“Image Parsing”
I S
![Page 7: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/7.jpg)
7
Coming up next…
• Define a (Monstrous) Generative model for Image Parsing
• How to perform s-l-o-w inference on such models (MCMC)
• How to accelerate inference using bottom-up cues (DDMCMC)
SPSIPISP ||
![Page 8: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/8.jpg)
8
Image Parsing Generative Model
– No. of regions K
– Region Shapes Li
and Types ζi
– Region Parameters Θi
SP
SIP |
K
iiiiR LIP
i1
,,|
Uniform
Uniform exp
exp
I
S
![Page 9: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/9.jpg)
10
Generic Regions
Constant up to Gaussian noise
Gray level histogram
Quadratic form
2,~ Ngl Ghhgl ,...,~ 1
2,,
~,
yxN
yxgl
feydxcybxyaxyx 22,
![Page 10: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/10.jpg)
11
Faces• Use a PCA model (Eigen-faces)• Estimate Cov. Σ and prin. comp.
,...~ 11 nnVcVcNF
nVV ...1
![Page 11: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/11.jpg)
12
Text region shapes
• Use Spline templates• Allow Affine transformation• Allow small deformations of control
point• Shading intensity model
![Page 12: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/12.jpg)
13
Problem Formulation
• Now we can compute
• We’d like to optimize
• over the space ofparse graphs
SPSIPISP ||
ISPS
|maxarg
K
iiiiR
K
iiiiii
LIpSIP
pLppKpSP
i1
1
,,||
||
![Page 13: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/13.jpg)
15
Optimizing P(S|I) is not easy…• Hybrid State Space:
Continuous & Discrete• Enormous number of local maxima• Graphical model structure is not pre-
determined
Rules out gradient methods
Rules out Belief propagation
![Page 14: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/14.jpg)
16
Optimize by Sampling!
• Monte Carlo Principle – Use random samples to optimize!– Lets say we’re given N samples from P(S|I)
•S1,…,SN
– Compute P(Si|I)
• Given Si it is easy to compute P(Si|I)
– Choose the best Si !
![Page 15: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/15.jpg)
17
Detour: Sampling methods• How to sample from
(very) complex probability space• Sampling algorithm• Why is Markov Chained in Monte
Carlo?
![Page 16: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/16.jpg)
18
Example
• Sample from
22|4|42
2
25
21
2
1
2
1
42
1 2
2
xeexp x
x
![Page 17: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/17.jpg)
19
Markov Chain
• A sequence of RandomVariables
• Markov property
• Transition
,...2,1,,, 321 tsssX t
tttt xxpxxxp |,...,| 111
Kpp tt
1
04.6.
9.1.0
010
KGiven the
present
The future is independent of the past
jiji ssKK ,
![Page 18: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/18.jpg)
20
Markov Chain – cont.
• Under certain conditions MC converges to unique distribution
• Stationary distribution – first eigen-vector of K
pKpKpp ˆˆˆ 1
![Page 19: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/19.jpg)
21
Markov Chain Monte Carlo• Reminder: • Had we wanted a sample from
Take the value of Xt,
• How to make our the stationary distribution of MC ?
• How to guarantee convergence ?
p̂
p
pKp ˆ1
pp ˆ
t
pKpKpp ˆˆˆ 1
![Page 20: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/20.jpg)
22
Markov Chain convergence• Irreducibility:
– The walk can reach any statestarting at any state
• Non-periodicity– Stationary distribution cannot depend
on t
![Page 21: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/21.jpg)
23
• Detailed Balance:(stationary distribution), if
• Written as matrix product• Sufficient condition to converge to
p(x)
xxKxp ** xxKxpxxKxp ***
The same distribution p(.)
How to make p(x) Stationary
Kpp
*xxKxp *x
*x
Probability sum to 1
pxp ˆ
Forward stepBackward step
xxKxpxpSx
**
*Independent of x*
![Page 22: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/22.jpg)
24
Kernel Selection
• Detailed Balance requires Kernel:• Metropolis-Hastings Kernel:
– Proposal: where to go next– Acceptance: should we go
• MH Kernel provides detailed balance
*xxK
xxq |
xxqxp
xxqxpxx
|
|,1min,
Among the ten most influencing algorithms in science and engineering
![Page 23: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/23.jpg)
25
Metropolis Hastings
• Sample x*~q(x*|xt)
• Compute acceptance probability
• If rand<A,
• Else,
*1 xxt
tt xx 1
xxqxp
xxqxpxx
|
|,1min,
![Page 24: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/24.jpg)
26
Can we use any q(.) ?
1. Easy to sample from:– we sample from q(.) instead of p(.)
![Page 25: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/25.jpg)
27
Can we use any q(.) ?
2. Supports p(x) 00 xqxp
p(x)
q(x)
![Page 26: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/26.jpg)
28
Can we use any q(.) ?
3. Explores p(x) wisely:– Too narrow q(.): q(x*|x) ~ N(x, .1)– Too wide q(.): q(x*|x) ~ N(0,20)
p(x)
q(x)
![Page 27: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/27.jpg)
29
Can we use any q(.) ?
1. Easy to sample from:• we sample from q(.) instead of p(.)
2. Supports p(x)–
3. Explores p(x) wisely:– q(.) too narrow – q(.) too wide -> low acceptance
• The best q(.) is p(.) – but we can’t sample p(.) directly.
00 xqxp
![Page 28: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/28.jpg)
30
Combining Kernels
• Suppose we have
Satisfying detailed balance with the same
• Then also satisfies detailed balance.
mixxK i ,..,1,*
*** xxKxpxxKxp ii
xp
*xxKwK ii
i
![Page 29: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/29.jpg)
31
Combining MH Kernels
• The same applies to Metropolis Hastings Kernels:
– Combining MH Kernels with different proposals – MC will converge to xp
![Page 30: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/30.jpg)
32
Example Revisited
• Proposal distribution:• Acceptance:
25,.~|* xNxxq
xxqxCxLxN
xxqxCxLxNA
|21
|21
,1min*
****
Given x - easy to
compute p(x) Normalization factor cancels
out
![Page 31: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/31.jpg)
33
Example – cont.
![Page 32: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/32.jpg)
34
MAP Estimation
• Converge to• Simulated Annealing:
– explore less – exploit more!
• As the density is peaked at the global maxima
xpxp iTi
1
0iT
xpmaxarg
![Page 33: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/33.jpg)
35
Annealing - example
• As the density is peaked at the global maxima
0iT
![Page 34: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/34.jpg)
36
• Dimensionality variation in our space
• Cannot directly comparedensity of differentstates!
Model Selection
Varying number of
regions
Varying types of
explanations per region
![Page 35: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/35.jpg)
37
• Pair-wise common measure
Jump across dimensions
![Page 36: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/36.jpg)
38
Reversible Jumps
• Common measure– Sample extensions u and u* s.t
dim(u)+dim(x) = dim(u*)+dim(x*)– Use common dimension for comparison
using invertible deterministic functions h and h’
– Explicitly allow reversible jumps x* x
uxhuxh ,, ***
![Page 37: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/37.jpg)
39
MCMC Summary
• Sample p(x) using Markov Chain • Proposal q(x*|x)
– Supports p(x)– Guides the sampling
• Detailed balance– MH Kernel ensures convergence to p(x)
• Reversible Jumps– Comparing across models and dimensions
![Page 38: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/38.jpg)
40
If you want to make a new sample,
You should first learn how to propose.
Acceptance is random
Eventually you’ll get trapped in endless chains
until you become stationary.
Some say it is better to do reversible jumps between models.
MCMC – Take home message
![Page 39: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/39.jpg)
41
Back to image parsing
• A state is a parse tree• Moves between
possible parsesof the image
Varying number of
regions
Different region types: Text, Face
and GenericVarying
number of parameters
![Page 40: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/40.jpg)
42
• Birth / Death of a Face / Text
• Split / Merge of a generic region
• Model switching for a region
• Region boundary evolution
MCMC Moves
![Page 41: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/41.jpg)
43
Moves -> Kernel
• Birth / Death of a Face / Text
• Split / Merge of a generic region
• Model switching for a region
• Region boundary evolution
MCMC Moves
![Page 42: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/42.jpg)
44
Moves -> Kernel
TextBirth
TextDeath
FaceBirth
FaceDeath
SplitRegion
MergeRegion
ModelSwitching
BoundaryEvolution
TextSub-Kernel
FaceSub-Kernel
GenericSub-Kernel
ISSK ;|*Dimensionality change: must allow reversible
jump
![Page 43: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/43.jpg)
45
Using bottom-up cues
• So far we haven’t stated the proposal probabilities q(.)
• If q(.) is uninformed of the image, convergence can be painfully slow
• Solution: use the image to propose moves
Face birth kernel
![Page 44: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/44.jpg)
46
Data Driven MCMC
• Define proposal probabilitiesq(x*|x;I)
• The proposal probabilities will depend on discriminative tests– Faces detection– Text detection– Edge detection– Parameter clustering
• Generative model with Discriminative proposals
![Page 45: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/45.jpg)
47
Face/Text Detection
• Bottom-up cues: AdaBoost– hard classification
– Estimate posterior instead
– Run on sliding windows at several scales
ITst,signIsign AdaAda
iiihH
1,I| ITst, AdaAda lelq l
![Page 46: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/46.jpg)
48
Edge Map
• Canny edge detection at several scales
• Only these edges for split / merge
![Page 47: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/47.jpg)
49
Parameters clustering
• Estimate likely parameter settings in the image
• Cluster using Mean-Shift
![Page 48: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/48.jpg)
50
How to propose?
• q(S*|S,I) should approximate p(S*|I)• Choose one sub-kernel at random
– (e.g., create face)
• Use bottom-up cues to generate proposals: S1,S2,…
• Weight proposal according to p(Si|I)
• Sample from discrete distribution
![Page 49: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/49.jpg)
51
Generic region – split/merge• Split/merge according to edge map• Dimensionality change – reversible
S S’
![Page 50: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/50.jpg)
52
Generic region – split/merge• Splitting k into i,j: Sk -> Sij
• Proposals are weighted
• Normalize weight to probabilities• Sample
ISP
ISPw
k
ijsplit |
|
kk
ijijsplit SPSIP
SPSIPw
|
|
![Page 51: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/51.jpg)
54
Faces sub-kernel
• Adding a face :S->S’ • Take AdaBoost proposals • Compute weights wi=P(S’|I)/P(S|I)• Normalize weights to probability• Sample
• Reversible kernel – add/remove face kernel
![Page 52: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/52.jpg)
55
Accept / Reject
• We have the proposal q(S’|S;I) • Check Metropolis Hastings
acceptance
ISpISSq
ISpISSq
|;'|
|';|',1min
![Page 53: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/53.jpg)
56
Full diagram
TextBirth
TextDeath
FaceBirth
FaceDeath
SplitRegion
MergeRegion
ModelSwitching
BoundaryEvolution
TextSub-Kernel
FaceSub-Kernel
GenericSub-Kernel
ISSK ;|*Generative
Text Detection Face Detection Edge Detection Parameter Clustering
Input ImageDiscriminativ
e
![Page 54: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/54.jpg)
57
Results
![Page 55: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/55.jpg)
58
Results
![Page 56: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/56.jpg)
59
Results
![Page 57: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/57.jpg)
60
Results
![Page 58: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/58.jpg)
61
Results
![Page 59: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/59.jpg)
62
Limitations
• Scaling to a large number of objects– Algorithm design complexity– Convergence speed– Dealing with complex objects
• Good Synthesis / Detectionbut not so good segmentation
![Page 60: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/60.jpg)
63
Extensions
![Page 61: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/61.jpg)
64
Extensions
![Page 62: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/62.jpg)
65
Extensions
![Page 63: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/63.jpg)
66
• Image Parsing– Decomposition to constituent
visual patterns
• Top-down Generative Model for Parse Graphs
• Optimization using DDMCMC– MCMC – Discriminative bottom-up proposals
Summary
![Page 64: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/64.jpg)
67
References• Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song-
Chun Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005.
• Z. Tu and S. Zhu. Image Segmentation by DDMCMC. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.
• Zhuowen Tu, Xiangrong Chen, A.L. Yuille and S.C. Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. IEEE International Conference on Computer Vision, 2003.
• C. Andrieu, N. de Freitas, A. Doucet and M. Jordan. An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp. 5--43, Jan.- Feb. 2003.
![Page 65: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/65.jpg)
68
Backups
![Page 66: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/66.jpg)
70
Example
• Compute posterior for a simple GMM:– Given one X, what component
of the mixture generated it?– Exhaustive search –
What if larger space? MpMxpxMp ||
![Page 67: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/67.jpg)
71
Example revisited
![Page 68: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/68.jpg)
74
Binarization
• Extracting text boundaries• Adaptive thresholding
WindowWindowThr std2.mean
![Page 69: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/69.jpg)
75
What’s so special about Text?• Information lies in boundary
– AdaBoost: suggests region– Adaptive binarization: boundary
refinement
![Page 70: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/70.jpg)
76
• Union of model subspaces
• How can we compare densitiesacross dimensions?
Model selection
U U
-5
-4
-3
-2
-1
0
1
2
3
4
5-5
-4-3
-2-1
01
23
45
-5
0
5
![Page 71: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/71.jpg)
77
Parameter clustering
• Each cluster in parameter set induce saliency map
Shading
Gray level
![Page 72: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/72.jpg)
78
Generic region – split/merge• Splitting k into i,j or merging i,j into k• Suggestions are weighted
jjjiii
kkk
jjjjiiii
jijimerge
kkk
jjjiii
kkkk
jiksplit
LpLp
qLq
LRpLRp
RRqw
Lp
qLqqLq
LRp
RRqw
,,,,
,
,,|,,|
,
,,
,,
,,|
,
,
RegionAffinity
ShapePrior
ParameterClustering
Current RegionProbability
Current parametersProbability
![Page 73: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/73.jpg)
79
Switching node’s attributes• No dimensionality change• Weighting the proposals by
iiiiiii
iiichangei
LpLIp
qLqw
,,,,|
',''
![Page 74: 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman](https://reader034.vdocuments.site/reader034/viewer/2022052603/56649c6e5503460f9491fba3/html5/thumbnails/74.jpg)
80
Boundary Evolution Kernel• Does not change dimensionality• For two adjacent regions:
– Log likelihood ratio– Changes in area– Boundary curvature– Deviation from control points (text)– Brownian noise
j
i
vIp
vIp
;
;log