plan of this talk - university of california, los angeles
TRANSCRIPT
![Page 1: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/1.jpg)
1
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Context Sensitive Graph Grammar and Top-Down/Bottom-up Inference
Song-Chun Zhu
Statistics and Computer ScienceUniversity of California, Los Angles
Joint work with four students: Hong Chen, Zijian Xu, Feng Han, Ziqiang Liu
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Plan of this talk
Issue 1: Context-sensitive graph grammar to generate “composite graphical templates” for cloth modelingwhich are reconfigurable Markov random fields.
Issue 2: Bottom-up / top-down inference with graph grammarsintegrating discriminative and generative models.
Issue 3: Information scaling and entropy rateWhat is the continuum spectrum for feature selection?
![Page 2: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/2.jpg)
2
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Issue I: Object Modeling with context-sensitive graph grammar
Motivation
Objects have large within-category variations in configurations1. Vehicles --- sedan, hunchback, van, truck, SUV, …2. Clothes --- jacket, T-shirt, sweater, ….3. Furniture --- desk, chair, dresser, …
Configuration changes more in scene categoriesA party scene, a living room, an office, a street scene, …
Each object should be represented by a set of graphical modelsor a graphical model that is “re-configurable”.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
![Page 3: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/3.jpg)
3
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Integrating Markov random fields with graph grammars
Why graph grammar?
1. They are known to generate a large set of configurationsusing a small number of primitives and production rules.
2. They can be context sensitive --- an essential property.
Objectives:
Not just for classification, but to understand the whole object.Also for rendering, e.g. human portrait, sketches, and cartoon.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Review: grammars and syntactic recognition
A stochastic grammar is often a 5-tuple G =<VN, VT, R, p, Σ >VN --- non-terminal nodes,VT --- terminal nodes,R --- production rules,Σ --- the language (a set of valid sentences)p --- the probability
})V(Vw w,s :p(w)) (w, { *TN
*R ∪∈→=Σ
![Page 4: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/4.jpg)
4
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Example of graph grammar from K.S. Fu in 70-80s
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Example of string grammar from K.S. Fu
![Page 5: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/5.jpg)
5
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Examples of diagram grammar(Rekers and Schurr 96)
Using a production ruleto expand the graph.
The shaded verticesare the neighborhoodOr environment.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
![Page 6: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/6.jpg)
6
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Data processing: decomposing the artist’s sketch
The artist sketch is represented by a 2D attribute graph (like primal sketch Guo,Zhu,Wu iccv03)we decompose the graph into a number of 2D sub-graphs
![Page 7: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/7.jpg)
7
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Each part has a set of sub-templates (graphical sub-configurations)
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
An And-Or graph for composing template configurations
The And-Or graph was used in heuristicAI search (Pearl, 1984). Like the 12-counterfeit coin problem.
It was not used for modeling, but forproblem solving in a divide-and-conquerstrategy. Pearl didn’t use the horizontallinks for context.
Chen, Xu, Liu, and Zhu, 2005
![Page 8: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/8.jpg)
8
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Composing the sub-templatesEach sub-template is a vertex in a composite template, and vertices are connectedthrough “bonds”. The ideas of bonds and address variables were proposed by Grenander in his book and Fridman and Mumford.
Intuitively, each sub-template is like a class in C++.
![Page 9: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/9.jpg)
9
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Examples of new configurations (synthesis)
Note that the number of sub-templates and their connections change.Each is a possible “configuration”.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Some results for recognition and sketch
Chen, Xu, Liu, and Zhu, 2005
![Page 10: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/10.jpg)
10
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Formulation
An And-Or graph represents a graph grammar for object class in a 5-tuple
is a set of terminal nodes. A node is a subgraph.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Formulation
is a set of OR-nodes
is a set of AND-nodes
![Page 11: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/11.jpg)
11
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Formulation
is a set of valid configurations (Grenander called “global regularity”)
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Probability model
We can put a joint probability (prior) on the composite templates.We imagine a physical system with dynamic connectivities.
The first term alone stands for a SCFG. The second and third terms are potentials on the graph G.
![Page 12: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/12.jpg)
12
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
The likelihood model is the primal sketch model
sketching pursuit process
sketch imagesynthesized textures
org image
syn image
+=
sketches
(Guo, Zhu and Wu, 2003)
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Simple examples of the image primitive(Guo, Zhu and Wu, 2003)Learned texton dictionary
![Page 13: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/13.jpg)
13
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
A simpler and more flexible graph grammarHan and Zhu 2005
One terminal sub-template--- a planar rectangle in 3-space
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Six grammar rules which can be used recursively
r1S ::= S
m
r2::=A A
A1m
scene
line
r3::=A A
A11mxn
mesh
AmA2
r4::=A A
A1
nesting
A2
r6::=A A
cube
A1
A2A3
r5::=A
instanceA
A1A2
A3
line production rule
A1 A2
nesting production rule
A1A2
A3
cube production rule
rectangleA1 A2 Am A1m
A2m a
![Page 14: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/14.jpg)
14
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Two configuration examples
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Issue 2: Top-down / Bottom-up InferenceIntegrating generative and discriminative methods
![Page 15: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/15.jpg)
15
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Generative
Discriminative
marginal posterior
Generative vs. Discriminative Algorithms
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
face text region model switching
Markov kernel
deathbirth deathbirth split merge
input image
face detection text detection edge detection model clustering
+
generativeinference
discriminativeinference
weighted particles
Diagram for Integrating Top-down generative andBottom-up discriminativeMethods.
![Page 16: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/16.jpg)
16
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Bottom-up vs. Top-Down:It is essentially an ordering problem
Measuring the power of a discriminative Test
Measuring the power of sub-kernels
Both bottom-up tests and top-down kernels can be evaluated by the amount of information gains.
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Bottom-up detection (proposal) of rectangles
independent rectangles
line segments in three groups
Each rectangle consists of two pairs of line segments that share a vanish point.
![Page 17: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/17.jpg)
17
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Bottom-up/top-down Inference with grammar
top-down proposals
bottom-up proposals
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Each grammar rule has a list of particles
......
............
......
......r6:
r2:
r3:
r4:
A particle is a production rule partially matched, its probability measuresan approximated posterior probability ratio.
By analogy, a production rule is like a image base in matching pursuit.
![Page 18: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/18.jpg)
18
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Example of top-down / bottom-up inference
a) edge map b) bottom-up proposal c) current state
d) proposed by rule 3 e) proposed by rule 6 f) proposed by rule 4
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Results
Edge map
Rectangles inferred
(Han and Zhu, 05)
![Page 19: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/19.jpg)
19
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Results
Edge map
Rectangles inferred
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Results
Edge map
Rectangles inferred
![Page 20: Plan of this talk - University of California, Los Angeles](https://reader033.vdocuments.site/reader033/viewer/2022042702/626509839cf62021de1c925d/html5/thumbnails/20.jpg)
20
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Results
Edge map Rectangles inferred
MSRI workshop on object recognition, March 2005, Song-Chun Zhu
Summary on issues 1-2
1. Context are modeled intrinsically by Markov random fieldsOne can measure the context information quantitatively by the minimax entropy learning scheme.
2. Grammars (switches) and address variables (hooks) are ways tore-configure the random field structures and to generatecomposite graphical templates.
3. Top-down / bottom-up inference comes naturally with grammarsThe information gains of bottom-up tests and top-down kernelsshould be measured quantitatively. Then algorithm design is an ordering problem.