plan of this talk - university of california, los angeles

20
MSRI workshop on object recognition, March 2005, Song-Chun Zhu Context Sensitive Graph Grammar and Top-Down/Bottom-up Inference Song-Chun Zhu Statistics and Computer Science University of California, Los Angles Joint work with four students: Hong Chen, Zijian Xu, Feng Han, Ziqiang Liu MSRI workshop on object recognition, March 2005, Song-Chun Zhu Plan of this talk Issue 1: Context-sensitive graph grammar to generate composite graphical templates” for cloth modeling which are reconfigurable Markov random fields. Issue 2: Bottom-up / top-down inference with graph grammars integrating discriminative and generative models. Issue 3: Information scaling and entropy rate What is the continuum spectrum for feature selection?

Upload: others

Post on 24-Apr-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Plan of this talk - University of California, Los Angeles

1

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Context Sensitive Graph Grammar and Top-Down/Bottom-up Inference

Song-Chun Zhu

Statistics and Computer ScienceUniversity of California, Los Angles

Joint work with four students: Hong Chen, Zijian Xu, Feng Han, Ziqiang Liu

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Plan of this talk

Issue 1: Context-sensitive graph grammar to generate “composite graphical templates” for cloth modelingwhich are reconfigurable Markov random fields.

Issue 2: Bottom-up / top-down inference with graph grammarsintegrating discriminative and generative models.

Issue 3: Information scaling and entropy rateWhat is the continuum spectrum for feature selection?

Page 2: Plan of this talk - University of California, Los Angeles

2

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Issue I: Object Modeling with context-sensitive graph grammar

Motivation

Objects have large within-category variations in configurations1. Vehicles --- sedan, hunchback, van, truck, SUV, …2. Clothes --- jacket, T-shirt, sweater, ….3. Furniture --- desk, chair, dresser, …

Configuration changes more in scene categoriesA party scene, a living room, an office, a street scene, …

Each object should be represented by a set of graphical modelsor a graphical model that is “re-configurable”.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Page 3: Plan of this talk - University of California, Los Angeles

3

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Integrating Markov random fields with graph grammars

Why graph grammar?

1. They are known to generate a large set of configurationsusing a small number of primitives and production rules.

2. They can be context sensitive --- an essential property.

Objectives:

Not just for classification, but to understand the whole object.Also for rendering, e.g. human portrait, sketches, and cartoon.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Review: grammars and syntactic recognition

A stochastic grammar is often a 5-tuple G =<VN, VT, R, p, Σ >VN --- non-terminal nodes,VT --- terminal nodes,R --- production rules,Σ --- the language (a set of valid sentences)p --- the probability

})V(Vw w,s :p(w)) (w, { *TN

*R ∪∈→=Σ

Page 4: Plan of this talk - University of California, Los Angeles

4

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Example of graph grammar from K.S. Fu in 70-80s

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Example of string grammar from K.S. Fu

Page 5: Plan of this talk - University of California, Los Angeles

5

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Examples of diagram grammar(Rekers and Schurr 96)

Using a production ruleto expand the graph.

The shaded verticesare the neighborhoodOr environment.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Page 6: Plan of this talk - University of California, Los Angeles

6

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Data processing: decomposing the artist’s sketch

The artist sketch is represented by a 2D attribute graph (like primal sketch Guo,Zhu,Wu iccv03)we decompose the graph into a number of 2D sub-graphs

Page 7: Plan of this talk - University of California, Los Angeles

7

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Each part has a set of sub-templates (graphical sub-configurations)

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

An And-Or graph for composing template configurations

The And-Or graph was used in heuristicAI search (Pearl, 1984). Like the 12-counterfeit coin problem.

It was not used for modeling, but forproblem solving in a divide-and-conquerstrategy. Pearl didn’t use the horizontallinks for context.

Chen, Xu, Liu, and Zhu, 2005

Page 8: Plan of this talk - University of California, Los Angeles

8

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Composing the sub-templatesEach sub-template is a vertex in a composite template, and vertices are connectedthrough “bonds”. The ideas of bonds and address variables were proposed by Grenander in his book and Fridman and Mumford.

Intuitively, each sub-template is like a class in C++.

Page 9: Plan of this talk - University of California, Los Angeles

9

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Examples of new configurations (synthesis)

Note that the number of sub-templates and their connections change.Each is a possible “configuration”.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Some results for recognition and sketch

Chen, Xu, Liu, and Zhu, 2005

Page 10: Plan of this talk - University of California, Los Angeles

10

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Formulation

An And-Or graph represents a graph grammar for object class in a 5-tuple

is a set of terminal nodes. A node is a subgraph.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Formulation

is a set of OR-nodes

is a set of AND-nodes

Page 11: Plan of this talk - University of California, Los Angeles

11

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Formulation

is a set of valid configurations (Grenander called “global regularity”)

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Probability model

We can put a joint probability (prior) on the composite templates.We imagine a physical system with dynamic connectivities.

The first term alone stands for a SCFG. The second and third terms are potentials on the graph G.

Page 12: Plan of this talk - University of California, Los Angeles

12

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

The likelihood model is the primal sketch model

sketching pursuit process

sketch imagesynthesized textures

org image

syn image

+=

sketches

(Guo, Zhu and Wu, 2003)

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Simple examples of the image primitive(Guo, Zhu and Wu, 2003)Learned texton dictionary

Page 13: Plan of this talk - University of California, Los Angeles

13

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

A simpler and more flexible graph grammarHan and Zhu 2005

One terminal sub-template--- a planar rectangle in 3-space

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Six grammar rules which can be used recursively

r1S ::= S

m

r2::=A A

A1m

scene

line

r3::=A A

A11mxn

mesh

AmA2

r4::=A A

A1

nesting

A2

r6::=A A

cube

A1

A2A3

r5::=A

instanceA

A1A2

A3

line production rule

A1 A2

nesting production rule

A1A2

A3

cube production rule

rectangleA1 A2 Am A1m

A2m a

Page 14: Plan of this talk - University of California, Los Angeles

14

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Two configuration examples

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Issue 2: Top-down / Bottom-up InferenceIntegrating generative and discriminative methods

Page 15: Plan of this talk - University of California, Los Angeles

15

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Generative

Discriminative

marginal posterior

Generative vs. Discriminative Algorithms

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

face text region model switching

Markov kernel

deathbirth deathbirth split merge

input image

face detection text detection edge detection model clustering

+

generativeinference

discriminativeinference

weighted particles

Diagram for Integrating Top-down generative andBottom-up discriminativeMethods.

Page 16: Plan of this talk - University of California, Los Angeles

16

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Bottom-up vs. Top-Down:It is essentially an ordering problem

Measuring the power of a discriminative Test

Measuring the power of sub-kernels

Both bottom-up tests and top-down kernels can be evaluated by the amount of information gains.

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Bottom-up detection (proposal) of rectangles

independent rectangles

line segments in three groups

Each rectangle consists of two pairs of line segments that share a vanish point.

Page 17: Plan of this talk - University of California, Los Angeles

17

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Bottom-up/top-down Inference with grammar

top-down proposals

bottom-up proposals

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Each grammar rule has a list of particles

......

............

......

......r6:

r2:

r3:

r4:

A particle is a production rule partially matched, its probability measuresan approximated posterior probability ratio.

By analogy, a production rule is like a image base in matching pursuit.

Page 18: Plan of this talk - University of California, Los Angeles

18

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Example of top-down / bottom-up inference

a) edge map b) bottom-up proposal c) current state

d) proposed by rule 3 e) proposed by rule 6 f) proposed by rule 4

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Results

Edge map

Rectangles inferred

(Han and Zhu, 05)

Page 19: Plan of this talk - University of California, Los Angeles

19

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Results

Edge map

Rectangles inferred

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Results

Edge map

Rectangles inferred

Page 20: Plan of this talk - University of California, Los Angeles

20

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Results

Edge map Rectangles inferred

MSRI workshop on object recognition, March 2005, Song-Chun Zhu

Summary on issues 1-2

1. Context are modeled intrinsically by Markov random fieldsOne can measure the context information quantitatively by the minimax entropy learning scheme.

2. Grammars (switches) and address variables (hooks) are ways tore-configure the random field structures and to generatecomposite graphical templates.

3. Top-down / bottom-up inference comes naturally with grammarsThe information gains of bottom-up tests and top-down kernelsshould be measured quantitatively. Then algorithm design is an ordering problem.