1 ecological statistics and perceptual organization charless fowlkes work with david martin and...

1

Ecological Statistics and Perceptual Organization

Charless Fowlkes

work with David Martin and Jitendra Malik

at University of California at Berkeley

2

“ I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of color. Do I have 327? No. I have sky, house, and trees.”

3


010011010....

4


Laws of Organization in Perceptual Forms Max Wertheimer (1923)

5

Perceptual Organization

Grouping Figure/Ground

7

Grouping by proximity

8

Grouping by similarity

9

Grouping by similarity (of shape)

10

Size and Surroundedness

11turnyourhead.com

Familiarity / Meaningfulness

12

Convexity

13

Perceptual organization as a computational theory of vision?

14

• How do these cues apply to real world images?

• How are different cues combined?

• Why does the visual system use these cues?

15

Ecological Validity

• Brunswik & Kamiya 1953: Gestalt rules reflect the structure of the natural world

• Attempted to validate the grouping rule of proximity of similars

• Brunswik was ahead of his time… we now have the tools.

Egon Brunswik (1903-1955)

16

Strategy

1. Collect high-level ground-truth annotations for a large collection of images

2. Develop computational models of cues for perceptual organization calibrated to ground-truth training data

3. Measure cue statistics and evaluate the relative “power” of different cues

18

• 30 subjects, age 19-23 • 1,458 person hours over 8 months• 1,020 Corel images• 11,595 Segmentations

– color, gray, inverted/negated

“You will be presented a photographic image. Divide the image into some number of segments, where the segments represent “things” or “parts of things” in the scene. The number of segments is up to you, as it depends on the image. Something between 2 and 30 is likely to be appropriate. It is important that all of the segments have approximately equal importance.”

19

Berkeley Segmentation DataSet [BSDS]

20

Scene

Background

Sky

Trees Shore

Water

Small Top

L R

Mermaid

Foreground

Rocks

Base

Land

(a)

(b)

(c)

Scene

Background

Trees Shore

Water

Small Top

L R

Mermaid

Foreground

Rocks

Base

Land

Scene

Background

Trees Shore

Water

Small Top

L R

Mermaid

Foreground

Rocks

Base

Land

Sky

Sky

21

Overview

• Grouping– Local Boundary Detection– Local Human Performance

• Figure/Ground– Local Figure/Ground Cues– Local Human Performance

• Discussion

22

Non-Boundaries Boundaries

T

B

C

23

Gradient Features

• Brightness Gradient (BG) – Difference of brightness distributions

• Color Gradient (CG)– Difference of color distributions

• Texture Gradient (TG)– Difference of distributions of

V1-like filter responses

1976 CIE L*a*b* color space

Distributions are represented by

smoothed histograms

r(x,y)

i ii

ii

hg

hghg

22 )(

2

1),(

24

Local Boundary DetectionImage

Boundary Cues

Model

Pb

Brightness

Color

Texture

• Using training data to learn the posterior probability of a boundary P(b=1|x,y,) from local gradient information

• Logistic regression to combine cues

Cue CombinationBrightnessBrightness

Color

Texture

25

Canny Pb HumanImage

26

Canny Pb HumansImage

28

Goal

FewerFalsePositives

Fewer Misses

29

Recall = P(Pb > t | H = 1)

PrecisionP(H = 1 | Pb > t)

30

How good are humans locally?Off-Boundary On-Boundary

•Algorithm: r = 9, Humans: r = {5,9,18}

•Fixation(2s) -> Patch(200ms) -> Mask(1s)

31

Man versus Machine:

32

Findings

• Texture gradient information is important for natural scenes

• Optimal local cue combination is achievable with a simple linear model

• Algorithm for performing local boundary detection which performs nearly as well as local humans (and better than traditional edge detectors).

33

Overview

• Grouping– Local Boundary Detection– Local Human Performance

• Figure/Ground– Local Figure/Ground Cues– Local Human Performance

• Discussion

34

Local Cues for Figure/Ground

• Assume we have a perfect segmentation

• Can we predict which region a contour belongs to based on its local shape?– Size

– Convexity

– Lower Region

35

Figure-Ground Labeling

- start with 200 segmented images of natural scenes- boundaries labeled by at least 2 different human subjects- subjects agree on 88% of contours labeled

36Size(p) = log(AreaF / AreaG)

Size and Surroundedness [Rubin 1921]

GFp

37

Convexity(p) = log(ConvF / ConvG)

ConvG = percentage of straight lines that lie completely within region G

pG F

Convexity [Metzger 1953, Kanizsa and Gerbino 1976]

38

LowerRegion(p) = θG

Lower Region[Vecera, Vogel & Woodman 2002]

θ

center of mass

39

Size

LowerRegion

Convexity

40

Figural regions tend to lie below ground regions

41

Figural regions tend to be convex

42

Figural regions tend to be small

43

“Upper Bounding” Local Performance

• Present human subjects with local shapes, seen through an aperture.

Configuration Configuration + Content

47

Findings

• Convexity, size and lower-region are ecologically valid.

• Boundary configuration is relatively weak compared to luminance content.

• Local judgments based on luminance content can be quite accurate.

48

• How do these cues apply to real world images?

• How are different cues combined?

• Why does the visual system use these cues?

Perceptual organization as a computational theory of vision

49

How do ideas from perceptual organization relate to natural scenes?

50

How do ideas from perceptual organization relate to natural scenes?

51

THE END

1 ecological statistics and perceptual organization charless fowlkes work with david martin and...

Documents

figureground slide

similarity slide

berkeley slide

proximity slide

surroundedness slide

convexity slide

similarity of shape

nuances of color