eecs 274 computer vision segmentation by clustering

EECS 274 Computer Vision

Segmentation by Clustering

Segmentation and grouping

• Motivation: Obtain a compact representation from an image/motion sequence/set of tokens

• Should support application• Broad theory is absent at

present

• Reading: FP Chapter 9-10, S Chapter 5

• Grouping (or clustering)– collect together tokens

that “belong together”• Fitting

– associate a model with tokens

– issues• which model?• which token goes to

which element?• how many elements in

the model?

Why do these tokens belong together (from human vision)?• because they form surfaces?• because they form the same texture?

How do they assemble?

Segmentation: examples

• Summarizing videos: shots• Finding matching parts• Finding people• Finding buildings in satellite images• Searching a collection of images• Object detection• Object recognition

Models problems

• Forming image segments: superpixels, image regions

• Fitting lines to edge points• Fitting a fundamental matrix to a set

of feature points:– If the correspondence is right, there is a

fundamental matrix connecting the points

Segmentation as clustering

• Partitioning– Decompose into regions that have coherent color and

texture– Decompose into extended blobs, consisting of regions

that have coherent color, texture, and motion – Take a video sequence and decompose it into shots

• Grouping– Collecting together tokens that, taken together, form a

line– Collecting together tokens that seem to share a

fundamental matrix

• What is the appropriate representation?

General ideas

• Tokens– whatever we need to group (pixels, points,

surface elements, etc., etc.)

• Top down segmentation– tokens belong together because they lie on the

same object

• Bottom up segmentation– tokens belong together because they are

locally coherent

• These two are not mutually exclusive

Grouping and Gestalt

• Context affects how things are perceived

• Gestalt school of psychology– Studying shape as a whole

instead of just a collection of simple and curves

– Grouping means the tendency of visual system to assemble components together and perceive them together

Muller-Lyer illusion

Some Gestalt properties

emergence reification

Figure and ground

• (Left) White circle/black rectangle: which is figure (foreground) and which is ground (background)?

• (Right) What object is in this picture?

Basic ideas of grouping in humans• Figure-ground

discrimination– grouping can be seen in

terms of allocating some elements to a figure, some to ground

– impoverished theory

• Gestalt properties– elements in a collection of

elements can have properties that result from relationships (Muller-Lyer effect)

• gestalt qualitat (form quality)

– A series of factors affect whether elements should be grouped together

• Gestalt factors

Gestalt movement

Examples of Gestalt factors

Example

Occlusion as a cue

Grouping phenomena

Illusory contours

Tokens appear to be grouped together as they provide a cue to the presence of an occluding object whose contour has no contrast

Background subtraction

• If we know what the background looks like, it is easy to identify “interesting bits”

• Applications– person in an office– tracking cars on a road– surveillance

• Approach:– use a moving average to

estimate background image

– subtract from current frame

– large absolute values are interesting pixels

• trick: use morphological operations to clean up pixels

Results using 80 x 60 frames (a) Average of all 120 frames. (b) thresholding. (c) results with low threshold (with missing

pixels) . (d) background computed using the EM-based algorithm. (e) background subtraction results (with missing pixels).

Results using 160 x 120 frames (a) Average of all 120 frames. (b) thresholding. (c) results with low threshold (with missing

pixels) . (d) background computed using the EM-based algorithm. (e) background subtraction results (with missing pixels).

Shot boundary detection

• Find the shots in a sequence of video– shot boundaries usually

result in big differences between succeeding frames

– allows for objects to move within a given shot

• Strategy:– compute interframe

distances– declare a boundary where

these are big

• Possible distances– frame differences– histogram differences– block comparisons– edge differences

• Applications:– representation for movies,

or video sequences • find shot boundaries• obtain “most

representative” frame– supports search

Interactive segmenation

Grabcut

Image matting

Superpixels

Segmentation as clustering

• Cluster together (pixels, tokens, etc.) that belong together

• Agglomerative clustering– attach closest to cluster it

is closest to– repeat

• Divisive clustering– split cluster along best

boundary– repeat

• Point-Cluster distance– single-link clustering

(distance between two closest elements)

– complete-link clustering (maximum distance between two elements)

– group-average clustering

• Dendrograms (tree diagram)– yield a picture of output as

clustering process continues

A data set A dendrogram

K-Means

• Choose a fixed number of clusters

• Choose cluster centers and point-cluster allocations to minimize error

• Can’t do this by search, because there are too many possible allocations.

• Algorithm– fix cluster centers; allocate

points to closest cluster– fix allocation; compute

best cluster centers

• x could be any set of features for which we can compute a distance (careful about scaling)

clusters

2

clusterth - of elements

),(i

ijij

xdataclusters

K-means clustering using intensity alone and color alone(using 5 clusters)How many clusters do we need in an image?

Image Clusters on intensity Clusters on color

K-means using color alone, 11 segments(6 kinds of vegetables)

Image Clusters on color

K-means usingcolor alone,11 segments(not using texture or position)- segments may be disconnected and scattered

K-means using color andposition, 20 segments- peppers are now separatedbut, cabbage is still broken up

Graph-theoretic clustering

• Represent tokens using a weighted graph.– affinity matrix– similar have large weights

• Cut up this graph to get connected components – with strong interior links– by cutting edges with low weights

• A graph is a set of vertices V and edges E, G={V,E}

• A weighted graph is one in which each edge is associated with a weight

• Every graph consists of a disjoint set of connected components

35

Flow networks

• A flow network G=(V,E): a directed graph, where each edge (u,v)E has a nonnegative capacity c(u,v)>=0.

• If (u,v)E, we assume that c(u,v)=0.• Two distinct vertices: a source s and a sink t.

s t

16

12

20

10 4 9 7

413

14

Maximum flow minimum cut

• Also known as s-t cut• Several deterministic and

randomized algorithms

Resulting Flow = 23

Weighted graph

9 points/pixels 9 x 9 Weighted matrix that looks like block diagonal

A cut with 2 connected components

BjAi

ijwBAcut,

),(

(Left) a set of points (Right) affinity matrix computed with a decaying exponential distance (large values bright pixels, and small values dark pixels)

Graph cut and normalized cut

),(),(),(

),(

),(

),(

),(),(

),(,

BAcutAAassocVAassoc

VBassoc

BAcut

VAassoc

BAcutBANcut

wBAcutBjAi

ij

cut(A,B): sum of weights of all edges between A and B

assoc(A,V):sum of all the weights associated with nodes in A

Cut a graph V into 2 connected components with minimal cost

Affinity

Intensity

Texture

Distance

2

2 )()(2

1exp),( yIxIyxaff

I

2

22

1exp),( yxyxaff

d

2

2 )()(2

1exp),( ycxcyxaff

c

Scale affects affinity

σ=0.1 σ=0.2 σ=1

Extracting a single good cluster• Simplest idea: we want a

vector a giving the association between each element and a cluster

• We want elements within this cluster to, on the whole, have strong affinity with one another

• We could maximize

• But need the constraint • Solving constrained

optimization with Lagrange multiplier

• This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue

• Cluster weights are the elements of the eigenvectors

1nT

n

nT

n

ww

Aww

)1( nT

nnT

n wwAww

nn wAw

association of element i with cluster n ×affinity between i and j × association of element j and cluster n

Example eigenvector

points

matrix

eigenvector

Clustering with eigenvectors

• If we reorder the nodes, the rows and columns of A are shuffled• We expect to deal with a few clusters with large association • We could shuffle the rows and columns of M to form a matrix

that is roughly block diagonal• Shuffling M simply shuffle the elements of its eigenvectors so • We can reason about the eigenvectors by working with a

shuffled version of M

Exampledata =

1 8 7 2 9 3 2 7

affinity =

1.0000 0.0144 0.0089 0.4931 0.0144 1.0000 0.3269 0.0291 0.0089 0.3269 1.0000 0.0178 0.4931 0.0291 0.0178 1.0000

nEigVec =

0.4959 -0.5122 -0.5029 -0.4869 -0.5252 -0.4839 0.4747 -0.5162

Exampledata =

1 8 2 7 9 3 7 2

affinity =

1.0000 0.4931 0.0089 0.0144 0.4931 1.0000 0.0178 0.0291 0.0089 0.0178 1.0000 0.3269 0.0144 0.0291 0.3269 1.0000

nEigVec =

0.4959 0.5122 0.4747 0.5162 -0.5252 0.4839 -0.5029 0.4869

Eigenvectors and clusters

• Eigenvectors of block-diagonal matrices consist of eigenvectors of the blocks padded with zeros

• Each block has an eigenvector corresponding to a rather large eigenvalue

• Expect that, if there are c significant clusters, the eigenvectors corresponding to the c largest eigenvalue

Large eigenvectors and clusters

Eigen gap and clusters σd=0.1 σd=0.2 σd=1

More than two segments

• Two options– Recursively split each side to get a tree,

continuing till the eigenvalue are too small

– Use the other eigenvectors

Normalized cuts

• Current criterion evaluates within cluster similarity, but not across cluster difference

• Can also use Cheeger constant

• Instead, we would like to maximize the within cluster similarity compared to the across cluster difference

• Write graph as V, one cluster as A and the other as B

• Maximize

• i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph

cut(A,B): sum of weights of all edges between A and B

assoc(A,V):sum of all the weights associated with nodes in A

Normalized cuts

• Write a vector y whose elements are 1 if item is in A, -b if it’s in B

• Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones.

• Criterion becomes

• and we have a constraint

• This is hard to do, because y ’s values are quantized

WDL

Dyy

Lyy

Dyy

yWDyT

T

yT

T

y

min

)(min

01 DyT

Normalized cuts and spectral graph theory

• Instead, solve the generalized eigenvalue problem

• which gives

• Now look for a quantization threshold that maximizes the criterion --- i.e all components of y above that threshold go to one, all below go to -b

1..

max))((max

Dyyts

LyyyWDyT

Ty

Ty

DyLy

DyyWD

)(

Using affinity based on texture and intensity. Figure from “Image and video segmentation: the normalized cut framework”, by Shi and Malik, copyright IEEE, 1998

Using spatio-temoral affinity metric.

Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000

Spectral clustering

Ng, Jordan and Weiss, NIPS 01

Comparisons

Ng, Jordan and Weiss, NIPS 01

8 clusters 3 clusters 2 clusters




Spectral graph theory

DyLy

WDDIDWDDL

WD

WDj

ijii

2/12/12/12/1 )( :Laplaciangraph normalized

:Laplaciangraph

• See EECS 275 Lecture 23 for more detail• See also Laplacian eigenmap, Laplace Beltrami operator,Laplacian eigenmap, spectral clustering, graph cut, minimal-cut maximum-flow, normalized cut, random walk, heat kernel, diffusion map

generalized eigenvalue problem

More information

• G. Scott and H. Longuet-Higgins, Feature grouping by relocalisation of eigenvectors of the proximity matrix, BMVC, 1990

• F. Chung, Spectral graph theory, AMS 1997• Y. Weiss, Segmentation using eigenvectors: a

unifying view, ICCV 1999• L. Saul et al., Spectral methods for dimensionality

reduction, in Semisupervised learning, 2006• U von Luxburg, A tutorial on spectral clustering,

Statistics and Computing, 2007

Related topics

• Watershed• Multiscale segmentation• Mean-shift segmentation• Parsing tree/stochastic grammar• Graph cut• Objet cut• Grab cut• Random walk• Energy-based method

eecs 274 computer vision segmentation by clustering

Documents

collection of elements

surface elements

ground background

coherent color

collection of simple

fundamental matrixwhat

tokensissueswhich model

interesting pixelstrick