cap5415-lecture 2mtappen/cap5415/lecs/lec10.pdftitle: cap5415-lecture 2 author: khurram hassan...

56
Lecture 10: Segmentation by Clustering Marshall Tappen

Upload: others

Post on 04-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Lecture 10: Segmentation by Clustering

Marshall Tappen

Page 2: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Region Segmentation

Page 3: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

What is the idea underlying segmentation?

We want to group pixels together that “belong” togetherComputer vision researchers aren't the first ones to think about thisAlso studied by the Gestalt school of psychologists

Page 4: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

GroupingTo perceive the image, the elements must be perceived as a wholeStudied how elements could be grouped togetherGestalt psychologists identified a group of factors that led to elements being grouped togetherI'm mentioning these ideas because they often come up in discussions of segmentation and grouping in computer visionHere are some examples

Page 5: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Proximity

Things that are nearby tend to be grouped together

(Figure from Forsyth and Ponce)

Page 6: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Similarity

Similar things tend to be grouped together

(Figure from Forsyth and Ponce)

Page 7: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Common Region

Tokens that lie in the same region tend to be grouped together

(Figure from Forsyth and Ponce)

Page 8: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Parallelism

Parallel lines or tokens tend to be grouped together

(Figure from Forsyth and Ponce)

Page 9: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Symmetry

We prefer groupings that lead to symmetric groups

(Figure from Forsyth and Ponce)

Page 10: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Closure

Tokens that lead to closed curves tend to be grouped together

(Figure from Forsyth and Ponce)

Page 11: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Grouping can lead to interesting effects

Called the Kanizsa TriangleGrouping is causing you to see illusory contours

Page 12: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Back to Pixels

Our goal is to group pixelsWe won't be able to incorporate all of the Gestalt cues, so we will have to focus on simpler cuesRGB similarityProximity

Page 13: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Simple idea

Let's find three clusters in this dataThese points could represent RGB triplets in 3D

Page 14: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Simple idea

Begin by guessing where the “center” of each cluster is

Page 15: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Simple idea

Now assign each point to the closest cluster

Page 16: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Simple ideaNow move each cluster center to the center of the points assigned to itRepeat this process until it converges

Page 17: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Mathematically, What's going on?

Each cluster will be described by a center μj

Each point, xi, will be assigned to one cluster

Call this assignment c(i)Our goal is to find the assignments and centers that minimize

Page 18: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

How do we do this?

Optimizing c(i) and μj jointly is too difficult

But!What if I know μ

j already?

How do I minimize this?

Page 19: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

How do we do this?

What if I know c(i) already?

Do you see why it's called k-means?

Page 20: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

K-Means

Page 21: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

How does this translate to images?

(From Comanciu and Meer)

Page 22: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Image Segmentation by K-Means

Select a value of KSelect a feature vector for every pixel (color, texture, position, or combination of these etc.)Define a similarity measure between feature vectors (Usually Euclidean Distance).Apply K-Means Algorithm.Apply Connected Components Algorithm.Merge any components of size less than some threshold to an adjacent component that is most similar to it.

Page 23: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

K-means clustering using intensity alone and color alone

Image

Clusters on intensity

Clusters on color

Page 24: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

K-means using color alone, 11 segments

Image

Clusters on color

Page 25: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

K-means usingcolor alone,11 segments.

Page 26: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Probabilistic Point of View

We'll take a generative point of viewHow to generate a data point:1)Choose a cluster,z, from (1 .... N)2)Sample that point from the distribution associated with that cluster

1D Example

Page 27: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Called a Mixture Model

z indicates which cluster is chosen

Probability of choosing cluster k

Probability of x given the cluster is k

or

Page 28: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

To make it a Mixture of Gaussians

Called a mixing coefficient

Page 29: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Brief Review of Gaussians

Page 30: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Mixture of Gaussians

Page 31: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

In Context of Our Previous Model

Now, we have means and covariances

Page 32: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

How does this help with clustering?

If we had the parameters of the clusters, it would be easy to assign points to clusters

How do we get the cluster parameters? We'll maximize the likelihood of the data

Page 33: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Mathematically, this means

Page 34: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Log of Mixture ModelMixture Model

Page 35: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Now we run into a problem

This is hard to maximize But, we can lower bound it If the lower bound is easy to work with, we

can maximize it. That should push the true function up

Page 36: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Lower Bounding

We use a theorem called Jensen's inequality

These have to add up to one

Page 37: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

This looks familiar

This looks a lot like using Bayes rule to find the probability of that point's cluster

Page 38: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Now life is easier We can now differentiate to find parameters This is called the M-Step, The previous step is called the E-Step You are always increasing a lower bound Complete set of steps:

Find Mean Covariance Mixing Coefficients

Page 39: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Where this comes from

Let's differentiate with respect to \mu_k

Page 40: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Mixing Coefficients

Page 41: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

EM Algorithm

This is called the E-StepM-Step: Using these estimates of maximize the rest of the parameters

Find Mean Covariance Mixing Coefficients

Page 42: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Back to clustering

Now we have Can be seen as a soft-clustering

Page 43: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

How many clusters?

Remember the line problem?

Page 44: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Basic Idea

We want to fit the data well, but we don't want a model that is too complex

We are balancing two issues: Fitting the data Model complexity (Here, that is the

number of lines Three popular criteria for evaluating this

Page 45: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

AIC – An Information Criterion

L is the squared error in our predictions (There is a probabilistic interpretation also,

involving the log-likelihood) The variable p is the number of

parameters

Page 46: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

BIC – Bayes Information Criterion

L is the squared error in our predictions (There is a probabilistic interpretation also,

involving the log-likelihood) The variable N is the number of

parameters Also called MDL (Minimum Description

Length)

Page 47: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM
Page 48: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

It doesn't always work (But it's close)

Page 49: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Another Clustering Application

Page 50: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Another Clustering Application

In this case, we have a video and we want to segment out what's moving or changing

from C. Stauffer and W. Grimson

Page 51: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Easy Solution

Average a bunch of frames to get a “Background” ImageComputer the difference between background and foreground

Page 52: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

The difficulty with this approach

The background changes

(From Stauffer and Grimson)

Page 53: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Solution

Fit a mixture model to the backgroundI.E. A background pixel could have multiple colors

Page 54: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Can use this to track in surveillance

Page 55: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Advantages/Disadvantages

Advantages:Easy to code!Flexible, you can easily incorporate

cues like proximity by including more features

Be careful about scaling! (why?)Monotonic optimization

Page 56: CAP5415-Lecture 2mtappen/cap5415/lecs/lec10.pdfTitle: CAP5415-Lecture 2 Author: Khurram Hassan Shafique Created Date: 9/23/2010 11:55:58 AM

Advantages/Disadvantages

Disadvantages:Only converges to a local minimumYou still need to initialize itThat could have a big impact on quality

of results