cs654: digital image analysis lecture 30: clustering based segmentation slides are adapted from:...

CS654: Digital Image Analysis

Lecture 30: Clustering based Segmentation

Slides are adapted from: http://www.wisdom.weizmann.ac.il/~vision/

Recap of Lecture 26

• Thresholding

• Otsu’s method

• Region based segmentation

• Region growing, split-merge, quad-tree

• Clustering, K-means

Outline of Lecture 27

• Overview about Mean Shift Segmentation

• Mean Shift algorithm• Kernel density estimation • Kernel

• Mean shift segmentation

• Applications of Mean Shift • Clustering• Filtering

Segmentation methods – overview

• Edge based

• Thresholding

• K-means clustering• The k-means algorithm is an algorithm to cluster n objects

based on attributes into k partitions, k < n

• Mean-shift algorithm

• Normalized cuts

K-Means pros and cons

• Pros• Simple and fast• Easy to implement

• Cons• Need to choose K• Sensitive to outliers

Mean-shift – motivation and intuitive description

• Given a distribution of points, mean shift is a procedure for finding the densest region.

• Example for simple 2D case

1. Start from arbitrary point in the distribution

2. Region of interest is a circle centered in this point

3. On each iteration find the center of the mass for the region of

interest

4. Move the circle to the center of the mass

5. Continue the iterations until convergence

Intuitive Description

Distribution of identical billiard balls

Region ofinterest

Center ofmass

Mean Shiftvector

Objective : Find the densest region

Region ofinterest

Center ofmass

Mean Shiftvector


Distribution of identical billiard balls Objective : Find the densest region

Region ofinterest

Center ofmass


Distribution of identical billiard balls Objective : Find the densest region

Mean-shift – algorithm formal definition

• The Basic Mean Shift Algorithm is formulated according to the following paper:

D. Comaniciu, P. Meer: Mean Shift Analysis and Applications, IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece , 1197-1203, 1999

Mean-shift – algorithm formal definition

Given:

• A set of n points in the d-dimensional space

Model:

• Assume non-parametric statistical model,

• There is a probability density function (PDF) associated with the set of points,

• Without any assumptions on its parameters.

Goal:

• For any given point find closest local mode of the density function.

PDF in feature space

• Color space• Scale space• Actually any feature space you can conceive• …

What is Mean Shift ?

Non-parametricDensity Estimation

Non-parametricDensity GRADIENT Estimation

(Mean Shift)

Data

Discrete PDF Representation

PDF Analysis

A tool for:Finding modes in a set of data samples, manifesting an underlying probability density function (PDF) in RN

Non-Parametric Density Estimation

Assumption : The data points are sampled from an underlying PDF

Assumed Underlying PDF Real Data Samples

Data point density implies PDF value !

Parametric Density Estimation

Assumption : The data points are sampled from an underlying PDF

2

2

( )

2

i

PDF( ) = i

iic e

x-μ

x

Estimate


Kernel Density Estimation

1

1( ) ( )

n

ii

P Kn

x x - xA function of some finite number of data pointsx1…xn

DataIn practice one uses the forms:

1

( ) ( )d

ii

K c k x

x or ( )K ckx x

Same function on each dimension Function of vector length only

Parzen Windows - Function Forms

Profile

Various Kernels

Examples:

• Epanechnikov Kernel

• Uniform Kernel

• Normal Kernel

21 1

( ) 0 otherwise

E

cK

x xx

1( )

0 otherwiseU

cK

xx

21( ) exp

2NK c

x x

Data

1

1( ) ( )

n

ii

P Kn

x x - xA function of some finite number of data pointsx1…xn

Kernel Density Estimation

Gradient

1

1 ( ) ( )

n

ii

P Kn

x x - x Give up estimating the PDF !Estimate ONLY the gradient

2

( ) iiK ck

h

x - xx - xUsing the

Kernel form:

Size of window

𝛻 𝑃 (𝒙 )= 1𝑛∑

𝑖=1

𝑛

𝛻 {𝑐𝑘(‖𝒙−𝒙 𝒊

h ‖2

)}

Kernel Density Gradient Estimation

𝛻 𝑃 (𝒙 )= 1𝑛∑

𝑖=1

𝑛

𝛻 {𝑐𝑘(‖𝒙−𝒙 𝒊

h ‖2

)}⇒𝛻𝑃 (𝒙 )=𝑐

𝑛∑𝑖=1

𝑛

𝑘 ′(‖𝒙− 𝒙𝒊

h ‖2

) .2.(𝒙− 𝒙𝒊 )h2

⇒𝛻𝑃 (𝒙 )= 2𝑐𝑛 h2 ∑

𝑖=1

𝑛

( 𝒙−𝒙 𝒊 )𝑘 ′ (‖𝒙−𝒙 𝒊

h ‖2

)Let,

𝛻 𝑃 (𝒙 )= 2𝑐𝑛h2∑

𝑖=1

𝑛

(𝒙 𝒊− 𝒙 )𝑔(‖𝒙−𝒙 𝒊

h ‖2

)

Kernel Density Gradient Estimation

𝛻 𝑃 (𝒙 )= 2𝑐𝑛h2∑

𝑖=1

𝑛

(𝒙 𝒊− 𝒙 )𝑔(‖𝒙−𝒙 𝒊

h ‖2

)⇒𝛻𝑃 (𝒙 )= 2𝑐

𝑛 h2 [∑𝑖=1

𝑛

𝒙𝒊𝑔 (‖𝒙− 𝒙 𝒊

h ‖2

)−∑𝑖=1

𝑛

𝒙𝑔 (‖𝒙− 𝒙 𝒊

h ‖2

)]

⇒𝛻𝑃 (𝒙 )= 2𝑐𝑛 h2 ∑

𝑖=1

𝑛

𝑔(‖𝒙−𝒙 𝒊

h ‖2

) [∑𝑖=1

𝑛

𝒙 𝒊𝑔 (‖𝒙 −𝒙 𝒊

h ‖2

)∑𝑖=1

𝑛


h ‖2

)−𝒙 ]

⇒𝛻𝑃 (𝒙 )= 2𝑐𝑛 h2 ∑

𝑖=1

𝑛


h ‖2

) [∑𝑖=1

𝑛


h ‖2

)∑𝑖=1

𝑛


h ‖2

)−𝒙 ]

Yet another Kernel density estimation !

Simple Mean Shift procedure:• Compute mean shift vector

•Translate the Kernel window by m(x)

2

1

2

1

( )

ni

ii

ni

i

gh

gh

x - xx

m x xx - x

Computing the Mean Shift

Mean Shift Procedure

At density maxima

⇒𝛻𝑃 (𝒙 )= 2𝑐𝑛 h2 ∑

𝑖=1

𝑛


h ‖2

) [∑𝑖=1

𝑛


h ‖2

)∑𝑖=1

𝑛


h ‖2

)−𝒙 ]=0

mean shift vector — must be 0 at optimum

• Automatic convergence speed – the mean shift vector size depends on the gradient itself.

• Near maxima, the steps are small and refined

• Convergence is guaranteed for infinitesimal steps only infinitely convergent, (therefore set a lower bound)

• For Uniform Kernel ( ), convergence is achieved in a finite number of steps

• Normal Kernel ( ) exhibits a smooth trajectory, but is slower than Uniform Kernel ( ).

AdaptiveGradient Ascent

Mean Shift Properties

Tessellate the space with windows Run the procedure in parallel

Real Modality analysis

Attraction basin

• Capture theorem

• Attraction basin:

• The region for which all trajectories lead to the same mode

• Cluster: all data points in the attraction basin of a mode

Clustering: Real Example

L*u*v space representation

( )s r

s rs r

K C k kh h

x xx

Clustering: Real Example

Not all trajectoriesin the attraction basinreach the same mode

2D (L*u) space representation

Final clusters

Mean shift clustering

The mean shift algorithm seeks modes of the given set of points

1. Choose kernel and bandwidth

2. For each point:a) Center a window on that pointb) Compute the mean of the data in the search windowc) Center the search window at the new mean locationd) Repeat (b,c) until convergence

3. Assign points that lead to nearby modes to the same cluster

Segmentation by Mean Shift

1. Find features (color, gradients, texture, etc)

2. Set kernel size for features Kr and position Ks

3. Initialize windows at individual pixel locations

4. Perform mean shift for each window until convergence

5. Merge windows that are within width of Kr and Ks

Feature space : Joint domain = spatial coordinates + color space

( )s r

s rs r

K C k kh h

x xx

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Mean shift segmentation results

Application of Mean Shift: Filtering

• Image filtering is a process by which we can enhance, modify or multilate images.

• Filtering reduces the influence from noise to mode detection.

• Mean shift filtering can work with binary, gray scale, RGB and arbitrary multichanel images.

• Filtering is the first step of mean shift segmentation process.

• A second step is the clustering of filtered data point

Mean Shift Filtering

• Let the d-dimensional input image be the filtered image

Algorithm

1. Initialize , and

2. Compute till convergence

3. Assign

Spatial location will have range component of the point of convergence

Examples

Comparison with Gaussian

Mean-shift: other issues

• Speedups• Uniform kernel (much faster but not as good)

• Binning or hierarchical methods

• Approximate nearest neighbor search

• Methods to adapt kernel size depending on data density

Mean shift pros and cons

• Pros• Good general-practice segmentation• Finds variable number of regions• Robust to outliers

• Cons• Have to choose kernel size in advance• Original algorithm doesn’t deal well with high dimensions

• When to use it• Over-segmentatoin• Multiple segmentations• Other tracking and clustering applications

Watershed algorithm

Watershed segmentation

Image Gradient Watershed boundaries

Meyer’s watershed segmentation

1. Choose local minima as region seeds

2. Add neighbors to priority queue, sorted by value

3. Take top priority pixel from queue1. If all labeled neighbors have same label, assign to pixel2. Add all non-marked neighbors

4. Repeat step 3 until finished

Matlab: seg = watershed(bnd_im)

Simple trick• Use median filter to

reduce number of regions

Watershed pros and cons

• Pros• Fast (< 1 sec for 512x512 image)• Among best methods for hierarchical segmentation

• Cons• Only as good as the soft boundaries• Not easy to get variety of regions for multiple segmentations• No top-down information

• Usage• Preferred algorithm for hierarchical segmentation

Thank youNext Lecture: Graph based approach

cs654: digital image analysis lecture 30: clustering based segmentation slides are adapted from:...

Documents

mean shift segmentation

mean shift analysis

meanshift motivation

segmentation region

data points

based segmentation slides

pdf value

region of interestmove