csce 643 computer vision: extractions of image features jinxiang chai

70
CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Upload: victoria-lester

Post on 04-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

CSCE 643 Computer Vision:Extractions of Image Features

Jinxiang Chai

Page 2: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Good Image Features

• What are we looking for?– Strong features– Invariant to changes (affine and perspective,

occlusion, illumination, etc.)

Page 3: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Feature Extraction

Why do we need to detect features?

- Features correspond to important points in both the world and image spaces

- Object detection/recognition

- Solve the problem of correspondence

• Locate an object in multiple images (i.e. in video)• Track the path of the object, infer 3D structures,

object and camera movement

Page 4: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Outline

Image Features

- Corner detection

- SIFT extraction

Page 5: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

What are Corners?

Point features

Page 6: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

What are Corners?

Point features

Where two edges come together

Where the image gradient has significant components in the x and y direction

We will establish corners from the gradient rather than the edge images

Page 7: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Basic Ideas

What are gradients along x and y directions?

Page 8: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Basic Ideas

What are gradients along x and y directions?

Page 9: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Basic Ideas

What are gradients along x and y directions?

How to measure corners based on the gradient images?

Page 10: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Basic Ideas

What are gradients along x and y directions?

How to measure corners based on the gradient images?How to measure corners based on the gradient images?

- two major axes in the local window!

Page 11: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

How to Find Two Major Axes?

• Principal component analysis (PCA)

Page 12: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

How to Find Two Major Axes?

• Principal component analysis (PCA)

The length of two major axes is dependent on the ration of eigen values (λ1/λ2 ).

Page 13: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Algorithm

6160531918

5855531513

5555501313

1010101111

1012121110

y

yxII

x

yxII yx

),(,

),(

1. Compute the image gradients

2. Define a neighborhood size as an area of interest around each pixel

3x3 neighborhood

Page 14: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

3. For each image pixel (i,j), construct the following matrix from it and its neighborhood values

e.g.

Corner Detection Algorithm (cont’d)

6160531918

5855531513

5555501313

1010101111

1012121110

xI

2

2

),(yyx

yxx

T

y

x

y

xji III

III

I

I

I

IC

22222

2222)3,3(

5553155550

13101011]1,1[

C

Similar to covariance matrix (Ix,Iy)T!

Page 15: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Algorithm (cont’d)

4. For each matrix C(i,j), determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

- This means dominant gradient direction aligns with x or y axis.

- If either λ1 or λ2 is close to zero, then this is not a corner.

Simple case:

Page 16: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Algorithm (cont’d)

4. For each matrix C(i,j), determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

Simple case:

Isolated pixelsInterior Region Edge Corner

λ1, λ2=0 Large λ1 and small λ2 Large λ1 and large λ2 small λ1 and small λ2

Page 17: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Algorithm (cont’d)

4. For each matrix C(i,j), determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

- This is just a rotated version of the one on last slide

- If either λ1 or λ2 is close to zero, then this is not a corner.

- invariant to 2D rotation

General case:

Page 18: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Eigen-values and Corner

- λ1 is large

- λ2 is large

Page 19: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Eigen-values and Corner

- λ1 is large

- λ2 is small

Page 20: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Eigen-values and Corner

- λ1 is small

- λ2 is small

Page 21: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Algorithm (cont’d)

4. For each matrix C(i,j), determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

5. If both λ1 and λ2 are big, we have a corner (Harris also checks the ratio of λs is not too high)

ISSUE: The corners obtained will be a function of the threshold !

Page 22: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Image Gradients

Page 23: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Image Gradients

Closeup of image orientation at each pixel

Page 24: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

The Orientation Field

Corners are detected where both λ1 and λ2 are big

Page 25: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

The Orientation Field

Corners are detected where both λ1 and λ2 are big

Page 26: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Corner Detection Sample Results

Threshold=25,000 Threshold=10,000

Threshold=5,000

Page 27: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Outline

Image Features

- Corner detection

- SIFT extraction

Page 28: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Scale Invariant Feature Transform (SIFT)

• Choosing features that are invariant to image scaling and rotation

• Also, partially invariant to changes in illumination and 3D camera viewpoint

Page 29: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Motivation for SIFT

• Earlier Methods– Harris corner detector

• Sensitive to changes in image scale• Finds locations in image with large gradients in two

directions

– No method was fully affine invariant• Although the SIFT approach is not fully invariant it

allows for considerable affine change• SIFT also allows for changes in 3D viewpoint

Page 30: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Invariance

• Illumination

• Scale

• Rotation

• Affine

Page 31: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Readings

• Object recognition from local scale-invariant features [pdf link], ICCV 09

• David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110

Page 32: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

SIFT Algorithm Overview

1. Scale-space extrema detection

2. Keypoint localization

3. Orientation Assignment

4. Generation of keypoint descriptors.

Page 33: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Scale Space• Different scales are appropriate for

describing different objects in the image, and we may not know the correct scale/size ahead of time.

Page 34: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Scale space (Cont.)

• Looking for features (locations) that are stable (invariant) across all possible scale changes– use a continuous function of scale (scale space)

• Which scale-space kernel will we use?– The Gaussian Function

Page 35: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

- variable-scale Gaussian

- input image

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

Page 36: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

- variable-scale Gaussian

- input image

• To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

),(*)),,(),,((),,( yxIyxGkyxGyxD ),,(),,(),,( yxLkyxLyxD

Page 37: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

- variable-scale Gaussian

- input image

• To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

),(*)),,(),,((),,( yxIyxGkyxGyxD ),,(),,(),,( yxLkyxLyxD

Page 38: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

- variable-scale Gaussian

- input image

• To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

),(*)),,(),,((),,( yxIyxGkyxGyxD ),,(),,(),,( yxLkyxLyxD

Look familiar?

Page 39: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

- variable-scale Gaussian

- input image

• To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

),(*)),,(),,((),,( yxIyxGkyxGyxD ),,(),,(),,( yxLkyxLyxD

Look familiar?

-bandpass filter!

Page 40: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Difference of Gaussian

1. A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

2. B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

3. DOG (Difference of Gaussian) = A – B

4. So how to deal with different scales?

Page 41: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Difference of Gaussian

1. A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

2. B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

3. DOG (Difference of Gaussian) = A – B

4. Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)

Page 42: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

A1

B1

Difference of Gaussian Pyramid

Input Image

Blur

Blur

Blur

Downsample

Downsample

B2

B3

A2

A3

A3-B3

A2-B2

A1-B1

DOG2

DOG1

DOG3

Blur

Page 43: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Other issues

• Initial smoothing ignores highest spatial frequencies of images

Page 44: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Other issues

• Initial smoothing ignores highest spatial frequencies of images

- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

Page 45: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Other issues

• Initial smoothing ignores highest spatial frequencies of images

- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

• How to do downsampling with bilinear interpolations?

Page 46: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Bilinear Filter

Weighted sum of four neighboring pixels

x

y

u

v

Page 47: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Bilinear Filter

Sampling at S(x,y):

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

u

v

y

x

Page 48: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Bilinear Filter

Sampling at S(x,y):

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

Si = S(i,j) + a*(S(i,j+1)-S(i))

Sj = S(i+1,j) + a*(S(i+1,j+1)-S(i+1,j))

S(x,y) = Si+b*(Sj-Si)

To optimize the above, do the following

u

v

y

x

Page 49: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Bilinear Filter

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

y

x

Page 50: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Pyramid Example

A1 B1 DOG1

DOG3

DOG3A2

A3 B3

B2

Page 51: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Feature Detection

• Find maxima and minima of scale space• For each point on a DOG level:

– Compare to 8 neighbors at same level– If max/min, identify corresponding point at pyramid

level below– Determine if the corresponding point is max/min of its 8

neighbors– If so, repeat at pyramid level above

• Repeat for each DOG level• Those that remain are key points

Page 52: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Identifying Max/Min

DOG L-1

DOG L

DOG L+1

Page 53: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Refining Key List: Illumination

• For all levels, use the “A” smoothed image to compute– Gradient Magnitude

• Threshold gradient magnitudes: – Remove all key points with Mij less than 0.1

times the max gradient value

• Motivation: Low contrast is generally less reliable than high for feature points

Page 54: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

SIFT Feature Orientation?

• We now obtain the location and scale of SIFT features

• How can we obtain the orientation of features?

Page 55: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Assigning Canonical Orientation

• For each remaining key point:– Choose surrounding N x N window at DOG

level it was detected

DOG image

Page 56: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Assigning Canonical Orientation

• For all levels, use the “A” smoothed image to compute– Gradient Orientation

+

Gaussian Smoothed Image Gradient Orientation Gradient Magnitude

Page 57: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Assigning Canonical Orientation

• Gradient magnitude weighted by 2D gaussian

Gradient Magnitude 2D Gaussian Weighted Magnitude

* =

Page 58: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Assigning Canonical Orientation• Accumulate in histogram

based on orientation• Histogram has 36 bins with

10° increments

Weighted Magnitude

Gradient OrientationGradient OrientationS

um o

f W

eigh

ted

Mag

nitu

des

Page 59: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Assigning Canonical Orientation• Identify peak and assign

orientation and sum of magnitude to key point

Weighted Magnitude

Gradient OrientationGradient OrientationS

um o

f W

eigh

ted

Mag

nitu

des

Peak*

Page 60: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Local Image Description

• SIFT keys each assigned:– Location– Scale (analogous to level it was detected)– Orientation (assigned in previous canonical

orientation steps)

• Now: Describe local image region invariant to the above transformations

Page 61: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

SIFT key example

Page 62: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Local Image Description

For each key point:

• Identify 8x8 neighborhood (from DOG level it was detected)

• Align orientation to x-axis

Page 63: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Local Image Description

3. Calculate gradient magnitude and orientation map

4. Weight by Gaussian

Page 64: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Local Image Description

5. Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.

Page 65: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Local Image Description

6. This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)

Page 66: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Applications: Image Matching

• Find all key points identified in source and target image– Each key point will have 2d location, scale and

orientation, as well as invariant descriptor vector

• For each key point in source image, search corresponding SIFT features in target image.

• Find the transformation between two images using epipolar geometry constraints or affine transformation.

Page 67: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Image matching via SIFT featrues

Feature detection

Page 68: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Image matching via SIFT featrues

• Image matching via nearest neighbor search

- if the ratio of closest distance to 2nd closest distance greater than 0.8 then reject as a false match.

• Remove outliers using epipolar line constraints.

Page 69: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Image matching via SIFT featrues

Page 70: CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai

Summary

• SIFT features are reasonably invariant to rotation, scaling, and illumination changes.

• We can use them for image matching and object recognition among other things.

• Efficient on-line matching and recognition can be performed in real time