ee 168 handout - stanford university

14
1 EE 168 Handout # Introduction to Digital Image Processing February 28, 2012 HOMEWORK 6 SOLUTIONS Before presenting the solutions for the individual problems, it is helpful to review some of the related ideas presented in the lectures. The solutions of the problems follow this review. Review of material: Convolution between two discrete space-domain signals h 1 (n,m) and h 2 (n,m), [0 n,m W 1], can be performed in two different ways: 1) Evaluating the convolution sum in the space domain, i.e. computing the following sum: ) , ( h ) , ( h ) , ( h 2 1 1 1 1 1 3 l m k n l k m n W l W k u ¦ ¦ where h 3 (n,m), 0 ≤ n,m ≤ 2*(W 1) is the resultant convolution sum. 2) Using the Fourier Transforms: i) Finding the Discrete Fourier Transforms (DFTs), H 1 (k,l), [ 0 ≤ k,l W 1 ], and H 2 (k,l), [ 0 ≤ k,l W 1 ], of the two signals h 1 (n,m) and h 2 (n,m) as: l m k n W j W m W n e m n h l k H u u u u u u ¦ ¦ S 2 1 1 0 1 0 1 ) , ( ) , ( , 0 ≤ k, l W 1 l m k n W j W m W n e m n h l k H u u u u u u ¦ ¦ S 2 2 1 0 1 0 2 ) , ( ) , ( , 0 ≤ k, l W 1 ii) Computing the DFT of h 3 * (n,m) using the convolution theorem: ) , ( ) , ( ) , ( 2 1 * 3 l k H l k H l k H u , 0 ≤ k, l W 1 iii) Inverting H 3 * (k,l) to obtain h 3 * (n,m): l m k n W j W l W k e l k H m n h u u u u u u ¦ ¦ S 2 * 3 1 0 1 0 * 3 ) , ( ) , ( , 0 ≤ n, m W 1 Two important points, which are also emphasized in the problems, need to be noted. The first one is related to the computation efficiencies of the two methods, while the second one is on the accuracy of the second method.

Upload: others

Post on 29-Apr-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EE 168 Handout - Stanford University

1

EE 168 Handout # Introduction to Digital Image Processing February 28, 2012

HOMEWORK 6 – SOLUTIONS Before presenting the solutions for the individual problems, it is helpful to review some of the related ideas presented in the lectures. The solutions of the problems follow this review. Review of material: Convolution between two discrete space-domain signals h1(n,m) and h2(n,m), [0 ≤ n,m ≤ W – 1], can be performed in two different ways:

1) Evaluating the convolution sum in the space domain, i.e. computing the following sum:

),(h),(h),(h 211

1

1

13 lmknlkmn W

l

W

k��u ¦¦ �

where h3(n,m), 0 ≤ n,m ≤ 2*(W – 1) is the resultant convolution sum.

2) Using the Fourier Transforms: i) Finding the Discrete Fourier Transforms (DFTs), H1(k,l), [ 0 ≤ k,l ≤ W – 1 ],

and H2(k,l), [ 0 ≤ k,l ≤ W – 1 ], of the two signals h1(n,m) and h2(n,m) as:

� �lmknW

jW

m

W

nemnhlkH

u�uuu

u��

u ¦¦

S2

11

0

1

01 ),(),( , 0 ≤ k, l ≤ W – 1

� �lmknW

jW

m

W

nemnhlkH

u�uuu

u��

u ¦¦

S2

21

0

1

02 ),(),( , 0 ≤ k, l ≤ W – 1

ii) Computing the DFT of h3*(n,m) using the convolution theorem:

),(),(),( 21

*3 lkHlkHlkH u , 0 ≤ k, l ≤ W – 1

iii) Inverting H3

*(k,l) to obtain h3*(n,m):

� �lmkn

WjW

l

W

kelkHmnh

u�uuu

u��

u ¦¦

S2*

31

0

1

0*

3 ),(),( , 0 ≤ n, m ≤ W – 1 Two important points, which are also emphasized in the problems, need to be noted. The first one is related to the computation efficiencies of the two methods, while the second one is on the accuracy of the second method.

Page 2: EE 168 Handout - Stanford University

2

Computational Efficiency:

The second method is computationally more efficient than the first. In the first method, W 2 multiplications are needed to compute each value in h3(n,m). Since h3 is of size W 2, a total of W 4 multiplications are needed. Although the number of multiplications used in the second method is also on the order of W 4, we note that the second method involves only DFT computations, which can be significantly reduced through the FFT approach. In particular, a 2-D DFT of a W by W function can be computed using only W 2log2(W 2) computations. Hence, the second method, which involves a total of three DFT computations (steps I and iii), and one product (step ii) requires a total of 3*W2log2(W 2) + W 2 computations. Accuracy of Second Method:

A careful look at the two methods above reveals that the 2*(W–1) length (length in both

dimensions) signal h3 is replaced by the W length (again length is in both dimensions) signal h3*

in the second method. The difference in lengths is due to the fact that the product of the two DFT’s of h1(n,m) and h2(n,m) does not yield the DFT of , the (linear) convolution of h1 and h2.

The crucial point is that the desired (linear convolution), h3, and the circular convolution, h3

*, are equivalent if h1 and h2 are each zero for an interval (in both dimensions) of at least W. You may want to read Chapter 6 of Ludeman (used in EE103) or Chapter 10 of Bracewell (used in EE261) for more information on this.

Page 3: EE 168 Handout - Stanford University

3

Problem 1: Convolution of Simple Shapes Note that this problem asks to compute the convolution sum using the first method reviewed above. This requires W 4 = 10244 multiplications. Using Matlab’s tic and toc functions, this requires approximately 55 seconds on a typical Dell Precision Workstation (equipped with an Intel 4 processor) in the ISE lab. Figure 1-1 shows the result of the convolution and illustrates two cuts through the origin, one horizontal and the other diagonal. Since h1(n,m) and h2(n,m) are squares of size 1024 by 1024, they can be represented as a product of two 1-D rectangles of width 1024. Letting the 1024-width rectangles be denoted as f(n) and g(m), we write: )()(),(h1 mgnfmn u and )()(),(h2 mgnfmn u Using the separability theorem, their DFT’s can be represented as: )()(),(H1 lGkFlk u and )()(),(H2 lGkFlk u where F(k) and G(l) are the 1-D DFT’s of f(n) and g(m), respectively. Finally, the convolution theorem is used to obtain: ),(),(),(H 211 lkHlkHlk u

� � � �),()(

)()()()()()()()(

lBkAlGlGkFkF

lGkFlGkF

u uuu

uuu

where A(k) and B(l) are the 1-D DFTs of f(n)*f(n) and g(m)*g(m), respectively. Applying separability once more, we conclude that, )()(),(h3 mbnamn u , where a(n) and b(m) are triangles of width 2*W, (Recall from HW-5, that the convolution of two squares of width W is a triangle of width 2*W). You can check to make sure that the product of the two 1-D triangles has the same values as the convolution sum computed above.

The original image

-200 -100 0 100 200

-200

-100

0

100

200

The convolution result

-200 -100 0 100 200

-200

-100

0

100

200

-400 -200 0 200 4000

0.5

1

1.5

2x 10

4 Horizontal cut

-400 -200 0 200 4000

0.5

1

1.5

2x 10

4 Diagonal cut

Figure 1-1: Convolving a square function with itself in the space-domain

Page 4: EE 168 Handout - Stanford University

4

Problem 2: Convolution with the Convolution Theorem

This problem asks to compute the convolution of the two signals using the second method above. This requires W 2log2(W 2) = 100*10242 multiplications. Using Matlab’s tic and toc functions, this requires approximately 0.6 seconds on a typical Dell Precision Workstation (equipped with an Intel 4 processor) in the ISE lab. Figure 2-1 shows the result of the convolution. Note that since the square function has a sufficiently long interval of zeros in both the x and y dimensions, the circular convolution (obtained by inverting the product of the DFTs) is equivalent to the desired (linear) convolution. This is not always the case, as demonstrated in the following problem.

The Original Image

-200 -100 0 100 200

-200

-100

0

100

200

The Convolution result using FFTs

-200 -100 0 100 200

-200

-100

0

100

200

-400 -200 0 200 4000

0.5

1

1.5

2x 10

4 Horizontal cut

-400 -200 0 200 4000

0.5

1

1.5

2x 10

4 Diagonal cut

Figure 2-1: Convolving a square function with itself using FFTs

Problem 3: Convolution and Effects

Note that h2(n,m) is non-zero for an interval of length 180 (in the x-direction), while it is zero for an interval of only length 76. Since 76 is less than 180, there will be wrap around with non-zero elements, and the circular convolution obtained by inverting the product of the two DFTs is different from the desired (linear) convolution. The comparison of direct (linear) convolution and the circular convolution through DFTs is shown in Figure 3-1. The first 20 columns are zero for the direct convolution, as they should be, since the points that fall off of the end of the array are clipped (remember we are looking at the ‘same’ convolution size parameter). However, for the DFT-produced convolution, the off-the-end points reappear as nonzero values in the first 15 columns of the image. This is notable in the center-row cuts plotted in Figure 3-2. Observe that the wrap-around effect of the DFT approach is due to the periodic nature of DFTs.

Page 5: EE 168 Handout - Stanford University

5

Direct Convolution - First 20 columns (all zeros)

-125 -120 -115 -110 -105 -100

-100

-50

0

50

100

Convolution using DFTs - first 20 columns

-125 -120 -115 -110 -105 -100

-100

-50

0

50

100

Convolution result using conv2()

-100 -50 0 50 100

-100

-50

0

50

100

Convolution result using FFTs

-100 -50 0 50 100

-100

-50

0

50

100

Figure 3-1: Convolving two rectangles of different widths in space and in special-frequency domains

0 100 200 3000

500

1000

1500

2000

2500

3000

3500

4000

4500Cut through the convultion result using conv2()

0 100 200 3000

500

1000

1500

2000

2500

3000

3500

4000

4500Cut through the convultion result using DFTs

Figure 3-2: Cuts through the center of the two convolution results in space and in special-frequency domains

Page 6: EE 168 Handout - Stanford University

6

Problem 4: Autocorrelation of a Circular Disk The autocorrelation, h4(n,m) of a real, W-length discrete-time signal h1(n,m) is defined as:

),(h),(h),(h 111

1

1

14 lmknlkmn W

l

W

k��u ¦¦ �

We can then express the autocorrelation h4(n,m) in convolution notation as: ),(h*),(h),(h 114 mnmnmn �� Noting that the circle is symmetric around both the x-axis and the y-axis, i.e. h1(n,m) = h1(-n,-m), we can conclude that for a circle, the autocorrelation is identical to the convolution of the circle with itself. Hence, we can use the conv2() function in Matlab to evaluate the autocorrelation. The autocorrelation is illustrated in Figure 4-1. We get the same autocorrelation result if we use Matlab’s xcorr2() function and the same result if we use DFT techniques, as shown in the attached code.

The original circular disk image

-500 0 500

-500

0

500

The Autocorrelation of the disk

-500 0 500

-500

0

500

Zoom of Autocorrelation

-100 -50 0 50 100

-100

-50

0

50

100-1000 -500 0 500 10000

500

1000

1500

2000

2500

3000

3500Cut through the peak of the autocorrelation result

Figure 4-1: The Autocorrelation of a Circular Disk

Page 7: EE 168 Handout - Stanford University

7

Problem 5: Where, o where has my dog gone?

The cross-correlation between the main image and the dog image can be used to locate the dogs in the main image. We use the Matlab function xcorr2(), which evaluates the 2-D cross-correlation of the two images (Figure 5-1). We also get the same result using DFT methods, as shown in Figure 5-2. To understand why the cross-correlation technique performs well in finding the dogs, assume that the 64 by 64 dog image is placed on top of a larger uniform background. Because we have made the dog image to have a mean of zero, the output of the cross-correlation function for the uniform background is going to be zero, while the output is going to be maximum when the dog image is exactly on top of the dog in the original image. When the dog-images are aligned, and the cross-correlation is maximum, we will see a bright spot in the cross-correlation image. Note that variation in the background (i.e. non-uniform) will cause the cross-correlation to be less than the maximum, since the corresponding dog-image points will not match exactly. Thus, we note that some of the bright cross-correlation spots are brighter than others.

We can consequently conclude that dog’s located on more uniform regions (such as those on the water) are easier to identify than those positioned over more varied background (such as leaves). The intensity of the correlation peaks reflect this, as shown in Figure 5-1. We can identify a total of 10 dogs in the cross-correlation image, but perhaps only 7 with the eye directly.

The original 512x512 image

-200 -100 0 100 200

-200

-100

0

100

200

The cross-correlation between image and dog

-200 -100 0 100 200

-200

-150

-100

-50

0

50

100

150

200

Figure 5-1: Using Cross-Correlation for locating objects in an image

The cross-correlation with xcorr2()

100 200 300 400 500

100

200

300

400

500

The cross-correlation with FFT methods

100 200 300 400 500

50

100

150

200

250

300

350

400

450

500

Figure 5-2: Cross-Correlation done in space domain and FFT domain

Page 8: EE 168 Handout - Stanford University

8

Problem 6: Computer Vision In part ii, it’s pretty cool that this method seems to work to find the sign!

In part iii, we see that this method isn’t particularly robust. Though we as humans can see that a small “one way” sign is the same as a large “one way” sign, when the template does not exactly match the size, shape, rotation angle, etc. of your target in the background image, the correlation is not very strong, since correlation is based on a pixel by pixel similarity between the two images.

Page 9: EE 168 Handout - Stanford University

9

In computer vision, there are methods of correlating with scaled and rotated and tilted templates that make tasks like sign detection easier, but over time, this correlation method was replaced with convolutional neural networks! This question was adapted from a homework assignment for AA274A: Principles of Robotic Autonomy. For more on this topic, take that class or CS231N: CNNs for Computer Vision.

Page 10: EE 168 Handout - Stanford University

8

MATLAB code: Problem 1: Convolution of Simple Shapes

% 'tic' and 'toc' functions are used to measure the execution time % Note that conv2() is used with the argument 'same' so that the output has the same length and width as the 2 inputs im = zeros(512, 512); im((256-64):(256+64),(256-64):(256+64)) = 1; tic im_conv=conv2(im,im,'same'); toc cut_xaxis = im_conv(256,:); for i=1:512, cut_45deg(i) = im_conv(i,i); end x = [-256:255]; y = [-256:255]; figure; colormap(gray); subplot(2,2,1); imagesc(x, y, im); axis('image'); title('The original image'); subplot(2,2,2); imagesc(x, y, im_conv); axis('image'); title('The convolution result') subplot(2,2,3); plot(x, cut_xaxis); title('Horizontal cut') subplot(2,2,4); plot(x, cut_45deg); title('Diagonal cut')

Problem 2: Convolution with the Convolution Theorem

% Note that the fftshift needs to be used to shift the center to % be at the 'corners' of the image before applying the fft2 im = zeros(512, 512); im((256-64):(256+64),(256-64):(256+64)) = 1; tic im_F = fftshift(fft2(fftshift(im))); im_conv_F = im_F.*im_F; im_conv = abs(fftshift(ifft2(fftshift(im_conv_F)))); toc cut_xaxis = im_conv(256,:); for i=1:512, cut_45deg(i) = im_conv(i,i); end x = [-256:255]; y = [-256:255]; figure; colormap(gray); subplot(2,2,1); imagesc(x,y,im); axis('image') title('The Original Image') subplot(2,2,2); imagesc(x,y,abs(im_conv)); axis('image'); title('The Convolution result using FFTs'); subplot(2,2,3); plot(x, cut_xaxis); title('Horizontal cut') subplot(2,2,4); plot(x, cut_45deg); title('Diagonal cut')

Page 11: EE 168 Handout - Stanford University

9

Problem 3: Convolution and Effects

% choose center to be 256/2+1 (what fftshift matches to 0) with 32 on either side, % so that it is symmetric about "zero" center = 256/2+1; imA = zeros(256,256); imA((center-32):(center+31), (center-32):(center+31)) = 1; %this is 64x64 imB = zeros(256,256); imB((center-32):(center+31), 61:240)=1; im_conv = conv2(imA, imB, 'same'); imA_F = fftshift(fft2(fftshift(imA))); imB_F = fftshift(fft2(fftshift(imB))); im_conv_F = imA_F.*imB_F; im_conv_F_inv = abs(fftshift(ifft2(fftshift(im_conv_F)))); figure; colormap(gray); subplot(2,2,3); imagesc([-128:-97], [-128:127], abs(im_conv(:,1:20))); title('Direct Convolution - First 20 columns (all zeros)'); subplot(2,2,4); imagesc([-128:-97], [-128:127], abs(im_conv_F_inv(:,1:20,:))); title('Convolution using DFTs - first 20 columns'); subplot(2,2,1); imagesc([-128:127], [-128:127], abs(im_conv)); title('Convolution result using conv2()') subplot(2,2,2); imagesc([-128:127], [-128:127], abs(im_conv_F_inv)); title('Convolution result using FFTs') cut_FFT = abs(im_conv_F_inv(128,:)); cut_conv = abs(im_conv(128,:)); figure; subplot(1,2,1); plot(cut_conv); title('Cut through the convultion result using conv2()') subplot(1,2,2); plot(cut_FFT); title('Cut through the convultion result using DFTs')

Problem 4: Autocorrelation of a Circular Disk

% Note that for a real, symmetric function, one can use 'conv2' to compute the autocorrelation im_disk = zeros(512, 512); radius = 32; circle = ones(2*radius+1, 2*radius+1); for i=1:2*radius+1, for j=1:2*radius+1, if(sqrt((i-radius-1)^2+(j-radius-1)^2)>radius+1), circle(i,j)=0; end end end row_circle = repmat([256-radius: 256+radius], size(circle,1),1); col_circle = repmat([256-radius: 256+radius]', 1, size(circle,2)); im_disk(sub2ind([512, 512], row_circle, col_circle)) = circle; im_acorr = conv2(im_disk,im_disk, 'same'); %'full' convolution = autocorrelation im_acorr_2 = xcorr2(im_disk); cut_peak = im_acorr(round(size(im_acorr,1)/2),:); im_disk_F = fft2(fftshift(im_disk)); corr_F = conj(im_disk_F).*im_disk_F; im_corrF = fftshift(ifft2(corr_F));

Page 12: EE 168 Handout - Stanford University

10

figure; colormap(gray(256)); imagesc([-100:100],[-100:100], abs(im_corrF)); title('The Autocorrelation of the disk with FFT methods'); axis('image'); figure; colormap(gray); subplot(2,2,1); imagesc([-511:511],[-511:511], (im_disk)); title('The original circular disk image'); axis('image') subplot(2,2,2); imagesc((im_acorr)); title('The Autocorrelation of the disk'); axis('image') subplot(2,2,4); imagesc([-100:100],[-100:100], (im_acorr([511-100:511+100], [511-100:511+100]))); title('Zoom of Autocorrelation'); axis('image') subplot(2,2,3); plot([-511:511], cut_peak); title('Cut through the peak of the autocorrelation result')

Problem 5: Where, o where has my dog gone?

%xcorr2() computes the 2-d cross-correlation between images fid = fopen('lab6prob5data', 'rb'); im = fread(fid, [512 512], 'uint8'); fclose(fid); im = im'; fid = fopen('lab6prob5dog', 'rb'); dog = fread(fid, [64 64], 'uint8'); fclose(fid); dog = dog'; dog = dog - mean(mean(dog)); im_crosscorr = xcorr2(im, dog); center = floor(size(im_crosscorr,1)/2)+1; dog_zeropad = zeros(512, 512); dog_zeropad(256-31:256+32, 256-31:256+32) = dog; dog_F = fft2(fftshift(dog_zeropad)); im_F = fft2(im); corr_F = conj(dog_F).*im_F; im_corrF = (ifft2(corr_F)); figure; colormap(gray(256)); subplot(1,2,1); imagesc([-256:255], [-256:255], im); title('The original 512x512 image'); axis('image'); subplot(1,2,2); imagesc([-250:250], [-220:220], abs(im_crosscorr((center-200):(center+200),(center-220):(center+220)))); title('The cross-correlation between image and dog'); axis('image'); figure; colormap(gray(256)); subplot(1,2,1); imagesc(abs(im_crosscorr)); title('The cross-correlation with xcorr2()'); axis('image'); subplot(1,2,2);imagesc(abs(im_corrF)); title('The cross-correlation with FFT methods'); axis('image');

Page 13: EE 168 Handout - Stanford University

14

Problem 6: Computer Vision

%% Load and display the images (part i) background = rgb2gray(imread('lab6prob6data1.jpg')); sign = rgb2gray(imread('lab6prob6data2.png')); figure title('Street View') colormap('gray') imagesc(background) axis image figure title('One Way') colormap('gray') imagesc(sign) axis image %% Correlating the images and locating the sign (part ii) corr = normxcorr2(sign,background); % Find location of maximum value % (Maximum cross correlation point should correspond to the bottom right % corner of the image) [ssr,snd] = max(corr(:)); [ij,ji] = ind2sub(size(corr),snd); figure subplot(2,1,1) colormap gray imagesc(corr); title('Sign and Background Correlation'); axis image subplot(2,1,2) imagesc(background) colormap gray axis image title('Sign Detected') hold on; rectangle('Position',[ji-size(sign,2)+1 ij-size(sign,1)+1 size(sign,2) size(sign,1)], 'LineWidth', 2, 'EdgeColor', 'r') %% Downscaling (part iii.) quarterSign = sign(1:4:end,1:4:end); corr = normxcorr2(quarterSign,background); % Find location of maximum value % (Maximum cross correlation point should correspond to the bottom right % corner of the image) [ssr,snd] = max(corr(:)); [ij,ji] = ind2sub(size(corr),snd); figure subplot(2,1,1) colormap gray imagesc(corr); axis image title('Quarter Size Sign and Background Correlation'); subplot(2,1,2)

Page 14: EE 168 Handout - Stanford University

15

imagesc(background) colormap gray axis image title('Sign Detected') hold on; rectangle('Position',[ji-size(quarterSign,2)+1 ij-size(quarterSign,1)+1 size(quarterSign,2) size(quarterSign,1)], 'LineWidth', 2, 'EdgeColor', 'r') title('Quarter Size Sign Matching')