ee368/cs232 - midterm exam - web.stanford.edu · ee368/cs232 - midterm exam 24-hour take-home exam,...

EE368/CS232 - Midterm Exam

24-hour take-home exam, 3 slots in Feb 28 – Mar 3. Each slot begins at 5:00pm and ends at 5:00pm the next day.

Please respect the Stanford Honor Code. There are a total of 5 problems in this midterm exam. You may use the lecture notes, your own notes, or any book to solve the exam. However, you may not discuss the exam with anyone else, including on Piazza or any other online forum, until March 3 at 5:00pm, when everyone in the class will have taken the exam. The only exception is that you can ask the course staff for clarification by emailing the staff email address ([email protected]). Emails sent to other email addresses may not be answered. We have tried hard to make the exam unambiguous and clear, so we are unlikely to say much. Note: We expect you to submit solutions to 4 out of the 5 problems. Please solve Problems 1-3 and then select either Problem 4 or Problem 5. Do NOT submit solutions to both problems. If solutions to both problems are submitted, only one of them will be graded at random. Submission guidelines: • All the files needed for the problems are located in the archive

midterm_data.zip that you can download at: https://web.stanford.edu/class/ee368/S6Iq2VIiRN/midterm_data.zip

• Submission must be made electronically on Gradescope. Please submit your answers as a single pdf file to the “Exam MM/DD” assignment, with relevant Matlab code and figures, where MM/DD corresponds to the date you received the exam. The assignment for a given date will close at 5pm the next day and late submissions will not be accepted.

• If you experience issues submitting to Gradescope, please email your solution to the staff email address ([email protected]).

Problem 1: Short Answer Questions (14 points) Please provide a brief explanation for all your answers. Please answer concisely. A typical answer should be less than 5 sentences. No programming is required for Problem 1. Part A (3 points): Alice wants to know the response of an unknown linear shift-invariant filter to the (infinite) checkerboard pattern shown below. Each number below represent the value of an individual pixel.

… … … … … …

… 10 50 10 50 …

… 50 10 50 10 …

… 10 50 10 50 …

… 50 10 50 10 …

… … … … … … Bob has access to the real-valued impulse response but he doesn’t want to share that information and only accepts to give Alice some values of the frequency response F(ejωx, ejωy), one at a time. What is the smallest subset of values of F(ωx, ωy) that Alice needs from Bob to calculate the response of the filter to the checkerboard pattern? Part B (2 points): A light source emits an additive mixture of two monochromatic lights: one at wavelength 580 nm and the other one at wavelength λ. We can independently adjust the intensities for each of these two components in order to match the appearance of the mixture with a spectrally flat white light (chromaticity coordinates: x = y = 1/3). For what (approximate) wavelength λ or what range of wavelengths λ is it possible for a viewer with normal trichromatic vision to exactly match the color of the light source with the reference white light? Part C (2 points): People affected by deuteranopia have either defective or missing M-cones in their retina. As a result, they are unable to distinguish certain colors. This is shown in the CIE chromaticity diagram below as confusion lines: all the colors on one given line are indistinguishable by a deuteranope. Spectrally flat white is shown as the point W in the diagram.

x

y

Figure reference: Pitts' data (1935) from Le Grand, Y., Light, Color and Vision, 2nd ed. London: Chapman and Hall, 1968.

We repeat the experiment from part B with a deuteranope. For what (approximate) wavelength λ or what range of wavelengths λ is it possible for a deuteranope to exactly match the color of the light source with the reference white light? Part D (3 points): We wish to filter an image with a filter with the following impulse response.

-1 -1 -1 -1 [9] -1 -1 -1 -1

(1) Qualitatively, how do you expect this filter to modify the input image? (2) What is the least number of additions, subtractions, multiplications, and

divisions per output pixel that are required to implement this filter? Assume that all operations count equally. Please justify your answer.

Part E (4 points): Consider the following modification of Gaussian filtering.

!!"#[!, !] = ! !, ! ![!, !, !, !]!,!

![!, !, !, !]!,!

Where I is the input image, I[x, y] denotes a pixel of I at location x, y, Iout is the filtered output image, and w[i, j, k, l] denotes the weights that are computed as follows

![!, !, !, !] = exp − (!!!)!!(!!!)!!!!!

− ![!,!]!![!,!] !!!!!

,

where !! and !! are constant parameters. A sample image with intensity range of [0, 1] and the output of the above filter using (!! = 4, !! = 0.25) is shown below. For comparison, we have also shown the output of a standard Gaussian filter with ! = 4.

Noisy input image Output of our filter Output of Gaussian filter Based on this information, please answer the following:

(1) How is the above filter different from Gaussian blurring? (2) Is this filter linear? (3) Is it shift-invariant? (4) Is this filter separable?

Problem 2: Reverse Engineering (16 points) In this problem, we ask you to deduce as much detail as possible about two different “black boxes” IPA1 and IPA2 by observing input and output images. The two following images are the inputs going into two unknown image processing algorithms:

• zoneplate.tif • comic_noisy.tif

Part A (8 points): A first image processing algorithm IPA1 is applied to both input images. The resulting images can be found here:

• zoneplate_IPA1.tif • comic_noisy_IPA1.tif

Please describe as clearly as possible what processing steps are contained in IPA1. Please justify your conclusions.

Part B (8 points): A second image processing algorithm IPA2 is applied to both input images. The resulting images can be found here:

• zoneplate_IPA2.tif

• comic_noisy_IPA2.tif Please describe as clearly as possible what processing steps are contained in IPA2. Please justify your conclusions.

Note: If you use MATLAB, please include any relevant code.

Problem 3: Morphological Image Processing (20 points) In this problem, we investigate continuous-space morphological operations. The original shape is shown in the following figure. White represents “1” or “Foreground” and black represents “0” or “Background.” Assume that values outside of the image boundaries are part of the background.

The original shape is altered, by either a single dilation or a single erosion, to produce the resulting shapes shown below. For each resulting shape, state whether it was the result of dilation or erosion of the original shape and give a structuring element that produces the resulting shape. Show the origin of each structuring element clearly. The red outline corresponds to the outline of the original shape and is included only for reference. You should not need MATLAB for this problem. Note: It can be helpful to solve parts B, C and D in order. Part A (5 points):

2s

s

s

2s

Part B (5 points):

Part C (5 points):

Part D (5 points):

Problem 4: QR code reader (30 points) Please select either Problem 4 or Problem 5. Do NOT submit solutions to both problems. If solutions to both problems are submitted, only one of them will be graded at random. A QR code is a 2D barcode that can be used to encode various type of information (text, numbers, etc.). In this question, we want to design a system that can read QR codes under certain conditions.

Example of a QR code we want to be able to read.

This one contains the text: EE368 midterm. Although QR codes exist in various sizes, for the purpose of this problem, we only consider codes of size 21x21 like the one shown above. Each cell in the 21x21 square grid can take one of two colors: black or white. We will also use the fact that the distinctive square patterns at the top-left, top-right and bottom-left corners are fixed. In this problem we give you 9 noisy 1200x1200 test images: qr_test_1.png, …, qr_test_9.png that each contain one QR code. The position and orientation of the code in the image is randomized and we want our system to automatically localize and decode it. (Only an in-plane rotation is applied, you do not need to consider affine or perspective transformations)

First test image.

Part A (6 points): In this part we first want to coarsely localize the QR code within a test image, and crop the image accordingly to reduce the area taken by noisy pixels that do not convey any valuable information. Using a method of your choice, propose one way to find the approximate location of the center of the QR code. Crop a 340x340 square region centered on that location (you can safely assume that this is within the boundaries of the original test image). Submit the cropped output for the first test image. Part B (6 points): In order to get a relatively clean binary edge map, you will first perform a rough cleanup of the noise in the image from part A using a simple operation of your choice, followed by binarization. Justify why the operation you are using is suitable. Note that your process may not remove all the noise in the image, which is fine. Then, extract the binary edge map from that cleaned up image. We recommend performing Canny edge detection. Submit the cleaned up image both before and after binarization, as well as the binary edge map for the first test image. Part C (6 points): Apply the Hough transform to the edge map and extract peaks. Given the grid pattern of the QR codes, describe what should be the (ideal) expected shape of the histogram of the angle values for these peaks. From this, describe a robust way to extract the orientation angle of the grid, and implement it. Explain why this orientation can be described by an angle in [0°, 90°], and submit the value you get for the first test image. Note: at this stage we only care about the orientation of the grid, but the actual QR code may be rotated additionally by 90°, 180° or 270°. Part D (6 points): Now that we know the orientation of the grid of the QR code, we still need to find out the location of its outline in the image. Fortunately, the square markers at the corners create strong lines at the outline. This guarantees that with a proper choice of parameters we will extract Hough peaks for the four lines corresponding to the outline of the code. By considering the peaks and the orientation found in part C, find the (ρ, θ) values corresponding to these 4 lines. You may need to revisit the parameters used for the Hough transform in part C to ensure that a peak is detected for each of these 4 lines.

We consider two distinct intersecting lines parameterized by (ρ1, θ1) and (ρ2, θ2) and call their intersection I. Write the formula that relates the coordinates of I (xI, yI) to (ρ1, θ1) and (ρ2, θ2). Using this formula, find the locations of the 4 corners in the test image. For the parameterization, we use the same conventions as used in MATLAB: https://www.mathworks.com/help/images/ref/hough.html#buwgokq-6. For the first test image, submit a version of the cleaned up non-binarized image (from part B) with the location of the corners of the QR code overlaid. Finally, rotate and crop/scale the cleaned up non-binarized image from part B so that the QR code fills the image. In order to do that, you can use the file rectify_image.m that contains a function that will perform this operation for you, if you give as an input the angle of your (counter-clockwise) rotation and the coordinates of two opposite corners. Submit the output for the first test image. Part E (6 points): At this stage you should have a 210x210 grayscale image for which the edges are aligned with the edges of the QR code. Average the values in each 10x10 tile and compare the tile average to a pre-defined threshold so as to get a 21x21 binary matrix. We use the convention that the black pixels in the QR code should correspond to a 0 and the white ones to a 1. By design, QR codes are robust to some amount of noise so as long as enough values are correct, it is fine if you realize that some values are not. Before decoding, we first need to ensure that the orientation is correct. For this purpose, use the square markers and rotate your matrix to put the corner that does not contain the square marker at the bottom right. Make sure that your method is robust to some erroneous values. Submit an image corresponding to your decoded input for the first test image. You can use the following command to turn the 21x21 back into a 210x210 image you can easily include in your write-up:

kron(qr_binary,ones(10)) Now, you can use the java utility we provide to decode the matrix. You need to copy the files qr_code.jar and decode_qr_code.m to your current working directory. To make sure that the functions can be called, please add the following line at some point in your MATLAB script (for example at the beginning):

javaaddpath('qr_code.jar') The MATLAB wrapper function decode_qr_code we provide takes your 21x21 binary matrix as an input. If the orientation is incorrect or if too many values are flipped, the function may return an error. Otherwise, it will return a string that contains the encoded information. You can make sure this function works on your system by using the test input matrix qr_test_input.mat (it corresponds to the values from the QR code shown above) and running the following commands:

load('qr_test_input') result = decode_qr_code(qr_test_input)

The variable result will contain the encoded text “EE368 midterm”. Decode the 9 test QR codes in order and concatenate the strings to read the secret message. Note: Please include any relevant MATLAB code.

Problem 5: Image Stitching (30 points) Please select either Problem 4 or Problem 5. Do NOT submit solutions to both problems. If solutions to both problems are submitted, only one of them will be graded at random. Panoramic or wide field-of-view images are often obtained by stitching together several smaller images. In this problem, you will be implementing a relatively simple panoramic stitching algorithm. You are provided with 5 input images named im_1.png through im_5.png, shown below. You can assume that these images are provided to you in order, from left to right. Your goal is to stitch these images to generate an output panorama. As a reward for your hard work, if everything is done correctly, in the end you will be in possession of a rather beautiful panoramic image of the Palace of Fine Arts in San Francisco.

To keep the task simple, we have already rectified the images so that all of them have the same scale and the correct orientation. The images can thus be aligned simply by the horizontal and vertical shifts. You can safely assume that you do not need to scale or rotate any image. This is an open design problem, which means that you have complete freedom in designing the stitching algorithm with the aim of getting back a good quality, natural-looking panorama. Your solution will be graded based on the quality of the synthesized image, the efficiency of your algorithm, and the elegance of the proposed solution. You will gain points for any processing done to enhance the final image quality and lose points for any visible artifacts in the generated panorama (wrong alignment, ghosting, and visible seams to mention a few). Please provide a detailed description highlighting each step of your algorithm with a brief justification for why you chose to use or not use some technique. Also attach a working MATLAB code that will accept the input images and generate a stitched panorama. We also expect you to time your code and report the total execution time (MATLAB functions tic, toc). Most importantly, remember to enjoy the process! If all goes well, you should see something similar to the sample panorama shown below.

Note: Please include any relevant MATLAB code.

ee368/cs232 - midterm exam - web.stanford.edu · ee368/cs232 - midterm exam 24-hour take-home exam,...

Documents