high level computer vision ss2015 exercise 2: object ... · histogram distances: dist_intersect.m,...

High Level Computer Vision SS2015

Exercise 2: Object Identification(Released on 8th May, due on 15th May.Send your solution to [email protected] adding [hlcv] to the caption)

mailto:[email protected]

High Level Computer Vision SS2015 | Tutorial for Exercise 2

Question 1: Image Representations, Histogram Distances

• normalized_hist.m :return normalized histogram of pixel intensities for a gray image

2

color image (remember to rgb2gray)

hist.m(matlab builtin function)

normalized_hist.m(remember normalization)


Question 1: Image Representations, Histogram DistancesColor Histograms: rgb_hist.m, rg_hist.m

• rgb_hist.m : ‣ Compute 3D histogram:

H(R, G, B) = #(pixels with color (R,G,B)) ‣ Normalized the H(R,G,B) then return

as a vector of size (num_bins)3

• rg_hist.m : ‣ Instead of R, G, B values, we use Chromatic representation ‣ Use only r and g to build the histogram

of size (num_bins)2

‣ Similarly, normalize and return as a vector

3

High Level Computer Vision - April 29 2015 82

Color Histograms

• Color statistics ‣ Given: tri-stimulus R,G,B for each pixel ‣ Compute 3D histogram

- H(R,G,B) = #(pixels with color (R,G,B))

[Swain & Ballard, 1991]


Color Histograms

• Color statistics ‣ Given: tri-stimulus R,G,B for each pixel ‣ Compute 3D histogram

- H(R,G,B) = #(pixels with color (R,G,B))

[Swain & Ballard, 1991]


Color

• One component of the 3D color space is intensity ‣ If a color vector is multiplied by a scalar, the intensity changes, but not

the color itself. ‣ This means colors can be normalized by the intensity.

- Intensity is given by: I = R + G + B:

‣ „Chromatic representation“

r =R

R + G + B

g =G

R + G + B

b =B

R + G + B


Color

• Observation: ‣ Since r + g + b = 1, only 2 parameters are necessary ‣ E.g. one can use r and g ‣ and obtains b = 1 - r - g

r + g + b = 1⇥ b = 1� r � g

R + G + B = 1

R

B

G


Question 1: Image Representations, Histogram DistancesHistogram of Gaussian Partial Derivatives: dxdy_hist.m

• First compute Gaussian partial derivatives on x and y directions

• How to determine the numerical ranges for bins of histogram? ‣ As we learnt from the Exercise 1, Dx can be gotten by first gaussian

filtering on y axis, then gaussian derivative filtering on x axis ‣ Assume 𝝈 for gaussian here is 6.0,

when we use this gaussian derivative filterto convolve a image with extreme case, then the maximum value we can get is ~33.5420

‣ Therefore to have bins of histogram distributed within [-34, 34] might be a good idea

4

gray image Dx Dy


Question 1: Image Representations, Histogram DistancesHistogram of Gaussian Partial Derivatives: dxdy_hist.m

• Since we have have to assign both Dx and Dy into different bins, so in total there will be a histogram with size: (num_bins)2

• Please still remember to do the normalization then the summation of a histogram will be 1

5


• dist_intersect.m : ‣ common part between histograms

• dist_l2.m : ‣ Euclidean distance

• dist_chi2.m : ‣ Chi-square

• Please check pages 88, 89 and 90 on lecture slides: CV-SS15-04-29-filtering-instance for their propertiesHigh Level Computer Vision - April 29 2015 88

Histogram Comparison

• Comparison measures ‣ Intersection

• Motivation ‣ Measures the common part of both histograms ‣ Range: [0,1] ‣ For unnormalized histograms, use the following formula

Question 1: Image Representations, Histogram DistancesHistogram Distances: dist_intersect.m, dist_l2.m, dist_chi2.m

6



• Comparison measures ‣ Intersection

• Motivation ‣ Measures the common part of both histograms ‣ Range: [0,1] ‣ For unnormalized histograms, use the following formula



• Comparison Measures ‣ Euclidean Distance

• Motivation ‣ Focuses on the differences between the histograms ‣ Range: [0,∞] ‣ All cells are weighted equally. ‣ Not very discriminant



• Comparison Measures ‣ Chi-square

• Motivation ‣ Statistical background:

- Test if two distributions are different

- Possible to compute a significance score

‣ Range: [0,∞] ‣ Cells are not weighted equally!

- therefore more discriminant

- may have problems with outliers (therefore assume that each cell contains at least a minimum of samples)


Question 2: Object Identification• find_best_match.m: query-by-example scenario:

‣ Note that in the model and query folders we have arranged them such that the groundtruth match of i-th query image is the i-th model image

• Use the histogram and distance functions from Question 1 to find the matches of query images.

• Rank the similarities of all models images w.r.t query images. ‣ a distance matrix between all pairs of model and query images

7

mod

elqu

ery


Question 3: Precision and Recall

8[http://en.wikipedia.org/wiki/Precision_and_recall]

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 3

A (binary) classifier classifies data points as + or −

If we also know the true classification, the performance of the classifier is a 2x2 contingency table, in this application usually called aconfusion matrix.

good!

good!

bad! (Type I error)

bad! (Type II error)

As we saw, this kind of table has many other uses: treatment vs.outcome, clinical test vs. diagnosis, etc.

[Some figures are from Prof. William H. Press, UT Austin]


Most classifiers have a “knob” or threshold that you can adjust: How certain do they have to be before they classify a “+”? To get more TP’s, you have to let in some FP’s!

Notice there is just one free parameter, think of it as TP, since

FP(TP) = [given by algorithm]TP + FN = P (fixed number of actual positives, column marginal)FP + TN = N (fixed number of actual negatives, column marginal)

So all scalar measures of performance are functions of one free parameter (i.e., curves).

And the points on any such curve are in 1-to-1 correspondence with those on any other such curve.

If you ranked some classifiers by how good they are, you might get a different rankings at different points on the scale.

On the other hand, one classifier might dominate another at all points on the scale.

more conservative more liberal

clas

sifie

r

clas

sifie

r

clas

sifie

r

clas

sifie

r TP FP

FN TN

Cartoon, not literal:

threshold: How certain do classifierhave to be before they classify a “+” ?

http://en.wikipedia.org/wiki/Precision_and_recall


Question 3: Precision and Recall

9[http://en.wikipedia.org/wiki/Precision_and_recall][Some figures are from Prof. William H. Press, UT Austin]


actual

clas

sifie

r

+ −

+

−

TP FP

FN TN

true pos rate (TPR)≡ sensitivity≡ recall

actual

clas

sifie

r

+ −

+

−

TP FP

FN TN

pos. predictive value (PPV)≡ precision

precision-recall curve

Precision-Recall curves overcome this issue by comparing TP with FN and FP

prec = tpr*100./(tpr*100+fpr*9900);prec(1) = prec(2); % fix up 0/0reca = tpr;plot(reca,prec)

Continue our toy example:note that P and N now enter

never better than ~0.13

0.01

By the way, this shape “cliff” is what the ROC convexity constraint looks like in a Precision-Recall plot. It’s not very intuitive.

plot_rpc.m : use 1) distance matrix btw model and query images2) different thresholdsto plot the precision/recall curve.

http://en.wikipedia.org/wiki/Precision_and_recall

high level computer vision ss2015 exercise 2: object ... · histogram distances: dist_intersect.m,...

Documents