cos429 computer vision cos429ps5:finding$ nemo$3dvision.princeton.edu/courses/cos429/2014fa/... ·...

20
COS 429 PS5: Finding Nemo

Upload: others

Post on 17-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

COS  429  PS5:  Finding  Nemo  

Princeton University

COS429 Computer Vision

Problem Set 5: Finding Nemo

In this assignment, we will study a full pipeline of a sliding window approach for 2D objectdetection. Specifically, we will implement an Exemplar-SVM algorithm. We will use all thesteps described in [4] except the calibration step in Section 3.1 (i.e. for simplicity purpose, wejust assume we don’t need calibration between di↵erent exemplars which is not true). Yourtask is to train an object detector for Nemo from a set of images as the training data, and thenrun the detector to find Nemo in a testing set of images that are di↵erent from the trainingset.

Figure 1: Examplar-SVM for detecting Nemo.

Problem 1 Describe Steps

Describe the main steps for both training and testing an Exemplar-SVM object detector.

Problem 2 Pyramid

Why do we need to compute image pyramid in the object detection pipeline?

Problem 3 Non-Maximum Suppression

Why do we need to do Non-Maximum Suppression (NMS) in the object detection pipeline?

1

Page 2: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Exemplar-SVM

•  S4ll  a  rigid  template,but  train  a  separate  SVM  for  each  posi4ve  instance  

For  each  category  it  can  has  exemplar  with  different  size  aspect  ra4o  

Page 3: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  1  Describe  Steps  Training  :  

 To  train  one  exemplar  SVM  what  we  do    Tes4ng  :  

 For  each  image  what  we  do  ?    In  this  assignment  we  use  6  images  as  training  and  others  as  tes4ng.    

Page 4: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  2  why  use  image  pyramid  ?  When  we  compute  HoG  for  each  image  we  compute  the  HoG  for  different  scale  of  image.    •  What  problem  we  are  trying  to  solve  here?  •  If  there  any  alterna4ve  solu4on?  Why  not  as  good  as  this  one  ?  

Page 5: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  2  Why  Non-­‐Maximum  Suppression?  

Look  at  the  code  see  where  we  do  non-­‐maximum  suppression  (call  the  func4on  nmsMe).      •  What  does  it  do?    •  Why  we  do  this?  

Page 6: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  3  Visualizing  Examplars    

•  Visualize  the  exemplar  image  :  Model.posRgb  •  Visualize  the  exemplar  HoG  feature  :  showHoG(Model.posfeature)  

•  Visualize  the  trained  svm  weight  :  showHoG(Model.svm.w)  

(a) Image of a positive examplar (b) HOG feature of the examplar (c) Trained SVM weights

Figure 2: Visualization of a positive training examplar.

Problem 4 Visualizing Examplars

Download the code and data we provided and open it in MATLAB. The code comes withthree pre-trained exemplar detectors. For each detector, the pre-trained model contains theimage area that we used for training the exemplar (e.g. Figure 2(a)), the HOG featurevector computed from the image area (e.g. Figure 2(b)), and the trained linear SVM weightscorresponding to each HOG feature dimension (e.g. Figure 2(c)).

You task is to visualize these three components for each of the three pre-trained model togenerate a figure look like Figure 2. Write a script to load each pre-trained model, visualize theHOG feature by calling the function “showHOG” that we provided, visualize the SVM weightsby calling the “showHOG” function as well, save the three images for the three visualization,and include all of them into the report PDF file as the answer to this question. Then, look atthe visualization and try to understand the correspondences among the three visualization.Describe how you see the pattens in the HOG and SVM weight visualization.

Problem 5 Testing Procedure

Use the pre-trained models, and implement the testing procedure for the exemplar SVMdetector. We have basically provided all the functions you need to call. You task is to putthem all together to slide one exemplar detector over the image to find Nemo. You will writeall your code to complete the function “bbdet = testfeaturePym(imageRgb,Model)”. Look atthe comments in the code with major steps listed.

Problem 6 Draw the Detection Result

Use the function “draw 2d” that we provide to draw your detection results, and include thefigures in your report.

Problem 7 Training Procedure

We have provided almost all the code for training an Exemplar-SVM detector in the file“TrainingMain hog.m”, except the hard negative mining step. You task contains:

1. Complete the hard negative mining step by reusing the code have have written for thetesting procedure.

2

Page 7: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  3  Visualizing  Examplars    

(a) Image of a positive examplar (b) HOG feature of the examplar (c) Trained SVM weights

Figure 2: Visualization of a positive training examplar.

Problem 4 Visualizing Examplars

Download the code and data we provided and open it in MATLAB. The code comes withthree pre-trained exemplar detectors. For each detector, the pre-trained model contains theimage area that we used for training the exemplar (e.g. Figure 2(a)), the HOG featurevector computed from the image area (e.g. Figure 2(b)), and the trained linear SVM weightscorresponding to each HOG feature dimension (e.g. Figure 2(c)).

You task is to visualize these three components for each of the three pre-trained model togenerate a figure look like Figure 2. Write a script to load each pre-trained model, visualize theHOG feature by calling the function “showHOG” that we provided, visualize the SVM weightsby calling the “showHOG” function as well, save the three images for the three visualization,and include all of them into the report PDF file as the answer to this question. Then, look atthe visualization and try to understand the correspondences among the three visualization.Describe how you see the pattens in the HOG and SVM weight visualization.

Problem 5 Testing Procedure

Use the pre-trained models, and implement the testing procedure for the exemplar SVMdetector. We have basically provided all the functions you need to call. You task is to putthem all together to slide one exemplar detector over the image to find Nemo. You will writeall your code to complete the function “bbdet = testfeaturePym(imageRgb,Model)”. Look atthe comments in the code with major steps listed.

Problem 6 Draw the Detection Result

Use the function “draw 2d” that we provide to draw your detection results, and include thefigures in your report.

Problem 7 Training Procedure

We have provided almost all the code for training an Exemplar-SVM detector in the file“TrainingMain hog.m”, except the hard negative mining step. You task contains:

1. Complete the hard negative mining step by reusing the code have have written for thetesting procedure.

2

•  Try  to  understand  the  correspondences  among  the  three  visualiza4on.    

•  Describe  the  paZerns  in  the  HOG  and  SVM  weight  visualiza4on.    

Page 8: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem4  Tes4ng  Procedure    

•  Goal  :  complete  the  func4on  bbdet  =  tes\eaturePym(imageRgb,Model)  

 • What  it  does  :  slide  one  exemplar  detector  over  the  image  to  find  Nemo  

•  Past  your  code  in  the  report.    

Page 9: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem4  Tes4ng  Procedure    

Basic  steps  (Listed  in  the  code  comments)  1.      Compute  feature  pyramid    2.  For  each  level  of  the  feature  pyramid    – Do  convolu4on  get  the  detec4on  score  for  each  window  

– Convert  the  score  to  bounding  box  in  image  domain  

3.  Do  non-­‐  maximum-­‐suppression  

Page 10: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Convolution get the detection score

•  Dot  product  between  trained  svm  weight  and  feature  for  each  window  •  score  =  fconv(one  level  of  feature,  {weight},  1,  1);  •   The  score  stores  at  the  lec  top  corner  of  the  window  in  

feature  domain.  

•  If  image    at  this  level  is  too  small,  will  not  do  convolu4on  on  this  and  smaller  level    

Page 11: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Convert the score to bounding box in image domain

•  Use:    –  the  scale  of  the  this  level    –  HOG  bin  size  8x8  pixels    –  size  of  this  exemplar  •  Example  in  white  board      

 

Page 12: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem6  draw  box  

•  Paste  result  in  report  •  Can  you  find  Nemo?  When  it  fails  ?    

Princeton University

COS429 Computer Vision

Problem Set 5: Finding Nemo

In this assignment, we will study a full pipeline of a sliding window approach for 2D objectdetection. Specifically, we will implement an Exemplar-SVM algorithm. We will use all thesteps described in [4] except the calibration step in Section 3.1 (i.e. for simplicity purpose, wejust assume we don’t need calibration between di↵erent exemplars which is not true). Yourtask is to train an object detector for Nemo from a set of images as the training data, and thenrun the detector to find Nemo in a testing set of images that are di↵erent from the trainingset.

Figure 1: Examplar-SVM for detecting Nemo.

Problem 1 Describe Steps

Describe the main steps for both training and testing an Exemplar-SVM object detector.

Problem 2 Pyramid

Why do we need to compute image pyramid in the object detection pipeline?

Problem 3 Non-Maximum Suppression

Why do we need to do Non-Maximum Suppression (NMS) in the object detection pipeline?

1

Page 13: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Problem  7  Training  Procedure  

•  Acer  you  finish  the  func4on  tes\eaturePym  Congratula4ons!  You  can  train  your  own  detectors.    

•  Run  rumMe.m  with  step.train  =1  •  To  train  6  exemplars  and  test  on  the  images    •  What’s  the  difference  of  the  results  ?  

Page 14: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

[Extra  Credit]  Detec4ng  Other  Objects    

•  Find  an  interes4ng  object  to  detect  •  Get  images    •  Get  ground  truth  box  by  label  it  by  yourself  

Page 15: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Evalua4on:  Precision  &  Recall  

Precision  =  TP  /  (TP  +  FP)   Recall  =  TP  /  (TP  +  FN)  

Object   No  Object  

Truth  

Object  

No  Object  

Algorithm  Predic4on  

True  Posi4ve  

True  Nega4ve  

False  Posi4ve  

False  Nega4ve  

Truth   Predic4on  

Intersec4on  Over    Union   =   /  

[Extra  Credit  ]  Evalua4on  

Page 16: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

[Extra  Credit  ]  Evalua4on  

•  X  :  recall  =  true  posi4ve  /  total  ground  truth  •  Y  :  precision  = true  posi4ve  /  total  detec4on  

Page 17: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Calibra4on •  Idea  :  make  different  exemplar  detector's  score  compa4ble.

•  Good  exemplar  vs  Bad  exemplar  

Page 18: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Calibra4on •  Idea  :  make  different  exemplar  detector's  score  compa4ble.

Detec4on  Score  

1.  On  Valida4on  set  :    Sort  detec4on  by  score,    classify  true/false  detec4on  

Good  posi4ve  : Similar  with  exemplar  

nega4ve:    

Detec4on  Score  

Target  Score  

1  

-­‐1  

2.  sigmoid  func4on  

•  All  detector’s  scores  are  from  0  to  1.  •  The  calibrated  score  reflects  each  detector’s  presicion

Page 19: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

Extra  Credit  Label  Transfer  

Hints:  hZp://www.cs.cmu.edu/~tmalisie/projects/iccv11/  paper  and  code      

Page 20: COS429 Computer Vision COS429PS5:Finding$ Nemo$3dvision.princeton.edu/courses/COS429/2014fa/... · Problem Set 5: Finding Nemo In this assignment, we will study a full pipeline of

iHoG  

•  Webpage  :hZp://web.mit.edu/vondrick/ihog/  •  They  have  online  demo  and  code  for  download  

>>  feat  =  features(im,  8);  >>  ihog  =  invertHOG(feat);  >>  imagesc(ihog);