visual saliency: learning to detect salient objects
DESCRIPTION
Visual Salency: Learning to Detect Salient ObjectsTRANSCRIPT
![Page 1: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/1.jpg)
Visual Attention: Detecting Saliency on Images
Vicente OrdonezDepartment of Computer Science
State University of New YorkStony Brook, NY 11790
![Page 2: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/2.jpg)
I will be working mainly on the following paper
• Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007.
http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
![Page 3: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/3.jpg)
What is Saliency? What is Visual Attention?
“Everyone knows what attention is...”—William James, 1890
![Page 4: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/4.jpg)
This is a problem of…
• Arbitrary object detection?• Background / Foreground segmentation?• Modeling Visual Attention?
![Page 5: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/5.jpg)
The Method
• Features: – Multiscale Contrast (Done!)– Center surround histogram (Mostly Done!) (Done!)– Color spatial distribution (Done!)
• Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above. (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
![Page 6: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/6.jpg)
Multiscale Contrast Function
• Generate the Gaussian Pyramid for the input image.– For each level in the pyramid • Do gaussian blurring• Do resampling
I’m using a 6 levels Gaussian pyramid for each RGB channel.
![Page 7: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/7.jpg)
How a Gaussian pyramid looks like
Figure from David Forsyth
![Page 8: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/8.jpg)
• Generate contrast maps for each level of the Pyramid.
• Sum all of the results to produce the final multiscale contrast map.– The two steps mentioned above are described in
this formula:
Multiscale Contrast Function
![Page 9: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/9.jpg)
Input image
![Page 10: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/10.jpg)
Contrast maps
![Page 11: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/11.jpg)
Contrast maps
Original image Contrast map at level 1
Contrast map at level 4 Contrast map at level 6
![Page 12: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/12.jpg)
Multiscale Contrast Map Output
![Page 13: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/13.jpg)
Center Surround Histogram Feature
For each pixel in the image For each possible rectangle with a reasonable size and
aspect ratio Create a surrounding rectangle and calculate the
histogram of the rectangle and the surrounding area. Pick and record the rectangle that maximizes the Chi-
Square distance between the two histograms calculated above and also record the Chi-Square distance.
![Page 14: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/14.jpg)
Center Surround Histogram Feature
![Page 15: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/15.jpg)
Center Surround Histogram Feature
• The algorithm as described before is computationally expensive…
• It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image.
• The algorithm was introduced in:– “Integral Histogram: A Fast Way to Extract Histograms
in Cartesian Spaces” by Fatih Porikli, Mitsubishi Electric Research Lab in CVPR 2005.
![Page 16: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/16.jpg)
Center Surround Histogram Feature
• Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
![Page 17: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/17.jpg)
Center Surround Histogram
Results Using my Implementation (15.2 sec, size = 245x384)
Results Reported in the Paper
![Page 18: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/18.jpg)
Center Surround Histogram
Results Reported in the Paper
Results Using my Implementation (13.6 sec, size = 247x346)
![Page 19: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/19.jpg)
Center Surround Histogram
Results Using my Implementation (10.2 sec, size = 248x277)
![Page 20: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/20.jpg)
More Results
![Page 21: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/21.jpg)
More Results
![Page 22: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/22.jpg)
More results
![Page 23: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/23.jpg)
More Results
![Page 24: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/24.jpg)
More Results
![Page 25: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/25.jpg)
More Results
![Page 26: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/26.jpg)
More Results
![Page 27: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/27.jpg)
More Results
![Page 28: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/28.jpg)
More Results
![Page 29: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/29.jpg)
More Results
![Page 30: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/30.jpg)
More Results
![Page 31: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/31.jpg)
Color Spatial Distribution
![Page 32: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/32.jpg)
Color Spatial Distribution
• Make an initial clustering of the colors in the image using k-means.
• Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm.
• I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
![Page 33: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/33.jpg)
Color Spatial Distribution
• Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions. Sum the variances and use this value to weight more those clusters with less spatial variance.
• Penalize the clusters that contain the majority of its pixels away from the center of the image.
![Page 34: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/34.jpg)
Color Spatial Distribution
![Page 35: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/35.jpg)
Color Spatial Distribution
![Page 36: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/36.jpg)
Color Spatial Distribution
![Page 37: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/37.jpg)
Color Spatial Distribution
![Page 38: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/38.jpg)
Color Spatial Distribution
![Page 39: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/39.jpg)
Color Spatial Distribution
![Page 40: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/40.jpg)
Color Spatial Distribution
![Page 41: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/41.jpg)
Color Spatial Distribution
![Page 42: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/42.jpg)
Combine Features Together
![Page 43: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/43.jpg)
Conditional Random Field Training and Inference
• Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).
• I did the training using this toolbox from the above paper:
• http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
![Page 44: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/44.jpg)
Mask outputs using CRF inferenceInput M-Contrast-map Center Surr. Hist. Color Spatial Var.
Input Combined features Ground truth
![Page 45: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/45.jpg)
Mask outputs using CRF inferenceInput M-Contrast-map Center Surr. Hist. Color Spatial Var.
Input Combined features Ground truth
![Page 46: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/46.jpg)
Mask outputs using CRF inferenceInput M-Contrast-map Center Surr. Hist. Color Spatial Var.
Input Combined features Ground truth
![Page 47: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/47.jpg)
Mask outputs using CRF inferenceInput M-Contrast-map Center Surr. Hist. Color Spatial Var.
Input Combined features Ground truth
![Page 48: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/48.jpg)
Precision / Recall obtained
Multiscale Contrast
Center Surround Histogram
Color Spatial Variance
Combination (Training Dataset =
100)
Combination (Larger
Dataset = 2000)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
PrecisionRecallF-measure
![Page 49: Visual Saliency: Learning to Detect Salient Objects](https://reader034.vdocuments.site/reader034/viewer/2022052619/555dc3e1d8b42aec698b47ed/html5/thumbnails/49.jpg)
Some Conclusions
• The results of the original research paper on computing the visual features have been successfully replicated in a considerable extent.
• The Conditional Random Field framework used in this project turned out to perform well for this task.
• The center-surround histogram map turned out to be the feature that gave the higher precision.
• The amount of time required for computing the individual features is in the order of several seconds.