the impact of visual saliency prediction in image classification
TRANSCRIPT
![Page 1: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/1.jpg)
The Impact of Visual Saliency Prediction in Image Classification
1Eric Arazo Sánchez Kevin McGuinness Eva Mohedano Xavier Giró-i-Nieto
Advisors:
![Page 2: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/2.jpg)
Introduction - Computer vision
2
ClassifierHandcrafted descriptors “guitar”
ClassifierLearned descriptors
Trainable
Trainable
Classical computer
vision
Deep Learning “guitar”
![Page 3: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/3.jpg)
Introduction - Imagenet
3
Russakovsky, Olga, et al. “Imagenet large scale visual recognition challenge”. International Journal of Computer Vision (2015).
![Page 4: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/4.jpg)
Imagenet
4
Images:
● 1.2 M train
● 50,000 test
● 1,000 categories
Evaluation dataset unpublished before the
competition
![Page 5: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/5.jpg)
Imagenet
5
Metrics:
● Top-1 accuracy
● Top-5 accuracy
![Page 6: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/6.jpg)
Imagenet
6
Metrics:
● Top-1 accuracy
● Top-5 accuracy
![Page 7: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/7.jpg)
Introduction - Imagenet
7
ILSVRC - Evolution since 2010
Slide credit: Kaiming He (FAIR)
![Page 8: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/8.jpg)
Introduction - Imagenet
8
ILSVRC - Evolution since 2010
Slide credit: Kaiming He (FAIR)
Some models have already reached
human-level performance.
Still the olympic games of computer
vision?
![Page 9: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/9.jpg)
Introduction - Imagenet
9Slide credit: Kaiming He (FAIR)
-9.4%2012
Introduction of the Convolutional Neural
Networks (CNN) in the competition with AlexNet
ILSVRC - Evolution since 2010
![Page 10: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/10.jpg)
Introduction - AlexNet
10
Ref: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. NIPS 2012.
![Page 11: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/11.jpg)
Introduction - AlexNet
11
5 Convolutional
Layers
3 Fully Connected
Layers
1000 softmax
Object class
![Page 12: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/12.jpg)
Introduction - CNN
12LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
![Page 13: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/13.jpg)
Introduction - CNN
13LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
CNN are very useful in computer vision:
● Reduction of parameters (shared filters)
● Spatial coherence
![Page 14: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/14.jpg)
Introduction - CNN
14
Image captioning Image segmentation
![Page 15: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/15.jpg)
Introduction - CNN
15
Saliency prediction
![Page 16: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/16.jpg)
Introduction - Saliency prediction
16
CNN model
Images
Saliency maps
![Page 17: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/17.jpg)
Introduction - Saliency prediction
17
CNN for image classification
![Page 18: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/18.jpg)
Objective
18
● Explore if saliency maps could improve other computer vision tasks
![Page 19: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/19.jpg)
Objective
19
● Explore if saliency maps could improve computer vision tasks
![Page 20: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/20.jpg)
Objective
20
● Explore if saliency maps could improve computer vision tasks
![Page 21: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/21.jpg)
Outline● Introduction● Objective● State-of-the-art ● Methodology● Conclusions● Future work
21
![Page 22: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/22.jpg)
State-of-the-art - Saliency prediction
22
SalNet
Pan, Junting and McGuinness, Kevin and Sayrol, Elisa and Giro-i-Nieto, Xavier and O'Connor, Noel E. Shallow and Deep Convolutional Networks for Saliency Prediction. CVPR 2016.
Trained on SALICON
![Page 23: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/23.jpg)
Saliency prediction
23
Application of saliency:
![Page 24: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/24.jpg)
Saliency prediction
24
Application of saliency:
● In image retrieval
○ Finding the last appearance of an object.
Ref: Reyes, Cristian et al. Where is my Phone? Personal Object Retrieval from Egocentric Images (2016)
![Page 25: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/25.jpg)
Saliency prediction
25
Application of saliency:
● In image retrieval
○ Finding the last appearance of an object.
● Object recognition
○ Health care
Ref: Reyes, Cristian et al. Where is my Phone? Personal Object Retrieval from Egocentric Images (2016)
Ref: Pérez de San Roman, Philippe et al. Saliency Driven Object recognition in egocentric videos with deep CNN. 2016
![Page 26: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/26.jpg)
Saliency prediction - our approach
26
![Page 27: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/27.jpg)
Saliency prediction - our approach
27
AlexNet*SalNet
![Page 28: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/28.jpg)
Outline● Introduction● Objective● State-of-the-art ● Methodology● Conclusions● Future work
28
![Page 29: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/29.jpg)
Methodology
29
RGB images
![Page 30: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/30.jpg)
30
RGB images
RGB - The Baseline
![Page 31: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/31.jpg)
31
RGB images
RGB - The Baseline
● 1.2 M images
● 227 x 227
![Page 32: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/32.jpg)
● 1.2 M images
● 227 x 227
32
RGB images
RGB - The Baseline
9 days to train on computation
cluster
![Page 33: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/33.jpg)
RGB - The Baseline
33
![Page 34: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/34.jpg)
RGB - The Baseline
34
9 days
5 days
![Page 35: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/35.jpg)
RGB - The Baseline
35
9 days
5 days
1.5 days
![Page 36: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/36.jpg)
How to introduce saliency predictions?
36
Multiplication
Fan-in Network
Concatenation
![Page 37: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/37.jpg)
37
AlexnetMultiplication
Fan-in Network
Concatenation
Alexnet
How to introduce saliency predictions?
![Page 38: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/38.jpg)
38
Multiplication
Fan-in Network
Concatenation
Alexnet
Alexnet
How to introduce saliency predictions?
![Page 39: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/39.jpg)
39
Multiplication
Fan-in Network
Concatenation
Alexnet
Alexnet
Alexnet
CNN
How to introduce saliency predictions?
![Page 40: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/40.jpg)
40
Multiplication
Fan-in Network
ConcatenationWhere?
Alexnet
Alexnet
Alexnet
CNN
How to introduce saliency predictions?
![Page 41: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/41.jpg)
41
Multiplication
Fan-in Network
Concatenation
Alexnet
Alexnet
Alexnet
CNN
How to introduce saliency predictions?
![Page 42: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/42.jpg)
42
Alexnet
Alexnet
Alexnet
CNN
Makes sense to use the baseline, which is already trained
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 43: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/43.jpg)
43
Alexnet
Alexnet
Alexnet
CNN
Makes sense to use the baseline, which is already trained
Multiplication
Fan-in Network
Concatenation
Pre-trained CNN
How to introduce saliency predictions?
![Page 44: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/44.jpg)
Multiplication vs. Concatenation
44
Three strategies for each of them:
![Page 45: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/45.jpg)
Multiplication vs. Concatenation
45
Three strategies for each of them:
RGBS
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 46: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/46.jpg)
Multiplication vs. Concatenation
46
Three strategies for each of them:
RGB-1S-2SRGBS
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 47: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/47.jpg)
Multiplication vs. Concatenation
47
Three strategies for each of them:
RGBS RGB-1S-2S RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 48: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/48.jpg)
Multiplication vs. Concatenation
48
RGBSRGBS
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 49: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/49.jpg)
Multiplication vs. Concatenation
49
RGBSRGBS
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 50: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/50.jpg)
Multiplication vs. Concatenation
50
RGB-1S-2S
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 51: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/51.jpg)
Multiplication vs. Concatenation
51
RGB-1S-2S
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 52: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/52.jpg)
Multiplication vs. Concatenation
52
RGBS-1S-2S
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 53: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/53.jpg)
Multiplication vs. Concatenation
53
RGBS-1S-2S
RGBS
RGB-1S-2S
RGBS-1S-2S
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 54: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/54.jpg)
Multiplication vs. Concatenation
54
The best option is concatenation:
● RGBS
● RGB-1S-2S
![Page 55: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/55.jpg)
55
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 56: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/56.jpg)
56
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 57: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/57.jpg)
57
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 58: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/58.jpg)
58
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 59: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/59.jpg)
59
Alexnet
CNN
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
How to introduce saliency predictions?
![Page 60: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/60.jpg)
60
Alexnet
CNN
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
Where?
How to introduce saliency predictions?
![Page 61: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/61.jpg)
Fan-in architecture
61
Three strategies:
Fan-in C1.1
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 62: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/62.jpg)
Fan-in architecture
62
Three strategies:
Fan-in C1.1 Fan-in C2.1
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 2Batch Norm.Max-Pooling
![Page 63: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/63.jpg)
Fan-in architecture
63
Three strategies:
Fan-in C1.1 Fan-in C2.1 Fan-in C2
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 2Batch Norm.Max-Pooling
Conv 1
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 64: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/64.jpg)
Fan-in architecture
64
Fan-in C1.1
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 65: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/65.jpg)
Fan-in architecture
65
Fan-in C1.1
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 66: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/66.jpg)
Fan-in architecture
66
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Fan-in C2.1
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 2Batch Norm.Max-Pooling
![Page 67: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/67.jpg)
Fan-in architecture
67
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Fan-in C2.1
Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 2Batch Norm.Max-Pooling
![Page 68: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/68.jpg)
Fan-in architecture
68
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Fan-in C2
Conv 1
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 69: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/69.jpg)
Fan-in architecture
69
Fan-in C1.1
Fan-in C2.1
Fan-in C2
Fan-in C2
Conv 1
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
![Page 70: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/70.jpg)
Fan-in architecture
70
The best option is concatenation:
● Fan-in C2.1
● Fan-in C2
![Page 71: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/71.jpg)
Fan-in architecture
71
The best option is concatenation:
● Fan-in C2.1
● Fan-in C2
Surprising result for Fan-in C2 since it
has less parameters than the baseline
More experiments
12.4%
![Page 72: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/72.jpg)
RGB-C2 (128x128)
72
Fan-in C2Fan-in Network
![Page 73: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/73.jpg)
RGB-C2 (128x128)
73
Fan-in C2Fan-in Network
![Page 74: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/74.jpg)
RGB-C2 (128x128)
74
RGB-C2RGB (baseline)
Fan-in C2Fan-in Network
![Page 75: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/75.jpg)
75
RGB-C2 (128x128)
RGB (baseline)
Fan-in Network
RGB-C2
Fan-in C2
![Page 76: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/76.jpg)
76
Multiplication
Fan-in Network
ConcatenationRGBS
RGB-1S-2S
How to introduce saliency predictions?
![Page 77: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/77.jpg)
77
Multiplication
Fan-in Network
ConcatenationRGBS
RGB-1S-2S
Fan-in C2.1
Fan-in C2
How to introduce saliency predictions?
![Page 78: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/78.jpg)
Analysis of per-class improvements
78
Fan-in C2.1
Fan-in C2
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
![Page 79: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/79.jpg)
Analysis of per-class improvements
79
Fan-in C2.1
Fan-in C2
RGBS
RGB-1S-2S
Multiplication
Fan-in Network
Concatenation
![Page 80: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/80.jpg)
Analysis of per-class improvements
80
Class Increase of accuracy
Acoustic guitar
25 %
Volleyball 23 %
![Page 81: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/81.jpg)
81
Analysis of per-class improvementsClass Increase of accuracy
Wrecker, tow car
-23 %
Entertainment center
-18 %
![Page 82: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/82.jpg)
Outline● Introduction● Objective● State-of-the-art ● Methodology● Conclusions● Future work
82
![Page 83: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/83.jpg)
● CNNs trained to predict saliency maps can be used to improve other computer vision tasks such as image classification
83
Conclusions
![Page 84: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/84.jpg)
● CNNs trained to predict saliency maps can be used to improve other computer vision tasks such as image classification
84
Conclusions
Fan-in Network
![Page 85: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/85.jpg)
● CNNs trained to predict saliency maps can be used to improve other computer vision tasks such as image classification
85
Conclusions
Fan-in Network
![Page 86: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/86.jpg)
● The best way to introduce the saliency maps to a CNN is with a Fan-in architecture, that provides freedom to the network to decide how to introduce the saliency maps
86
Conclusions
![Page 87: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/87.jpg)
● The best way to introduce the saliency maps to a CNN is with a Fan-in architecture, that provides freedom to the network to decide how to introduce the saliency maps
87
Conclusions
Fan-in C2.1Conv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
Conv 1Batch Norm.Max-Pooling
Conv 2Batch Norm.Max-Pooling
Fan-in NetworkConcatenation
RGBSConv 1
Conv 2
Conv 3Conv 4Conv 5
FC 1
FC 1
FC 3 - Output
Drop Out
Drop Out
Batch Norm.
Batch Norm.
Max-Pooling
Max-Pooling
Max-Pooling
RGBSaliency
![Page 88: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/88.jpg)
● The best way to introduce the saliency maps to a CNN is with a Fan-in architecture, that provides freedom to the network to decide how to introduce the saliency maps
88
Conclusions
![Page 89: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/89.jpg)
● The methodology of downsampling the images provides accurate results on the improvements of the CNN in larger images
89
Conclusions
227 x 227
128 x 128
![Page 90: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/90.jpg)
Outline● Introduction● Objective● State-of-the-art ● Methodology● Conclusions● Future work
90
![Page 91: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/91.jpg)
Future work
91
● Several experiments:○ Fan-in:
■ Fan-in C2 without saliency maps
■ Concatenating instead of multiplying
○ Concatenation only in the first convolutional layer
○ Multiplication and training from scratch
● Once we have a reasonable model try with other saliency models
![Page 92: The impact of visual saliency prediction in image classification](https://reader030.vdocuments.site/reader030/viewer/2022020301/58e4de591a28abf5048b666f/html5/thumbnails/92.jpg)
Future work
92
● Several experiments:○ Fan-in:
■ Fan-in C2 without saliency maps
■ Concatenating instead of multiplying
○ Concatenation only in the first convolutional layer
○ Multiplication and training from scratch
● Once we have a reasonable model try with other saliency models
Thank you