printing and reading. the lecture version is didactically ... · see paper for detailed...
TRANSCRIPT
![Page 1: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/1.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 1
Seminar Explainable AIModule 8
Selected Methods Part 4Sensitivity – Gradients
Andreas HolzingerHuman‐Centered AI Lab (Holzinger Group)
Institute for Medical Informatics/Statistics, Medical University Graz, Austriaand
Explainable AI‐Lab, Alberta Machine Intelligence Institute, Edmonton, Canada
00‐FRONTMATTER
![Page 2: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/2.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 2
This is the version for printing and reading. The lecture version is didactically different.
Remark
![Page 3: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/3.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 3
00 Reflection 01 Sensitivity Analysis 02 Gradients: General overview 03 Gradients: DeepLIFT 04 Gradients: Grad‐CAM 05 Integrated Gradients
AGENDA
![Page 4: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/4.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 4
01 Sensitivity Analysis
05‐Causality vs Causability
![Page 5: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/5.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 5
Sensitivity analysis (SA) is a classic, versatile and broad field with long tradition and can be used for a variety of different purposes, including: Robustness testing (very important for ML) Understanding the relationship between inputand output Reducing uncertainty
What is Sensitivity Analysis?
Andrea Saltelli, Stefano Tarantola, Francesca Campolongo & Marco Ratto 2004. Sensitivity analysis in practice: a guide to assessing scientific models. Chichester, England.
![Page 6: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/6.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 6
Remember: NN=nonlinear function approximators using gradient descent to minimize the error in such a function approximation
To students this seems to be “new” – but it has a long history: Chain rule = back‐propagation was invented by Leibniz (1676) and
L’Hopital (1696) Calculus and Algebra have long been used to solve optimization
problems and gradient descent was introduced by Cauchy (1847) This was used to fuel machine learning in the 1940ies > perceptron
– but was limited to linear functions, therefore Learning nonlinear functions required the development of a
multilayer perceptron and methods to compute the gradient through such a model
This was elaborated by LeCun (1985), Parker (1985), Rummelhart(1986) and Hinton (1986)
Overview > review Ch.6, p.167ff of Goodfellow, Bengio, Courville 2016
![Page 7: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/7.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 7
What are Saliency Maps?
Zhuwei Qin, Fuxun Yu, Chenchen Liu & Xiang Chen 2018. How convolutional neural network see the world-A survey of convolutional neural network visualization methods. arXiv preprint arXiv:1804.11191.
![Page 8: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/8.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 8
What are Saliency Maps?
Karen Simonyan, Andrea Vedaldi & Andrew Zisserman 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv:1312.6034.
![Page 9: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/9.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 9
Problem: How to cope with local non‐linearities?
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen & Klaus-Robert Mueller 2010. How to explain individual classification decisions. Journal of machine learning research (JMLR), 11, (6), 1803-1831.
![Page 10: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/10.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 10
Let us consider a function , a data point and the prediction Now, SA measures the local variation of the function along
each input dimension:
With other words, SA produces local explanations for the prediction of a differentiable function using the squared norm of its gradient w.r.t. the inputs
The saliency map S produced with this method describes the extent to which variations in the input would produce a change in the output
Principle of Sensitivity Analysis (SA)
Muriel Gevrey, Ioannis Dimopoulos & Sovan Lek 2003. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling, 160, (3), 249‐264.
![Page 11: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/11.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 11
Given an image classification (ConvNet), we aim to answer two questions: What does a class model look like? What makes an image belong to a class?
To this end, we visualise: Canonical image of a class Class saliency map for a given image and class
Both visualisations are based on the class score derivative w.r.t. the input image (computed using back‐prop)
What do we want to know?
![Page 12: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/12.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 12
Simonyan, Vedaldi, Zisserman: Visualizing Saliency Maps
Karen Simonyan, Andrea Vedaldi& Andrew Zisserman 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv:1312.6034.
![Page 13: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/13.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 13
Simonyan & Zissserman (2014):
Karen Simonyan & Andrew Zisserman 2014. Very deep convolutional networks for large‐scale image recognition. arXiv:1409.1556.
![Page 14: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/14.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 14
Relation to de‐convolutional networks
![Page 15: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/15.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 15
02 GradientsGeneral Overview
05‐Causality vs Causability
![Page 16: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/16.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 16
Topographical maps, level curves, gradients
https://mathinsight.org/applet/gradient_directional_derivative_mountain
![Page 17: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/17.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 17
How to find a local minima?
![Page 18: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/18.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 18
Example
![Page 19: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/19.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 19
Finding the GLOBAL minimum is difficult !
https://www.khanacademy.org/math/multivariable‐calculus/multivariable‐derivatives/partial‐derivative‐and‐gradient‐articles/a/the‐gradient
![Page 20: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/20.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 20
![Page 21: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/21.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 21
Gradients > Sensitivity Analysis > Heatmapping
Karen Simonyan, Andrea Vedaldi & Andrew Zisserman 2013. Deep inside convolutional networks: Visualisingimage classification models and saliency maps. arXiv:1312.6034.
![Page 22: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/22.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 22
Gradients
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen & Klaus‐Robert Mueller 2010. How to explain individual classification decisions. Journal of machine learning research (JMLR), 11, (6), 1803‐1831.
![Page 23: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/23.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 23
03 Gradients: DeepLIFT
Deep Learning Important FeaTures
05‐Causality vs Causability
![Page 24: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/24.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 24
model 55%
Probability for not be able to pay back the loan
No loan
Why?!
Why?!Sorry, the computer said no
Typical Recommender Systems Scenario Black‐Box
![Page 25: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/25.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 25
Motivation: The Saturation Problem
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
![Page 26: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/26.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 26
First idea: perturbation
How can we find the important parts of the input for a given prediction?
Output
…
…
…
…Yellow = inputs
Drawbacks1) Computational efficiency ‐requires one forward prop for each perturbation2) Saturation
Examples1) Zeiler & Fergus, 20132) LIME (Ribeiro et al., 2016)3) Zintgraf et al., 2017
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
![Page 27: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/27.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 27
i1 i2
y
0i1 + i2
1
1 2
y
Saturation problem illustrated
=1 =1
=1
0
Avoiding saturation means perturbing combinations of inputs increased computational cost
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
![Page 28: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/28.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 28
Output
…
…
…
…Yellow = inputs
Second idea backpropagate importance
Examples:‐ Gradients (Simonyan et al.)‐ Deconvolutional Networks (Zeiler & Fergus)‐ Guided Backpropagation (Springenberg et al.)‐ Layerwise Relevance Propagation (Bach et al.)‐ Integrated Gradients (Sundararajan et al.)‐ DeepLIFT (Learning Important FeaTures)
‐ https://github.com/kundajelab/deeplift
How can we find the important parts of the input for a given prediction?
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
![Page 29: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/29.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 29
Saturation revisited
When (i1 + i2) >= 1,gradient is 0
0i1 + i2
1
1 2
yi1 i2
y
Affects:‐ Gradients‐ Deconvolutional Networks‐ Guided Backpropagation‐ Layerwise Relevance Propagation
=1
=1=1
![Page 30: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/30.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 30
0i1 + i2
1
1 2
y0=0 as (i10 + i20) = 0 (reference)With (i1 + i2) = 2, the “difference from reference” (Δy) is +1, NOT 0
Reference: i10=0 & i20=0
y
CΔi1Δy=0.5=CΔi2Δy
i1 i2
y =1
=1=1
Δi1=1 Δi2=1
See paper for detailed backpropagation rules
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
DeepLIFT
![Page 31: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/31.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 31
Choice of reference matters!
Original ReferenceDeepLIFT scores
CIFAR10 model, class = “ship”Suggestions on how to pick a reference:‐ MNIST: all zeros (background)‐ Consider using a distribution
of references‐ E.g. multiple references
generated by shuffling a genomic sequence
![Page 32: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/32.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 32
Example for Interpretation: which features are important?
Avanti Shrikumar, Peyton Greenside & Anshul Kundaje 2017. Learning important features through propagating activation differences. arXiv:1704.02685.
https://pypi.org/project/deeplifthttps://vimeo.com/238275076
https://github.com/kundajelab/deeplift
![Page 33: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/33.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 33
https://www.youtube.com/watch?v=v8cxYjNZAXc&list=PLJLjQOkqSRTP3cLB2cOOi_bQFw6KPGKML
![Page 34: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/34.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 34
04 Gradients: Grad‐CAM
05‐Causality vs Causability
![Page 35: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/35.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 35
![Page 36: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/36.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 36
CAM relies on heatmaps highlighting image pixels for a particular class, and uses global average pooling (GAP) in CNNs.
A class activation map for a particular category indicates the discriminative image regions used by the CNN to identify that exact category (see figure below and see next slide for the procedure).
GAP outputs the spatial average of the feature map of each unit at the last layer of the CNN. A weighted sum of these values is used to generate the final output. Similarly, a weighted sum of the feature maps of the last convolutional layer to obtain the class activation maps is computed.
Class Activation Mapping (CAM)
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva & Antonio Torralba 2016. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition, 2921‐2929.
![Page 37: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/37.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 37
Class Activation Mapping (CAM)
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva & Antonio Torralba 2016. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition, 2921‐2929.
![Page 38: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/38.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 38
Class Activation Mapping (CAM)
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva & Antonio Torralba 2016. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition, 2921‐2929.
![Page 39: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/39.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 39
The drawback of CAM is that it requires changing the network structure and then retraining it. This also implies that current architectures which don’t have the final convolutional layer — global average pooling layer —linear dense layer — structure, can’t be directly employed for this heat map technique. The technique is constrained to visualization of the latter stages of image classification or the final convolutional layers.
Disadvantages
https://jacobgil.github.io/deeplearning/class‐activation‐maps
http://cnnlocalization.csail.mit.edu
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva & Antonio Torralba. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. 2921-2929.
https://www.youtube.com/watch?v=COjUB9Izk6E
![Page 40: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/40.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 40
More information (additional material for the interested student)
https://www.youtube.com/watch?v=6wcs6szJWMY
![Page 41: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/41.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 41
Grad‐CAM (first paper)
Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh & Dhruv Batra 2016. Grad‐CAM: Why did you say that? arXiv:1611.07450.
![Page 42: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/42.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 42
Grad‐CAM big picture
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh & Dhruv Batra. Grad‐CAM: Visual Explanations from Deep Networks via Gradient‐Based Localization. ICCV, 2017. 618‐626.
![Page 43: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/43.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 43
Grad‐CAM is a generalization of CAM
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh & Dhruv Batra. Grad‐CAM: Visual Explanations from Deep Networks via Gradient‐Based Localization. ICCV, 2017. 618‐626.
![Page 44: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/44.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 44
05 Integrated Gradients
05‐Causality vs Causability
![Page 45: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/45.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 45
combines the Implementation Invariance of Gradients along with the Sensitivity of techniques e.g. LRP, or DeepLift
Formally, suppose we have a function 𝐹 ∶ 𝑅𝑛 0; 1 that represents a deep network. Specifically, let 𝑥 𝑅𝑛 be the input at hand, and 𝑥’𝑅𝑛 be the baseline input. For image networks, the baseline could be the black image, while for text models it could be the zero embedding vector.
We consider the straight line path (in 𝑅𝑛) from the baseline 𝑥′ to the input 𝑥, and compute the gradients at all points along the path. Integrated gradients are obtained by cumulating these gradients. Specifically, integrated gradients are defined as the path integral of the gradients along the straight line path from the baseline to the input x.
Integrated Gradients
Mukund Sundararajan, Ankur Taly & Qiqi Yan. Axiomatic attribution for deep networks. Proceedings of the 34th International Conference on Machine Learning, 2017. JMLR, 3319‐3328.
![Page 46: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/46.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 46
Gradients of counterfactuals
Mukun
d Su
ndararajan
, Ankur Taly&
QiqiYan
201
6. Gradien
ts of
coun
terfactuals. arXiv:161
1.02
639.
![Page 47: printing and reading. The lecture version is didactically ... · See paper for detailed backpropagation rules ... a.holzinger@human‐centered.ai 36 Last update: 26‐09‐2019](https://reader033.vdocuments.site/reader033/viewer/2022052009/601dec9775f4cb6a1f52109a/html5/thumbnails/47.jpg)
Last update: 26‐09‐2019a.holzinger@human‐centered.ai 47
Thank you!