sanity checks for saliency maps - neurips05... · sanity check2: networks trained with true and...
TRANSCRIPT
![Page 1: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/1.jpg)
Sanity Checks for Saliency Maps
Julius Adebayo*+, Justin Gilmer#, Michael Muelly#, Ian Goodfellow#, Moritz Hardt^#, Been Kim# *Work was done during the Google AI residency program, +MIT, ^UC Berkeley, #Google Brain.
![Page 2: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/2.jpg)
Interpretability
To use machine learning more responsibly.
![Page 3: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/3.jpg)
Investigating post-training interpretability methods.
!3
Given a fixed model, find the evidence of prediction.
![Page 4: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/4.jpg)
!4
Junco Bird-ness
A trained machine learning model
(e.g., neural network)
Given a fixed model, find the evidence of prediction.
Why was this a Junco bird?
Investigating post-training interpretability methods.
![Page 5: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/5.jpg)
One of the most popular techniques:
Saliency maps
!5
Caaaaan do!
A trained machine learning model
(e.g., neural network)
The promise: these pixels are the
evidence of prediction.
Junco Bird-ness
![Page 6: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/6.jpg)
The promise: these pixels are the
evidence of prediction.
Sanity check question.
!6
A trained machine learning model
(e.g., neural network)Junco Bird-ness
![Page 7: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/7.jpg)
Sanity check question.
!7
A trained machine learning model
(e.g., neural network)
If so, when prediction changes, the explanation should change.
Extreme case: If prediction is random, the explanation should
REALLY change.
The promise: these pixels are the
evidence of prediction.
Junco Bird-ness
![Page 8: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/8.jpg)
Sanity check: When prediction changes, do explanations change?
Saliency map
![Page 9: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/9.jpg)
Randomized weights! Network now makes garbage predictions.
Saliency map
Sanity check: When prediction changes, do explanations change?
![Page 10: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/10.jpg)
Saliency map
!!!!!???!?
Sanity check: When prediction changes, do explanations change?
Randomized weights! Network now makes garbage predictions.
![Page 11: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/11.jpg)
Saliency map
!!!!!???!?
Sanity check: When prediction changes, do explanations change?
Randomized weights! Network now makes garbage predictions.
the evidence of prediction?????
![Page 12: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/12.jpg)
Before After
Gui
ded
Ba
ckpr
op
Inte
grat
ed
Gra
dien
t
Sanity check1: When prediction changes, do explanations change?
No!
![Page 13: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/13.jpg)
!13
Networks trained with….
Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages?
No!
![Page 14: Sanity Checks for Saliency Maps - NeurIPS05... · Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Conclusion!14 Poster](https://reader034.vdocuments.site/reader034/viewer/2022042600/5f58b6145f4681344711cee0/html5/thumbnails/14.jpg)
Conclusion
!14
Poster #30 10:45am - 12:45pm
@Room 210
• Confirmation bias: Just because it “makes sense” to humans, doesn’t mean it reflects the evidence for prediction.
• Do sanity checks for your interpretability methods! (e.g., TCAV [K. et al ’18])
• Others who independently reached the same conclusions: [Nie, Zhang, Patel ’18] [Ulyanov, Vedaldi, Lempitsky ’18]
• Some of these methods have been shown to be useful for humans. Why? More studies needed.