probabilistic model-agnostic meta-learningcbfinn/_files/icml2018_pgm_platipus.pdfprobabilistic...
TRANSCRIPT
![Page 1: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/1.jpg)
Probabilistic Model-Agnostic Meta-LearningChelsea Finn*, Kelvin Xu*, Sergey Levine
*equal contribution
![Page 2: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/2.jpg)
Deep learning requires a lot of data.
Few-shot learning / meta-learning: incorporate prior knowledge from previous tasks for fast learning of new tasks
Given 1 example of 5 classes: Classify new examples
few-shot image classification
![Page 3: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/3.jpg)
Ambiguity in Few-Shot Learning
Important for:- learning to actively learn
- safety-critical few-shot learning (e.g. medical imaging)
- learning to explore in meta-RL
Can we learn to generate hypotheses about the underlying function?
?
?
+ -
![Page 4: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/4.jpg)
Key idea: Train over many tasks, to learn parameter vector θ that transfers
Meta-test time:
Background: Model-Agnostic Meta-LearningOptimize for pretrained parameters
MAML:
Finn, Abbeel, Levine ICML ’17
How to reason about ambiguity?
![Page 5: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/5.jpg)
Modeling ambiguity
Can we sample classifiers?
Intuition of our approach: learn a prior where a random kick can put us in different modes
smiling, hatsmiling, young
![Page 6: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/6.jpg)
Meta-learning with ambiguity
![Page 7: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/7.jpg)
Sampling parameter vectors
approximate with MAPthis is extremely crude
but extremely convenient!
Training is harder. We use amortized variational inference.
(Santos ’92, Grant et al. ICLR ’18)
![Page 8: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/8.jpg)
PLATIPUSProbabilistic LATent model for Incorporating Priors and Uncertainty in few-Shot learning
Ambiguous 5-shot regression:
Ambiguous 1-shot classification:
![Page 9: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/9.jpg)
PLATIPUSProbabilistic LATent model for Incorporating Priors and Uncertainty in few-Shot learning
![Page 10: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/10.jpg)
Collaborators
Questions?
Sergey LevineKelvin Xu
Takeaways
Can use hybrid inference in hierarchical Bayesian model for scalable and uncertainty-aware meta-learning.
Handling ambiguity is particularly relevant in few-shot learning.
![Page 11: Probabilistic Model-Agnostic Meta-Learningcbfinn/_files/icml2018_pgm_platipus.pdfProbabilistic Model-Agnostic Meta-Learning Chelsea Finn*, Kelvin Xu*, Sergey Levine *equal contribution](https://reader035.vdocuments.site/reader035/viewer/2022062912/5f01ee417e708231d401bd20/html5/thumbnails/11.jpg)
Related concurrent work
Kim et al. “Bayesian Model-Agnostic Meta-Learning”: uses Stein variational gradient descent for sampling parameters
Gordon et al. “Decision-Theoretic Meta-Learning”: unifies a number of meta-learning algorithms under a variational inference framework