![Page 1: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/1.jpg)
Towards Visual Recognition in the Wild:
Long-Tailed Sources & Open Compound Targets
Boqing Gong
![Page 2: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/2.jpg)
CVPR 2009
50 classes85 attributes
![Page 3: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/3.jpg)
2011-2015
Kernel Methods for
Unsupervised Domain
Adaptation
10~100 classes
![Page 4: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/4.jpg)
ILSVRC 2010-2017
~1000 classes
Bottom image credit: http://www.thegreenmedium.com/blog/2019/5/24/why-robots-will-help-you-rather-than-try-to-take-over-the-world-a-brief-history-of-ai
![Page 5: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/5.jpg)
ICML 2014
Deep features!
![Page 6: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/6.jpg)
Object recognition in the wild
5k~8k classes
![Page 7: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/7.jpg)
in the wild
![Page 8: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/8.jpg)
Right image credit: https://natureneedsmore.org/the-elephant-in-the-room/
in the wild
![Page 9: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/9.jpg)
CVPR 2019 (oral), improving neural architectures
![Page 10: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/10.jpg)
Long-tailed ImageNet (1000 classes)
Long-tailed Places-365
Long-tailed MS1M ArcFace (74.5k ids)
A memory bank to enhance tail classes
CVPR 2019 (oral), improving neural architectures
![Page 11: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/11.jpg)
An old AI problem
A new AI problem (meta-learning,
transfer learning, zero-shot learning)
Acknowledgement: Matthew Brown @Google
![Page 12: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/12.jpg)
Existing workClass-wise weighting, over/under-sampling, etc.
[CVPR’18] Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
[CVPR’19] Class-Balanced Loss Based on Effective Number of Samples
[NeurIPS’19] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
[ICLR’20] Decoupling Representation and Classifier for Long-Tailed Recognition
![Page 13: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/13.jpg)
Classes
Freq
uenc
y
![Page 14: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/14.jpg)
Existing workClass-wise weighting, over/under-sampling, etc.
[CVPR’18] Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
[CVPR’19] Class-Balanced Loss Based on Effective Number of Samples
[NeurIPS’19] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
[ICLR’20] Decoupling Representation and Classifier for Long-Tailed Recognition
Existing work assumes 𝝐=0
… as domain adaptationTarget
Source
![Page 15: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/15.jpg)
Existing work assumes 𝝐=0
… as domain adaptationMany training images in a head class: 𝝐=0
Few-shot training images in a tail class: 𝝐≠0
Head vs. tail
![Page 16: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/16.jpg)
CVPR 2020 (oral), long-tailed recognition ⩰ domain adaptation
![Page 17: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/17.jpg)
Our approachEstimating both &
by unifying [CVPR’19] & an improved meta-learning method
SOTA on six datasets
○ CIFAR-LT-10○ CIFAR-LT-100○ ImageNet-LT○ Places-LT○ iNaturalist 2017○ iNaturalist 2018
![Page 18: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/18.jpg)
Long-tailed visual recognition (LTVR)
Emerging challenge as the datasets grow in scale
Timely topic
Datasets: iNaturalist, LVIS, ImageNet, COCO, etc.
Tasks: almost all
… as domain adaptationNew perspective to LTVR
New powerhouse of methods
Domain-invariant representation learning
Curriculum domain adaptation
Adversarial learning
Classifier discrepancy
Data augmentation & synthesis, etc.
Diff: no access to target data
![Page 19: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/19.jpg)
in the wild
![Page 20: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/20.jpg)
Open compound test cases (target)
![Page 21: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/21.jpg)
Open compound test cases (target)
![Page 22: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/22.jpg)
Open compound domain adaptation
Training:
Labeled source domain data
Unlabeled data of the compound target
Testing:
in the compound target domain and
in previously unseen domains
Liu, Ziwei, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Stella X. Yu, Dahua Lin, and Boqing Gong. "Compound domain adaptation in an open world." CVPR 2020. (oral)
![Page 23: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/23.jpg)
Experiments
![Page 24: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/24.jpg)
Our approach to break the compound target domaininto a series of bi-domain adaptation problems by “domain distances” between the source and latent domains in the target (curriculum training)
Source
Latent domain 1
Latent domain 2
Latent domain 3
![Page 25: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/25.jpg)
into a series of bi-domain adaptation problems by “domain distances” between the source and latent domains in the target (curriculum training)
Source
Latent domain 1
Latent domain 2
Latent domain 3
Our approach to break the compound target domain
![Page 26: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/26.jpg)
into a series of bi-domain adaptation problems by “domain distances” between the source and latent domains in the target (curriculum training)
Source
Latent domain 1
Latent domain 2
Latent domain 3
Our approach to break the compound target domain
![Page 27: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/27.jpg)
into a series of bi-domain adaptation problems by “domain distances” between the source and latent domains in the target (curriculum training)
Source
Latent domain 1
Latent domain 2
Latent domain 3
Our approach to break the compound target domain
![Page 28: Towards Visual Recognition in the Wild Boqing Gong Long ...valser.org/webinar/slide/slides/20200722/Visual Recognition in the … · 22.07.2020 · Unlabeled data of the compound](https://reader035.vdocuments.site/reader035/viewer/2022071220/605b010b98a4d6286d32e0d2/html5/thumbnails/28.jpg)
Pushing the boundary of visual recognition
Long-tailed source domains
The elephant in the room as we scale up classes / study the wild data
Memory bank to enhance tail classes (CVPR’19, oral)
Domain adaptation: a new powerhouse of techniques (CVPR’20, oral)
Improved meta-learning for long-tailed recognition (undergoing)
Open compound target domains (CVPR’20, oral)
Learning from unlabeled, noisy data in the wild (undergoing)