count, crop and recognise: fine-grained recognition in the ... · fine-grained recognition in the...

34
Count, Crop and Recognise: Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani, Daniel Schofield, Andrew Zisserman Visual Geometry Group, University of Oxford

Upload: others

Post on 21-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise:Fine-Grained Recognition in the Wild

Max Bain, Arsha Nagrani, Daniel Schofield, Andrew Zisserman

Visual Geometry Group, University of Oxford

Page 2: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Recognising Animal Individuals in a Video

• The aim: label animal individuals in every frame of a video

King Kong(King Kong 2005)

George(Rampage 2018)

Jambo(Durell Wildlife Park, France)

Page 3: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Recognising Animal Individuals in a Video

• The aim: label animal individuals in every frame of a video

Page 4: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Recognising Animal Individuals in a Video

• The aim: label animal individuals in every frame of a video

Page 5: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods

Chimpanzee face recognition from videos in the wild using deep learningD. Schofield, A. Nagrani, A. Zisserman, M. Hayashi, T. Matsuzawa, D. Biro, S. CarvalhoScience Advances, 2019www.robots.ox.ac.uk/~vgg/research/ChimpanzeeFaces

Page 6: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods

Page 7: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Training Pipeline

Page 8: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Pre-trained Detector

(SSD, YOLO, etc.)

Current Methods: Detection

Page 9: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Finetune Detector

Evaluate

Current Methods: Detection

Page 10: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Tracking

Fully-Convolutional Siamese Networks for Object TrackingL. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. S. TorrCVPR 2017

Page 11: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

1. Acquire identity labels from expert

2. Train Identity Recognition CNN• ResNet• CE weighted loss (class imbalance)

Current Methods: Recognition

Page 12: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Challenges

Face often turned away Bodies prone to heavy occlusion and overlap

Page 13: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Challenges

Face often turned away Bodies prone to heavy occlusion and overlap

Page 14: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Challenges

Lack of contextual information

Page 15: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Challenges

Lack of contextual information

Page 16: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Current Methods: Challenges

Lack of contextual information

Page 17: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

What if we recognised without explicit detection?

Page 18: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Dataset (Publicly Available)

Page 19: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Frame-Level Recognition

• Mutli-label classifier on raw frames• ResNet18, Sigmoid + Weighted BCE loss

Page 20: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Coarse-grainedcounting

CNNImage

Fine-grained recognition

CNN

Region Proposal

JEJE, PELEY

Prediction

Count, Crop and Recognise (CCR)

Page 21: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise (CCR)

Coarse-grainedcounting

CNNCount = 2

• Count labels are for free• Trained as classification task (ResNet18, CE loss)• Bin count of N+ into the same class

Page 22: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise (CCR)

Region Proposal

Page 23: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise (CCR)

Region Proposal

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning Deep Features for Discriminative Localization. CVPR'16 (arXiv:1512.04150, 2015).

Australian terrier ...

CONV

CONV

CONV

CONV

CONV

GAP ...

w1

w2

wn

w1 * + w2 * + … + wn * Class Activation Map

(Australian terrier)

=

CONV

Class Activation Mapping

Page 24: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise (CCR)

Region Proposal

Page 25: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for
Page 26: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Results

Page 27: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Count, Crop and Recognise (CCR)

• Recognise

Fine-grained recognition

CNNJEJE, PELEY

• Multi-label classification (ResNet18)• Sigmoid + BCE Loss

Page 28: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Results

Page 29: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

ResultsIndividual: JIRE

Method AP

Face 31.2

Body 42.3

Baseline 82.3

CCR 86.4

Page 30: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

ResultsIndividual: JIRE

Method AP

Face 31.2

Body 42.3

Baseline 82.3

CCR 86.4

Page 31: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Conclusions

• Detect, Track and Recognise pipelines limited by detector performance• Body > Face• Frame-level recognition offers an alternative, more research needed

Page 32: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

An quirky post-hoc application

FANA JEJE PAMA FOAF

Top-Down Neural Attention by Excitation Backprop, ECCV 2016J. Zhang, Z. Lin, J. Brandt, X. Shen and S. Sclaroff

Page 33: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Annotation Tools

Abhishek Dutta and Andrew Zisserman. 2019.

The VIA Annotation Software for Images, Audio and Video.

In Proceedings of the 27th ACM International Conference on Multimedia (MM ’19)

Object Annotator (bounding boxes, keypoints, pose) Temporal segmentation (presence, actions, speech)

Page 34: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for

Thank you to the organisers for arranging

this workshop!

Paper, Dataset and Code at:www.robots.ox.ac.uk/~vgg/research/ccr