pascal voc classification: local features vs. deep...

60
Shuicheng YAN, NUS PASCAL VOC Classification: Local Features vs. Deep Features

Upload: lamcong

Post on 27-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Shuicheng YAN, NUS

PASCAL VOC Classification: Local Features vs. Deep Features

Page 2: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

PASCAL VOC

PASCAL VOC Visual object classes challenges Be held yearly 2007 – 2012 Tens of teams from universities and industries participated including INRIA,

Berkeley, Oxford, NEC, etc. Become “the dataset” for visual object recognition research

Main tasks: object classification, detection and segmentation Other tasks: person layout, action recognition, etc.

Data: 20 object classes, ~23,000 images with fine labeling

Visual ObjectRecognition

ObjectSegmentation

ObjectClassification

Person, Horse,Barrier, Table, etc

ObjectDetection

Why valuable? Multi-label, Real Scenarios!Why valuable? Multi-label, Real Scenarios!

Page 3: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

PASCAL VOC: 2010-2014

NUS-(PSL) team results 2014, Classification MAP to 0.91

2012, 2011, 2010, Winner of object classification task. (cls)

2012, Winner of object segmentation task. (seg)

2010, Honorable mention of object detection task. (det)

NUS-(PSL) architecture A joint learning of cls-det-seg.

Cls: GlobalInformation

Det: LocalInformation

Seg: Fine-detailed

Information

VisualObject

Recognition

Page 4: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

PASCAL VOC: 2010-2014

2010: 73.8%

2012: 82.2%

2014: 83.2%

2014: 91.4%

2011: 78.7%

LLC

Context-SVM

GHM

Sub-category

Deep feature

HCP

25%

2013: 79.0%

Deep feature

Page 5: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

I. Spring of Local Features: 2010-2012

Page 6: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Pipeline

1. Jian Dong, Qiang Chen, Jiashi Feng, Wei Xia, Zhongyang Huang, Shuicheng YAN, Subcategory-aware Object Classi-fication. In CVPR'13.2. Qiang Chen, Zheng Song, Yang Hua, Zhongyang Huang, Shuicheng Yan. Hierarchical Matching with Side Informationfor Image Classification. In CVPR’12.3. Zheng Song*, Qiang Chen*, Zhongyang Huang, Yang Hua, and Shuicheng Yan. Contextualizing Object Detection andClassification. In CVPR'11.

FeatureRepresentation

FeatureRepresentation

ModelLearningModel

Learning

Low LevelFeatures

FeatureEncoding

FeaturePooling

ClassifierLearning

ContextModeling

GHM[2]:Generalized HierarchicalMatching (GHM) forobject central problems.Object central pooling.

Subcategory mining[1]:Automatically mining thevisual subcategoriesbased on ambiguitymodeling.

Contextualization[3]:Mutual Contextualizationfor object classification anddetection tasks. Greatperformance improvement.

Page 7: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Visual Features

Feature PoolingSPM

Feature PoolingSPM

Local FeatureExtraction

Local FeatureExtraction

Feature CodingFeature Coding

Classification

SVMSVM

RegressionRegression

Post Processing

KernelRegression

KernelRegression

ConfidenceRefinement

with Exclusiveprior

ConfidenceRefinement

with Exclusiveprior

Detection ResultsDetection Results

Max poolingMax pooling

Kernel

NonlinearKernel

NonlinearKernel

Linear KernelLinear Kernel

Chair

Framework – NUS-PSL 2010

Page 8: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Feature PoolingSPM

Feature PoolingSPM

Feature PoolingSPM, GHM

Feature PoolingSPM, GHM

Feature CodingFeature CodingFeature CodingFK

Feature CodingFK

Chair

Detection ResultsDetection ResultsSubcategory

Detection ResultsSubcategory

Detection Results

Max poolingMax pooling

Framework – NUS-PSL 2012

Visual Features

Classification

SVMSVM

RegressionRegression

Post Processing

KernelRegression

KernelRegression

ConfidenceRefinement

with Exclusiveprior

ConfidenceRefinement

with Exclusiveprior

Local FeatureExtraction

Local FeatureExtraction

Kernel

NonlinearKernel

NonlinearKernel

Linear KernelLinear Kernel

Nonlinear +Linear KernelNonlinear +Linear Kernel

SubcategoryMining

SubcategoryMining

Flipping

Flipping

Flipping

II GeneralizedHierarchical

Matching

III SubcategoryMining

I ContextualizedObject

Classification andDetection

Page 9: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Outline for VOC: 2010-2012

Context model: Contextualized Object Classification andDetection

Feature pooling: Generalized Hierarchical Matching/Pooling

Subcategory learning: Sub-Category Aware Detection &Classification

Page 10: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Contextualized Object Classification and Detection

Occurrence Probability

Det: Localpatches with

matched localshape/texture

Whether CanExchange

Information?

Det

Cls

Cls: Globalprobabilities tocontain objects

Page 11: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Observations

Object classification and detection are mutuallycomplemental to each other. Each subject task servesas context task for the other.

Context is not robust for the subject task, so use onlywhen necessary

Scene/Global level information isnot stable for object detection.

person

False alarm of object detectionharms object classification.

Page 12: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

• Adaptive contextualization

• Configurable model complexity: low rank constraint

• dim n x m R x (n + m)

• Easy to be solved and kernelized, if is fixed.

Contextualized SVM - Formulation

Adaptive embeddingof context features

Original classificationhyperplane

Context model(dim m)

Selection toambiguous samples(dim n)

Sample specificclassification

n: feature dimm: context dim

Page 13: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Contextualized SVM - Formulation

Ambiguity modeling: Define the ambiguity degree of sample as the hinge loss of the subject task,

Learn the Ambiguity-guided Mixture Model (AMM) through EM to maximize thefollowing objective,

Multi-mode ambiguity term is defined as the posterior of each mixture r,

Page 14: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Iterative Co-training of Detection and Classification

Learn to Detect

Learn to Classify

Context frominitial

Classification

Context frominitial

Classification

Detection Feature

Context frominitial

Detection

Context frominitial

Detection

Classification Feature

ContextSVM

Context from1st

Classification

Context from1st

Classification

DetectionFeature

DetectionFeature

Contextfrom 1st

Detection

Contextfrom 1st

Detection

ClassificationFeature

ClassificationFeature

InitialModel

Classification PipelineDetection Pipeline

…ContextSVM

a) initial model b) 1st iteration of ContextSVM c) 2nd iteration of ContextSVM

Page 15: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Results

Iterative contextualization:Mean AP values of 20 classes on VOC 2010 train/val

Page 16: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Results

Comparison with state-of-the-arts on VOC 2010

Page 17: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Exemplar results

Representative examples of the baseline (without contextualization) andContext-SVM for classification task.

Page 18: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Outline for VOC: 2010-2012

Context model: Contextualized Object Classification andDetection

Feature pooling: Generalized Hierarchical Matching/Pooling

Subcategory learning: Sub-Category Aware Detection &Classification

Page 19: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Generalized Hierarchical Matching/Pooling

Traditional Pooling: SPM = approximate geometric constraint Not optimal for object recognition due to misalignment

(a) Images (b) SPM partitions (c) Object Confidence Map partition

Page 20: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hierarchical Pooling for Image Classification

Design a general form of hierarchical matching withside information.

Represent image with hierarchical structure

Page 21: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hierarchical Matching Kernel

Image Similarity Kernel is defined as the weightedsum over each cluster kernel.

General form of SPM, PMK, etc… Flexible to integrate other side information.

Page 22: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Generalized Hierarchical Matching/Pooling

Encoded local featurevs. side information

(b) Hierarchically cluster by side information.Level 1 (top),2 (mid),3 (bottom)

(a) Side informationand Image

(c) Hierarchical structurerepresentation

(d) Matching/poolingwithin each cluster

Utilize side information to hierarchically pool local features

Page 23: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Side information design

FusingFusing

Images

ObjectConfidenceMaps

sub-windowsub-windowSliding

windowSliding

window

Score voteback to image

Score voteback to image

ShapeModel

AppearanceModel

Process Score voteback to image

Score voteback to image

Side Information - Detection Confidence Map

Page 24: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Results

VOC

Page 25: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Outline for VOC: 2010-2012

Context model: Contextualized Object Classification andDetection

Feature pooling: Generalized Hierarchical Matching/Pooling

Subcategory learning: Sub-Category Aware Detection &Classification

Page 26: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Sub-Category Mining

Sofa

Chair

Diningtable

Ambiguity Guided Subcategory Mining

Subcategory-aware Object Classification

Fusion ModelFusion Model

Subcategory Model2

Subcategory Model2

Subcategory ModelN

Subcategory ModelN

Subcategory Model1

Subcategory Model1

Page 27: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Calculate the sample intra-class similarity

Calculate the sample inter-class ambiguity

Detect dense subgraphs by graph shift algorithm [1]

Subgraphs to subcategories.

Sub-Category Mining

Subcategory Mining based on both Similarity & Ambiguity

[1] Hairong Liu, Shuicheng Yan. Robust Graph Mode Seeking by Graph Shift. ICML 2010

Chair

SofaAmbiguous Categories

Similarity

Similarity

InstanceAffinity Graph

Graph Shift

DetectedSubgraphs

Visualization

CorrespondingSubcategories

Ambiguity

Ambiguity

Page 28: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Sub-Category Aware Detection & Classification

Subcategory Model 1

Detection Model

Classification Model

FusionModelFusionModel

GHM PoolingLocal FeatureExtraction and

Coding

SubcategoryDetectionResult N

SubcategoryDetectionResult NSubcategory Model NSubcategory Model N

Categorylevel

Result

Categorylevel

Result

ImageRepresentation Subcategory

ClassificationResult N

SubcategoryClassification

Result N

SubcategoryDetectionResult 1

SubcategoryDetectionResult 1

SubcategoryClassification

Result 1

SubcategoryClassification

Result 1Sliding/SelectiveWindow Search

FeatureExtraction

Testing Image

Page 29: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Sub-Category Mining Result

OutliersSubcategories

Bus

Chair

Page 30: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Summary of VOC results

2010 2011 2012Our Best Other's Best Our Best Other's Best Our Best Other's Best

aeroplane 93 93.3 95.5 94.5 97.3 92bicycle 79 77 81.1 82.6 84.2 74.2

bird 71.6 69.9 79.4 79.4 80.8 73boat 77.8 77.2 82.5 80.7 85.3 77.5

bottle 54.3 53.7 58.2 57.8 60.8 54.3bus 85.2 85.9 87.7 87.8 89.9 85.2car 78.6 80.4 84.1 85.5 86.8 81.9cat 78.8 79.4 83.1 83.9 89.3 76.4

chair 64.5 62.9 68.5 66.6 75.4 65.2cow 64 66.2 74.7 74.2 77.8 63.2

diningtable 62.9 61.1 68.5 69.4 75.1 68.5dog 69.6 71.1 76.4 75.2 83 68.9

horse 82 76.7 83.3 83 87.5 78.2motorbike 84.4 81.7 87.5 88.1 90.1 81

person 91.6 90.2 92.8 93.5 95 91.6pottedplant 48.6 53.3 56.5 58.7 57.8 55.9

sheep 65.4 66.3 77.7 75.5 79.2 69.4sofa 59.6 58 67 66.3 73.4 65.4train 89.4 87.5 91.2 90 94.5 86.7

tvmonitor 77.2 76.2 77.5 77.2 80.7 77.4MAP 73.8 78.7 82.2

Page 31: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

II. Spring of Deep Feature: 2013-2014

Page 32: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

CNN: Single-label Image Classification

Definition Assign one and only one label from a pre-defined set to an image

Explicit assumption: object is roughly aligned

Alex Net [1] made a great breakthrough in single-label classification inILSVRC2012 (with 10% gain over the previous methods)

[1] A. Krizhevsky, I. Sutskever, G. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012.

Page 33: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

CNN: Multi-label Image Classification

Definition Assign multiple labels from a pre-defined set to an image

Challenges Foreground objects are not roughly aligned

Interactions between different objects, e.g. partial visibility and occlusion

A large number of training images are required The label space is expanded from n to 2^n

Single-label images Multi-label images

vs.

Directly CNN training is unreasonable and unreliable!

Page 34: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities
Page 35: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hypotheses-CNN-Pooling(HCP)

Our framework

c

96

256384 384 256

4096 4096

5527 13 13 13

5

5

3

33

333

MaxPooling

MaxPooling

MaxPooling

Shared convolutional neural network

11

…………………

dog,person,sheep

MaxPooling

Scores for individualhypothesis

Hypotheses assumption:single-labeled

Page 36: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Characteristics of Our Framework

No ground-truth bounding box information is required for training on themulti-label image dataset

The proposed HCP infrastructure is robust to the noisy and/or redundanthypotheses

No explicit hypothesis label is required for training

The shared CNN can be well pre-trained with a large-scale single-labelimage dataset

The HCP outputs are naturally multi-label prediction results

Page 37: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Training of HCP

Hypotheses extraction

Initialization of HCP Pre-training on a large-scale single-label image set, e.g. ImageNet

Image-fine-tuning on a multi-label image set

Hypotheses-fine-tuning

Page 38: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hypotheses Extraction

Criteria: High object detection recall rate

Small number of hypotheses

High computational efficiency

Solution: BING [2]+ Boxes clustering

[2] M.-M. Cheng, J. Warrell, W.-Y. Lin, and P.H.S.Torr. BING: Binarized normed gradients for objectness estimation at 300fps. CVPR 2014.

Page 39: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hypotheses Extraction

Page 40: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Initialization of HCP

Single-label Images(e.g. ImageNet)

Pre-training

Step1

Multi-label Images(e.g. Pascal VOC)

Image-fine-tuning

Step2

Parameters transferring

Page 41: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Hypotheses-fine-tuning

Page 42: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

A subset from detection dataset of ILSVRC 2013 is used for BING training

Experimental Results

Page 43: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Experimental Results

Performance on PASCAL VOC 2007

NewNew

Page 44: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Experimental Results

Performance on PASCAL VOC 2012

New-1New-1

New-2New-2

Page 45: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Experimental Results

Complementary Analysis: Hand-crafted features vs. Deep features

Page 46: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

One test sample from VOC2007 500 hypotheses for each image, 1~1.5s

Experimental Results

car horse

person

personcar

horseperson

Generate hypotheses

Feed into the shared CNN

Cross-hypothesis max-pooling

Page 47: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

New Result: “Network in Network” (NIN)

NIN: CNN with non-linear filters, yet without final fully-connected NN layer

Intuitively less overfitting globally, and more discriminative locally(not finally used in our submission due to the surgery of our main team member, but very effective)

With less parameter #[4] Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron C. Courville, Yoshua Bengio: Maxout

Networks. ICML (3) 2013: 1319-1327

[4]

CNNNIN

Page 48: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Better Local Abstraction

Local patch is projected to its feature vector.Using a small network.

Motivation: Better Local Abstraction!

Cascaded Cross Channel Parametric Pooling (CCCP)

Lin, Min, Qiang Chen, and Shuicheng Yan. "Network In Network." ICLR-2014.

Page 49: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

CCCP ≈ Cascaded 1x1 Convolution in Implementation

Page 50: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Global Average Pooling

Save tons of parameters

CNN NIN

Confidence map of each category

Page 51: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

To avoid hyper-parameter tuning,we put cccp layer directly on convolutionlayers of ZFNet. (Network in ZFNet)

layer details

Conv1 Stride = 2, kernel = 7x7,channel_out = 96

Conv2 Stride = 2, kernel = 5x5,channel_out = 256

Conv3 Stride = 1, kernel = 3x3,channel_out = 512

Conv4 Stride = 1, kernel = 3x3,channel_out = 1024

Conv5 Stride = 1, kernel = 3x3,channel_out = 512

Fc1 Output = 4096

Fc2 Output = 4096

Fc3 Output = 1000

layer details

Conv1 Stride = 2, kernel = 7x7,channel_out = 96

Cccp1 Output = 96

Conv2 Stride = 2, kernel = 5x5,channel_out = 256

Cccp2 Output = 256

Conv3 Stride = 1, kernel = 3x3,channel_out = 512

Cccp3 Output = 256

Conv4 Stride = 1, kernel = 3x3,channel_out = 1024

Cccp4 Output = 512

Cccp5 Output = 384

Conv5 Stride = 1, kernel = 3x3,channel_out = 512

Cccp6 Output = 256

Fc1 Output = 4096

Fc2 Output = 4096

Fc3 Output = 1000

(10.91%) With 256xN training and 3 view test

Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks."Computer Vision–ECCV 2014. Springer International Publishing, 2014. 818-833.

NIN in ILSVR2014

Page 52: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

…………………

Shared NIN

dog,person,sheep

c

MaxPooling

Scores for individualhypothesis

NIN in HCP

Page 53: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Compared with State-of-the-arts on VOC 2012

Category NUS-PSL[1] PRE-1000C[2] PRE-1512[2] Chatfield et al.[3] HCP-NIN HCP-NIN+NUS-PSLplane 97.3 93.5 94.6 96.8 98.4 99.5

bicycle 84.2 78.4 82.9 82.5 89.5 93.7bird 80.8 87.7 88.2 91.5 96.2 96.8boat 85.3 80.9 84.1 88.1 91.7 94.0

bottle 60.8 57.3 60.3 62.1 72.5 77.7bus 89.9 85.0 89.0 88.3 91.1 95.3car 86.8 81.6 84.4 81.9 87.2 92.4cat 89.3 89.4 90.7 94.8 97.1 98.2

chair 75.4 66.9 72.1 70.3 73.0 86.1cow 77.8 73.8 86.8 80.2 89.5 91.3table 75.1 62.0 69.0 76.2 75.1 83.5dog 83.0 89.5 92.1 92.9 96.3 97.3

horse 87.5 83.2 93.4 90.3 93.0 96.8motor 90.1 87.6 88.6 89.3 90.5 96.3person 95.0 95.8 96.1 95.2 94.8 95.8plant 57.8 61.4 64.3 57.4 66.5 72.2sheep 79.2 79.0 86.6 83.6 90.3 91.5sofa 73.4 54.3 62.3 66.4 65.8 81.1train 94.5 88.0 91.1 93.5 95.6 97.6

tv 80.7 78.3 79.8 81.9 82.0 90.0MAP 82.2 78.7 82.8 83.2 86.8 91.4

[1] S. Yan, J. Dong, Q. Chen, Z. Song, Y. Pan, W. Xia, H. Zhongyang, Y. Hua, and S. Shen. Generalized hierarchical matching for subcategory awareobject classification. In Visual Recognition Challange workshop, ECCV, 2012.

[2] M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. CVPR, 2014.

[3] K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman. Return of the Devil in the Details: Delving Deep into Convolutional Nets , BMVC, 2014

From 81.7% | < 90.3%From 81.7% | < 90.3%

Page 54: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Demo

Online Demo

Page 55: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Highest and Lowest Score Five Images for Each Class

Aeroplane

Bicycle

Bird

Boat

Bottle

Page 56: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Highest and Lowest Score Five Images for Each Class

Bus

Car

Cat

Chair

Cow

Page 57: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Dining table

Dog

Horse

Motorbike

Person

Highest and Lowest Score Five Images for Each Class

Page 58: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Pottedplant

Sheep

Sofa

Train

TV monitor

Highest and Lowest Score Five Images for Each Class

Page 59: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

What’s next?

Better Deep Features?Better Deep Features?

Better Local Features?Better Local Features?

2009: 66.5%

2010: 73.8%

2012: 82.2%

2014: 83.2%

2014: 91.4%

2011: 78.7%

LLC

Context-SVM

GHM

Sub-category

Deep feature

HCP

25%

More Extra Data?More Extra Data?

Better Solution for Small/Occluded Objects?Better Solution for Small/Occluded Objects?

Page 60: PASCAL VOC Classification: Local Features vs. Deep …jiechen/ACCV2014_Workshop_RoLoD.files/ACCV14... · PASCAL VOC Classification: Local Features vs. Deep Features. ... probabilities

Shuicheng YAN

[email protected]