deep learning for computer vision: image classification (upc 2016)

Day 1 Lecture 2

Classification

[course site]

Eva Mohedano

Outline

Image Classification

Train/Test Splits

Metrics

Models

Outline

Train/Test Splits

Metrics

Models

Image ClassificationSet of predefined categories [eg: table, apple, dog, giraffe]Binary classification [1, 0]

DogImage Classification pipeline

Slide credit: Jose M Àlvarez

7Slide credit: Jose M Àlvarez

Learned Representation

Image Classification pipeline

Learned Representation

Part I: End-to-end learning (E2E)

Image Classification pipeline

Image Classification: Example Datasets

training set of 60,000 examples

test set of 10,000 examples

Image Classification: Example Datasets

Outline

Train/Test Splits

Metrics

Models

2.1 3.2 4.8 0.1 0.0 2.63.1 1.4 2.5 0.2 1.0 2.01.0 2.3 3.2 9.3 6.4 0.32.0 5.0 3.2 1.0 6.9 9.19.0 3.5 5.4 5.5 3.2 1.0

N training examples (rows)

D features (columns)

Training set

DatasetShuffled

shuffleTraining

data(70%)

Test data(30%)

splitLearning algorithm

fit(X, y)

Prediction algorithmpredict(X)

PredictionsCompute error/accuracy

score(X, y)Out-of-sample error estimate

Train/Test Splits

Split your dataset into train and test at the very start● Usually good practice to shuffle data (exception: time series)

Do not look at test data (data snooping)!● Lock it away at the start to prevent contamination

NB: Never ever train on the test data!● You have no way to estimate error if you do● Your model could easily overfit the test data and have poor generalization, you have no way of

knowing without test data● Model may fail in production

Data hygiene

Outline

Train/Test Splits

Metrics

Models

Confusion matrices provide a by-class comparison between the results of the automatic classifications with ground truth annotations.

Metrics

MetricsCorrect classifications appear in the diagonal, while the rest of cells correspond to errors.

Prediction

Class 1 Class 2 Class 3

Ground Truth

Class 1 x(1,1) x(1,2) x(1,3)

Class 2 x(2,1) x(2,2) x(2,3)

Class 3 x(3,1) x(3,2) x(3,3)

Special case: Binary classifiers in terms of “Positive” vs “Negative”.

Prediction

Positives negative

GroundTruth

Positives True positive (TP)

Falsenegative

negative False positives (FP)

True negative(TN)

Metrics

The “accuracy” measures the proportion of correct classifications, not distinguishing between classes.

Binary

Prediction

Class 1 Class 2 Class 3

GroundTruth

Class 1 x(1,1) x(1,2) x(1,3)

Class 2 x(2,1) x(2,2) x(2,3)

Class 3 x(3,1) x(3,2) x(3,3)

Prediction

Positives negative

GroundTruth

Positives True positive (TP)

Falsenegative

Negative False positives (FP)

True negative(TN)

Metrics

Given a reference class, its Precision (P) and Recall (R) are complementary measures of relevance.

Prediction

Positives Negatives

GroundTruth

PositivesTrue

positive (TP)

Falsenegative

NegativesFalse

positives (FP)

"Precisionrecall" by Walber - Own work. Licensed under Creative Commons Attribution-Share Alike 4.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Precisionrecall.svg#mediaviewer/File:Precisionrecall.svg

Example: Relevant class is “Positive” in a binary classifier.

Metrics

Binary classification results often depend from a parameter (eg. decision threshold) whose value directly impacts precision and recall.

For this reason, in many cases a Receiver Operating Curve (ROC curve) is provided as a result.

Metrics

Outline

Train/Test Splits

Metrics

Models

DogImage Classification pipeline

Mapping function to predict a score for the class label

Linear Modelsf(x, w) = (wTx + b)

CS231n: Convolutional Neural Networks for Visual Recognition

Sigmoid

f(x, w) = g(wTx + b)

Activation function: Turn score into probabilitiesLogistic Regression

Neuron

Summary

Image classification aims to build a model to automatically predict the content of an image.

When presented as supervised problem, data should be structured in Train/Test splits to learn the model.

Important to have a way to asses the performance of the models

Easiest models classification are linear models. Linear models with a non-linearity on top represent the the “neuron”, the basic element of Neural Networks.

deep learning for computer vision: image classification (upc 2016)

Technology

deep learning for computer vision: attention models (upc...

deep learning for computer vision: data augmentation (upc...

mesoscalem. d. eastin deep convection: classification

deep learning for computer vision: face recognition (upc...

deep generative models i (dlai d9l2 2017 upc deep learning...

deep and young vision learning at upc barcelonatech (nips...

methodology (dlai d6l2 2017 upc deep learning for artificial...

imagenet classification with deep convolutional neural...

deep learning for superpixel-based classification...

multimodal deep learning (d4l4 deep learning for speech and...

deep learning for computer vision: image retrieval (upc...

optimization (dlai d4l1 2017 upc deep learning for...

deep learning for image classification - · pdf filedeep...

image classification using deep learning

deep learning architectures for music audio classification:...

effective media traffic classification using deep...

underwater sparse image classification using deep · pdf...

deep learning for computer vision: deep networks (upc 2016)

word embeddings (d2l4 deep learning for speech and language...

deep generative models ii (dlai d10l1 2017 upc deep learning...