incremental classifier and representation...

Incremental Classifier and Representation LearningSylvestre-Alvise Rebuffi†,∗, Alexander Kolesnikov∗, Georg Sperl∗, Christoph H. Lampert∗

∗ IST Austria † University of Oxford

Abstract

iCaRL can learn classifiers/representation incrementally over along period of time where other methods quickly fail. Catastrophicforgetting is avoided thanks to a combination of anearest-mean-of-exemplars classifier, herding for adaptiveexemplar selection and distillation for representation learning.

1) Motivation

Situation: class-incremental learning

Problem:I classes appear sequentiallyI train on new classes without forgetting the old onesI no retraining from scratch: too heavyI can’t keep all the data around

2) Class-Incremental Learning

We want to/we can:I for any number of observed classes, t , learn a multi-class classifierI store a certain number, K , of images (a few hundreds or thousands)

We do not want to/we cannot:I add more and more resources: fixed sized networkI store all training examples (could be millions)

The dilemma:I fixing the data representation: suboptimal results on new classes.I continuously improving the representation: classifiers for earlier classes deteriorate over time

(”catastrophic forgetting/interference”) [McCloskey, Cohen. 1989]

3) Existing Approaches

Fixed data representation:I represent classes by mean feature vectors [Mensink et al., 2012], [Ristin et al., 2014]

Learning the data representation:I grow neural network incrementally, fixing parts that are responsible for earlier class decisions

[Mandziuk, Shastri. 1998], . . . , [Rusu et al., 2016]

I multi-task setting: preserve network activations by distillation [Li, Hoiem. 2016]

4) iCaRL

iCaRL component 1: exemplar-based classification.

I nearest-mean-of-exemplars classifierI automatic adjustment to representation

changeI more robust than network outputs

iCaRL component 2: representation learning.I add a distillation term to loss functionI stabilizes outputsI limited overhead, just need one copy of the network

iCaRL component 3: exemplar selection.

I greedy selection procedure by herdingI number of exemplars per class decrease

as the number of class increasesI ability to remove exemplars on-the-fly

through ranking of exemplars

5) Experiments

CIFAR-100:I 100 classes, in batches of 10I 32-layer ResNet [He et al., 2015]

I evaluated by top-1 accuracyI number of exemplars: 2000

ImageNet ILSVRC 2012:I 1000 classes, in batches of 10I 18-layer ResNet [He et al., 2015]

I evaluated by top-5 accuracyI number of exemplars: 20000

Baselines:I fixed representation: freeze representation after first batch of classesI finetuning: ordinary NN learning, no freezingI LwF: ”Learning without Forgetting” [Li, Hoiem. 2016], use network itself to classify

6) Results

CIFAR-100 (2, 5 and 10 classes per batch)

ILSVRC 2012 (small) ILSVRC 2012 (full)

Discussion:I as expected: fixed representation and finetuning do not work wellI iCaRL is able to keep good classification accuracy for many iterationsI ”Learning without Forgetting” starts to forget earlier

Confusion matrices for CIFAR-100

Discussion:I iCaRL: predictions spread homogeneouslyI LwF.MC: prefer recently seen classes→ long-term memory lossI fixed representation: prefer first batch of classes→ lack of neural plasticity

I finetuning: predict only classes among the last batch→ catastrophic forgetting

incremental classifier and representation...

Documents

lwf nepal country strategy 2019-2024

facing your fear - lwf

lwf myanmar program sittwe township camp situation report...

unhcr requirements lwf procurement manual and other...

lwf myanmar program camp situation report august...

lwf aktuell 97 – forsteinrichtungsplanung der … · die...

c2 mbs lwf 2008

computer vision ece cs 543 / ece 549 university of illinois...

save the earth cambodia centdor lwf-c cepa cedac

lwf dws burundi presentation on peace regional consultation...

lwf-merkblatt nr.38, feinerschließung€¦ ·...

computational photography cs498dh derek hoiem 8/25/11

aristide research final report lwf rwanda

les pixels contenu tiré du cours par derek hoiem

lwf -- transition in communion... and beyond

lwf 101 for open hack day

lwf haiti - annual report 2013

dosificadores de pérdida (lwf)

apps for good in my school lwf 2012

recovering surface layout from a single image d. hoiem, a.a....