invariance and stability of deep convolutional representationslcarin/liqun1.11.2019.pdf ·...

25
Invariance and Stability of Deep Convolutional Representations Alberto Bietti, Julien Maira Univ. Grenoble Alpes, Inria Presented by Liqun Chen Jan 11th, 2017 1

Upload: others

Post on 24-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Invariance and Stability of Deep ConvolutionalRepresentations

Alberto Bietti, Julien Maira

Univ. Grenoble Alpes, Inria

Presented by Liqun Chen

Jan 11th, 2017

1

Page 2: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

2

Page 3: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Introduction

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

3

Page 4: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Introduction

Introduction

Motivation

Understanding the geometry of these functional spaces is afundamental question.

Representations that are stable to small deformations can robustmodels that may exploit these invariances complexity.

Related work

scattering transform is a recent attempt to characterize convolutionalmultilayer architectures based on wavelets.

scattering transform networks do not involve “learning”, since thefilters of the networks are pre-defined.

4

Page 5: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Introduction

Contribution of this work

This paper studies the translation-invariance properties of the kernelrepresentation and its stability to the action of diffeomorphisms, obtainingsimilar guarantees as the scattering transform, while preserving signalinformation.

5

Page 6: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Notation and basic mathematical tools

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

6

Page 7: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Notation and basic mathematical tools

Notation and basic mathematical tools (I)

1 A positive definite kernel K that operates on a set X implicitly defines areproducing kernel Hilbert space (RKHS) H of functions from X to R, alongwith a mapping φ : X → H;

2 A predictive model associates to every point z in X and a label in R. Itconsists of a linear function f in H such that f(z) = 〈f, φ(z)〉H, where φ(z)is the data representation.

3 Given two points z, z′ ∈ X , Cauchy-Schwarz’s inequality allows to controlthe variation of the model f : | f(z)− f(z′) |≤‖ f ‖H‖ φ(z)− φ(z′) ‖H .If z and z′ are close to each other under RKHS norm, the model should output

similar predictions, when the model f has reasonably small norm in H

7

Page 8: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Notation and basic mathematical tools

Notation and basic mathematical tools (II)

1 a signal x is a function in L2(Ω,H), where Ω is a subset of Rdrepresenting spatial coordinates

2 Given a linear operator T : L2(Ω,H)→ L2(Ω,H′), the operator normis defined as ‖ T ‖L2(Ω,H)→L2(Ω,H′):= sup‖x‖L2(Ω,H)≤1 ‖ Tx ‖L2(Ω,H′)

3 For simplicity, | · | is the Euclidean norm on Rd, ‖ · ‖ is the Hilbertspace norm.

8

Page 9: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM)

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

9

Page 10: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM)

model

10

Page 11: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM)

Framework of the model

As shown in Figure 1, a new map xk is built from the previous one xk–1 byapplying successively three operators that perform patch extraction (Pk),kernel mapping (Mk) in a new RKHS Hk, and linear pooling (Ak),respectively. When going up in the hierarchy, the points xk(u) carryinformation from larger signal neighborhoods centered at u in Ω with moreinvariance, as we will formally show.

11

Page 12: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Patch extraction operator

Patch extraction operator

12

Page 13: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Patch extraction operator

Patch extraction operator

Given the layer xk–1, we consider a patch shape Sk, defined as acompact centered subset of Ω, e.g., a box,

we define the Hilbert space Pk := L2(Sk,Hk–1) equipped with thenorm ‖z‖2 =

∫Sk‖z(u)‖2dνk(u), where dνk is the normalized uniform

measure on Sk for every z in Pk.

we define the (linear) patch extraction operatorPk : L2(Ω,Hk–1)→ L2(Ω,Pk) such that for all u in Ω,

Pkxk–1(u) = (v 7→ xk–1(u+ v))v∈Sk∈ Pk.

Note that by equipping Pk with a normalized measure, by Fubini’stheorem, ‖Pkxk–1‖ = ‖xk–1‖ and hence Pkxk–1 is in L2(Ω,Pk).

13

Page 14: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator

Kernel mapping operator

14

Page 15: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator

Kernel mapping operator

Then, we map each patch of xk–1 to a RKHS Hk using the kernel mappingφk : Pk → Hk associated to a positive definite kernel Kk that operates onpatches.

We can define the non-linear pointwise operator Mk such that for all u in Ω,

MkPkxk–1(u) := φk(Pkxk–1(u)) ∈ Hk.

In this paper, it uses homogeneous dot-product kernels of the form:

Kk(z, z′) = ‖z‖‖z′‖κk(〈z, z′〉‖z‖‖z′‖

)= 〈φk(z), φk(z′)〉, (1)

where κk(u) =∑∞j=0 bjuj with bj ≥ 0 and κk(1) = 1, which ensures that

‖MkPkxk–1(u)‖ = ‖Pkxk–1(u)‖ and that ‖MkPkxk–1‖ is in L2(ω,Hk).

15

Page 16: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Kernel mapping operator

Kernel mapping operator

Convolutional Kernel Networks approximation

Approximate φk(z) by projection on span(φk(z1), ..., φk(zp))

Leads to tractable, p-dimensional representation ψk(z)

Anchor points z1, . . . , zp can be learned from data (K-means or backprop)

16

Page 17: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator

Pooling operator

17

Page 18: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator

Pooling operator

The last step to build the layer xk consists of pooling neighboring values toachieve local shift-invariance.

We apply a linear convolution operator Ak with a Gaussian filter of scale σk,hσk

(u) := σ−dk h(u/σk), where h(u) = (2π)−d/2 exp(−|u|2/2).

Then, for all u in Ω,

xk(u) = AkMkPkxk–1(u) =

∫Rd

hσk(u− v)MkPkxk–1(v)dv ∈ Hk, (2)

Applying Schur’s test, we can obtains ‖Ak‖ ≤ 1. Thus, xk is in L2(Ω,Hk), with‖xk‖ = ‖AkMkPkxk–1‖ ≤ ‖MkPkxk–1‖.

18

Page 19: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Pooling operator

Recap

19

Page 20: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Construction of the Multilayer Convolutional Kernel Network(CKM) Multilayer construction

Multilayer construction

Finally, we obtain a multilayer representation by composing multiple timesthe previous operators. In order to increase invariance with each layer, thesize of the patch Sk and pooling scale σk grow exponentially with k, withσk and the patch size supc∈Sk

|c| of the same order. With n layers, themaps xn may then be written

φn(x0) := xn = AnMnPnAn–1Mn–1Pn–1 · · · A1M1P1x0 ∈ L2(Ω,Hn).(3)

20

Page 21: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Stability to deformations

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

21

Page 22: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Stability to deformations

Stability to deformations: Definition

C1 diffeomorphism: τ : Ω→ Ω

action operator: Lτx(u) = x(u− τ(u))

Representation Φ(·) is stable if:

‖Φ(Lτx)− Φ(x)‖ ≤ (c1‖∇τ‖∞ + c2‖τ‖∞)‖x‖,

here c1, c2 are two constants, ∇τ is the Jacobian,‖∇τ‖∞ = supu∈Ω‖∇τ(u)‖, ‖τ‖∞ = supu∈Ω|τ(u)|.

translation invariance: c2 → 0

22

Page 23: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Stability to deformations

Stability results

Theorem

Let Φ(x) be a representation given by Φ(x) = Φn(A0x). If ‖∇τ‖∞ ≤ 12 ,

we have:

‖Φ(Lτx)− Φ(x)‖ ≤ (c1(1 + n)‖∇τ‖∞ +c2

σn‖τ‖∞)‖x‖ .

Here we assume that the input signal x0 = A0x, where A0 is the initialpooling operator which is used to control the high frequencies. σn is theparameter controls the pooling layer (reminder: it will grow exponentiallywith the number of layers n).

23

Page 24: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Link with CNN

Outline

1 Introduction

2 Notation and basic mathematical tools

3 Construction of the Multilayer Convolutional Kernel Network (CKM)Patch extraction operatorKernel mapping operatorPooling operatorMultilayer construction

4 Stability to deformations

5 Link with CNN

24

Page 25: Invariance and Stability of Deep Convolutional Representationslcarin/Liqun1.11.2019.pdf · 11/01/2019  · representation and its stability to the action of di eomorphisms, obtaining

Link with CNN

Link with CNN

CNN map construction:

CNN function fσ, input image x0 ∈ L2(Ω,Rp0) with p0 channels.

feature maps represented at layer k as a function zk ∈ L2(Ω,Rp0)

a set of filters (wik)i=1,...,pk , activation function δ

intermediate feature maps (before pooling operation) zk:

zik = nk(u)δ(〈wik, Pkzk−1(u)〉/nk(u)).

Here Pk is the patch extractor, nk(u) = ‖Pkxk−1(u)‖

Homogenerous activations: i.e., δ : z → ‖z‖δ(〈g, z〉/‖z‖) for all g in Pk

25