nuno vasconcelos ece depp,artment, ucsdpositive definite kernels it is not hard to show that all dot...

37
Dot-product kernels Nuno Vasconcelos ECE Department, UCSD

Upload: others

Post on 18-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Dot-product kernels

Nuno Vasconcelos ECE Department, UCSDp ,

Page 2: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Classificationa classification problem has two types of variables

• X - vector of observations (features) in the worldX - vector of observations (features) in the world• Y - state (class) of the world

Perceptron: classifier implements the linear decision rulep p

with [ ]g(x)sgn )( =xh bxwxg T +=)( w

appropriate when the classes arelinearly separable

b

x

xg )(

to deal with non-linear separabilitywe introduce a kernel

wb w

2

Page 3: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Kernel summary1. D not linearly separable in X, apply feature transformation Φ:X → Z,

such that dim(Z) >> dim(X)

2. computing Φ(x) too expensive:• write your learning algorithm in dot-product form• instead of Φ(xi) we only need Φ(xi)TΦ(xj) ∀ijinstead of Φ(xi), we only need Φ(xi) Φ(xj) ∀ij

3. instead of computing Φ(xi)TΦ(xj) ∀ij, define the “dot-product kernel”

)()(),( zxzxK T ΦΦ=

and compute K(xi,xj) ∀ij directly• note: the matrix

)()()(

⎥⎤

⎢⎡ M

is called the “kernel” or Gram matrix

⎥⎥⎥

⎦⎢⎢⎢

=M

LL ),( ji xxKK

3

4. forget about Φ(x) and use K(x,z) from the start!

Page 4: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Polynomial kernelsthis makes a significant difference when K(x,z) is easier to compute that Φ(x)TΦ(z)p ( ) ( )e.g., we have seen that

( ) TT zxzxzxK )()(),( 2ΦΦ== ( )

x

)()(),(

1 ⎞⎜⎛

ℜ→ℜΦ : with2dd

( )Tddddd

d

xxxxxxxxxxxxx

,,,,,,,, 2112111 LLLM →⎟⎟

⎠⎜⎜⎜

while K(x,z) has complexity O(d), Φ(x)TΦ(z) is O(d2)for K(x,z) = (xTz)k we go from O(d) to O(dk)

4

( , ) ( ) g ( ) ( )

Page 5: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Questionwhat is a good dot-product kernel?• intuitively a good kernel is one that maximizes the margin γ in• intuitively, a good kernel is one that maximizes the margin γ in

range space• however, nobody knows how to do this effectively

in practice:• pick a kernel from a library of known kernels• we talked about• we talked about

• the linear kernel K(x,z) = xTz• the Gaussian family 2zx −

• the polynomial family

σ),(zx

ezxK−

=

( )k

5

( ) { }L,2,1,1),( ∈+= kzxzxK kT

Page 6: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Question“this problem of mine is really asking for the kernel k’(x,z) = ...”( , )• how do I know if this is a dot-product kernel?

let’s start by the definitionDefinition: a mapping

k: X x X → ℜxxx

xx

xxx

xxxx

ooo

ooooo

oooo

x2 X

(x,y) → k(x,y)is a dot-product kernel if and only if

ooo x1

xxx

xx

xxx

xxxx

ooo

oo oo

ox1

Φ

H

k(x,y) = <Φ(x),Φ(y)>where Φ: X → H, H is a vector space and, <.,.> a dot-product in H

o o oo oox1

x3 x2xn

6

product in H

Page 7: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Vector spacesnote that both H and <.,.> can be abstract, not necessarily ℜdyDefinition: a vector space is a set H where • addition and scalar multiplication are defined and satisfy:

1) x+(x’+x’’) = (x+x’)+x” 5) λx ∈ H2) x+x’ = x’+x ∈ H 6) 1x = x3) 0 ∈ H, 0 + x = x 7) λ(λ’ x) = (λλ’)x) , ) ( ) ( )4) –x ∈ H, -x + x = 0 8) λ(x+x’) = λx + λx’

9) (λ+λ’)x = λx + λ’x

the canonical example is ℜd with standard vector additionthe canonical example is ℜ with standard vector addition and scalar multiplicationanother example is the space of mappings X → ℜ with

7

(f+g)(x) = f(x) + g(x) (λf)(x) = λf(x)

Page 8: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Bilinear formsto define dot-product we first need to recall the notion of a bilinear formDefinition: a bilinear form on a vector space H is a mapping

Q: H x H ℜQ: H x H → ℜ(x,x’) → Q(x,x’)

such that ∀ x,x’,x’’∈ Hsuch that ∀ x,x ,x ∈ H

i) Q[(λx+λx’),x”] = λQ(x,x”) + λ’Q(x’,x”)ii) Q[x”,(λx+λx’)] = λQ(x”,x) + λ’Q(x”,x’)) [ ( )] ( ) ( )

in ℜ d the canonical bilinear form is Q(x,x’) = xTAx’

8

if Q(x,x’) = Q(x’,x) ∀ x,x’∈ H, the form is symmetric

Page 9: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Dot productsDefinition: a dot-product on a vector space H is a symmetric bilinear formy

<.,.>: H x H → ℜ(x,x’) → <x,x’>

h th tsuch that

i) <x,x> ≥ 0, ∀ x∈ Hii) <x x> = 0 if and only if x = 0ii) <x,x> = 0 if and only if x = 0

note that for the canonical bilinear form in ℜ dnote that for the canonical bilinear form in ℜ<x,x> = xTAx

this means that A must be positive definite

9

this means that A must be positive definitexTAx > 0, ∀ x ≠ 0

Page 10: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite matricesrecall that (e.g. Linear Algebra and Applications, Strang)

Definition: each of the following is a necessary andDefinition: each of the following is a necessary and sufficient condition for a real symmetric matrix A to be (semi) positive definite:

i) TA ≥ 0 0i) xTAx ≥ 0, ∀ x ≠ 0 ii) all eigenvalues of A satisfy λi ≥ 0iii) all upper-left submatrices Ak have non-negative determinanti ) there is a matri R ith independent ro s s ch thativ) there is a matrix R with independent rows such that

A = RTR

l ft b t iupper left submatrices:

L 322212

3,12,11,1

32,11,1

2111 ⎥⎥⎤

⎢⎢⎡

=⎥⎤

⎢⎡

== aaaaaa

Aaa

AaA

10

3,32,31,3

3,22,21,232,21,2

21,11

⎥⎥⎦⎢

⎢⎣

⎥⎦

⎢⎣ aaa

aa

Page 11: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite matricesproperty iv) is particularly interesting• in ℜ d <x x> = xTAx is a dot-product kernel if and only if A is• in ℜ , <x,x> = x Ax is a dot-product kernel if and only if A is

positive definite• from iv) this holds if and only if there is R such that A = RTR• hence

<x,y> = xTAy = (Rx)T(Ry) = Φ(x)TΦ(y)with

Φ: ℜ d → ℜ d

x → Rx

i.e. the dot-product kernelp

k(x,z) = xTAz, (A positive definite)is the standard dot-product in the range space of the

11

is the standard dot product in the range space of the mapping Φ(x) = Rx

Page 12: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Notethere are positive semidefinite matrices

xTAx ≥ 0x Ax ≥ 0and positive definite matrices

xTAx > 0

we will work with semidefinite but, to simplify, will call definiteif we really need > 0 we will say “strictly positive definite”

12

Page 13: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite kernelshow do we define a positive definite function?Definition: a function k(x y) is a positive definite kernel onDefinition: a function k(x,y) is a positive definite kernel on X xX if ∀ l and ∀ {x1, ..., xl}, xi∈ X, the Gram matrix

⎤⎡ M

⎥⎥⎥

⎢⎢⎢

⎡=

M

LL

M

),( ji xxkK

is positive definite.Note: this implies that

⎥⎦⎢⎣ M

Note: this implies that• k(x,x) ≥ 0 ∀ x∈ X

• PD ∀ x,y∈ X⎤⎡ ),(),( yxkxxk (*)

13

yetc...⎥

⎤⎢⎣

⎡),(),(),(),(

yykxyky ( )

Page 14: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite kernelsthis proves some simple properties• a PD kernel is symmetrica PD kernel is symmetric

Xyxxykyxk ∈∀= ,),,(),( Proof:

since PD means symmetric (*) implies k(x,y) = k(y,x) ∀ x,y∈ X

• Cauchy-Schwarz inequality for kernels: if k(x,y) is a PD kernel, y q y ( y)then

Xyxyykxxkyxk ∈∀≤ ,),,(),(),( 2

Proof:from (*), and property iii) of PD matrices, the determinant of the 2x2 matrix of (*) is non-negative This means that

14

2x2 matrix of ( ) is non negative. This means that

k(x,x)k(y,y) – k(x,y)2 ≥ 0

Page 15: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite kernelsit is not hard to show that all dot product kernels are PDLemma 1: Let k(x y) be a dot product kernel Then k(x y)Lemma 1: Let k(x,y) be a dot-product kernel. Then k(x,y)is positive definiteproof: p• k(x,y) dot product kernel implies that• ∃Φ and some dot product <.,.> such that

k( ) <Φ( ) Φ( )>k(x,y) = <Φ(x),Φ(y)> • this implies that if:

• we pick any l, and any sequence {x1, ..., xl}, ⎤⎡ My y { 1 l}• and let K be the associated Gram matrix • then, for ∀ c ≠ 0 ⎥

⎥⎥

⎢⎢⎢

⎡=

M

LL

M

),( ji xxkK

15

Page 16: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite kernels

( ),= ∑ij

jijiT xxkccKcc

( ) ( ),ΦΦ= ∑ij

jiji

j

xxcc (k is dot product)

( ) ( ), ΦΦ= ∑∑j

jji

ii xcxc (<.,.> is a bilinear form)

( ) 02

≥Φ= ∑ ii

j

xc (from def of dot product)i product)

16

Page 17: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Positive definite kernelsthe converse is also true but more difficult to proveLemma 2: Let k(x y) x y∈ X be a positive definiteLemma 2: Let k(x,y), x,y∈ X, be a positive definite kernel. Then k(x,y) is a dot product kernelproof:p• we need to show that there is a transformation Φ, a vector space

H = Φ(X), and a dot product <.,.>* in H such that k(x y) <Φ(x) Φ(y)> x2k(x,y) = <Φ(x),Φ(y)>*

• we proceed in three steps1. construct a vector space H

x xx

x

xxx

x

xxx

x

ooo

o

oooo

oooo

x1

x2

Φ

X

p H

2. define the dot-product <.,.>* on H

3. show that k(x,y) = <Φ(x),Φ(y)>* holdsx xx

x

xxx

x

xxx

x

ooo

ooo

oox1

Φ

H

17

ooo

oo ox1

x3 x2xn

Page 18: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The vector space Hwe define H as the space spanned by linear combinations of k(.,xi)( , i)

H = ( )⎭⎬⎫

⎩⎨⎧

∈∀∀= ∑=

m

iiii Xxmxkff

1,,.,(.)|(.) α

notation: by k(.,xi) we mean a function of g(y) = k(y,xi) of y, xi is fixed.homework: check that H is a vector space• e.g. 2) ( )

⎪⎪⎫

= ∑ .,(.) xkfm

iiα ( )

( )∈+=+

⎪⎪⎭

⎪⎪⎬

= ∑

∑= (.)(.)'(.)'(.)

'.,(.)'

.,(.)

'

1

1 ffffxkf

xkf

m

ijj

iii

β

αH

18

⎪⎭=1i

Page 19: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Examplewhen we use the Gaussian kernel

2. ix−

k( ) i G i t d ith i I

2)(., σi

i exK−

=

k(.,xi) is a Gaussian centered on xi with covariance σIand

⎪⎫⎪⎧−m x i.

H =

is the space of all linear⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧

∀∀= ∑=

−m

iii xmeff

1

,,(.)|(.) 2 σα

is the space of all linear combinations of Gaussiansnote that these are not mixtures

e.g.

19

but close

Page 20: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

if f(.) and g(.) ∈ H, with'mm

( ) ( )∑∑==

==11

'.,(.).,(.)m

jjj

m

iii xkgxkf βα (**)

we define the operator <.,.>* as

'

( )∑∑= =

=m

iji

m

jji xxkgf

1

'

1*

',, βα (***)

20

Page 21: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Examplewhen we use the Gaussian kernel

2. ix−

th t < > i i ht d f G i t

2)(., σi

i exK−

=

the operator <.,.>* is a weighted sum of Gaussian terms

∑∑−

−m xxm ji

f''

2

2

β

you can look at this as either:

∑∑= =

=i j

ji egf1 1

*

2, σβα

you can look at this as either:• a dot product in H (still need to prove this)• a non-linear measure of similarity in X, somewhat related to

21

ylikelihoods

Page 22: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

important note: for f(.) and g(.) ∈ H, the operator <.,.>*

m m '

( )∑∑= =

=m

iji

m

jji xxkgf

1

'

1*

',, βα

has the property

( )xxkxkxk ')'()( = (****)

proof: just make

( )jiji xxkxkxk ,)(.,),(.,*

= ( )

p j

⎩⎨⎧

≠∀==≠∀==

jkik

kj

ki 0,10,1

ββαα

22

⎩ ≠∀ jkkj 0,1 ββ

Page 23: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

assume that <.,.>* is a dot product in H (proof in moments))since

( )jiji xxkxkxk ,)(.,),(.,*

=

then, clearly

( )*

)(),(, jiji xxxxk ΦΦ=

with

( )*

)(),(, jiji

Φ: X → Hx → k( x)

i.e. the kernel is a dot-product on H, which results from the feature transformation Φ

x → k(.,x)

23

this proves Lemma 2

Page 24: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Examplewhen we use the Gaussian kernel• the point xi ∈ ℜd is mapped into the Gaussian G(x xi σI)

σ

2

),(ixx

i exxK−

−=

• the point xi ∈ ℜ is mapped into the Gaussian G(x,xi,σI)• H is the space of all functions that are linear combinations of

Gaussians• this has infinite dimension• the kernel is a dot product in H, and a non-linear

similarity on X

xx

x

x

xx

x

xx2

ℜd H*

x

xx

x

oo

o

o

o

o oo

o oo

ox1

xx

x

x

x

xx

x

xx

x

x

o

o

ooo

o Φ

*

24

o o

x3x2

xn

x oo

o

ooo

x1*

Page 25: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

In summaryto show that k(x,y), x,y∈ X, positive definite ⇒ k(x,y) is a dot product kernelpwe need to show that

( )m m '

( )∑∑= =

=i

jij

ji xxkgf1 1

*',, βα

is a dot product on

H = ( ) ⎬⎫

⎨⎧

∈∀∀= ∑m

Xxmxkff ( )|( ) αH =

this reduces to verifying the dot product conditions

( )⎭⎬

⎩⎨ ∈∀∀= ∑

=iiii Xxmxkff

1,,.,(.)|(.) α

25

y g p

Page 26: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

1) is <.,.>* a bilinear form on H ?

by definition of f( ) and g( ) in (**)by definition of f(.) and g(.) in ( )

( ) ( )'

*'.,,.,, j

m

j

m

ii xkxkgf ∑∑= βα

on the other hand,*11 ji

∑∑==

m m '

( )

( )

∑∑= =

=

m m

m

iji

m

jji xxkgf

'

1 1*

',, βα from (***)

equality of the two left hand sides is the definition of

( ) ( )∑∑= =

=m

iji

m

jji xkxk

1*

1'.,,.,βα from (****)

26

q ybilinearity

Page 27: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

2) is <.,.>* symmetric?note thatnote that

( )*

1

'

1*

,,', gfxxkfgm

iij

m

jji == ∑∑

= =

βα

if and only if k(xi,xj’) = k(xj’,xi) for all xi, xj’.b t thi f ll f th iti d fi it f k( )

1 1i j

but this follows from the positive definiteness of k(x,y)we have seen that a PD kernel is always symmetrich i t ihence, <.,.>* is symmetric

27

Page 28: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

3) is <f,f>* ≥ 0, ∀ f ∈ H ?

by definition of f( ) in (**)by definition of f(.) in ( )

( ) αααα Kxxkff Tm

ji

m

ji == ∑∑*,,

where α ∈ ℜm and K is the Gram matrixsince k(x y) is positive definite K is positive definite by

i j= =1 1

since k(x,y) is positive definite, K is positive definite by definition and <f,f>* ≥ 0the only non-trivial part of the proof is to show that

(x)

<f,f>* = 0 ⇒ f = 0we need two more results

28

Page 29: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

Lemma 3: <.,.>* is itself a positive definite kernel on H x H

proof:proof:• consider any sequence {f1, ..., fm}, fi ∈ H

• then

=∑ ∑∑*

*,, ffff

ij jjj

iiijiji γγγγ (by bilinearity of <.,.>*)

0

=*21

*

,ggj j

(for some g1,g2 ∈ H )

(by (x) )

• hence the Gram matrix is always PD and the kernel <.,.>* is PD

( y ( ) )

29

Page 30: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

Lemma 4: ∀ f ∈ H, <k(.,x),f(.)>* = f(x)

proof:proof:

)(.,),(.,(.)),(.,*

xkxkfxk ii= ∑α (by (**) )

)(.,),(.,*

*

xkxki

ii

i

= ∑α (by bilinearity of <.,.>*)

)(),( xfxxki

ii == ∑α (by (****) )

30

Page 31: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

The operator <.,.>*

4) we are now ready to prove that <f,f>* = 0 ⇒ f = 0proof:proof:• since <.,.>* is a PD kernel (lemma 3) we can apply Cauchy-

Schwarz Xyxyykxxkyxk ∈∀≤ ,),,(),(),( 2 • using k(.,x) as x and f(.) as y this becomes

( )2

***),(.,,)(.,),(., fxkffxkxk ≥

• and using lemma 4

)(,),( 2*

xfffxxk ≥

• from which <f,f>* = 0 ⇒ f = 0

31

Page 32: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

In summarywe have shown that

( )∑∑= =

=m

iji

m

jji xxkgf

1

'

1*

',, βα

is a dot product on

( ) ⎬⎫

⎨⎧

∀∀∑m

Xxmxkff ( )|( )H =

and this shows that if k(x y) x y∈ X is a positive definite

( )⎭⎬

⎩⎨ ∈∀∀= ∑

=iiii Xxmxkff

1,,.,(.)|(.) α

and this shows that if k(x,y), x,y∈ X, is a positive definite kernel, then k(x,y) is a dot product kernel.since we had initially proven the converse, we have the

32

y p ,following theorem.

Page 33: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Dot product kernelsTheorem: k(x,y), x,y∈ X, is a dot-product kernel if and only if it is a positive definite kernely p

this is interesting because it allows us to check whether a k l i d t d t t!kernel is a dot product or not!• check if the Gram matrix is positive definite for all possible

sequences {x1, ..., xl}, xi∈ X

but the proof is much more interesting than this result aloneit actually gives us insight on what the kernel is doinglet’s summarize

33

Page 34: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Dot product kernelsa dot product kernel k(x,y), x,y∈ X:• applies a feature transformationapplies a feature transformation

Φ: X → Hx → k(.,x)

• to the vector space

H =

x → k(.,x)

( )⎭⎬⎫

⎩⎨⎧

∈∀∀= ∑m

iii Xxmxkff ,,.,(.)|(.) α

• where the kernel implements the dot product

( )⎭⎬

⎩⎨ ∑

=iiii

1,,,( )|( )

m m '

( )∑∑= =

=m

iji

m

jji xxkgf

1 1*

',, βα

34

Page 35: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

Dot product kernelsthe dot product

m m '

h th d i t

( )∑∑= =

=m

iji

m

jji xxkgf

1

'

1*

',, βα

has the reproducing property

)((.)),(.,*

xffxk =you can think of this as analog to the convolution with a Dirac deltawe will talk about this a lot in the coming lecturesfinally, <.,.>* is itself a positive definite kernel on H x H

35

Page 36: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

A good picture to rememberwhen we use the Gaussian kernel• the point xi ∈ ℜd is mapped into the Gaussian G(x xi σI)

σ

2

),(ixx

i exxK−

−=

• the point xi ∈ ℜ is mapped into the Gaussian G(x,xi,σI)• H is the space of all functions that are linear combinations of

Gaussiansth k l i d t d t i• the kernel is a dot product in H

• the dot product with one of the Gaussianshas the reproducing property

xx

x

x

xx

x

xx2

ℜd H*

x

xx

x

oo

o

o

o

o oo

o oo

ox1

xx

x

x

x

xx

x

xx

x

x

o

o

ooo

o Φ

*

36

o o

x3x2

xn

x oo

o

ooo

x1*

Page 37: Nuno Vasconcelos ECE Depp,artment, UCSDPositive definite kernels it is not hard to show that all dot product kernels are PD Lemma 1:Lemma 1: Let k(x y)k(x,y) be a dotbe a dot-product

37