dynamics and generalization of lvq, birmingham, 09-12- 05 3) vector quantization (vq) and learning...

Dynamics and generalization of LVQ Birmingham 09-12- 05

3) Vector Quantization (VQ)

and Learning Vector Quantization (LVQ)

References

M Biehl A Freking G ReentsDynamics of on-line competitive learningEurophysics Letters 38 (1997) 73-78

M Biehl A Ghosh B HammerDynamics and generalization ability of LVQ algorithmsJ Machine Learning Research 8 (2007) 323-360

and references in the latter


Vector Quantization (VQ)

aim

representation of large amounts

of data by (few) prototype vectors

example

identification and grouping

in clusters of similar data

assignment of feature vector to the closest prototype w

(similarity or distance measure

eg Euclidean distance )


unsupervised competitive learning

bull initialize K prototype vectors

bull present a single example

bull identify the closest prototype ie the so-called winner

bull move the winner even closer towards the example

intuitively clear plausible procedure

- places prototypes in areas with high density of data

- identifies the most relevant combinations of features

- (stochastic) on-line gradient descent with respect to

the cost function


quantization error

μj

μk

K

jk

P

1μj

μK

1jVQ ddΘ

2 wξH

μjdprototypes data wj is the winner

here

Euclidean distance

aim faithful representation (in general ne clustering )

Result depends on - the number of prototype vectors - the distance measure metric used



bull initialize prototype vectors for different classes


bull move the winner - closer towards the data (same class)

- away from the data (different class)

classification

assignment of a vector to the class of the closest

prototype w

aim generalization ability

classification of novel data

after learning from examples

∙ identification of prototype vectors from labelled example data

∙ distance based classification (eg Euclidean Manhattan hellip)

basic heuristic LVQ scheme LVQ1 [Kohonen]

piecewise linear decision boundaries

Learning Vector Quantization

(t)wξ(t)w1tw (t)w

η

N-dimfeature space


LVQ algorithms

- frequently applied in a variety

of practical problems

- plausible intuitive flexible

- fast easy to implement

- often based on heuristic arguments

or cost functions with unclear relation to generalization

- limited theoretical understanding of

- dynamics and convergence properties

- achievable generalization ability

here analysis of LVQ algorithms wrt

- dynamics of the learning process

- performance ie generalization ability

- typical properties in a model situation


Model situation two clusters of N-dimensional data

random vectors isin ℝN according to σ)P(p )P(1σ

σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π

1σ)P( Βξξ mixture of two Gaussians

orthonormal center vectors

B+ B- isin ℝN ( B )2 =1 B+ B- =0

prior weights of classes p+ p-

p+ + p- = 1

B+

B-

(p+)

(p-)

cluster distance prop ℓ ℓ

jj Bσσξ

σσσvξξ

22jj

indep components with

and variance

ℝN


high-dimensional data (formally Ninfin)

ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)

projections into the plane of center vectors B+ B-

μ By ξ

μ 2

2xξ

w

projections on two independent random directions w12

μ 11x ξw


Dynamics of on-line training

sequence of new independent random examples 123μσμμ ξ

drawn according to μμσ σPp μ ξ

learning ratestep size

competitiondirection ofupdate etc

change of prototypetowards or away from the current data

example

LVQ1 original formulation [Kohonen]

Winner-Takes-All (WTA) algorithm

μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ

update of two prototype vectors w+ w-


algorithm recursions

Mathematical analysis of the learning dynamics

11 σtsμt

μs

μstσ

μs

μsσ QBR www

projections into the (B+ B- )-plane

length and relativeposition of prototypes

1 description in terms of a few characteristic quantitities

( here ℝ2N ℝ7 )

2 average over the current example

random vector according to avg lengthσ)|P( μξ 22 vN σσ

ξ

in the thermodynamic limit N μμ

μ1-μs

μs

By

wx

ξ

ξ

correlated Gaussian random quantities

completely specified in terms of first and second moments (wo indices μ)

sσσ

N

1jjsσs R x

jw stσσtσsσt s Qv xx- xx

sσσσsσ s Rv yx- yx σσσσv yy- yy

sσσ y


averaged recursions closed in p σ1σ

σ

μsσ

μst R Q

- depend on the random sequence of example data

- their fluctuations vanish with N

learning dynamics is completely described in terms of averages

3 self-averaging property of characteristic quantities

μsσ

μst R Q

1N

(mean and variance)

R++ (α=10) computer simulations (LVQ1)

- mean results approach theoretical prediction- variance vanishes as N


4 continuous learning time

N

μ α of examples

of learning stepsper degree of freedom

) α (R ) α (Q sσst integration yields evolution of projections

stochastic recursions deterministic ODE

probability for misclassification of a novel example

ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve

generalization error εg(α) after training with α N examples


LVQ1 The winner takes it all

initialization ws(0)=0

theory and simulation (N=100)p+=08 v+=4 p+=9 ℓ=20 =10

averaged over 100 indep runs

Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww

only the winner is updated according to the class label

self-averaging property

(mean and variances)

1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+

Trajectories in the (B+B- )-plane

(bull) =2040140 optimal decision boundary ____ asymptotic position

initialization ws(0)asymp0

theory and simulation (N=100)p+=08 v+=4 v+=9 ℓ=20 =10


Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002

- suboptimal non-monotonic behavior for small η

εg (αinfin) grows linearly with η- stationary state

η 0 αinfin (η α ) infin

- well-defined asymptotics

η

εgp+ = 02 ℓ=10

v+ = v- = 10

achievable generalization error

εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081

best linear boundary― LVQ1


LVQ 21 [Kohonen] here update correct and wrong winner

1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0

6theory and simulation (N=100)

p+=08 ℓ=1 v+=v-=1 =05

averages over 100 independent runs

problem instability of the algorithm

due to repulsion of wrong prototypes

trivial classification for αinfin

εg = min p+p- RS+

RS-


suggested strategy

selection of data in a window close to the current decision boundary

slows down the repulsion system remains instable

Early stopping end training process at minimal εg (idealized)

εg

η= 20 10 05

η

- pronounced minimum in εg (α) depends on initialization and cluster geometry

- here lowest minimum value reached for η0

v+ =025 v-=081εg

p+

― LVQ1__ early stopping


Learning From Mistakes (LFM)

1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww

LVQ21 updateonly if the current classification is wrong

crisp limit version of Soft Robust LVQ [Seo and Obermayer 2003]

projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves

η-independent asymptotic εg p+=08 ℓ= 12 v+=v-=10


εg

p+

equal cluster variances

p+

unequal variances

best linear boundary

― LVQ1

--- LVQ21 (early stopping)middot-middot LFM

Comparison achievable generalization ability

v+=025 v-=081v+=v-=10

― trivial classification


Vector Quantization

competitive learning 1-μs

μμS

μS

1-μs

μs dd

N

ηwξww

ws winner

class membership is unknown

or identical for all data

numerical integration for ws(0)asymp0 ( p+=02 ℓ=10 =12 )

εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0

10system is invariant under

exchange of the prototypes

weakly repulsive fixed

points


interpretations

- VQ unsupervised learning unlabelled data

- LVQ two prototypes of the same class identical labels

- LVQ different classes but labels are not used in training

εg

p+

asymptotics (0 )

p+asymp0

p-asymp1

- low quantization error- high gen error εg


Summary

bulla model scenario of LVQ training

two clusters two prototypes

dynamics of online training

bullcomparison of algorithms (within the model)

LVQ 1 original formulation of LVQ

with close to optimal asymptotic generalization

LVQ 21 intuitive extension creates instability

trivial (stationary) classification

+ stopping potentially good performance

practical difficulties depends on initialization

LFM crisp limit of Soft Robust LVQ stable behavior

far from optimal generalization

VQ description of in-class competition


Outlook

bullSelf-Organizing Maps (SOM)

neighborhood preserving SOM Neural Gas (distance rank based)

bull Generalized Relevance LVQ [eg Hammer amp Villmann]

adaptive metrics eg distance measure

N

i

iii w1

2)( sλ ξξwd

training

bullapplications

bull multi-class multi-prototype problems

bull optimized procedures learning rate schedules

variational approach Bayes optimal on-line


Vector Quantization (VQ)

aim

representation of large amounts

of data by (few) prototype vectors

example

identification and grouping

in clusters of similar data

assignment of feature vector to the closest prototype w

(similarity or distance measure

eg Euclidean distance )











the cost function


quantization error

μj

μk

K

jk

P

1μj

μK

1jVQ ddΘ

2 wξH


here

Euclidean distance









classification


prototype w









(t)wξ(t)w1tw (t)w

η

N-dimfeature space


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications














the cost function


quantization error

μj

μk

K

jk

P

1μj

μK

1jVQ ddΘ

2 wξH


here

Euclidean distance









classification


prototype w









(t)wξ(t)w1tw (t)w

η

N-dimfeature space


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





quantization error

μj

μk

K

jk

P

1μj

μK

1jVQ ddΘ

2 wξH


here

Euclidean distance









classification


prototype w









(t)wξ(t)w1tw (t)w

η

N-dimfeature space


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications










classification


prototype w









(t)wξ(t)w1tw (t)w

η

N-dimfeature space


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications







σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


and variance

ℝN



ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






ξμ isinℝN N=200 ℓ=1 p+=04 v+=144 v-=064μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications











example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ





11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications







11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )



ξ


μ1-μs

μs

By

wx

ξ

ξ



sσσ

N

1jjsσs R x



sσσ y



σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






σ

μsσ

μst R Q





μsσ

μst R Q

1N

(mean and variance)





N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






N

μ α of examples





ddpddp gε

]2

]

]2

][

2

1

2

1

QQ[Qv

R[R2QQ

QQ[Q v

RR2QQpp

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications









Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+






Q++

Q--

Q+-

α

RSσ

tsst

σssσ

Q

BR

ww

w


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0


p+=08 ℓ=1 v+=v-=1 =05





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications






1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10



Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





Vector Quantization


μμS

μS

1-μs

μs dd

N

ηwξww

ws winner




εg

α

VQ

LVQ+

LVQ1

αα

R++

R+-

R-+

R--

100 200 3000

0




points


interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





interpretations




εg

p+

asymptotics (0 )

p+asymp0

p-asymp1



Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





Summary















Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications





Outlook





N

i

iii w1

2)( sλ ξξwd

training

bullapplications




dynamics and generalization of lvq, birmingham, 09-12- 05 3) vector quantization (vq) and learning...

Documents

generalization of lvq

b b bb

p p p

p p pp

prototype vectors example

hammer dynamics

euclidean distance slide

number of prototype