bregman information bottleneck

21
Bregman Bregman Information Bottleneck Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Koby Crammer Hebrew Hebrew University University of Jerusalem of Jerusalem Noam Slonim Noam Slonim Princeton Princeton University University

Upload: anais

Post on 10-Jan-2016

66 views

Category:

Documents


0 download

DESCRIPTION

Koby Crammer Hebrew University of Jerusalem. Noam Slonim Princeton University. Bregman Information Bottleneck. NIPS’03, Whistler December 2003. Motivation. Hello, world. Extend the IB for a broad family of representations - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bregman  Information Bottleneck

Bregman Bregman Information BottleneckInformation Bottleneck

NIPS’03, Whistler December 2003

Koby CrammerKoby CrammerHebrew UniversityHebrew University

of Jerusalemof Jerusalem

Noam SlonimNoam SlonimPrinceton UniversityPrinceton University

Page 2: Bregman  Information Bottleneck

MotivationMotivation

• Extend the IB for a broad family of representations• Relation to the Exponential family

Hello, world

Multinomial distribution

Vectors

Page 3: Bregman  Information Bottleneck

OutlineOutline

• Rate-Distortion Formulation• Bregman Divergences• Bregman IB• Statistical Interpretation• Summary

Page 4: Bregman  Information Bottleneck

Information BottleneckInformation Bottleneck

X T Y

X

[ p(y=1|X) … p(y=n|X)]

[ p(y=1|T) … p(y=n|T)]

T

Page 5: Bregman  Information Bottleneck

• Input

• Variables

• Distortion

Rate-Distortion FormulationRate-Distortion Formulation

Page 6: Bregman  Information Bottleneck

• Bolzman Distribution:

• Markov + Bayes

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Page 7: Bregman  Information Bottleneck

Bregman DivergencesBregman Divergences

f

(u,f(u))

(v,f(v))

(v, f(u)+f’(u)(v-u))

Bf(v||u) = f(v) - (f(u)+f’(u)(v-u))Bf(v||u) = f:S R

Page 8: Bregman  Information Bottleneck

• Functional

• Bregman Function

• Input

• Variables

• Distortion

Bregman IB: Rate-Distortion FormulationBregman IB: Rate-Distortion Formulation

Page 9: Bregman  Information Bottleneck

• Bolzman Distribution:

• Prototypes: convex combination of input vectors

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Page 10: Bregman  Information Bottleneck

Special CasesSpecial Cases

• Information Bottleneck: Bregman function: f(x)=x log(x) – x Domain: Simplex Divergence: Kullback-Leibler

• Soft K-means Bregman function: f(x)=(1/2) x2

Domain: Realsn

Divergence: Euclidian Distance [Still, Bialek, Bottou, NIPS 2003]

Page 11: Bregman  Information Bottleneck

Bregman IBBregman IB

Information Bottleneck

BregmanClustering

Rate-Distortion

Exponential Family

Page 12: Bregman  Information Bottleneck

Exponential FamilyExponential Family

• Expectation parameters:

• Examples (single dimension): Normal

Poisson

Page 13: Bregman  Information Bottleneck

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

Page 14: Bregman  Information Bottleneck

IllustrationIllustration

Page 15: Bregman  Information Bottleneck

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

Page 16: Bregman  Information Bottleneck

• Distortion:

• Data vectors and prototypes: expectation parameters

• Question: For what exponential distribution we have ?

Answer: Poisson

Back to Distributional ClusteringBack to Distributional Clustering

Page 17: Bregman  Information Bottleneck

Product of Poisson

Distributions

IllustrationIllustration

a a b a a a b a a a .8.2

a b

6040

a b

Pr

Multinomial Distribution

Page 18: Bregman  Information Bottleneck

Back to Distributional ClusteringBack to Distributional Clustering

• Information Bottleneck: Distributional clustering of Poison distributions

• (Soft) k-means: (Soft) Clustering of Normal distributions

Page 19: Bregman  Information Bottleneck

• Distortion

• Input: Observations

• Output Parameters of Distribution

• IB functional: EM [Elidan & Fridman, before]

Maximum Likelihood PerspectiveMaximum Likelihood Perspective

Page 20: Bregman  Information Bottleneck

• Posterior:

• Partition Function:

Weighted -norm of the Likelihood

• → ∞ , most likely cluster governs• →0 , clusters collapse into a single prototype

Back to Self Consistent EquationsBack to Self Consistent Equations

Page 21: Bregman  Information Bottleneck

Summary Summary

• Bregman Information Bottleneck Clustering/Compression

for many representations and divergences

• Statistical Interpretation Clustering of distributions from the exponential family EM like formulation

• Current Work: Algorithms Characterize distortion measures which also yield

Bolzman distributions General distortion measures