concentration inequalities and tail bounds€¦ · concentration inequalities and tail bounds john...

24
Concentration inequalities and tail bounds John Duchi Prof. John Duchi

Upload: others

Post on 14-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Concentration inequalities and tail bounds

John Duchi

Prof. John Duchi

Page 2: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Outline

I Basics and motivation1 Law of large numbers2 Markov inequality3 Cherno↵ bounds

II Sub-Gaussian random variables1 Definitions2 Examples3 Hoe↵ding inequalities

III Sub-exponential random variables1 Definitions2 Examples3 Cherno↵/Bernstein bounds

Prof. John Duchi

Page 3: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Motivation

I Often in this class, goal is to argue that sequence of random(vectors) X1, X2, . . . satisfies

1

n

nX

i=1

X

i

p! E[X].

I Law of large numbers: if E[kXk] < 1, then

P

lim

n!1

1

n

nX

i=1

X

i

6= E[X]

!= 0.

Prof. John Duchi

Page 4: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Markov inequalities

Theorem (Markov’s inequality)

Let X be a non-negative random variable. Then

P(X � t) E[X]

t

.

Prof. John Duchi

Page 5: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Chebyshev inequalitiesTheorem (Chebyshev’s inequality)

Let X be a real-valued random variable with E[X2] < 1. Then

P(X � E[X] � t) E[(X � E[X])

2]

t

2=

Var(X)

t

2.

Example: i.i.d. sampling

Prof. John Duchi

Page 6: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Cherno↵ bounds

Moment generating function: for random variable X, the MGF is

M

X

(�) := E[e�X ]

Example: Normally distributed random variables

Prof. John Duchi

Page 7: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Cherno↵ bounds

Theorem (Cherno↵ bound)

For any random variable and t � 0,

P(X �E[X] � t) inf

��0M

X�E[X](�)e��t

= inf

��0E[e�(X�E[X])

]e

��t

.

Prof. John Duchi

Page 8: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Sub-Gaussian random variables

Definition (Sub-Gaussianity)

A mean-zero random variable X is �2-sub-Gaussian if

Ehe

�X

i exp

✓�

2�

2

2

◆for all � 2 R

Example: X ⇠ N(0,�2)

Prof. John Duchi

Page 9: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Properties of sub-Gaussians

Proposition (sums of sub-Gaussians)

Let X

i

be independent, mean-zero �

2i

-sub-Gaussian. ThenPn

i=1Xi

is

Pn

i=1 �2i

-sub-Gaussian.

Prof. John Duchi

Page 10: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Concentration inequalities

TheoremLet X be �

2-sub-Gaussian. Then for t � 0,

P(X � E[X] � t) exp

✓� t

2

2�

2

P(X � E[X] �t) exp

✓� t

2

2�

2

Prof. John Duchi

Page 11: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Concentration: convergence of an independent sum

CorollaryLet X

i

be independent �

2i

-sub-Gaussian. Then for t � 0,

P 1

n

nX

i=1

X

i

� t

! exp

� nt

2

2

1n

Pn

i=1 �2i

!

Prof. John Duchi

Page 12: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Example: bounded random variables

PropositionLet X 2 [a, b], with E[X] = 0. Then

E[e�X ] e

�2(b�a)2

8.

Prof. John Duchi

Page 13: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Maxima of sub-Gaussian random variables (in probability)

Emax

jn

X

j

�p2�

2log n

Prof. John Duchi

Page 14: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Maxima of sub-Gaussian random variables (in expectation)

P✓max

jn

X

j

�p2�

2(log n+ t)

◆ e

�t

.

Prof. John Duchi

Page 15: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Hoe↵ding’s inequality

If Xi

are bounded in [a

i

, b

i

] then for t � 0,

P 1

n

nX

i=1

(X

i

� E[Xi

]) � t

! exp

� 2nt

2

1n

Pn

i=1(bi � a

i

)

2

!

P 1

n

nX

i=1

(X

i

� E[Xi

]) �t

! exp

� 2nt

2

1n

Pn

i=1(bi � a

i

)

2

!.

Prof. John Duchi

Page 16: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Equivalent definitions of sub-Gaussianity

TheoremThe following are equivalent (up to constants)

i E[exp(X2/�

2)] e

ii E[|X|k]1/k �

pk

iii P(|X| � t) exp(� t

2

2�2 )

If in addition X is mean-zero, then this is also equivalent to i–iii

above

iv X is �

2-sub-Gaussian

Prof. John Duchi

Page 17: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Sub-exponential random variables

Definition (Sub-exponential)

A mean-zero random variable X is (⌧2, b)-sub-Exponential if

E [exp (�X)] exp

✓�

2⌧

2

2

◆for |�| 1

b

.

Example: Exponential RV, density p(x) = �e

��x for x � 0

Prof. John Duchi

Page 18: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Sub-exponential random variables

Example: �2-random variable. Let Z ⇠ N(0,�2) and X = Z

2.Then

E[e�X ] =

1

[1� 2��

2]

12+

.

Prof. John Duchi

Page 19: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Concentration of sub-exponentials

TheoremLet X be (⌧

2, b)-sub-exponential. Then

P(X � E[X]+t) (e

� t2

2⌧2if 0 t ⌧

2

b

e

� t2b

if t � ⌧

2

b

= max

⇢e

� t2

2⌧2, e

� t2b

�.

Prof. John Duchi

Page 20: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Sums of sub-exponential random variables

Let Xi

be independent (⌧2i

, b

i

)-sub-exponential random variables.Then

Pn

i=1Xi

is (P

n

i=1 ⌧2i

, b⇤)-sub-exponential, whereb⇤ = max

i

b

i

Corollary: If Xi

satisfy above, then

P �����

1

n

nX

i=1

X

i

� E[Xi

]

����� � t

! 2 exp

�min

(nt

2

2

1n

Pn

i=1 ⌧2i

,

nt

2b⇤

)!.

Prof. John Duchi

Page 21: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Bernstein conditions and sub-exponentialsSuppose X is mean-zero with

|E[Xk

]| 1

2

k!�

2b

k�2

Then

E[e�X ] exp

✓�

2�

2

2(1� b|�|)

Prof. John Duchi

Page 22: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Johnson-Lindenstrauss and high-dimensional embedding

Question: Let u1, . . . , um 2 Rd be arbitrary. Can we find amapping F : Rd ! Rn, n ⌧ d, such that

(1� �)

��u

i � u

j

��22��F (u

i

)� F (u

j

)

��22 (1 + �)

��u

i � u

j

��22

Theorem (Johnson-Lindenstrauss embedding)

For n & 1✏

2 logm such a mapping exists.

Prof. John Duchi

Page 23: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Proof of Johnson-Lindenstrauss continued

P �����

kXuk22n kuk22

� 1

����� � t

! 2 exp

✓�nt

2

8

◆for t 2 [0, 1].

Prof. John Duchi

Page 24: Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John Duchi Prof. John Duchi. Outline I Basics and motivation 1 Law of large numbers 2

Reading and bibliography

1. S. Boucheron, O. Bousquet, and G. Lugosi. Concentrationinequalities.In O. Bousquet, U. Luxburg, and G. Ratsch, editors, AdvancedLectures in Machine Learning, pages 208–240. Springer, 2004

2. V. Buldygin and Y. Kozachenko. Metric Characterization of

Random Variables and Random Processes, volume 188 ofTranslations of Mathematical Monographs.American Mathematical Society, 2000

3. M. Ledoux. The Concentration of Measure Phenomenon.American Mathematical Society, 2001

4. S. Boucheron, G. Lugosi, and P. Massart. ConcentrationInequalities: a Nonasymptotic Theory of Independence.Oxford University Press, 2013

Prof. John Duchi