pac learning and the vc dimension

29

Upload: butest

Post on 01-Jul-2015

610 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PAC Learning and The VC Dimension
Page 2: PAC Learning and The VC Dimension

Fix a rectangle (unknown to you):

From An Introduction to Computational Learning Theory by Keanrs and Vazirani

Page 3: PAC Learning and The VC Dimension

Draw points from some fixed unknown distribution:

Page 4: PAC Learning and The VC Dimension

You are told the points and whether they are in or out:

Page 5: PAC Learning and The VC Dimension

You propose a hypothesis:

Page 6: PAC Learning and The VC Dimension

Your hypothesis is tested on points drawn from the same distribution:

Page 7: PAC Learning and The VC Dimension

We want an algorithm that:◦ With high probability will choose a hypothesis that is

approximately correct.

Page 8: PAC Learning and The VC Dimension

Choose the minimum area rectangle containing all the positive points:

h

Page 9: PAC Learning and The VC Dimension

Derive a PAC bound:

For fixed:◦ R : Rectangle

◦ D : Data Distribution

◦ ε : Test Error

◦ δ : Probability of failing

◦ m : Number of Samples

h

R

Page 10: PAC Learning and The VC Dimension

We want to show that with high probability the area below measured with respect to D is bounded by ε :

h

R< ε

Page 11: PAC Learning and The VC Dimension

We want to show that with high probability the area below measured with respect to D is bounded by ε :

h

R

< ε/4

Page 12: PAC Learning and The VC Dimension

Define T to be the region that contains exactly ε/4 of the mass in D sweeping down from the top of R.

p(T’) > ε/4 = p(T) IFFT’ contains T

T’ contains T IFFnone of our m samplesare from T

What is the probabilitythat all samples miss T

h

R

< ε/4 T’

T

Page 13: PAC Learning and The VC Dimension

What is the probability that all m samples miss T:

What is the probability thatwe miss any of the rectangles?◦ Union Bound

h

R

< ε/4 T’

T

Page 14: PAC Learning and The VC Dimension

A B

Page 15: PAC Learning and The VC Dimension

What is the probability that all m samples miss T:

What is the probability thatwe miss any of therectangles:◦ Union Bound

h

RT

= ε/4

Page 16: PAC Learning and The VC Dimension

Probability that any region has weight greater than ε/4 after m samples is at most:

If we fix m such that:

Than with probability 1- δwe achieve an error rate of at most ε

h

RT

= ε/4

Page 17: PAC Learning and The VC Dimension

Common Inequality:

We can show:

Obtain a lower bound on the samples:

Page 18: PAC Learning and The VC Dimension

Provides a measure of the complexity of a “hypothesis space” or the “power” of “learning machine”

Higher VC dimension implies the ability to represent more complex functions

The VC dimension is the maximum number of points that can be arranged so that f shatters them.

What does it mean to shatter?

Page 19: PAC Learning and The VC Dimension

A classifier f can shatter a set of points if and only if for all truth assignments to those points f gets zero training error

Example: f(x,b) = sign(x.x-b)

Page 20: PAC Learning and The VC Dimension

What is the VC Dimension of the classifier:◦ f(x,b) = sign(x.x-b)

Page 21: PAC Learning and The VC Dimension

Conjecture:

Easy Proof (lower Bound):

Page 22: PAC Learning and The VC Dimension

Harder Proof (Upper Bound):

Page 23: PAC Learning and The VC Dimension

VC Dimension Conjecture:

Page 24: PAC Learning and The VC Dimension

VC Dimension Conjecture: 4

Upper bound (more Difficult):

Page 25: PAC Learning and The VC Dimension

What is the VC Dimension of:◦ f(x,{w,b})=sign( w . x + b )

◦ X in R^d

Proof (lower bound):◦ Pick {x_1, …, x_n} (point) locations:

◦ Adversary gives assignments {y_1, …, y_n} and you choose {w_1, …, w_n} and b:

Page 26: PAC Learning and The VC Dimension
Page 27: PAC Learning and The VC Dimension

Proof (upper bound): VC-Dim = d+1◦ Observe that the last d+1 points can always be

expressed as:

Page 28: PAC Learning and The VC Dimension

Proof (upper bound): VC-Dim = d+1◦ Observe that the last d+1 points

can always be expressed as:

Page 29: PAC Learning and The VC Dimension