vapnik-chervonenkis dimension part i: definition and lower bound
Post on 15-Jan-2016
231 views
TRANSCRIPT
![Page 1: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/1.jpg)
Vapnik-Chervonenkis Dimension
Part I: Definition and Lower bound
![Page 2: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/2.jpg)
PAC Learning model
• There exists a distribution D over domain X• Examples: <x, c(x)>
– use c for target function (rather than ct)
• Goal: – With high probability (1-)– find h in H such that – error(h,c ) < – arbitrarily small.
![Page 3: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/3.jpg)
VC: Motivation
• Handle infinite classes.
• VC-dim “replaces” finite class size.
• Previous lecture (on PAC):– specific examples– rectangle.– interval.
• Goal: develop a general methodology.
![Page 4: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/4.jpg)
Definitions: Projection
• Given a concept c over X– associate it with a set (all positive examples)
• Projection (sets)– For a concept class C and subset S– C(S) = { c S | c C}
• Projection (vectors)– For a concept class C and S = {x1, … , xm}– C(S) = {<c(x1), … , cxm)> | c C}
![Page 5: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/5.jpg)
Definition: VC-dim
• Clearly |C(S) | 2m
• C shatters S if |C(S) | =2m
• VC dimension of a class C:– The size d of the largest set S that shatters C.– Can be infinite.
• For a finite class C– VC-dim(C) log |C|
![Page 6: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/6.jpg)
Example 1: Interval
1 0
C1={cz | z [0,1] }
cz(x) = 1 x z
![Page 7: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/7.jpg)
Example 2: line
C2={cw | w=(a,b,c) }
cw(x,y) = 1 ax+by c
![Page 8: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/8.jpg)
Example 3: Parallel Rectangle
![Page 9: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/9.jpg)
Example 4: Finite union of intervals
![Page 10: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/10.jpg)
Example 5 : Parity
• n Boolean input variables
• T {1, …, n}
• fT(x) = iT xi
• Lower bound: n unit vectors
• Upper bound– Number of concepts– Linear dependency
![Page 11: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/11.jpg)
Example 6: OR
• n Boolean input variables• P and N subsets {1, …, n}• fP,N(x) = ( iP xi) ( iN xi)• Lower bound: n unit vectors• Upper bound
– Trivial 2n– Use ELIM (get n+1)– Show second vector removes 2 (get n)
![Page 12: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/12.jpg)
Example 7: Convex polygons
![Page 13: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/13.jpg)
Example 7: Convex polygons
![Page 14: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/14.jpg)
Example 8: Hyper-plane
• VC-dim(C8) = d+1
• Lower bound– unit vectors and zero vector
• Upper bound!
C8={cw,c | wd}
cw,c(x) = 1 <w,x> c
![Page 15: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/15.jpg)
Radon Theorem
• Definitions:– Convex set.– Convex hull: conv(S)
• Theorem:– Let T be a set of d+2 points in Rd
– There exists a subset S of T such that– conv(S) conv(T \ S)
• Proof!
![Page 16: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/16.jpg)
Hyper-plane: Finishing the proof
• Assume d+2 points T can be shattered.
• Use Radon Theorem to find S such that– conv(S) conv(T \ S)
• Assign point in S label 1– points not in S label 0
• There is a separating hyper-plane
• How will it label conv(S) conv(T \ S)
![Page 17: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/17.jpg)
Lower bounds: Setting
• Static learning algorithm:– asks for a sample S of size m()– Based on S selects a hypothesis
![Page 18: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/18.jpg)
Lower bounds: Setting
• Theorem:– if VC-dim(C) = then C is not learnable.
• Proof:– Let m = m(0.1,0.1)– Find 2m points which are shattered (set T)– Let D be the uniform distribution on T– Set ct(xi)=1 with probability ½.
• Expected error ¼.• Finish proof!
![Page 19: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/19.jpg)
Lower Bound: Feasible
• Theorem– VC-dim(C)=d+1, then m()=(d/)
• Proof:– Let T be a set of d+1 points which is shattered.– D samples:
• z0 with prob. 1-8
• zi with prob. 8/d
![Page 20: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/20.jpg)
Continue
– Set ct(z0)=1 and ct(zi)=1 with probability ½
• Expected error 2• Bound confidence
– for accuracy
![Page 21: Vapnik-Chervonenkis Dimension Part I: Definition and Lower bound](https://reader035.vdocuments.site/reader035/viewer/2022062409/56649d495503460f94a25ff5/html5/thumbnails/21.jpg)
Lower Bound: Non-Feasible
• Theorem– For two hypoth. m()=((log 1))
• Proof:– Let H={h0, h1}, where hb(x)=b
– Two distributions:
– D0: Prob. <x,1> is ½ - and <y,0> is ½ +
– D1: Prob. <x,1> is ½ + and <y,0> is ½ -