computational learning theory pac iid vc dimension svm

13
Computational Learning Theory PAC IID VC Dimension SVM Kunstmatige Intelligentie / RuG Marius Bulacu

Upload: deirdre-davenport

Post on 01-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Computational Learning Theory PAC IID VC Dimension SVM. Marius Bulacu. Kunstmatige Intelligentie / RuG. The Problem. Why does learning work? How do we know that the learned hypothesis h is close to the target function f if we do not know what f is?. answer provided by - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computational Learning Theory  PAC  IID  VC Dimension  SVM

Computational Learning Theory• PAC• IID• VC Dimension• SVM

Kunstmatige Intelligentie / RuG

Marius Bulacu

Page 2: Computational Learning Theory  PAC  IID  VC Dimension  SVM

2

The Problem

• Why does learning work?

• How do we know that the learned hypothesis h is close to the target function f if we do not know what f is?

answer provided by

computational learning theory

Page 3: Computational Learning Theory  PAC  IID  VC Dimension  SVM

3

The Answer

• Any hypothesis h that is consistent with a sufficiently large number of training examples is unlikely to be seriously wrong.

Therefore it must be:

Probably Approximately Correct

PAC

Page 4: Computational Learning Theory  PAC  IID  VC Dimension  SVM

4

The Stationarity Assumption

• The training and test sets are drawn randomly from the same population of examples using the same probability distribution.

Therefore training and test data are

Independently and Identically Distributed

IID

“the future is like the past”

Page 5: Computational Learning Theory  PAC  IID  VC Dimension  SVM

5

How many examples are needed?

Number of examples Probability that h and f disagree on an example

Probability of existence of a wrong hypothesis

consistent with all examples

)Hln(lnm 11

Size of hypothesis space

Sample complexity

Page 6: Computational Learning Theory  PAC  IID  VC Dimension  SVM

6

Formal Derivation

H (the set of all possible hypothese)

f

HBAD (the set of “wrong” hypotheses)

1))x(f)x(h,x(P

))x(f)x(h,x(P

)Hln(lnm)(H

)(H)Hh(P

m

mBADBAD

11

1

1

Page 7: Computational Learning Theory  PAC  IID  VC Dimension  SVM

7

What if hypothesis space is infinite?

Can’t use our result for finite H Need some other measure of complexity for H

– Vapnik-Chervonenkis dimension

Page 8: Computational Learning Theory  PAC  IID  VC Dimension  SVM

8

Page 9: Computational Learning Theory  PAC  IID  VC Dimension  SVM

9

Page 10: Computational Learning Theory  PAC  IID  VC Dimension  SVM

10

Page 11: Computational Learning Theory  PAC  IID  VC Dimension  SVM

11

Page 12: Computational Learning Theory  PAC  IID  VC Dimension  SVM

12

SVM (1): Kernels

Complicated separation boundary

Simple separation boundary: Hyperplane

f1

f2

f1

f2

f3

Kernels Polynomial Radial basis Sigmoid

Implicit mapping to a higher dimensional space where linear separation is possible.

Page 13: Computational Learning Theory  PAC  IID  VC Dimension  SVM

13

SVM (2): Max Margin

Support vectors

Max Margin

“Best” Separating Hyperplane

From all the possible separating hyperplanes, select the one that gives Max Margin.

Solution found by Quadratic Optimization – “Learning”.

f1

f2Good generalization