comp 328: midterm review spring 2010 nevin l. zhang department of computer science & engineering...

34
COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology http:// www.cse.ust.hk/~lzhang / Can be used as cheat sheet

Post on 20-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

COMP 328: Midterm Review

Spring 2010

Nevin L. Zhang

Department of Computer Science & Engineering

The Hong Kong University of Science & Technology

http://www.cse.ust.hk/~lzhang/

Can be used as cheat sheet

Page 2: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Page 2

Overview

Algorithms for supervised learning Decision trees

Naïve Bayes classifiers

Neural networks

Instance-based learning

Support vector machines

General issues regarding supervised learning Classification error and confidence interval

Bias-Variance tradeoff

PAC learning theory

Page 3: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Supervised Learning

Page 3

Page 4: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Decision Trees

Page 4

Page 5: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Decision trees

Page 5

Page 6: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Reduced-Error Pruning

Page 6

Page 7: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Decision Trees

Issues with attributes Continuous

Attributes with many values Use GainRatio instead of Gain

Missing values

Tree construction is a search process Local minimum

Page 7

Page 8: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Naïve Bayes Classifier

Page 8

Can classify using this rule:

But, joint too expensive to get

Page 9: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Naïve Bayes Classifier

Page 9

Page 10: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Learning Naïve Bayes Classifier

Page 10

Laplace smoothing Continuous attribute When independence not true, double counting of evidence Generalization: Bayesian networks

Page 11: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Networks

Page 11

For classification and regression

Page 12: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Networks

Activation function Step, sign

Sigmoid, tanh (hyperbolic tangent)

Page 12

Page 13: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Network/Properties

Perceptrons are linear classifier

Two-layer network with enough perceptron units can

represent all Boolean functions

One layer with enough sigmoid units can approximate any

functions well

Page 13

Page 14: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Network

Page 14

Converge only when linearly separable

Page 15: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Network

Page 15

Adaline learning: Delta rule

Page 16: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Neural Network

Page 16

Page 17: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Instance-Based Learning

Lazy learning K-NN

Distance-weighted k-NN (kernel regression)

Locally weighted regression

Page 17

Page 18: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Support Vector Machines

Page 18

Page 19: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

SVM

Page 19

Page 20: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

SVM

Page 20

Page 21: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

SVM

Page 21

Page 22: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

SVM

Data not linearly separable

Page 22

Page 23: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

SVM

Page 23

Page 24: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Nonlinear SVM

Page 24

Page 25: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Impact of σ and C

Page 25

Page 26: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Classifier Evaluation

Relationship between

Page 26

Page 27: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Algorithm Evaluation/Model Selection

Page 27

Which learning algorithm to use? Given algorithm, which model to use? (How many hidden units?)

Page 28: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Algorithm Evaluation/Model Selection

Page 28

Page 29: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Bias-Variance Decomposition

Page 29

Page 30: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Bias-Variance Tradeoff

Page 30

For classification problem also

Page 31: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

PAC Learning Theory

Probably approximate correct (PAC)

Relationship between

Page 31

Page 32: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

PAC Learning Theory

Page 32

Page 33: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

VC Dimension

Page 33

Page 34: COMP 328: Midterm Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology lzhang

Sample Complexity

Page 34