tying up loose ends. understand your data no answers available, only data

34
Tying up loose ends

Upload: daniella-park

Post on 27-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Tying up loose ends

Understand your data

No answers available, only data

No answers available, only dataClustering, SOM, Hebbian learning,

PCA…

Training includes inputs and correct answers

Training includes inputs and correct answers

Perceptron, Backprop, POS tagging

Probability of Y given X

Probability of Y given XOr the most likely Y given X

Probability of Y given XOr the most likely Y given XCollaborative Filtering – people who

like X probably like Y

Probability of Y given XOr the most likely Y given XCollaborative Filtering – people who

like X probably like YNeural Networks – input X triggers Y

output (behaviorism)

Input retrieves similarities or correlations as output

X is a…

X is a…X is A or B or C or D

X is a…X is A or B or C or DX is 1 or 0

X is a…X is A or B or C or DX is face or not-face

Goal is prediction

Goal is predictionClassification is a type of association

Goal is predictionClassification is a type of association Includes pattern recognition: OCR,

faces, diagnosis, speech, NLP…

Goal is predictionClassification is a type of association Includes pattern recognition: OCR,

faces, diagnosis, speech, NLP… Includes compression

If the output is a continuous number

If the output is a continuous numberEx. Automatic steering

inputs: sensors (video, GPS, proximity…)

output: degree of rotation of the wheel Ex. ALVINN

Backprop Neural Nets work for both

Different algorithms use different error calculations

Different algorithms use different error calculations

Simplest : # wrong / # total ie. 2/5 = .4 or 40%

Different algorithms use different error calculations

Simplest : # wrong / # total ie. 2/5 = .4 or 40%

Other examples: WER Mean Squared Error

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput Validation

Training

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

Fold 1

Fold 2

Fold 3

Fold 4

Fold 5

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

Fold 1

Fold 2

Fold 3

Fold 4

Fold 5

Train

Test

-> Learner 1 error = .01

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

Fold 1

Fold 2

Fold 3

Fold 5

Fold 4

Train

Test

-> Learner 2 error = .012

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

InputInput OutputOutputInputInput OutputOutput

Fold 1

Fold 2

Fold 3

Fold 5

Fold 3

Train

Test

-> Learner 3 error = .011

If errors between folds vary greatly this indicated bias in training

Over-fitting – too much training

Over-fittingMisrepresentative data