ar#ficial intelligence - university of adelaidedsuter/harbin_course/perceptron.pdf · [many slides...

31
Ar#ficial Intelligence Perceptrons Instructors: David Suter and Qince Li Course Delivered @ Harbin Ins#tute of Technology [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. Some others from colleagues at Adelaide University.]

Upload: hoangdat

Post on 16-Mar-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Ar#ficialIntelligencePerceptrons

Instructors:DavidSuterandQinceLi

CourseDelivered@HarbinIns#tuteofTechnology[ManyslidesadaptedfromthosecreatedbyDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.SomeothersfromcolleaguesatAdelaide

University.]

Page 2: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Error-DrivenClassifica#on

Page 3: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

WhattoDoAboutErrors

§  Problem:there’ss#llspaminyourinbox

§  Needmorefeatures–wordsaren’tenough!§  Haveyouemailedthesenderbefore?§  Have1MotherpeoplejustgoYenthesameemail?§  Isthesendinginforma#onconsistent?§  IstheemailinALLCAPS?§  DoinlineURLspointwheretheysaytheypoint?§  Doestheemailaddressyouby(your)name?

§  NaïveBayesmodelscanincorporateavarietyoffeatures,buttendtodobestinhomogeneouscases(e.g.allfeaturesarewordoccurrences)

Page 4: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

LinearClassifiers

Page 5: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

FeatureVectors

Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just

# free : 2 YOUR_NAME : 0 MISSPELLED : 2 FROM_FRIEND : 0 ...

SPAMor+

PIXEL-7,12 : 1 PIXEL-7,13 : 0 ... NUM_LOOPS : 1 ...

“2”

Page 6: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Some(Simplified)Biology

§  Verylooseinspira#on:humanneurons

Page 7: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

LinearClassifiers

§  Inputsarefeaturevalues§  Eachfeaturehasaweight§  Sumistheac#va#on

§  Iftheac#va#onis:§  Posi#ve,output+1§  Nega#ve,output-1 Σ

f1

f2

f3

w1

w2 w3

>0?

Page 8: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Weights§  Binarycase:comparefeaturestoaweightvector§  Learning:figureouttheweightvectorfromexamples

# free : 2 YOUR_NAME : 0 MISSPELLED : 2 FROM_FRIEND : 0 ...

# free : 4 YOUR_NAME :-1 MISSPELLED : 1 FROM_FRIEND :-3 ...

# free : 0 YOUR_NAME : 1 MISSPELLED : 1 FROM_FRIEND : 1 ...

Dot product positive means the positive class

Page 9: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

DecisionRules

Page 10: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

BinaryDecisionRule

§  Inthespaceoffeaturevectors§  Examplesarepoints§  Anyweightvectorisahyperplane§  OnesidecorrespondstoY=+1§  OthercorrespondstoY=-1

BIAS : -3 free : 4 money : 2 ... 0 1

0

1

2

free

mon

ey

+1=SPAM

-1=HAM

Page 11: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

WeightUpdates

Page 12: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Learning:BinaryPerceptron

§  Startwithweights=0§  Foreachtraininginstance:

§  Classifywithcurrentweights

§  Ifcorrect(i.e.,y=y*),nochange!

§  Ifwrong:adjusttheweightvector

Page 13: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Learning:BinaryPerceptron

§  Startwithweights=0§  Foreachtraininginstance:

§  Classifywithcurrentweights

§  Ifcorrect(i.e.,y=y*),nochange!§  Ifwrong:adjusttheweightvectorbyaddingorsubtrac#ngthefeaturevector.Subtractify*is-1.

Page 14: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Examples:Perceptron

§  SeparableCase

Page 15: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

RealData

Page 16: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI
Page 17: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI
Page 18: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Mul#classDecisionRule

§  Ifwehavemul#pleclasses:§  Aweightvectorforeachclass:

§  Score(ac#va#on)ofaclassy:

§  Predic#onhighestscorewins

Binary=mul,classwherethenega,veclasshasweightzero

Page 19: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Learning:Mul#classPerceptron

§  Startwithallweights=0§  Pickuptrainingexamplesonebyone§  Predictwithcurrentweights

§  Ifcorrect,nochange!§  Ifwrong:lowerscoreofwronganswer,

raisescoreofrightanswer

Page 20: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

§  Theconceptofhavingaseparatesetofweights(oneforeachclass)canbethoughtofashavingseparate“neurons”–alayerofneurons–oneforeachclass.Asinglelayernetwork.Ratherthantakingclass“max”overtheweights–onecantraintolearnacodingvector…

Page 21: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

E.G.LearningDigits0,1….9

Page 22: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI
Page 23: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI
Page 24: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Proper#esofPerceptrons

§  Separability:trueifsomeparametersgetthetrainingsetperfectlycorrect

§  Convergence:ifthetrainingisseparable,perceptronwilleventuallyconverge(binarycase)

Separable

Non-Separable

Page 25: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI
Page 26: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

ImprovingthePerceptron

Page 27: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

ProblemswiththePerceptron

§  Noise:ifthedataisn’tseparable,weightsmightthrash§  Averagingweightvectorsover#me

canhelp(averagedperceptron)

§  Mediocregeneraliza#on:findsa“barely”separa#ngsolu#on

§  Overtraining:test/held-outaccuracyusuallyrises,thenfalls§  Overtrainingisakindofoverfiong

Page 28: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

FixingthePerceptron

§  Lotsofliteratureonchangingthestepsize,averagingweightupdatesetc….

Page 29: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

LinearSeparators

§  Whichoftheselinearseparatorsisop#mal?

Page 30: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

SupportVectorMachines

§  Maximizingthemargin:goodaccordingtointui#on,theory,prac#ce§  OnlysupportvectorsmaYer;othertrainingexamplesareignorable§  Supportvectormachines(SVMs)findtheseparatorwithmaxmargin§  Basically,SVMsareMIRAwhereyouop#mizeoverallexamplesatonce

SVM

Page 31: Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides adapted from those created by Dan Klein and Pieter Abbeel for CS188 Intro to AI

Classifica#on:Comparison

§  NaïveBayes§  Buildsamodeltrainingdata§  Givespredic#onprobabili#es§  Strongassump#onsaboutfeatureindependence§  Onepassthroughdata(coun#ng)

§  Perceptrons/SVN:§  Makeslessassump#onsaboutdata(?–linearseparabilityisabigassump#on!ButkernelSVN’setcweakenthatassump#on)

§  Mistake-drivenlearning§  Mul#plepassesthroughdata(predic#on)§  Oqenmoreaccurate