supervised learning & classification, part i reading: dh&s, ch 1
Post on 21-Dec-2015
217 views
TRANSCRIPT
![Page 1: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/1.jpg)
Supervised Learning &
Classification, part I
Reading: DH&S, Ch 1
![Page 2: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/2.jpg)
Administrivia...
•Pretest answers back today
•Today’s lecture notes online after class
•Apple Keynote, PDF, PowerPoint
•PDF & PPT auto-converted; may be flakey
![Page 3: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/3.jpg)
Your place in history•Yesterday:
•Course administrivia
•Fun & fluffy philosophy
•Today:
•The basic ML problem
•Branches of ML: the 20,000 foot view
•Intro to supervised learning
•Definitions and stuff
![Page 4: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/4.jpg)
Pretest results: trends
•Courses dominated by math, stat; followed by algorithms; followed by CS530; followed by AI & CS500
•Proficiencies: probability > algorithms > linear algebra
•μ=56%
•σ=28%
![Page 5: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/5.jpg)
The basic ML problem
“Emphysema”
World
Super
vised
f(⋅)
![Page 6: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/6.jpg)
•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can recognize new (previously unseen) instances
•Classification or discrimination
Hashimoto-Pritzker
The basic ML problem
f(⋅) ???
![Page 7: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/7.jpg)
•Our job: Reconstruct f() from observations•Knowing f() tells us:•Can synthesize new data (e.g., speech or images)•Generation
The basic ML problem
Randomsource
Emphysema
f(⋅)
![Page 8: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/8.jpg)
The basic ML problem•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can help us understand the process that generated data
•Description or analysis
•Can tell us/find things we never knew
•Discovery or data mining
f(⋅)
How many clusters (“blobs”) are there?Taxonomy of data?Networks of relationships?Unusual/unexpected things?Most important characteristics?
![Page 9: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/9.jpg)
The basic ML problem•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can help us act or perform better
•Control
Turn left?Turn right?Accelerate?Brake?Don’t ride in
the rain?
![Page 10: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/10.jpg)
A brief taxonomyAll All MLML
(highly
abbreviat
ed)
- have “inputs”- have “outputs”- find “best” f()
- have “inputs”- no “outputs”- find “best” f()
- have “inputs”- have “controls”- have “reward”- find “best” f()
SupervisSuperviseded
UnsuperviUnsupervisedsed
ReinforcemReinforcementent
LearningLearning
![Page 11: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/11.jpg)
A brief taxonomyAll All MLML
SupervisSuperviseded
UnsuperviUnsupervisedsed
ReinforcemReinforcementent
LearningLearning
(highly
abbreviat
ed)
ClassificatiClassificationon RegressionRegression
Discrete outputs Continuous outputs
![Page 12: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/12.jpg)
A classic example: digitsThe post office wants to be able to
auto-scanenvelopes, recognize addresses, etc.
87131
???
![Page 13: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/13.jpg)
Digits to bits
255, 255, 127, 35, 0, 0 ...
255, 0, 93, 11, 45, 6 ...
Feature vectorDigitize (sensors)
![Page 14: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/14.jpg)
Measurements & features•The collection of numbers from the sensors:
•... is called a feature vector, a.k.a.,
•attribute vector
•measurement vector
•instance
255, 0, 93, 11, 45, 6 ...
![Page 15: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/15.jpg)
•Written
•where
•d is the dimension of the vector
•Each is drawn from some range
•E.g., or or
Measurements & features
![Page 16: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/16.jpg)
•Features (attributes, independent variables) can come in different flavors:
•Continuous
•Discrete
•Categorical or nominal
More on features
![Page 17: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/17.jpg)
•We (almost always) assume that the set of features is fixed & of finite dimension, d
•Sometimes quite large, though (d≥100,000 not uncommon)
•The set of all possible instances is the instance space or feature space,
•
•
•
More on features
![Page 18: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/18.jpg)
•We (almost always) assume that the set of features is fixed & of finite dimension, d
•Sometimes quite large, though (d≥100,000 not uncommon)
•The set of all possible instances is the instance space or feature space,
•
More on features
![Page 19: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/19.jpg)
•Every example comes w/ a class
•A.k.a., label, prediction, dependent variable, etc.
•For classification problems, class label is categorical
•For regression problems, it’s continuous
•Usually called dependent or regressed variable
•We’ll write
•E.g.,
Classes
255, 255, 127, 35, 0, 0 ...
255, 0, 93, 11, 45, 6 ...
“7”
“8”
![Page 20: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/20.jpg)
Classes, cont’d
•The possible values of the class variable is called the class set, class space, or range
•Book writes indiv classes as
•Presumably whole class set is:
•So
![Page 21: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/21.jpg)
A very simple example
I. setosa I. versicolor I. virginica
Sepal lengthSepal widthPetal lengthPetal width
Feature space,
![Page 22: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/22.jpg)
A very simple example
I. setosa I. versicolor I. virginica
Class space,
![Page 23: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/23.jpg)
Training data•Set of all available data for learning == training data
•A.k.a., parameterization set, fitting set, etc.
•Denoted
•Can write as a matrix, w/ a corresponding class vector:
![Page 24: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/24.jpg)
Finally, goals•Now that we have and , we have a (mostly) well defined job:
Find the function
that most closely approximates the “true” function
The supervised learning problem:
![Page 25: Supervised Learning & Classification, part I Reading: DH&S, Ch 1](https://reader035.vdocuments.site/reader035/viewer/2022062516/56649d6a5503460f94a48b68/html5/thumbnails/25.jpg)
Goals?•Key Questions:
•What candidate functions do we consider?
•What does “most closely approximates” mean?
•How do you find the one you’re looking for?
•How do you know you’ve found the “right” one?