uva cs 6316: machine learning lecture 1: introduction€¦ · deep learning / feature learning : [...
TRANSCRIPT
![Page 1: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/1.jpg)
UVA CS 6316:Machine Learning
Lecture 1: Introduction
Dr. Yanjun Qi
University of Virginia Department of Computer Science
![Page 2: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/2.jpg)
Today
q Course Logistics q Machine Learning Basics q A Rough Plan of Course Contentq Machine Learning History
9/2/19 Yanjun Qi / UVA CS 2
![Page 3: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/3.jpg)
Welcome
• Course Website:• https://qiyanjun.github.io/2019f-UVA-CS6316-
MachineLearning/
9/2/19 Yanjun Qi / UVA CS 3
We focus on learning fundamental principles, mathematical formulation, algorithm design and learning theory.
![Page 4: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/4.jpg)
Objective
• To help students able to build simple machine learning tools
• (not just a tool user!!!)
• Key Results: • Able to build a few simple machine learning
methods from scratch • Able to understand a few complex machine learning
methods at the source code level • To re-produce one cutting-edge machine learning
paper and understand the pipeline
9/2/19 Dr. Yanjun Qi / UVA CS 4
![Page 5: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/5.jpg)
Objective
• To help students able to build simple machine learning tools
• (not just a tool user!!!)
• Key Results: • Able to build a few simple machine learning
methods from scratch • Able to understand a few complex machine learning
methods at the source code level • To re-produce one cutting-edge machine learning
paper and understand the pipeline
9/2/19 Dr. Yanjun Qi / UVA CS 5
We focus on developing skills, more than feeding knowledge
![Page 6: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/6.jpg)
Course Staff • Instructor: Prof. Yanjun Qi
• QI: /ch ee/ • You can call me “professor”, “professor Qi”; • I have been teaching Graduate-level and Under-Level
Machine Learning course for years! • My research is about machine learning
• TA and Office Hour information @ CourseWeb
9/2/19 Yanjun Qi / UVA CS 6
![Page 7: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/7.jpg)
Course Material
• Text books for this class is:• NONE
• My slides – if it is not mentioned in my slides, it is not an official topic of the course
• Your UVA Collab for Assignments and Project• Google Forms for Quizzes
9/2/19 Yanjun Qi / UVA CS 7
![Page 8: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/8.jpg)
Course Background Needed
• Background Needed • Calculus, Basic linear algebra, Basic probability and Basic
Algorithm• Statistics is recommended. • Students should already have good programming skills,
i.e. python is required for all programming assignments
9/2/19 Yanjun Qi / UVA CS 8
![Page 9: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/9.jpg)
Today
q Course Logistics q Machine Learning Basics q A Rough Plan of Course Contentq Machine Learning History
9/2/19 Yanjun Qi / UVA CS 9
![Page 10: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/10.jpg)
OUR DATA-RICH WORLD
9/2/19Yanjun Qi / UVA CS 10
• Biomedicine• Patient records, brain imaging, MRI & CT scans, …• Genomic sequences, bio-structure, drug effect info, …
• Science• Historical documents, scanned books, databases from astronomy,
environmental data, climate records, …
• Social media• Social interactions data, twitter, facebook records, online reviews, …
• Business• Stock market transactions, corporate sales, airline traffic, …
• Entertainment• Internet images, Hollywood movies, music audio files, …
![Page 11: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/11.jpg)
What can we do with the data wealth?è REAL-WORLD IMPACT
§ Business efficiencies§ Scientific breakthroughs§ Improve quality-of-life: § healthcare, § energy saving / generation, § environmental disasters, § nursing home, § transportation, § …
9/2/19 Yanjun Qi / UVA CS 11
Medical Images
Genomic Data
Transportation Data
Brain computer interaction (BCI)
Device sensor data
![Page 12: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/12.jpg)
9/2/19 Yanjun Qi / UVA CS 12
• Data capturing (sensor, smart devices, medical instruments, et al.)
• Data transmission • Data storage • Data management • High performance data processing• Data visualization• Data security & privacy (e.g. multiple individuals)• ……
• Data analytics¢How can we analyze this big data wealth ?¢E.g. Machine learning and data mining
BIG DATA CHALLENGES
this course
e.g. HCI
e.g. cloud computing
![Page 13: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/13.jpg)
MACHINE LEARNING IS CHANGING THE WORLD
9/2/19 Yanjun Qi / UVA CS 13Many more !
![Page 14: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/14.jpg)
BASICS OF MACHINE LEARNING
• “The goal of machine learning is to build computer systems that can learn and adapt from their experience.” – Tom Dietterich
• “Experience” in the form of available dataexamples (also called as instances, samples)
• Available examples are described with properties (data points in feature space X)
9/2/19 Yanjun Qi / UVA CS 14
![Page 15: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/15.jpg)
9/2/19 Yanjun Qi / UVA CS 15
e.g. SUPERVISED LEARNING• Find function to map input space X to output
space Y
• So that the difference between y and f(x) of each example x is small.
I believe that this book is not at all helpful since it does not explain thoroughly the material . it just provides the reader with tables and calculations that sometimes are not easily understood …
x
y-1
Input X : e.g. a piece of English text
Output Y: {1 / Yes , -1 / No } e.g. Is this a positive product review ?
e.g.
![Page 16: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/16.jpg)
SUPERVISED Linear Binary Classifier
• Now let us check out a VERY SIMPLE case of
9/2/19 Yanjun Qi / UVA CS 16
e.g.: Binary y / Linear f / X as R2
f x y
f(x,w,b) = sign(wT x + b)
X =(x_1, x_2)
![Page 17: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/17.jpg)
9/2/19 Dr. Yanjun Qi / UVA CS 17
Traditional Programming
Machine Learning (training phase)
ComputerData
ProgramOutput
ComputerData X
Output Y
Program / Model f()
![Page 18: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/18.jpg)
SUPERVISED Linear Binary Classifier
f x y
f(x,w,b) = sign(wT x + b)
wT x + b<0
Courtesy slide from Prof. Andrew Moore’s tutorial
“Predict Class =
+1”
zone
“Predict Class =
-1”
zone
wTx + b>0
denotes +1 pointdenotes -1 point
wT x + b=0
9/2/19 Yanjun Qi / UVA CS 18
X =(x_1, x_2)
x_1
X_2
![Page 19: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/19.jpg)
SUPERVISED Linear Binary Classifier
f x y
f(x,w,b) = sign(wT x + b)
wT x + b<0
Courtesy slide from Prof. Andrew Moore’s tutorial
?
?“Predict C
lass = +1”
zone
“Predict Class =
-1”
zone
wTx + b>0
denotes +1 pointdenotes -1 pointdenotes future points
wT x + b=0
?
9/2/19 Yanjun Qi / UVA CS 19
X =(x_1, x_2)
x_1
X_2
![Page 20: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/20.jpg)
9/2/19 Yanjun Qi / UVA CS 20
• Training (i.e. learning parameters w,b ) • Training set includes
• available examples x1,…,xL
• available corresponding labels y1,…,yL
• Find (w,b) by minimizing loss• (i.e. difference between y and f(x) on available
examples in training set)
(W, b) = argminW, b
Basic Concepts
![Page 21: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/21.jpg)
9/2/19 Yanjun Qi / UVA CS 21
• Testing (i.e. evaluating performance on “future”points)
• Difference between true y? and the predicted f(x?) on a set of testing examples (i.e. testing set)
• Key: example x? not in the training set
• Generalisation: learn function / hypothesis from past data in order to “explain”, “predict”, “model” or “control” new data examples
Basic Concepts
![Page 22: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/22.jpg)
9/2/19 Yanjun Qi / UVA CS 22
• Loss function • e.g. hinge loss for binary
classification task
• e.g. pairwise ranking loss for ranking task (i.e. ordering examples by preference)
• Regularization • E.g. additional information addedon loss function to control f
Basic Concepts
![Page 23: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/23.jpg)
Task
Machine Learning in a Nutshell
Representation
Score Function
Search/Optimization
Models, Parameters
9/2/19 Dr. Yanjun Qi / UVA CS 23
ML grew out of work in AI
Optimize a performance criterion using example data or past experience,
Aiming to generalize to unseen data
![Page 24: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/24.jpg)
Today
q Course Logistics q Machine Learning Basics q A Rough Plan of Course Contentq Machine Learning History
9/2/19 Yanjun Qi / UVA CS 24
![Page 25: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/25.jpg)
9/2/19 Dr. Yanjun Qi / UVA CS 25
Traditional Programming
Machine Learning (training phase)
ComputerData
ProgramOutput
ComputerData X
Output Y
Program / Model f()
![Page 26: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/26.jpg)
TYPICAL MACHINE LEARNING Pipeline (Learning Mode)
9/2/19 Yanjun Qi / UVA CS 26
Low-level sensing
Pre-processing
Feature Extract
Feature Select
Inference, Prediction, Recognition
Label Collection
Evaluation
![Page 27: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/27.jpg)
9/2/19 Dr. Yanjun Qi / UVA CS 27
Operational (Deployment) Mode
LearnerReference Data
Model
ExecutionEngine
ModelTagged
Data
ProductionData Deployment
input-output pairs
![Page 28: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/28.jpg)
TYPICAL MACHINE LEARNING Pipeline (Learning Mode)
9/2/19 Yanjun Qi / UVA CS 28
Low-level sensing
Pre-processing
Feature Extract
Feature Select
Inference, Prediction, Recognition
Label Collection
Data Complexity of X
Data Complexity
of YEvaluation
![Page 29: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/29.jpg)
UNSUPERVISED LEARNING : [ COMPLEXITY in Y ]
• No labels are provided (e.g. No Y provided)• Find patterns from unlabeled data, e.g. clustering
9/2/19 Yanjun Qi / UVA CS 29
e.g. clustering => to find “natural” grouping of instances given un-labeled data
![Page 30: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/30.jpg)
STRUCTURAL OUTPUT LEARNING : [ COMPLEXITY OF Y ]
• Many prediction tasks involve output labels having structured correlations or constraints among instances
9/2/19 Yanjun Qi / UVA CS 30
Many more possible structures between y_i , e.g. spatial , temporal, relational …
The dog chased the cat
APAFSVSPASGACGPECA…
TreeSequence GridStructured Dependency between Examples’ Y
Input
Output
CCEEEEECCCCCHHHCCC…
![Page 31: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/31.jpg)
Original Space Feature Space
STRUCTURAL INPUT : Kernel Methods [ COMPLEXITY OF X ]
e.g. Graphs,Sequences,3D structures,
9/2/19 Yanjun Qi / UVA CS 31
�
�
�
![Page 32: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/32.jpg)
MORE RECENT: FEATURE LEARNING[ COMPLEXITY OF X ]
Deep Learning Supervised Embedding
9/2/19 Yanjun Qi / UVA CS 32
Layer-wise Pretraining
![Page 33: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/33.jpg)
DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ]
9/2/19 Yanjun Qi / UVA CS 33
Feature Engineering ü Most critical for accuracy ü Account for most of the computation for testing ü Most time-consuming in development cycle ü Often hand-craft and task dependent in practice
Feature Learning ü Easily adaptable to new similar tasks ü Layerwise representation ü Layer-by-layer unsupervised trainingü Layer-by-layer supervised training
![Page 34: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/34.jpg)
Why learn features?
34
![Page 35: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/35.jpg)
Reinforcement Learning (RL)[ COMPLEXITY OF (X,Y) ]
• What’s Reinforcement Learning?
9/2/19 Yanjun Qi / UVA CS 35
Environment
Agent
{Observation, Reward} {Actions}
• Agent interacts with an environment and learns by maximizing a scalar reward signal– Basic version: No labels or any other supervision signal. – Variation like imitation learning: supervised
• Previously suffering from hand-craft states or representation.Adapt from Professor Qiang Yang of HK UST
![Page 36: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/36.jpg)
9/2/19 Yanjun Qi / UVA CS 36
Deep Reinforcement Learning [ COMPLEXITY OF (X,Y) ]
• Human
• So what’s DEEP RL?Environment
{Actions}{Raw Observation, Reward}
Adapt from Professor Qiang Yang of HK UST
![Page 37: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/37.jpg)
When to use Machine Learning (Adaptto / learn from data) ?
• 1. Extract knowledge from data• Relationships and correlations can be hidden within large
amounts of data• The amount of knowledge available about certain tasks is
simply too large for explicit encoding (e.g. rules) by humans
• 2. Learn tasks that are difficult to formalise• Hard to be defined well, except by examples, e.g., face
recognition
• 3. Create software that improves over time• New knowledge is constantly being discovered. • Rule or human encoding-based system is difficult to
continuously re-design “by hand”.9/2/19 Yanjun Qi / UVA CS 37
![Page 38: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/38.jpg)
“Big Data” Challenges for Machine Learning
9/2/19 Yanjun Qi / UVA CS 38
üLarge size of samplesüHigh dimensional features
Not the focus, being covered in my advanced-level course
![Page 39: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/39.jpg)
Large-Scale Machine Learning: SIZE MATTERS
• One thousand data instances
• One million data instances
•One billion data instances
•One trillion data instances
9/2/19 Yanjun Qi / UVA CS 39
Those are not different numbers, those are different mindsets !!!
![Page 40: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/40.jpg)
Course Content Plan èSix major sections of this courseq Regression (supervised)q Classification (supervised)q Unsupervised models q Learning theory
q Graphical models q Reinforcement Learning
9/2/19 Yanjun Qi / UVA CS 40
Y is a continuous
Y is a discrete
NO Y
About f()
About interactions among X1,… Xp
Learn program to Interacts with its environment
![Page 41: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/41.jpg)
Task
Machine Learning in a Nutshell
Representation
Score Function
Search/Optimization
Models, Parameters
9/2/19 Dr. Yanjun Qi / UVA CS 41
ML grew out of work in AI
Optimize a performance criterion using example data or past experience,
Aiming to generalize to unseen data
![Page 42: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/42.jpg)
What we have covered
9/2/19 Dr. Yanjun Qi / UVA CS 42
Task Regression, classification, clustering, dimen-reduction
Representation Linear func, nonlinear function (e.g. polynomial expansion), local linear, logistic function (e.g. p(c|x)), tree, multi-layer, prob-density family (e.g. Bernoulli, multinomial, Gaussian, mixture of Gaussians), local func smoothness,
Score Function MSE, Hinge, log-likelihood, EPE (e.g. L2 loss for KNN, 0-1 loss for Bayes classifier), cross-entropy, cluster points distance to centers, variance,
Search/Optimization Normal equation, gradient descent, stochastic GD, Newton, Linear programming, Quadratic programming (quadratic objective with linear constraints), greedy, EM, asyn-SGD, eigenDecomp
Models, Parameters Regularization (e.g. L1, L2)
![Page 43: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/43.jpg)
What we will cover
9/2/19 Dr. Yanjun Qi / UVA CS 43
Task Regression, classification, clustering, dimen-reduction
Representation Linear func, nonlinear function (e.g. polynomial expansion), local linear, logistic function (e.g. p(c|x)), tree, multi-layer, prob-density family (e.g. Bernoulli, multinomial, Gaussian, mixture of Gaussians), local func smoothness, kernel matrix, local smoothness, partition of feature space,
Score Function MSE, Margin, log-likelihood, EPE (e.g. L2 loss for KNN, 0-1 loss for Bayes classifier), cross-entropy, cluster points distance to centers, variance, conditional log-likelihood, complete data-likelihood, regularized loss func (e.g. L1, L2) , goodness of inter-cluster similar
Search/Optimization
Normal equation, gradient descent, stochastic GD, Newton, Linear programming, Quadratic programming (quadratic objective with linear constraints), greedy, EM, asyn-SGD, eigenDecomp, backprop
Models, Parameters
Linear weight vector, basis weight vector, local weight vector, dual weights, training samples, tree-dendrogram, multi-layer weights, principle components, member (soft/hard) assignment, cluster centroid, cluster covariance (shape), …
![Page 44: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/44.jpg)
A Dataset
• Data/points/instances/examples/samples/records: [ rows ]• Features/attributes/dimensions/independent
variables/covariates/predictors/regressors: [ columns, except the last] • Target/outcome/response/label/dependent variable: special column to be
predicted [ last column ]
9/2/19 Dr. Yanjun Qi / 44
![Page 45: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/45.jpg)
Main Types of Columns • Continuous: a real number,
for example, weight
• Discrete: a symbol, like “Good” or “Bad”
9/2/19 Dr. Yanjun Qi / 45
![Page 46: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/46.jpg)
e.g. SUPERVISED Classification
• e.g. target Y can be a discretetarget variable
9/2/19 Dr. Yanjun Qi / 46
f(x?)
Training dataset consists of input-
output pairs
![Page 47: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/47.jpg)
e.g. SUPERVISED Regression
• e.g. target Y can be a continuous target variable
9/2/19 Dr. Yanjun Qi / 47
f(x?)
Training dataset consists of input-
output pairs
10
12
1
5
7
9
12.5
3
5
20
7
![Page 48: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/48.jpg)
How to know the program works well?
9/2/19 Dr. Yanjun Qi / UVA CS 48
f(x?)
Training dataset consists of input-
output pairs
Evaluation
Measure Loss on pair è Error ( f(x?), y? )
![Page 49: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/49.jpg)
9/2/19 Dr. Yanjun Qi / UVA CS 49
!ytrain =
y1y2"yn
!
"
#####
$
%
&&&&&
!ytest =
yn+1yn+2"yn+m
!
"
#####
$
%
&&&&&
𝐗"#$%& =
− − 𝐱*+ − −− − 𝐱,+ − −⋮ ⋮ ⋮
− − 𝐱&+ − −
𝐗"./" =
− − 𝐱&0*+ − −− − 𝐱&0,+ − −⋮ ⋮ ⋮
− − 𝐱&01+ − −
![Page 50: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/50.jpg)
Notation
• Inputs • Xj (j-th element of a vector variable X ) : random
variables written in capital letter • p #input variables, n #observations• X : matrix written in bold capital • Vectors are assumed to be column vectors• Discrete inputs often described by characteristic
vector (dummy variables)
• Outputs• quantitative Y• qualitative C (for categorical)
• Observed written in lower case• The i-th observed sample of the random variable
Xj is xj,i (a scalar)
![Page 51: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/51.jpg)
a few sample slides
9/2/19 Yanjun Qi / UVA CS 51
![Page 52: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/52.jpg)
9/2/19 Dr. Yanjun Qi / UVA CS 52
![Page 53: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/53.jpg)
Deriving the Maximum Likelihood Estimate for Bernoulli
Dr. Yanjun Qi / UVA CS
!!L(p)= px(1− p)n−x
maximize
0.0
00
.04
0.0
8
potential HIV prevalences
like
liho
od
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
!!log(L(p)= log px(1− p)n−x⎡⎣ ⎤⎦
maximize
!2
50
!1
50
!5
00
potential HIV prevalences
log
(lik
elih
oo
d)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
−l(p)= − log px(1− p)n−x⎡⎣ ⎤⎦
Minimize the negative log-likelihood
05
01
00
20
0potential HIV prevalences
!lo
g(lik
elih
oo
d)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p
p
p
![Page 54: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/54.jpg)
Today
q Course Logistics q Machine Learning Basics q A Rough Plan of Course Contentq Machine Learning History
9/2/19 Yanjun Qi / UVA CS 54
![Page 55: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/55.jpg)
MACHINE LEARNING IN COMPUTER SCIENCE
• Machine learning is already the preferred approach for • Speech recognition, natural language processing• Computer vision• Medical outcome analysis• Robot control …
• Why growing ?• Improved machine learning algorithms• Improved CPU / GPU powers • Increased data capture, new sensors, networking • Systems/Software too complex to control manually• Demand to self-customization for user, environment, ….
9/2/19 Yanjun Qi / UVA CS 55
![Page 56: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/56.jpg)
HISTORY OF MACHINE LEARNING• 1950s
• Samuel’s checker player• Selfridge’s Pandemonium
• 1960s: • Neural networks: Perceptron• Pattern recognition • Learning in the limit theory• Minsky and Papert prove limitations of Perceptron
• 1970s: • Symbolic concept induction• Winston’s arch learner• Expert systems and the knowledge acquisition bottleneck• Quinlan’s DT ID3• Michalski’s AQ and soybean diagnosis• Scientific discovery with BACON• Mathematical discovery with AM
9/2/19 Yanjun Qi / UVA CS 56Adapted From Prof. Raymond J. Mooney’s slides
![Page 57: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/57.jpg)
HISTORY OF MACHINE LEARNING (CONT.)• 1980s:
• Advanced decision tree and rule learning• Explanation-based Learning (EBL)• Learning and planning and problem solving• Utility problem• Analogy• Cognitive architectures• Resurgence of neural networks (connectionism, backpropagation)• Valiant’s PAC Learning Theory• Focus on experimental methodology
• 1990s• Data mining• Adaptive software agents and web applications• Text learning• Reinforcement learning (RL)• Inductive Logic Programming (ILP)• Ensembles: Bagging, Boosting, and Stacking• Bayes Net learning9/2/19 Yanjun Qi / UVA CS 57Adapted From Prof. Raymond J. Mooney’s slides
![Page 58: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/58.jpg)
HISTORY OF MACHINE LEARNING (CONT.)• 2000s
• Support vector machines• Kernel methods• Graphical models• Statistical relational learning• Transfer learning• Sequence labeling• Collective classification and structured outputs• Computer Systems Applications
• Compilers• Debugging• Graphics• Security (intrusion, virus, and worm detection)
• Email management• Personalized assistants that learn• Learning in robotics and vision
9/2/19 Yanjun Qi / UVA CS 58Adapted From Prof. Raymond J. Mooney’s slides
![Page 59: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/59.jpg)
HISTORY OF MACHINE LEARNING (CONT.)
• 2010s• Speech translation, voice recognition (e.g. SIRI)• Google search engine uses numerous machine learning
techniques (e.g. grouping news, spelling corrector, improving search ranking, image retrieval, …..)
• 23 and me (scan sample of person genome, predict likelihood of genetic disease, … )
• DeepMind, Google Brain, …• IBM waston QA system• Machine Learning as a service (e.g. google prediction API,
bigml.com, cloud autoML . ) • IBM healthcare analytics• ……
9/2/19 Yanjun Qi / UVA CS 59Adapted From Prof. Raymond J. Mooney’s slides
![Page 60: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/60.jpg)
RELATED DISCIPLINES• Artificial Intelligence• Data Mining• Probability and Statistics• Information theory• Numerical optimization• Computational complexity theory• Control theory (adaptive)• Psychology (developmental, cognitive)• Neurobiology• Linguistics• Philosophy
9/2/19 Yanjun Qi / UVA CS 60
![Page 61: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/61.jpg)
What are the goals of AI research?
9/2/19 Yanjun Qi / UVA CS 61
Artifacts that ACT like HUMANS
Artifacts that THINK like HUMANS
Artifacts that THINK RATIONALLY
Artifacts that ACT RATIONALLY
From: M.A. Papalaskar
![Page 62: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/62.jpg)
How can we build more intelligent computer / machine ?
• Able to • perceive the world• understand the world • react to the world
• This needs• Basic speech capabilities• Basic vision capabilities • Language/semantic understanding • User behavior / emotion understanding • Able to act • Able to think ?
9/2/19 Yanjun Qi / UVA CS 62
![Page 63: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/63.jpg)
How can we build more intelligent computer / machine ? : Milestones in Recent Vision/AI Fields
9/2/19 Yanjun Qi / UVA CS 63
ImageNet Competition:
[ Training on 1.2 million images [X]vs. 1000 different word labels [Y] ]
![Page 64: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/64.jpg)
Reason of the recent breakthroughs:
Yanjun Qi / UVA CS 9/2/19 64
Plenty of (Labeled) Data
Advanced Computer
Architecture that fits DNNs
Powerful DNN platforms /
Libraries
![Page 65: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/65.jpg)
Detour: three planned programming assignments about AI tasks
• HW: Semantic language understanding (sentiment classification on movie review text)
• HW: Visual object recognition (labeling images about handwritten digits)
• HW: Audio speech recognition (unsupervised learning based speech recognition task )
9/2/19 Yanjun Qi / UVA CS 65
![Page 66: UVA CS 6316: Machine Learning Lecture 1: Introduction€¦ · DEEP LEARNING / FEATURE LEARNING : [ COMPLEXITY OF X ] 9/2/19 Yanjun Qi / UVA CS 33 Feature Engineering üMost critical](https://reader030.vdocuments.site/reader030/viewer/2022040413/5f0ef0b17e708231d441b064/html5/thumbnails/66.jpg)
References
q Prof. Andrew Moore’s tutorials q Prof. Raymond J. Mooney’s slidesq Prof. Alexander Gray’s slidesq Prof. Eric Xing’s slidesq http://scikit-learn.org/q Hastie, Trevor, et al. The elements of statistical
learning. Vol. 2. No. 1. New York: Springer, 2009.q Prof. M.A. Papalaskar’s slides
9/2/19 Yanjun Qi / UVA CS 66