ai lecture 11 & 12 - machine learning
TRANSCRIPT
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
1/53
Machine LearningLecture 11 & 12
Artificial Intelligence Spring 2013
1
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
2/53
2
What is M achine L earni ng?
The field of machine learning is concerned with the questionof how to construct computer programs that automatically
improve with experience (T. Mitchell)
Principles, methods, and algorithms for learning andprediction on the basis of past experience
In the broadest sense, any method that incorporatesinformation from training samples in the design of aclassifier employs learning
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
3/53
3
What is M achine L earni ng?
Our tendency is to view learning only in the manner in whichhumans learn, i.e. incrementally over time. This may not be
the case when ML algorithms are concerned.
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
4/53
4
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
5/53
5
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
6/53
6
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
7/53
7
A simple decision model
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
8/538
An overly complex decision modelThis may lead to worse classification than a simple model
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
9/539
May be this model is an optimal trade off between modelcomplexity and and performance on the training set
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
10/5310
A classification problem: the grades for students taking thiscourse
Key Steps:1. Data (what past experience can we rely on?)2. Assumptions (what can we assume about the students
or the course?)3. Representation (how do we summarize a student?)4. Estimation (how do we construct a map from students
to grades?)5. Evaluation (how well are we predicting?)6. Model Selection (perhaps we can do even better?)
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
11/5311
1. Data: The data we have available may be:
- names and grades of students in past years ML courses- academic record of past and current students
Student M L Course X Course Y Peter A B A Training
David B A A data
Jack ? C A CurrentKate ? A A data
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
12/5312
2. Assumptions:
There are many assumptions we can make to facilitatepredictions
1. The course has remained same roughly over the years2. Each student performs independently from others
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
13/5313
3. Representation:
Academic records are rather diverse so we might limit thesummaries to a select few coursesFor example, we can summarize the i th student (say Pete)with a vector
X i = [A C B]
where the grades may correspond to numerical values
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
14/5314
3. Representation:
The available data in this representation is:
Training data Data for predictionStudent ML grade Student ML gradeX1t B X 1p ?
X2t A X 2p ?
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
15/5315
4. Estimation
Given the training data
Student ML gradeX1t BX2t A
we need to find a mapping from input vectors x to labels y encoding the grades for the ML course.
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
16/5316
Possible solution (nearest neighbor classifier):
1. For any student x find the closest student xi
in thetraining set2. Predict y i, the grade of the closest student
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
17/5317
5. Evaluation
How can we tell how good our predictions are?- we can wait till the end of this course...- we can try to assess the accuracy based on the data
we already have (training data)
Possible solution:- divide the training set further into training and test
sets- evaluate the classifier constructed on the basis of
the training set on the test set
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
18/5318
6. Model Selection
We can refine- The estimation algorithm (e.g., using a classifier
other than the nearest neighbor classifier)- The representation (e.g., base the summaries on a
different set of courses)
- The assumptions (e.g., perhaps students work ingroups) etc.
We have to rely on the method of evaluating the accuracyof our predictions to select among the possible refinements
What is M achine L earni ng?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
19/5319
Types of M achi ne L earn ing
Data can be- Symbolic or Categorical (e.g. High Temperature)
- Numerical (e.g. 450
C)
We will be primarily dealing with Symbolic data
Numerical data is primarily dealt by Artificial Neural Networks, whi ch have evolved into a separate f ield
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
20/5320
Types of M achi ne L earn ing
From the available data we can- Model the system which has generated the data
- Find interesting patterns in the data
We will be primarily concerned with rule based modelling of the system from which the data was generated
The search for interesting patterns is considered to be the domain of Data M ining
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
21/5321
Types of M achi ne L earn ing
Complete Pattern Recognition (or classification) system
consists of several steps
We will be primarily concernedwith the development of classifier systems
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
22/5322
Types of M achi ne L earn ing
Supervised learning , where we get a set of training inputsand outputs. The correct output for the training samples is
available
Unsupervised learning , where we are interested in capturinginherent organization in the data. No specific output valuesare supplied with the learning patterns
Reinforcement learning , where there are no exact outputssupplied, but there is a reward (reinforcement) for desirablebehaviour
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
23/5323
First, there are problems for which there exist no humanexperts.
Example: in modern automated manufactur ing f acil i ties, there is a need to predict machine fail ures before they occur by analyzing sensor readings. Because the machines are new,there are no human experts who can be interviewed by a
programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rul es.
Why Use M achine L earning?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
24/5324
Second, there are problems where human experts exist, butwhere they are unable to explain their expertise.
This is the case in many perceptual tasks, such as speech recogni tion, hand-wri ting recognition, and natur al l anguage understanding. Virtually all humans exhibit expert-level abil i ties on these tasks, but none of them can describe the
detailed steps that they follow as they perform them.F ortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learn ing algor i thms can learn to map the inputs to the outputs.
Why Use M achine L earning?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
25/5325
Why Use M achine L earning?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
26/5326
Third, there are problems where phenomena are changingrapidly.
Example: people would like to predict the future behavior of the stock market, of consumer pur chases, or of exchange rates.The rules and parameters governi ng these behaviors change f requently, so that the computer program for prediction would need to be rewritten f requently.
Why Use M achine L earning?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
27/5327
Fourth, there are applications thatneed to be customized for each
computer user separately.
Example: a program to f i l ter unwanted electronic mail messages. Different users wil l need different f i l ters.
Why Use M achine L earning?
INTRODUCTION
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
28/5328
Learning has been classified into several types
Much of human learning involves acquiring generalconcepts from specific training examples (this is cal led inductive learning)
VERSION SPACE
Concept L earning by I nduction
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
29/5329
Example: Concept of ball
* red, round, small
* green, round, small* red, round, medium
Complicated concepts: situations in which I should
study more to pass the exam
VERSION SPACE
Concept L earning by I nduction
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
30/53
30
Each concept can be thought of as a Boolean-valuedfunction whose value is true for some inputs and falsefor all the rest
(e.g. a function defined over all the animals, whosevalue is true for birds and false for all the otheranimals)
Problem of automatically inferring the generaldefinition of some concept, given examples labeled asmembers or nonmembers of the concept. This task iscalled concept learn ing , or approximating (inferring) aBoolean valued function from examples
VERSION SPACE
Concept L earning by I nduction
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
31/53
31
Target Concept to be learnt: Days on which Aldoenjoys his favorite water sport
Training Examples present are:
VERSION SPACE
Concept L earning by I nduction
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
32/53
32
The training examples are described by the values of seven Attributes
The task is to learn to predict the value of the attributeEnjoySport for an arbitrary day, based on the values of its other attributes
VERSION SPACE
Concept L earning by I nduction
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
33/53
33
The possible concepts are called Hypotheses and weneed an appropriate representation for the hypotheses
Let the hypothesis be a conjunction of constraints onthe attribute-values
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
34/53
34
If sky = sunny temp = warm humidity = ?
wind = strong water = ? forecast = samethenEnjoy Sport = YeselseEnjoy sport = No
Alternatively, this can be written as:{sunny, warm, ?, strong, ?, same}
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
35/53
35
For each attribute, the hypothesis will have either? Any value is acceptable
Value Any single value is acceptableNo value is acceptable
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
36/53
36
If some instance (example/observation) satisfies all theconstraints of a hypothesis, then it is classified as
positive (belonging to the concept)
The most general hypothesis is {?, ?, ?, ?, ?, ?}
It would classify every example as a positive example
The most specific hypothesis is { , , , , , }It would classify every example as negative
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
37/53
37
Alternate hypothesis representation could have beenDi sjunction of several conjunction of constraints
on the attr ibute-values Example:
{sunny, warm, normal, strong, warm, same} {sunny, warm, high, strong, warm, same} {sunny, warm, high, strong, cool, change}
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
38/53
38
Another alternate hypothesis representation couldhave been
Conj unction of constraints on the attribute-values where each constraint may be a disjunction of values
Example:
{sunny, warm, normal high, strong, warm cool,same change}
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
39/53
39
Yet another alternate hypothesis representation couldhave incorporated negations
Example:
{ sunny, warm, (normal high), ?, ?, ?}
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
40/53
40
By selecting a hypothesis representation, the space of allhypotheses (that the program can ever represent andtherefore can ever learn) is implicitly defined
In our example, the instance space X can contain 3.2.2.2.2.2 =96 distinct instances
There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses.Since every hypothesis containing even one classifies everyinstance as negative, hence semantically distinct hypothesesare: 4.3.3.3.3.3 + 1 = 973
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
41/53
41
Most practical learning tasks involve much larger, sometimesinfinite, hypothesis spaces
VERSION SPACE
Concept Learning by I nduction: H ypothesis Representation
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
42/53
42
Concept learning can be viewed as the task of searchingthrough a large space of hypotheses implicitly defined bythe hypothesis representation
The goal of this search is to find the hypothesis that best fitsthe training examples
VERSION SPACE
Concept L earning by I nduction: Search in H ypotheses Space
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
43/53
43
Once a hypothesis that best fits the training examples isfound, we can use it to predict the class label of newexamples
The basic assumption while using this hypothesis is:
Any hypothesis found to approximate the target function well over a suf f icientl y large set of training examples wil l also approximate the target function well over other unobserved examples
VERSION SPACE
Concept Learni ng by I nduction: Basic Assumption
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
44/53
44
If we view learning as a search problem, then it is naturalthat our study of learning algorithms will examinedifferent strategies for searching the hypothesis space
Many algorithms for concept learning organize thesearch through the hypothesis space by relying on ageneral to specific ordering of hypotheses
VERSION SPACE
Concept L earning by I nduction: General to Specif ic Order ing
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
45/53
45
Example:Consider h1 = {sunny, ?, ?, strong, ?, ?}
h2 = {sunny, ?, ?, ?, ?, ?}any instance classified positive by h1 will also beclassified positive by h2 (because it imposes fewerconstraints on the instance)
Hence h2 is more general than h1 and h1 is morespecific than h2
VERSION SPACE
Concept L earning by I nduction: General to Specif ic Order ing
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
46/53
46
Consider the three hypotheses h1, h2 and h3
VERSION SPACE
Concept L earning by I nduction: General to Specif ic Order ing
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
47/53
47
Neither h1 nor h3 is more general than the other
h2 is more general than both h1 and h3
Note that the more -general- than relationship isindependent of the target concept. It depends only onwhich instances satisfy the two hypotheses and not on theclassification of those instances according to the targetconcept
VERSION SPACE
Concept L earning by I nduction: General to Specif ic Order ing
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
48/53
48
How to find a hypothesis consistent with the observedtraining examples?
- A hypothesis is consistent with the tr aini ng examples if i t correctly classif ies these examples
One way is to begin with the most specific possible
hypothesis, then generalize it each time it fails to cover apositive training example (i.e. classifies it as negative)
The algorithm based on this method is called Find-S
VERSION SPACE
F ind-S Algorithm
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
49/53
49
We say that a hypothesis covers a positive training exampleif it correctly classifies the example as positive
A positive training example is an example of the concept tobe learnt
Similarly a negative training example is not an example of the concept
VERSION SPACE
F ind-S Algorithm
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
50/53
50
VERSION SPACE
Find-S Algorithm
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
51/53
51
VERSION SPACE
F ind-S Algor ithm
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
52/53
52
The nodes shown in the diagram are the possible hypothesesallowed by our hypothesis representation scheme
Note that our search is guided by the positive examples andwe consider only those hypotheses which are consistentwith the positive training examples
The search moves from hypothesis to hypothesis, searchingfrom the most specific to progressively more generalhypotheses
VERSION SPACE
F ind-S Algorithm
-
7/28/2019 AI Lecture 11 & 12 - Machine Learning
53/53
At each step, the hypothesis is generalized only as far asnecessary to cover the new positive example
Therefore, at each stage the hypothesis is the most specifichypothesis consistent with the training examples observedup to this point
Hence, it is called Find-S
VERSION SPACE
F ind-S Algorithm