pr lecture 1
DESCRIPTION
A god lectueTRANSCRIPT
Pattern Recognition
Lecture 1:
Introduction
Course Overview and Grading Policy
1
Pattern Recognition
Instructor
Dr. Maqsood Hayat
Lecture timings
Wednesday 1-3:30.
Text book
Pattern Recognition, S. Theodoridis, K. Koutroumbas, 3rd ed.
Pattern Classification, By Richard O. Duda, Peter E. Hart and
David G. Stork, 2nd ed. , John Wiley & Sons, New York, 2001.
Additional readings:
Including lecture notes, slides and selected papers from the
literature will be provided periodically on the class.
2
ECE 5713: Pattern Recognition
Prerequisites for this course
§ Digital image processing § Working knowledge of Matlab § Mathematical Foundations required
Linear algebra
- Derivatives of matrices - Determinant and matrix inversion - Eigenvalues and eigenvectors Probability Theory
- Discrete random variables - Expected values, mean, variance, covariance - Statistical independence, correlation - pdf, and cdf - Conditional probability - Law of total probability and Bayes rule - Normal distribution and the central limit theorem 3
Pattern Recognition
GRADING POLICY:
Midterm:
20%
Surprise Quizzes:
05%
Assignments + Paper review
15%
Final:
50%
4
Introduction to Pattern Recognition: Outline § What is a Pattern § Machine Perception and Pattern Recognition (definitions) § Examples of PR Applications § Computer vision, Image Processing and PR § A Pattern classification example § Terminology in PR § Components of Pattern Recognition Systems § The Design Cycle
§ Learning and Adaptation § PR approaches
5
Human Perception
Human Perception
Human Perception
Human and Machine Perception
Human Perception
Pattern Recognition (PR)
Human Perception
Pattern Recognition
What is a Pattern?
§ “A pattern is the opposite of a chaos; it is an entity vaguely defined, that could be given a name.”
6
Machine Perception and Pattern Recognition
Machine Perception
Build a machine that can recognize patterns
Pattern Recognition (PR)
- Theory, Algorithms, Systems to Put Patterns into Categories
- Classification of Noisy or Complex Data
- Relate Perceived Pattern to Previously Perceived Patterns
7
PR Definitions from Literature
§ “The assignment of a physical object or event to one of several pre- specified categories” - Duda and Hart
§ “A problem of estimating density functions in a high-dimensional space and dividing the space into the regions of categories or
classes” - Fukunaga
§ “Given some examples of complex signals and the correct decisions for them, make decisions automatically for a stream of
future examples” - Ripley
§
“The science that concerns the description or classification
(recognition) of measurements” -Schalkoff
§ Pattern Recognition is concerned with answering the question
“What is this?” -Morse
8
Examples of PR Applications
§ Optical Character Recognition (OCR) - Sorting letters by postal code.
- Reconstructing text from printed
materials.
- Handwritten character recognition
§ Analysis and identification of human
patterns (Biometric Classification)
- Face recognition
- Handwriting recognition
- Finger prints and DNA mapping.
- Iris scan identification
9
Examples of Pattern Recognition Problems § Computer aided diagnosis - Medical imaging,
EEG, ECG signal analysis - Designed to assist (not
replace) physicians - Example: X-ray mammography
§ Prediction systems - Weather forecasting
(based on satellite data). - Analysis of seismic patterns
10
Other Examples
§ Speech and voice recognition/speaker identification.
§ Banking and insurance applications - Credit cards applicants classified by income, credit worthiness,
# of dependents, etc.
- Car insurance (pattern including make of car, # of accidents,
age, sex, driving habits, location, etc).
§ Information Retrieval - Data Mining
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
11
Human Perception
A Case Study Fish Classification
Human Perception
A Case Study Fish Classification
Problem Analysis
§ Set up a camera and take some training samples to extract features • Length
• Lightness
• Width
• Number and shape of fins
• Position of the mouth, etc…
- This is the set of all suggested features to explore for use in our
classifier!
Purpose:
To classify the future samples based on the data of extracted
features from the training samples
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
15
Human Perception
A Case Study Fish Classification
Selection Criterion
The length is a poor feature alone!
Select the lightness as a possible feature.
Length or Lightness, which one is a better feature?
No value of either feature will “classify” all fish correctly
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
17
Human Perception
A Case Study Fish Classification
Decision Boundary Selection
Which of the boundaries would you choose? Simple linear boundary -training error > 0
Nonlinear complex boundary -tr. error = 0
Simpler nonlinear boundary -tr. error > 0
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
19
Generalization and Decision Boundary
§ Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:
§ However, the central aim of designing a classifier is to correctly classify novel input (classify future samples)
Issue of generalization
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
20
Selected decision boundary
A simpler but generalized boundary in this case may be the
simple nonlinear curve as shown below:
§ Further features can be added to enhance the performance § However redundancy in information may not be desirable
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
21
Adjustment of decision boundary
§ A classifier, intuitively, is designed to minimize classification error,
the total number of instances (fish) classified incorrectly.
- Is this the best objective (cost) function to minimize? What kinds
of error can be made? Are they all equally bad? What is the real
cost of making an error?
• Sea bass misclassified as salmon: Pleasant surprise for the
consumer, tastier fish
• Salmon misclassified as sea bass: Customer upset, paid too
much for inferior fish - We may want to adjust our decision boundary to minimize overall
risk -in this case, second type error is more costly, so we may
want to minimize this error.
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
22
Adjustment of decision boundary
Error is minimized by moving the threshold to the left
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
23
Terminologies in PR
§ Features: a set of variables believed to carry discriminating and
characterizing information about the objects under consideration
§ Feature vector: A collection of d features, ordered in some
meaningful way into a d-dimensional column vector, that
represents the signature of the object to be identified.
§ Feature space: The d-dimensional space in which the feature
vectors lie. A d-dimensional vector in a d-dimensional space
constitutes a point in that space.
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
24
Features
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Feature Vectors
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Terminologies in PR
§ Class: The category to which a given object belongs, typically
denoted by w
§ Pattern: A collection of features of an object under consideration,
along with the correct class information of that object. In
classification, a pattern is a pair of variables, where x is the feature
vector and w is the corresponding label
§ Decision boundary:A boundary in the d-dimensional feature space
that separates patterns of different classes from each
§ Training Data: Data used during training of a classifier for which
the correct labels are a priori known
§ Field Test Data: Unknown data to be classified -for measurements
from which the classifier is trained. The correct class of this data
are not known a priori
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
25
Terminologies in PR
§ Cost Function: A quantitative measure that represents the cost of making an error. The classifier is trained to minimize this function. § Classifier: An algorithm which adjusts its parameters to find the correct decision boundaries -through a learning algorithm using a training dataset -such that a cost function is minimized. § Error: Incorrect labeling of the data by the classifier § Cost of error: Cost of making a decision, in particular an incorrect one -not all errors are equally costly! § Training Performance: The ability / performance of the classifier in correctly identifying the classes of the training data, which it has already seen. It may not be a good indicator of the generalization performance. § Generalization (Test Performance): The ability / performance of the classifier in identifying the classes of previously unseen
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
26
Classical Model for PR
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Evaluating a Classifier
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Cooperative vs. Uncooperative data
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
27
Good Features vs. Bad Features
§ Ideally, for a given group of patterns coming from the same
class, feature values should all be similar
§ For patterns coming from different classes, the feature
values should be different
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
28
Components of Pattern Recognition System
What is the cost of making a decision -what is the risk of an incorrect decision?
Determine the right type of classifier, right type of training algorithm and right type of classifier parameters.
How to choose a classifier that minimizes the risk and
maximizes the performance? Can we add prior
knowledge, context to the classifier?
Identify and extract features relevant to distinguishing sea
bass from salmon, but invariant to noise, occlusion
rotations, scale, shift. What if we already have irrelevant
features? Dimensionality reduction?
Separate & isolate fish from background image
CCD Camera on a conveyor belt
Fish
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
29
Components of Pattern Recognition System
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
30
Pattern Recognition Systems
§ Sensing - Use of a transducer (camera or microphone)
- PR system depends of the bandwidth, the resolution sensitivity distortion of
the transducer
§ Segmentation and grouping - Patterns should be well separated and should not overlap
§ Feature extraction - Discriminative features
- Invariant features with respect to translation, rotation and scale.
§ Classification - Use a feature vector provided by a feature extractor to assign the object to a
category
§ Post Processing - Exploit context input dependent information other than from the target
pattern itself to improve performance
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
31
The Design Cycle
To know the number of examples of each class should be obtained. Required prior knowledge of the problem
To know what features to select, and how do we select them…? To know whether they are relevant / noisy / outlier? Requires prior
knowledge of the problem.
What type of classifier to use? To select its parameters? And choose the best classifier.
To train and adjust the parameters of the model (classifier) we picked so that the model fits the data
To evaluate our performance. To estimate the true generalization performance of the model, the performance it will exhibit in the real world. To validate the results. Confidence in decision?
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
32
The Design Cycle
§ Data Collection - How do we know when we have collected an adequately large
and representative set of examples for training and testing the system? § Feature Choice
- Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise. § Model Choice
- Unsatisfied with the performance of our fish classifier and want to jump to another class of model
§ Training - Use data to determine the classifier. Many different procedures
for training classifiers and choosing models
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
33
The Design Cycle
§ Evaluation - Measure the error rate (or performance and switch from one set
of features to another one
§ Computational Complexity - What is the trade-off between computational ease and
performance? - (How an algorithm scales as a function of the number of
features, patterns or categories?)
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
34
Learning and Adaptation
§ Supervised learning - A teacher provides a category label or cost for each pattern in
the training set
§ Unsupervised learning - The system forms clusters or “natural groupings” of the input
patterns
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Pattern Recognition
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Pattern Recognition
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
35
Pattern recognition approaches
§ Statistical Approach - Patterns classified based on an underlying statistical model of
the features
• The statistical model is defined by a family of class-
conditional probability density functions p(x|ci) (Probability
of feature vector x given class ci)
§ Neural Networks - Classification is based on the response of a network of
processing units (neurons) to an input stimuli (pattern)
• “Knowledge” is stored in the connectivity and strength of
the synaptic weights - Neural PR is a trainable, non-algorithmic, black-box strategy
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
36
Pattern recognition approaches
§ Syntactic Approach - Patterns classified based on measures of structural similarity
• “Knowledge” is represented by means of formal grammars
or relational descriptions(graphs)
- Syntactic PR is used not only for classification, but also for
description • Typically, Syntactic PR approaches formulate hierarchical
descriptions of complex patterns built up from simpler sub
patterns
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
37
Pattern recognition approaches
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
38
Example of Syntactic PR
§ Consider the problem of recognizing the letters L,P,O,E,Q - Determine a sufficient set of features
- Design a tree-structured classifier
2/1/2008
Muhammad Ali Jinnah University, Islamabad
Pattern Recognition EE5713
39