pr lecture 1

48
Pattern Recognition Lecture 1: Introduction Course Overview and Grading Policy 1

Upload: ayaz

Post on 19-Feb-2015

16 views

Category:

Documents


0 download

DESCRIPTION

A god lectue

TRANSCRIPT

Page 1: PR Lecture 1

Pattern Recognition

Lecture 1:

Introduction

Course Overview and Grading Policy

1

Page 2: PR Lecture 1

Pattern Recognition

Instructor

Dr. Maqsood Hayat

Lecture timings

Wednesday 1-3:30.

Text book

Pattern Recognition, S. Theodoridis, K. Koutroumbas, 3rd ed.

Pattern Classification, By Richard O. Duda, Peter E. Hart and

David G. Stork, 2nd ed. , John Wiley & Sons, New York, 2001.

Additional readings:

Including lecture notes, slides and selected papers from the

literature will be provided periodically on the class.

2

Page 3: PR Lecture 1

ECE 5713: Pattern Recognition

Prerequisites for this course

§ Digital image processing § Working knowledge of Matlab § Mathematical Foundations required

Linear algebra

- Derivatives of matrices - Determinant and matrix inversion - Eigenvalues and eigenvectors Probability Theory

- Discrete random variables - Expected values, mean, variance, covariance - Statistical independence, correlation - pdf, and cdf - Conditional probability - Law of total probability and Bayes rule - Normal distribution and the central limit theorem 3

Page 4: PR Lecture 1

Pattern Recognition

GRADING POLICY:

Midterm:

20%

Surprise Quizzes:

05%

Assignments + Paper review

15%

Final:

50%

4

Page 5: PR Lecture 1

Introduction to Pattern Recognition: Outline § What is a Pattern § Machine Perception and Pattern Recognition (definitions) § Examples of PR Applications § Computer vision, Image Processing and PR § A Pattern classification example § Terminology in PR § Components of Pattern Recognition Systems § The Design Cycle

§ Learning and Adaptation § PR approaches

5

Page 6: PR Lecture 1

Human Perception

Human Perception

Page 7: PR Lecture 1

Human Perception

Human and Machine Perception

Page 8: PR Lecture 1

Human Perception

Pattern Recognition (PR)

Page 9: PR Lecture 1

Human Perception

Pattern Recognition

Page 10: PR Lecture 1

What is a Pattern?

§ “A pattern is the opposite of a chaos; it is an entity vaguely defined, that could be given a name.”

6

Page 11: PR Lecture 1

Machine Perception and Pattern Recognition

Machine Perception

Build a machine that can recognize patterns

Pattern Recognition (PR)

- Theory, Algorithms, Systems to Put Patterns into Categories

- Classification of Noisy or Complex Data

- Relate Perceived Pattern to Previously Perceived Patterns

7

Page 12: PR Lecture 1

PR Definitions from Literature

§ “The assignment of a physical object or event to one of several pre- specified categories” - Duda and Hart

§ “A problem of estimating density functions in a high-dimensional space and dividing the space into the regions of categories or

classes” - Fukunaga

§ “Given some examples of complex signals and the correct decisions for them, make decisions automatically for a stream of

future examples” - Ripley

§

“The science that concerns the description or classification

(recognition) of measurements” -Schalkoff

§ Pattern Recognition is concerned with answering the question

“What is this?” -Morse

8

Page 13: PR Lecture 1

Examples of PR Applications

§ Optical Character Recognition (OCR) - Sorting letters by postal code.

- Reconstructing text from printed

materials.

- Handwritten character recognition

§ Analysis and identification of human

patterns (Biometric Classification)

- Face recognition

- Handwriting recognition

- Finger prints and DNA mapping.

- Iris scan identification

9

Page 14: PR Lecture 1

Examples of Pattern Recognition Problems § Computer aided diagnosis - Medical imaging,

EEG, ECG signal analysis - Designed to assist (not

replace) physicians - Example: X-ray mammography

§ Prediction systems - Weather forecasting

(based on satellite data). - Analysis of seismic patterns

10

Page 15: PR Lecture 1

Other Examples

§ Speech and voice recognition/speaker identification.

§ Banking and insurance applications - Credit cards applicants classified by income, credit worthiness,

# of dependents, etc.

- Car insurance (pattern including make of car, # of accidents,

age, sex, driving habits, location, etc).

§ Information Retrieval - Data Mining

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

11

Page 16: PR Lecture 1

Human Perception

A Case Study Fish Classification

Page 17: PR Lecture 1

Human Perception

A Case Study Fish Classification

Page 18: PR Lecture 1

Problem Analysis

§ Set up a camera and take some training samples to extract features • Length

• Lightness

• Width

• Number and shape of fins

• Position of the mouth, etc…

- This is the set of all suggested features to explore for use in our

classifier!

Purpose:

To classify the future samples based on the data of extracted

features from the training samples

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

15

Page 19: PR Lecture 1

Human Perception

A Case Study Fish Classification

Page 20: PR Lecture 1

Selection Criterion

The length is a poor feature alone!

Select the lightness as a possible feature.

Length or Lightness, which one is a better feature?

No value of either feature will “classify” all fish correctly

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

17

Page 21: PR Lecture 1

Human Perception

A Case Study Fish Classification

Page 22: PR Lecture 1

Decision Boundary Selection

Which of the boundaries would you choose? Simple linear boundary -training error > 0

Nonlinear complex boundary -tr. error = 0

Simpler nonlinear boundary -tr. error > 0

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

19

Page 23: PR Lecture 1

Generalization and Decision Boundary

§ Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:

§ However, the central aim of designing a classifier is to correctly classify novel input (classify future samples)

Issue of generalization

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

20

Page 24: PR Lecture 1

Selected decision boundary

A simpler but generalized boundary in this case may be the

simple nonlinear curve as shown below:

§ Further features can be added to enhance the performance § However redundancy in information may not be desirable

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

21

Page 25: PR Lecture 1

Adjustment of decision boundary

§ A classifier, intuitively, is designed to minimize classification error,

the total number of instances (fish) classified incorrectly.

- Is this the best objective (cost) function to minimize? What kinds

of error can be made? Are they all equally bad? What is the real

cost of making an error?

• Sea bass misclassified as salmon: Pleasant surprise for the

consumer, tastier fish

• Salmon misclassified as sea bass: Customer upset, paid too

much for inferior fish - We may want to adjust our decision boundary to minimize overall

risk -in this case, second type error is more costly, so we may

want to minimize this error.

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

22

Page 26: PR Lecture 1

Adjustment of decision boundary

Error is minimized by moving the threshold to the left

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

23

Page 27: PR Lecture 1

Terminologies in PR

§ Features: a set of variables believed to carry discriminating and

characterizing information about the objects under consideration

§ Feature vector: A collection of d features, ordered in some

meaningful way into a d-dimensional column vector, that

represents the signature of the object to be identified.

§ Feature space: The d-dimensional space in which the feature

vectors lie. A d-dimensional vector in a d-dimensional space

constitutes a point in that space.

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

24

Page 28: PR Lecture 1

Features

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 29: PR Lecture 1

Feature Vectors

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 30: PR Lecture 1

Terminologies in PR

§ Class: The category to which a given object belongs, typically

denoted by w

§ Pattern: A collection of features of an object under consideration,

along with the correct class information of that object. In

classification, a pattern is a pair of variables, where x is the feature

vector and w is the corresponding label

§ Decision boundary:A boundary in the d-dimensional feature space

that separates patterns of different classes from each

§ Training Data: Data used during training of a classifier for which

the correct labels are a priori known

§ Field Test Data: Unknown data to be classified -for measurements

from which the classifier is trained. The correct class of this data

are not known a priori

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

25

Page 31: PR Lecture 1

Terminologies in PR

§ Cost Function: A quantitative measure that represents the cost of making an error. The classifier is trained to minimize this function. § Classifier: An algorithm which adjusts its parameters to find the correct decision boundaries -through a learning algorithm using a training dataset -such that a cost function is minimized. § Error: Incorrect labeling of the data by the classifier § Cost of error: Cost of making a decision, in particular an incorrect one -not all errors are equally costly! § Training Performance: The ability / performance of the classifier in correctly identifying the classes of the training data, which it has already seen. It may not be a good indicator of the generalization performance. § Generalization (Test Performance): The ability / performance of the classifier in identifying the classes of previously unseen

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

26

Page 32: PR Lecture 1

Classical Model for PR

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 33: PR Lecture 1

Evaluating a Classifier

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 34: PR Lecture 1

Cooperative vs. Uncooperative data

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

27

Page 35: PR Lecture 1

Good Features vs. Bad Features

§ Ideally, for a given group of patterns coming from the same

class, feature values should all be similar

§ For patterns coming from different classes, the feature

values should be different

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

28

Page 36: PR Lecture 1

Components of Pattern Recognition System

What is the cost of making a decision -what is the risk of an incorrect decision?

Determine the right type of classifier, right type of training algorithm and right type of classifier parameters.

How to choose a classifier that minimizes the risk and

maximizes the performance? Can we add prior

knowledge, context to the classifier?

Identify and extract features relevant to distinguishing sea

bass from salmon, but invariant to noise, occlusion

rotations, scale, shift. What if we already have irrelevant

features? Dimensionality reduction?

Separate & isolate fish from background image

CCD Camera on a conveyor belt

Fish

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

29

Page 37: PR Lecture 1

Components of Pattern Recognition System

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

30

Page 38: PR Lecture 1

Pattern Recognition Systems

§ Sensing - Use of a transducer (camera or microphone)

- PR system depends of the bandwidth, the resolution sensitivity distortion of

the transducer

§ Segmentation and grouping - Patterns should be well separated and should not overlap

§ Feature extraction - Discriminative features

- Invariant features with respect to translation, rotation and scale.

§ Classification - Use a feature vector provided by a feature extractor to assign the object to a

category

§ Post Processing - Exploit context input dependent information other than from the target

pattern itself to improve performance

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

31

Page 39: PR Lecture 1

The Design Cycle

To know the number of examples of each class should be obtained. Required prior knowledge of the problem

To know what features to select, and how do we select them…? To know whether they are relevant / noisy / outlier? Requires prior

knowledge of the problem.

What type of classifier to use? To select its parameters? And choose the best classifier.

To train and adjust the parameters of the model (classifier) we picked so that the model fits the data

To evaluate our performance. To estimate the true generalization performance of the model, the performance it will exhibit in the real world. To validate the results. Confidence in decision?

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

32

Page 40: PR Lecture 1

The Design Cycle

§ Data Collection - How do we know when we have collected an adequately large

and representative set of examples for training and testing the system? § Feature Choice

- Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise. § Model Choice

- Unsatisfied with the performance of our fish classifier and want to jump to another class of model

§ Training - Use data to determine the classifier. Many different procedures

for training classifiers and choosing models

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

33

Page 41: PR Lecture 1

The Design Cycle

§ Evaluation - Measure the error rate (or performance and switch from one set

of features to another one

§ Computational Complexity - What is the trade-off between computational ease and

performance? - (How an algorithm scales as a function of the number of

features, patterns or categories?)

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

34

Page 42: PR Lecture 1

Learning and Adaptation

§ Supervised learning - A teacher provides a category label or cost for each pattern in

the training set

§ Unsupervised learning - The system forms clusters or “natural groupings” of the input

patterns

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 43: PR Lecture 1

Pattern Recognition

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 44: PR Lecture 1

Pattern Recognition

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

35

Page 45: PR Lecture 1

Pattern recognition approaches

§ Statistical Approach - Patterns classified based on an underlying statistical model of

the features

• The statistical model is defined by a family of class-

conditional probability density functions p(x|ci) (Probability

of feature vector x given class ci)

§ Neural Networks - Classification is based on the response of a network of

processing units (neurons) to an input stimuli (pattern)

• “Knowledge” is stored in the connectivity and strength of

the synaptic weights - Neural PR is a trainable, non-algorithmic, black-box strategy

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

36

Page 46: PR Lecture 1

Pattern recognition approaches

§ Syntactic Approach - Patterns classified based on measures of structural similarity

• “Knowledge” is represented by means of formal grammars

or relational descriptions(graphs)

- Syntactic PR is used not only for classification, but also for

description • Typically, Syntactic PR approaches formulate hierarchical

descriptions of complex patterns built up from simpler sub

patterns

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

37

Page 47: PR Lecture 1

Pattern recognition approaches

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

38

Page 48: PR Lecture 1

Example of Syntactic PR

§ Consider the problem of recognizing the letters L,P,O,E,Q - Determine a sufficient set of features

- Design a tree-structured classifier

2/1/2008

Muhammad Ali Jinnah University, Islamabad

Pattern Recognition EE5713

39