ece 8443 – pattern recognition

16
ECE 8443 – Pattern Recognition ECE 8443 – Pattern Recognition LECTURE 04: PERFORMANCE BOUNDS Objectives: Typical Examples Performance Bounds ROC Curves Resources: D.H.S.: Chapter 2 (Part 3) V.V.: Chernoff Bound J.G.: Bhattacharyya T.T. : ROC Curves NIST: DET Curves Audio: URL:

Upload: ewa

Post on 17-Feb-2016

45 views

Category:

Documents


2 download

DESCRIPTION

ECE 8443 – Pattern Recognition. LECTURE 04: PERFORMANCE BOUNDS. •Objectives: Typical Examples Performance Bounds ROC Curves Resources: D.H.S.: Chapter 2 (Part 3) V.V.: Chernoff Bound J.G.: Bhattacharyya T.T. : ROC Curves NIST: DET Curves. Audio:. URL:. Two-Category Case (Review). - PowerPoint PPT Presentation

TRANSCRIPT

Page 2: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 2

• A classifier that places a pattern in one of two classes is often referred to as a dichotomizer.

• We can reshape the decision rule:

02121 )() - g(g) g() (g)(if g xxxxx• If we use log of the posterior probabilities:

)ln()ln()ln(

)()(

2

1

2

1

21

PP

pp

)g()f(

PP)g(

xx

xx

xxx

• A dichotomizer can be viewed as a machine that computes a single discriminant function and classifies x according to the sign (e.g., support vector machines).

Two-Category Case (Review)

Page 3: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 3

Unconstrained or “Full” Covariance (Review)

Page 4: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 4

• This has a simple geometric interpretation:

)()(

ln2 222

i

jji P

P

xx

• The decision region when the priors are equal and the support regions are spherical is simply halfway between the means (Euclidean distance).

Threshold Decoding (Review)

Page 5: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 5

)ln(21

)()(ln)

21

)(21

)ln(21

)()(ln)

21

))(21

)]()()()[(21

)ln(21

)()(ln

11

j1111

11

1111

11

j

i

j

iiiijjj

jiiij

j

i

j

iiiijjj

jjiiij

jjjiii

j

i

j

iji

PPc

c

PP

PPgg

tt

tt

tt

tt

(

xxx

(

(

)(-)(

bA

:wherebA

xxx

xxxx

xx

tt

General Case for Gaussian Classifiers

Page 6: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 6

• Case: i = 2I

)()(ln)

21

)(10

2

2

j

ii

tij

tj

ji

PPc

(

bA

• This can be rewritten as:

)()()(ln)(

21

0)(

2

2

0

0

jij

i

jiji

t

PP

x

xxw

Identity Covariance

Page 7: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 7

• Case: i =

)()()(

)()(ln)(

21

10 jijiji

jiji

PP

t

x

Equal Covariances

Page 8: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 8

Arbitrary Covariances

Page 10: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 10

• Bayes decision rule guarantees lowest average error rate• Closed-form solution for two-class Gaussian distributions• Full calculation for high dimensional space difficult• Bounds provide a way to get insight into a problem and

engineer better solutions.

• Need the following inequality:100,],min[ 1 andbababa

Assume a b without loss of generality: min[a,b] = b.

Also, ab(1- ) = (a/b)b and (a/b) 1.

Therefore, b (a/b)b, which implies min[a,b] ab(1- ) .

• Apply to our standard expression for P(error).

Error Bounds

Page 11: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 11

xxx

xxx

xxx

xxx

xx

xxxxx

dppPP

dpPpP

dpPpP

dpp

pPp

pP

dpPPerrorP

)()()()(

)()()()(

)]()(),()(min[

)(])(

)()(,

)()()(

min[

)()](),(min[)(

21

121

1

21

21

11

2211

2211

21

• Recall:

• Note that this integral is over the entire feature space, not the decision regions (which makes it simpler).

• If the conditional probabilities are normal, this expression can be simplified.

Chernoff Bound

Page 12: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 12

• If the conditional probabilities are normal, our bound can be evaluated analytically:

))(exp()()( 21

1 kdpp xxx

where:

)1(21

21

121

2112

)1(ln

21

)(])1([)(2

)1()(

tk

• Procedure: find the value of that minimizes exp(-k( ), and then compute P(error) using the bound.

• Benefit: one-dimensional optimization using

Chernoff Bound for Normal Densities

Page 13: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 13

• The Chernoff bound is loose for extreme values• The Bhattacharyya bound can be derived by = 0.5:

))(exp()()(

)()()()(

)()()()(

21

2121

21

121

1

kPP

dppPP

dppPP

xxx

xxx

where:

21

21

12121

122ln

21)(]

2[)(

81)(

tk

• These bounds can still be used if the distributions are not Gaussian (why? hint: Occam’s Razor). However, they might not be adequately tight.

Bhattacharyya Bound

Page 14: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 14

• How do we compare two decision rules if they require different thresholds for optimum performance?

• Consider four probabilities:

rejectioncorrect :)|(miss:)|(

alarm false:)|(hit:)|(

1*

2*

1*

2*

xxxxx

xxxxx

PxP

PxP

Receiver Operating Characteristic (ROC)

Page 15: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 15

• An ROC curve is typically monotonic but not symmetric:

• One system can be considered superior to another only if its ROC curve lies above the competing system for the operating region of interest.

General ROC Curves

Page 16: ECE 8443 – Pattern Recognition

ECE 8443: Lecture 04, Slide 16

Summary• Gaussian Distributions: how is the shape of the decision region influenced by

the mean and covariance?• Bounds on performance (i.e., Chernoff, Bhattacharyya) are useful

abstractions for obtaining closed-form solutions to problems.• A Receiver Operating Characteristic (ROC) curve is a very useful way to

analyze performance and select operating points for systems.• Discrete features can be handled in a way completely analogous to

continuous features.