ece 8443 – pattern recognition
DESCRIPTION
ECE 8443 – Pattern Recognition. LECTURE 04: PERFORMANCE BOUNDS. •Objectives: Typical Examples Performance Bounds ROC Curves Resources: D.H.S.: Chapter 2 (Part 3) V.V.: Chernoff Bound J.G.: Bhattacharyya T.T. : ROC Curves NIST: DET Curves. Audio:. URL:. Two-Category Case (Review). - PowerPoint PPT PresentationTRANSCRIPT
ECE 8443 – Pattern RecognitionECE 8443 – Pattern Recognition
LECTURE 04: PERFORMANCE BOUNDS
• Objectives:Typical ExamplesPerformance BoundsROC Curves
• Resources:D.H.S.: Chapter 2 (Part 3)V.V.: Chernoff BoundJ.G.: BhattacharyyaT.T. : ROC CurvesNIST: DET Curves
Audio:URL:
ECE 8443: Lecture 04, Slide 2
• A classifier that places a pattern in one of two classes is often referred to as a dichotomizer.
• We can reshape the decision rule:
02121 )() - g(g) g() (g)(if g xxxxx• If we use log of the posterior probabilities:
)ln()ln()ln(
)()(
2
1
2
1
21
PP
pp
)g()f(
PP)g(
xx
xx
xxx
• A dichotomizer can be viewed as a machine that computes a single discriminant function and classifies x according to the sign (e.g., support vector machines).
Two-Category Case (Review)
ECE 8443: Lecture 04, Slide 3
Unconstrained or “Full” Covariance (Review)
ECE 8443: Lecture 04, Slide 4
• This has a simple geometric interpretation:
)()(
ln2 222
i
jji P
P
xx
• The decision region when the priors are equal and the support regions are spherical is simply halfway between the means (Euclidean distance).
Threshold Decoding (Review)
ECE 8443: Lecture 04, Slide 5
)ln(21
)()(ln)
21
)(21
)ln(21
)()(ln)
21
))(21
)]()()()[(21
)ln(21
)()(ln
11
j1111
11
1111
11
j
i
j
iiiijjj
jiiij
j
i
j
iiiijjj
jjiiij
jjjiii
j
i
j
iji
PPc
c
PP
PPgg
tt
tt
tt
tt
(
xxx
(
(
)(-)(
bA
:wherebA
xxx
xxxx
xx
tt
General Case for Gaussian Classifiers
ECE 8443: Lecture 04, Slide 6
• Case: i = 2I
)()(ln)
21
)(10
2
2
j
ii
tij
tj
ji
PPc
(
bA
• This can be rewritten as:
)()()(ln)(
21
0)(
2
2
0
0
jij
i
jiji
t
PP
x
xxw
Identity Covariance
ECE 8443: Lecture 04, Slide 7
• Case: i =
)()()(
)()(ln)(
21
10 jijiji
jiji
PP
t
x
Equal Covariances
ECE 8443: Lecture 04, Slide 8
Arbitrary Covariances
ECE 8443: Lecture 04, Slide 9
Typical Examples of 2D Classifiers
ECE 8443: Lecture 04, Slide 10
• Bayes decision rule guarantees lowest average error rate• Closed-form solution for two-class Gaussian distributions• Full calculation for high dimensional space difficult• Bounds provide a way to get insight into a problem and
engineer better solutions.
• Need the following inequality:100,],min[ 1 andbababa
Assume a b without loss of generality: min[a,b] = b.
Also, ab(1- ) = (a/b)b and (a/b) 1.
Therefore, b (a/b)b, which implies min[a,b] ab(1- ) .
• Apply to our standard expression for P(error).
Error Bounds
ECE 8443: Lecture 04, Slide 11
xxx
xxx
xxx
xxx
xx
xxxxx
dppPP
dpPpP
dpPpP
dpp
pPp
pP
dpPPerrorP
)()()()(
)()()()(
)]()(),()(min[
)(])(
)()(,
)()()(
min[
)()](),(min[)(
21
121
1
21
21
11
2211
2211
21
• Recall:
• Note that this integral is over the entire feature space, not the decision regions (which makes it simpler).
• If the conditional probabilities are normal, this expression can be simplified.
Chernoff Bound
ECE 8443: Lecture 04, Slide 12
• If the conditional probabilities are normal, our bound can be evaluated analytically:
))(exp()()( 21
1 kdpp xxx
where:
)1(21
21
121
2112
)1(ln
21
)(])1([)(2
)1()(
tk
• Procedure: find the value of that minimizes exp(-k( ), and then compute P(error) using the bound.
• Benefit: one-dimensional optimization using
Chernoff Bound for Normal Densities
ECE 8443: Lecture 04, Slide 13
• The Chernoff bound is loose for extreme values• The Bhattacharyya bound can be derived by = 0.5:
))(exp()()(
)()()()(
)()()()(
21
2121
21
121
1
kPP
dppPP
dppPP
xxx
xxx
where:
21
21
12121
122ln
21)(]
2[)(
81)(
tk
• These bounds can still be used if the distributions are not Gaussian (why? hint: Occam’s Razor). However, they might not be adequately tight.
Bhattacharyya Bound
ECE 8443: Lecture 04, Slide 14
• How do we compare two decision rules if they require different thresholds for optimum performance?
• Consider four probabilities:
rejectioncorrect :)|(miss:)|(
alarm false:)|(hit:)|(
1*
2*
1*
2*
xxxxx
xxxxx
PxP
PxP
Receiver Operating Characteristic (ROC)
ECE 8443: Lecture 04, Slide 15
• An ROC curve is typically monotonic but not symmetric:
• One system can be considered superior to another only if its ROC curve lies above the competing system for the operating region of interest.
General ROC Curves
ECE 8443: Lecture 04, Slide 16
Summary• Gaussian Distributions: how is the shape of the decision region influenced by
the mean and covariance?• Bounds on performance (i.e., Chernoff, Bhattacharyya) are useful
abstractions for obtaining closed-form solutions to problems.• A Receiver Operating Characteristic (ROC) curve is a very useful way to
analyze performance and select operating points for systems.• Discrete features can be handled in a way completely analogous to
continuous features.