classification with several populations presented by: libin zhou

Classification with several populations

Presented by: Libin Zhou

Classification procedure

Minimum Expected Cost of Misclassification Method (ECM) The ECM for two populations is:

Where: P is the conditional probability; p is the prior probability; c is the cost of misclassification

The ECM for multiple populations could be:

)))|()|((())(*()(*...)1(*

)|()|()(

)1|()1|()1|()1|(...)1|3()1|3()1|2()1|2()1(

1 1 &11

&1

2

g

i

g

i

g

ikkiig

g

ikk

g

k

ikcikPpiECMpgECMPECMpECM

ikcikPiECM

kckPgcgPcPcPECM

21 )2|1()2|1()1|2()1|2( pcPpcPECM

Minimum ECM classification Rule

Result 11.5 on page 614. When

is smallest, assigning x to population k could minimize the ECM.

• If misclassification costs are equal, the rule could be simplified as

Or

g

kiiii ikcxfp

&1

)|()(

)(ln)(ln

)()(

xfpxfp

xfpxfp

iikk

iikk

Maximum posterior probability Rule

The posterior probability is =P(x comes from population k given that x was observed)

for k=1,2,…,g This rule is the generalization of the largest posterior

probability rule for two populations classification (Equation (11-9))

)|( xP

)](*)[(

)(*)(

)(

)()|(

1

likelihoodprior

likelihoodprior

xfp

xfpxP g

iii

kkk

Classification with Normal population

When the populations are multivariate normal distribution, the term in the minimum ECM classification rule with equal misclassification costs (Equation(11-41))

could be written by

Then we get

Where d is the quadratic discrimination score and i=1,2,…,p

)(xfi)(ln)(ln xfpxfp iikk

)]()'(exp[||)2(

1)(

1

21

2/12/ iiii

pi xxxf

ii iiiQi

iiik kkk

pkkk

pxxxd

xfpxxpxfp

ln)()'(||ln)(

))((lnmax)()'(||ln)2ln()(ln)(ln

1

21

21

1

21

21

2

Minimum total probability of misclassification (TPM) rule for normal populations with different

If the quadratic discrimination score

then x would be allocated to population k

))(max()( xdxd Qi

Qk

i

Estimated Minimum (TPM) rule for several normal populations with different

In practice, the and are usually unknown, but a training set of correctly classified observations is often available for the construction of estimates. The relevant sample quantities for population i are and

Then the estimated could be written by

i=1,2,…,g

ii

ix

iSQid̂

iiiiiQi pxxSxxSxd ln)()'(||ln)(ˆ 1

21

21

i

The estimated minimum TPM rule for equal-covariance normal population

If the covariance of the several populations are equal, then the quadratic discrimination score could be simplified into an estimate of a linear discriminant score based on the pooled estimate of the covariance.

We can also define a new variable: Generalized Squared Distance

Then the sample discriminant score could be written by

iipooledipooledii pxSxxSxxd ln)(ˆ 1'211'

)()()( 12ipooledii xxSxxxD

iiiiiiiQi pxDconspxxSxxSxd ln)(ln)()'(||ln)(ˆ 2

211

21

21

Example 11.11. Classifying a potential business-school graduate student

Introduction: the admission officer of a business school has used an “index” of undergraduate grade point average (GPA) and graduate management aptitude test (GMAT) scores to help decide which applicants should be admitted to the school’s graduate programs.

Analysis: Populations: Pop1—admit; Pop2—do not admit; Pop3—

borderline Variable: x1—GPA; x2—GMAT

Question: Allocating a new applicant with variables (3.21,497) using sample discriminant scores

Resolution

1) calculate the mean values for each populations 2) calculate the pooled covariance 3) calculate the sample squared distances

using the sample squared distance function

Where i=1,2,3

• 4) Results: =2.58; =17.10; =2.47

From the rule of assigning x to the “closest” population, the new application should be assigned to population 3, borderline.

)(2 xDi

)()()( 12ipooledii xxSxxxD

)( 02

1 xD )( 022 xD )( 0

23 xD

classification with several populations presented by: libin zhou

Documents