1 bayesian decision theory shyh-kang jeng department of electrical engineering/ graduate institute...

76
1 Bayesian Decision Bayesian Decision Theory Theory Shyh-Kang Jeng Shyh-Kang Jeng Department of Electrical Engineeri Department of Electrical Engineeri ng/ ng/ Graduate Institute of Communicatio Graduate Institute of Communicatio n/ n/ Graduate Institute of Networking a Graduate Institute of Networking a nd Multimedia, National Taiwan Uni nd Multimedia, National Taiwan Uni versity versity

Upload: reynard-singleton

Post on 13-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

11

Bayesian Decision TheoryBayesian Decision Theory

Shyh-Kang JengShyh-Kang JengDepartment of Electrical Engineering/Department of Electrical Engineering/Graduate Institute of Communication/Graduate Institute of Communication/

Graduate Institute of Networking and MultiGraduate Institute of Networking and Multimedia, National Taiwan Universitymedia, National Taiwan University

Page 2: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

22

Basic AssumptionsBasic Assumptions

The decision problem is posed in The decision problem is posed in probabilistic termsprobabilistic terms

All of the relevant probability values All of the relevant probability values are knownare known

Page 3: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

33

State of NatureState of NatureState of natureState of nature–

A priori probability (prior)A priori probability (prior)–

Decision rule to judge just one fishDecision rule to judge just one fish–

)salmon(or bass) sea( 21

salmon isfish next the:)(

bass sea isfish next the:)(

2

1

P

P

2211 decide otherwise ;)()( if Decide PP

Page 4: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

44

Class-Conditional Class-Conditional Probability DensityProbability Density

Page 5: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

55

Bayes FormulaBayes Formula

evidence

priorlikelihoodposterior

Pxpxp

xp

PxpxP

jjj

jjj

2

1

)()|()(

)(

)()|()|(

Page 6: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

66

Posterior ProbabilitiesPosterior Probabilities3/1)(,3/2)( 21 PP

Page 7: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

77

Bayes Decision RuleBayes Decision RuleProbability of errorProbability of error

Bayes decision ruleBayes decision rule

dxxpxerrorpdxxerrorperrorP

xP

xPxerrorP

)()|(),()(

decide weif)|(

decide weif)|()|(

12

21

2

22111

2211

decide otherwise

);()|()()|( if decide Or,

decide otherwise);|()|( if Decide

PxpPxp

xPxP

Page 8: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

88

Bayes Decision Theory (1/3)Bayes Decision Theory (1/3)CategoriesCategories

ActionsActions

Loss functionsLoss functions

Feature vector Feature vector

c ,,1

a ,,1

)|( ji

xvector component d

Page 9: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

99

Bayes Decision Theory (2/3)Bayes Decision Theory (2/3)Bayes formulaBayes formula

Conditional riskConditional risk

c

jjj

jjj

Ppp

p

Ppp

1

)()|()(

)(

)()|()|(

xx

x

xx

c

jjjii xpR

1

)|()|()|( x

Page 10: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1010

Bayes Decision Theory (3/3)Bayes Decision Theory (3/3)Decision function assumes one of the vDecision function assumes one of the values alues Overall riskOverall risk

Bayes decision rule: compute the conditional Bayes decision rule: compute the conditional risk risk

then select the action for which is mithen select the action for which is mi

nimumnimum

)(x

xxxx dpRR )()|)((

a

a ,,1

c

jjjii aiPR

1

,,1),|()|()|( xx

i )|( xiR

Page 11: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1111

Two-Category ClassificationTwo-Category ClassificationConditional riskConditional risk

Decision rule: decide Decision rule: decide 11 if if

Likelihood ratioLikelihood ratio

)|(

)|()|()|(

)|()|()|(

2221212

2121111

jiij

PPR

PPR

xxx

xxx

)()|()()()|()( 222212111121 PpPp xx

)(

)(

)|(

)|(

2

1

1121

2212

2

1

P

P

p

p

x

x

Page 12: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1212

Minimum-Error-Rate ClassificationMinimum-Error-Rate ClassificationIf action is taken and the true state is If action is taken and the true state is , then the decision is correct if and , then the decision is correct if and in error ifin error if

Error rate (the probability of error) is to Error rate (the probability of error) is to be minimizedbe minimized

Symmetrical or zero-one loss functionSymmetrical or zero-one loss function

Conditional riskConditional risk

cjiji

jiji ,,1,,

,1

,0)|(

)|(1)|()|()|(1

xxx ijj

c

jii PPR

ij ji

ji

Page 13: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1313

Minimum-Error-Rate ClassificationMinimum-Error-Rate Classification

Page 14: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1414

Mini-max CriterionMini-max Criterion

To perform well over a range of prior To perform well over a range of prior probabilityprobability

Minimize the maximum possible Minimize the maximum possible overall risk overall risk – So that the worst risk for any value of So that the worst risk for any value of

the priors is as small as possiblethe priors is as small as possible

Page 15: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1515

Mini-maximizing RiskMini-maximizing Risk

RR

dpdp

P

dp

dpPpP

dpPpPR

mm

RR

R

R

R

])|()()|()(

))[((

)|()(

)|()()|()(

)|()()|()(

12

1

2

1

2221211121

22111

2221222

22221121

22121111

xxxx

xx

xxx

xxx

Page 16: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1616

Searching for Mini-max BoundarySearching for Mini-max Boundary

Page 17: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1717

Neyman-Pearson CriterionNeyman-Pearson Criterion

Minimize the overall risk subject to Minimize the overall risk subject to a constrainta constraint

ExampleExample– Minimize the total risk subject toMinimize the total risk subject to

constantdR i xx)|(

Page 18: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1818

Discriminant FunctionsDiscriminant FunctionsA classifier assigns to class if A classifier assigns to class if

where are called discriminant functionswhere are called discriminant functionsA discriminant function for a Bayes classifier A discriminant function for a Bayes classifier Two discriminant functions for minimum- erTwo discriminant functions for minimum- error-rate classificationror-rate classification

x iijgg ji allfor )()( xx

)(xig

)|()( xx ii Rg

)(ln)|(ln)(;)()|(

)()|()(

1

iiic

jjj

iii Ppg

Pp

Ppg

xxx

xx

Page 19: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

1919

Discriminant FunctionsDiscriminant Functions

Page 20: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2020

Two-Dimensional Two-Category Two-Dimensional Two-Category ClassifierClassifier

Page 21: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2121

DichotomizersDichotomizersPlace a pattern in one of only two categoriesPlace a pattern in one of only two categories– cf. Polychotomizerscf. Polychotomizers

More common to define a single duscriminaMore common to define a single duscriminant functionnt function

Some particular formsSome particular forms )()()( 21 xxx ggg

)(

)(ln

)|(

)|(ln)(

)|()|()(

2

1

2

1

21

P

P

p

pg

PPg

x

xx

xxx

Page 22: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2222

Univariate Normal PDFUnivariate Normal PDF

),(~2

1exp

2

1)( 2

2

Nx

xp

Page 23: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2323

Distribution with Maximum Entropy Distribution with Maximum Entropy and Central Limit Theoremand Central Limit Theorem

Entropy for discrete distributionEntropy for discrete distribution

Entropy for continuous distributionEntropy for continuous distribution

Central limit theoremCentral limit theorem– Aggregate effect of the sum of a large number of sAggregate effect of the sum of a large number of s

mall, independent random disturbances, will lead mall, independent random disturbances, will lead to a Gaussian distrubutionto a Gaussian distrubution

)(log21

bitsPPH i

m

ii

)()(ln)())(( natsdxxpxpxpH

Page 24: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2424

Multivariate Normal PDFMultivariate Normal PDF

: : dd-component mean vector-component mean vector

: : dd-by--by-dd covariance matrixcovariance matrix

][xμ E

),(~

2

1exp

2

1)( 1

2/12/

Σμ

μxΣμxΣ

x

N

p T

d

TE μxμxΣ

Page 25: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2525

Linear Combination of Gaussian Linear Combination of Gaussian Random VariablesRandom Variables

),(~)(),,(~)( ΣAAμAyxAyΣμx ttt NpNp

Page 26: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2626

Whitening TransformWhitening Transform

: matrix whose columns are the ortho: matrix whose columns are the orthonormal eigenvectors of normal eigenvectors of : diagonal matrix of the correspondin: diagonal matrix of the corresponding eigenvaluesg eigenvaluesWhitening transformWhitening transform

IΣAA

ΦΛA

wtw

w2/1

Page 27: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2727

Bivariate Gaussian PDFBivariate Gaussian PDF

Page 28: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2828

Mahalanobis DistanceMahalanobis DistanceSquared Mahalanobus distanceSquared Mahalanobus distance

Volume of the Hyperellipsoids of constaVolume of the Hyperellipsoids of constant Mahalanobis distance nt Mahalanobis distance rr

)()( 12 μxΣμx tr

odd!/)!2

1(2

even)!2//(2/)1(

2/

2/1

ddd

ddV

rVV

dd

d

d

dd

Σ

Page 29: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

2929

Discriminant Functions for Discriminant Functions for Normal DensityNormal Density

)(lnln2

12ln

2)()(

2

1)(

density normaslfor

)(ln)|(ln)(

tionclassifica rate-error-minimumfor

1iiii

tii

iii

Pd

g

Ppg

ΣμxΣμxx

xx

Page 30: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3030

Case 1: Case 1: ii = = 22 II

)(ln2

1,

1

)(

)(ln22

1)(

)()(

)(ln2

)(

202

0

2

2

2

2

iiti

iii

ii

itii

iiti

ti

ti

it

ii

ii

i

Pw

wg

Pg

Pg

μμμw

xwx

μμxμxxx

μxμxμx

μxx

Page 31: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3131

Decision BoundariesDecision Boundaries

)()(

)(ln)(

2

1

0)(

)()(

2

2

0

0

jij

i

ji

ji

ji

t

ji

P

P

gg

μμμμ

μμx

μμw

xxw

xx

Page 32: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3232

Decision Boundaries when Decision Boundaries when PP((ii)=)=PP

((jj))

Page 33: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3333

Decision Boundaries when Decision Boundaries when PP((ii) ) anan

dd PP((jj) ) are unequalare unequal

Page 34: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3434

Case 2: Case 2: ii = =

)(ln2

1,

)(

)(ln2

1

2

1)(

)(ln)()(2

1)(

10

1

0

11

1

iitiiii

itii

iiti

ti

ti

iit

ii

Pw

wg

Pg

Pg

μΣμμΣw

xwx

μμxΣμxΣxx

μxΣμxx

Page 35: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3535

Decision BoundariesDecision Boundaries

)()()(

)](/)(ln[)(

2

1

)(

0)(

)()(

10

1

0

jiji

tji

jiji

ji

t

ji

PP

gg

μμμμΣμμ

μμx

μμΣw

xxw

xx

Page 36: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3636

Decision BoundariesDecision Boundaries

Page 37: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3737

Case 3: Case 3: ii = arbitrary = arbitrary

)(lnln2

1

2

1

,2

1

)(

10

11

0

iiiitii

iiiii

itii

ti

Pw

wg

ΣμΣμ

μΣwΣW

xwxWxx

Page 38: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3838

Decision Boundaries for One-Decision Boundaries for One-Dimensional CaseDimensional Case

Page 39: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

3939

Decision Boundaries for Two-Decision Boundaries for Two-Dimensional CaseDimensional Case

Page 40: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4040

Decision Boundaries for Three-Decision Boundaries for Three-Dimensional Case (1/2)Dimensional Case (1/2)

Page 41: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4141

Decision Boundaries for Three-Decision Boundaries for Three-Dimensional Case (2/2)Dimensional Case (2/2)

Page 42: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4242

Decision Boundaries for Four Decision Boundaries for Four Normal DistributionsNormal Distributions

Page 43: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4343

Example: Decision Regions for Example: Decision Regions for Two-Dimensional Gaussian DataTwo-Dimensional Gaussian Data

Page 44: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4444

Example: Decision Regions for Example: Decision Regions for Two-Dimensional Gaussian DataTwo-Dimensional Gaussian Data

2

3 passingnot ,1875.0125.1514.3

boundarydecision

5.0)()(

2/10

02/1,

2/10

02

20

02,

2

3,

20

02/1,

6

3

2112

21

12

11

2211

xxx

PP

ΣΣ

ΣμΣμ

Page 45: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4545

Bayes Decision Compared with OthBayes Decision Compared with Other Decision Strategieser Decision Strategies

12

)()|()()|(

),(),()(

2211

2112

RR

dPpdPp

RPRPerrorP

xxxx

xx

Page 46: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4646

Multicategory CaseMulticategory CaseProbability of being correctProbability of being correct

Bayes classifier maximizes this probabiBayes classifier maximizes this probability by choosing the regions so that the lity by choosing the regions so that the integrand is maximal for all xintegrand is maximal for all x– No other partitioning can yield a smaller pNo other partitioning can yield a smaller p

robability of errorrobability of error

c

i R

ii

i

dPpcorrectP1

)()|()( xx

Page 47: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4747

Error Bounds for Normal DensitiesError Bounds for Normal Densities

Full calculation of the error Full calculation of the error probability is difficult for the probability is difficult for the Gaussian caseGaussian case– Especially in high dimensionsEspecially in high dimensions– Discontinuous nature of the decision Discontinuous nature of the decision

regionsregions

Upper bound on the error can be Upper bound on the error can be obtained for two-category caseobtained for two-category case– By approximating the error integral By approximating the error integral

analyticallyanalytically

Page 48: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4848

Chernoff BoundChernoff Bound

2

1

1

21

121

2112

)(2

11

21

121

1

21

1

)1(ln

2

1

)(])1[()(2

)1()(

)|()|( densities, normalfor

)|()|()()()(

)()|()(

)()|()|(

)]|(),|(min[)(

10 and 0,for ],min[

ΣΣ

ΣΣ

μμΣΣμμ

xxx

xxx

xx

xx

xx

t

k

jjjj

j

k

edpp

dppPPerrorP

Ppp

Ppp

PPerrorP

bababa

Page 49: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

4949

Bhattacharyya BoundBhattacharyya Bound

21

21

12

1

2112

)2/1(21

2121

2ln

2

1

)(2

)(8

1)2/1(

)()(

)|()|()()()(

2/1set

ΣΣ

ΣΣ

μμΣΣ

μμ

xxx

t

k

k

ePP

dppPPerrorP

Page 50: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5050

Chernoff Bound and Chernoff Bound and Bhattacharyya BoundBhattacharyya Bound

Page 51: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5151

Example: Error Bounds for Example: Error Bounds for Gaussian DistributionGaussian Distribution

5.0)()(

2/10

02/1,

2/10

02

20

02,

2

3,

20

02/1,

6

3

21

12

11

2211

PP

ΣΣ

ΣμΣμ

Page 52: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5252

Example: Error Bounds for Example: Error Bounds for Gaussian DistributionGaussian Distribution

Bhattacharyya boundBhattacharyya bound– kk(1/2)(1/2) = 4.11157 = 4.11157– PP((errorerror)) < 0.0087 < 0.0087Chernoff boundChernoff bound– 0.008190 by numerical searching0.008190 by numerical searchingError rate by numerical integrationError rate by numerical integration– 0.00210.0021– Impractical for higher dimensionImpractical for higher dimension

Page 53: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5353

Signal Detection TheorySignal Detection Theory

Internal signal in the detector xInternal signal in the detector x– Has mean Has mean 22 when external signal (pulse) is when external signal (pulse) is

presentpresent– Has mean Has mean 11 when external signal is not pre when external signal is not pre

sentsent– pp((xx||ii) ~ ) ~ NN((ii, , 22))

Page 54: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5454

Signal Detection TheorySignal Detection Theory

12'bility discrimina

d

Page 55: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5555

Four ProbabilitiesFour ProbabilitiesHit: Hit: PP((xx>>xx*|*|xx in in 22))

False alarm: False alarm: PP((xx>>xx*|*|xx in in 11))

Miss: Miss: PP((xx<<xx*|*|xx in in 22))

Correct reject: Correct reject: PP((xx<<xx*|*|xx in in 11))

Page 56: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5656

Receiver Operating Characteristic Receiver Operating Characteristic (ROC)(ROC)

Page 57: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5757

Bayes Decision Theory: Bayes Decision Theory: Discrete FeaturesDiscrete Features

)|(minargaction select

)()|()(

)(

)()|()|(

)|()|(

*

1

x

xx

x

xx

xxxx

ii

c

iii

iii

ii

R

PPP

P

PPP

Pdp

Page 58: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5858

Independent Binary FeaturesIndependent Binary Features

ii

iiii

x

i

id

i

x

i

i

d

i

xi

xi

d

i

xi

xi

iiii

td

q

p

q

p

P

P

qqxPppP

xqxp

xx

1

12

1

1

12

1

11

21

1

1

1

)|(

)|(

ratio likelihood

)1()|(,)1()|(

ceindependen lconditiona Assume

]|1Pr[],|1Pr[

,,

x

x

x

x

Page 59: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

5959

Discriminant FunctionDiscriminant Function

0)( if and 0)( if decide

)(

)(ln

1

1ln,

)1(

)1(ln

)(

)(

)(ln

1

1ln)1(ln)(

21

1 2

10

01

2

1

1

xx

x

x

gg

P

P

q

pw

pq

qpw

wxwg

P

P

q

px

q

pxg

d

i i

i

ii

iii

d

iii

d

i i

ii

i

ii

Page 60: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6060

Example: Three-Dimensional Example: Three-Dimensional Binary DataBinary Data

75.2

5.0

5.0ln

5.01

8.01ln

3863.1

)8.01(5.0

)5.01(8.0ln

3,2,1,5.0,8.0

5.0)()(

3

10

21

i

i

ii

w

w

iqp

PP

Page 61: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6161

Example: Three-Dimensional Example: Three-Dimensional Binary DataBinary Data

83.1

5.0

5.0ln

5.01

8.01ln

0

2,1,3863.1

)8.01(5.0

)5.01(8.0ln

5.0

2,1,5.0,8.0

5.0)()(

2

10

3

33

21

i

i

ii

w

w

i

w

qp

iqp

PP

Page 62: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6262

Illustration of Missing FeaturesIllustration of Missing Features

Page 63: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6363

Decision with Missing FeaturesDecision with Missing Features

b

bi

b

bbgbgi

b

bbgi

g

gigi

bg

dp

dpg

dp

dxpP

dp

dp

P

PP

xx

xxx

xx

xxxx

xx

xxx

x

xx

xxx

)(

)()(

)(

),(),|(

)(

),,(

)(

),()|(

],[

Page 64: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6464

Noisy FeaturesNoisy Features

],[,)|()(

)|()()(

)|(),(

)|(),(),|(),|(

)|(),|(),,(),|(),|(

),,(),,|(),,,(

),(

),,,(),|(

and oft independen assume),|( :model noise

tg

ttb

ttbi

ttbtg

ttbtgtgi

bgi

tbtgbtgtgbtgi

tbgtbgitbgi

bg

ttbgi

bgi

gibtb

dxpp

dxppg

dxpp

dxpppP

ppppP

pPp

P

dxpP

xp

xxxxxx

xxxx

xxxx

xxxxxxxx

xxxxxxxxxxxx

xxxxxxxxx

xx

xxxxx

xxx

Page 65: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6565

Example of Statistical Dependence Example of Statistical Dependence and Independenceand Independence

)()(),( 3131 xpxpxxp

Page 66: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6666

Example of Causal DependenceExample of Causal Dependence

State of an mobileState of an mobile– Temperature of engineTemperature of engine– Pressure of brake fluidPressure of brake fluid– Pressure of air in the tiresPressure of air in the tires– Voltages in the wiresVoltages in the wires– Oil temperatureOil temperature– Coolant temperatureCoolant temperature– Speed of the radiator fanSpeed of the radiator fan

Page 67: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6767

Bayesian Belief Nets Bayesian Belief Nets (Causal Networks)(Causal Networks)

Page 68: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6868

Example: Belief Network for FishExample: Belief Network for Fish

012.04.05.04.06.025.0

)|()|(),|()()(),,,,( 22231321323213

xdPxcPbaxPbPaPdcxbap

Page 69: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

6969

Simple Belief Network 1Simple Belief Network 1

)()|()|()|(

)|()|()|()(

),,,()(

,,

,,

aabbccd

cdbcaba

dcbad

ac b

cba

cba

PPPP

PPPP

PP

Page 70: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7070

Simple Belief Network 2Simple Belief Network 2

gfe

gfe

gfe

gfhegefe

gfhegefe

hgfeh

,

,,

,,

),|()|()|()(

),|()|()|()(

),,,()(

PPPP

PPPP

PP

Page 71: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7171

Use of Bayes Belief NetsUse of Bayes Belief NetsSeek to determine some particular Seek to determine some particular configuration of other variablesconfiguration of other variables– Given the values of some of the Given the values of some of the

variables (evidence)variables (evidence)

Determine values of several query Determine values of several query variables (variables (xx) given the evidence of ) given the evidence of all other variables (all other variables (ee))

),()(

),()|( ex

e

exex P

P

PP

Page 72: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7272

ExampleExample

Page 73: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7373

ExampleExample

37.0),|(,63.0),|(

066.0),|(

114.0)]|()|([

)],|()(),|()(

),|()(),|()([

)|()(

)|()|(),|()()(

),,,,(),(

),,(),|(

212211

212

1211

24142313

22122111

112

,111212

,121

21

211211

bcxPbcxP

bcxP

xdPxdP

baxPaPbaxPaP

baxPaPbaxPaP

xcPbP

xPxcPbxPbPP

cbxPbcP

bcxPbcxP

da

da

daa

da

Page 74: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7474

Naïve Bayes’ Rule Naïve Bayes’ Rule (Idiot Bayes’ Rule)(Idiot Bayes’ Rule)

When the dependency relationship When the dependency relationship among the features are unknown, among the features are unknown, we generally take the simplest we generally take the simplest assumptionassumption– Features are conditionally Features are conditionally

independent given the categoryindependent given the category

– Often works quite wellOften works quite well

)|()|(),|( bxaxbax PPP

Page 75: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7575

Applications in Medical DiagnosisApplications in Medical DiagnosisUppermost nodes represent a fundamental Uppermost nodes represent a fundamental biological agentbiological agent– Such as the presence of a virus or bacteriaSuch as the presence of a virus or bacteria

Intermediate nodes describe diseaseIntermediate nodes describe disease– Such as flu or emphysemaSuch as flu or emphysema

Lowermost nodes describe the symptomsLowermost nodes describe the symptoms– Such as high temperature or coughingSuch as high temperature or coughing

A physician enters measured values into A physician enters measured values into the net and finds the most likely disease the net and finds the most likely disease or causeor cause

Page 76: 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and

7676

Compound Bayesian DecisionCompound Bayesian Decision

n

ii

n

c

t

ipp

Pp

Pp

p

PpP

i

n

1

1

1

))(|()|(

tionsimplifica

)()|(

)()|(

)(

)()|()|(

,,

,, from valuesone takes)(

)(,),1(

xωX

ωωX

ωωX

X

ωωXXω

xxX

ω