38615014 decision fusion

8/8/2019 38615014 Decision Fusion

1/24


2/24

Outline

Advantages of Sensor Network

Wireless Sensor Network Decision Fusion

Our approach Experiments

Survey of existing approaches

Conclusions


3/24

Sensor Network Advantage

Reliability: depends on distance to target

some nodes are valuable, others are not.M. Duarte and Y.H. Hu, Distance Based Decision Fusion in a Distributed Wireless Sensor

Network, IPSN03.

Why take into account all nodes if some are

not reliable?

Methods to filter out redundant, region-

corrupting nodes


4/24

Wireless Sensor Network

Decision Fusion

Collaborative signal processing tasks such as

detection, classification, localization, tracking

require aggregation of sensor data. Decision fusion allows each sensor to send

quantized data (decision) to a fusion center.

prevent overloading the wireless network

conserve energy.

Question: What is optimal decision fusion?


5/24

Decision Fusion Approaches

Existing Approaches

Voting: Simplest, but is it the best?

Weighted Linear Combination: generalized voting

Stack Generalization (Classifier of Classifiers): Most

general. But specific method is not specified.

Basic Ideas of Stack Generalization

Output of each expert (individual decisions) can be

regarded as a meta-feature for the decision fusionalgorithm to make the fused decision

Fusion: A mapping from local decisions (meta-feature) to

the final decision.


6/24

Classifier Fusion

CSP Task: Region-based classifier fusion

Classifying acoustic features into three types ofvehicles or no vehicle (4 classes)

Individual sensor node decision: {1, 2, 3, 4} Assume sensor-target distance is known:

information available after source localization

Not all classifiers given the same classification rate:

Classifiers far away from target have lower classificationrate than those who are closer

The classification rate v.s. sensor-target distance havebeen empirically established


7/24

Classification Rate Vs. Sensor

Target Distance

Distance, Meters

SNR,dB

50 100 150 200 250 300 350 400 450

5

10

15

20

25

30

35

40

45

50

0

1

ClassificationProbability


8/24

Our Approach

Question:

Given K classifiers. Eachclassifiers output d(k) is aclass-label ranging from 1 to N

(N classes). Assume the kthclassifiers probability of correctclassification p(k) is known.

Optimal decision fusion:

Find a decision fusion classifierthat gives a combined decisionD {1, 2, , N} that is afunction of {d(k), p(k)} such thatthe probability that D is correctis maximized.

Stack-generalizationapproach

Global decision based onlocal decisions

Local decision may bebased on identical featuresor different features.

Combination rule D may belinear or nonlinear

d(1) d(2) d(K) D

1 1 4 2

2 1 3 1

p(1) p(2) p(K)


9/24

Our approach: (Contd)

With K local decisions(classifiers), N possibledecisions (classes), thereare NK rows in the

assignment table. Each entry under D in the

table has N possibleassignments. Hence thetotal number of different

fusion rules is N

(NK)

For each feature vector x inthe feature space, theoutcome of all K classifierswill be a row in this table

with a probability. To calculate this probability,

we assume that if aclassifier misclassifies afeature vector, its output will

be one of the remainingN1 class label with equalprobability.

All classifiers makeindependent decisions.

d(1) d(2) d(K) D

1 1 4 2

2 1 3 1

p(1) p(2) p(K)


10/24


For example, let K=3, N=3,

and p(k) as shown. If label

P = 1, the outcome (1,1,3)

will occur with probability

0.7*0.5*(1-0.2)/2=0.14 If the label P = 2, then (1,1,3)

will occur with probability

label d(1) d(2) d(3)

P 1 1 3

p(k) 0.7 0.5 0.2

015.02

)2.01(

2

)5.01(

2

)7.01(!

If the label P = 3, then

(1,1,3) will occur with

probability

Given a specific feature

vector x, the 27 outcomes

of the 3 classifiers

represents all possibleoutcomes. Hence, the sum

of probabilities of the 27

rows should add to 1!

0075.02.02

)5.01(2

)7.01( !


11/24


If the assignment D = 1, the

probability it is a correct

assignment is:

label d(1) d(2) d(3)

P 1 1 3

p(k) 0.7 0.5 0.2

Here we assume p(P =

n)=1/N, namely uninformed

prior distribution.

Similarly,

047.03/14.0)1(14.0

)1()1|113)3()2()1(()1and113)3()2()1((

)113)3()2()1(|1(

}!!!

!!!!

!!!

!!

N

NNN

p

pdddPdddP

dddcorrectDP 005.03/015.0)2(14.0

)2()2|113)3()2()1((

)2and113)3()2()1((

)113)3()2()1(|2(

!!!!

!!!!

!!!

!!

N

NN

N

p

pdddP

dddP

dddcorrectDP

0025.03/0075.0)3(14.0

)3()3|113)3()2()1((

)3and113)3()2()1((

)113)3()2()1(|3(

!!!!

!!!!

!!!

!!

N

NN

N

p

pdddP

dddP

dddcorrectDP


12/24


13/24

Evaluation of classification

fusion by separate rates

If the worst classifier has classification rate less than1/N, its output will be ruled out from fusioni.e. if this classifier shows class n, maximum fusion will not show class n.

If the best classifier has classification rate greaterthan 0.5, its output will be forced in fusioni.e. if this classifier shows class n, maximum fusion will show class n.

How does the classification rate of the best classifiercompare to that of the best fusion?

This will enable different fusion schemes dependenton individual success rates, and on some cases ruleout linear combinations


14/24

Two-classifier fusion

experiments

Difference between maximum

mapping classification rate and

maximum classifier classification

rate. Min. 0.00, Max. 0.75

Maximum Classification Rate for a

two classifier mapping (Blue is

lowest, Red is Highest). Min: 0.33,

Max. 1.00


15/24

Survey ofExisting Decision

Fusion Approaches

DCS-LAK. Woods, W.P. Kegelmeyer Jr. andK. Bowyer, Combination of Multiple Classifiers Using Local

Accuracy Estimates, IEEE Transactions on Pattern Analysis and Machine Intelligence, April 1997

ClassifierAgreement AnalysisM. Petrakos, J.A. Benediktsson and I. Kanellopoulos, The Effect of ClassifierAgreement on the

Accuracy of the Combined Classifier in Decision Level Fusion, IEEE Transactions on Geoscience

and Remote Sensing, November 2001

Combination of Weak ClassifiersC. Ji and S. Ma, Combinations of Weak Classifiers. IEEE Transactions on Neural Networks,

January 1997

Classifier Combination SurveyL. Xu, A. Krzyzak, C.Y. Suen, Methods of Combining Multiple Classifiers and TheirApplications to

Handwriting Recognition, IEEE Transactions on Systems, Man and Cybernetics, May/June 1992


16/24

Dynamic Classifier Selection

by Local Accuracy (DCS-LA)

Estimate each classifiers accuracy in local

regions of feature space

Use decision of most locally accurateclassifier

Accuracy can be calculated among all

classes or for each separate class

Analogy to physical space: is a classifier

more accurate in a spatial region?


17/24


18/24

Combination ofWeak

Classifiers

Create low-classification rate classifiers

randomly.

Train new classifiers on samples marginallyclassified by fusion of previous classifiers.

On fusion, enough classifiers will correctly

classify any given sample.

In general, any fusion scheme should have

enough correct classifiers for all samples.


19/24

Different Levels in Classifier

Output Information

Different information levels merit different fusion

schemes

Level 1 (Abstract): A classifier only outputs a unique

label or set of labels (uncertainty)

Level 2 (Rank): A classifier ranks all labels or a

subset of the labels in a queue with the label at the

top being the first choice

Level 3 (Measurement): Each classifier attributes to

each label a measurement value to address the

degree that the sample has the label.


20/24


21/24

Combination of Multiple

Classifiers in Dempster-Shafer

Formalism

Prepositions

Subset of prepositions represents disjunction

Each element represents a

singleton:

All possible subsets of form superset 2:

Each set A has a value bel(A) [0,1]

1, 1,..., { ,..., }i m A i M A A! p 5 !

1 1{ ,..., } .i iA A 5

{ }i

255


22/24



Formalism

Belief determined by Basic Probability Assignment

(BPA) m(A); cannot be subdivided to its elements.

Singletons {Ai} are only part of the elements of2

,thus

BPA supplies an incomplete probabilistic model.

Any subset A with positive m is a focal point; for a

single focal point, m(A) + m() = 1.

1( ) 1

M

iim A

!

bel( ) ( )B A

A m A

!


23/24



Formalism

Two BPAs can be fused using Dempster

Rule:

Propositions Ai for different classes i=1:M,

BPAs mj for each classifierj=1:K. This method is being reviewed by UWCSP

1 2 1 2

,

1

1 2 1 2

( ) ( ) ( )

1 ( ) ( ) ( ) ( )

X Y A A

X Y X Y

m A m m k m X m y

k m X m y m X m y

! {

! {

! !

! !


24/24

Conclusions

Independent classifiers fusion will not yieldbetter rate than best classifier (except whenclassifiers are bad, and it is subject to

uncertainty) In sensor network case, if success

probabilities are calculated for each node,one node will yield best result smallercommunication burden

This can be currently applied in classification.Expansion to other CSP tasks?

38615014 decision fusion

Documents