38615014 decision fusion

Upload: black-soy-factories

Post on 10-Apr-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 38615014 Decision Fusion

    1/24

  • 8/8/2019 38615014 Decision Fusion

    2/24

    Outline

    Advantages of Sensor Network

    Wireless Sensor Network Decision Fusion

    Our approach Experiments

    Survey of existing approaches

    Conclusions

  • 8/8/2019 38615014 Decision Fusion

    3/24

    Sensor Network Advantage

    Reliability: depends on distance to target

    some nodes are valuable, others are not.M. Duarte and Y.H. Hu, Distance Based Decision Fusion in a Distributed Wireless Sensor

    Network, IPSN03.

    Why take into account all nodes if some are

    not reliable?

    Methods to filter out redundant, region-

    corrupting nodes

  • 8/8/2019 38615014 Decision Fusion

    4/24

    Wireless Sensor Network

    Decision Fusion

    Collaborative signal processing tasks such as

    detection, classification, localization, tracking

    require aggregation of sensor data. Decision fusion allows each sensor to send

    quantized data (decision) to a fusion center.

    prevent overloading the wireless network

    conserve energy.

    Question: What is optimal decision fusion?

  • 8/8/2019 38615014 Decision Fusion

    5/24

    Decision Fusion Approaches

    Existing Approaches

    Voting: Simplest, but is it the best?

    Weighted Linear Combination: generalized voting

    Stack Generalization (Classifier of Classifiers): Most

    general. But specific method is not specified.

    Basic Ideas of Stack Generalization

    Output of each expert (individual decisions) can be

    regarded as a meta-feature for the decision fusionalgorithm to make the fused decision

    Fusion: A mapping from local decisions (meta-feature) to

    the final decision.

  • 8/8/2019 38615014 Decision Fusion

    6/24

    Classifier Fusion

    CSP Task: Region-based classifier fusion

    Classifying acoustic features into three types ofvehicles or no vehicle (4 classes)

    Individual sensor node decision: {1, 2, 3, 4} Assume sensor-target distance is known:

    information available after source localization

    Not all classifiers given the same classification rate:

    Classifiers far away from target have lower classificationrate than those who are closer

    The classification rate v.s. sensor-target distance havebeen empirically established

  • 8/8/2019 38615014 Decision Fusion

    7/24

    Classification Rate Vs. Sensor

    Target Distance

    Distance, Meters

    SNR,dB

    50 100 150 200 250 300 350 400 450

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    0

    1

    ClassificationProbability

  • 8/8/2019 38615014 Decision Fusion

    8/24

    Our Approach

    Question:

    Given K classifiers. Eachclassifiers output d(k) is aclass-label ranging from 1 to N

    (N classes). Assume the kthclassifiers probability of correctclassification p(k) is known.

    Optimal decision fusion:

    Find a decision fusion classifierthat gives a combined decisionD {1, 2, , N} that is afunction of {d(k), p(k)} such thatthe probability that D is correctis maximized.

    Stack-generalizationapproach

    Global decision based onlocal decisions

    Local decision may bebased on identical featuresor different features.

    Combination rule D may belinear or nonlinear

    d(1) d(2) d(K) D

    1 1 4 2

    2 1 3 1

    p(1) p(2) p(K)

  • 8/8/2019 38615014 Decision Fusion

    9/24

    Our approach: (Contd)

    With K local decisions(classifiers), N possibledecisions (classes), thereare NK rows in the

    assignment table. Each entry under D in the

    table has N possibleassignments. Hence thetotal number of different

    fusion rules is N

    (NK)

    For each feature vector x inthe feature space, theoutcome of all K classifierswill be a row in this table

    with a probability. To calculate this probability,

    we assume that if aclassifier misclassifies afeature vector, its output will

    be one of the remainingN1 class label with equalprobability.

    All classifiers makeindependent decisions.

    d(1) d(2) d(K) D

    1 1 4 2

    2 1 3 1

    p(1) p(2) p(K)

  • 8/8/2019 38615014 Decision Fusion

    10/24

    Our approach: (Contd)

    For example, let K=3, N=3,

    and p(k) as shown. If label

    P = 1, the outcome (1,1,3)

    will occur with probability

    0.7*0.5*(1-0.2)/2=0.14 If the label P = 2, then (1,1,3)

    will occur with probability

    label d(1) d(2) d(3)

    P 1 1 3

    p(k) 0.7 0.5 0.2

    015.02

    )2.01(

    2

    )5.01(

    2

    )7.01(!

    If the label P = 3, then

    (1,1,3) will occur with

    probability

    Given a specific feature

    vector x, the 27 outcomes

    of the 3 classifiers

    represents all possibleoutcomes. Hence, the sum

    of probabilities of the 27

    rows should add to 1!

    0075.02.02

    )5.01(2

    )7.01( !

  • 8/8/2019 38615014 Decision Fusion

    11/24

    Our approach: (Contd)

    If the assignment D = 1, the

    probability it is a correct

    assignment is:

    label d(1) d(2) d(3)

    P 1 1 3

    p(k) 0.7 0.5 0.2

    Here we assume p(P =

    n)=1/N, namely uninformed

    prior distribution.

    Similarly,

    047.03/14.0)1(14.0

    )1()1|113)3()2()1(()1and113)3()2()1((

    )113)3()2()1(|1(

    }!!!

    !!!!

    !!!

    !!

    N

    NNN

    p

    pdddPdddP

    dddcorrectDP 005.03/015.0)2(14.0

    )2()2|113)3()2()1((

    )2and113)3()2()1((

    )113)3()2()1(|2(

    !!!!

    !!!!

    !!!

    !!

    N

    NN

    N

    p

    pdddP

    dddP

    dddcorrectDP

    0025.03/0075.0)3(14.0

    )3()3|113)3()2()1((

    )3and113)3()2()1((

    )113)3()2()1(|3(

    !!!!

    !!!!

    !!!

    !!

    N

    NN

    N

    p

    pdddP

    dddP

    dddcorrectDP

  • 8/8/2019 38615014 Decision Fusion

    12/24

  • 8/8/2019 38615014 Decision Fusion

    13/24

    Evaluation of classification

    fusion by separate rates

    If the worst classifier has classification rate less than1/N, its output will be ruled out from fusioni.e. if this classifier shows class n, maximum fusion will not show class n.

    If the best classifier has classification rate greaterthan 0.5, its output will be forced in fusioni.e. if this classifier shows class n, maximum fusion will show class n.

    How does the classification rate of the best classifiercompare to that of the best fusion?

    This will enable different fusion schemes dependenton individual success rates, and on some cases ruleout linear combinations

  • 8/8/2019 38615014 Decision Fusion

    14/24

    Two-classifier fusion

    experiments

    Difference between maximum

    mapping classification rate and

    maximum classifier classification

    rate. Min. 0.00, Max. 0.75

    Maximum Classification Rate for a

    two classifier mapping (Blue is

    lowest, Red is Highest). Min: 0.33,

    Max. 1.00

  • 8/8/2019 38615014 Decision Fusion

    15/24

    Survey ofExisting Decision

    Fusion Approaches

    DCS-LAK. Woods, W.P. Kegelmeyer Jr. andK. Bowyer, Combination of Multiple Classifiers Using Local

    Accuracy Estimates, IEEE Transactions on Pattern Analysis and Machine Intelligence, April 1997

    ClassifierAgreement AnalysisM. Petrakos, J.A. Benediktsson and I. Kanellopoulos, The Effect of ClassifierAgreement on the

    Accuracy of the Combined Classifier in Decision Level Fusion, IEEE Transactions on Geoscience

    and Remote Sensing, November 2001

    Combination of Weak ClassifiersC. Ji and S. Ma, Combinations of Weak Classifiers. IEEE Transactions on Neural Networks,

    January 1997

    Classifier Combination SurveyL. Xu, A. Krzyzak, C.Y. Suen, Methods of Combining Multiple Classifiers and TheirApplications to

    Handwriting Recognition, IEEE Transactions on Systems, Man and Cybernetics, May/June 1992

  • 8/8/2019 38615014 Decision Fusion

    16/24

    Dynamic Classifier Selection

    by Local Accuracy (DCS-LA)

    Estimate each classifiers accuracy in local

    regions of feature space

    Use decision of most locally accurateclassifier

    Accuracy can be calculated among all

    classes or for each separate class

    Analogy to physical space: is a classifier

    more accurate in a spatial region?

  • 8/8/2019 38615014 Decision Fusion

    17/24

  • 8/8/2019 38615014 Decision Fusion

    18/24

    Combination ofWeak

    Classifiers

    Create low-classification rate classifiers

    randomly.

    Train new classifiers on samples marginallyclassified by fusion of previous classifiers.

    On fusion, enough classifiers will correctly

    classify any given sample.

    In general, any fusion scheme should have

    enough correct classifiers for all samples.

  • 8/8/2019 38615014 Decision Fusion

    19/24

    Different Levels in Classifier

    Output Information

    Different information levels merit different fusion

    schemes

    Level 1 (Abstract): A classifier only outputs a unique

    label or set of labels (uncertainty)

    Level 2 (Rank): A classifier ranks all labels or a

    subset of the labels in a queue with the label at the

    top being the first choice

    Level 3 (Measurement): Each classifier attributes to

    each label a measurement value to address the

    degree that the sample has the label.

  • 8/8/2019 38615014 Decision Fusion

    20/24

  • 8/8/2019 38615014 Decision Fusion

    21/24

    Combination of Multiple

    Classifiers in Dempster-Shafer

    Formalism

    Prepositions

    Subset of prepositions represents disjunction

    Each element represents a

    singleton:

    All possible subsets of form superset 2:

    Each set A has a value bel(A) [0,1]

    1, 1,..., { ,..., }i m A i M A A! p 5 !

    1 1{ ,..., } .i iA A 5

    { }i

    255

  • 8/8/2019 38615014 Decision Fusion

    22/24

    Combination of Multiple

    Classifiers in Dempster-Shafer

    Formalism

    Belief determined by Basic Probability Assignment

    (BPA) m(A); cannot be subdivided to its elements.

    Singletons {Ai} are only part of the elements of2

    ,thus

    BPA supplies an incomplete probabilistic model.

    Any subset A with positive m is a focal point; for a

    single focal point, m(A) + m() = 1.

    1( ) 1

    M

    iim A

    !

    bel( ) ( )B A

    A m A

    !

  • 8/8/2019 38615014 Decision Fusion

    23/24

    Combination of Multiple

    Classifiers in Dempster-Shafer

    Formalism

    Two BPAs can be fused using Dempster

    Rule:

    Propositions Ai for different classes i=1:M,

    BPAs mj for each classifierj=1:K. This method is being reviewed by UWCSP

    1 2 1 2

    ,

    1

    1 2 1 2

    ( ) ( ) ( )

    1 ( ) ( ) ( ) ( )

    X Y A A

    X Y X Y

    m A m m k m X m y

    k m X m y m X m y

    ! {

    ! {

    ! !

    ! !

  • 8/8/2019 38615014 Decision Fusion

    24/24

    Conclusions

    Independent classifiers fusion will not yieldbetter rate than best classifier (except whenclassifiers are bad, and it is subject to

    uncertainty) In sensor network case, if success

    probabilities are calculated for each node,one node will yield best result smallercommunication burden

    This can be currently applied in classification.Expansion to other CSP tasks?