comparison of evidence theory and bayesian theory for...

72
Comparison of Evidence Theory and Bayesian Theory for Uncertainty Modeling Prabhu Soundappan Efstratios Nikolaidis 1 Mechanical, Industrial and Manufacturing Department The University of Toledo Toledo, OH-43606 USA Email (Nikolaidis) [email protected] R. T. Haftka Department of Aerospace Engineering, Mechanics and Engineering Science The University of Florida Gainesville, FL 32611-6250 USA Ramana Grandhi Department of Mechanical and Materials Engineering Wright State University Dayton, OH 45435 USA Robert Canfield Air Force Institute of Technology WPAFB, OH 45433 USA Abstract This paper compares Evidence Theory (ET) and Bayesian Theory (BT) for uncertainty modeling and decision under uncertainty, when the evidence about uncertainty is imprecise. The basic concepts of ET and BT are introduced and the ways these theories model uncertainties, propagate them through systems and assess the safety of these systems are presented. ET and BT approaches are demonstrated and compared on challenge problems involving an algebraic function whose input variables are uncertain. The evidence about the input variables consists of intervals provided by experts. It is recommended that a decision-maker compute both the Bayesian probabilities of the outcomes of alternative actions and their plausibility and belief measures when evidence about uncertainty is imprecise, because this helps assess the importance of imprecision and the value of additional information. Finally, the paper presents and demonstrates a method for testing approaches for decision under uncertainty in terms of their effectiveness in making decisions. 1 Corresponding author

Upload: vuongdien

Post on 16-May-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Comparison of Evidence Theory and Bayesian Theory for Uncertainty Modeling

Prabhu Soundappan

Efstratios Nikolaidis1 Mechanical, Industrial and Manufacturing Department

The University of Toledo Toledo, OH-43606

USA Email (Nikolaidis) [email protected]

R. T. Haftka

Department of Aerospace Engineering, Mechanics and Engineering Science The University of Florida

Gainesville, FL 32611-6250 USA

Ramana Grandhi

Department of Mechanical and Materials Engineering Wright State University

Dayton, OH 45435 USA

Robert Canfield

Air Force Institute of Technology WPAFB, OH 45433

USA

Abstract This paper compares Evidence Theory (ET) and Bayesian Theory (BT) for uncertainty modeling and decision under uncertainty, when the evidence about uncertainty is imprecise. The basic concepts of ET and BT are introduced and the ways these theories model uncertainties, propagate them through systems and assess the safety of these systems are presented. ET and BT approaches are demonstrated and compared on challenge problems involving an algebraic function whose input variables are uncertain. The evidence about the input variables consists of intervals provided by experts. It is recommended that a decision-maker compute both the Bayesian probabilities of the outcomes of alternative actions and their plausibility and belief measures when evidence about uncertainty is imprecise, because this helps assess the importance of imprecision and the value of additional information. Finally, the paper presents and demonstrates a method for testing approaches for decision under uncertainty in terms of their effectiveness in making decisions.

1 Corresponding author

Introduction The information in many problems of design under uncertainty, especially those

involving reducible (epistemic) uncertainty, is imprecise. Reducible uncertainty is

uncertainty due to lack of knowledge, as opposed to random (aleatory) uncertainty, which

is due to inherent variability in a physical phenomenon. It is called reducible, because it

can be reduced or eliminated if one collects information. The uncertainty in the

probability of getting heads in one flip of a bent coin is reducible, because it is due to

lack of knowledge. It can be reduced if we conduct experiments. On the other hand there

is aleatory uncertainty when flipping a coin even if we know the probability of events;

heads and tails. This type of uncertainty cannot be reduced even if we conduct n

experiments, where n is a very large number. For example, we know the probability of

heads and tails in a fair coin is 0.5. But each time you flip it, you are uncertain about the

output. So this uncertainty is also called irreducible uncertainty. Oberkampf et al. [1]

explained these two types of uncertainty and presented examples in which aleatory and

epistemic uncertainty are encountered in engineering problems.

There is no consensus about what the best theory is for modeling epistemic

uncertainty. Oberkampf et al. [1] studied the differences and similarities of epistemic and

random uncertainty. In their study, they used a hybrid approach in which random

uncertainty was modeled using probability and epistemic uncertainty was modeled using

intervals bounding variables in which there was epistemic uncertainty. Theories for

modeling epistemic uncertainty, include Coherent Upper and Lower Previsions [2-4],

Possibility Theory [5-8], Evidence Theory [9-10], the Transferable Belief Model [11] and

Bayesian Theory [12-13]. Information gap-Decision Theory is another alternative for

decision making under uncertainty when information about uncertainty is scarce [14].

2

Information about epistemic uncertainty is usually in the form of intervals. For

example, if we show a glass jar containing beans to a person and ask her how many beans

are in the jar, then she is more likely to give a range rather than a precise number.

Similarly, if we ask an expert what he thinks the prime interest rate will be in 2007, he

will probably provide a range rather than a single number. The same is true, if we ask an

expert about the probability of getting “heads” in a flip of a bent coin.

The objective of the study presented in this paper is to compare two approaches,

one using Evidence Theory (ET), the other using Bayesian Theory (BT), for

characterizing uncertainty in situations, such as the ones presented in the previous

paragraph, where the information about uncertainty is imprecise.

Specifically, the following problem is considered:

The performance of a system is characterized by variable Y, which is a function of

uncertain variables X1,...,Xm. We know the relation between the performance variable Y

and variables X1,...,Xm, )x,...,x(fy m1=

],[ ,, jiujil xx

2. We have information about the values of

variables X1,…,Xm, which is in the form of intervals obtained from n experts. These

intervals have the form where subscript i specifies the variable, and j

specifies the expert. Suppose that a system survives if the performance variable Y falls in

a certain interval or collection of intervals denoted by S. There is no uncertainty in the

functional relation of the performance variable and the input variables, and in the

definition of survival. We want to model the uncertainty in the independent variables,

derive a model about the uncertainty in the performance variable Y and assess safety.

2 In this report, capital letters denote variables and lower case letters denote values that these variables assume.

3

First, the assumptions of the ET based approach are presented and then an

approach for constructing models of uncertainty is developed on the basis of these

assumptions, in section 2. A method for propagating uncertainty through a system to

estimate the uncertainty in the response from the uncertainty in the input is shown.

Finally, equations for computation of the Belief and Plausibility of failure of the system

are presented. A simple example of an algebraic function is used to demonstrate each

step of the approach.

Section 3 presents a Bayesian approach for constructing models of uncertainty.

First, Bayes rule for updating the prior mass function of a discrete variable or the

probability density function of a continuous variable is reviewed. Then an approach for

constructing a model of uncertainty using Bayes rule from expert evidence about the

random variables, which is in the form of intervals, is presented. An example

demonstrating each step of this approach is also included.

In section 4, ET and BT approaches are demonstrated and compared on a series of

challenge problems involving epistemic uncertainty proposed by the Epistemic

Uncertainty Group [15].

As mentioned earlier, decision-makers have an arsenal of different theories and

methods based on these theories for making decisions under uncertainty. There is no

consensus as to what method is most suitable for problems with epistemic uncertainty,

when information is scarce and imprecise. Comparisons of alternative approaches on the

basis of their effectiveness in making decisions under uncertainty could help understand

better these methods and assess their effectiveness in modeling epistemic uncertainty.

Section 5 proposes an approach for comparing methods for the solution of the challenge

4

problems. The approach uses alternative methods for characterizing uncertainty to make

decisions, the outcomes of which are later evaluated through numerical simulations or

physical experiments.

1. Evidence Theory Approach

Assumptions

The following are the key assumptions of the ET approach:

1. If some of the evidence is imprecise we can quantify uncertainty about an event

by the maximum and minimum probabilities of that event. Maximum (minimum)

probability of an event is the maximum (minimum) of all probabilities that are

consistent with the available evidence.

2. The process of asking an expert about an uncertain variable is a random

experiment whose outcome can be precise or imprecise. There is randomness

because every time we ask a different expert about the variable we get a different

answer. The expert can be precise and give a single value or imprecise and

provide an interval. Therefore, if the information about uncertainty consists of

intervals from multiple experts, then we have uncertainty due to both imprecision

and randomness.

If all experts are precise they give us pieces of evidence pointing precisely to

specific values. In this case, we can build a probability distribution of the variable. But if

the experts provide intervals, we cannot build such a probability distribution because we

do not know what specific values of the random variables each piece of evidence

5

supports. In this case, we can use second order probability3, or we can calculate the

maximum and minimum values of the probabilities of events. The latter approach does

not require any additional information beyond what is already available.

To demonstrate this philosophy of calculating maximum and minimum bounds

for the probability of an event when the evidence is imprecise, consider the following

problem. We roll a weighted die n times and videotape the results. The statistical error is

negligible because n is large. Later we discover that we cannot determine precisely the

outcomes in the experiments from the tape. We can only tell that 40% of the experiments

resulted in a number less or equal to three and the other 60% in a number greater than

three. In this case, we cannot estimate the probability of each of the numbers 1-6,

unless we make arbitrary assumptions about the likelihood of getting numbers between 1

to 3 and 4 to 6. But we could estimate that there is a 0.4 probability of getting a number

between 1 to 3 and 0.6 probability of getting a number between 4 to 6. Instead of making

additional assumptions, such as that all numbers between 1 to 3 are equally likely, we

could conclude that every number from 1 to 3 can have a probability as high as 0.4 and as

low as 0 and every number from 4 to 6 can have a probability as high as 0.6 and as low as

0.

Modeling uncertainty

First consider one variable X1. The information consists of intervals obtained from

n experts [ thought to enclose the precise value x], ,1,1 jujl xx 1,j . The intervals can be

nested, in which case we have consonant evidence, they may overlap, or they may be

3 Second order probability treats the variables associated with epistemic uncertainty as random variables with their own probability distributions and computes a probability distribution of the probability of occurrence of an event. For example, in a experiment that involves flipping a bent coin the probability of the event “heads” is treated as a random variable.

6

disjoint, in which case we have conflicting evidence. When an expert provides an

interval instead of a value, then the expert is telling us that the true value of the variable

could be anywhere in this interval. Therefore, the evidence from the expert could or could

not support a particular value in that interval. The maximum probability of the variable

being equal to x is the ratio of the pieces of the imprecise evidence from the experts that

could support x to the total number of intervals. For example, if experts 1 and 2 told us

that the value of the gas ten years from now can be between $1 and $5 and $1 to $10,

respectively, then on the basis of this evidence, the probability of any value between $1

and $5 could be as high as 1 and the probability of any value greater than $5 and less or

equal to $10 could be as high as 0.5.

The maximum probability of X1=x , Pu(X1=x), can be found by solving the

following optimization problem:

Find n,, x,...,x 111

to maximize ∑===

n

iiI

n)xX(P

11

1 (1)

where Ii is an indicator function: =

=otherwise 0

1 1 xxI i,i

so that ]x,x[x ui,

li,i, 111 ∈

This maximum probability will also be called Plausibility. The above formulation

indicates that the maximum probability of X1=x is the ratio of the number of intervals of

the experts containing x to the total number of intervals.

The minimum probability X1=x, Pl(X1=x), can be found by solving the following

dual optimization problem:

7

Find n,, x,...,x 111

To minimize ∑===

n

iiI

nxX

1

'1

1)(P (2)

where 'I is an indicator function: i

===

otherwise 01 ,1,1' xxxI

ui

lii

So that ]x,x[x u

i,l

i,i, 111 ∈ This minimum probability will also be called Belief.

From the above formulation, we conclude that the minimum probability of X1=x

is the ratio of the number of intervals that coincide with point {x} to the total number of

intervals. This probability is zero unless there is precise evidence pointing at x.

One can easily extend the formulations of the above two optimization problems to

find the maximum and minimum probabilities of any event associated with variable X1,

such as the event that X1 assumes a value in a given interval or set of intervals. The

Plausibility and Belief can also be found solving equations (8) & (9), once the body of

evidence of the input variables is resolved. The body of evidence is molded from the

intervals given by the experts using the mixing or averaging technique. We found this to

be the most intuitive technique when one does not have any knowledge about the experts.

The evidence can be combined using several other techniques like Dempsters rule of

combination, Discount+Combine method, Yager’s modified Dempster’s rule, Inagaki’s

unified combination rule and several others. These combination rules are studied in detail

in [10].

The following assertion relates maximum and minimum probabilities to

Plausibility and Belief, which are used in evidence theory [6] to characterize one’s belief

8

about the occurrence of events. Evidence theory can be viewed as an extension of

probability theory. It is suitable for characterizing uncertainty when evidence is

imprecise because it allows one to estimate probabilities of intervals instead of

probabilities of specific values. These intervals are called focal elements and their

probabilities basic probabilities.

Assertion

Consider the experts providing n intervals about a variable. These intervals are

considered as focal elements and have basic probability 1/n. Then the Plausibility and

Belief of any event associated with the variable are equal to the maximum and minimum

probabilities of the event, respectively.

Justification

The maximum probability an event represented by a set C is equal to the ratio of

the number of focal elements (intervals provided by the experts) that intersect with C to

the total number of focal elements. Indeed, all of the evidence provided by the experts,

consisting of intervals intersecting with C, could support C because, according to these

experts, the true value of the variable could be in C. The rest of the evidence cannot

support C because the rest of the intervals and C are disjoint. That means the maximum

probability of C is equal to the sum of the basic probabilities of the focal elements that

intersect C, which is the Plausibility of x. Similarly, the minimum probability of C is the

number of the focal elements contained in C divided by the total number of focal

elements, which is the Belief of C.

The following examples are based the challenge problems [15].

9

Example 1: Two experts said that variable A is in the following intervals: [0.1, 0.4] and

[0.3,0.6]. Two experts said that variable B is in the following intervals: [0.2, 0.5] and

[0.4,0.7]. Figure 1, shows the maximum probabilities (Plausibility) of A and B,

respectively.

Consider m variables. If there is no information about the correlation of the

variables and the experts are equally credible, we can transfer the evidence about each

variable, into the m-dimensional space of all the variables using the following equation:

m

])x,x([m]),[],...,,[],x,x([m

ulXul

X...XX m11

111

21=+∞−∞+∞−∞ (3)

In the above equation is the basic probability of the interval

. Symbol is the basic

probability of the same interval in the m-dimensional space of the variables. This

equation can be justified as follows: if an expert says that a variable is in a given interval

this is true for both the space of that variable and the m-dimensional space of the m

variables. But if we have m bodies of evidence for m variables, then the evidence must be

normalized by m when transferring evidence from the one-dimensional space of a

variable to the m-dimensional space.

])x,x([m ulX 111

112],x,x([ ul

X... m]x,x[ ul

11 1

]),[],...,,[m XX +∞−∞+∞−∞

If we know that the variables are independent (that is, information about any

group of variables does not change our belief about the others) then we can use the

following approach to combine the evidence about the variables into a single joint body

of evidence. a) Focal elements of the joint body of evidence are the elements of the

Cartesian product of the elements of the evidence about the individual variables. b) The

10

probability of each element in the m-dimensional space is the product of the individual

probabilities.

]),([...]),([]),[],...,,([ 1111... 121

um

lmX

ulX

um

lm

ulXXX xxmxxmxxxxm mm ⋅⋅= (4)

The above equation is a special case of Dempster’s rule of combination when the

bodies of evidence from different experts are independent and equally credible. Since the

rule has been justified in other publications, such as [9], we will not justify it here.

From the joint body of evidence, we can estimate the maximum joint probability

of the variables.

Example 2: This is a continuation of example 1. If A and B and independent, the joint

body of evidence is shown in Fig. 2. Specifically, this figure shows the focal elements of

the joint body of evidence, which are the four rectangles (boxes) in Fig. 2. The four boxes

in Fig. 2 is the result of the Cartesian product of the individual bodies of evidence of

variables A and B. The maximum joint probability (Plausibility) of these variables is

shown in Figure 3.

Propagating uncertainty through a system

Here we compute the maximum probability of variable Y, which is related to the

input variables through function , from the joint body of evidence about

X

),...,( 1 mxxfy =

1, …,Xm. First, we transform the joint body of evidence about the input variables into

evidence about variable Y. For this purpose, we map the focal elements in the m-

dimensional space of the independent variables into the elements in the space of variable

Y. We can do this by solving one maximization and one minimization problems to find

the limits of variable Y, when the input variables vary within each focal element in the

11

joint probability space. Mathematically, we solve a pair of optimization problems to find

the interval in the space of variable Y given the focal element in the space of the variables

X1, …,Xm.

Find X1, …,Xm

To min (max) Y (5) ),...,( 1 mXXf=

So that , where are the lower and upper bounds of X],[ ui

lii XXX ∈ u

ili XX and i

corresponding to each focal element in the joint space.

Example 3: This is an extension of example 2. After calculating the joint body of

evidence of variables, the limits of Y can be found using Eq. 5. In Fig. 2, we have four

boxes in the joint space of variables as the outcome of the Cartesian product of the

individual variables. The shaded box in Fig. 2 is the product of the focal elements

A[0.1,0.4] and B[0.2,0.5]. The basic probability assigned to this shaded box (Fig.2) is the

product of the probabilities of the individual focal elements. The corresponding limits of

Y can be found solving Eq. 5, which is [0.897,1.039], the shaded ellipse in Fig. 4 and 5.

The basic probability assigned to this focal element is ¼. Using the same procedure, the

rest of the focal elements and the basic probabilities are calculated to construct the body

of evidence of Y.

The above optimization problems can be solved using nonlinear programming or

Monte-Carlo simulation. This yields a set of intervals for Y, Ci, and the basic probabilities

of these intervals, mY(Ci). Then we compute the maximum probability of Y being equal

to y through the following equation:

∑==∈ iCy

iYu )C(m)yY(P (6)

12

In the above equation, Ci are the intervals that contain y.

We can also find the maximum and minimum cumulative probability distribution

functions (UCDF and LCDF, respectively) of Y. These functions provide the maximum

and minimum and values of the probability of Y being less or equal to a value y. The

UCDF is obtained using the following equation:

∑= )C(m)y(F iY

uY (7)

The sum in the above equation includes all the elements Ci that intersect with

interval [-∞, y]. The LCDF is obtained from the same equation but the sum includes all

elements that are contained in the interval [-∞, y].

Example 4: Consider function Y=(A+B)A. The maximum probabilities of variables A and

B were found in Example 1 and the joint maximum probability of A and B in Example 2.

Example 3 explains how we construct the body of evidence of Y from the joint space.

Figures 4 and 5 show the body of evidence of variable Y, and the corresponding

maximum probability and the maximum and minimum cumulative probability

distribution functions of Y, respectively.

It is observed from Figure 4 that the maximum probability of Y=y is an overly

conservative measure of the Likelihood of this event. Indeed, unless the probability

density function (PDF) of Y has a delta function at Y=y, the probability of this event is

zero, while the maximum probability of this event assumes a non zero value in the range

from 0.81 to 1.17.

Figure 5 shows that there is a large gap between the maximum and minimum

bounds of the cumulative probability of Y. This indicates a large uncertainty in the true

value of this probability. There are two reasons for this large uncertainty: a) The

13

intervals provided by the experts about the values of the independent variables A and B

are wide and they are nested. b) Only the information provided in the problem statement

was used to model the uncertain variables.

Assessing safety

As mentioned in the introduction, when the evidence is imprecise it is useful to

know how low and how high the probability of survival (or failure) of a system can be.

Indeed, when evidence is imprecise, it could be reasonable to design a system, whose

failure could have severe consequences, using the most conservative models that are

consistent with the available evidence. The maximum and minimum probabilities of

survival can be found using the following equations:

∑==

≠I 0SCiY

u

i

)C(m)S(Pl)S(P (8)

∑==⊆ SC

iYl

i

)C(m)S(Bel)S(P (9)

Uncertainty Measures

ET considers two types of uncertainty. One is due to the imprecision in the

evidence; the other is due to the conflict. Nonspecificity and Strife measure the

uncertainty due to imprecision and conflict, respectively. Both measures are expressed in

bits of information. In the following, we briefly present these measures. A detailed

presentation can be found in [16, 17].

The larger the focal elements of a body of evidence, the more imprecise is the

evidence and, consequently, the higher is Nonspecificity. When the evidence is precise

14

(all of the focal elements consist of a single member), Nonspecificity is zero. In the

challenge problems, the broader the interval of the experts, the higher is Nonspecificity.

Strife measures the degree to which pieces of evidence contradict each other.

Consonant (nested) focal elements imply little or no conflict. Disjoint elements imply

high conflict in the evidence. For example, if the experts’ intervals are disjoint, the

experts contradict each other. Therefore, Strife is large. For finite sets, when evidence is

precise, Strife reduces to Shannon’s entropy, which measures conflict in probability

theory.

Nonspecificity measures the epistemic/reducible uncertainty, the uncertainty

associated with the sizes (cardinalities) of relevant sets of alternatives. Consider a body of

evidence <F,m>, where F represents the set of all focal elements and m their

corresponding basic probability assignments. Here N(m,µ) measures the Nonspecificity

in bits.

µ(A)m(B)µ)N(m,XA

∑ ⋅=⊆

(10)

where for discrete domains and (A)logµ(A) 2= )Aln(1µ(A) += for continuous

domains. A is the Lebesgue measure of A. The Lebesgue measure of an interval is its

length.

Strife measures conflict among the various sets of alternatives in a body of

evidence. Strife measure in evidence theory is given by,

( ) ∑

∑ ⋅−=

∈ ∈FA FB2 B)SUB(A,m(B)logm(A)mS (11)

15

where A

BAB)SUB(A,

∩= for finite sets and

>∩≡

=

otherwise 0

0 if

if 1

)A((A)

)BA(A

B)SUB(A,

φ

for

infinite sets. Symbols )BA( and )A( ∩ represent the Lebesgue measures of

, respectively. BA and A ∩

2. Bayesian Approach

This section explains Bayes rule and presents an approach for constructing a

probability mass/density function of discrete and continuous variables using evidence

from experts. Methods for estimating the probability density function of the response of

a system and its probability of failure given the probability density function of the input

variables, such as Monte Carlo simulation and Fast Probability Integration, are well

documented [18]. Therefore, they will not be discussed here.

Bayes rule Discrete case: Updating a Prior Probability Mass Function using evidence

Suppose we have information about variable X whose Prior Probability Mass

Function (PMF) is given by the set of possible values x1,…,xj and the corresponding

probabilities P(X = xj) , j = 1,…,J. Then we observe a sample value of another variable

Y. The Likelihood probability, P(Y=y/X=xj), is determined from the conditional PMF of

Y given X. Bayes rule can be applied to update the Prior PMF of X, when the sample

value of Y is observed, to estimate the Posterior PMF, P(X= xj | Y=y). Bayes rule for the

discrete case is:

16

∑ =⋅==

=⋅=====

=

J

jjj

jjj

xXPxXyYP

xXPxXyYPyYxXP

1)()|(

)()|()|( (12)

Continuous case: Updating a Prior PDF using evidence [19]

The uncertainty in a continuous random variable, X, can be represented by its

PDF, , which can be updated on the basis of evidence, E, into a Posterior PDF, )x(f oX

)E/x(f X . This function can be calculated using Bayes theorem:

(x)fL(E/x)k1(x/E)f o

XX ⋅⋅= (13)

)E/x(f X is the Posterior PDF of variable X and k is a normalization constant:

(14) dxxfx oX⋅ )()/EL∫= ∞

∞− (k

)x/E(L is the Likelihood function, which is the conditional probability of

observing evidence E, given X=x.

If we do not know the Prior PDF of X, then we can assume a non-

informative/maximum entropy Prior. If we only know that a variable is in a certain range

then the uniform probability density is the one with maximum entropy. If the evidence is

imprecise (i.e. it consists of intervals instead of single values) Bayes rule cannot be

directly applied. The analyst needs to make assumptions that are described below to

estimate the Likelihood of the evidence. The Posterior PDF of X can be sensitive to these

assumptions.

Example 5: Consider the problem solved in examples 1-4 using the ET approach. Assume

that an analyst only knows that variable A is between 0.1 and 1, and variable B is between

17

0 and 1. Since the analyst has no other information regarding the prior the analyst

assumes uniform prior probability distributions for A and B shown in Figure 6.

Method for combining evidence from experts to construct models of uncertainty

When the evidence given by experts about a random variable X, consists of

intervals (Figure 7), we need a method to interpret the evidence and bring it into a form

so that it can be used in the Bayesian framework.

To apply Bayes theorem to this problem we need to estimate the Prior PDF of X

and the Likelihood L(E/X=x). The following assumptions are made to combine evidence

from experts:

a) The analyst converts the interval provided from each expert about a variable into

a point estimate. This can be the midpoint of the interval or another point

obtained based on the analyst's judgment.

b) The point estimate of the expert is equal to the true value of the variable plus an

error. The analyst assumes a joint probability distribution of the errors of the

experts.

Suppose that we have evidence in the form of intervals from n experts,

for i = 1,…,n (Figure 7). The analyst can assume that the i],[ maxminii xx th expert gives a

point estimate , which is the midpoint of the iix̂ th interval:

2

xxx̂maxi

mini

i+= for i = 1,…,n (15)

Let x be the true value of variable X. If the error in the ith expert’s estimate is Di,

then:

18

Point Estimates = Random Variables + Errors of Experts, or

DXX +=ˆ (16)

where X̂ is vector of the point estimates of the experts (size n), is a vector

whose elements are all equal to the true value of variable X, x, and D is the vector of the

errors.

X

We need the PDF of D to estimate the likelihood of the evidence. As an example

we can assume that random vector D is normal with mean b and covariance matrix C.

=2

1

212

1

211

nn DDDn,

DD,D

σσσρ

σσρσ

L

MOM

L

C (17)

where

iDσ is the standard deviation in the estimate of the ith expert and is

the correlation coefficient of the estimates of two experts. The i

ijρ

th element of vector b, bi,

is the bias of the ith expert. The analyst should estimate the above quantities.

As an example, an analyst could assume that:

a) Bias bi is zero,

b) The endpoints of the interval provided by ith expert are equal to the midpoint ±3

standard deviations of the error, respectively.

On the basis of the above assumptions, the analyst can calculate the standard

deviation of the error in each expert estimate:

6

minixmax

ixiD

−=σ (18)

19

Example 6: In example 5 we assumed the prior PDF’s of A and B. The next step is to

determine the experts’ error given the evidence from experts. We also assume that the

bias is zero for all the experts (that is the errors of the experts have zero mean). Figure 8

displays the experts’ errors for variables A and B.

Then the Likelihood of the evidence is:

)ˆ()ˆ.(21

21

1T.

2

1)/ˆ()|(bXXCbXX

DC

XX−−⋅⋅−−− −

⋅⋅

==−== exXfxXEL

π(19)

The Posterior PDF of variable X can be calculated from Eq. (11).

Example 7: Consider the function Y = (A+B)A. The Prior PDFs and the errors of the

experts for variables A and B were found in examples 6 and 7. The Likelihood is

calculated for this example using Eq. (19) and the Posterior PDF using Eq. (13). The

Likelihood PDF and the Posterior PDF of A and B are presented in Figure 9. Using the

convolution integral method we can readily compute the posterior PDF of dependent

variable Y.

Uncertainty Measures

In standard and Bayesian probability theory, Shannon’s entropy measures the uncertainty

due to conflict. Since evidence is treated as if it were precise, Nonspecificity is zero.

Shannon’s entropy for finite sets is not directly applicable for continuous distributions as

a measure of uncertainty. When a probability density function of a continuous variable is

defined in a real interval, then Shannon’s entropy is defined in relative terms using a

reference probability density function. In this case, entropy can be positive (entropy in

20

the probability density function is greater than that in the reference probability density

function) or it can be negative. It can only be employed in a modified form:

∫ ⋅−=∞

∞−dx

)x(gp(x)logp(x)H(X) (20)

where X is the random variable with PDF p(x) and g(x) is the reference density function

of X. In this paper, reference densities for the input variables are their prior PDFs . For

dependent variable Y, reference density is the PDF of Y corresponding to the prior PDF’s

of the input variables.

3. Demonstration and Comparison of ET and BT Approaches

The Epistemic Uncertainty Group, [15], proposed solving the following

challenge problem using methods for modeling uncertainty to understand how the

methods work when evidence is imprecise. Consider function Y=(A+B)A, where A and B

are independent variables, which means that knowledge about one variable does not alter

our belief about the other. There is no uncertainty in the functional relation between A, B

and Y. There is only uncertainty in the values of A and B. Experts provide information

about A and B in the form of intervals. The objective is to quantify the uncertainty in Y.

In this paper, the above problem is solved in seven cases using both ET and BT

approaches. The objective is to calculate and compare the models that these approaches

construct to characterize the uncertainty in variables A, B and Y. Table 1 presents the

evidence from the experts. In case 1, we have only one expert providing evidence for

each variable, so there is only imprecision. In cases 2 and 3, two experts provided

evidence for each of the variables A and B and there is both imprecision and conflict.

21

Conflict is considerably lower in case 3 than in case 2, because the intervals are

overlapping in the former case and disjoint in the latter. The experts in case 4 are precise

(their intervals are very narrow) but they contradict each other. In case 5, we have

highly imprecise experts (the intervals for A and B are wide) and there is no conflict (the

intervals are nested, which means that all experts could be right). This is the opposite

situation than in case 4 where conflict dominates over imprecision. In cases 6 and 7, we

have evidence from 3 and 4 experts for variables A and B, respectively. In case 6, the

evidence is nested, so there is no conflict. In case 7, the intervals of the experts are

disjoint and narrower than in case 6. Therefore there is higher conflict and lower

imprecision than in case 6.

The analyst in the Bayesian approach makes the following assumptions for all the

cases:

1. The Priors of A and B are uniform from 0.1 to 1 and from 0 to 1, respectively.

2. The error of each expert is normal with standard deviation equal to 1/6th of the

width of the interval provided by the expert in all cases but 4a and 7a. If the

experts are unbiased the mean of the error is zero. If the experts are independent,

then the correlation coefficients of the errors are zero. Consequently, the

correlation matrix in Eq. (17) is diagonal.

In the Bayesian approach the analyst assumes the probability distributions of the

errors of the experts. In cases 1-7, the experts are assumed independent and unbiased. In

case 3a, the analyst still assumes zero bias but the errors of the two experts are positively

correlated ( ).80=ρ for variable A, and negatively correlated ( 80.−=ρ ) for variable B.

22

In case 3b, the analyst assumes a bias of 0.1 for variable A and a bias of 0.05 for variable

B, but the experts are independent.

Cases 4 and 7 are challenging for the analyst who uses Bayesian approach

because the experts contradict each other. This means that based on the experts intervals

only one expert can be correct. The analyst could still assume that the standard deviation

of the expert's errors are equal to 1/6th of the width their intervals but this is a poor

assumption because if both experts were as accurate as this assumption implies then they

would not contradict each other. To overcome this difficulty, the analyst increases the

standard deviation of the error based on the degree of the conflict in the expert's evidence.

The standard deviation of the error obtained from (Eq 18) is increased by 0.05 and 0.2 for

both variables A and B in cases 4a and 7a, respectively.

Figures 10-15 16 present the results of the two methods in cases 1-7, respectively.

The ET method yields maximum and minimum cumulative probability distributions of

the input and the output variables. These curves envelop the true cumulative probability

distributions of the variables. They are also labeled Plausibility and Belief, respectively.

The Bayesian approach characterizes uncertainty using a single cumulative probability

distribution function. Because of the assumptions about the probability distributions of

the expert's errors, the maximum and minimum cumulative probability distributions do

not always envelop the Bayesian cumulative distribution.

One can assess the magnitude of each type of uncertainty by studying the

maximum and minimum cumulative probability distributions obtained from the ET

approach. A large horizontal distance between the maximum and minimum cumulative

23

distributions of a variable indicates high imprecision. For example, in case 5, there is

high imprecision in all variables. Uncertainty due to conflict in a variable can be assessed

by the width of the interval in which its cumulative probability distribution suggests that

it can vary. The flatter the cumulative distribution, the wider is the interval of variation

and the higher is the uncertainty due to conflict. For example, conflict is larger in cases 4

and 7 than in the other cases, because the cumulative Plausibility and Belief distributions

of the variables are flatter (have lower slope) in these cases than in the other cases. From

the Bayesian approach results in Figure 12, we observe that the uncertainty increases

when there is positive correlation in the expert’s errors (variable A) and decreases when

there is negative correlation in the expert's errors (variable B). A shift in the cumulative

distribution function (Figure 12) accounts for the bias assumed in estimating the experts’

errors for variables A, B and Y in case 3b.

As mentioned earlier, conflict is largest in cases 4 and 7. If the Bayesian analyst

assumes that the standard deviation of the errors of the expert’s estimates are 1/6th of the

widths of their intervals, then the analyst will seriously underestimate the uncertainty in

both the input variables and the response variable. For example, in Figures 13 and 16, the

uncertainty in all variables computed by the BT approach in cases 4 and 7 is small

compared to that predicted by the ET approach. But if the Bayesian analyst increases the

standard deviation of the errors of the experts then the analyst will assess the uncertainty

more accurately (cases 4a and 7a) and his/her conclusions will be consistent with those

from ET.

Both Shannon's entropy, used in BT, and Total Uncertainty, used in ET, indicate

that the uncertainty is largest in cases 4a and 7a (Figure 17). It is also observed from

24

Figure 17 that the conclusions of the BT approach are sensitive to the assumptions about

the experts’ errors. Assuming that the standard deviations of the errors of the expert’s

estimates are 1/6th of the widths of their intervals leads to the conclusion that entropy is

small (cases 4 and 7). But this assumption amounts to saying that both experts are

precise, which is wrong because the experts contradict each other. Using proper

assumptions about the expert's errors a Bayesian analyst obtains consistent conclusions

with ET.

Table 2 presents the Nonspecificity, Conflict and Shannon’s entropy in cases 1 to

7. Even though both Strife and Shannon’s entropy measure conflict, they should not be

compared because Shannon’s entropy measures the conflict relative to a reference

probability density (a uniform probability density in this example). The negative numbers

for the entropy indicate that the uncertainty in variables A and B was reduced when the

prior distributions of these variables were updated based on the expert’s evidence. In

case 1, Strife is zero showing that there is no uncertainty due to conflict. In case 2, Strife

is higher than Nonspecificity, which indicates that uncertainty due to conflict in the

evidence of the experts dominates. In case 3, imprecision and conflict types of

uncertainty are comparable. In case 4, conflict dominates over imprecision because the

evidence is precise but conflicting. In case 5, imprecision dominates over conflict,

because the evidence consists of nested intervals. In Case 6, Nonspecificity and Strife are

comparable. In case 7, where the intervals are disjoint, Strife is larger then

Nonspecificity. These conclusions are consistent with those from Figure 18, which shows

that Nonspecificity dominates in cases 1, 3-3b, 5, 6 whereas Strife dominates in cases 4

25

and 7. In the BT approach one cannot only assess the total uncertainty in the input and

output variables because both types of uncertainty were aggregated into one.

When imprecision is large, a decision-maker cannot estimate the probability of an

event accurately. For example, in case 5, the cumulative probability of Y can assume any

value between 0 and 1, for values of Y between 1 and 1.7. This indicates that one should

consider collecting more data before making a decision. However, if the decision-maker

has to make a decision now, then the results of ET do not tell the decision-maker what to

do. For example if two alternative designs have minimum and maximum probabilities of

failure 0.05 and 0.1, and 0.01 and 0.15, respectively, the ET approach does not tell the

decision-maker which design is safer.

Table 3 summarizes the differences of the two approaches. BT approach does not

distinguish between imprecision and conflict types of uncertainty. It provides single

estimates of the probabilities of events, which help a decision-maker rank alternative

designs in terms of their reliability. On the other hand, a decision-maker cannot tell if it is

worth buying more information by only examining the cumulative probability

distribution of a variable. One can assess the value of additional information by studying

the sensitivity of the results of the BT on the underlying assumptions about the

probability distributions of the random variables.

4. Experimental Comparison of Methods

Different methods have been proposed on how to model and propagate the

uncertainty, and provide information on the uncertainty in the function y in the challenge

problems. We assume that, in many cases, uncertainty is propagated for the purpose of

26

making decisions. Therefore, we propose a simulation method for testing methods in

terms of their effectiveness for making simple decisions.

The proposed simulations imitate the following physical experiment designed to

see which of two people (say John and Linda) can estimate better the relative weight of

two pieces of cake. We give John a cake, asking him to cut it knowing that Linda will

pick the heavier slice, and that his objective is to end up with the heaviest slice himself.

Under these conditions, John will try to cut the cake as evenly as possible. We then repeat

the experiment by giving Linda an identical cake to slice. Finally, we weigh the two

pieces that John ended up with and the two pieces that Linda has. If Linda has

substantially heavier total, it would indicate that she estimates the relative weight of two

pieces of cake more accurately than John. This problem belongs to a wide class of real-

life problems in which two players are to divide a certain amount of resources or goods in

two parts. For example, a sales manager wants to divide a town to two salesmen

equitably. One way is for the manager to ask one salesman to divide the town into two

regions by drawing a straight line on the town map and then ask the other salesman to

select a region. Then the salesman who divided the town receives the remaining region.

The analogy is immediate when one wants to estimate the median of the

probability density function of function, y(x) (median: 0.5 probability to fall below it and

0.5 probability to exceed it). We have two methods for constructing models of the

uncertainty in variable X and propagating it to quantify the uncertainty in function y(x).

Each method has an advocate, called John and Linda. We ask John to divide the interval

[yl, yu] into two by picking one point inside the interval, so that Linda can select one

subinterval. We repeat the procedure by asking Linda to slice the interval and allow John

27

to pick a subinterval. Finally, we conduct an evaluation of who selected better. How we

do that is a crucial part of the proposed procedure, but we will discuss it later.

First, let us give a simple example. Assume that X is a scalar and that we know

only that it resides in the interval [0,1]. We will assume that the person who divides the

interval will do so assuming that the other person uses the same model for characterizing

uncertainty. We also assume that each player wants to get the portion with the highest

probability. We will assume that if the function is y = x, so that [yl, yu]=[0,1], both John

and Linda will slice [yl, yu] in the middle (at y=0.5). Now consider the function y = x2,

which also has [yl, yu]=[0,1]. This time, John may say that since he does not know

anything about the distribution of X, he does not know anything about the distribution of

Y so he will still divide [yl, yu] in the middle. Linda, on the other hand, may assume a

uniform distribution for X. Using this assumption she will select the interval [0.0.5] and

leave John the interval [0.5,1]. Then, in her turn, she will divide [yl, yu] at y = 0.25 since

0.25 is the median of the probability distribution of Y. Using the same logic, John will

then pick the interval [0.25,1]. That is, in both cases John will end up with the right

interval, and Linda with the left one.

We now come to the issue of how to decide who made better decisions, John or

Linda. We assume that knowing only the interval where a variable lies is equivalent to

that variable being able to take any probability distribution supported on that interval

with each distribution having the same likelihood. This suggests the following possible

Monte Carlo simulation: Pick a distribution at random, and evaluate the outcomes for

John and Linda based on that distribution. Repeat the process many times and see

28

whether John or Linda emerge as winners when a large number of simulations has been

performed.

To make this process more manageable we assume that probability distributions

in common usage, such as the normal or the Weibull distributions, are popular because

they describe well uncertainties we encounter often in practical applications. This allows

us to limit the simulation to a set of the five or ten most popular distributions. Note that

even if we limited ourselves to a single distribution, we still can vary the parameters of

the distribution such as the mean and standard deviation in case of the normal

distribution.

To cater for the possibility that we left out some odd but important distributions,

we can add in some experimental distributions from various sources. For example, we

may ask a class of students to each pick a number from the range of X or a scaled version

of X

We still need to deal with the question of how to simulate situations where the

information on X is more complex. For example, we may receive information from two

people, one saying that X is in the range [0, 1] and the other saying that it is in the range

[0, 0.5]. The best way of simulating this situation deserves some consideration and

debate. One possibility is to take half of the distributions from one interval and the other

half from the other interval (assuming that both sources of information are equally

credible).

In the following, we present methods for solving the interval splitting problem and

demonstrate them through examples.

Definition of Interval Splitting and Subinterval Selection Problem

29

Consider the following game played by John and Linda. The players consider a

variable X and a known function of this variable, Y(X). They are told that X is in the

interval I=[xl, xu]. Then Y is in the interval IY=[yl, yu], where yl and yu are the minimum

and maximum values of the function Y(X) when X varies in the interval I. Additional

evidence about X could be available. John divides IY into two subintervals by selecting a

point, y0. Then Linda selects one of the subintervals, IYC, and John gets the remaining

subinterval, IYB. Linda wins if he or she selects the interval with higher probability. Find

y0 so that John does not lose. The game is repeated with John and Linda switching places

to create a symmetric game.

Solution

Even though the two players may have different models of uncertainty, we

assume that they do not know what these models are. Therefore, we assume that they

solve the problem under the assumption that the other player uses the same model as they

do.

The interval splitting problem is formulated as follows:

Find y0,

to maximize the objective function = P(IYB)-P(IYC), where P(IYB) and P(IYC) are the true

probabilities of the John's and Linda's subintervals.

Under the assumption that Linda has the same uncertainty model as John, Linda

can always select the interval with highest probability. The best bet for John is to select

the median of Y as the dividing point, in which case he will break even (see the appendix

for a mathematical proof of this assertion).

30

John should select the subinterval with highest probability that is, he will select

the left subinterval if y0 is to the right of the median of Y (i.e. FY (y0 ) ≥ 0.5). He will

select the right subinterval otherwise.

Examples

In the following, we present two solutions to both the interval splitting and

subinterval selection problems for function Y=X2, in two cases, in which John knows that

X is in an interval (Case A) and John has evidence about X in the form of three intervals

(Case B). We further assume that one player uses the principle of maximum entropy to

construct a probability distribution, while the other employes the minimum and

maximum cumulative probability distribution function of Y obtained using ET for

dividing and selecting.

Case A:

The only evidence available is that X is between 0 and 1. Therefore, Y ranges

between 0 and 1, too.

Interval splitting problem: Find spitting point, y0. Maximum entropy solution: Using the maximum entropy principle, John assumes a

uniform probability density of X in [0, 1]. Then,

.25.05.0

,5.0 is which, of median theof square theis of valueoptimum theTherefore,

.5.0)(5.0)(5.0)(5.0)(5.0)(

20

0

0002

00

==

=

=⇔=≤⇔=≤⇔=≤⇔=

y

xXY

yFyXPyXPyYPyF XY

31

ET solution: In this case, John does not assume a probability distribution for X or

Y. Instead, he will find the minimum and maximum cumulative probabilities of X that

are consistent with the available evidence and derive the minimum and maximum

cumulative probabilities of Y. Figure 19 shows the minimum and maximum cumulative

probability distributions of X, and Y, which are identical in this case. According to this

figure, John only knows that the cumulative probability can assume any value between 0

and 1 for any value of Y in the interval [0, 1]. Therefore, it does not matter what value

of Y he selects as long as it is in the interval [0, 1]. In cases where one only knows that

the optimum solution to a problem is in a certain interval it is reasonable to select the

midpoint of that interval for the solution because this will best protect against errors in

calculating the interval. One the basis of this argument, John will select y0 = 0.5.

Subinterval selection problem Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise

he will select the left.

ET solution: If y0 is less than 0.5, John will select the right interval, otherwise he will

select the left one.

Based on these calculations we will get the following scenario: The maximum

entropy player will divide the interval at 0.25, and the ET player will select the right

subinterval. In the second game, the ET player will divide the interval at 0.5, and the

maximum entropy player will select the left subinterval.

Case B:

Three experts tell John that X ranges in the following intervals: [0, 1], [0.3, 0.7],

[0.4, 0.6].

32

Interval splitting problem Maximum entropy solution: If we considered only the opinion of one expert we would

assume a uniform probability distribution in the expert's interval. Assuming that each

distribution is correct one third of the time we obtain the probability density function of X

shown in Figure 20. Based of this function, the median of X is x0 = 0.5. Therefore, the

optimum value of Y is again, y0 = 0.25.

ET solution: The body of evidence for variable X introduced by the experts is shown in

Figure 21, where m ([a, b]) is the basic probability assignment of interval [a, b].

Figure 22 presents the basic probability assignment of Y . Using this body of evidence

we can compute the minimum and maximum cumulative probabilities of variable Y. For

example, the minimum cumulative probability distribution at point y is equal to the belief

of interval [-∞, y], which is:

∑=−∞=

∞⊆ ],[- A

min )(]),([)(y

Y AmyBelyF (21)

The minimum and maximum cumulative distributions of Y in Figure 23 can tell us

that the median of Y cannot assume any values less than 0.09 or greater than 0.49.

Therefore, the optimum value of Y, should be anywhere in the range [0.09, 0.49]. A

reasonable choice for y0 is the midpoint, that is the splitting point is, y0 = 0.29.

Subinterval selection problem Probabilistic solution: If y0 is less than 0.25, John will select the right interval, otherwise

he will select the left.

33

ET solution: If y0 is less than 0.29, John will select the right interval, otherwise he will

select the left.

The results in this case are similar to Case A. In the first game the maximum

entropy player will select 0.25 as the splitting point, and the ET player will select the

right interval. In the second game, the splitting point will be 0.29, and the maximum

entropy player will select the left interval.

Observations

Consider the problem Y=Xn, where n is large (e.g. 100). The person who used the

maximum entropy principle would select an extremely small y0 (y0 = 0.5100). If the only

information available to John was that X were in the interval [0, 1] and John used an ET

approach, then he would not know what value of Y to select in the interval [0, 1]. He

could select the midpoint y0 = 0.5. In this case he would lose practically all the time by a

wide margin for most probability distributions. Figure 24 shows John's objective

(payoff) function, when John splits the interval, in case where although both players

assumed that Y = X 100, the true value of the exponent of the function was 100 times a

bias factor ranging from 0.5 to 1.5. Two cases in which John uses probability and Linda

uses minimum and maximum probalities and vice versa are considered. Since John is

splitting the interval, he has a disadvantage compered to Linda. When John uses

probability his payoff function is close to 0, which means he will break even. But when

John uses maximum and minimum probability he will almost always lose and his payoff

function is practically -1, which is the lowest value this function can assume. The reason

34

is that he will split the interval in half and Linda will select the left interval which almost

always has the higher probability.

Now consider the same problem but evidence consists of three intervals [0, 1],

[0.3, 0.7] and [0.4, 0.6]. In this case, if John used the probabilistic method described

above, John would still get the correct value of y0. On the other hand, if John used the

approach based on minimum and maximum probability he would counclude that the

splitting point should be between [0.3100, 0.7100]. This solution is much better than the

optimum solution of the same method when only interval [0, 1] is available. It appears

that the performance of the approach based on minimum and maximum probability can

be very poor when there is a severe information deficit. This is interesting because our

intuition tells us the opposite, the fewer assumptions a method does the less sensitive it is

to lack of information.

5. Conclusions

Two approaches, one based on ET and the other on BT have been presented.

These approaches can be used for modeling uncertainty and assessing the safety of a

system when the available evidence consists of intervals bounding the values of the input

variables. Experts provide these intervals.

The Evidence theory approach does not require the user to assume anything

beyond what that is already available. This approach treats uncertainty due to imprecision

differently than uncertainty due to randomness. The approach yields maximum and

minimum bounds of the probability of survival (and/or the probability of failure) of a

system, which can help assess the relative importance of the two types of uncertainty.

These results could help a decision-maker decide if it is worth collecting additional data

35

to reduce imprecision. On the other hand, if the gap between maximum and minimum

probabilities were large, the decision-maker would have difficulty ranking alternative

options. If the decision-maker has to make a decision now, then the ET approach does

not tell the decision-maker what option is better.

The Bayesian approach requires the analyst to make strong assumptions about the

credibility of the experts to estimate the likelihood of the available evidence. On the other

hand, this approach is more flexible than the Evidence theory approach because it

accounts for the credibility and correlation of the experts. This approach yields a single

estimate of the probability of failure of the system. This makes it easier for a decision-

maker to rank alternative options. On the other hand it does not help the decision-maker

assess the relative importance of imprecision over random uncertainty.

It is recommended that a decision-maker compute both the Bayesian probability

of events and their minimum and maximum probabilities when there is considerable

imprecision. A large gap between the minimum and maximum probability suggests that

more information should be collected before making a decision. If this is not feasible,

then Bayesian probabilities can help make a decision.

A procedure for testing alternative methods for solving the challenge problems,

based on the outcomes of decisions obtained from these methods, was presented. A

simple set of test problems, mimicking real-life decision problems in which a given

amount of resources is to be equally divided, is used for testing methods. We can learn

useful lessons about the efficacy of alternative methods, which are difficult to learn from

examination of the theoretical foundations of methods, using the test problems. For

example, it was found that a decision-maker who uses an ET approach performs worst

36

than an opponent who uses probability, in the long run, even when the information about

uncertainty is scarce.

Acknowledgements

The work presented in this report has been partially supported by the grant

"Analytical Certification and Multidisciplinary Integration" provided by The Dayton

Area Graduate Study Institute (DAGSI) through the Air Force Institute of Technology.

Appendix: Mathematical derivation of the solution

As mentioned in the main body of this report, John will select the median of the

probability distribution of Y as the dividing point to overcome Linda's advantage of

knowing the true probability distribution of Y. Here we will prove this assertion.

John will assume that Linda will select the interval with the higher true

probability, that is the probability of Linda selecting the left subinterval, p(y0), is:

<≥

=5.0)( if 05.0)( if 1

)(00

0 yFyF

ypYY (22)

where FY(y0) is the value of the true cumulative probability distribution of Y at Y

= y0. The decision tree is shown in Figure 25.

Figure 26 shows the objective function as a function of the value of the

cumulative probability of Y at y0, FY(y0). It is observed that the optimum value of y0 is

the one for which FY (y0) is 0.5, that is y0 is the median of Y. This is the only choice for

which John breaks even − he losses for all other values. Q.E.D.

37

References

[1] Oberkampf, W.L., DeLand, S.M., Rutherford, B.M., Diegert, K.V., Alvin, K.F., 2000,

“Estimation of Total Uncertainty in Modeling and Simulation”, Sandia Report

SAND2000-0824, Albuquerque, NM.

[2] Walley, P., 1991, Statistical Reasoning with Imprecise Probabilities, Chapman and

Hall, New York.

[3] Walley, P., 1998, “Coherent Upper and Lower Previsions.” Available at

http://ippserv.rug.ac.be. The Imprecise Probabilities Project.

[4] Kyburg, H. E., 1998, "Interval-valued Probabilities." Available at

http://ippserv.rug.ac.be, . The Imprecise Probabilities Project.

[5] Dubois, D and Prade, H., 1998, Possibility Theory, Plenum Press, New York.

[6] Joslyn, C., A., 1994, Possibilistic Processes for Complex Systems Modeling, Ph.D.

Dissertation, State University of New York at Binghamton, .

[7] Giles, R., 1982, “Foundations for a Theory of Possibility,” Fuzzy Information and

Decision Processes, North-Holland Publishing Company.

[8] Langley, R. S., 2000, “Unified Approach to Probabilistic and Possibilistic Analysis of

Uncertain Systems,” Journal of Engineering Mechanics, ASCE, Vol. 126, No. 11, pp.

1163-1172.

[9] Shafer, G., 1976, A Mathematical Theory of Evidence, Princeton University Press,

Princeton.

[10] Sentz, K., Ferson, S., “Combination of Evidence in Dempster-Shafer Theory,”

Sandia Report SAND2002-0835, April 2002, Albuquerque, NM.

38

[11] Smets, P., "Belief Functions and the Transferable Belief Model." Available at

http://ippserv.rug.ac.be. The Imprecise Probabilities Project.

[12] Berger, J. O., 1985, Decision Theory and Bayesian Analysis, Springer-Verlag, New

York, pp. 109-113.

[13] Winkler, R. L., 1972, Introduction to Bayesian Inference and Decision, Holt

Rienhart and Winston, Inc.

[14] Ben-Haim,Y., 2001, Information-gap Decision Theory: Decisions Under Severe

Uncertainty, Academic Press.

[15] Oberkampf, W.L., Helton, J. C., Joslyn, C. A., Wojtkiewicz, S. F., and Ferson, S.,

“Challenge Problems: Uncertainty in Series Systems Response Given Uncertain

Parameters,” (this issue).

[16 ] Klir, G. J., Wierman, M. J.,1998, Uncertainty-Based Information, A Springer-

Verlag Company.

[17] Rocha, L. M., "Relative Uncertainty and Evidence Sets: A Constructivist

Framework," International Journal of general Systems, Vol. 26, (1-2), pp. 35-61.

[18] Melchers, R. E., 1987, Structural Reliability, Analysis and Prediction, Ellis Horwood

Limited, West Sussex, England.

[19] Wu, J. S., Apostolakis, G. E., and Okrent, D., 1990, Uncertainties in System

Analysis: Probabilistic Versus Nonprobabilistic Theories," Reliability Engineering and

System Safety, 30, pp. 163-181.

39

40

Figure Captions

Figure 1: Plausibility of A and B. The belief is zero.

Figure 2: Variables A and B in joint space.

Figure 3: Joint maximum probability (Plausibility) of A and B.

Figure 4: Body of Evidence and Plausibility of variable Y. Belief is zero.

Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y.

Figure 6: Prior PDF’s of variables A and B.

Figure 7: Evidence from experts.

Figure 8: PDFs of errors of experts for input variables.

Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B

Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Case 1.

Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Case 2.

Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Cases 3,3a and 3b.

Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Cases 4 and 4a.

Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

case 5.

Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Case 6.

41

Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for

Cases 7 and 7a.

Figure 17: Uncertainty in ET vs. Uncertainty in BT

Figure 18: Nonspecificity (ET) vs. Strife (ET)

Figure 19: Maximum and Minimum Cumulative Probabilities of variables X and Y.

Figure 20: PDF of X.

Figure 21: Body of Evidence of variable X.

Figure 22: Body of Evidence of variable Y.

Figure 23: Maximum and Minimum Cumulative Probability of Y.

Figure 24: Payoff function graph, Probability vs. Min-Max probability.

Figure 25: Decision Tree

Figure 26: Objective function vs. FY (y0).

42

Case Variable A Variable B Sizes and positions of intervals Conflict Imprecision

1 [0.2,0.5] [0.3,0.6] Wide None High

2 [0.1, 0.5] [0.55,0.95]

[0, 0.5] [0.52,1] Wide, Disjoint High High

3, 3a, 3b

[0.1, 0.5], [0.2,0.6]

[0, 0.5], [0.2,0.7] Wide, Overlapping Low High

4, 4a

[0.5,0.52], [0.6,0.62]

[0.6,0.62] [0.7,0.72] Narrow, Conflicting High Low

5 [0.1,1.0]

[0.6,0.8], [0.4,0.85], [0.2,0.9], [0.0,1.0]

Wide, Overlapping Low High

6 [0.5,0.7], [0.3,0.8], [0.1,1.0]

[0.59,0.61], [0.4,0.85], [0.2,0.9], [0.0,1.0]

Nested Low High

7, 7a

[0.8,1.0], [0.5,0.7], [0.1,0.4].

[0.8,1.0], [0.5,0.7], [0.1,0.4], [0.0,0.2]

Disjoint High High

Table 1: Evidence from experts.

43

Case

Non-specificity (ET) Strife (ET)

Shannon’s entropy (BT)

1 A = 0.2624 B = 0.2624 Y = 0.1747

A = 0 B = 0 Y = 0

A = -1.484 B = -1.586 Y = -1.62

2 A = 0.37 B = 0.40

Y = 0.417

A = 1 B = 1

Y = 0.854

A = -1.53 B = -1.43 Y = -2.08

3 A = 0.3365 B = 0.4055 Y = 0.2846

A = 0.193 B = 0.322 Y = 0.308

A = -1.53 B = -2.043 Y = -2.46

3a A = 0.3365 B = 0.4055 Y = 0.2846

A = 0.193 B = 0.322 Y = 0.308

A = -1.237 B = -2.2173 Y = -3.08

3b A = 0.3365 B = 0.4055 Y = 0.2846

A = 0.193 B = 0.322 Y = 0.308

A = -1.58 B = -2.11

Y = -2.573

4 A = 0.0198 B = 0.0198 Y = 0.0247

A = 1 B = 1

Y = 1.924

A = -3.59 B = -3.87 Y = -4.03

4a A = 0.0198 B = 0.0198 Y = 0.0247

A = 1 B = 1

Y = 1.924

A = -1.84 B = -1.93 Y = -2.35

5 A = 0.64 B = 0.444 Y = 0.714

A = 0 B = 0.358 Y = 0.123

A = -0.169 B = - 2.18 Y = -1.76

6 A = 0.41 B = 0.404 Y = 0.516

A = 0.36 B = 0.465 Y = 0.375

A = -1.81 B = - 3.56 Y = -2.58

7 A = 0.209 B = 0.203 Y = 0.249

A = 1.585 B = 1.75

Y = 1.805

A = -2.5 B = -2.24 Y = -2.84

7a A = 0.209 B = 0.203 Y = 0.249

A = 1.585 B = 1.75

Y = 1.805

A = -0.46 B = -0.74 Y = -1.08

Table 2: Uncertainty Measures

44

ET Approach BT Approach

Analyst does not need to make any

additional assumptions beyond what is

already available.

Analyst assumes the Prior and

the error in the estimates of the

experts. These assumptions can

affect the results significantly.

Treats uncertainty due to imprecision and

conflict separately. Yields maximum and

minimum bounds of probabilities of events

from which the relative magnitude of these

two types of uncertainty can be assessed.

Does not distinguish between

imprecision and conflict. Gives

a single value of the probability

of an event

Reliability and correlation between experts

cannot be taken into account.

Expert’s reliability and

correlation can be taken into

account.

When the intervals provided by the experts

are very broad, then the gap between the

maximum and minimum probabilities of

failure can be very large. A decision-

maker might be unable to rank alternative

designs in terms of their reliability because

of this gap.

Since a single value of the

probability of failure is provided

it is easier for the decision-

maker to rank designs in terms

of their reliability.

Table 3: Comparison of ET and BT approaches

45

Figure 1: Plausibility of A and B. The belief is zero.Figure 1: Plausibility of A and B. The belief is zero.

46

0.50.1 0.2 0.3 0.4 0.6

0.5

0.6

0.7

0.4

0.3

0.2

Variable A

Var

iabl

e B

Figure 2: Variables A and B in Joint Space

m=1/2

m=1

/2

m=1/2

m=1

/2

0.50.1 0.2 0.3 0.4 0.6

0.5

0.6

0.7

0.4

0.3

0.2

Variable A

Var

iabl

e B

Figure 2: Variables A and B in Joint Space

m=1/2

m=1

/2

m=1/2

m=1

/2

47

S0.25 b 0.5:=

b

a

y

Figure 3: Joint maximum probability (Plausibility) of A and B

S0.25 b 0.5:=

b

a

y

Figure 3: Joint maximum probability (Plausibility) of A and B

48

Figure 4: Body of Evidence and Plausibility of Variable Y. Belief is zero.

4 1.1 1.20.80.7

Focal element corresponding to the shaded box(Fig.2)

0.9

Figure 4: Body of Evidence and Plausibility of Variable Y. Belief is zero.

4 1.1 1.20.80.7

Focal element corresponding to the shaded box(Fig.2)

0.9

49

4 1.1 1.20.80.7

Focal element corresponding to the shaded box(Fig.2)

0.9

Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y

4 1.1 1.20.80.7

Focal element corresponding to the shaded box(Fig.2)

0.9

Figure 5: Body of Evidence and Cumulative Plausibility and Belief of Y

50

Figure 6: Prior PDF’s of variables A and BFigure 6: Prior PDF’s of variables A and B

51

…minx1

minnxmaxx2

minx2maxx1

maxnxnx̂2x̂1x̂

Figure 7: Evidence from experts

…minx1

minnxmaxx2

minx2maxx1

maxnxnx̂2x̂1x̂

Figure 7: Evidence from experts

52

Varia

ble A

Varia

ble B

Figure 8: PDFs of errors of experts for input variables.

Varia

ble A

Varia

ble B

Figure 8: PDFs of errors of experts for input variables.

53

Varia

ble A

Varia

ble B

Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B.

Varia

ble A

Varia

ble B

Figure 9: Likelihood and Posterior Probability Density Functions of Variables A and B.

54

Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 1.Figure 10: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 1.

55

Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 2.Figure 11: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 2.

56

Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 3, 3a and 3b.Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 3, 3a and 3b.Figure 12: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 3, 3a and 3b.

57

Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 4 and 4a.Figure 13: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 4 and 4a.

58

Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for case 5.Figure 14: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for case 5.

59

Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 6.Figure 15: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Case 6.

60

Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 7 and 7a.Figure 16: Cumulative Plausibility and Belief along with Cumulative Bayesian PDF for Cases 7 and 7a.

61

0

0.5

1

1.5

2

2.5

-4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0

Shannon's Entropy

Tota

l Unc

erta

inty

(Non

spec

ifici

ty+S

trife

)

Case 4

Case 3bCase 3Case 5

Case 2

Case 1Case 3a

Case 6

Case 7 Case 4a Case 7a

Figure 17: Uncertainty in ET versus Uncertainty in BT

0

0.5

1

1.5

2

2.5

-4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0

Shannon's Entropy

Tota

l Unc

erta

inty

(Non

spec

ifici

ty+S

trife

)

Case 4

Case 3bCase 3Case 5

Case 2

Case 1Case 3a

Case 6

Case 7 Case 4a Case 7a

0

0.5

1

1.5

2

2.5

-4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0

Shannon's Entropy

Tota

l Unc

erta

inty

(Non

spec

ifici

ty+S

trife

)

Case 4

Case 3bCase 3Case 5

Case 2

Case 1Case 3a

Case 6

Case 7 Case 4a Case 7a

Figure 17: Uncertainty in ET versus Uncertainty in BT

62

00.10.20.30.40.50.60.70.8

0 0.5 1 1.5 2 2.5Strife

Non

spec

ifici

ty

case 7, 7a

case 4, 4acase 1

cases 3, 3a, 3bcase 2

case 6

case 5

Figure 18: Nonspecificity (ET) versus Strife (ET)

00.10.20.30.40.50.60.70.8

0 0.5 1 1.5 2 2.5Strife

Non

spec

ifici

ty

case 7, 7a

case 4, 4acase 1

cases 3, 3a, 3bcase 2

case 6

case 5

Figure 18: Nonspecificity (ET) versus Strife (ET)

63

0.2 0 0.2 0.4 0.6 0.8 10

0.5

1

M axim u m C u m u la tive P rob ab i l it yM in im u m C u m u lat ive P rob ab i li ty

f a( )

l a( )

a

F ig u re 1 9 : M ax im u m an d M in imu m cu mu la tiv e p ro b abv ar ia b les X an d Y. Figure 19: Maximum and Minimum Cumulative Probabilities of variables X and Y

0.2 0 0.2 0.4 0.6 0.8 10

0.5

1

M axim u m C u m u la tive P rob ab i l it yM in im u m C u m u lat ive P rob ab i li ty

f a( )

l a( )

a

F ig u re 1 9 : M ax im u m an d M in imu m cu mu la tiv e p ro b abv ar ia b les X an d Y. Figure 19: Maximum and Minimum Cumulative Probabilities of variables X and Y

64

fX(x)

0 10.3 0.4 0.6 0.7

0.33

0.66

1

x

Figure 20: PDF of X.

fX(x)

0 10.3 0.4 0.6 0.7

0.33

0.66

1

x

Figure 20: PDF of X.

65

66

0 10.09 0.490.16 0.36

m([0.16, 0.36])=1/3

m([0.09, 0.49])=1/3

m([0, 1])=1/3

y

Figure 22: Body of Evidence of variable Y.

0 10.09 0.490.16 0.36

m([0.16, 0.36])=1/3

m([0.09, 0.49])=1/3

m([0, 1])=1/3

y

Figure 22: Body of Evidence of variable Y.

67

68

.

0.5 1 1.5-1

0

1Bidder splits interval

Exponent, bias factor

John splits interval

John uses probability

John uses min-max probability

Figure 24: Payoff function graph, Probability vs. Min-Max Probability

John

's pa

yoff

func

tion

.

0.5 1 1.5-1

0

1Bidder splits interval

Exponent, bias factor

John splits interval

John uses probability

John uses min-max probability

Figure 24: Payoff function graph, Probability vs. Min-Max Probability

John

's pa

yoff

func

tion

69

Following is the decision tree:

.

.

.

Figure 25 : Decision Tree

Linda selects right subinterval

1-p(yo)

Linda selects left subinterval p(yo)

John selects, yo

Objective function P(IYB)-P(IYC) = 1 - 2FY(yo)

Objective function P(IYB)-P(IYC) = 2FY(yo) - 1

Following is the decision tree:

.

.

.

Figure 25 : Decision Tree

Linda selects right subinterval

1-p(yo)

Linda selects left subinterval p(yo)

John selects, yo

Objective function P(IYB)-P(IYC) = 1 - 2FY(yo)

Objective function P(IYB)-P(IYC) = 2FY(yo) - 1

70

FY(y0)

10.50

Objective function

-1

Figure 26: Objective function vs. FY(y0)

FY(y0)

10.50

Objective function

-1

Figure 26: Objective function vs. FY(y0)

71

72