bayesian networks october 9, 2008 sung-bae cho. bayesian network –introduction –inference of...

83
Bayesian Networks October 9, 2008 Sung-Bae Cho

Upload: jasper-waters

Post on 25-Dec-2015

232 views

Category:

Documents


12 download

TRANSCRIPT

Page 1: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Networks

October 9, 2008

Sung-Bae Cho

Page 2: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 3: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Probabilities

• Probability distribution P(X| – X is a random variable

• Discrete random variable: Finite set of possible outcomes

• Continuous random variable: Probability distribution (density function) over continuous values

– is background state of information

0)( ixP 1)(1

n

iixP

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

X1 X2 X3 X4

1)()( xPxPX binary:

nxxxxX ,...,,, 321

10

0

1)( dxxP

10,0X 0)( xP

7

5

)()75( dxxPxP

)(xP

x5 7

3

Page 4: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Rules of Probability

• Product Rule

• Marginalization

• Bayes Rule

)()|()()|(),( XPXYPYPYXPYXP

),(),()( xYPxYPYP

),( )(1

n

iixYPYP

X binary:

)()|()()|(),( HPHEPEPEHPEHP

)(

)()|()|(

EP

HPHEPEHP

4

Page 5: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Rules of Probability

• Chain rule of probability

n

iii

n

xxxp

xxxpxxpxp

xxpxp

111

213121

1

),...,|(

)...,|()|()(

),...,()(

5

Page 6: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

The Joint Distribution

• Recipe for making a joint distribution of M variables

– Make a truth table listing all combinations of values of your variables

– For each combination of values, say how probable it is

6

Page 7: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Using the Joint

• Once you have the joint distribution, you can ask for the probability of any logical expression involving your attribute

E

rowPEP matching rows

)()(

7

Page 8: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Using the Joint

4654.0) ( MalePoorP

E

rowPEP matching rows

)()(

7604.0)( PoorP

E

rowPEP matching rows

)()(

8

Page 9: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference with Joint

2

21

matching rows

and matching rows

2

2121

)(

)(

)(

),()|(

E

EE

rowP

rowP

EP

EEPEEP

9

Page 10: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference with the Joint

2

21

matching rows

and matching rows

2

2121

)(

)(

)(

),()|(

E

EE

rowP

rowP

EP

EEPEEP

612.07604.0/4654.0)|( PoorMaleP

10

Page 11: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Joint Distributions

• Good news

– Once you have a joint distribution, you can ask important questions about stuff that involves a lot of uncertainty

• Bad news

– Impossible to create for more than about ten attributes because there are so many numbers needed when you build them

11

Page 12: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Networks

• In general

– P(X1, … Xn) needs at least 2n – 1numbers to specify the joint probability

– Exponential storage and inference

• Overcome the problem of exponential size by exploiting conditional independence

REAL Joint Probability Distribution

(2n-1 numbers)

ConditionalIndependence

(Domain knowledge orDerived from data)

Graphical Representation of Joint Probability Distribution

(Bayesian Network)

12

Page 13: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

= P(A) P(S) P(T|A) P(L|S) P(B|S) P(C|T,L) P(D|T,L,B)

P(A, S, T, L, B, C, D)

Conditional Independencies Efficient Representation

Θ) (G,BN

CPD:

T L B D=0 D=10 0 0 0.1 0.90 0 1 0.7 0.30 1 0 0.8 0.20 1 1 0.9 0.1

...

Lung Cancer

Smoking

Chest X-ray

Bronchitis

Dyspnoea

Tuberculosis

Visit to Asia

P(D|T,L,B)

P(B|S)

P(S)

P(C|T,L)

P(L|S)

P(A)

P(T|A)

[Lauritzen & Spiegelhalter, 95]

Bayesian Networks?

13

Page 14: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Networks?

• Structured, graphical representation of probabilistic relationships between several random variables

• Explicit representation of conditional independencies

• Missing arcs encode conditional independence

• Efficient representation of joint pdf (probabilistic distribution function)

• Allows arbitrary queries to be answered

• P(lung cancer=yes | smoking=no, dyspnoea=yes)=?

14

Page 15: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Networks?

• Also called belief networks, and (directed acyclic) graphical models

• Bayesian network

– Directed acyclic graph

• Nodes are variables (discrete or continuous)

• Arcs indicate dependence between variables

– Conditional Probabilities (local distributions)

15

Page 16: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Networks

CancerSmoking heavylightnoS ,,

malignantbenignnoneC ,,P( S=no) 0.80

P( S=light) 0.15

P( S=heavy) 0.05

Smoking= no light heavy

P( C=none) 0.96 0.88 0.60

P( C=benign) 0.03 0.08 0.25

P( C=malig) 0.01 0.04 0.15

16

Page 17: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Product Rule

• P(C,S) = P(C|S) P(S)

S C none benign malignantno 0.768 0.024 0.008

light 0.132 0.012 0.006

heavy 0.035 0.010 0.005

S C none benign malig totalno 0.768 0.024 0.008 .80

light 0.132 0.012 0.006 .15

heavy 0.035 0.010 0.005 .05

total 0.935 0.046 0.019

P(Cancer)

P(Smoke)

17

Page 18: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayes Rule

)(

),(

)(

)()|()|(

CP

SCP

CP

SPSCPCSP

S C none benign maligno 0.768/.935 0.024/.046 0.008/.019

light 0.132/.935 0.012/.046 0.006/.019

heavy 0.030/.935 0.015/.046 0.005/.019

Cancer= none benign malignant

P( S=no) 0.821 0.522 0.421

P( S=light) 0.141 0.261 0.316

P( S=heavy) 0.037 0.217 0.263

18

Page 19: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Missing Arcs Represent Conditional Independence

BatteryEngine

Turns OverStart

Start and Battery are independent, given Engine Turns Over

)|(),|( tsptbsp

)|()|()(),,( tspbtpbpstbp

n

tttn xparentsxpxxxp

121 ))(|(),...,,(

General product (chain) rule for Bayesian networks

19

Page 20: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Network

),,,,,,( SCLCSEGAP

Smoking

GenderAge

Cancer

LungTumor

SerumCalcium

Exposureto Toxics

)|()|( CLPCSCP

)()( GPAP

),|()|( GASPAEP

),|( SECP

20

Page 21: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Bayesian Network Knowledge Engineering

• Objective: Construct a model to perform a defined task

• Participants: Collaboration between domain expert and BN modeling expert

• Process: iterate until “done”

– Define task objective

– Construct model

– Evaluate model

21

Page 22: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

The KE Process

• What are the variables?

• What are their values/states?

• What is the graph structure?

• What is the local model structure?

• What are the parameters (Probabilities)?

• What are the preferences (utilities)?

22

Page 23: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

The Knowledge Acquisition Task

• Variables:

– collectively exhaustive, mutually exclusive values

– clarity test: value should be knowable in principle

• Structure

– if data available, can be learned

– constructed by hand (using “expert” knowledge)

– variable ordering matters: causal knowledge usually simplifies

• Probabilities

– can be learned from data

– second decimal usually does not matter; relative probabilities

– sensitivity analysis

23

Page 24: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

What are the Variables?

• “Focus” or “query” variables

– Variables of interest

• “Evidence” or “Observation” variables

– What sources of evidence are available?

• “Context” variables

– Sensing conditions, background causal conditions

• Start with query variables and spread out to related variables

24

Page 25: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

What are the Values/States?

• Variables/values must be exclusive and exhaustive

– Naive modelers sometimes create separate (often Boolean) variables for different states of the same variable

• Types of variables

– Binary (2-valued, including Boolean)

– Qualitative

– Numeric discrete

– Numeric continuous

• Dealing with infinite and continuous domains

– Some BN software requires that continuous variables be discretized

– Discretization should be based on differences in effect on related variables (i.e., not just be even sized chunks)

25

Page 26: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Risk of Smoking Smoking

• Values versus Probabilities

What is a variable?

• Collectively exhaustive, mutually exclusive values

4321 xxxx

jixx ji )(

Error Occurred

No Error

26

Page 27: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

What is the Graph Structure?

• Goals in specifying graph structure

– Minimize probability elicitation: fewer nodes, fewer arcs, smaller state spaces

– Maximize fidelity of model

• Sometimes requires more nodes, arcs, and states

• Tradeoff between more accurate model and cost of additional modeling

• Too much detail can decrease accuracy

27

Page 28: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 29: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference

• We now have compact representations of probability distributions: Bayesian Networks

• Network describes a unique probability distribution P

• How do we answer queries about P?

• We use inference as a name for the process of computing answers to such queries

29

Page 30: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference - The Good & Bad News

• We can do inference

• We can compute any conditional probability

• P( Some variables | Some other variable values )

2

21

matching entriesjoint

and matching entriesjoint

2

2121

)(

)(

)(

),()|(

E

EE

rowP

rowP

EP

EEPEEP

The sad, bad newsConditional probabilities by enumerating all matching entries in the joint are expensive:

Exponential in the number of variables

Sadder and worse newsGeneral querying of Bayesian networks is NP-complete

Hardness does not mean we cannot solve inferenceIt implies that we cannot find a general procedure that works efficiently for all networks

For particular families of networks, we can have provably efficient procedures

30

Page 31: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Example

H

G

M

SJ

SMJHG

MJG

TSMJHGP

TSMJTHGP

TSP

TSTHPTSTHP

,,,,

,,

),,,,(

),,,,(

)(

),()|(

H=T H=F

M=T M=F M=T M=F

G=T 0.8 0.6 0.7 0.3

G=F 0.2 0.4 0.3 0.7

G=T G=F

J=T 0.8 0.6

J=F 0.2 0.4

G=T G=F

S=T 0.6 0.7

S=F 0.4 0.3

H=T 0.7

H=F 0.3

M=T 0.3

M=F 0.7

23 JP Computation

24 JP computation31

Page 32: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Queries: Likelihood

• There are many types of queries we might ask.

• Most of these involve evidence

– Evidence e is an assignment of values to a set E variables in the domain

– Without loss of generality E = { Xk+1, …, Xn }

• Simplest query: compute probability of evidence

• This is often referred to as computing the likelihood of the evidence

)( eP 1x

1 ),,,(kx

kxxP e

32

Page 33: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Queries

• Often we are interested in the conditional probability of a variable given the evidence

• This is the a posteriori belief in X, given evidence e• A related task is computing the term P(X, e)

– i.e., the likelihood of e and X = x for values of X

– we can recover the a posteriori belief by

)(),(

)|(eP

eXPeXP

x

xXPxXP

xXP),(

),()|(

ee

e

33

Page 34: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

A Posteriori Belief

This query is useful in many cases:

• Prediction: what is the probability of an outcome given the starting condition

– Target is a descendent of the evidence

• Diagnosis: what is the probability of disease/fault given symptoms

– Target is an ancestor of the evidence

• As we shall see, the direction between variables does not restrict the directions of the queries

– Probabilistic inference can combine evidence form all parts of the network

34

Page 35: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Approaches to Inference

• Exact inference

– Inference in Simple Chains

– Variable elimination

– Clustering / join tree algorithms

• Approximate inference

– Stochastic simulation / sampling methods

– Markov chain Monte Carlo methods

– Mean field theory

35

Page 36: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference in Simple Chains

• How do we compute P(X2)?

• How do we compute P(X3)?

• we already know how to compute P(X2)...

X1 X2

11

)|()(),()( 121212xx

xxPxPxxPxP

X1 X2 X3

22

)|()(),()( 232323xx

xxPxPxxPxP

11

)|()(),()( 121212xx

xxPxPxxPxP

36

Page 37: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Inference in Simple Chains

How do we compute P(Xn)?

• Compute P(X1), P(X2), P(X3), …

• We compute each term by using the previous one

• Complexity:

– Each step costs operations

• Compare to naïve evaluation, that requires summing over joint values of n-1 variables

X1 X2 X3Xn

ix

iiii xxPxPxP )|()()( 11

|))1(||)((| ii XValXValO

),...,,()(121 ,...,,

21 nXXX

n XXXPXPn

37

Page 38: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Elimination in Chains

A B C ED

d c b a

edcbaPeP ),,,,()(

d c b a

d c b a

dePcdPbcPabPaP

edcbaPeP

)|()|()|()|()(

),,,,()(

d c b a

d c b a

abPaPdePcdPbcP

dePcdPbcPabPaPeP

)|()()|()|()|(

)|()|()|()|()()(

38

Page 39: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Elimination in Chains

A B C EDX

d c b

d c b a

bpdePcdPbcP

abPaPdePcdPbcPeP

)()|()|()|(

)|()()|()|()|()(

A B C EDX X

d c

d c b

d c b

cpdePcdP

bpbcPdePcdP

bpdePcdPbcPeP

)()|()|(

)()|()|()|(

)()|()|()|()(

39

Page 40: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Variable Elimination

General idea:

• Write query in the form

• Iteratively

– Move all irrelevant terms outside of innermost sum

– Perform innermost sum, getting a new term

– Insert the new term into the product

kx x x i

iin paxPXP3 2

)|(),( e

40

Page 41: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Stochastic Simulation

• Suppose you are given values for some subset of the variables, G, and want to infer values for unknown variables, U

• Randomly generate a very large number of instantiations from the BN

– Generate instantiations for all variables – start at root variables and work your way “forward”

• Only keep those instantiations that are consistent with the values for G

• Use the frequency of values for U to get estimated probabilities

• Accuracy of the results depends on the size of the sample (asymptotically approaches exact results)

41

Page 42: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Markov Chain Monte Carlo Methods

• So called because– Markov chain – each instance generated in the sample is dependent on

the previous instance– Monte Carlo – statistical sampling method

• Perform a random walk through variable assignment space, collecting statistics as you go

– Start with a random instantiation, consistent with evidence variables– At each step, for some non-evidence variable, randomly sample its value,

consistent with the other current assignments

• Given enough samples, MCMC gives an accurate estimate of the true distribution of values

42

Page 43: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 44: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Why Learning

knowledge-based(expert systems)

data-based

-Answer Wizard, Office 95, 97, & 2000-Troubleshooters, Windows 98 & 2000

-Causal discovery-Data visualization

-Concise model of data-Prediction

44

Page 45: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Why Learning?

• Knowledge acquisition is bottleneck– Knowledge acquisition is an expensive process– Often we don’t have an expert

• Data is cheap– Amount of available information growing rapidly– Learning allows us to construct models from raw data

• Conditional independencies & graphical language capture structure of many real-world distributions

• Graph structure provides much insight into domain – Allows “knowledge discovery”

• Learned model can be used for many tasks

• Supports all the features of probabilistic learning – Model selection criteria – Dealing with missing data & hidden variables

45

Page 46: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Why Struggle for Accurate Structure?

• Increases the number of parameters to be fitted

• Wrong assumptions about causality and domain structure

• Cannot be compensated by accurate fitting of parameters

• Also misses causality and domain structure

Earthquake Alarm Set

Sound

BurglaryEarthquake Alarm Set

Sound

Burglary

Earthquake Alarm Set

Sound

Burglary

Adding an arc Missing an arc

46

Page 47: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Learning Bayesian Networks from Data

data

X1 X2 X3

True

1 0.7

False

5 -1.6

False

3 5.9 …

true 2 6.3

… …

+Prior/expert information

Bayesian networklearner

Bayesian networks

X1 X2

X3 X4

X5

X6

X7

X8 X9

47

Page 48: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Learning Bayesian Networks

• Known Structure, Complete Data

– Network structure is specified

• Inducer needs to estimate

parameters

– Data does not contain missing

values

• Unknown Structure, Complete Data

– Network structure is not

specified

• Inducer needs to select arcs & estimate parameters

– Data does not contain missing values

• Known Structure, Incomplete Data

– Network structure is specified

– Data contains missing values

• Need to consider assignments to missing values

• Unknown Structure, Incomplete Data

– Network structure is not

specified

– Data contains missing values

• Need to consider assignments to missing values

48

Page 49: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Two Types of Methods for Learning BNs

• Constraint based

– Finds a Bayesian network structure whose implied independence constraints “match” those found in the data

• Scoring methods (Bayesian, MDL, MML)

– Find the Bayesian network structure that can represent distributions that “match” the data (i.e. could have generated the data)

• Practical considerations

– The number of possible BN structures is super exponential in the number of variables.

– How do we find the best graph(s)?

49

Page 50: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Approaches to Learning Structure

• Constraint based

– Perform tests of conditional independence

– Search for a network that is consistent with the observed dependencies and independencies

• Pros & Cons

– Intuitive, follows closely the construction of BNs

– Separates structure learning from the form of the independence tests

– Sensitive to errors in individual tests

– Computationally hard

50

Page 51: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Approaches to Learning Structure

• Score based

– Define a score that evaluates how well the (in)dependencies in a structure match the observations

– Search for a structure that maximizes the score

• Pros & Cons

– Statistically motivated

– Can make compromises

– Takes the structure of conditional probabilities into account

– Computationally hard

51

Page 52: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Score-based Learning

• Define scoring function that evaluates how well a structure matches the data

• Search for a structure that maximizes the score

E, B, A<Y, Y, Y><Y, Y, Y><N, N, Y><N, Y, Y>

.

.<N,Y,Y>

E B

A

E

B

A

E

BA

52

Page 53: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Structure Search as Optimization

• Input:

– Training data

– Scoring function

– Set of possible structures

• Output

– A network that maximizes the score

• Key computational property: Decomposability

)in ()( GXscoreGscore

53

Page 54: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Tree Structured Networks

• Trees

– At most one parent per variable

• Why trees?

– Elegant math

• We can solve the optimization problem

– Sparse parameterization

• Avoid overfitting

54

Page 55: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Beyond Trees

• When we consider more complex network, the problem is not as easy

– Suppose we allow at most two parents per node

– A greedy algorithm is no longer guaranteed to find the optimal network

– In fact, no efficient algorithm exists

55

Page 56: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Model Search

• Finding the BN structure with the highest score among those structures with at most k parents is NP hard for k>1 (Chickering, 1995)

• Heuristic methods

– Greedy

– Greedy with restarts

– MCMC methods

scoreall possible

single changes

anychangesbetter?

performbest

change

yes

no

returnsaved structure

initializestructure

56

Page 57: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Algorithm B

57

Page 58: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Scoring Functions: MDL

• Minimum Description Length (MDL)

• Learning data compression

• Other: MDL = -BIC (Bayesian Information Criterion)

• Bayesian score (BDe) - asymptotically equivalent to MDL

||2

log),|(log)|(

NGDPDBNMDL

DL(Model)

DL(Data|model)

<9.7 0.6 8 14

18> <0.2 1.3 5 ?? ??

> <1.3 2.8 ?? 0 1

> <?? 5.6 0 10 ??

> ……………….

58

Page 59: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Typical Operations

S C

E

D

S C

E

D

Reverse C EDelete C E

Add C

D

S C

E

D

S C

E

D

59

Page 60: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Exploiting Decomposability in Local Search

S C

E

D

S C

E

D

S C

E

D

S C

E

D

• Caching: To update the score of after a local change, we only need to re-score the families that

were changed in the last move

60

Page 61: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Greedy Hill-Climbing

• Simplest heuristic local search– Start with a given network

• empty network• best tree • a random network

– At each iteration• Evaluate all possible changes• Apply change that leads to best improvement in score• Reiterate

– Stop when no modification improves score

• Each step requires evaluating approximately n new changes

61

Page 62: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Greedy Hill-Climbing: Possible Pitfalls

• Greedy Hill-Climbing can get struck in:– Local Maxima:

• All one-edge changes reduce the score– Plateaus:

• Some one-edge changes leave the score unchanged• Happens because equivalent networks received the same score and

are neighbors in the search space

• Both occur during structure search

• Standard heuristics can escape both– Random restarts– TABU search

62

Page 63: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 64: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

What are BNs useful for?

• Prediction: P(symptom|cause)=?

• Diagnosis: P(cause|symptom)=?

• Classification: P(class|data)

• Decision-making (given a cost function)

• Data mining: induce best model from data

Medicine Bio-informatic

s

Computer troubleshooting

Stock market

Text Classificatio

n

Speechrecognition

classmax

64

Cause

Effect

Predictive Inference

Cause

Effect

Diagnostic Reasoning

Page 65: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Why use BNs?

• Explicit management of uncertainty

• Modularity (modular specification of a joint distribution) implies maintainability

• Better, flexible and robust decision making – MEU (Maximization of Expected Utility), VOI (Value of Information)

• Can be used to answer arbitrary queries - multiple fault problems (General purpose “inference” algorithm)

• Easy to incorporate prior knowledge

• Easy to understand

65

Page 66: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Example from Medical Diagnostics

• Network represents a knowledge structure that models the relationship between medical difficulties, their causes and effects, patient information and diagnostic tests

Visit to Asia

Tuberculosis

Tuberculosisor Cancer

XRay Result Dyspnea

BronchitisLung Cancer

Smoking

Patient Information

Medical Difficulties

Diagnostic Tests

66

Page 67: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Relationship knowledge is modeled by deterministic functions, logic and conditional probability distributions

Patient Information

Diagnostic Tests

Visit to Asia

Tuberculosis

Tuberculosisor Cancer

XRay Result Dyspnea

BronchitisLung Cancer

SmokingTuber

Present

Present

Absent

Absent

Lung Can

Present

Absent

Present

Absent

Tub or Can

True

True

True

False

Medical Difficulties

Tub or Can

True

True

False

False

Bronchitis

Present

Absent

Present

Absent

Present

0.90

0.70

0.80

0.10

Absent

0.l0

0.30

0.20

0.90

Dyspnea

67

Example from Medical Diagnostics

Page 68: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Example from Medical Diagnostics

• Propagation algorithm processes relationship information to provide an unconditional or marginal probability distribution for each node

• The unconditional or marginal probability distribution is frequently called the belief function of that node

TuberculosisPresentAbsent

1.0499.0

XRay ResultAbnormalNormal

11.089.0

Tuberculosis or CancerTrueFalse

6.4893.5

Lung CancerPresentAbsent

5.5094.5

DyspneaPresentAbsent

43.656.4

BronchitisPresentAbsent

45.055.0

Visit To AsiaVisitNo Visit

1.0099.0

SmokingSmokerNonSmoker

50.050.0

68

Page 69: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Example from Medical Diagnostics

• As a finding is entered, the propagation algorithm updates the beliefs attached to each relevant node in the network

• Interviewing the patient produces the information that “Visit to Asia” is “Visit”

• This finding propagates through the network and the belief functions of several nodes are updated

TuberculosisPresentAbsent

5.0095.0

XRay ResultAbnormalNormal

14.585.5

Tuberculosis or CancerTrueFalse

10.289.8

Lung CancerPresentAbsent

5.5094.5

DyspneaPresentAbsent

45.055.0

BronchitisPresentAbsent

45.055.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

50.050.0

69

Page 70: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

TuberculosisPresentAbsent

5.0095.0

XRay ResultAbnormalNormal

18.581.5

Tuberculosis or CancerTrueFalse

14.585.5

Lung CancerPresentAbsent

10.090.0

DyspneaPresentAbsent

56.443.6

BronchitisPresentAbsent

60.040.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

Example from Medical Diagnostics

• Further interviewing of the patient produces the finding “Smoking” is “Smoker”

• This information propagates through the network

70

Page 71: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

TuberculosisPresentAbsent

0.1299.9

XRay ResultAbnormalNormal

0 100

Tuberculosis or CancerTrueFalse

0.3699.6

Lung CancerPresentAbsent

0.2599.8

DyspneaPresentAbsent

52.147.9

BronchitisPresentAbsent

60.040.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

Example from Medical Diagnostics

• Finished with interviewing the patient, the physician begins the examination• The physician now moves to specific diagnostic tests such as an X-Ray, which results in a

“Normal” finding which propagates through the network• Note that the information from this finding propagates backward and forward through the arcs

71

Page 72: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

TuberculosisPresentAbsent

0.1999.8

XRay ResultAbnormalNormal

0 100

Tuberculosis or CancerTrueFalse

0.5699.4

Lung CancerPresentAbsent

0.3999.6

DyspneaPresentAbsent

100 0

BronchitisPresentAbsent

92.27.84

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

Example from Medical Diagnostics

• The physician also determines that the patient is having difficulty breathing, the finding “Present” is entered for “Dyspnea” and is propagated through the network

• The doctor might now conclude that the patient has bronchitis and does not have tuberculosis or lung cancer

72

Page 73: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Applications

• Industrial• Processor Fault Diagnosis - by

Intel• Auxiliary Turbine Diagnosis -

GEMS by GE • Diagnosis of space shuttle

propulsion systems - VISTA by NASA/Rockwell

• Situation assessment for nuclear power plant - NRC

• Military• Automatic Target Recognition -

MITRE• Autonomous control of unmanned

underwater vehicle - Lockheed Martin

• Assessment of Intent

• Medical Diagnosis• Internal Medicine• Pathology diagnosis - Intellipath

by Chapman & Hall• Breast Cancer Manager with

Intellipath• Commercial

• Financial Market Analysis• Information Retrieval• Software troubleshooting and

advice - Windows 95 & Office 97• Pregnancy and Child Care -

Microsoft• Software debugging - American

Airlines’ SABRE online reservation system

73

Page 74: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 75: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Modeling Users in Location-Tracking

• Modeling Users in Location-Tracking Application [Abdelsalam 2004]

– Taxi calling service for customer by tracking roving users and connecting suitable taxi

• Sending the position of user has a specified time interval

– The position in the time interval is uncertain

• General input: Direction, last position, velocityProposed solution:

Considering additional evidences by Bayesian network

– Habit of user, user behavior, environments

• The proposed methods

– Considering unique goals, personalities, and task for location reasoning

– Goal: Build location-tracking applications that take into account the individual characteristics, habits, and preferences of the users

W. Abdelsalam, Y. Ebrahim, “Managing uncertainty: Modeling users in location-tracking applications,” IEEE Pervasive Computing, 2004.

75

Page 76: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Modeling Users

• Temporal variables

– Time information

– Ex) Events occurred, time of year, day of the week, time of day

• Spatial variables

– Information about locations of user

– Ex) Building, town, certain part of town, certain road or highway

• Environmental variables

– Environmental information

– Ex) Weather conditions, road conditions, special events

• Behavioral variables

– Behavioral feature & information

– Ex) Typical speeds, resting patterns, preferred work areas, common reactions in certain situations

76

Page 77: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Collecting User Data

• User behavior log

– User-specific data

– Environment-specific data

– User location data

• Data feature driven frequency

– Location: periodic

– Event & occurrence: non-periodic

77

Page 78: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Building the User Model

• Bayesian Networks– Need flexible and automatic techniques Bayesian Network!

• cf. logic-based methods

• Two major purposes of using BN– Reasonability of causes from observation– Considering the changed evidence with reduced computation

78

Page 79: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Using Bayesian Networks

• Taxicab location-tracking application– Event– Time of day

• Morning rush hour• Lunch hour• Evening rush hour• Late night• Etc

– Source: start location• airport, downtown, …

– Destination: end location– Weather conditions

• Sunny, rainy, snowing, …

– Route• All available routes are considered• Considering route is local or highway

– Speed• Speed variation range

Last location

Route SpeedFuture

location

79

Page 80: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Experimental Results (1)

• Comparison with LSR (Last Speed Reported) method

– length = velocity X time

• The simulation

– Data: Artificially generated data

– Simulating roving users

• Simplified the speed values for routes and weather

• 45%(less than 10km/h)+25%(20km/h)+20%(50km/h)+10%(100km/h)

– 200 routes are generated

• local or highway is randomly selected

– A set of Trip Segments (TSs)

• Speed + duration

– Weather

• Good or Bad

80

Page 81: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Experimental Results (2)

• Standard distribution of distance error for each reporting interval

81

Page 82: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

• Bayesian Network

– Introduction

– Inference of Bayesian Network

– Modeling of Bayesian Network

• Bayesian Network Application

– Bayesian Network Application Example

– Life Log Application Example

• Summary & Review

Agenda

Page 83: Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Summary & Review

• Bayesian Network

– Inference & Modeling Methods

– Applications

• A Life Log Application Example

– Using BN for inference by user modeling and observation

– Ongoing research

• Detail modeling: Using SOM

• Route reduction: Using key and secondary route segments

• Application on LBS: Utilization of personalized context for location based services

83