bayesian nets and applications

32
Bayesian Nets and Applications

Upload: marshall-giles

Post on 31-Dec-2015

17 views

Category:

Documents


2 download

DESCRIPTION

Bayesian Nets and Applications. Naïve Bayes. What happens if we have more than one piece of evidence? If we can assume conditional independence Overslept and trafficjam are independent, given late - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bayesian Nets and Applications

Bayesian Nets and Applications

Page 2: Bayesian Nets and Applications

2

Naïve Bayes What happens if we have more than one piece of

evidence? If we can assume conditional independence

Overslept and trafficjam are independent, given late A and B are conditionally independent given C just in case B

doesn't tell us anything about A if we already know C: P(late|overslept Λ trafficjam) =

αP(overslept Λ trafficjam)|late)P(late) = αP(overslept)|late)P(trafficjam|late)P(late)

Naïve Bayes where a single cause directly influences a number of effects, all conditionally independent

Independence often assumed even when not so

Page 3: Bayesian Nets and Applications

3

Bayesian Networks A directed acyclic graph in which each node is

annotated with quantitative probability information A set of random variables makes up the network nodes A set of directed links connects pairs of nodes. If there

is an arrow from node X to node Y, X is a parent of Y Each node Xi has a conditional probability

distributionP(Xi|Parents(Xi) that quantifies the effect of the parents on the node

Page 4: Bayesian Nets and Applications

4

Example Topology of network encodes conditional

independence assumptions

Page 5: Bayesian Nets and Applications

5

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

Page 6: Bayesian Nets and Applications

6

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

Smart

True False

.5 .5

Hard Working

True False

.7 .3

S Good Test Taker

True False

True .75 .25

False .25 .75

S HW UM

True False

True True .95 .05

True False .6 .4

False True .6 .4

False False .2 .8

Page 7: Bayesian Nets and Applications

7

Conditional Probability Tables

Smart

True False

.5 .5

Hard Working

True False

.7 .3

S Good Test Taker

True False

True .75 .25

False .25 .75

S HW UM

True False

True True .95 .05

True False .6 .4

False True .6 .4

False False .2 .8

GTT UM Exam Grade

A B C D F

True True .7 .25 .03 .01 .01

True False .3 .4 .2 .05 .05

False True .4 .3 .2 .08 .02

False False .05 .2 .3 .3 .15

Homework Grade

UM A B C D F

True .7 .25 .03 .01 .01

False .2 .3 .4 .05 .05

Page 8: Bayesian Nets and Applications

8

Compactness A CPT for Boolean Xi with k Boolean parents has

2k rows for the combinations of parent values Each row requires one number p for Xi=true (the

number for Xi=false is just 1-p) If each variable has no more than k parents, the

complete network requires O(nx2k) numbers Grows linearly with n vs O(2n) for the full joint

distribution Student net: 1+1+2+2+5+5=11 numbers (vs. 26-

1)=31

Page 9: Bayesian Nets and Applications

9

Conditional Probability

Page 10: Bayesian Nets and Applications

10

Global Semantics/Evaluation

Global semantics defines the full joint distribution as the product of the local conditional distributions:

P(x1,…,xn)=∏in

=1P(xi| Parents(Xi))e.g.,

P(EG=AΛGTΛ⌐UMΛSΛHW)

Page 11: Bayesian Nets and Applications

11

Global Semantics

Global semantics defines the full joint distribution as the product of the local conditional distributions:

P(X1,…,Xn)=∏in=1P(Xi|Parents(Xi))

e.g., Observations:S, HW, not UM, will I get an A? P(EG=AΛGTΛ⌐UMΛSΛHW)

= P(EG=A|GT Λ⌐UM)*P(GT|S)*P(⌐UM |HW ΛS)*P(S)*P(HW)

Page 12: Bayesian Nets and Applications

12

Conditional Independence and Network Structure The graphical structure of a Bayesian network

forces certain conditional independences to hold regardless of the CPTs.

This can be determined by the d-separation criteria

Page 13: Bayesian Nets and Applications

13

a

b

c

a

b

c

b

a c

Linear

Converging

Diverging

Page 14: Bayesian Nets and Applications

14

D-separation (opposite of d-connecting)

A path from q to r is d-connecting with respect to the evidence nodes E if every interior node n in the path has the property that either

It is linear or diverging and is not a member of E It is converging and either n or one of its decendents is

in E

If a path is not d-connecting (is d-separated), the nodes are conditionally independent given E

Page 15: Bayesian Nets and Applications

15

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

Page 16: Bayesian Nets and Applications

16

S and EG are not independent given GTT S and HG are independent given UM

Page 17: Bayesian Nets and Applications

Medical Application of Bayesian Networks:Pathfinder

Page 18: Bayesian Nets and Applications

18

Pathfinder Domain: hematopathology diagnosis

Microscopic interpretation of lymph-node biopsies Given: 100s of histologic features appearing in

lymph node sections Goal: identify disease type

malignant or benign Difficult for physicians

Page 19: Bayesian Nets and Applications

19

Pathfinder System Bayesian Net implementation Reasons about 60 malignant and benign

diseases of the lymph node Considers evidence about status of up to 100

morphological features presenting in lymph node tissue

Contains 105,000 subjectively-derived probabilities

Page 20: Bayesian Nets and Applications

20

Page 21: Bayesian Nets and Applications

21

Commercialization Intellipath Integrates with videodisc libraries of

histopathology slides Pathologists working with the system make

significantly more correct diagnoses than those working without

Several hundred commercial systems in place worldwide

Page 22: Bayesian Nets and Applications

22

Sequential Diagnosis

Page 23: Bayesian Nets and Applications

23

Features Structured into a set of 2-10 mutually

exclusive values Pseudofollicularity

Absent, slight, moderate, prominent

Represent evidence provided by a feature as F1,F2, … Fn

Page 24: Bayesian Nets and Applications

24

Value of information User enters findings from microscopic analysis of tissue

Probabilistic reasoner assigns level of belief to different diagnoses

Value of information determines which tests to perform next

Full disease utility model making use of life and death decision making

Cost of tests Cost of misdiagnoses

Page 25: Bayesian Nets and Applications

25

Page 26: Bayesian Nets and Applications

26

Page 27: Bayesian Nets and Applications

27

Group Discrimination Strategy Select questions based on their ability to

discriminate between disease classes For given differential diagnosis, select most

specific level of hierarchy and selects questions to discriminate among groups

Less efficient Larger number of questions asked

Page 28: Bayesian Nets and Applications

28

Page 29: Bayesian Nets and Applications

29

Page 30: Bayesian Nets and Applications

30

Other Bayesian Net Applications Lumiere – Who knows what it is?

Page 31: Bayesian Nets and Applications

31

Other Bayesian Net Applications Lumiere

Single most widely distributed application of BN Microsoft Office Assistant Infer a user’s goals and needs using evidence about user

background, actions and queries VISTA

Help NASA engineers in round-the-clock monitoring of each of the Space Shuttle’s orbiters subsystem

Time critical, high impact Interpret telemetry and provide advice about likely failures Direct engineers to the best information In use for several years

Microsoft Pregnancy and Child Care What questions to ask next to diagnose illness of a child

Page 32: Bayesian Nets and Applications

32

Other Bayesian Net Applications Speech Recognition

Text Summarization

Language processing tasks in general