soft computing lecture 17 introduction to probabilistic reasoning. bayesian nets. markov models

22
Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Upload: sydney-boyd

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Soft Computing

Lecture 17

Introduction to probabilistic reasoning. Bayesian nets. Markov models

Page 2: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Why probabilistic methods?• Probabilistic is used for description of

events or behavior or making decision when we have not enough knowledge about object of observation

• Probability theory and then probabilistic methods of AI aim to introduce in random processes any knowledge about laws or rules about sources of random events

• This is alternative path of description of uncertainty in contrast to fuzzy logics

Page 3: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

The Problem• We normally deal with assertions and their causal connections:

– John has fever – John has the flu – If somebody has the flu then that person has fever.

• We are not certain that such assertions are true. We believe/disbelieve them to some degree. [Though "belief" and "evidence" are not the same thing, for our purposes they will be treated synonymously.]

• Our problem is how to associate a degree of belief or of disbelief with assertions – How do we associate beliefs with elementary assertions – How do we combine beliefs in composite assertions from the beliefs of the component

assertions – What is the relation between the beliefs of causally connected assertions.

• Estimates for elementary assertions are obtained – From Experts (subjective probability) – From frequencies (if given enough data)

• It is very hard to come up with good estimates for beliefs. Always consider the question "What if the guess is bad".

• Estimates are needed, given the belief in assertions A and B, for the assertions ~A, A & B, A v B

• Evidence must be combined in cases such as: – We have a causal connection from assertion A to assertion B, what can we say about B if A

is true, or, vice versa, about A if B is true – We have a causal connection from assertion A to assertions B1 and B2, what can we say

about A if both B1 and B2 are true – We have a causal connection from assertion A1 to B and a causal connection from A2 to B,

what can we say about B when both A1 and A2 are true.

Page 4: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Probabilistic methods of reasoning and learning

• Probabilistic neural networks

• Bayesian networks

• Markov models and chains

• Support Vector and Kernel Machines (SVM)

• Genetic algorithms (evolution learning)

Page 5: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

• Bayes’ Law– P(a,b) = P(a|b) P(b) = P(b|a) P(a)– Joint probability of a and b = probability of b

times the probability of a given b

Page 6: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayesian learning

Page 7: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayesian learning (2)

Page 8: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayesian learning (3)

Page 9: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayesian learning (4)

Page 10: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayes theorem

P(Aj | B) – posterior probability of event Aj at condition of event B,P(B | Aj) – likelihood,P(B) – evidence

Bayes theorem is only valid if we know all the conditional probabilities relating to the evidence in question.This makes it hard to apply the theorem in practical AI applications

Page 11: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Bayesian Network• A Bayesian Network is a directed acyclic graph:

– A graph where the directions are links which indicate dependencies that exist between nodes (variables).

– Nodes represent propositions about events or events themselves.

– Conditional probabilities quantify the strength of dependencies. • Consider the following example: • The probability, that my car won't start. • If my car won't start then it is likely that

– The battery is flat or – The staring motor is broken.

• In order to decide whether to fix the car myself or send it to the garage I make the following decision: – If the headlights do not work then the battery is likely to be flat so

i fix it myself. – If the starting motor is defective then send car to garage. – If battery and starting motor both gone send car to garage.

Page 12: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

A simple Bayesian network

Page 13: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Kinds of relations between variables in Bayesian nets

a) Sequence, influence may be distribute from A to C and back while value of B is unknown

b) Divergence, influence may be distributed on childes of A while A is unknown

c) Convergence, about A nothing unknown except that may be obtained through its parents

Page 14: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Reasoning in Bayesian nets • Probabilities in links obey standard conditional probability

axioms. • Therefore follow links in reaching hypothesis and update

beliefs accordingly. • A few broad classes of algorithms have bee used to help

with this: – Pearls's message passing method. – Clique triangulation. – Stochastic methods. – Basically they all take advantage of clusters in the network and

use their limits on the influence to constrain the search through net.

– They also ensure that probabilities are updated correctly.

• Since information is local information can be readily added and deleted with minimum effect on the whole network. ONLY affected nodes need updating.

Page 15: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Synthesis of Bayes network based on a priory information

• Describe task in terms of probabilities of values of goal variables

• Select concept space of task, determine variables corresponding to goal variables, describe possible values of ones

• Determine a priori probabilities of values of variables

• Describe causal relations and node (variables) as graph

• For every node determine condition probabilities of value of variable at different combinations of values of variables-parents

Page 16: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Applications of Bayes networks• Medical diagnostic systems

– PathFinder (1990) for diagnostics of illness of lymphatic glands,

• Space and military applications– Vista (NASA) is used for selection of needed

information for diagnostic display from telemetric information in real time,

– Netica (Australia) for defence of territory from sea• Computers and software

– For control of agents-helpers in MS Office• Image processing

– Extract of 3-dimensional scene from 2-dimensional images

• Finance and economy– Estimation of risks and prediction of yield of portfolio

Page 17: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models
Page 18: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models
Page 19: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models
Page 20: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models
Page 21: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

Hidden Markov Model for recognition of speech

P3(.)P1(.) P2(.)

P1,1

P1,2

P2,2 P3,3

P2,3 P3,4

Page 22: Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models

• Create compound HMM for each lexical entry by concatenating the phones making up the pronunciation– example of HMM for ‘lab’ (following ‘speech’ for crossword triphone)

• Multiple pronunciations can be weighted by likelihood into compound HMM for a word

• (Tri)phone models are independent parts of word models

Lexical HMMs

P3(.)P1(.) P2(.)

P1,1

P1,2

P2,2 P3,3

P2,3 P3,4P3(.)P1(.) P2(.)

P1,1

P1,2

P2,2 P3,3

P2,3 P3,4P3(.)P1(.) P2(.)

P1,1

P1,2

P2,2 P3,3

P2,3 P3,4

phone: l a b

triphone: ch-l+a l-a+b a-b+#