hidden markov models, bayesian networks - nutritionistahofroe.net/stat430/lectures/28-hidden.pdf ·...

Hidden Markov Models, Bayesian Networks

Stat 430

Outline

• Definition of HMM

• Set-up of 3 main problems

• Three Main algorithms:

• Forward/Backward

• Viterbi

• Baum-Welch

Hidden Markov Models (HMM)

• Ideaeach state of a Markov chain emits a letter from a fixed alphabetdistribution of letters is time independent, but depends on state

• Situationusually we have string of emitted symbols, but don’t know Markov Chain (it’s hidden)

Application Areas

• Pattern recognition

• Search algorithms

• Sequence alignments

• Time series analysis

Setup

Markov State Diagramwith transition probabilities A

Emitted Sequence Y

emission probabilities B

Usually, we only observe Y (several instances of it)

Example

• Suppose we have five amino acid sequencesCAEFTPAVHCKETTPADHCAETPDDHCAEFDDHCDAEFPDDH

• find best possible alignment of all sequences (allowing insertions and deletions)

Run of an HMM• 2-step process:

initial -> emission -> transition -> emission -> transition -> q1 O1 q2 O2 q3

• Sequence of visited states Q = q1 q2 q3 ...

• Sequence of emitted symbols O = O1 O2 O3 ...

• usually we can observe O, but don’t know Q

Example• Markov Chain with two states, S1 and S2

and transition matrix P

• Emission alphabet {a, b}

• In S1 emission probabilitiesfor a and b are 0.5 and 0.5

• In S2 emission probabilitiesfor a and b are 0.25 and 0.75

• Observed sequence is bbb

S1 S2

S1

S2

0.9 0.1

0.8 0.2

Example

• Observed sequence is bbb

• What is the most likely sequence Q that emitted bbb? argmaxQ P(Q|O)

• What is the probability to observe O? P(O) = ∑Q P(O|Q) P(Q)

S1 S2

S1

S2

0.9 0.10.8 0.2

a bS1

S2

0.5 0.50.25 0.75

transitions emissions

Definition

• set of states S1, S2, ..., SN

• the transition matrix P with pij = P(qt+1 = Sj| qt=Si)

• an alphabet of M unique, observed symbols A = {a1, ..., aM}

• emission probabilities bi(a) = P(state Si emits a)

• initial distribution πi = P(q1 = Si)

A hidden Markov model (HMM) consists of

Three Main Problems

• Find P(O)computational problem: naive solution is intractablefoward-backward algorithm

• Find the sequence of states that most likely produced observed output O: argmaxQ = P(Q|O)Viterbi algorithm

• for fixed topology find P, B and π that maximize probability to observe OBaum-Welch algorithm

Forward/Backward

• given all parameters (B, Q, π) find P(O)

• naive approach is computationally too intensive

• Use help of forwards α and backwards β

• α(t,i) = P(o1o2 … ot, qt=Si)

• then P(O) = ∑i α(T,i)

Viterbi

• compute arg maxQ P(Q | O)

• two-step algorithm:

• maximize probability first,

• then recover structure Q

Baum-Welch

HMM for Amino Acid/Gene Sequences

CAEFTPAVH, CKETTPADH, CAETPDDH, CAEFDDH, CDAEFPDDH

d1

i1

m1

d2

i2

m2

d3

i3

m3 m4

i0

m0

Example

• CAEFDDH most likely produced bym0 m1 m2 m3 m4 d5 d6 m7 m8 m9 m10

• CDAEFPDDH most likely produced bym0 m1 i1 m2 m3 m4 d5 m6 m7 m8 m9 m10

• Yields alignment C -AEF - -DDH CDAEFP -DDH

CAEFTPAVH, CKETTPADH, CAETPDDH, CAEFDDH, CDAEFPDDH

R packages

• HMM

• RHmm

• HiddenMarkov

• msm

• depmix, depmixS4

• flexmix

Bayesian Networks

• HMMs are special case of Bayesian Nets

• A Bayesian Network is a directed acyclic graph, where nodes represent variablesedges describe conditional relationships

Setup

• If there is no edge between two nodes, the nodes are independent

• Edges imply parent/child relationship:

• X1 has children X2, X3, X5

• X5 has parents X1, X2

• P(X1, ..., Xp) = ∏i P(Xi|parents(Xi))

X1

X2

X3

X4X5

Example

• Given that the grass is wet, what is the probability that it is raining?

R packages

• deal

• MASTINO

hidden markov models, bayesian networks - nutritionistahofroe.net/stat430/lectures/28-hidden.pdf ·...

Documents