hidden markov models sushmita roy bmi/cs 576 [email protected] oct 16 th, 2014

30
Hidden Markov models Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Oct 16 th , 2014

Upload: jonathan-norman

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Hidden Markov models

Sushmita RoyBMI/CS 576

www.biostat.wisc.edu/bmi576/[email protected]

Oct 16th, 2014

Key concepts

• What are Hidden Markov models (HMMs)?– States, Emission characters, Parameters

• Three important questions in HMMs and algorithms to solve these questions– Probability of a sequence observations: Forward

algorithm– Most likely path (sequence of states): Viterbi– Parameter estimation: Baum-Welch/Forward-

backward algorithm

Revisiting the CpG question

• Given a sequence x1..xT we can use two Markov chains to decide if x1..xT is a CpG island or not.

• What do we do if we were asked to “find” the CpG islands in the genome?

• We have to search for these “islands” in a “sea” of non-islands

A simple HMM for identifying CpG islands

A+

T+C+

G+A-

T-C-

G-

CpG IslandBackground

Note, there is no more one-to-one correspondence between states and observed symbols

An HMM for an occasionally dishonest casino

0.9

1 1/62 1/63 1/64 1/65 1/66 1/6

0.95

0.05

21

0.1

1 1/102 1/103 1/104 1/105 1/106 1/2

Fair Loaded

What is hidden?

What is observed?

Which dice is rolled

Number (1-6) on the die

What does an HMM do?

• Enables us to model observed sequences of characters generated by a hidden dynamic system

• The system can exist in a fixed number of “hidden” states

• The system probabilistically transitions between states and at each state it emits a symbol/character

Formally defining a HMM

• States• Emission alphabet• Parameters– State transition probabilities for probabilistic

transitions from state at time t to state at time t+1– Emission probabilities for probabilistically emitting

symbols from a state

Notation• States with emissions will be numbered from 1 to K– 0 begin state, N end state

• observed character at position t• Observed sequence• Hidden state sequence or path• Transition probabilities

• Emission probabilities: Probability of emitting symbol b from state k

An example HMM

0.8

probability of emitting character A in state 2

probability of a transition from state 1 to state 3

0.4

A 0.4C 0.1G 0.2T 0.3

A 0.1C 0.4G 0.4T 0.1

A 0.2C 0.3G 0.3T 0.2

begin end

0.5

0.5

0.2

0.8

0.6

0.1

0.90.2

0 5

4

3

2

1

A 0.4C 0.1G 0.1T 0.4

Path notation

A 0.1C 0.4G 0.4T 0.1

A 0.4C 0.1G 0.1T 0.4

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.90.2

0.8

0 5

4

3

2

1

A 0.4C 0.1G 0.2T 0.3

A 0.2C 0.3G 0.3T 0.2

Three important questions in HMMs

• How likely is an HMM to have generated a given sequence?– Forward algorithm

• What is the most likely “path” for generating a sequence of observations– Viterbi algorithm

• How can we learn an HMM from a set of sequences?– Forward-backward or Baum-Welch (an EM algorithm)

How likely is a given sequence from an HMM?

Initial transition Emitting symbol xt

State transition betweenconsecutive time points

How likely is a given sequence from an HMM?

• But we don’t know what the path is• So we need to sum over all paths• The probability over all paths is:

Example

• Consider an candidate CpG island

• Considering our HMM for CpG island model, some possible paths that are consistent with this CpG island are

CGCGC

C+G+C+G+C+

C-G-C-G-C-

C-G+C-G+C-

Number of paths

A 0.4C 0.1G 0.1T 0.4

A 0.4C 0.1G 0.2T 0.3

begin end

0 3

2

1

• for a sequence of length T, how many possible paths through this HMM are there?

2T

• the Forward algorithm enables us to compute the probability of a sequence by efficiently summing over all possible paths

How likely is a given sequence: Forward algorithm

• Define as the probability of observing and ending in state k at time t

• This can be written recursively as follows

Steps of the Forward algorithm

• Initialization: denote 0 for the “begin” state

• Recursion: for t=1 to T

• Termination

Forward algorithm example

A 0.4C 0.1G 0.2T 0.3

A 0.1C 0.4G 0.4T 0.1

A 0.4C 0.1G 0.1T 0.4

A 0.2C 0.3G 0.3T 0.2

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.90.2

0.8

0 5

4

3

2

1

What is the probability of sequence TAGA ?

In class exercise

Table for TAGA

1 0 0 0 0

0 0.15 0.012 0.00048 0.0000384

0 0.2 0.064 0.00512 0.0016384

0 0 0.024 0.00576 0.0005376

0 0 0.004 0.00528 0.0001552

0 0 0 0 0.00046224

T A G A

t=1 t=2 t=3 t=4

0

1

2

3

4

States

5

This entry is also P(x). Does not require f1(4) and f2(4)

Three important questions in HMMs

• How likely is an HMM to have generated a given sequence?– Forward algorithm

• What is the most likely “path” for generating a sequence of observations– Viterbi algorithm

• How can we learn an HMM from a set of sequences?– Forward-backward or Baum-Welch (an EM algorithm)

Viterbi algorithm

• Viterbi algorithm gives an efficient way to find the most likely/probable state

• Consider the dishonest casino example– Given a sequence of dice rolls can you infer when

the casino was using the loaded versus fair dice?– Viterbi algorithm gives this answer

• Viterbi is very similar to the Forward algorithm– Except instead of summing we maximize

Notation for Viterbi

• probability of the most likely path for x1.. xt ending in state k

• pointer to the state that gave the maximizing transition

• is the most probability path sequence x1,..,xT

Steps of the Viterbi algorithm

• Initialization:

• Recursion:for t=1 to T

• Termination:

Probability associated with the most likely path

Traceback in Viterbi

• Traceback for t=T to 1

Viterbi algorithm example

A 0.4C 0.1G 0.2T 0.3

A 0.1C 0.4G 0.4T 0.1

A 0.4C 0.1G 0.1T 0.4

A 0.2C 0.3G 0.3T 0.2

begin end

0.5

0.5

0.2

0.8

0.4

0.6

0.1

0.90.2

0.8

0 5

4

3

2

1

What is the most likely path for TAG ?

In class exercise

Viterbi computations for TAG

1 0 0 0

0 0.15 0.012 0.00048

0 0.2 0.064 0.00512

0 0 0.024 0.00288

0 0 0.004 0.00512

0 0 0 0.004608

T A G

t=1 t=2 t=3

0

1

2

3

4

Stat

es

5

0 0 0

0 0 0

0 1 3

0 2 2

T A G

t=1 t=2 t=3

vk(t)

ptrt(k)

1

2

3

4

Trace back path

Using HMM to detect CpG islands

• Recall the 8-state HMM for our CpG island• Apply the Viterbi algorithm to a DNA

sequence on this HMM• Contiguous assignments of ‘+’ states will

correspond to CpG islands

Summary

• Hidden Markov models are extensions to Markov chains enabling us to model and segment sequence data

• HMMs are defined by a set of states and emission characters, transition probabilities and emission probabilities

• We have examined two questions for HMMs– Computing the probability of a sequence of observed

characters given an HMM (Forward algorithm)– Computing the most likely sequence of states (or path) for

a sequence of observed characters