outline profiles and sequence logos profile hidden markov models aligning profiles multiple sequence...

63

Upload: shon-morgan

Post on 18-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways
Page 2: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

OUTLINE

Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual

Sequence Adition Other Ways of Obtaining Multiple Alignments Sequence Pattern Discovery

Page 3: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

OUTLINE

Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual

Sequence Adition Other Ways of Obtaining Multiple Alignments Sequence Pattern Discovery

Page 4: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markow Models:– A hidden Markov model (HMM) is a statisticsl

model,– in which the system being modeled is

assumed to be a Markov process (Memoryless process: its future and past are independent ),

– with hidden states.

Page 5: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markow Models:– Has a set of states each of which has limited

number of transitions and emissions,– Each transition between states has an

assisgned probability,– Each model strarts from start state and ends

in end state,

Page 6: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markow Models parameters:– A set of finite number of states, Si,– The transition probability from state Si to Sj, aij,– The emission probability density of a symbol ω– in state Si

Page 7: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markow Models parameters:– Firstly discuss:

Morkov Models, Markov Assumption

Page 8: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.):– To understand HMMs:

Talk about weather, Assume there are three types of weather:

– Sunny,

– Rainy,

– Foggy.

Assume weather does not change during the day (if it is sunny it will sunny all the day)

Page 9: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Weather prediction is about the what would be the weather tomorrow,– Based on the observations on the past.

Page 10: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Weather at day n is

– qn depends on the known weathers of the past days (qn-1, qn-2,…)

},,{ foggyrainysunnyqn

Page 11: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): We want to find that:

– means given the past weathers what is the probability of any possible weather of today.

Page 12: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): For example:

if we knew the weather for last three days was:

the probability that tomorrow would be is:

P(q4 = | q3 = , q2 = , q1 = )

Page 13: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.):– For example:

this probability could be infered from the statistics of past observations

the problem: the larger n is, the more observations we must collect.

– for example: if n = 6 we need 3(6-1) = 243 past observations.

Page 14: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.):– Therefore, make a simplifying assumption Markov

assumption: For sequence:

the weather of tomorrow only depends on today (first order Markov model)

Page 15: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Examples:

The probabilities table:

Page 16: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Examples:

HMM:

Page 17: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Examples:

Given that day the weather is sunny, what is the probability that tomorrow is sunny and the next day is rainy ?

Markov assumption

Page 18: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.): Examples:

If the weather yesterday was rainy and today is foggy what is the probability that tomorrow it will be sunny?

Page 19: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Markow Models and Assumption (cont.):– Examples:

If the weather yesterday was rainy and today is foggy what is the probability that tomorrow it will be sunny?

Markov assumption

Page 20: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM:

Suppose that you are locked in a room for several days, you try to predict the weather outside, The only piece of evidence you have is whether the

person who comes into the room bringing your daily meal is carrying an umbrella or not.

Page 21: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM (cont.):

assume probabilities as seen in the table:

Page 22: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM (cont.):

Now the actual weather is hidden from you. You can not directly see what is the weather.

Page 23: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM (cont.):

Finding the probability of a certain weather

is based on the observations xi:

},,{ foggyrainysunnyqn

Page 24: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM (cont.):

Using Bayes rule:

For n days:

Page 25: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– What is HMM (cont.):

We can omit So:

With Markov assumptions:

Page 26: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Examples:

Suppose the day you were locked in it was sunny. The next day, the caretaker carried an umbrella into the room.

You would like to know, what the weather was like on this second day.

Page 27: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Examples:

Calculate 3 probabilities:

Page 28: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Examples:

Consider the event with highest value. It is most likely to happen.

Page 29: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Another Examples:

Suppose you do not know how the weather was when your were locked in. The following three days the caretaker always comes without an umbrella. Calculate the likelihood for the weather on these three days to have been

Page 30: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Another Examples:

As you do not know how the weather is on the first day, you assume the 3 weather situations are equi-probable on this day and the prior probability for sun on day one is therefore

Page 31: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models (HMMs):– Another Examples:

Assumption:

Page 32: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Hidden Markov Models:– Another Examples:

Page 33: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs to represent a family of sequences– Given a multiple alignment of sequences, we

can use an HMM to model the sequences. – Each column of the alignment may be

represented by a hidden state that produced represented by a hidden state that produced that column.

– Insertions and deletions can be represented by other states

Page 34: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs to represent a family of sequences

Page 35: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs to represent a family of sequences

http://www.ifm.liu.se/bioinfo/assignments/hmm-profile.png

Page 36: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs to represent a family of sequences

http://www.ifm.liu.se/bioinfo/assignments

Page 37: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the states of the HMM– The structure is usually fixed and only the

number of “match” states is to be determined

Page 38: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the states of the HMM– An alignment column with no gaps can be

considered as a “match” state considered as a match state.

– An alignment column with a majority of gaps can be considered an “insert” state can be considered an insert state.

Page 39: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the transition probabilities – From a state u the transition to another state v is

represented by t(u.v).– The summation over all states w that are

connected to state u by transition gives one:

– The transition probabilities from a state (except the end state) always add up to 1.

Page 40: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the transition probabilities

Page 41: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the transition probabilities – mu,v: number of transitions from state u to state v

– Transition probabilities t(u,v) can be estimated:

Page 42: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the emission probabilities Emission probabilities in a match or insert state

also adds up to 1.

Page 43: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the emission probabilities eMu: emission probability for a residue from the u

th match state,

Page 44: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Determining the emission probabilities eIu: emission probability for a residue from the u

th insert state,

The probabilities are usually taken from the overal amino acid comparison of a selected data set.

Page 45: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs examples:

Page 46: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs examples:

Page 47: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

HMMs examples:

5 transitions in gap region:• C out, •G out• AC, •CT, •T out• Out transition 3/5• Stay transition 2/5

gap region

Page 48: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMM– Given a profile HMM, any given path through

the model will emit a sequence with an associated probability,

– The path probability is the product of all transition and emission probabilities along the path.

Page 49: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Given a query sequence we can compute the most probable path that will emit that query sequence.

Page 50: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Another interesting question: What is the probability that a given sequence can be generated by the hidden Markov model

– Solution:Calculated by summing over all possible path giving rise to a given sequence

Page 51: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Will be applied to profiles HMM similar to:

Page 52: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– At a profile position u:Mu : Match state,

Iu : Insertion state,

Du : Delete state,

t(Mu, Mu+1) : transition probability from state Mu to Mu+1

Page 53: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– During the algorithm:A record must be kept of the highest

probability up to that point in the model and for a given amount of emited sequences.

Page 54: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– During the algorithm:When the sequence up to and

including residue xi has been emited, the highest probability will be written as VDu(xi)

Page 55: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Dynamic Programming:

COLUMNS: NUMBER OF STATES IN HMM

ROWS:LENGTH OF THE QUERY OR EMITTED SEQUENCE

Page 56: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Recurrence relations:

Page 57: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm:

– Recurrence relations (log):

Page 58: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm (EXAMPLE):

Page 59: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm (EXAMPLE):

Emission Probabilities:

Start Probabilities:

Page 60: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm (EXAMPLE):

Page 61: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm (EXAMPLE):

GGCTGATT

Page 62: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

Profile Hidden Markov Models

Scoring a sequence against a profile HMMViterbi algorithm (EXAMPLE):

G G C T GA T TBeginM1M2I3I3I3I3I3M3End

Page 63: OUTLINE Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways

References

M. Zvelebil, J. O. Baum, “Understanding Bioinformatics”, 2008, Garland Science

Andreas D. Baxevanis, B.F. Francis Ouellette, “Bioinformatics: A practical guide to the analysis of genes and proteins”, 2001, Wiley.

Barbara Resch, “Hidden Markov Models - A Tutorial for the Course Computational Intelligence”, 2010.