Hidden Markov Models for biological sequence analysis

Size: px
Start display at page:

Download "Hidden Markov Models for biological sequence analysis"

Transcription

1 Hidden Markov Models for biological sequence analysis Master in Bioinformatics UPF Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain

2 Example: CpG Islands Wherever there is a CG (CpG) in the genome, C use to change chemically by methylation, which is then likely to mutate into a T. Thus CpG dinucleotides are less frequent than expected (less than 1/16) This methylation is sometimes suppressed in some regions (e.g. promoter regions of genes), hence they contain a high content of CpG. These regions are called CpG islands CpG islands are of variable length, between hundreds to thousands of base pairs. We would like to answer the following questions: given a DNA sequence, Is it part of a CpG island? Given a large DNA region, can we find CpG islands in it?

3 Example: CpG Islands + a st a st Transition probability between two adjacent positions in CpG islands Transition probability between two adjacent pos. outside CpG islands Given a sequence S the log-likelihood ratio is: C + G + P(G C) P(S +) σ = log P(S ) = N i=1 log a + i 1,i a i 1,i The larger the value of sigma, the more likely is to be a CpG island

4 Example: CpG Islands Given a large stretch of DNA of length L, we extract windows of l nucleotides: S (k ) = (s k+1,...,s k+ l ) 1 k L l l << L For each window we calculate σ(s (k ) ) P(S +) σ = log P(S ) = N i=1 log a + i 1,i a i 1,i Windows with σ(s (k ) ) > 0 (or above a set threshold) are possible CpG islands Limitation: the Markov chain model cannot determine the lengths of the CpG islands

5 Example: CpG Islands A + T + A - T - C + G + C - G - Can we build a single model for the entire sequence that incorporates both Markov chains?

6 Example: CpG Islands Include the probability to switch from one model to the other A + T + A - T - C + G + C - G - For every symbol, in addition to the transitions to symbols of the same type (state), we also consider the transitions to symbols of the other type (state). But this approach will require the estimation of many probabilities

7 Example: CpG Islands Simplify by defining the probability to switch from one model to the other A + T + A - T - C + G + C - G - We summarize the transitions between + and symbols into single transition probabilities The change between states + and - determines the limits of the CpG island

8 Example: CpG Islands We can extend the model with all possible transition and emission probabilities: P(+ +) = a ++ P(+ begin) = a +0 begin P( begin) = a 0 P( ) = a Emission of first symbol CpG (+) P( +) = a + Non CpG (-) P(a +) = e + (a) P(a ) = e (a) P(ab +) = e + (ab) P(+ ) = a + We have transition probabilities between states: P(ab ) = e (ab) a ++,a,a +,a + Emission of a pair of symbols in any position beyond the first one And emission probabilities in each state: e + (a),e + (ab),e (a),e (ab)

9 Example: CpG Islands In fact, we can redraw the CpG island model in the following way Each state is a Markov chain model We have transitions between states and each state can generate symbols

10 Example: The coin tossing problem A simpler version of the CpG island problem Game: to flip coins, which results in two possible outcomes: Head or Tail Two coins: Fair coin (F): Head (H) or Tail (T) with same probability = 1/2 P(H F) = e F (H ) = 0.5 P(T F) = e F (T ) = 0.5 Loaded coin (L): Head (H) with probability 3/4 and tail (T) with probability 1/4 P(H L) = e L (H ) = 0.75 P(T L) = e L (T ) = 0.25

11 Example: The coin tossing problem HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT Which coin produced the observed symbol? Fair coin (F): Head (H) or Tail (T) with same probability = 1/2 P(H F) = e F (H ) = 0.5 P(T F) = e F (T ) = 0.5 Loaded coin (L): Head (H) with probability 3/4 and tail (T) with probability 1/4 P(H L) = e L (H ) = 0.75 P(T L) = e L (T ) = 0.25

12 Example: The coin tossing problem The crooked dealer changes between Fair and loaded coins 10% of the time (90% of the time uses the same coin) We can define some sort of probability transition beween coin states : P(F F) = a FF = 0.9 P(L L) = a LL = 0.9 P(L F) = a FL = 0.1 P(F L) = a LF = 0.1

13 Example: The coin tossing problem We can represent this model graphically F T a FF = 0.9 e F (H) = 0.5 e F (T) = 0.5 a FL = 0.1 a LF = 0.1 e L (H) = 0.75 e L (T) = 0.25 a LL = 0.9 We have two states: fair and loaded Each state has emission probabilities (head or tail) There are transition probabilities between states (changing the coin)

14 HMM definition A Hidden Markov Model (HMM) is a triplet: M = (Σ,Q,Θ) Σ Q Θ Is an alphabet of symbols Is a set of states that can emit symbols from the alphabet Is a set of probabilities a kl Transition probabilities between states e k (b) Emission probabilities k Q b Σ k,l Q

15 HMM example Probability of transition between states π i and π i+1 no emission at the begin state Probability that s i was emitted from state π i a 13 Probability of transition from state 1 to state 3 e 2 (A) Probability of emitting character A in state 2 begin/end states are silent states on the path: do not emit symbols

16 HMM definition A path in the model M is a sequence of states Π = (π 1,...,π L ) We have added two states initial and final π 0 π N +1 Given a sequence S = s 1 s 2 s 3...s N of symbols The transition probabilities between states (k and l), and the emission probabilities (of character b at state k) are written as: P(S Π) = P(s 1...s N π 1...π N ) a kl = P(π i = l π i 1 = k) = P(π i π i 1 ) e k (b) = P(s i = b π i = k) = P(s i π i ) The probability that sequence S is generated by model M given the path = P(π 1 π 0 )P(s 1 π 1 ) P(π 2 π 1 )P(s 2 π 2 )... P(π N π N 1 )P(s N π N ) P(π N+1 π N ) = a π0,π 1 e π1 (s 1 ) a π1,π 2 e π2 (s 2 )... a π N 1,π N e π N (s N ) = a π0,π 1 a πi,π i+1 e πi (s i ) N i=1

17 Why Hidden? In Markov chains, each state accounts for each part of the sequence In HMMs we distinguish between the observable parts of the problem (emitted characters) and the hidden parts (states): In a Hidden Markov model, there are multiple states that could account for the same part of the sequence: that actual state is hidden. That is, there is no one-to-one correspondence between states and observations.

18 HMM example Π=0113 S=AAC There is a probability associated to this sequence, given by the transition and L emission probabilities: P(S Π) = a π 0, π 1 e π i (s i ) a π i, π i+1 i =1 P(AAC Π) = a 01 e 1 (A) a 11 e 1 (A) a 13 e 3 (C) a 35 =

19 Why Hidden? E.g. The toss of a coin is an observable, the coin being used is the state, which we cannot observe. HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT Which coin produced the observed symbol? Fair coin (F): P(H F) = e F (H ) = 0.5 P(T F) = e F (T ) = 0.5 Loaded coin (L): P(H L) = e L (H ) = 0.75 P(T L) = e L (T ) = 0.25 E.g. given a biological sequence, interesting features are hidden (genes, splice-sites, etc ) We will have to calculate the probability of being in a particular state

20 HMMs are memory-less At each time (or position) step, the only thing that affects future (or downstream) states and emissions is the current state P(π i = l "what ever happened so far") = P(π i = l π 1,π 2,...,π i 1,s 1,s 2,...,s i 1 ) = P(π i = l π i 1 = k) = a kl At each time/position, the emission of a character only depends on the current state: P(s i = b "what ever happened so far") = P(s i = b π 1,π 2,...,π i,s 1,s 2,...,s i 1 ) = P(s i = b π i = k) = e k (b)

21 HMMs can be seen as generative models We can sample paths of states (e.g. the orange path). This orange path in the HMM generates a sequence: Π=0113 S=AAC Generate paths using the transition probabilities Emit symbols using the emission probabilities

22 HMM questions However. Π=0113 S=AAC This is not necessarily the most probable sequence of states This is also not the total probability of the sequence of symbols, since other combination of states can give the same sequence of symbols.

23 The LEARNING problem HMM questions Goal: find the transition and emission probabilities of the model: learn the model The EVALUATION problem What is the probability that this sequence has been generated by the crooked dealer? The DECODING problem We must find the underlying sequence of states that led to the observed sequence of results: we must label the results by its state The POSTERIOR DECODING problem Goal: find the probability that the dealer was using a biased coin at a particular time

24 The LEARNING problem HMM questions (algorithms) What are the parameters of the model? Maximum likelihood estimation, Baum-Welch, EM The EVALUATION problem How likely is a given sequence The Forward algorithm The DECODING problem What is the most probable path for generating a given sequence? The Viterbi algorithm The POSTERIOR DECODING problem Which one is the most likely state in a given position? The Forward-Backward algorithm

25 The LEARNING problem HMM questions (algorithms) What are the parameters of the model? Maximum likelihood estimation, Baum-Welch, EM The EVALUATION problem How likely is a given sequence The Forward algorithm The DECODING problem What is the most probable path for generating a given sequence? The Viterbi algorithm The POSTERIOR DECODING Which one is the most likely state in a given position? The Forward-Backward algorithm

26 HMM questions The LEARNING problem Given a series of labelled sequences HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT loaded fair loaded fair Goal: find the transition and emission probabilities of the model: learn the model F L a FF = 0.9 e F (H) = 0.5 e F (T) = 0.5 a FL = 0.1 e L (H) = 0.75 e L (T) = 0.25 a LL = 0.9 a LF = 0.1

27 HMM Parameter estimation The simplest way to obtain the parameters of the model is from a set of labelled observations. E.g. a number of sequences for which we know the actual sequence of states. A kl E k (b) Count the number of transitions between states k and l Count the number of times the symbol b is emitted by state k We can estimate the probabilities as follows: a kl = A kl A, e k(b) = E k(b) kl' l' b' E k (b')

28 HMM Parameter estimation This estimation is called Maximum likelihood, since the probability of all sequences given the parameters is maximal: P(s 1...s N θ ) = P(s j θ ) N j =1 Is maximal for parameters estimated from frequencies and assuming that the sequences are independent To avoid overfitting, use pseudocounts: A kl A kl + r kl E k (b) E k (b)+ r k (b) Pseudocounts reflect our prior knowledge

29 The LEARNING problem HMM questions What are the parameters of the model? Maximum likelihood estimation, Baum-Welch, EM The EVALUATION problem How likely is a given sequence The Forward algorithm The DECODING problem What is the most probable path for generating a given sequence? The Viterbi algorithm The POSTERIOR DECODING Which one is the most likely state in a given position? The Forward-Backward algorithm

30 The DECODING problem Given a sequence of heads (H) and tails (T) HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT Is it possible to reproduce the coin types the dealer has used? Goal: to determine which coin produced each outcome We must find the underlying sequence of states that led to the observed sequence of results: we must label the results by its state. For instance: Observations States HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT LLLLLLLLLLLFFFFFFFFFFFFFFFFFFFFLLLLLLLLFFFFFFFFF loaded fair loaded fair The label assignment has associated a probability

31 The DECODING problem Finding the most probable path: Viterbi algorithm Π We want to find the optimal path that maximizes the probability of the observed sequence P(S Π ) is maximal We can express this as follows: Π = arg max Π { P(S Π) } We solve this problem using Dynamic programming (maximization over all possible paths)

32 The DECODING problem Finding the most probable path: Viterbi algorithm We define a dynamic programming table of size Π x L, where Π is the number of states and L the length of the sequence Each element v k (i), k=1,, Π, i=1,,l represents the probability of the most probable path for the prefix s 1 s 2 s i (up to character s i ) and ending in state k v k (i) = max{ π 1,...,π i 1 } P(s...s,π...π,s,π = k) 1 i 1 1 i 1 i i We want to calculate v end (L) the probability of the most probable path accounting for all the sequence and ending in the end state

33 The DECODING problem Finding the most probable path: Viterbi algorithm We can define this probability recursively v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk

34 The DECODING problem Finding the most probable path: Viterbi algorithm Conditional probability We can define this probability recursively v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk

35 The DECODING problem Finding the most probable path: Viterbi algorithm Conditional probability We can define this probability recursively Markov property v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk

36 The DECODING problem Finding the most probable path: Viterbi algorithm Conditional probability We can define this probability recursively Markov property v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 Optimal substructure property = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk

37 The DECODING problem Finding the most probable path: Viterbi algorithm Conditional probability We can define this probability recursively Markov property v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 Optimal substructure property = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk Definition of v k (i) Definition of emission and transition probabilities

38 The DECODING problem Finding the most probable path: Viterbi algorithm Conditional probability We can define this probability recursively Markov property v k (i +1) = max{ π 1,...,π i } P(s...s,π,...,π,s,π = k) 1 i 1 i i+1 i+1 Optimal substructure property = max{ π 1,...,π i } P(s i+1,π i+1 = k s 1...s i,π 1...π i )P(s 1...s i,π 1...π i ) [ ] = max{ π 1,...,π i } P(s,π = k π )P(s...s,π...π,s,π ) i+1 i+1 i 1 i 1 1 i 1 i i [ ] = max l P(s i+1,π i+1 = k π i = l)max{ π 1,...,π i 1 } P(s...s,π...π,s,π = l) 1 i 1 1 i 1 i i [ ] [ ] = max l P(s i+1 π i+1 = k)p(π i+1 = k π i = l)v l (i) = e k (s i+1 )max l v l (i)a lk Definition of v k (i) Definition of emission and transition probabilities

39 The DECODING problem Finding the most probable path: Viterbi algorithm 1.- Initialization v begin (0) = 1 v k (0) = 0, k begin 2.- Recursion: for positions i = 0,...,L 1 and states l Q v l (i +1) = e l (s i+1 ) max k Q { } v k (i) a kl 3.- Termination P(S Π * ) = max{ v k (L) a k, end } k Q

40 The DECODING problem Finding the most probable path: Viterbi algorithm (keep the pointers for back-tracking of states) 1.- Initialization v begin (0) = 1 v k (0) = 0, k begin 2.- Recursion: for positions i = 0,...,L 1 and states l Q v l (i +1) = e l (s i+1 ) max ptr i+1 (l) = arg max k Q k Q { } v k (i) a kl { } v k (i) a kl Keep track of most probable path 3.- Termination P(S Π * ) = max{ v k (L) a k, end } k Q π * L = arg max{ v k (L) a k, end } k Q Find optimal path following pointers back, starting at π L using the recursion: π * i = ptr(π * i+1 )

41 The DECODING problem Finding the most probable path: Viterbi algorithm Use logarithms to avoid underflow problems: 1.- Initialization v begin (0) = 1 v begin (0) = 0 v k (0) = 0, k begin v k (0) =, k begin 2.- Recursion: for positions i = 0,...,L 1 and states l Q v l (i +1) = e l (s i+1 ) max ptr i+1 (l) = argmax k Q k Q { } v k (i) a kl { } v k (i) a kl ( ) + max{ } v l (i +1) = log e l (s i+1 ) v k(i)+ log(a kl ) k Q ptr i+1 (l) = arg max{ v k (i)+ log(a kl )} k Q 3.- Termination P(S Π * ) = max v k (L) a k, end k Q { } π * L = arg max{ v k (L) a k, end } k Q PL(S,Π * ) = max{ v k(l)+ log(a k, end )} k Q π * L = arg max{ v k (L)+ log(a k, end )} k Q Find optimal path using the same recursion π * i = ptr(π * i+1 )

42 Exercise

43 Viterbi algorithm - Example From a DNA sequence, we want to predict whether a region is likely to be exonic or intronic We do this by labelling the observed bases in terms of a number of states (E: exon, I: intron), e.g.: AGGGCTAGGGTGTCTGGTGCAACTCAGCAGCTCTGTACCGTCCCTAGCCTAAGTATGTTCG EEEEEEEEEEEIIIIIIIIIIIIIIIIIIIEEEEEEEEEIIIIIIIIIIIIIIEEEEEEEE Provided a series of transition probabilities between states and emission probabilities for symbols (nucleotides), we want to determine the most probable labeling of an input sequence (the most probable path/parse)

44 Viterbi algorithm - Example Simple HMM for 5 splice-site recognition (Eddy 2004) This model describes sequences where the initial part is exonic (homogeneous composition), then there is a nucleotide that will form the splice-site (mostly G), and the rest of the sequence belongs to the intron (rich in A and T). ACATCGTAGTGA CGACTGTAATGT GCGTAGAAGTAT TACGGGTAAATA The parameters are in general derived from a training (labelled) set The problem consists in finding the 5 splice-site (donor) in an unknown DNA sequence

45 Viterbi algorithm - Example Given the path Π = E E D I I (E: exon, I: intron, D: donor) The probability of observing the sequence: S= C A G T A P(S Π ) = a_be x e_e(c) x a_ee x e_e(a) x a_ed x e_d(g) x a_di x e_i(t) x a_ii x e_i(a) = 1.0 x 0.25 x 0.9 x 0.25 x 0.1 x 0.95 x 1.0 x 0.4 x 0.9 x 0.4 = For this other path of P(S Π ) E D I I I C A G T A = a_be x e_e(c) x a_ed x e_d(a) x a_di x e_i(g) x a_ii x e_i(t) x a_ii x e_i(a) = 1.0 x 0.25 x 0.1 x 0.05 x 1.0 x 0.1 x 0.9 x 0.4 x 0.9 x 0.4 = It is smaller

46 Viterbi algorithm - Example The same model but using logarithms The logs of the probabilities of the different sequence of states for the same DNA sequence (sequence of states)

47 Viterbi algorithm - Exercise Fill in the Viterbi dynamic programming matrix and obtain the most probable path Sequence States

48 Viterbi algorithm - Exercise Fill in the Viterbi dynamic programming matrix and obtain the most probable path

49 Viterbi algorithm - Exercise Fill in the Viterbi dynamic programming matrix and obtain the most probable path

50 Viterbi algorithm - Exercise Fill in the Viterbi dynamic programming matrix and obtain the most probable path

51 Viterbi algorithm - Exercise For instance, using the log-form of the algorithm: v 3 (8) = v I (T) = e I (8) + max { v k(7) + a ki } k {D,I } { } = e I (8) + max v I (7) + a II ;v D (7) + a DI = max{ ; } = = The maximum is given by the transition from D to I: ptr 8 (3) = ptr 8 (I) = D only need to use the possible transitions. Not allowed transitions can be given a 0 probability (- in log-form)

52 Exercise: fill in the Viterbi matrix and recover the optimal path of states using the 5 ss model in log scale and the following sequence: Viterbi matrix: x - A C C C G A G T A A begin Exon donor intron end Optimal path of states: i Π*

53 Exercise. (exam ). Consider the hidden Markov model of the coin-tossing problem: S" P(F F) = 3/4! P(F S)! F! L! P(L F) = 1/4! P(h F) = 1/2! P(t F) = 1/2! P(F L) = 1/4! P(L S)! P(h L) = 3/4! P(t L) = 1/4! P(L L) = 3/4! where S is the start (silent) state, and F and L are the fair and loaded coin states, respectively. Find a value (or condition) for P(F S) such that the most probable path for two heads in a row, hh, is given by the Fair coin, i.e. FF. Can you give an interpretation to this result?

54 Exercise. (exam ). Note: Use the Viterbi algorithm. Recall that the Viterbi algorithm defines a dynamic programming matrix v k (i), which represents the probability of the most probable path for the sequence prefix s 1 s 2 s i ending at state k, for k=1,, Π, i=1,,l, where Π is the number of states in the model and L the length of the sequence. Also, recall that the Viterbi matrix is filled with the following algorithm: Initialization: v start (0) =1 v k (0) = 0, k start Recursion: Finalization: v l (i +1) = e l (s i+1 ) max ptr i+1 (l) = argmax P(S Π * ) = π L * = argmax k {1,..., Π } k {1,..., Π } max k {1,..., Π } k {1,..., Π } { } v k (i) a kl { } v k (i) a kl { } v k (L) a k,end { } v k (L) a k,end Where e l (s i ) is the emission probability of the symbol s i in the state π l and a kl is the transition probability between states π k and π l. The optimal path Π * is recovered using the recursion:

55 The LEARNING problem HMM questions What are the parameters of the model? Maximum likelihood estimation, Baum-Welch, EM The EVALUATION problem How likely is a given sequence The Forward algorithm The DECODING problem What is the most probable path for generating a given sequence? The Viterbi algorithm The POSTERIOR DECODING Which one is the most likely state in a given position? The Forward-Backward algorithm

56 HMM questions The EVALUATION problem Given a sequence of heads (H) and tails (T) HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT Assuming that we have modelled the coins correctly, what is the probability that this sequence has been generated by the crooked dealer? HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT e.g. P(S)= 1.3x10-35

57 How likely is a sequence? But that is the probability of the sequence, given a specific path We want the probability of the sequence over all paths, i.e.: P(S) = P(s 1...s L ) = P( s 1...s L,π 0...π N ) π But the number of possible paths can grow exponentially: there are N L possible paths for the case of N states and a sequence of length L π The Forward algorithm allows to calculate this probability efficiently This algorithm will allow us to calculate also the probability of being in a specific state k in a position i of the sequence.

58 The Forward algorithm We define the probability of generating the first i letters of S ending up in state k (forward probability): f k (i) = P(s 1...s i,π i = k) f k (i) = P(s 1...s i,π i = k) = P(s 1...s i 1,π 1...π i 1,π i = k)e k (s i ) π 1...π i 1 = P(s 1...s i 1,π 1...π i 2,π i 1 = l)a lk e k (s i ) l π 1...π i 1 = P(s 1...s i 1,π i 1 = l)a lk e k (s i ) l = e k (s i ) f l (i 1)a lk l We obtain a recursion formula for this probability: we can apply a dynamic programming approach

59 The Forward algorithm 1.- Initialization f begin (0) = 1 f k (0) = 0, k begin Probability that we are in the begin state and have observed zero characters from the sequence 2.- Recursion: for positions i = 0,...,L 1 and states l Q f l (i +1) = e l (s i+1 ) k Q f k (i) a kl 3.- Termination P(S) = k Q f k (L) a k, end Probability that we are in the end state and have observed the entire sequence Forward algorithm is very similar to Viterbi. The only difference is that here we do a sum instead of a maximisation

60 The Forward algorithm: Example S = TAGA

61 The Forward algorithm: Example

62 The Forward algorithm In some cases the algorithm can be more efficient by taking into account the minimum number of steps that must be taken to reach a state Eg. For this model, we do not need to initialize or calculate the values: f 3 (0), f 4 (0), f 5 (0), f 5 (1)

63 The LEARNING problem HMM questions What are the parameters of the model? Maximum likelihood estimation, Baum-Welch, EM The EVALUATION problem How likely is a given sequence The Forward algorithm The DECODING problem What is the most probable path for generating a given sequence? The Viterbi algorithm The POSTERIOR DECODING Which one is the most likely state in a given position? The Forward-Backward algorithm

64 HMM questions The POSTERIOR DECODING Given a sequence of heads and tails HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT Is it possible to catch him cheating? Goal: find the probability that the dealer was using a biased coin at a particular time HTTHHHHHHTTHTHHTHTTTHTHTHHTTHHTHHHHHHTTTTHTHHTHT P(Loaded)=80%

65 The posterior probability of a state (Backward algorithm) We want to calculate the most probable state given an observation s i For instance, we want to catch the coin dealer cheating For this, we calculate the probability distribution of the i th position: the probability that a given position in the sequence has an underlying state: P(π i = k S) We can compute the posterior probability in the following way: P(π i = k S) = P(S,π i = k) P(S)

66 The posterior probability of a state (Backward algorithm) We can compute the posterior probability in the following way: P(π i = k S) = P(S,π i = k) P(S) Consider the following join probability probability of generating the first i letters of S ending up in state k P(S,π i = k) = P(s 1...s i,π i = k)p(s i+1...s L s 1...s i,π i = k) = P(s 1...s i,π i = k)p(s i+1...s L π i = k) Forward f k (i) Backward b k (i) Probability of generating the rest of the sequence (s i+1 s L ), starting from a given state (k) and ending at the end state

67 The posterior probability of a state (Backward algorithm) P(π i = k S) = P(S,π i = k) P(S) Given by Forward and Backward algorithms b k (i) = P(s i+1...s L π i = k) = P(s i+1 s i+2...s L,π i+1...π N π i = k) π i+1...π N Given by the Forward algorithm = P(s i+1 s i+2...s L,π i+1 = l,π i+2...π N π i = k) l π i+1...π N = e l (s i+1 )a kl P(s i+2...s L,π i+2...π N π i+1 = k) l π i+1...π N = e l (s i+1 )a kl b l (i +1) l We can again apply dynamic programming to calculate the backward probabilities

68 The posterior probability of a state (Backward algorithm) 1.- Initialization i = L b k (L) = a k,end, k Q 2.- Recursion: for positions i = L 1,...,1 and states l Q b k (i) = a kl e l (s i+1 ) b l (i +1) l Q 3.- Termination P(S) = a begin, l e l (s 1 ) b l (1) l Q Not usually necessary, as we normally use Forward to derive P(S)

69 The posterior probability of a state (Backward algorithm) 1.- Initialization i = L b k (L) = a k,end, k Q 2.- Recursion: for positions i = L 1,...,1 and states l Q b k (i) = a kl e l (s i+1 ) b l (i +1) l Q 3.- Termination P(S) = a begin, l e l (s 1 ) b l (1) l Q Not usually necessary, as we normally use Forward to derive P(S) P(π i = k,s) = f k (i)b k (i) P(π i = k,s) = P(π i = k S)P(S) The posterior probability P(π i = k S) = f (i)b (i) k k P(S)

70 Posterior decoding We can ask at each position, what is the most likely state that gave that character in the sequence. (I.e. the probability that a symbol is produced by a given state, given the full sequence) E.g.: we want to know the probability for each G to be emitted by the state D (donor site): Viterbi Forward-Backward

71 Posterior decoding If P(fair) > 0.5, the roll is more likely to be generated by a fair die than a loaded die

72 References What is a hidden Markov model? Sean R. Eddy (2004) Nature Biotechnology 22(10)1315 Biological Sequence Analysis: Probabilis5c Models of Proteins and Nucleic Acids Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Cambridge University Press, 1999 Problems and Solu5ons in Biological Sequence Analysis Mark Borodovsky, Svetlana Ekisheva Cambridge University Press, 2006

Hidden Markov Models for biological sequence analysis I

Hidden Markov Models for biological sequence analysis I Hidden Markov Models for biological sequence analysis I Master in Bioinformatics UPF 2014-2015 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Example: CpG Islands

More information

Hidden Markov Models. Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from:

Hidden Markov Models. Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from: Hidden Markov Models Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from: www.ioalgorithms.info Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm

More information

HIDDEN MARKOV MODELS

HIDDEN MARKOV MODELS HIDDEN MARKOV MODELS Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training Baum-Welch algorithm

More information

Example: The Dishonest Casino. Hidden Markov Models. Question # 1 Evaluation. The dishonest casino model. Question # 3 Learning. Question # 2 Decoding

Example: The Dishonest Casino. Hidden Markov Models. Question # 1 Evaluation. The dishonest casino model. Question # 3 Learning. Question # 2 Decoding Example: The Dishonest Casino Hidden Markov Models Durbin and Eddy, chapter 3 Game:. You bet $. You roll 3. Casino player rolls 4. Highest number wins $ The casino has two dice: Fair die P() = P() = P(3)

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms  Hidden Markov Models Hidden Markov Models Hidden Markov Models Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training

More information

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II) CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: Baum-Welch algorithm Viterbi training Hidden Markov models

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms   Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training Baum-Welch algorithm

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas CG-Islands Given 4 nucleotides: probability of occurrence is ~ 1/4. Thus, probability of

More information

Hidden Markov Models. Three classic HMM problems

Hidden Markov Models. Three classic HMM problems An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Hidden Markov Models Slides revised and adapted to Computational Biology IST 2015/2016 Ana Teresa Freitas Three classic HMM problems

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas Forward Algorithm For Markov chains we calculate the probability of a sequence, P(x) How

More information

Markov Chains and Hidden Markov Models. COMP 571 Luay Nakhleh, Rice University

Markov Chains and Hidden Markov Models. COMP 571 Luay Nakhleh, Rice University Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and Hidden Markov Models Modeling the statistical properties of biological sequences and distinguishing regions

More information

CSCE 471/871 Lecture 3: Markov Chains and

CSCE 471/871 Lecture 3: Markov Chains and and and 1 / 26 sscott@cse.unl.edu 2 / 26 Outline and chains models (s) Formal definition Finding most probable state path (Viterbi algorithm) Forward and backward algorithms State sequence known State

More information

Hidden Markov Models (I)

Hidden Markov Models (I) GLOBEX Bioinformatics (Summer 2015) Hidden Markov Models (I) a. The model b. The decoding: Viterbi algorithm Hidden Markov models A Markov chain of states At each state, there are a set of possible observables

More information

Stephen Scott.

Stephen Scott. 1 / 27 sscott@cse.unl.edu 2 / 27 Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative

More information

Markov Chains and Hidden Markov Models. = stochastic, generative models

Markov Chains and Hidden Markov Models. = stochastic, generative models Markov Chains and Hidden Markov Models = stochastic, generative models (Drawing heavily from Durbin et al., Biological Sequence Analysis) BCH339N Systems Biology / Bioinformatics Spring 2016 Edward Marcotte,

More information

Plan for today. ! Part 1: (Hidden) Markov models. ! Part 2: String matching and read mapping

Plan for today. ! Part 1: (Hidden) Markov models. ! Part 2: String matching and read mapping Plan for today! Part 1: (Hidden) Markov models! Part 2: String matching and read mapping! 2.1 Exact algorithms! 2.2 Heuristic methods for approximate search (Hidden) Markov models Why consider probabilistics

More information

11.3 Decoding Algorithm

11.3 Decoding Algorithm 11.3 Decoding Algorithm 393 For convenience, we have introduced π 0 and π n+1 as the fictitious initial and terminal states begin and end. This model defines the probability P(x π) for a given sequence

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

Introduction to Hidden Markov Models for Gene Prediction ECE-S690

Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence

More information

Hidden Markov Models 1

Hidden Markov Models 1 Hidden Markov Models Dinucleotide Frequency Consider all 2-mers in a sequence {AA,AC,AG,AT,CA,CC,CG,CT,GA,GC,GG,GT,TA,TC,TG,TT} Given 4 nucleotides: each with a probability of occurrence of. 4 Thus, one

More information

Hidden Markov Models. x 1 x 2 x 3 x K

Hidden Markov Models. x 1 x 2 x 3 x K Hidden Markov Models 1 1 1 1 2 2 2 2 K K K K x 1 x 2 x 3 x K HiSeq X & NextSeq Viterbi, Forward, Backward VITERBI FORWARD BACKWARD Initialization: V 0 (0) = 1 V k (0) = 0, for all k > 0 Initialization:

More information

1/22/13. Example: CpG Island. Question 2: Finding CpG Islands

1/22/13. Example: CpG Island. Question 2: Finding CpG Islands I529: Machine Learning in Bioinformatics (Spring 203 Hidden Markov Models Yuzhen Ye School of Informatics and Computing Indiana Univerty, Bloomington Spring 203 Outline Review of Markov chain & CpG island

More information

CSCE 478/878 Lecture 9: Hidden. Markov. Models. Stephen Scott. Introduction. Outline. Markov. Chains. Hidden Markov Models. CSCE 478/878 Lecture 9:

CSCE 478/878 Lecture 9: Hidden. Markov. Models. Stephen Scott. Introduction. Outline. Markov. Chains. Hidden Markov Models. CSCE 478/878 Lecture 9: Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative sscott@cse.unl.edu 1 / 27 2

More information

Hidden Markov Models. x 1 x 2 x 3 x K

Hidden Markov Models. x 1 x 2 x 3 x K Hidden Markov Models 1 1 1 1 2 2 2 2 K K K K x 1 x 2 x 3 x K Viterbi, Forward, Backward VITERBI FORWARD BACKWARD Initialization: V 0 (0) = 1 V k (0) = 0, for all k > 0 Initialization: f 0 (0) = 1 f k (0)

More information

Hidden Markov Models. Hosein Mohimani GHC7717

Hidden Markov Models. Hosein Mohimani GHC7717 Hidden Markov Models Hosein Mohimani GHC7717 hoseinm@andrew.cmu.edu Fair et Casino Problem Dealer flips a coin and player bets on outcome Dealer use either a fair coin (head and tail equally likely) or

More information

Hidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)

Hidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98) Hidden Markov Models Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98) 1 The occasionally dishonest casino A P A (1) = P A (2) = = 1/6 P A->B = P B->A = 1/10 B P B (1)=0.1... P

More information

Lecture 9. Intro to Hidden Markov Models (finish up)

Lecture 9. Intro to Hidden Markov Models (finish up) Lecture 9 Intro to Hidden Markov Models (finish up) Review Structure Number of states Q 1.. Q N M output symbols Parameters: Transition probability matrix a ij Emission probabilities b i (a), which is

More information

Hidden Markov Models. based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes

Hidden Markov Models. based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes Hidden Markov Models based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes music recognition deal with variations in - actual sound -

More information

6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 05. Hidden Markov Models Part II

6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 05. Hidden Markov Models Part II 6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution Lecture 05 Hidden Markov Models Part II 1 2 Module 1: Aligning and modeling genomes Module 1: Computational foundations Dynamic programming:

More information

Lecture 5: December 13, 2001

Lecture 5: December 13, 2001 Algorithms for Molecular Biology Fall Semester, 2001 Lecture 5: December 13, 2001 Lecturer: Ron Shamir Scribe: Roi Yehoshua and Oren Danewitz 1 5.1 Hidden Markov Models 5.1.1 Preface: CpG islands CpG is

More information

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech

More information

6 Markov Chains and Hidden Markov Models

6 Markov Chains and Hidden Markov Models 6 Markov Chains and Hidden Markov Models (This chapter 1 is primarily based on Durbin et al., chapter 3, [DEKM98] and the overview article by Rabiner [Rab89] on HMMs.) Why probabilistic models? In problems

More information

HMMs and biological sequence analysis

HMMs and biological sequence analysis HMMs and biological sequence analysis Hidden Markov Model A Markov chain is a sequence of random variables X 1, X 2, X 3,... That has the property that the value of the current state depends only on the

More information

HMM: Parameter Estimation

HMM: Parameter Estimation I529: Machine Learning in Bioinformatics (Spring 2017) HMM: Parameter Estimation Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Content Review HMM: three problems

More information

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010 Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed

More information

Grundlagen der Bioinformatik, SS 08, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence

Grundlagen der Bioinformatik, SS 08, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence rundlagen der Bioinformatik, SS 08,. Huson, June 16, 2008 89 8 Markov chains and Hidden Markov Models We will discuss: Markov chains Hidden Markov Models (HMMs) Profile HMMs his chapter is based on: nalysis,

More information

ROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY

ROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY BIOINFORMATICS Lecture 11-12 Hidden Markov Models ROBI POLIKAR 2011, All Rights Reserved, Robi Polikar. IGNAL PROCESSING & PATTERN RECOGNITION LABORATORY @ ROWAN UNIVERSITY These lecture notes are prepared

More information

Hidden Markov Models

Hidden Markov Models s Ben Langmead Department of Computer Science Please sign guestbook (www.langmead-lab.org/teaching-materials) to tell me briefly how you are using the slides. For original Keynote files, email me (ben.langmead@gmail.com).

More information

Hidden Markov Models. Ron Shamir, CG 08

Hidden Markov Models. Ron Shamir, CG 08 Hidden Markov Models 1 Dr Richard Durbin is a graduate in mathematics from Cambridge University and one of the founder members of the Sanger Institute. He has also held carried out research at the Laboratory

More information

Hidden Markov Models, I. Examples. Steven R. Dunbar. Toy Models. Standard Mathematical Models. Realistic Hidden Markov Models.

Hidden Markov Models, I. Examples. Steven R. Dunbar. Toy Models. Standard Mathematical Models. Realistic Hidden Markov Models. , I. Toy Markov, I. February 17, 2017 1 / 39 Outline, I. Toy Markov 1 Toy 2 3 Markov 2 / 39 , I. Toy Markov A good stack of examples, as large as possible, is indispensable for a thorough understanding

More information

VL Algorithmen und Datenstrukturen für Bioinformatik ( ) WS15/2016 Woche 16

VL Algorithmen und Datenstrukturen für Bioinformatik ( ) WS15/2016 Woche 16 VL Algorithmen und Datenstrukturen für Bioinformatik (19400001) WS15/2016 Woche 16 Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin Based on slides by

More information

Grundlagen der Bioinformatik, SS 09, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence

Grundlagen der Bioinformatik, SS 09, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence rundlagen der Bioinformatik, SS 09,. Huson, June 16, 2009 81 7 Markov chains and Hidden Markov Models We will discuss: Markov chains Hidden Markov Models (HMMs) Profile HMMs his chapter is based on: nalysis,

More information

Lecture 7 Sequence analysis. Hidden Markov Models

Lecture 7 Sequence analysis. Hidden Markov Models Lecture 7 Sequence analysis. Hidden Markov Models Nicolas Lartillot may 2012 Nicolas Lartillot (Universite de Montréal) BIN6009 may 2012 1 / 60 1 Motivation 2 Examples of Hidden Markov models 3 Hidden

More information

Hidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2

Hidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2 Hidden Markov Models based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis Shamir s lecture notes and Rabiner s tutorial on HMM 1 music recognition deal with variations

More information

Hidden Markov Models (Part 1)

Hidden Markov Models (Part 1) Hidden Marov Models (Part ) BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mar Craven craven@biostat.wisc.edu Fall 0 A simple HMM A G A G C T C T! given say a T in our input sequence, which state emitted

More information

R. Durbin, S. Eddy, A. Krogh, G. Mitchison: Biological sequence analysis. Cambridge University Press, ISBN (Chapter 3)

R. Durbin, S. Eddy, A. Krogh, G. Mitchison: Biological sequence analysis. Cambridge University Press, ISBN (Chapter 3) 9 Markov chains and Hidden Markov Models We will discuss: Markov chains Hidden Markov Models (HMMs) lgorithms: Viterbi, forward, backward, posterior decoding Profile HMMs Baum-Welch algorithm This chapter

More information

Data Mining in Bioinformatics HMM

Data Mining in Bioinformatics HMM Data Mining in Bioinformatics HMM Microarray Problem: Major Objective n Major Objective: Discover a comprehensive theory of life s organization at the molecular level 2 1 Data Mining in Bioinformatics

More information

Today s Lecture: HMMs

Today s Lecture: HMMs Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models

More information

Hidden Markov Methods. Algorithms and Implementation

Hidden Markov Methods. Algorithms and Implementation Hidden Markov Methods. Algorithms and Implementation Final Project Report. MATH 127. Nasser M. Abbasi Course taken during Fall 2002 page compiled on July 2, 2015 at 12:08am Contents 1 Example HMM 5 2 Forward

More information

Hidden Markov Models. x 1 x 2 x 3 x N

Hidden Markov Models. x 1 x 2 x 3 x N Hidden Markov Models 1 1 1 1 K K K K x 1 x x 3 x N Example: The dishonest casino A casino has two dice: Fair die P(1) = P() = P(3) = P(4) = P(5) = P(6) = 1/6 Loaded die P(1) = P() = P(3) = P(4) = P(5)

More information

Chapter 4: Hidden Markov Models

Chapter 4: Hidden Markov Models Chapter 4: Hidden Markov Models 4.1 Introduction to HMM Prof. Yechiam Yemini (YY) Computer Science Department Columbia University Overview Markov models of sequence structures Introduction to Hidden Markov

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis .. Lecture 6. Hidden Markov model and Viterbi s decoding algorithm Institute of Computing Technology Chinese Academy of Sciences, Beijing, China . Outline The occasionally dishonest casino: an example

More information

Hidden Markov Models

Hidden Markov Models Andrea Passerini passerini@disi.unitn.it Statistical relational learning The aim Modeling temporal sequences Model signals which vary over time (e.g. speech) Two alternatives: deterministic models directly

More information

Comparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey

Comparative Gene Finding. BMI/CS 776  Spring 2015 Colin Dewey Comparative Gene Finding BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following: using related genomes

More information

Bioinformatics: Biology X

Bioinformatics: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Where (or of what) one cannot speak, one must pass over in silence.

More information

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden

More information

Markov chains and Hidden Markov Models

Markov chains and Hidden Markov Models Discrete Math for Bioinformatics WS 10/11:, b A. Bockmar/K. Reinert, 7. November 2011, 10:24 2001 Markov chains and Hidden Markov Models We will discuss: Hidden Markov Models (HMMs) Algorithms: Viterbi,

More information

Genome 373: Hidden Markov Models II. Doug Fowler

Genome 373: Hidden Markov Models II. Doug Fowler Genome 373: Hidden Markov Models II Doug Fowler Review From Hidden Markov Models I What does a Markov model describe? Review From Hidden Markov Models I A T A Markov model describes a random process of

More information

Advanced Data Science

Advanced Data Science Advanced Data Science Dr. Kira Radinsky Slides Adapted from Tom M. Mitchell Agenda Topics Covered: Time series data Markov Models Hidden Markov Models Dynamic Bayes Nets Additional Reading: Bishop: Chapter

More information

Hidden Markov Models and Applica2ons. Spring 2017 February 21,23, 2017

Hidden Markov Models and Applica2ons. Spring 2017 February 21,23, 2017 Hidden Markov Models and Applica2ons Spring 2017 February 21,23, 2017 Gene finding in prokaryotes Reading frames A protein is coded by groups of three nucleo2des (codons): ACGTACGTACGTACGT ACG-TAC-GTA-CGT-ACG-T

More information

HMM : Viterbi algorithm - a toy example

HMM : Viterbi algorithm - a toy example MM : Viterbi algorithm - a toy example 0.6 et's consider the following simple MM. This model is composed of 2 states, (high GC content) and (low GC content). We can for example consider that state characterizes

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/14/07 CAP5510 1 CpG Islands Regions in DNA sequences with increased

More information

HMM : Viterbi algorithm - a toy example

HMM : Viterbi algorithm - a toy example MM : Viterbi algorithm - a toy example 0.5 0.5 0.5 A 0.2 A 0.3 0.5 0.6 C 0.3 C 0.2 G 0.3 0.4 G 0.2 T 0.2 T 0.3 et's consider the following simple MM. This model is composed of 2 states, (high GC content)

More information

EECS730: Introduction to Bioinformatics

EECS730: Introduction to Bioinformatics EECS730: Introduction to Bioinformatics Lecture 07: profile Hidden Markov Model http://bibiserv.techfak.uni-bielefeld.de/sadr2/databasesearch/hmmer/profilehmm.gif Slides adapted from Dr. Shaojie Zhang

More information

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics A stochastic (probabilistic) model that assumes the Markov property Markov property is satisfied when the conditional probability distribution of future states of the process (conditional on both past

More information

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Parameters of an HMM States: A set of states S=s 1, s n Transition probabilities: A= a 1,1, a 1,2,, a n,n

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related

More information

L23: hidden Markov models

L23: hidden Markov models L23: hidden Markov models Discrete Markov processes Hidden Markov models Forward and Backward procedures The Viterbi algorithm This lecture is based on [Rabiner and Juang, 1993] Introduction to Speech

More information

Multiple Sequence Alignment using Profile HMM

Multiple Sequence Alignment using Profile HMM Multiple Sequence Alignment using Profile HMM. based on Chapter 5 and Section 6.5 from Biological Sequence Analysis by R. Durbin et al., 1998 Acknowledgements: M.Sc. students Beatrice Miron, Oana Răţoi,

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Sequence Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(21) Introduction Structured

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2011 1 HMM Lecture Notes Dannie Durand and Rose Hoberman October 11th 1 Hidden Markov Models In the last few lectures, we have focussed on three problems

More information

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma COMS 4771 Probabilistic Reasoning via Graphical Models Nakul Verma Last time Dimensionality Reduction Linear vs non-linear Dimensionality Reduction Principal Component Analysis (PCA) Non-linear methods

More information

Parametric Models Part III: Hidden Markov Models

Parametric Models Part III: Hidden Markov Models Parametric Models Part III: Hidden Markov Models Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2014 CS 551, Spring 2014 c 2014, Selim Aksoy (Bilkent

More information

Lab 3: Practical Hidden Markov Models (HMM)

Lab 3: Practical Hidden Markov Models (HMM) Advanced Topics in Bioinformatics Lab 3: Practical Hidden Markov Models () Maoying, Wu Department of Bioinformatics & Biostatistics Shanghai Jiao Tong University November 27, 2014 Hidden Markov Models

More information

BMI/CS 576 Fall 2016 Final Exam

BMI/CS 576 Fall 2016 Final Exam BMI/CS 576 all 2016 inal Exam Prof. Colin Dewey Saturday, December 17th, 2016 10:05am-12:05pm Name: KEY Write your answers on these pages and show your work. You may use the back sides of pages as necessary.

More information

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010 Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition

More information

1 What is a hidden Markov model?

1 What is a hidden Markov model? 1 What is a hidden Markov model? Consider a Markov chain {X k }, where k is a non-negative integer. Suppose {X k } embedded in signals corrupted by some noise. Indeed, {X k } is hidden due to noise and

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Statistical Sequence Recognition and Training: An Introduction to HMMs

Statistical Sequence Recognition and Training: An Introduction to HMMs Statistical Sequence Recognition and Training: An Introduction to HMMs EECS 225D Nikki Mirghafori nikki@icsi.berkeley.edu March 7, 2005 Credit: many of the HMM slides have been borrowed and adapted, with

More information

Assignments for lecture Bioinformatics III WS 03/04. Assignment 5, return until Dec 16, 2003, 11 am. Your name: Matrikelnummer: Fachrichtung:

Assignments for lecture Bioinformatics III WS 03/04. Assignment 5, return until Dec 16, 2003, 11 am. Your name: Matrikelnummer: Fachrichtung: Assignments for lecture Bioinformatics III WS 03/04 Assignment 5, return until Dec 16, 2003, 11 am Your name: Matrikelnummer: Fachrichtung: Please direct questions to: Jörg Niggemann, tel. 302-64167, email:

More information

Introduction to Hidden Markov Models (HMMs)

Introduction to Hidden Markov Models (HMMs) Introduction to Hidden Markov Models (HMMs) But first, some probability and statistics background Important Topics 1.! Random Variables and Probability 2.! Probability Distributions 3.! Parameter Estimation

More information

Hidden Markov Models (HMMs) November 14, 2017

Hidden Markov Models (HMMs) November 14, 2017 Hidden Markov Models (HMMs) November 14, 2017 inferring a hidden truth 1) You hear a static-filled radio transmission. how can you determine what did the sender intended to say? 2) You know that genes

More information

Hidden Markov Models. Terminology, Representation and Basic Problems

Hidden Markov Models. Terminology, Representation and Basic Problems Hidden Markov Models Terminology, Representation and Basic Problems Data analysis? Machine learning? In bioinformatics, we analyze a lot of (sequential) data (biological sequences) to learn unknown parameters

More information

6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm

6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm 6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm Overview The EM algorithm in general form The EM algorithm for hidden markov models (brute force) The EM algorithm for hidden markov models (dynamic

More information

Hidden Markov Models. Terminology and Basic Algorithms

Hidden Markov Models. Terminology and Basic Algorithms Hidden Markov Models Terminology and Basic Algorithms The next two weeks Hidden Markov models (HMMs): Wed 9/11: Terminology and basic algorithms Mon 14/11: Implementing the basic algorithms Wed 16/11:

More information

Sequence Analysis. BBSI 2006: Lecture #(χ+1) Takis Benos (2006) BBSI MAY P. Benos

Sequence Analysis. BBSI 2006: Lecture #(χ+1) Takis Benos (2006) BBSI MAY P. Benos Sequence Analysis BBSI 2006: Lecture #(χ+1) Takis Benos (2006) Molecular Genetics 101 What is a gene? We cannot define it (but we know it when we see it ) A loose definition: Gene is a DNA/RNA information

More information

What s an HMM? Extraction with Finite State Machines e.g. Hidden Markov Models (HMMs) Hidden Markov Models (HMMs) for Information Extraction

What s an HMM? Extraction with Finite State Machines e.g. Hidden Markov Models (HMMs) Hidden Markov Models (HMMs) for Information Extraction Hidden Markov Models (HMMs) for Information Extraction Daniel S. Weld CSE 454 Extraction with Finite State Machines e.g. Hidden Markov Models (HMMs) standard sequence model in genomics, speech, NLP, What

More information

Bioinformatics 2 - Lecture 4

Bioinformatics 2 - Lecture 4 Bioinformatics 2 - Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what

More information

Lecture #5. Dependencies along the genome

Lecture #5. Dependencies along the genome Markov Chains Lecture #5 Background Readings: Durbin et. al. Section 3., Polanski&Kimmel Section 2.8. Prepared by Shlomo Moran, based on Danny Geiger s and Nir Friedman s. Dependencies along the genome

More information

The Computational Problem. We are given a sequence of DNA and we wish to know which subsequence or concatenation of subsequences constitutes a gene.

The Computational Problem. We are given a sequence of DNA and we wish to know which subsequence or concatenation of subsequences constitutes a gene. GENE FINDING The Computational Problem We are given a sequence of DNA and we wish to know which subsequence or concatenation of subsequences constitutes a gene. The Computational Problem Confounding Realities:

More information

Sequences and Information

Sequences and Information Sequences and Information Rahul Siddharthan The Institute of Mathematical Sciences, Chennai, India http://www.imsc.res.in/ rsidd/ Facets 16, 04/07/2016 This box says something By looking at the symbols

More information

Pair Hidden Markov Models

Pair Hidden Markov Models Pair Hidden Markov Models Scribe: Rishi Bedi Lecturer: Serafim Batzoglou January 29, 2015 1 Recap of HMMs alphabet: Σ = {b 1,...b M } set of states: Q = {1,..., K} transition probabilities: A = [a ij ]

More information

Hidden Markov Models Hamid R. Rabiee

Hidden Markov Models Hamid R. Rabiee Hidden Markov Models Hamid R. Rabiee 1 Hidden Markov Models (HMMs) In the previous slides, we have seen that in many cases the underlying behavior of nature could be modeled as a Markov process. However

More information

Hidden Markov Models: All the Glorious Gory Details

Hidden Markov Models: All the Glorious Gory Details Hidden Markov Models: All the Glorious Gory Details Noah A. Smith Department of Computer Science Johns Hopkins University nasmith@cs.jhu.edu 18 October 2004 1 Introduction Hidden Markov models (HMMs, hereafter)

More information