HMM : Viterbi algorithm - a toy example

Similar documents
HMM : Viterbi algorithm - a toy example

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)

CSCE 471/871 Lecture 3: Markov Chains and

Markov Chains and Hidden Markov Models. COMP 571 Luay Nakhleh, Rice University

Hidden Markov Models. Three classic HMM problems

Introduction to Hidden Markov Models for Gene Prediction ECE-S690

Hidden Markov Models for biological sequence analysis

Hidden Markov Models for biological sequence analysis I

11.3 Decoding Algorithm

Multiple Sequence Alignment using Profile HMM

Lecture 9. Intro to Hidden Markov Models (finish up)

Computational Genomics and Molecular Biology, Fall

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms Hidden Markov Models

L23: hidden Markov models

Hidden Markov Models

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017

STA 4273H: Statistical Machine Learning

Page 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models (HMMs) and Profiles

Hidden Markov Models. x 1 x 2 x 3 x K

Markov Chains and Hidden Markov Models. = stochastic, generative models

Pair Hidden Markov Models

STA 414/2104: Machine Learning

Introduction to Machine Learning CMU-10701

Hidden Markov Models,99,100! Markov, here I come!

O 3 O 4 O 5. q 3. q 4. Transition

order is number of previous outputs

HIDDEN MARKOV MODELS

Hidden Markov Models. Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from:

Hidden Markov Models and Gaussian Mixture Models

Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data

Hidden Markov models in population genetics and evolutionary biology

Hidden Markov Models. based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes

HMM: Parameter Estimation

HMMs and biological sequence analysis

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences

6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 05. Hidden Markov Models Part II

Example: The Dishonest Casino. Hidden Markov Models. Question # 1 Evaluation. The dishonest casino model. Question # 3 Learning. Question # 2 Decoding

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

Hidden Markov Models

CSCE 478/878 Lecture 9: Hidden. Markov. Models. Stephen Scott. Introduction. Outline. Markov. Chains. Hidden Markov Models. CSCE 478/878 Lecture 9:

Stephen Scott.

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences

Graphical Models Seminar

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

Hidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)

Hidden Markov Models and Their Applications in Biological Sequence Analysis

Numerically Stable Hidden Markov Model Implementation

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Plan for today. ! Part 1: (Hidden) Markov models. ! Part 2: String matching and read mapping

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Today s Lecture: HMMs

Lecture 7 Sequence analysis. Hidden Markov Models

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Hidden Markov Models

1 What is a hidden Markov model?

Stephen Scott.

Basic math for biology

Recall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series

Chapter 4: Hidden Markov Models

Hidden Markov Models

Hidden Markov Models. x 1 x 2 x 3 x K

Hidden Markov Modelling

1 Hidden Markov Models

VL Algorithmen und Datenstrukturen für Bioinformatik ( ) WS15/2016 Woche 16

Hidden Markov Models (Part 1)

Hidden Markov models

Hidden Markov Models (I)

Hidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2

1. Markov models. 1.1 Markov-chain

Hidden Markov Models NIKOLAY YAKOVETS

ROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY

Multiscale Systems Engineering Research Group

Biochemistry 324 Bioinformatics. Hidden Markov Models (HMMs)

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018.

University of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I

Hidden Markov Models and Gaussian Mixture Models

Robert Collins CSE586 CSE 586, Spring 2015 Computer Vision II

Implementation of EM algorithm in HMM training. Jens Lagergren

Comparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey

Data Mining in Bioinformatics HMM

Hidden Markov Models 1

Advanced Data Science

Lab 3: Practical Hidden Markov Models (HMM)

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

6 Markov Chains and Hidden Markov Models

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

Math 350: An exploration of HMMs through doodles.

EECS730: Introduction to Bioinformatics

Lecture 12: Algorithms for HMMs

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Dept. of Linguistics, Indiana University Fall 2009

Bioinformatics Introduction to Hidden Markov Models Hidden Markov Models and Multiple Sequence Alignment

Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features

Statistical Methods for NLP

Transcription:

MM : Viterbi algorithm - a toy example 0.6 et's consider the following simple MM. This model is composed of 2 states, (high GC content) and (low GC content). We can for example consider that state characterizes coding DNA while characterizes non-coding DNA. The model can then be used to predict the region of coding DNA from a given sequence. Sources: For the theory, see Durbin et al (1998);; For the example, see Borodovsky & Ekisheva (2006), pp 80-81

MM : Viterbi algorithm - a toy example 0.6 Consider the sequence S= GGCACTGAA There are several paths through the hidden states ( and ) that lead to the given sequence S. Example: P = The probability of the MM to produce sequence S through the path P is: p = p (0) * p (G) * p * p (G) * p * p (C) *... = * 0.2 * 0.6 * 0.2 * 0.4 * 0.3 *... =...

MM : Viterbi algorithm - a toy example 0.6 GGCACTGAA There are several paths through the hidden states ( and ) that lead to the given sequence, but they do not have the same probability. The Viterbi algorithm is a dynamical programming algorithm that allows us to compute the most probable path. Its principle is similar to the DP programs used to align 2 sequences (i.e. Needleman-Wunsch) Source: Borodovsky & Ekisheva, 2006

MM : Viterbi algorithm - a toy example Viterbi algorithm: principle 0.6 G G C A C T G A A The probability of the most probable path ending in state k with observation "i" is probability to observe element i in state l probability of the most probable path ending at position x-1 in state k with element j probability of the transition from state l to state k

MM : Viterbi algorithm - a toy example Viterbi algorithm: principle 0.6 G G C A C T G A A The probability of the most probable path ending in state k with observation "i" is In our example, the probability of the most probable path ending in state with observation "A" at the 4th position is: We can thus compute recursively (from the first to the last element of our sequence) the probability of the most probable path.

MM : Viterbi algorithm - a toy example -1-1 -1 A -2.322 A -1.737-1 -0.737 C -1.737 C -2.322 G -1.737-1.322 G -2.322 T -2.322 T -1.737 Remark: for the calculations, it is convenient to use the log of the probabilities (rather than the probabilities themselves). Indeed, this allows us to compute sums instead of products, which is more efficient and accurate. We used here log 2 (p).

MM : Viterbi algorithm - a toy example -1-1 -1 A -2.322 C -1.737 G -1.737 T -2.322-1 -1.322 A -1.737 C -2.322 G -2.322 T -1.737-0.737 GGCACTGAA Probability (in log 2 ) that G at the first position was emitted by state Probability (in log 2 ) that G at the first position was emitted by state p (G,1) = -1-1.737 = -2.737 p (G,1) = -1-2.322 = -3.322

MM : Viterbi algorithm - a toy example -1-1 -1 A -2.322 C -1.737 G -1.737 T -2.322-1 -1.322 A -1.737 C -2.322 G -2.322 T -1.737-0.737 GGCACTGAA Probability (in log 2 ) that G at the 2nd position was emitted by state Probability (in log 2 ) that G at the 2nd position was emitted by state p (G,2) = -1.737 + max (p (G,1)+p, p (G,1)+p ) = -1.737 + max (-2.737-1, -3.322-1.322) = -5.474 (obtained from p (G,1)) p (G,2) = -2.322 + max (p (G,1)+p, p (G,1)+p ) = -2.322 + max (-2.737-1.322, -3.322-0.737) = -6.059 (obtained from p (G,1))

MM : Viterbi algorithm - a toy example -1-1 -1 A -2.322 C -1.737 G -1.737 T -2.322-1 -1.322 A -1.737 C -2.322 G -2.322 T -1.737-0.737 GGCACTGAA G G C A C T G A A -2.73-5.47-8.21-11.53-14.01... -25.65-3.32-6.06-8.79-10.94-14.01... -24.49 We then compute iteratively the probabilities p (i,x) and p (i,x) that nucleotide i at position x was emitted by state or, respectively. The highest probability obtained for the nucleotide at the last position is the probability of the most probable path. This path can be retrieved by back-tracking.

MM : Viterbi algorithm - a toy example -1-1 -1 A -2.322 C -1.737 G -1.737 T -2.322-1 -1.322 A -1.737 C -2.322 G -2.322 T -1.737-0.737 GGCACTGAA back-tracking (= finding the path which corresponds to the highest probability, -24.49) G G C A C T G A A -2.73-5.47-8.21-11.53-14.01... -25.65-3.32-6.06-8.79-10.94-14.01... -24.49 The most probable path is: Its probability is 2-24.49 = 4.25E-8 (remember that we used log 2 (p))

MM : Forward algorithm - a toy example 0.6 Consider now the sequence S= GGCA What is the probability P(S) that this sequence S was generated by the MM model? This probability P(S) is given by the sum of the probabilities p i (S) of each possible path that produces this sequence. The probability P(S) can be computed by dynamical programming using either the so-called Forward or the Backward algorithm.

MM : Forward algorithm - a toy example Forward algorithm 0.6 Consider now the sequence S= GGCA G G C A 0 *0.3=0.15 0 *0.2=0.1

MM : Forward algorithm - a toy example 0.6 Consider now the sequence S= GGCA G G C A 0 *0.3=0.15 0.15**0.3 + 0.1*0.4*0.3=0.0345 0 *0.2=0.1

MM : Forward algorithm - a toy example 0.6 Consider now the sequence S= GGCA G G C A 0 *0.3=0.15 0.15**0.3 + 0.1*0.4*0.3=0.0345 0 *0.2=0.1 0.1*0.6*0.2 + 0.15**0.2=0.027

MM : Forward algorithm - a toy example 0.6 Consider now the sequence S= GGCA G G C A 0 *0.3=0.15 0.15**0.3 + 0.1*0.4*0.3=0.0345... +... 0 *0.2=0.1 0.1*0.6*0.2 + 0.15**0.2=0.027... +...

MM : Forward algorithm - a toy example 0.6 Consider now the sequence S= GGCA G G C A 0 *0.3=0.15 0.15**0.3 + 0.1*0.4*0.3=0.0345... +... 0.0013767 0 *0.2=0.1 0.1*0.6*0.2 + 0.15**0.2=0.027... +... 0.0024665 => The probability that the sequence S was generated by the MM model is thus P(S)=0.0038432. Σ = 0.0038432

MM : Forward algorithm - a toy example 0.6 The probability that sequence S="GGCA" was generated by the MM model is P MM (S) = 0.0038432. To assess the significance of this value, we have to compare it to the probability that sequence S was generated by the background model (i.e. by chance). Ex: If all nucleotides have the same probability, p bg =0.25;; the probability to observe S by chance is: P bg (S) = p bg 4 = 0.25 4 = 0.00396. Thus, for this particular example, it is likely that the sequence S does not match the MM model (P bg > P MM ). NB: Note that this toy model is very simple and does not reflect any biological motif. If fact both states and are characterized by probabilities close to the background probabilities, which makes the model not realistic and not suitable to detect specific motifs.

MM : Summary Summary The Viterbi algorithm is used to compute the most probable path (as well as its probability). It requires knowledge of the parameters of the MM model and a particular output sequence and it finds the state sequence that is most likely to have generated that output sequence. It works by finding a maximum over all possible state sequences. In sequence analysis, this method can be used for example to predict coding vs non-coding sequences. In fact there are often many state sequences that can produce the same particular output sequence, but with different probabilities. It is possible to calculate the probability for the MM model to generate that output sequence by doing the summation over all possible state sequences. This also can be done efficiently using the Forward algorithm (or the Backward algorithm), which is also a dynamical programming algorithm. In sequence analysis, this method can be used for example to predict the probability that a particular DNA region match the MM motif (i.e. was emitted by the MM model). A MM motif can represent a TF binding site for ex.

MM : Summary Remarks To create a MM model (i.e. find the most likely set of state transition and output probabilities of each state), we need a set of (training) sequences, that does not need to be aligned. No tractable algorithm is known for solving this problem exactly, but a local maximum likelihood can be derived efficiently using the Baum-Welch algorithm or the Baldi-Chauvin algorithm. The Baum-Welch algorithm is an example of a forward-backward algorithm, and is a special case of the Expectation-maximization algorithm. For more details: see Durbin et al (1998) MMER The UMMER3 package contains a set of programs (developed by S. Eddy) to build MM models (from a set of aligned sequences) and to use MM models (to align sequences or to find sequences in databases). These programs are available at the Mobyle plateform (http://mobyle.pasteur.fr/cgi-bin/mobyleportal/portal.py)