O 3 O 4 O 5. q 3. q 4. Transition


 Leona Hunt
 1 years ago
 Views:
Transcription
1 Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in a series of statistical papers by L. E. Baum et al. They began to be used extensively in the area of bioinformatics (to model gene and protein behavior) in the mid 1980 s. General Idea: Similarly to a Markov chain, a hidden Markov model is a statistical model for a sequence of observations which are dependent on each other in a systematical way. A hidden Markov model consists of two parts: a sequence of states q 1, q 2, q 2,... and a sequence of emitted observations O 1, O 2, O 3,.... Generally, it is assumed that the observations can be observed (duh!) but that the states cannot be observed in practice. That s the hidden part of Hidden Markov model. O 1 O 2 O 3 O 4 O 5 Emission q 1 q 2 q 3 q 4 q 5 Transition The state variables q perform transitions from one state to another according to a discrete finite Markov chain. Each state emits observations O from a finite alphabet with a probability distribution that may depend on the state q. To fully describe a hidden Markov model, one needs the following components: A set of N states {S 1, S 2,..., S N }. An alphabet of M distinct observation symbols A = {a 1, a 2,..., a M }. The transition probability matrix for the states P = (p ij ) = P (q t+1 = S j q t = S i ) The emission probabilities (that may depend on the current state q t = S i ). For each state S i and a A define b i (a) = P (S i emits symbol a) these emission probabilities may be arranged in form of an N M matrix B. The initial distribution vector π = (π 1,..., π N ) for the first state. π i = P (q 1 = S i ) 36
2 Example: You have two coins  a fair coin in your right hand and a biased coin that tosses heads with probability 0.8 in your left hand. You will toss the coins repeatedly (always starting with the right hand) switching hands according to a Markov chain with transition probability matrix P (for right, left) and call out the results ( R L ) R P = L (a) What are the states and the observations in this example? Write down the state space and the observation alphabet. (b) How many (free) parameters does this HMM have? Write them down. (c) Suppose the coins are tossed three times and we observe T HT. What is the probability of this outcome for this model? (d) Given this outcome, find the most likely sequence of hidden states that have emitted it. The collection of parameters of a HMM is denoted λ = (P, B, π). In practice, the values of these parameters are usually unknown. Sometimes a structure, for instance the distribution of B may be inferred (apart from parameter values) from the biological setting. 37
3 Today, Hidden Markov models are used very frequently in biological applications. Examples of their use include: Gene Finding: A chromosome consists of regions that code for proteins (coding regions) and regions whose function is today still largely unknown (noncoding regions). However, DNA has a grammatical structure, that is the composition of base pairs is slightly different in coding and noncoding regions. For instance, it is known that the beginning regions of mammalian genes (promotors) are rich in CG combinations. The program Genescan relies on a hidden Markov model to divide a given nucleotide sequence into coding and noncoding regions. Studying copy number variations. During replication or reproduction whole regions in the genome are sometimes deleted or duplicated. That leads to some genes being present more than once in a genome. The number of times a gene is present is called the copy number of that gene. There may be evolutionary pressure for or against genes in higher copy numbers. Copy numbers can vary amongst individuals and may vary even between identical twins. Microarray technology can be used to measure copy number variation and hidden Markov models are used to reconstruct which portions of the genome are duplicated or deleted in a specific individual. Protein Family Characterization: A protein family is a group of evolutionary related proteins. Proteins in the same family often have similar threedimensional structures, similar functions, and similar amino acid sequences. Currently, there are over 60,000 defined protein families. How can a Hidden Markov model help in defining protein families? Starting with training sequences of proteins that have known similarities in sequence and/or function a hidden Markov model is fitted that describes how the sequences evolved from each other. The parameters of the HMM are estimated from the training data. Knowing the statistical properties of the model allows to perform hypothesis tests for new protein structures of unknown family: versus H 0 : new protein does not belong to family H a : new protein does belong to family The architecture of the HMM used for protein family modeling is described in great detail in Krogh et al (J.Mol Biol, 1994). Consider the following four amino acid sequences that may have evolved from a common ancestor: Protein 1 M P L H L T Q D E L D V Protein 2 I P H H F A Q D E L S S Protein 3 I P L H A A Y Q N L S W Protein 4 V V T H M A Q N F V D L 38
4 Evolutionary events that any amino acid in the ancestral sequence may have undergone include Matches  same distribution of amino acids on all sequences Deletion  the amino acid is deleted. Insertion  a new amino acid is inserted between two existing ones Consider the following HMM for a short ancestral sequence with only four amino acids. D1 D2 D3 D4 Begin M1 M2 M3 M4 End I1 I2 I3 I4 I5 Here, the Mstates are the match states. Each match state has a distinct distribution for the 20letter amino acid alphabet. Corresponding to each match state, there is a deletion state D that emits a dummy variable δ (or  that stands for deletion of an amino acid). On either side of a match state there is an insertion state I which generates an amino acid again from the 20letter amino acid alphabet but with a distribution characterized by the state I. The Begin and End states do not produce emissions. (a) How many states does this HMM have? (b) What are the model parameters P and B in this example? (c) What would the model parameters have to be so that this model produces completely random sequences? Always the same sequence? 39
5 Hidden Markov Models in Gene Finding In most (nonbacterial) genomes, genes are split into several coding and noncoding regions. The beginning and end of coding sequences within the genome are flanked by start and stop codons. The start codon is preceded by a promotor region. This is a DNA sequence to which the RNA polymerase (which facilitates the copying of DNA into RNA) can bind. The DNA sequences at the splice sites, where two different regions meet, are quite characteristic (usually GT at the 5 end and AG at the 3 end) and properties of the nucleotide sequence (for instance CG content) can differ within these regions. Interactions of proteins with promoter sites can block the promoter and effectively make it impossible for the RNA polymerase to bind. This can cause the gene to be (at least temporarily) turned off. This concept will become very important for us, when we study microarray technology and its applications in a few weeks. When DNA is transcribed into RNA, it is first transcribed into premrna which contains both the introns and exons. RNA splicing is the process of removing the introns and joining the exons to form a continuous coding sequence. The goal of a gene finding algorithm is to automatically detect the coding regions, given a long sequence of DNA. The algorithm may classify the regions into upstream, start codon, intron, exon, stop coding, downstream and intergenic regions. There are several different HMM based algorithms whose goal it is to identify coding regions in DNA. Genescan uses the hidden state structure shown in the figure to the right. The states move from noncoding intergenic regions to promoters to start codons and then to one or more exons (interspersed with introns if there is more than one exon). The model specifies the probabilities with which each state (blue circle or red diamond in figure) emits the nucleotides A,T,C,G. (Burge & Karlin, J. Mol. Biol., 1997) 40
6 Statistical Inference for Hidden Markov Models There are three different types of questions that can be asked (and answered) in the context of hidden Markov models. (1) Given the parameters λ = (P, B, π) of the model, efficiently calculate the probability of some given output sequence. One algorithm that can efficiently compute P (O λ) is called the Forward Algorithm. (2) Given the parameters λ = (P, B, π) of the model, find the sequence of hidden states that is most likely to have generated a specific sequence of observations. The algorithm that performs this task is called the Viterbi Algorithm. It finds argmax Q P (Q O) (3) The hardest task is to estimate the values of the model parameters while not knowing the hidden states in order to maximize the likelihood of a given sequence of observations. In effect, both the model parameters and the hidden states have to be estimated from the data in order to make the model likelihood as large as possible. argmax λ P (O λ) This task is sometimes referred to in the literature as Machine Learning. The general solution for this problem is called the EMAlgorithm ( E stands for expectation, and M stands for maximization). The special case of the EMalgorithm for hidden Markov models is called the BaumWelch method. The Forward Algorithm Recall that the task of the Forward algorithm is to compute the probability of a specific sequence of outputs (assuming the model structure of the HMM is fixed and the model parameters are known). Naively this could be done by using the law of total probability and by considering all possible hidden state sequences Q P (O λ) = all Q P (O Q, λ)p (Q λ) However, considering all possible state sequences that may have lead to an observed emission sequence O quickly makes the search space very large, particularly if the observation sequence is long. It would require summing N T products of 2T terms each (N is the size of the state space and T is the length of the sequence of observations). Thus, an iterative procedure is clearly preferable. 41
7 The Forward Algorithm iteratively computes the joint probability that at time t the Markov chain is in state S i S and that the observations along the way were O 1,..., O t. α i (t) = P (O 1, O 2,..., O t, q t = S i λ), t = 1, 2,..., T This allows for an efficient computation of the terminal probabilities α i (T ) = P (O 1, O 2,..., O T, q T = S i λ) The probability to observe any particular sequence O = (O 1,..., O T ) of emissions can then be written as N P (O) = α i (T ) i=1 This calculation still requires recursive computation of N α s, but it is much less costly than an exhaustive search of all possible state sequences. To recursively compute α i (T ) for i = 1,..., N, we need an initialization α i (1) = We also need an induction step in which α i (t + 1) is formulated as a function of previous α s. Then P (O) = N α i (T ) The Forward algorithm provides a solution for problem (1): Calculating the probability of a given output sequence O. It can be carried out in T N 2 computations. i=1 42
8 Example: Recall the coin tossing example from page 37. In this example, the state space was S = {right, left}, the emissions were A = {H, T } and the given model parameters were ( R L ) ( H T ) R R π = (1, 0), P =, B = L L Use the forward algorithm to compute the probability of observing the sequence THT. The Backward Algorithm In the forward algorithm, we started by considering the possible values of the first state and working our way forward from there. Similarly, one could also start by considering the possible values for the last state q T and work backwards. That is, we find the probability of the ending sequence (O t+1,..., O T ) given the hidden state occupied at time t. Define β i (t) = P (O t+1,..., O T q t = S i, λ) Note, that in this definition, β is a conditional probability whereas the α s were defined as joint probabilities. The initial β is defined as β i (T ) = 1, for all i S The βterms are now defined recursively backwards in time: β i (t) = P (O t+1,..., O T q t = S i, λ) = N P (O t+1,..., O T, q t+1 = S j q t = S i, λ) j=1 = N P (O t+1,..., O T q t = S i, q t+1 = S j, λ)p (q t+1 = S j q t = S i, λ) j=1 = N P (O t+1,..., O T q t+1 = S j, λ)p (q t+1 = S j q t = S i, λ) j=1 = N P (O t+2,..., O T q t+1 = S j, λ)p (O t+1 q t+1 = S j, λ)p (q t+1 = S j q t = S i, λ) j=1 = N β j (t + 1)b j (O t+1 )p ij j=1 43
9 The Viterbi Algorithm Recall, that the goal of the Viterbi algorithm is to find the most likely sequence of states that have produced a given output sequence. Here, we assume that the parameters λ of the hidden Markov model are known. We want to find argmax Q P (Q O, λ) = argmax Q P (Q, O, λ) Just as the forward and backward algorithms the Viterbi algorithm is defined recursively. Let v i (t) be the probability of the most likely state sequence of the first t observations that ends in state S i. That is The sequences are initialized by defining v i (t) = max 1 j N (P (O t q t = S i )p ji v j (t 1)) v i (1) = P (O 1 q 1 = S i )π i The Viterbi path x 1,..., x T is defined as the sequence of states q t = S j that maximize the v i (t) expression. That is x T = argmax 1 i N v i (T ) x t = argmax 1 j N (P (O t q t = S i )p ji v j (t 1)) As for the forward and backward algorithm, the complexity of this algorithm is O(T N 2 ). Example: Recall the coin tossing example from page 37. In this example, the state space was S = {right, left}, the emissions were A = {H, T } and the given model parameters were ( R L ) ( H T ) R R π = (1, 0), P =, B = L L Use the Viterbi algorithm to find the most likely state sequence that gave rise to the sequence THT. 44
10 The EMAlgorithm Recall, that the goal of the EMalgorithm is to estimate the (possibly numerous) parameters of a hidden Markov model from the sequence of observations without knowing the explicit sequence of hidden states. The original EMalgorithm was developed for maximum likelihood estimation of parameters in probability models with missing data. (Dempster, Laird, Rubin, 1977). Consider the following scenario: We have a probability model that generates observations x. The model has parameters θ. There may be some missing data y. In the hidden Markov model context we have Notation Description HMM analog x observed data output sequence O 1,..., O T y missing data state sequence q 1,..., q T θ model parameters λ = (π, P, B) The likelihood of a set of parameter values θ is defined as the probability to observe a given set of outcomes given those parameter values. L(θ x) = P (x θ) Since joint probabilities are usually computed as products with many terms, it is more convenient in many cases to work with loglikelihood functions rather than with likelihood functions directly. A maximumlikelihood parameter estimate is the value ˆθ that maximizes the likelihood function. The same value will also maximize the loglikelihood function, since log(x) is monotone. That is, we want to find ˆθ to maximize log P (x θ) = log P (x, y θ) y Here the sum should be taken over all possible values of the missing data. We will now provide a heuristic justification (not a strict proof) for how the EMalgorithm works. First, note that and write P (x, y θ) = P (y x, θ) P (x θ) log P (x θ) = log P (x, y θ) log P (y x, θ) Suppose θ t is a current estimate (not necessarily the optimal estimate) of the parameter vector θ. Multiply both sides of the above equation with P (y x, θ t ) and sum over all possible values of y. One the left side of the equation, nothing will change, since log P (x θ)p (y x, θ t ) = log P (x θ) P (y x, θ t ) = log P (x θ) 1 = log P (x θ) y y 45
11 On the other side, we ll have log P (x θ) = y P (y x, θ t ) log P (x, y θ) y P (y x, θ t ) log P (y x, θ) Give the first term in this equation a name Q(θ θ t ) = y P (y x, θ t ) log P (x, y θ) One can show that maximizing Q(θ θ t ) with respect to θ always increases the likelihood function P (x θ) (compared to P (x θ t )). Note, that here Q(θ θ t ) is the expected value of log P (x, y θ) with respect to y. The name of the EM algorithm comes from the two steps that are now carried out in an alternating fashion: Initialization: Pick an initial value for θ, say θ 0. Estep: For the current θ estimate, say θ t, compute Q(θ θ t ), that is E y [log P (x, y θ t )]. Mstep: Maximize Q(θ θ t ) with respect to θ. Call the new argmax θ t+1. Repeat: Check whether termination criterion is met and if not go back to Estep with updated θ t+1. The EMalgorithm can be shown to converge to a local maximum of the likelihood function P (x θ). Note, that a local maximum is not necessarily equal to the global maximum. It is usually a good idea to restart the algorithm with different initial values θ 0. The algorithm should be terminated either after a fixed number of steps or after the likelihood function does not change much anymore (% change less than some threshold). The BaumWelch Algorithm In the specific context of hidden Markov models the observed data are the outputs O = (O 1,..., O T ), the missing data are the states q = (q 1,..., q T ) and the model parameters are λ = (π, P, B). Next, we need to introduce two new quantities. Define γ i (t) = P (q t = S i O, λ) Note, that because of the Markov property we can write P (O, q t = S i λ) = 46
12 so that γ i (t) = Also define In a similar way one can show that ξ ij (t) = P (q t = S i, q t+1 = S j O, λ) ξ ij (t) = α i (t)p ij b j (O t+1 )β j (t + 1) N N α i (t)p ij b j (O t+1 )β j (t + 1) i=1 j=1 Note, that both γ i (t) as well as ξ ij (t) involve the forward probabilities α i (t) as well as the backward probabilities β j (t). Why do we need all this complicated notation? Let s take a look for what the two new parameters γ and ξ represent. γ i (t) is the probability that the hidden state is S i at time t, given the complete sequence of observations and the current set of parameters. ξ ij (t) is the probability that from time t to time t + 1 the hidden states will transition from i to j, given the complete sequence of observations and current set of parameters. To estimate the initial distribution π we would like an estimate for the probability that the hidden state is in state i at time t = 1. Note, that ˆπ i = γ i (1) = P (q 1 = S i O, λ) where λ is the current set of parameter estimates. Furthermore, T 1 t=1 T 1 t=1 Hence γ i (t) is the expected number of times the hidden chain will be in state i, whereas ξ ij (t) is the expected number of times the hidden chain transitions from i to j. Finally, ˆp ij = ˆbj (o) = T 1 t=1 T 1 ξ ij (t) γ i (t) t=1 O t=o γ j (t) T γ j (t) t=1 47
13 Initialization: Pick an initial value for λ, say λ 0. Often, the entries in π, P, and B are chosen to all be equally likely. If the entries are chosen to be zero, they will remain zero during the algorithm. Estep: For the current λ estimate, say λ t, compute first α i (t), β i (t) (for i = 1,..., N and t = 1,... T ). Then use them to compute γ i (t) and ξ ij (t) (for i, j = 1,..., N and t = 1,..., T ). Mstep: Compute ˆπ i, ˆp ij, and ˆb i (o) for i, j = 1,..., N and o A. Repeat: Compute model likelihood P (O λ). Check, whether termination criterion is met and if not go back to Estep with updated λ t+1. The HMM package in R For hidden Markov models numerous R packages have become available in the first decade of this century for simulation of data and execution of the algorithms we discussed (Forward, Backward, Viterbi, BaumWelch, EM). The most prominent examples of such packages are HMM andhiddenmarkov. The different packages have many overlapping functionalities. Example: The HMM package can simulate data from a hidden Markov model with specified parameters. Let the state space be S = {1, 2} and the emission alphabet A = {a, b, c}. Specify the initial state distribution π, the transition probability matrix P and the emission probability matrix B as follows: ( ) ( ) π = (0.5, 0.5), P =, B = In R, one initiates (i.e., defines parameters, state space, and emission alphabet) with the command inithmm(). Then, the command simhmm() can be used to simulate a sequence of n observations generated from the HMM previously defined. The results are random, that means if you repeat the command, you may get different values. 48
14 Given the parameters of a HMM (in the form of an inithmm() object) the forward() algorithm finds the probability of observing a specific output sequence together with the state of the last observed output. The probabilities are reported on a logscale. The backward() algorithm finds the probabilities of observing future outputs, given the hidden state at time t and complete sequence of observations. Recall, that the viterbi() algorithm finds the most likely sequence of states that generated a given sequence of observations. The user has to provide model parameters in the form of an inithmm() object for this algorithm. The baumwelch() algorithm finds both the estimated state sequence as well as estimates of the parameters (π, P, B) of the model, given an output sequence and initial parameter estimates. The initial parameter estimates are often noninformative matrices, in which every transition or emission is equally likely. 49
Hidden Markov Models for biological sequence analysis
Hidden Markov Models for biological sequence analysis Master in Bioinformatics UPF 20172018 http://comprna.upf.edu/courses/master_agb/ Eduardo Eyras Computational Genomics Pompeu Fabra University  ICREA
More informationCISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)
CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: BaumWelch algorithm Viterbi training Hidden Markov models
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CGIslands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. ForwardBackward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationHidden Markov Models
Hidden Markov Models Outline 1. CGIslands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. ForwardBackward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationHidden Markov Models for biological sequence analysis I
Hidden Markov Models for biological sequence analysis I Master in Bioinformatics UPF 20142015 Eduardo Eyras Computational Genomics Pompeu Fabra University  ICREA Barcelona, Spain Example: CpG Islands
More informationHidden Markov Models (I)
GLOBEX Bioinformatics (Summer 2015) Hidden Markov Models (I) a. The model b. The decoding: Viterbi algorithm Hidden Markov models A Markov chain of states At each state, there are a set of possible observables
More informationToday s Lecture: HMMs
Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models
More informationLecture 7 Sequence analysis. Hidden Markov Models
Lecture 7 Sequence analysis. Hidden Markov Models Nicolas Lartillot may 2012 Nicolas Lartillot (Universite de Montréal) BIN6009 may 2012 1 / 60 1 Motivation 2 Examples of Hidden Markov models 3 Hidden
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2011 1 HMM Lecture Notes Dannie Durand and Rose Hoberman October 11th 1 Hidden Markov Models In the last few lectures, we have focussed on three problems
More information6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm
6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm Overview The EM algorithm in general form The EM algorithm for hidden markov models (brute force) The EM algorithm for hidden markov models (dynamic
More informationHidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Hidden Markov Model Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/19 Outline Example: Hidden Coin Tossing Hidden
More informationHidden Markov Models. Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from:
Hidden Markov Models Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from: www.ioalgorithms.info Outline CGislands The Fair Bet Casino Hidden Markov Model Decoding Algorithm
More informationMarkov Chains and Hidden Markov Models. = stochastic, generative models
Markov Chains and Hidden Markov Models = stochastic, generative models (Drawing heavily from Durbin et al., Biological Sequence Analysis) BCH339N Systems Biology / Bioinformatics Spring 2016 Edward Marcotte,
More informationVL Algorithmen und Datenstrukturen für Bioinformatik ( ) WS15/2016 Woche 16
VL Algorithmen und Datenstrukturen für Bioinformatik (19400001) WS15/2016 Woche 16 Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin Based on slides by
More informationHMMs and biological sequence analysis
HMMs and biological sequence analysis Hidden Markov Model A Markov chain is a sequence of random variables X 1, X 2, X 3,... That has the property that the value of the current state depends only on the
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Hidden Markov Models Outline CGislands The Fair Bet Casino Hidden Markov Model Decoding Algorithm ForwardBackward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training
More informationExample: The Dishonest Casino. Hidden Markov Models. Question # 1 Evaluation. The dishonest casino model. Question # 3 Learning. Question # 2 Decoding
Example: The Dishonest Casino Hidden Markov Models Durbin and Eddy, chapter 3 Game:. You bet $. You roll 3. Casino player rolls 4. Highest number wins $ The casino has two dice: Fair die P() = P() = P(3)
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationMarkov Chains and Hidden Markov Models. COMP 571 Luay Nakhleh, Rice University
Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and Hidden Markov Models Modeling the statistical properties of biological sequences and distinguishing regions
More informationHidden Markov Models
Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas Forward Algorithm For Markov chains we calculate the probability of a sequence, P(x) How
More informationHidden Markov Models
Hidden Markov Models Outline CGislands The Fair Bet Casino Hidden Markov Model Decoding Algorithm ForwardBackward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training BaumWelch algorithm
More informationHIDDEN MARKOV MODELS
HIDDEN MARKOV MODELS Outline CGislands The Fair Bet Casino Hidden Markov Model Decoding Algorithm ForwardBackward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training BaumWelch algorithm
More informationHidden Markov Models
Andrea Passerini passerini@disi.unitn.it Statistical relational learning The aim Modeling temporal sequences Model signals which vary over time (e.g. speech) Two alternatives: deterministic models directly
More informationLecture 3: Markov chains.
1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationHidden Markov Models and Applications. Spring 2018 February 22,27, 2018
Hidden Markov Models and Applications Spring 2018 February 22,27, 2018 Gene finding in prokaryotes Reading frames A protein is coded by groups of three nucleotides (codons): ACGTACGTACGTACGT ACGTACGTACGTACGT
More informationHidden Markov Models. Three classic HMM problems
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Hidden Markov Models Slides revised and adapted to Computational Biology IST 2015/2016 Ana Teresa Freitas Three classic HMM problems
More informationIntroduction to Hidden Markov Models (HMMs)
Introduction to Hidden Markov Models (HMMs) But first, some probability and statistics background Important Topics 1.! Random Variables and Probability 2.! Probability Distributions 3.! Parameter Estimation
More informationCSCE 471/871 Lecture 3: Markov Chains and
and and 1 / 26 sscott@cse.unl.edu 2 / 26 Outline and chains models (s) Formal definition Finding most probable state path (Viterbi algorithm) Forward and backward algorithms State sequence known State
More informationHMM: Parameter Estimation
I529: Machine Learning in Bioinformatics (Spring 2017) HMM: Parameter Estimation Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Content Review HMM: three problems
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationStephen Scott.
1 / 27 sscott@cse.unl.edu 2 / 27 Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative
More information6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 05. Hidden Markov Models Part II
6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution Lecture 05 Hidden Markov Models Part II 1 2 Module 1: Aligning and modeling genomes Module 1: Computational foundations Dynamic programming:
More information11.3 Decoding Algorithm
11.3 Decoding Algorithm 393 For convenience, we have introduced π 0 and π n+1 as the fictitious initial and terminal states begin and end. This model defines the probability P(x π) for a given sequence
More informationStatistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department
More informationPart of Speech Tagging: Viterbi, Forward, Backward, Forward Backward, BaumWelch. COMP599 Oct 1, 2015
Part of Speech Tagging: Viterbi, Forward, Backward, Forward Backward, BaumWelch COMP599 Oct 1, 2015 Announcements Research skills workshop today 3pm4:30pm Schulich Library room 313 Start thinking about
More informationL23: hidden Markov models
L23: hidden Markov models Discrete Markov processes Hidden Markov models Forward and Backward procedures The Viterbi algorithm This lecture is based on [Rabiner and Juang, 1993] Introduction to Speech
More informationPlan for today. ! Part 1: (Hidden) Markov models. ! Part 2: String matching and read mapping
Plan for today! Part 1: (Hidden) Markov models! Part 2: String matching and read mapping! 2.1 Exact algorithms! 2.2 Heuristic methods for approximate search (Hidden) Markov models Why consider probabilistics
More informationData Mining in Bioinformatics HMM
Data Mining in Bioinformatics HMM Microarray Problem: Major Objective n Major Objective: Discover a comprehensive theory of life s organization at the molecular level 2 1 Data Mining in Bioinformatics
More information1 What is a hidden Markov model?
1 What is a hidden Markov model? Consider a Markov chain {X k }, where k is a nonnegative integer. Suppose {X k } embedded in signals corrupted by some noise. Indeed, {X k } is hidden due to noise and
More informationHidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010
Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10701/15781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data
More informationCSCE 478/878 Lecture 9: Hidden. Markov. Models. Stephen Scott. Introduction. Outline. Markov. Chains. Hidden Markov Models. CSCE 478/878 Lecture 9:
Useful for modeling/making predictions on sequential data E.g., biological sequences, text, series of sounds/spoken words Will return to graphical models that are generative sscott@cse.unl.edu 1 / 27 2
More informationHidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing
Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Timeseries data E.g. Speech
More informationIntroduction to Machine Learning CMU10701
Introduction to Machine Learning CMU10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed
More informationMultiscale Systems Engineering Research Group
Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationIntroduction to Hidden Markov Models for Gene Prediction ECES690
Introduction to Hidden Markov Models for Gene Prediction ECES690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence
More informationHidden Markov Models
Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A8010
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, etworks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationorder is number of previous outputs
Markov Models Lecture : Markov and Hidden Markov Models PSfrag Use past replacements as state. Next output depends on previous output(s): y t = f[y t, y t,...] order is number of previous outputs y t y
More informationHidden Markov Models. x 1 x 2 x 3 x K
Hidden Markov Models 1 1 1 1 2 2 2 2 K K K K x 1 x 2 x 3 x K HiSeq X & NextSeq Viterbi, Forward, Backward VITERBI FORWARD BACKWARD Initialization: V 0 (0) = 1 V k (0) = 0, for all k > 0 Initialization:
More informationHidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)
Hidden Markov Models Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98) 1 The occasionally dishonest casino A P A (1) = P A (2) = = 1/6 P A>B = P B>A = 1/10 B P B (1)=0.1... P
More informationBioinformatics 2  Lecture 4
Bioinformatics 2  Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what
More informationHidden Markov Models 1
Hidden Markov Models Dinucleotide Frequency Consider all 2mers in a sequence {AA,AC,AG,AT,CA,CC,CG,CT,GA,GC,GG,GT,TA,TC,TG,TT} Given 4 nucleotides: each with a probability of occurrence of. 4 Thus, one
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis WeiTa Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationStatistical NLP: Hidden Markov Models. Updated 12/15
Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for partofspeechtagging applications Their first
More informationHidden Markov Models (HMMs)
Hidden Markov Models (HMMs) Reading Assignments R. Duda, P. Hart, and D. Stork, Pattern Classification, JohnWiley, 2nd edition, 2001 (section 3.10, hardcopy). L. Rabiner, "A tutorial on HMMs and selected
More informationHidden Markov Models (HMMs) November 14, 2017
Hidden Markov Models (HMMs) November 14, 2017 inferring a hidden truth 1) You hear a staticfilled radio transmission. how can you determine what did the sender intended to say? 2) You know that genes
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/14/07 CAP5510 1 CpG Islands Regions in DNA sequences with increased
More informationChapter 4: Hidden Markov Models
Chapter 4: Hidden Markov Models 4.1 Introduction to HMM Prof. Yechiam Yemini (YY) Computer Science Department Columbia University Overview Markov models of sequence structures Introduction to Hidden Markov
More informationHidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391
Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Parameters of an HMM States: A set of states S=s 1, s n Transition probabilities: A= a 1,1, a 1,2,, a n,n
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationMachine Learning for natural language processing
Machine Learning for natural language processing Hidden Markov Models Laura Kallmeyer HeinrichHeineUniversität Düsseldorf Summer 2016 1 / 33 Introduction So far, we have classified texts/observations
More informationParametric Models Part III: Hidden Markov Models
Parametric Models Part III: Hidden Markov Models Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2014 CS 551, Spring 2014 c 2014, Selim Aksoy (Bilkent
More informationROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY
BIOINFORMATICS Lecture 1112 Hidden Markov Models ROBI POLIKAR 2011, All Rights Reserved, Robi Polikar. IGNAL PROCESSING & PATTERN RECOGNITION LABORATORY @ ROWAN UNIVERSITY These lecture notes are prepared
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related
More informationLecture 9. Intro to Hidden Markov Models (finish up)
Lecture 9 Intro to Hidden Markov Models (finish up) Review Structure Number of states Q 1.. Q N M output symbols Parameters: Transition probability matrix a ij Emission probabilities b i (a), which is
More informationHidden Markov Models Part 2: Algorithms
Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:
More informationBMI/CS 576 Fall 2016 Final Exam
BMI/CS 576 all 2016 inal Exam Prof. Colin Dewey Saturday, December 17th, 2016 10:05am12:05pm Name: KEY Write your answers on these pages and show your work. You may use the back sides of pages as necessary.
More information1/22/13. Example: CpG Island. Question 2: Finding CpG Islands
I529: Machine Learning in Bioinformatics (Spring 203 Hidden Markov Models Yuzhen Ye School of Informatics and Computing Indiana Univerty, Bloomington Spring 203 Outline Review of Markov chain & CpG island
More informationMarkov Models & DNA Sequence Evolution
7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models  looking under
More informationGenome 373: Hidden Markov Models II. Doug Fowler
Genome 373: Hidden Markov Models II Doug Fowler Review From Hidden Markov Models I What does a Markov model describe? Review From Hidden Markov Models I A T A Markov model describes a random process of
More informationHidden Markov Models, I. Examples. Steven R. Dunbar. Toy Models. Standard Mathematical Models. Realistic Hidden Markov Models.
, I. Toy Markov, I. February 17, 2017 1 / 39 Outline, I. Toy Markov 1 Toy 2 3 Markov 2 / 39 , I. Toy Markov A good stack of examples, as large as possible, is indispensable for a thorough understanding
More informationHidden Markov Methods. Algorithms and Implementation
Hidden Markov Methods. Algorithms and Implementation Final Project Report. MATH 127. Nasser M. Abbasi Course taken during Fall 2002 page compiled on July 2, 2015 at 12:08am Contents 1 Example HMM 5 2 Forward
More informationMath 350: An exploration of HMMs through doodles.
Math 350: An exploration of HMMs through doodles. Joshua Little (407673) 19 December 2012 1 Background 1.1 Hidden Markov models. Markov chains (MCs) work well for modelling discretetime processes, or
More informationHMM for modeling aligned multiple sequences: phylohmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The Expectation Maximization (EM) algorithm is one approach to unsupervised, semisupervised, or lightly supervised learning. In this kind of learning either no labels are
More informationGCD3033:Cell Biology. Transcription
Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors
More informationCS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm
+ September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature
More informationHidden Markov Models. x 1 x 2 x 3 x K
Hidden Markov Models 1 1 1 1 2 2 2 2 K K K K x 1 x 2 x 3 x K Viterbi, Forward, Backward VITERBI FORWARD BACKWARD Initialization: V 0 (0) = 1 V k (0) = 0, for all k > 0 Initialization: f 0 (0) = 1 f k (0)
More informationHidden Markov Models
Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas CGIslands Given 4 nucleotides: probability of occurrence is ~ 1/4. Thus, probability of
More informationLab 3: Practical Hidden Markov Models (HMM)
Advanced Topics in Bioinformatics Lab 3: Practical Hidden Markov Models () Maoying, Wu Department of Bioinformatics & Biostatistics Shanghai Jiao Tong University November 27, 2014 Hidden Markov Models
More informationDNA Feature Sensors. B. Majoros
DNA Feature Sensors B. Majoros What is Feature Sensing? A feature is any DNA subsequence of biological significance. For practical reasons, we recognize two broad classes of features: signals short, fixedlength
More informationHidden Markov Models
Hidden Markov Models CI/CI(CS) UE, SS 2015 Christian Knoll Signal Processing and Speech Communication Laboratory Graz University of Technology June 23, 2015 CI/CI(CS) SS 2015 June 23, 2015 Slide 1/26 Content
More information3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationComparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey
Comparative Gene Finding BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following: using related genomes
More informationPair Hidden Markov Models
Pair Hidden Markov Models Scribe: Rishi Bedi Lecturer: Serafim Batzoglou January 29, 2015 1 Recap of HMMs alphabet: Σ = {b 1,...b M } set of states: Q = {1,..., K} transition probabilities: A = [a ij ]
More informationHidden Markov Models. Hosein Mohimani GHC7717
Hidden Markov Models Hosein Mohimani GHC7717 hoseinm@andrew.cmu.edu Fair et Casino Problem Dealer flips a coin and player bets on outcome Dealer use either a fair coin (head and tail equally likely) or
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 07 0.8 x 07 0.4 x 07 l(θ ) 2040 6080 00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationCSEP 590A Summer Lecture 4 MLE, EM, RE, Expression
CSEP 590A Summer 2006 Lecture 4 MLE, EM, RE, Expression 1 FYI, re HW #2: Hemoglobin History Alberts et al., 3rd ed.,pg389 2 Tonight MLE: Maximum Likelihood Estimators EM: the Expectation Maximization Algorithm
More informationCSEP 590A Summer Tonight MLE. FYI, re HW #2: Hemoglobin History. Lecture 4 MLE, EM, RE, Expression. Maximum Likelihood Estimators
CSEP 59A Summer 26 Lecture 4 MLE, EM, RE, Expression FYI, re HW #2: Hemoglobin History 1 Alberts et al., 3rd ed.,pg389 2 Tonight MLE: Maximum Likelihood Estimators EM: the Expectation Maximization Algorithm
More information