CS1820 Notes. hgupta1, kjline, smechery. April 3-April 5. output: plausible Ancestral Recombination Graph (ARG)

Similar documents
Hidden Markov Models. Three classic HMM problems

Implementation of EM algorithm in HMM training. Jens Lagergren

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Basic math for biology

CISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)

Hidden Markov Models

Parametric Models Part III: Hidden Markov Models

Machine Learning for natural language processing

Computational Genomics and Molecular Biology, Fall

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

CS 7180: Behavioral Modeling and Decision- making in AI

Lecture 9. Intro to Hidden Markov Models (finish up)

Lab 3: Practical Hidden Markov Models (HMM)

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

An Introduction to Bioinformatics Algorithms Hidden Markov Models

Hidden Markov Models Hamid R. Rabiee

Hidden Markov Models

11.3 Decoding Algorithm

Data-Intensive Computing with MapReduce

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Dynamic Approaches: The Hidden Markov Model

Hidden Markov Models. x 1 x 2 x 3 x K

p(d θ ) l(θ ) 1.2 x x x

Hidden Markov Models. Ivan Gesteira Costa Filho IZKF Research Group Bioinformatics RWTH Aachen Adapted from:

Hidden Markov Models. Terminology, Representation and Basic Problems

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Modelling

Note Set 5: Hidden Markov Models

Conditional Random Fields

An Introduction to Bioinformatics Algorithms Hidden Markov Models

Hidden Markov Models. Terminology and Basic Algorithms

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models

A brief introduction to Conditional Random Fields

Introduction to Machine Learning CMU-10701

Log-Linear Models, MEMMs, and CRFs

Hidden Markov Models. Terminology and Basic Algorithms

Sequence Modelling with Features: Linear-Chain Conditional Random Fields. COMP-599 Oct 6, 2015

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm

Steven L. Scott. Presented by Ahmet Engin Ural

Alignment Algorithms. Alignment Algorithms

Counting All Possible Ancestral Configurations of Sample Sequences in Population Genetics

CSCE 471/871 Lecture 3: Markov Chains and

O 3 O 4 O 5. q 3. q 4. Transition

Conditional Random Field

HMM: Parameter Estimation

Hidden Markov Models

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

Stephen Scott.

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

1 What is a hidden Markov model?

Hidden Markov Models (I)

Hidden Markov Model and Speech Recognition

Hidden Markov models in population genetics and evolutionary biology

Automatic Speech Recognition (CS753)

2 : Directed GMs: Bayesian Networks

6.864: Lecture 5 (September 22nd, 2005) The EM Algorithm

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Example: The Dishonest Casino. Hidden Markov Models. Question # 1 Evaluation. The dishonest casino model. Question # 3 Learning. Question # 2 Decoding

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Hidden Markov Models Last Updated: Feb 7th, 2017

4 : Exact Inference: Variable Elimination

Hidden Markov Models: Maxing and Summing

CRF for human beings

Hidden Markov Models for biological sequence analysis

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

Grundlagen der Bioinformatik, SS 09, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence

( ).666 Information Extraction from Speech and Text

Basic Text Analysis. Hidden Markov Models. Joakim Nivre. Uppsala University Department of Linguistics and Philology

Data Mining in Bioinformatics HMM

Grundlagen der Bioinformatik, SS 08, D. Huson, June 16, S. Durbin, S. Eddy, A. Krogh and G. Mitchison, Biological Sequence

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation

CSCE 478/878 Lecture 9: Hidden. Markov. Models. Stephen Scott. Introduction. Outline. Markov. Chains. Hidden Markov Models. CSCE 478/878 Lecture 9:

Hidden Markov Models. based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes

Hidden Markov Models

Multiscale Systems Engineering Research Group

Hidden Markov Models (HMMs)

Hidden Markov Models. Hosein Mohimani GHC7717

Markov Chains and Hidden Markov Models. COMP 571 Luay Nakhleh, Rice University

Hidden Markov Models (HMMs) November 14, 2017

Hidden Markov Models. Representing sequence data. Markov Models. A dice-y example 4/5/2017. CISC 5800 Professor Daniel Leeds Π A = 0.3, Π B = 0.

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making

Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008

Graphical Models for Collaborative Filtering

Statistical Machine Learning from Data

HIDDEN MARKOV MODELS

CS838-1 Advanced NLP: Hidden Markov Models

Sequence Labeling: HMMs & Structured Perceptron

Lecture 12: Algorithms for HMMs

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA

Hidden Markov Models

Hidden Markov Models

Phylogenetic Networks with Recombination

A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong

Hidden Markov Models for biological sequence analysis I

Hidden Markov Models. Representing sequence data. Markov Models. A dice-y example 4/26/2018. CISC 5800 Professor Daniel Leeds Π A = 0.3, Π B = 0.

Today s Lecture: HMMs

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

An Evolutionary Programming Based Algorithm for HMM training

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Transcription:

CS1820 Notes hgupta1, kjline, smechery April 3-April 5 April 3 Notes 1 Minichiello-Durbin Algorithm input: set of sequences output: plausible Ancestral Recombination Graph (ARG) note: the optimal ARG is the one with the minimum of recombinations. 2 Definitions 0,1 are alleles, is an undefined allele sequences involved: haplotype sequences of length m over alphabet 0, 1, Let C be a sequence, C[i] is the ith symbol, 1 i m C 1 [i] C 2 [i] iff C 1 [i] = C 2 [i] or C 1 [i] = or C 2 [i] = T is the argument for the time step S T is a sample of sequences operations 1. Coalesce Rule: If there are two sequences C 1 and C 2 in S T such that for all instances of i the condition C 1 [i] C 2 [i] is true, then C 1 and C 2 coalesce into an ancestor sequence Transition: S T +1 = (S T \ {C 1, C 2 }) {C} such that C[i] = C 1 [i] when C 1 [i] and C 2 [i] and C[i] = C 2 [i] otherwise. 2. Mutation 1

Rule: If there exists a sequence C 1 in S T and a marker i where, for all of C 2 in (S T \ {C 1 }), we have C 2 [i] = C 1 [i] or, then we can remove the derived allele (C 1 [i]) from the population Transition: S T +1 = (S T \ C 1 ) {C }, where C [i] = C 1 [i] and C [j] = C 1 [j] for all j i 3. Recombination Rule: When the rules of mutation and coalesce do not apply, must apply recombination or pair of recombinations. Denote the recombination break point as (α, β) as meaning that it occurs between markers α and β. Picking a shared tract C 1, C 2 [α, β] from those available in S T, we aim to put recombination parent of C 1 and one recombination parent of C 2 satisfy the rule of coalescence. To do this, we must put a break point at (α 1, α), if α 1 and put a break point at (β, β + 1) if β m. Transition: From the tract {C 1, C 2 }[α, β], pick (1) a valid breakpoint (α, β), where either (α, β) = (α 1, α) or (α, β) = (β, β+1) and (2) a recombinant sequence C R, where either the C R = 1 or C R = 2. Then, S T +1 = (S T \{C R }){C 1, C 2}, where C 1[i] = C R [i] for all i α and C 1[i] = otherwise, C 2[i] = C R [i] for all i β and C 2[i] = otherwise. If both (α 1, α) and (β, β +1) are valid breakpoints, we must put the second recombination (taking us to S T +2 ), on an appropriate ancestor of C 1 or C 2. 3 The Algorithm The goal of the algorithm is to find a single ancestral sequence of a sample of sequences. 1. The algorithm starts at T=1. 2. For each iteration S i (where i = T ) from 1 i m, apply the Coalesce, Mutation, and Recombination rules to S i 3. Stop when S i contains one sequence. 2

April 5 Notes Chapter 4: Hidden Markov Models - The Learning Problem and Algorithm Three Fundamental Problems A Hidden Markov Model has the following inputs The observation sequence: Θ = θ 1 θ 2...θ T The Model: λ = (A, B, π) Problem 1: the Equation problem or Model Scoring problem. Given: Θ, λ Compute: P (Θ λ), the probability of observing the observation given the word. Problem 2: The Decoding Problem or Uncovering the Hidden Part Given: Θ, λ Compute: the sequence of sites q 1, q 2,..., q T that optimally explains the observed sequences The Viturbi Algorithm (Maximum Likelihood) is used for this problem. Problem 3: The Learning problem or Training problem Given: Θ Compute: the parameters of a model λ = (A, B, π) that maximize P (Θ λ) Definition of a Hidden Markov Model An HMM has 5 elements: N, M, A, B, π N 1. N = of states s = s 1, s 2,..., s N 2. M = of distinct observation symbols per state. 3. A = transition probability distribution, A = a ij, a ij = P [q t+1 = s j q i = s i ], 1 i, j N 4. B = observation symbols probability distribution, B = b j (k) b j (k) = P [v k at time t q t = s j ], 1 j N, 1 k M 5. π N = The initial state distribution. π = π i, π i = P [q i = s i ], 1 i N We need a number of variables: α: The forward variable α t (i) = P (θ 1 θ 2...θ T, q T = s i ) Note: θ 1 θ 2 is the prefix of the observation sequences. The backward variable: B t (i) = P (θ t+1 θ t+2...θ T, q t = s i ) θ t+1 θ t+2 is the suffix of the observation sequence. 3

Delta δ = δ t (i) = MAX P [q 1 q 2...q t = s i, θ 1 θ 2...θ T Solution to Problem 3: We want to construct parameters of the model λ = (A, Bπ) to maximize the probability of observing the sequence Θ in λ. There is no analytical exact solution (like for problems 1 and 2) We are going to construct a λ = (A, B π that is a local max pf P(Θ λ ) An Iterative algorithm: ζ t (i, j) = probability of being in state i at time t and transitioning to state s j at time t + 1. ζ t (i, j) = P (q t = s i, q t+1 = s j Θ, λ) ζ t (i, j) = αt(i)aijbj(θt+1)βt+1(j) P (θ λ) ζ t (i, j) = N α t(i)a ijbj(θ t+1)β t+1(j) N i =1 j =1 αt(i )a i j b j (θ t+1)β t(j ) β t (j ) is the normalizing factor Gamma: γ t = the probability of being in state s i at the time t given the observation sequence. γ t (i) = N j=1 ζ t(i, j) The expected number of times that state s i is visited: T 1 t=1 γ t(i) The expected number of transitions from s i : T 1 t=1 ζ t(i, j)=the expected number of transitions from s i to s j Suppose that we have a model λ = (A, B, π) Construct a new model λ = (Ā, B, π) π i =expected frequency (n number of times) in state s i at t=1 = γ t (i) A. a ij a ij = expected number of transitions from s i to s j = expected number of transitions from s j B. b j (k) b j (k) = expected of times in state observing symbol v k expected of times in state s j T γ t=1,θ = k =v t(j) k T γt(j) t=1 T 1 t=1 ζ T (i,j) T 1 t=1 γt(i) 4

λ = (Ā, B, π) 5

Figure 1: An example of a full ARG 6

Figure 2: First part of a partial recombination Figure 3: Second part of a partial recombination 7