Stochastic processes and
|
|
- Merilyn Mitchell
- 6 years ago
- Views:
Transcription
1 Stochastic processes and Markov chains (part II) Wessel van Wieringen nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The Netherlands
2 Example
3 Example DNA copy number of a genomic segment is simply the number of copies of that segment present in the cell under study. Healthy normal cell: chr 1 : 2 chr 22 : 2 chr X : 1 or 2 chr Y : 0 or 1
4 Example Chromosomes of a tumor cell Technique: SKY
5 Example The DNA copy number is often categorized into: L : loss : < 2 copies N : normal : 2 copies G : gain : > 2 copies In cancer: The number of DNA copy number aberrations accumulates with the progression of the disease. DNA copy number aberrations are believed to be irreversible. Let us model the accumulation process of DNA copy number aberrations.
6 Example State diagram for the accumulation process of a locus. G G G G N N N N N L L L L t 0 t 1 t 2 t 3 time
7 Example The associated initial distribution: and, associated transition matrix: with parameter constraints:
8 Example Calculate the probability of a loss, normal and gain at this locus after p generations: Using: These probabilities simplify to, e.g.:
9 Example In practice, a sample is only observed once the cancer has already developed. Hence, the number of generations p is unknown. This may be accommodated by modeling p as being Poisson distributed: This yields, e.g.:
10 Example So far, we only considered one locus. Hence: DNA i DNA i DNA i DNA i t 0 t 1 t 2 t 3
11 Example For multiple loci: DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA i DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA t 0 t 1 t 2 t 3
12 Example Multiple loci multivariate problem. Complications: p unknown, loci not independent. Solution: p random, assume particular dependence structure. After likelihood formulation and parameter estimation: After likelihood formulation and parameter estimation: identify most aberrated loci, reconstruct time of onset of cancer.
13 Stationary distribution
14 Stationary distribution We generated a DNA sequence of bases long in accordance with a 1 st order Markov chain. For stretches DNA ever longer and ever farther away from the first base pair we calculated l the nucleotide %. bp bp %A %C %G %T e e e e e rgence e conve
15 Stationary distribution Hence, after a while an equilibrium sets in. Not necessarily a fixed state or pattern, but: the proportion of a time period that is spent in a particular state t converges to a limit it value. The limit values of all states form the stationary The limit values of all states form the stationary distribution of the Markov process, denoted by:
16 Stationary distribution For a stationary process, it holds that P(X t =E i ) = φ i for all t and i. In particular: P(X t =EE i ) = φ i = P(X t+1 =EE i ). Of course, this does not imply: P(X t =E i, X t+1 =E i )= φ i φ i as this ignores the 1 st order Markov dependency. d
17 Stationary distribution The stationary distribution is associated with the first- order Markov process, parameterized by (π, P). Question How do φ and (π, P) relate? Hereto, recall: φ i = P(X t =E i )
18 Stationary distribution We then obtain the following relation: definition
19 Stationary distribution We then obtain the following relation: use the fact that P(A, B) +P(A P(A, B C )=P(A)
20 Stationary distribution We then obtain the following relation: use the definition of conditional probability: P(A, B) = P(A B) P(B)
21 Stationary distribution We then obtain the following relation:
22 Stationary distribution We then obtain the following relation: Thus: Eigenvectors!
23 Stationary distribution Theorem Irreducible, aperiodic Markov chains with a finite it state t space S have a stationary distribution to which the chain converges as t. A C G T φ A =0.1, 01 φ C =0.2, 02 φ G =0.3, 03 φ T =0.4 04
24 Stationary distribution A Markov chain is aperiodic if there is no state that can only be reached at multiples of a certain period. E.g., state E i only at t = 0, 3, 6, 9, et cetera. Example of an aperiodic Markov chain A C G T p AC A A C p CA C p GC p GC G p GA p AG p TC T ATCGATCGATCGATCG p AA p GG G p GT p TG T p CT p CC p TT
25 Stationary distribution Example of a periodic Markov chain A C G T A C G T A p AT =1 p GA =1 C p CG =1 p TC =1 ATCGATCGATCGATCG G T The fraction of time spent in A (roughly φ A ): P(X t+1000 = A) ) = ¼ whereas P(X t+1000 = A X 4 = A) = 1.
26 Stationary distribution A Markov chain is irreducible if every state can (in principle) be reached (after enough time has passed) from every other state. Examples of a reducible Markov chain A C A C G T G T C is an absorbing state. A will not be reached.
27 Stationary distribution How do we find the stationary distribution? We know the stationary distribution satisfies: (*) and We thus have S+1 equations for S unknowns: one of the equations in (*) can be dropped (which is irrelevant), and the system of S remaining equations needs to be solved.
28 Stationary distribution Consider the transition matrix: In order to find the stationary distribution we need to solve the following system of equations: This yields: (φ 1, φ 2 ) T = ( , ) T
29 Stationary distribution On the other hand, for n large: P(X t+n =E j X t =E i ) = φ j is independent of i. Or,p (n) ij = φ j. Hence, the n-step transition matrix P (n) has identical rows: This motivates a numerical way to find the stationary This motivates a numerical way to find the stationary distribution.
30 Stationary distribution Same example as before: with stationary distribution: (φ 1, φ 2 ) T = ( , ) T Then: matrix multiplication ( rows times columns )
31 Stationary distribution Thus: In similar fashion we obtain:
32 Convergence of the stationary distribution
33 Convergence to the stationary distribution Define the vector 1 = (1,,1) T. We have already seen: P n = 1 φ T for large n Question How fast does P n go to 1 φ T as n? Answer 1) Use linear algebra 2) Calculate numerically
34 Convergence to the stationary distribution Fact The transition matrix P of a finite, it aperiodic, irreducible ibl Markov chain has an eigenvalue equal to 1 (λ 1 =1), while all other eigenvalues are (in the absolute sense) smaller than one: λ k < 1, k=2,,3. Focus on λ 1 =1 1 We know φ T P = φ T for the stationary distribution. Hence, φ is the left eigenvector of eigenvalue λ=1. Also, row sums of P equal 1: P 1= 1. Hence, 1 is a right eigenvector of eigenvalue λ=1.
35 Convergence to the stationary distribution The spectral decomposition of a square matrix P is given by: where: diagonal matrix containing the eigenvalues, columns contain the corresponding eigenvectors. In case of P is symmetric, Then: is orthogonal:
36 Convergence to the stationary distribution The spectral decomposition of P, reformulated: eigenvalues right eigenvectors left eigenvectors The eigenvectors are normalized:
37 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: plug in the spectral decomposition of P
38 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: bring the right eigenvector in the sum, and use the properties of the transpose operator
39 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: the eigenvectors are normalized
40 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that:
41 Convergence to the stationary distribution Repeating this argument n times yields: Hence, and are left and right eigenvector with eigenvalue of. Thus:
42 Convergence to the stationary distribution Verify the spectral decomposition for P 2 :
43 Convergence to the stationary distribution Use the spectral decomposition of P n to show how fast P n converges to 1 φ T as n. We know: λ 1 =1, λ k < 1 for k=2,,s, and Then:
44 Convergence to the stationary distribution Expanding this: as n 0 0
45 Convergence to the stationary distribution Clearly: Furthermore, as: It is the second largest (in absolute sense) eigenvalue that dominates, and thus determines the convergence speed to the stationary distribution.
46 Convergence to the stationary distribution Fact AM Markov chain with a symmetric ti P has a uniform stationary distribution. Proof Symmetry of P implies that left- and right eigenvectors are identical (up to a constant). First right eigenvector corresponds vector of ones, 1. Hence, the left eigenvector equals c1. The left eigenvector is the stationary distribution and should sum to one: c = 1 / (number of states).
47 Convergence to the stationary distribution Question Suppose the DNA may reasonably described d by a first order Markov model with transition matrix P: and stationary distribution: and eigenvalues:
48 Convergence to the stationary distribution Question What is the probability of a G at position 2, 3, 4, 5, 10? And how does this depend on the starting nucleotide? In other words, give: But also: P(X 2 = G X 1 = A) ) = P(X 2 = G X 1 = C) = P(X 2 = G X 1 = G) = P(X 2 = G X 1 = T) = P(X 3 = G X 1 = A) = et cetera.
49 Convergence to the stationary distribution Thus, calculate P(X t = G X 1 = x 1 ) with t=2, 3, 4, 5, 10, and x 1 = A, C, G, T. x1=a x1=c x1=g x1=t t= t= t= t= t= Study the influence of the first nucleotide on the calculated probability for increasing t.
50 Convergence to the stationary distribution > # define π and P > pi <- matrix(c(1, 0, 0, 0), ncol=1) > P <- matrix(c(2, 3, 3, 2, 1, 4, 4, 1, 3, 2, 2, 3, 4, 1, 1, 4), ncol=4, byrow=true)/10 > # define function that calculates the powers of a > # matrix (inefficiently though) > matrixpower <- function(x, power){ Xpower <- X for (i in 2:power){ Xpower <- Xpower %*% X } return(xpower) } > # calculate P to the power 100 > matrixpower(p, 100)
51 Convergence to the stationary distribution Question Suppose the DNA may reasonably described d by a first order Markov model with transition matrix P: and stationary distribution: and eigenvalues:
52 Convergence to the stationary distribution Again calculate P(X t = G X 1 = x 1 ) with t=2, 3, 4, 5, 10, and x 1 = A, C, G, T. x1=a x1=c x1=g x1=t t= t= t= t= t= t= Now the influence of the first nucleotide fades slowly. This can be explained by the large 2 nd eigenvalue.
53 Processes back in time
54 Processes back in time So far, we have studied Markov chains forward in time. In practice, we may wish to study processes back in time. Example Evolutionary models that describe occurrence of SNPs in DNA sequences. We aim to attribute two DNA sequences to a common ancestor. human chimp common ancestor
55 Processes back in time Consider a Markov chain {X t } t=1,2,. The reverse Markov t t,, chain {X r *} r=1,2, is then defined by: X r * = X t-r X 2 * X 1 * X 0 * X -1 * X t-2 X t-1 X t X t+1 With transition probabilities: p ij = P(X t =E j X t-1 =E i ) p ij *=P(X r *=E j X r-1 *=E i )
56 Processes back in time Show how the transition probabilities p ji * relate to p ij : just the definition
57 Processes back in time Show how the transition probabilities p ji * relate to p ij : express this in terms of the original Markov chain express this in terms of the original Markov chain using that X r * = X t-r
58 Processes back in time Show how the transition probabilities p ji * relate to p ij : apply definition of conditional probability twice (Bayes): apply definition of conditional probability twice (Bayes): P(A B) = P(B A) P(A) / P(B)
59 Processes back in time Show how the transition probabilities p ji * relate to p ij : Hence:
60 Processes back in time Check that rows of the transition matrix P* sum to one, i.e.: p i1 * + p i2 * + + p js * = 1 Hereto: use the fact that
61 Processes back in time The two Markov chains defined by P and P* have the same stationary distribution. Indeed, as: we have:
62 Processes back in time Definition A Markov chain is called reversible if p ij * = p ij. In that case: p ij * = p ij = p ji φ j / φ i Or, φ i p ij = φ j p ji for all i and j. These are the so-called detailed balance equations. Theorem A Markov chain is reversible if and only if the detailed balance equations hold.
63 Processes back in time Example 1 The 1 st order Markov chain with transition matrix: is irreversible. Check that this (deterministic) Markov chain does not satisfy the detailed balance equations. Irreversibility can be seen from a sample of this chain: ABCABCABCABCABCABC In the reverse direction transitions from B to C do not occur!
64 Processes back in time Example 2 The 1 st order Markov chain with transition matrix: is irreversible. Again, a uniform stationary distribution: As P is not symmetric, the detailed balance equations are not satisfies: p ij /3 p ji / 3 for all i and j.
65 Processes back in time Example 2 (continued) The irreversibility of this chain implies: P(A B C A) B C A A A A = P(A B) P(B C) P(C A) B B B B = 08* * * 0.1 * 0.1 = P(A C) P(C B) P(B A) = P(A C B A). It matters how one walks from A to A. C C C C A A A A B B B B C C C C Or, it matters whether one walks forward or backward.
66 Processes back in time Kolmogorov condition for reversibility A stationary Markov chain is reversible if and only if any path from state E i to state E i has the same probability as the path in the opposite direction. Or, the Markov chain is reversible if and only if: for all i, i 1, i 2,, i k. 1 2 k Eg: E.g.: P(A B C A) = P(A C B A). A A A A A A A A B B B B B B B B C C C C C C C C
67 Processes back in time Interpretation For a reversible Markov chain it is not possible to determine the direction of the process from the observed state sequence alone. Molecular l phylogenetics aims to reconstruct t evolutionary relationships between present day species from their DNA sequences. Reversibility is then an essential assumption. Genes are transcribed in one direction only (from the 3 end to the 5 end). The promoter is only on the 3 end. This suggests irreversibility. e
68 Processes back in time E. Coli For a gene in the E. Coli genome, we estimate: Transition matrix [,1] [,2] [,3] [,4] [1,] [2,] [3,] [4,] Stationary distribution [1] Then, the detailed balance equation do not hold, e.g.: π 1 p 12 π 2 p 21.
69 Processes back in time Note Within evolution theory the notion of irreversibility refers to the presumption that complex organisms once lost evolution will not appear in the same form. Indeed, the likelihood of reconstructing a particular phylogenic system is infinitesimal small.
70 Application: motifs
71 Application: motifs Study the sequence of the promotor region up-stream of a gene. gene DNA upstream promotor region This region contains binding sites for the transcription factors that regulate the transcription of the gene. gene DNA
72 Application: motifs The binding sites of a transciption factor (that may regulate multiple genes) share certain sequence patterns, motifs. Not all transcription factors and motifs are known. Hence, a high occurrence of a particular sequence pattern in the upstream regions of a gene may indicate that it has a regulatory function (e.g., binding site). Problem Determine the probability of observing m motifs in a background generated by a 1 st order stationary Markov chain.
73 Application: motifs An h-letter word W = w 1 w 2 w h is a map from {1,, h} to, where some non-empty set, called the alphabet. In the DNA example: and, e.g.:
74 Application: motifs A word W is p-periodic if The lag between two overlapping occurrences of the word. The set of all periods of W (less than h) is the period set, p ( ) p, denoted by. In other words, the set of integers 0 < p < h such that a new occurrence of W can start p letters after an occurrence of W.
75 Application: motifs If W 1 = CGATCGATCG, then For: CGATCGATC CGATCGATCGATC CGATCGATCGATCGATC If W 2 = CGAACTG, then
76 Application: motifs Let N(W) be the number of (overlapping) occurrences of an h-letter word W in a random sequence n on A. If Y t is the random variable defined by: then
77 Application: motifs Also, denote the number of occurences in W by: and If W = CATGA, then: n(a) = 2 and n(a ) = 1
78 Application: motifs Assume a stationary 1 st order Markov model for the random sequence of length n. The probability of an occurrence of W in the random sequence is given by:
79 Application: motifs In the 1 st order Markov model, the expected number of occurrences of W is approximated by: and its variance by:
80 Application: motifs To find words with exceptionally frequency in the DNA, the following (asymptotically) standard normal statistic is used: The p-value of word W is then given by:
81 Application: motifs Note Robin, Daudon (1999) provide exact probabilities of word occurences in random sequences. However, Robin, Schbath (2001) point out that calculation of the exact probabilities is computationally intensive. Hence, the use of an approximation here.
82 References & further reading
83 References and further reading Ewens, W.J, Grant, G (2006), Statistical Methods for Bioinformatics, Springer, New York. Reinert, G., Schbath, S., Waterman, M.S. (2000), Probabilistic and statistical properties of words: an overview, Journal of Computational Biology, 7, Robin, S., Daudin, J.-J. (1999), Exact distribution of word occurrences in a random sequence of letters, Journal of Applied Probability, 36, Robin, S., Schbath, S. (2001), Numerical comparison of several approximations of the word count distribution ib ti in random sequences, Journal of Computational Biology, 8(4), Schbath, S. (2000), An overview on the distribution of word counts in Markov chains, Journal of Computational. Biology, 7, Schbath, S., Robin, R. (2009), How can pattern statistics be useful for DNA motif discovery?. In Scan Statistics: ti ti Methods and Applications by Glaz, J. et al. (eds.).
84 This material is provided under the Creative Commons Attribution/Share-Alike/Non-Commercial License. See for details.
Stochastic processes and Markov chains (part II)
Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The
More informationStochastic processes and
Stochastic processes and Markov chains (part I) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University
More informationAnswers Lecture 2 Stochastic Processes and Markov Chains, Part2
Answers Lecture 2 Stochastic Processes and Markov Chains, Part2 Question 1 Question 1a Solve system of equations: (1 a)φ 1 + bφ 2 φ 1 φ 1 + φ 2 1. From the second equation we obtain φ 2 1 φ 1. Substitition
More informationMarkov chains and the number of occurrences of a word in a sequence ( , 11.1,2,4,6)
Markov chains and the number of occurrences of a word in a sequence (4.5 4.9,.,2,4,6) Prof. Tesler Math 283 Fall 208 Prof. Tesler Markov Chains Math 283 / Fall 208 / 44 Locating overlapping occurrences
More informationLecture 3: Markov chains.
1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.
More informationBiology 644: Bioinformatics
A stochastic (probabilistic) model that assumes the Markov property Markov property is satisfied when the conditional probability distribution of future states of the process (conditional on both past
More informationLecture #5. Dependencies along the genome
Markov Chains Lecture #5 Background Readings: Durbin et. al. Section 3., Polanski&Kimmel Section 2.8. Prepared by Shlomo Moran, based on Danny Geiger s and Nir Friedman s. Dependencies along the genome
More information8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains
8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States
More informationStatistics 992 Continuous-time Markov Chains Spring 2004
Summary Continuous-time finite-state-space Markov chains are stochastic processes that are widely used to model the process of nucleotide substitution. This chapter aims to present much of the mathematics
More informationMarkov Chains, Stochastic Processes, and Matrix Decompositions
Markov Chains, Stochastic Processes, and Matrix Decompositions 5 May 2014 Outline 1 Markov Chains Outline 1 Markov Chains 2 Introduction Perron-Frobenius Matrix Decompositions and Markov Chains Spectral
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationVariable Selection in Structured High-dimensional Covariate Spaces
Variable Selection in Structured High-dimensional Covariate Spaces Fan Li 1 Nancy Zhang 2 1 Department of Health Care Policy Harvard University 2 Department of Statistics Stanford University May 14 2007
More informationIntroduction to Hidden Markov Models for Gene Prediction ECE-S690
Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationExample Linear Algebra Competency Test
Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,
More informationMarkov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued
Markov Chains X(t) is a Markov Process if, for arbitrary times t 1 < t 2
More informationSTOCHASTIC PROCESSES Basic notions
J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving
More informationMarkov Models & DNA Sequence Evolution
7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationThe Distribution of Mixing Times in Markov Chains
The Distribution of Mixing Times in Markov Chains Jeffrey J. Hunter School of Computing & Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand December 2010 Abstract The distribution
More informationLecture 7: Simple genetic circuits I
Lecture 7: Simple genetic circuits I Paul C Bressloff (Fall 2018) 7.1 Transcription and translation In Fig. 20 we show the two main stages in the expression of a single gene according to the central dogma.
More informationURN MODELS: the Ewens Sampling Lemma
Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 3, 2014 1 2 3 4 Mutation Mutation: typical values for parameters Equilibrium Probability of fixation 5 6 Ewens Sampling
More informationBioinformatics 2 - Lecture 4
Bioinformatics 2 - Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what
More informationHidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2
Hidden Markov Models based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis Shamir s lecture notes and Rabiner s tutorial on HMM 1 music recognition deal with variations
More informationFinite-Horizon Statistics for Markov chains
Analyzing FSDT Markov chains Friday, September 30, 2011 2:03 PM Simulating FSDT Markov chains, as we have said is very straightforward, either by using probability transition matrix or stochastic update
More information2 Discrete-Time Markov Chains
2 Discrete-Time Markov Chains Angela Peace Biomathematics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,
More informationNote that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +
Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n
More informationLECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)
LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG) A discrete time Markov Chains is a stochastic dynamical system in which the probability of arriving in a particular state at a particular time moment
More information(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?
IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2006, Professor Whitt SOLUTIONS to Final Exam Chapters 4-7 and 10 in Ross, Tuesday, December 19, 4:10pm-7:00pm Open Book: but only
More informationMarkov Chains Handout for Stat 110
Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of
More informationMarkov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly
Markov Chains Sarah Filippi Department of Statistics http://www.stats.ox.ac.uk/~filippi TA: Luke Kelly With grateful acknowledgements to Prof. Yee Whye Teh's slides from 2013 14. Schedule 09:30-10:30 Lecture:
More informationComputational Biology and Chemistry
Computational Biology and Chemistry 33 (2009) 245 252 Contents lists available at ScienceDirect Computational Biology and Chemistry journal homepage: www.elsevier.com/locate/compbiolchem Research Article
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationCS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions
CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions Instructor: Erik Sudderth Brown University Computer Science April 14, 215 Review: Discrete Markov Chains Some
More informationStochastic Processes
Stochastic Processes 8.445 MIT, fall 20 Mid Term Exam Solutions October 27, 20 Your Name: Alberto De Sole Exercise Max Grade Grade 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 5 5 Total 30 30 Problem :. True / False
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationStochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property
Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat
More informationReading for Lecture 13 Release v10
Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................
More informationMining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences
Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences Daisuke Ikeda Department of Informatics, Kyushu University 744 Moto-oka, Fukuoka 819-0395, Japan. daisuke@inf.kyushu-u.ac.jp
More information18.175: Lecture 30 Markov chains
18.175: Lecture 30 Markov chains Scott Sheffield MIT Outline Review what you know about finite state Markov chains Finite state ergodicity and stationarity More general setup Outline Review what you know
More informationHomework set 2 - Solutions
Homework set 2 - Solutions Math 495 Renato Feres Simulating a Markov chain in R Generating sample sequences of a finite state Markov chain. The following is a simple program for generating sample sequences
More informationUnit 16: Hidden Markov Models
Computational Statistics with Application to Bioinformatics Prof. William H. Press Spring Term, 2008 The University of Texas at Austin Unit 16: Hidden Markov Models The University of Texas at Austin, CS
More informationStatistics 150: Spring 2007
Statistics 150: Spring 2007 April 23, 2008 0-1 1 Limiting Probabilities If the discrete-time Markov chain with transition probabilities p ij is irreducible and positive recurrent; then the limiting probabilities
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs
More informationIntroduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo
Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10
More informationAARMS Homework Exercises
1 For the gamma distribution, AARMS Homework Exercises (a) Show that the mgf is M(t) = (1 βt) α for t < 1/β (b) Use the mgf to find the mean and variance of the gamma distribution 2 A well-known inequality
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More informationThe Wright-Fisher Model and Genetic Drift
The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population
More informationModeling Noise in Genetic Sequences
Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:
More informationTMA 4265 Stochastic Processes Semester project, fall 2014 Student number and
TMA 4265 Stochastic Processes Semester project, fall 2014 Student number 730631 and 732038 Exercise 1 We shall study a discrete Markov chain (MC) {X n } n=0 with state space S = {0, 1, 2, 3, 4, 5, 6}.
More informationLecture 5: Random Walks and Markov Chain
Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables
More informationStochastic Modelling Unit 1: Markov chain models
Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson
More informationA Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves
A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves David Eberly, Geometric Tools, Redmond WA 9852 https://www.geometrictools.com/ This work is licensed under the Creative
More informationELEC633: Graphical Models
ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm
More information6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities
6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov
More informationLinear Algebra review Powers of a diagonalizable matrix Spectral decomposition
Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing
More informationSMSTC (2007/08) Probability.
SMSTC (27/8) Probability www.smstc.ac.uk Contents 12 Markov chains in continuous time 12 1 12.1 Markov property and the Kolmogorov equations.................... 12 2 12.1.1 Finite state space.................................
More informationLasso regression. Wessel van Wieringen
Lasso regression Wessel van Wieringen w.n.van.wieringen@vu.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The Netherlands Lasso regression Instead
More informationStochastic modelling of epidemic spread
Stochastic modelling of epidemic spread Julien Arino Department of Mathematics University of Manitoba Winnipeg Julien Arino@umanitoba.ca 19 May 2012 1 Introduction 2 Stochastic processes 3 The SIS model
More informationReinforcement Learning
Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha
More informationMinimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA
Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University
More informationN.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a
WHEN IS A MAP POISSON N.G.Bean, D.A.Green and P.G.Taylor Department of Applied Mathematics University of Adelaide Adelaide 55 Abstract In a recent paper, Olivier and Walrand (994) claimed that the departure
More informationMathematical Methods for Computer Science
Mathematical Methods for Computer Science Computer Science Tripos, Part IB Michaelmas Term 2016/17 R.J. Gibbens Problem sheets for Probability methods William Gates Building 15 JJ Thomson Avenue Cambridge
More informationMarkov Chain Model for ALOHA protocol
Markov Chain Model for ALOHA protocol Laila Daniel and Krishnan Narayanan April 22, 2012 Outline of the talk A Markov chain (MC) model for Slotted ALOHA Basic properties of Discrete-time Markov Chain Stability
More informationRegulatory Sequence Analysis. Sequence models (Bernoulli and Markov models)
Regulatory Sequence Analysis Sequence models (Bernoulli and Markov models) 1 Why do we need random models? Any pattern discovery relies on an underlying model to estimate the random expectation. This model
More informationROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY
BIOINFORMATICS Lecture 11-12 Hidden Markov Models ROBI POLIKAR 2011, All Rights Reserved, Robi Polikar. IGNAL PROCESSING & PATTERN RECOGNITION LABORATORY @ ROWAN UNIVERSITY These lecture notes are prepared
More informationKaraliopoulou Margarita 1. Introduction
ESAIM: Probability and Statistics URL: http://www.emath.fr/ps/ Will be set by the publisher ON THE NUMBER OF WORD OCCURRENCES IN A SEMI-MARKOV SEQUENCE OF LETTERS Karaliopoulou Margarita 1 Abstract. Let
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationThe genome encodes biology as patterns or motifs. We search the genome for biologically important patterns.
Curriculum, fourth lecture: Niels Richard Hansen November 30, 2011 NRH: Handout pages 1-8 (NRH: Sections 2.1-2.5) Keywords: binomial distribution, dice games, discrete probability distributions, geometric
More informationPattern Matching (Exact Matching) Overview
CSI/BINF 5330 Pattern Matching (Exact Matching) Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Pattern Matching Exhaustive Search DFA Algorithm KMP Algorithm
More informationMolecular evolution 2. Please sit in row K or forward
Molecular evolution 2 Please sit in row K or forward RBFD: cat, mouse, parasite Toxoplamsa gondii cyst in a mouse brain http://phenomena.nationalgeographic.com/2013/04/26/mind-bending-parasite-permanently-quells-cat-fear-in-mice/
More information- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.
February 8, 2005 Bio 107/207 Winter 2005 Lecture 11 Mutation and transposable elements - the term mutation has an interesting history. - as far back as the 17th century, it was used to describe any drastic
More informationMAS275 Probability Modelling Exercises
MAS75 Probability Modelling Exercises Note: these questions are intended to be of variable difficulty. In particular: Questions or part questions labelled (*) are intended to be a bit more challenging.
More informationSummary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)
Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection
More informationThis operation is - associative A + (B + C) = (A + B) + C; - commutative A + B = B + A; - has a neutral element O + A = A, here O is the null matrix
1 Matrix Algebra Reading [SB] 81-85, pp 153-180 11 Matrix Operations 1 Addition a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn + b 11 b 12 b 1n b 21 b 22 b 2n b m1 b m2 b mn a 11 + b 11 a 12 + b 12 a 1n
More informationPrediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / /
Prediction of cis-regulatory Binding Sites and Regulons 11/21/2008 Mapping predicted motifs to their cognate TFs After prediction of cis-regulatory motifs in genome, we want to know what are their cognate
More informationHidden Markov Models and some applications
Oleg Makhnin New Mexico Tech Dept. of Mathematics November 11, 2011 HMM description Application to genetic analysis Applications to weather and climate modeling Discussion HMM description Application to
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationMutation models I: basic nucleotide sequence mutation models
Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or
More informationPhylogenetic Assumptions
Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by
More informationHigh-dimensional data: Exploratory data analysis
High-dimensional data: Exploratory data analysis Mark van de Wiel mark.vdwiel@vumc.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Contributions by Wessel
More informationStatistical NLP: Hidden Markov Models. Updated 12/15
Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first
More informationLEARNING DYNAMIC SYSTEMS: MARKOV MODELS
LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationLinear Algebra review Powers of a diagonalizable matrix Spectral decomposition
Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2018 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing
More informationChapter 1. Vectors, Matrices, and Linear Spaces
1.7 Applications to Population Distributions 1 Chapter 1. Vectors, Matrices, and Linear Spaces 1.7. Applications to Population Distributions Note. In this section we break a population into states and
More informationINDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -33 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.
INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -33 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Regression on Principal components
More informationStochastic modelling of epidemic spread
Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca
More informationMATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems
Lecture 31: Some Applications of Eigenvectors: Markov Chains and Chemical Reaction Systems Winfried Just Department of Mathematics, Ohio University April 9 11, 2018 Review: Eigenvectors and left eigenvectors
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns
More information3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,
More informationHumans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase
Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs
More informationAn Introduction to Entropy and Subshifts of. Finite Type
An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview
More informationLecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008
Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008 1 Sequence Motifs what is a sequence motif? a sequence pattern of biological significance typically
More informationMotifs and Logos. Six Introduction to Bioinformatics. Importance and Abundance of Motifs. Getting the CDS. From DNA to Protein 6.1.
Motifs and Logos Six Discovering Genomics, Proteomics, and Bioinformatics by A. Malcolm Campbell and Laurie J. Heyer Chapter 2 Genome Sequence Acquisition and Analysis Sami Khuri Department of Computer
More information