Stochastic processes and

Size: px
Start display at page:

Download "Stochastic processes and"

Transcription

1 Stochastic processes and Markov chains (part II) Wessel van Wieringen nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The Netherlands

2 Example

3 Example DNA copy number of a genomic segment is simply the number of copies of that segment present in the cell under study. Healthy normal cell: chr 1 : 2 chr 22 : 2 chr X : 1 or 2 chr Y : 0 or 1

4 Example Chromosomes of a tumor cell Technique: SKY

5 Example The DNA copy number is often categorized into: L : loss : < 2 copies N : normal : 2 copies G : gain : > 2 copies In cancer: The number of DNA copy number aberrations accumulates with the progression of the disease. DNA copy number aberrations are believed to be irreversible. Let us model the accumulation process of DNA copy number aberrations.

6 Example State diagram for the accumulation process of a locus. G G G G N N N N N L L L L t 0 t 1 t 2 t 3 time

7 Example The associated initial distribution: and, associated transition matrix: with parameter constraints:

8 Example Calculate the probability of a loss, normal and gain at this locus after p generations: Using: These probabilities simplify to, e.g.:

9 Example In practice, a sample is only observed once the cancer has already developed. Hence, the number of generations p is unknown. This may be accommodated by modeling p as being Poisson distributed: This yields, e.g.:

10 Example So far, we only considered one locus. Hence: DNA i DNA i DNA i DNA i t 0 t 1 t 2 t 3

11 Example For multiple loci: DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA i DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA DNA t 0 t 1 t 2 t 3

12 Example Multiple loci multivariate problem. Complications: p unknown, loci not independent. Solution: p random, assume particular dependence structure. After likelihood formulation and parameter estimation: After likelihood formulation and parameter estimation: identify most aberrated loci, reconstruct time of onset of cancer.

13 Stationary distribution

14 Stationary distribution We generated a DNA sequence of bases long in accordance with a 1 st order Markov chain. For stretches DNA ever longer and ever farther away from the first base pair we calculated l the nucleotide %. bp bp %A %C %G %T e e e e e rgence e conve

15 Stationary distribution Hence, after a while an equilibrium sets in. Not necessarily a fixed state or pattern, but: the proportion of a time period that is spent in a particular state t converges to a limit it value. The limit values of all states form the stationary The limit values of all states form the stationary distribution of the Markov process, denoted by:

16 Stationary distribution For a stationary process, it holds that P(X t =E i ) = φ i for all t and i. In particular: P(X t =EE i ) = φ i = P(X t+1 =EE i ). Of course, this does not imply: P(X t =E i, X t+1 =E i )= φ i φ i as this ignores the 1 st order Markov dependency. d

17 Stationary distribution The stationary distribution is associated with the first- order Markov process, parameterized by (π, P). Question How do φ and (π, P) relate? Hereto, recall: φ i = P(X t =E i )

18 Stationary distribution We then obtain the following relation: definition

19 Stationary distribution We then obtain the following relation: use the fact that P(A, B) +P(A P(A, B C )=P(A)

20 Stationary distribution We then obtain the following relation: use the definition of conditional probability: P(A, B) = P(A B) P(B)

21 Stationary distribution We then obtain the following relation:

22 Stationary distribution We then obtain the following relation: Thus: Eigenvectors!

23 Stationary distribution Theorem Irreducible, aperiodic Markov chains with a finite it state t space S have a stationary distribution to which the chain converges as t. A C G T φ A =0.1, 01 φ C =0.2, 02 φ G =0.3, 03 φ T =0.4 04

24 Stationary distribution A Markov chain is aperiodic if there is no state that can only be reached at multiples of a certain period. E.g., state E i only at t = 0, 3, 6, 9, et cetera. Example of an aperiodic Markov chain A C G T p AC A A C p CA C p GC p GC G p GA p AG p TC T ATCGATCGATCGATCG p AA p GG G p GT p TG T p CT p CC p TT

25 Stationary distribution Example of a periodic Markov chain A C G T A C G T A p AT =1 p GA =1 C p CG =1 p TC =1 ATCGATCGATCGATCG G T The fraction of time spent in A (roughly φ A ): P(X t+1000 = A) ) = ¼ whereas P(X t+1000 = A X 4 = A) = 1.

26 Stationary distribution A Markov chain is irreducible if every state can (in principle) be reached (after enough time has passed) from every other state. Examples of a reducible Markov chain A C A C G T G T C is an absorbing state. A will not be reached.

27 Stationary distribution How do we find the stationary distribution? We know the stationary distribution satisfies: (*) and We thus have S+1 equations for S unknowns: one of the equations in (*) can be dropped (which is irrelevant), and the system of S remaining equations needs to be solved.

28 Stationary distribution Consider the transition matrix: In order to find the stationary distribution we need to solve the following system of equations: This yields: (φ 1, φ 2 ) T = ( , ) T

29 Stationary distribution On the other hand, for n large: P(X t+n =E j X t =E i ) = φ j is independent of i. Or,p (n) ij = φ j. Hence, the n-step transition matrix P (n) has identical rows: This motivates a numerical way to find the stationary This motivates a numerical way to find the stationary distribution.

30 Stationary distribution Same example as before: with stationary distribution: (φ 1, φ 2 ) T = ( , ) T Then: matrix multiplication ( rows times columns )

31 Stationary distribution Thus: In similar fashion we obtain:

32 Convergence of the stationary distribution

33 Convergence to the stationary distribution Define the vector 1 = (1,,1) T. We have already seen: P n = 1 φ T for large n Question How fast does P n go to 1 φ T as n? Answer 1) Use linear algebra 2) Calculate numerically

34 Convergence to the stationary distribution Fact The transition matrix P of a finite, it aperiodic, irreducible ibl Markov chain has an eigenvalue equal to 1 (λ 1 =1), while all other eigenvalues are (in the absolute sense) smaller than one: λ k < 1, k=2,,3. Focus on λ 1 =1 1 We know φ T P = φ T for the stationary distribution. Hence, φ is the left eigenvector of eigenvalue λ=1. Also, row sums of P equal 1: P 1= 1. Hence, 1 is a right eigenvector of eigenvalue λ=1.

35 Convergence to the stationary distribution The spectral decomposition of a square matrix P is given by: where: diagonal matrix containing the eigenvalues, columns contain the corresponding eigenvectors. In case of P is symmetric, Then: is orthogonal:

36 Convergence to the stationary distribution The spectral decomposition of P, reformulated: eigenvalues right eigenvectors left eigenvectors The eigenvectors are normalized:

37 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: plug in the spectral decomposition of P

38 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: bring the right eigenvector in the sum, and use the properties of the transpose operator

39 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that: the eigenvectors are normalized

40 Convergence to the stationary distribution We now obtain the spectral decomposition of the n- step transition matrix P n. Hereto observe that:

41 Convergence to the stationary distribution Repeating this argument n times yields: Hence, and are left and right eigenvector with eigenvalue of. Thus:

42 Convergence to the stationary distribution Verify the spectral decomposition for P 2 :

43 Convergence to the stationary distribution Use the spectral decomposition of P n to show how fast P n converges to 1 φ T as n. We know: λ 1 =1, λ k < 1 for k=2,,s, and Then:

44 Convergence to the stationary distribution Expanding this: as n 0 0

45 Convergence to the stationary distribution Clearly: Furthermore, as: It is the second largest (in absolute sense) eigenvalue that dominates, and thus determines the convergence speed to the stationary distribution.

46 Convergence to the stationary distribution Fact AM Markov chain with a symmetric ti P has a uniform stationary distribution. Proof Symmetry of P implies that left- and right eigenvectors are identical (up to a constant). First right eigenvector corresponds vector of ones, 1. Hence, the left eigenvector equals c1. The left eigenvector is the stationary distribution and should sum to one: c = 1 / (number of states).

47 Convergence to the stationary distribution Question Suppose the DNA may reasonably described d by a first order Markov model with transition matrix P: and stationary distribution: and eigenvalues:

48 Convergence to the stationary distribution Question What is the probability of a G at position 2, 3, 4, 5, 10? And how does this depend on the starting nucleotide? In other words, give: But also: P(X 2 = G X 1 = A) ) = P(X 2 = G X 1 = C) = P(X 2 = G X 1 = G) = P(X 2 = G X 1 = T) = P(X 3 = G X 1 = A) = et cetera.

49 Convergence to the stationary distribution Thus, calculate P(X t = G X 1 = x 1 ) with t=2, 3, 4, 5, 10, and x 1 = A, C, G, T. x1=a x1=c x1=g x1=t t= t= t= t= t= Study the influence of the first nucleotide on the calculated probability for increasing t.

50 Convergence to the stationary distribution > # define π and P > pi <- matrix(c(1, 0, 0, 0), ncol=1) > P <- matrix(c(2, 3, 3, 2, 1, 4, 4, 1, 3, 2, 2, 3, 4, 1, 1, 4), ncol=4, byrow=true)/10 > # define function that calculates the powers of a > # matrix (inefficiently though) > matrixpower <- function(x, power){ Xpower <- X for (i in 2:power){ Xpower <- Xpower %*% X } return(xpower) } > # calculate P to the power 100 > matrixpower(p, 100)

51 Convergence to the stationary distribution Question Suppose the DNA may reasonably described d by a first order Markov model with transition matrix P: and stationary distribution: and eigenvalues:

52 Convergence to the stationary distribution Again calculate P(X t = G X 1 = x 1 ) with t=2, 3, 4, 5, 10, and x 1 = A, C, G, T. x1=a x1=c x1=g x1=t t= t= t= t= t= t= Now the influence of the first nucleotide fades slowly. This can be explained by the large 2 nd eigenvalue.

53 Processes back in time

54 Processes back in time So far, we have studied Markov chains forward in time. In practice, we may wish to study processes back in time. Example Evolutionary models that describe occurrence of SNPs in DNA sequences. We aim to attribute two DNA sequences to a common ancestor. human chimp common ancestor

55 Processes back in time Consider a Markov chain {X t } t=1,2,. The reverse Markov t t,, chain {X r *} r=1,2, is then defined by: X r * = X t-r X 2 * X 1 * X 0 * X -1 * X t-2 X t-1 X t X t+1 With transition probabilities: p ij = P(X t =E j X t-1 =E i ) p ij *=P(X r *=E j X r-1 *=E i )

56 Processes back in time Show how the transition probabilities p ji * relate to p ij : just the definition

57 Processes back in time Show how the transition probabilities p ji * relate to p ij : express this in terms of the original Markov chain express this in terms of the original Markov chain using that X r * = X t-r

58 Processes back in time Show how the transition probabilities p ji * relate to p ij : apply definition of conditional probability twice (Bayes): apply definition of conditional probability twice (Bayes): P(A B) = P(B A) P(A) / P(B)

59 Processes back in time Show how the transition probabilities p ji * relate to p ij : Hence:

60 Processes back in time Check that rows of the transition matrix P* sum to one, i.e.: p i1 * + p i2 * + + p js * = 1 Hereto: use the fact that

61 Processes back in time The two Markov chains defined by P and P* have the same stationary distribution. Indeed, as: we have:

62 Processes back in time Definition A Markov chain is called reversible if p ij * = p ij. In that case: p ij * = p ij = p ji φ j / φ i Or, φ i p ij = φ j p ji for all i and j. These are the so-called detailed balance equations. Theorem A Markov chain is reversible if and only if the detailed balance equations hold.

63 Processes back in time Example 1 The 1 st order Markov chain with transition matrix: is irreversible. Check that this (deterministic) Markov chain does not satisfy the detailed balance equations. Irreversibility can be seen from a sample of this chain: ABCABCABCABCABCABC In the reverse direction transitions from B to C do not occur!

64 Processes back in time Example 2 The 1 st order Markov chain with transition matrix: is irreversible. Again, a uniform stationary distribution: As P is not symmetric, the detailed balance equations are not satisfies: p ij /3 p ji / 3 for all i and j.

65 Processes back in time Example 2 (continued) The irreversibility of this chain implies: P(A B C A) B C A A A A = P(A B) P(B C) P(C A) B B B B = 08* * * 0.1 * 0.1 = P(A C) P(C B) P(B A) = P(A C B A). It matters how one walks from A to A. C C C C A A A A B B B B C C C C Or, it matters whether one walks forward or backward.

66 Processes back in time Kolmogorov condition for reversibility A stationary Markov chain is reversible if and only if any path from state E i to state E i has the same probability as the path in the opposite direction. Or, the Markov chain is reversible if and only if: for all i, i 1, i 2,, i k. 1 2 k Eg: E.g.: P(A B C A) = P(A C B A). A A A A A A A A B B B B B B B B C C C C C C C C

67 Processes back in time Interpretation For a reversible Markov chain it is not possible to determine the direction of the process from the observed state sequence alone. Molecular l phylogenetics aims to reconstruct t evolutionary relationships between present day species from their DNA sequences. Reversibility is then an essential assumption. Genes are transcribed in one direction only (from the 3 end to the 5 end). The promoter is only on the 3 end. This suggests irreversibility. e

68 Processes back in time E. Coli For a gene in the E. Coli genome, we estimate: Transition matrix [,1] [,2] [,3] [,4] [1,] [2,] [3,] [4,] Stationary distribution [1] Then, the detailed balance equation do not hold, e.g.: π 1 p 12 π 2 p 21.

69 Processes back in time Note Within evolution theory the notion of irreversibility refers to the presumption that complex organisms once lost evolution will not appear in the same form. Indeed, the likelihood of reconstructing a particular phylogenic system is infinitesimal small.

70 Application: motifs

71 Application: motifs Study the sequence of the promotor region up-stream of a gene. gene DNA upstream promotor region This region contains binding sites for the transcription factors that regulate the transcription of the gene. gene DNA

72 Application: motifs The binding sites of a transciption factor (that may regulate multiple genes) share certain sequence patterns, motifs. Not all transcription factors and motifs are known. Hence, a high occurrence of a particular sequence pattern in the upstream regions of a gene may indicate that it has a regulatory function (e.g., binding site). Problem Determine the probability of observing m motifs in a background generated by a 1 st order stationary Markov chain.

73 Application: motifs An h-letter word W = w 1 w 2 w h is a map from {1,, h} to, where some non-empty set, called the alphabet. In the DNA example: and, e.g.:

74 Application: motifs A word W is p-periodic if The lag between two overlapping occurrences of the word. The set of all periods of W (less than h) is the period set, p ( ) p, denoted by. In other words, the set of integers 0 < p < h such that a new occurrence of W can start p letters after an occurrence of W.

75 Application: motifs If W 1 = CGATCGATCG, then For: CGATCGATC CGATCGATCGATC CGATCGATCGATCGATC If W 2 = CGAACTG, then

76 Application: motifs Let N(W) be the number of (overlapping) occurrences of an h-letter word W in a random sequence n on A. If Y t is the random variable defined by: then

77 Application: motifs Also, denote the number of occurences in W by: and If W = CATGA, then: n(a) = 2 and n(a ) = 1

78 Application: motifs Assume a stationary 1 st order Markov model for the random sequence of length n. The probability of an occurrence of W in the random sequence is given by:

79 Application: motifs In the 1 st order Markov model, the expected number of occurrences of W is approximated by: and its variance by:

80 Application: motifs To find words with exceptionally frequency in the DNA, the following (asymptotically) standard normal statistic is used: The p-value of word W is then given by:

81 Application: motifs Note Robin, Daudon (1999) provide exact probabilities of word occurences in random sequences. However, Robin, Schbath (2001) point out that calculation of the exact probabilities is computationally intensive. Hence, the use of an approximation here.

82 References & further reading

83 References and further reading Ewens, W.J, Grant, G (2006), Statistical Methods for Bioinformatics, Springer, New York. Reinert, G., Schbath, S., Waterman, M.S. (2000), Probabilistic and statistical properties of words: an overview, Journal of Computational Biology, 7, Robin, S., Daudin, J.-J. (1999), Exact distribution of word occurrences in a random sequence of letters, Journal of Applied Probability, 36, Robin, S., Schbath, S. (2001), Numerical comparison of several approximations of the word count distribution ib ti in random sequences, Journal of Computational Biology, 8(4), Schbath, S. (2000), An overview on the distribution of word counts in Markov chains, Journal of Computational. Biology, 7, Schbath, S., Robin, R. (2009), How can pattern statistics be useful for DNA motif discovery?. In Scan Statistics: ti ti Methods and Applications by Glaz, J. et al. (eds.).

84 This material is provided under the Creative Commons Attribution/Share-Alike/Non-Commercial License. See for details.

Stochastic processes and Markov chains (part II)

Stochastic processes and Markov chains (part II) Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The

More information

Stochastic processes and

Stochastic processes and Stochastic processes and Markov chains (part I) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University

More information

Answers Lecture 2 Stochastic Processes and Markov Chains, Part2

Answers Lecture 2 Stochastic Processes and Markov Chains, Part2 Answers Lecture 2 Stochastic Processes and Markov Chains, Part2 Question 1 Question 1a Solve system of equations: (1 a)φ 1 + bφ 2 φ 1 φ 1 + φ 2 1. From the second equation we obtain φ 2 1 φ 1. Substitition

More information

Markov chains and the number of occurrences of a word in a sequence ( , 11.1,2,4,6)

Markov chains and the number of occurrences of a word in a sequence ( , 11.1,2,4,6) Markov chains and the number of occurrences of a word in a sequence (4.5 4.9,.,2,4,6) Prof. Tesler Math 283 Fall 208 Prof. Tesler Markov Chains Math 283 / Fall 208 / 44 Locating overlapping occurrences

More information

Lecture 3: Markov chains.

Lecture 3: Markov chains. 1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics A stochastic (probabilistic) model that assumes the Markov property Markov property is satisfied when the conditional probability distribution of future states of the process (conditional on both past

More information

Lecture #5. Dependencies along the genome

Lecture #5. Dependencies along the genome Markov Chains Lecture #5 Background Readings: Durbin et. al. Section 3., Polanski&Kimmel Section 2.8. Prepared by Shlomo Moran, based on Danny Geiger s and Nir Friedman s. Dependencies along the genome

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

Statistics 992 Continuous-time Markov Chains Spring 2004

Statistics 992 Continuous-time Markov Chains Spring 2004 Summary Continuous-time finite-state-space Markov chains are stochastic processes that are widely used to model the process of nucleotide substitution. This chapter aims to present much of the mathematics

More information

Markov Chains, Stochastic Processes, and Matrix Decompositions

Markov Chains, Stochastic Processes, and Matrix Decompositions Markov Chains, Stochastic Processes, and Matrix Decompositions 5 May 2014 Outline 1 Markov Chains Outline 1 Markov Chains 2 Introduction Perron-Frobenius Matrix Decompositions and Markov Chains Spectral

More information

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

Variable Selection in Structured High-dimensional Covariate Spaces

Variable Selection in Structured High-dimensional Covariate Spaces Variable Selection in Structured High-dimensional Covariate Spaces Fan Li 1 Nancy Zhang 2 1 Department of Health Care Policy Harvard University 2 Department of Statistics Stanford University May 14 2007

More information

Introduction to Hidden Markov Models for Gene Prediction ECE-S690

Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Markov Models & DNA Sequence Evolution

Markov Models & DNA Sequence Evolution 7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Lecture Notes: Markov chains

Lecture Notes: Markov chains Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

More information

The Distribution of Mixing Times in Markov Chains

The Distribution of Mixing Times in Markov Chains The Distribution of Mixing Times in Markov Chains Jeffrey J. Hunter School of Computing & Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand December 2010 Abstract The distribution

More information

Lecture 7: Simple genetic circuits I

Lecture 7: Simple genetic circuits I Lecture 7: Simple genetic circuits I Paul C Bressloff (Fall 2018) 7.1 Transcription and translation In Fig. 20 we show the two main stages in the expression of a single gene according to the central dogma.

More information

URN MODELS: the Ewens Sampling Lemma

URN MODELS: the Ewens Sampling Lemma Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 3, 2014 1 2 3 4 Mutation Mutation: typical values for parameters Equilibrium Probability of fixation 5 6 Ewens Sampling

More information

Bioinformatics 2 - Lecture 4

Bioinformatics 2 - Lecture 4 Bioinformatics 2 - Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what

More information

Hidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2

Hidden Markov Models. music recognition. deal with variations in - pitch - timing - timbre 2 Hidden Markov Models based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis Shamir s lecture notes and Rabiner s tutorial on HMM 1 music recognition deal with variations

More information

Finite-Horizon Statistics for Markov chains

Finite-Horizon Statistics for Markov chains Analyzing FSDT Markov chains Friday, September 30, 2011 2:03 PM Simulating FSDT Markov chains, as we have said is very straightforward, either by using probability transition matrix or stochastic update

More information

2 Discrete-Time Markov Chains

2 Discrete-Time Markov Chains 2 Discrete-Time Markov Chains Angela Peace Biomathematics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < + Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n

More information

LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)

LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG) LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG) A discrete time Markov Chains is a stochastic dynamical system in which the probability of arriving in a particular state at a particular time moment

More information

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes? IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2006, Professor Whitt SOLUTIONS to Final Exam Chapters 4-7 and 10 in Ross, Tuesday, December 19, 4:10pm-7:00pm Open Book: but only

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

Markov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly

Markov Chains. Sarah Filippi Department of Statistics  TA: Luke Kelly Markov Chains Sarah Filippi Department of Statistics http://www.stats.ox.ac.uk/~filippi TA: Luke Kelly With grateful acknowledgements to Prof. Yee Whye Teh's slides from 2013 14. Schedule 09:30-10:30 Lecture:

More information

Computational Biology and Chemistry

Computational Biology and Chemistry Computational Biology and Chemistry 33 (2009) 245 252 Contents lists available at ScienceDirect Computational Biology and Chemistry journal homepage: www.elsevier.com/locate/compbiolchem Research Article

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions Instructor: Erik Sudderth Brown University Computer Science April 14, 215 Review: Discrete Markov Chains Some

More information

Stochastic Processes

Stochastic Processes Stochastic Processes 8.445 MIT, fall 20 Mid Term Exam Solutions October 27, 20 Your Name: Alberto De Sole Exercise Max Grade Grade 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 5 5 Total 30 30 Problem :. True / False

More information

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010 Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition

More information

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). 1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

More information

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences

Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences Mining Infrequent Patterns of Two Frequent Substrings from a Single Set of Biological Sequences Daisuke Ikeda Department of Informatics, Kyushu University 744 Moto-oka, Fukuoka 819-0395, Japan. daisuke@inf.kyushu-u.ac.jp

More information

18.175: Lecture 30 Markov chains

18.175: Lecture 30 Markov chains 18.175: Lecture 30 Markov chains Scott Sheffield MIT Outline Review what you know about finite state Markov chains Finite state ergodicity and stationarity More general setup Outline Review what you know

More information

Homework set 2 - Solutions

Homework set 2 - Solutions Homework set 2 - Solutions Math 495 Renato Feres Simulating a Markov chain in R Generating sample sequences of a finite state Markov chain. The following is a simple program for generating sample sequences

More information

Unit 16: Hidden Markov Models

Unit 16: Hidden Markov Models Computational Statistics with Application to Bioinformatics Prof. William H. Press Spring Term, 2008 The University of Texas at Austin Unit 16: Hidden Markov Models The University of Texas at Austin, CS

More information

Statistics 150: Spring 2007

Statistics 150: Spring 2007 Statistics 150: Spring 2007 April 23, 2008 0-1 1 Limiting Probabilities If the discrete-time Markov chain with transition probabilities p ij is irreducible and positive recurrent; then the limiting probabilities

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs

More information

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10

More information

AARMS Homework Exercises

AARMS Homework Exercises 1 For the gamma distribution, AARMS Homework Exercises (a) Show that the mgf is M(t) = (1 βt) α for t < 1/β (b) Use the mgf to find the mean and variance of the gamma distribution 2 A well-known inequality

More information

Evolutionary Models. Evolutionary Models

Evolutionary Models. Evolutionary Models Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

Modeling Noise in Genetic Sequences

Modeling Noise in Genetic Sequences Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:

More information

TMA 4265 Stochastic Processes Semester project, fall 2014 Student number and

TMA 4265 Stochastic Processes Semester project, fall 2014 Student number and TMA 4265 Stochastic Processes Semester project, fall 2014 Student number 730631 and 732038 Exercise 1 We shall study a discrete Markov chain (MC) {X n } n=0 with state space S = {0, 1, 2, 3, 4, 5, 6}.

More information

Lecture 5: Random Walks and Markov Chain

Lecture 5: Random Walks and Markov Chain Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves

A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves A Relationship Between Minimum Bending Energy and Degree Elevation for Bézier Curves David Eberly, Geometric Tools, Redmond WA 9852 https://www.geometrictools.com/ This work is licensed under the Creative

More information

ELEC633: Graphical Models

ELEC633: Graphical Models ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

More information

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov

More information

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

More information

SMSTC (2007/08) Probability.

SMSTC (2007/08) Probability. SMSTC (27/8) Probability www.smstc.ac.uk Contents 12 Markov chains in continuous time 12 1 12.1 Markov property and the Kolmogorov equations.................... 12 2 12.1.1 Finite state space.................................

More information

Lasso regression. Wessel van Wieringen

Lasso regression. Wessel van Wieringen Lasso regression Wessel van Wieringen w.n.van.wieringen@vu.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The Netherlands Lasso regression Instead

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Department of Mathematics University of Manitoba Winnipeg Julien Arino@umanitoba.ca 19 May 2012 1 Introduction 2 Stochastic processes 3 The SIS model

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha

More information

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University

More information

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a WHEN IS A MAP POISSON N.G.Bean, D.A.Green and P.G.Taylor Department of Applied Mathematics University of Adelaide Adelaide 55 Abstract In a recent paper, Olivier and Walrand (994) claimed that the departure

More information

Mathematical Methods for Computer Science

Mathematical Methods for Computer Science Mathematical Methods for Computer Science Computer Science Tripos, Part IB Michaelmas Term 2016/17 R.J. Gibbens Problem sheets for Probability methods William Gates Building 15 JJ Thomson Avenue Cambridge

More information

Markov Chain Model for ALOHA protocol

Markov Chain Model for ALOHA protocol Markov Chain Model for ALOHA protocol Laila Daniel and Krishnan Narayanan April 22, 2012 Outline of the talk A Markov chain (MC) model for Slotted ALOHA Basic properties of Discrete-time Markov Chain Stability

More information

Regulatory Sequence Analysis. Sequence models (Bernoulli and Markov models)

Regulatory Sequence Analysis. Sequence models (Bernoulli and Markov models) Regulatory Sequence Analysis Sequence models (Bernoulli and Markov models) 1 Why do we need random models? Any pattern discovery relies on an underlying model to estimate the random expectation. This model

More information

ROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY

ROBI POLIKAR. ECE 402/504 Lecture Hidden Markov Models IGNAL PROCESSING & PATTERN RECOGNITION ROWAN UNIVERSITY BIOINFORMATICS Lecture 11-12 Hidden Markov Models ROBI POLIKAR 2011, All Rights Reserved, Robi Polikar. IGNAL PROCESSING & PATTERN RECOGNITION LABORATORY @ ROWAN UNIVERSITY These lecture notes are prepared

More information

Karaliopoulou Margarita 1. Introduction

Karaliopoulou Margarita 1. Introduction ESAIM: Probability and Statistics URL: http://www.emath.fr/ps/ Will be set by the publisher ON THE NUMBER OF WORD OCCURRENCES IN A SEMI-MARKOV SEQUENCE OF LETTERS Karaliopoulou Margarita 1 Abstract. Let

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

The genome encodes biology as patterns or motifs. We search the genome for biologically important patterns.

The genome encodes biology as patterns or motifs. We search the genome for biologically important patterns. Curriculum, fourth lecture: Niels Richard Hansen November 30, 2011 NRH: Handout pages 1-8 (NRH: Sections 2.1-2.5) Keywords: binomial distribution, dice games, discrete probability distributions, geometric

More information

Pattern Matching (Exact Matching) Overview

Pattern Matching (Exact Matching) Overview CSI/BINF 5330 Pattern Matching (Exact Matching) Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Pattern Matching Exhaustive Search DFA Algorithm KMP Algorithm

More information

Molecular evolution 2. Please sit in row K or forward

Molecular evolution 2. Please sit in row K or forward Molecular evolution 2 Please sit in row K or forward RBFD: cat, mouse, parasite Toxoplamsa gondii cyst in a mouse brain http://phenomena.nationalgeographic.com/2013/04/26/mind-bending-parasite-permanently-quells-cat-fear-in-mice/

More information

- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.

- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes. February 8, 2005 Bio 107/207 Winter 2005 Lecture 11 Mutation and transposable elements - the term mutation has an interesting history. - as far back as the 17th century, it was used to describe any drastic

More information

MAS275 Probability Modelling Exercises

MAS275 Probability Modelling Exercises MAS75 Probability Modelling Exercises Note: these questions are intended to be of variable difficulty. In particular: Questions or part questions labelled (*) are intended to be a bit more challenging.

More information

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection

More information

This operation is - associative A + (B + C) = (A + B) + C; - commutative A + B = B + A; - has a neutral element O + A = A, here O is the null matrix

This operation is - associative A + (B + C) = (A + B) + C; - commutative A + B = B + A; - has a neutral element O + A = A, here O is the null matrix 1 Matrix Algebra Reading [SB] 81-85, pp 153-180 11 Matrix Operations 1 Addition a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn + b 11 b 12 b 1n b 21 b 22 b 2n b m1 b m2 b mn a 11 + b 11 a 12 + b 12 a 1n

More information

Prediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / /

Prediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / / Prediction of cis-regulatory Binding Sites and Regulons 11/21/2008 Mapping predicted motifs to their cognate TFs After prediction of cis-regulatory motifs in genome, we want to know what are their cognate

More information

Hidden Markov Models and some applications

Hidden Markov Models and some applications Oleg Makhnin New Mexico Tech Dept. of Mathematics November 11, 2011 HMM description Application to genetic analysis Applications to weather and climate modeling Discussion HMM description Application to

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Mutation models I: basic nucleotide sequence mutation models

Mutation models I: basic nucleotide sequence mutation models Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or

More information

Phylogenetic Assumptions

Phylogenetic Assumptions Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by

More information

High-dimensional data: Exploratory data analysis

High-dimensional data: Exploratory data analysis High-dimensional data: Exploratory data analysis Mark van de Wiel mark.vdwiel@vumc.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Contributions by Wessel

More information

Statistical NLP: Hidden Markov Models. Updated 12/15

Statistical NLP: Hidden Markov Models. Updated 12/15 Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first

More information

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2018 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

More information

Chapter 1. Vectors, Matrices, and Linear Spaces

Chapter 1. Vectors, Matrices, and Linear Spaces 1.7 Applications to Population Distributions 1 Chapter 1. Vectors, Matrices, and Linear Spaces 1.7. Applications to Population Distributions Note. In this section we break a population into states and

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -33 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -33 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -33 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Regression on Principal components

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca

More information

MATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems

MATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems Lecture 31: Some Applications of Eigenvectors: Markov Chains and Chemical Reaction Systems Winfried Just Department of Mathematics, Ohio University April 9 11, 2018 Review: Eigenvectors and left eigenvectors

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns

More information

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

An Introduction to Entropy and Subshifts of. Finite Type

An Introduction to Entropy and Subshifts of. Finite Type An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview

More information

Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008

Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008 Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008 1 Sequence Motifs what is a sequence motif? a sequence pattern of biological significance typically

More information

Motifs and Logos. Six Introduction to Bioinformatics. Importance and Abundance of Motifs. Getting the CDS. From DNA to Protein 6.1.

Motifs and Logos. Six Introduction to Bioinformatics. Importance and Abundance of Motifs. Getting the CDS. From DNA to Protein 6.1. Motifs and Logos Six Discovering Genomics, Proteomics, and Bioinformatics by A. Malcolm Campbell and Laurie J. Heyer Chapter 2 Genome Sequence Acquisition and Analysis Sami Khuri Department of Computer

More information