Answers Lecture 2 Stochastic Processes and Markov Chains, Part2

Similar documents
Stochastic processes and

Stochastic processes and Markov chains (part II)

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

A simple dynamic model of labor market

Stochastic Processes

TMA 4265 Stochastic Processes Semester project, fall 2014 Student number and

Birth and Death Processes. Birth and Death Processes. Linear Growth with Immigration. Limiting Behaviour for Birth and Death Processes

Lecture 21. David Aldous. 16 October David Aldous Lecture 21

Link Analysis. Leonid E. Zhukov

Stochastic processes and

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Markov chains and the number of occurrences of a word in a sequence ( , 11.1,2,4,6)

LTCC. Exercises solutions

2 Discrete-Time Markov Chains

Markov Chains Handout for Stat 110

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

ISyE 6650 Probabilistic Models Fall 2007

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Markov Chains and MCMC

Statistics 992 Continuous-time Markov Chains Spring 2004

Markov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

CMPSCI 240: Reasoning Under Uncertainty

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Mutation models I: basic nucleotide sequence mutation models

Markov Processes. Stochastic process. Markov process

Eigenvalues and eigenvectors.

Lecture 9: MATH 329: Introduction to Scientific Computing

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Data Mining and Matrices

Introduction to Restricted Boltzmann Machines

1 Random Walks and Electrical Networks

The Transition Probability Function P ij (t)

Lecture 3: Matrix and Matrix Operations

Statistics 150: Spring 2007

Lecturer: Olga Galinina

Stochastic optimization Markov Chain Monte Carlo

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

Lecture 3: Markov chains.

The Theory behind PageRank

Monte Carlo Methods. Leon Gu CSD, CMU

Markov chains. Randomness and Computation. Markov chains. Markov processes

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Matrix analytic methods. Lecture 1: Structured Markov chains and their stationary distribution

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

STOCHASTIC PROCESSES Basic notions

Convex Optimization CMU-10725

ELEC633: Graphical Models

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Markov decision processes and interval Markov chains: exploiting the connection

Assignment 3 with Reference Solutions

Bayesian Methods with Monte Carlo Markov Chains II

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

Multivariate Statistical Analysis

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Linear Algebra Solutions 1

Lecture Notes: Markov chains

Discrete Markov Chain. Theory and use

Markov Chains, Stochastic Processes, and Matrix Decompositions

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical

18.600: Lecture 32 Markov Chains

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Markov Chains, Random Walks on Graphs, and the Laplacian

Package matrixnormal

Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

Markov Chain Monte Carlo

Leslie matrices and Markov chains.

Node and Link Analysis

Numerical Analysis Comprehensive Exam Questions

Understanding MCMC. Marcel Lüthi, University of Basel. Slides based on presentation by Sandro Schönborn

6 Evolution of Networks

Introduction to Matrix Algebra

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

Lecture #5. Dependencies along the genome

Markov chains (week 6) Solutions

Online Social Networks and Media. Link Analysis and Web Search

CS 246 Review of Linear Algebra 01/17/19

Link Analysis Ranking

Basic Linear Algebra in MATLAB

Package geigen. August 22, 2017

Solutions to Problem Set 5

ISM206 Lecture, May 12, 2005 Markov Chain

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

The Google Markov Chain: convergence speed and eigenvalues

Language Acquisition and Parameters: Part II

Lecture 5: Random Walks and Markov Chain

Multiplying matrices by diagonal matrices is faster than usual matrix multiplication.

Math 5490 Network Flows

1 Ways to Describe a Stochastic Process

The Distribution of Mixing Times in Markov Chains

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

25.1 Ergodicity and Metric Transitivity

18.440: Lecture 33 Markov Chains

Lecture 4a: Continuous-Time Markov Chain Models

Statistical Learning Notes III-Section 2.4.3

arxiv: v2 [math.co] 6 May 2018

Transcription:

Answers Lecture 2 Stochastic Processes and Markov Chains, Part2 Question 1 Question 1a Solve system of equations: (1 a)φ 1 + bφ 2 φ 1 φ 1 + φ 2 1. From the second equation we obtain φ 2 1 φ 1. Substitition in the first yields Or: Thus: φ 1 b/(a + b) and φ 2 a/(a + b). (1 a)φ 1 + b(1 φ 1 ) φ 1. (a + b)φ 1 b. Question 1b Reversibility is assessed through the evaluation of the detailed balance equations. Hence, it (the detailed balance equations) needs checking whether φ i (P) i,j φ j (P) j,i holds for every i and j. Hereto use the answer to 1a. Then, e.g.: φ 1 (P) 1,2 b a + b a ab a + b ab a + b a a + b b φ 2(P) 2,1. Equation also holds for other choices of i and j. Hence, reversibility holds for all a and b. Question 1c Either if a 0 b, or a 1 b. Question 2 Question 2a The state space S comprises states 0, 1, 2, en 3. The transition matrix is: P 1 0 0 0 λ 1 λ µ µ 0 0 λ 1 λ µ µ 0 0 0 1, 1

where λ 0, µ 0 and λ + µ 1 Question 2b No, the process has two absorption states and is therefor not irreducible. This is a prerequisite for a Markov process to have a stationary distribution. Question 2c The transition matrix now is: Thus: P 1 0 0 λ 1 λ µ µ 0 0 1 P (X t 0 voor een t λ + λ (1 λ µ) + λ (1 λ µ) 2 +... λ (1 λ µ) t Similarly, for X t 2. When µ λ, P (X t 2 for any t) P (X t 0 for any t). t0 Question 2d Using Question 2c, then: when λ µ. λ (1 λ µ) t µ (1 λ µ) t t0 t0 Question 2e The transition matrix now (after substitution of λ µ) is: P From this we conclude that P (X t+1 0) P (X t+1 1) P (X t+1 2) P (X t+1 3) 1 0 0 0 λ 1 2λ λ 0 0 λ 1 2λ λ 0 0 0 1 P (X t 0) P (X t 1) P (X t 2) P (X t 3) T 1 0 0 0 λ 1 2λ λ 0 0 λ 1 2λ λ 0 0 0 1 Focus on the probabilities of X t+1 1 and X t+1 2 one obtains: ( ) ( ) T ( P (Xt+1 1) P (Xt 1) 1 2λ λ P (X t+1 2) P (X t 2) λ 1 2λ Iterative application of the result above yields the following reformulation in terms of the initial probabilities: ( ) ( ) T ( ) t P (Xt+1 1) P (Xt 1) 1 2λ λ P (X t+1 2) P (X t 2) λ 1 2λ ( ) T ( ) P (X1 1) at (λ) b t (λ), P (X 1 2) b t (λ) a t (λ) ). 2

where a t (λ) and b t (λ) are both positive for λ 0.1. In particular, then a t (λ) b t (λ) for all t. From the above we obtain (and P (X 1 1) 1): t+1 P (X t+2 0) λ P (X s 1) λ λ P (X t+2 3) λ s1 t a s (λ)p (X 1 1) + b s (λ)p (X 1 2) s1 t a s (λ) s1 t b s (λ) s1 Thus: P (X t+2 0) P (X t+2 3) for all t. Question 3 Question 3a # define the matrix P <- matrix(c(0.1500, 0.3500, 0.3500, 0.1500, 0.1660, 0.3340, 0.3340, + 0.1660, 0.1875, 0.3125, 0.3125, 0.1875, 0.2000, 0.3000, + 0.3000, 0.2000), ncol4, byrowtrue) # calculate its stationary distribution analytically statdist <- solve(t(p[1:3,1:3]) - t(matrix(p[4,1:3], ncol3, nrow3, + byrowtrue)) - diag(3), -P[4,1:3]) statdist <- c(statdist, 1-sum(statDistP2)) statdist # calculate its stationary distribution numerically # define function for raising matrix to a power matrixpower <- function(x, power){ + Xpower <- X + if (power 2){ + for (i in 2:power){ + Xpower <- X % * % Xpower + } + } + return(xpower) + } matrixpower(p, 1000)[1,] Question 3b GCcontent <- sum(statdist[2:3]) GCcontent 3

Question 4 Question 4a The detailed balance equations are given by: p i j φ i p j i φ j. The irreducibility and aperiodic properties imply that all φ j 0. Co-occurrence of this and a nonsymmetric off-diagonal zero (p ij 0 while p ji 0 for i j, or vice versa) violates the detailed balance equations. Question 4b Symmetry of the transition matrix implies a uniform stationary distribution. This uniformity and the symmetry together imply that the detailed balance equations hold. Question 5 Question 5a Insert picture. Question 5b No, once you end up in states II or III, you will never be able to get to states I and IV. Question 5c A limiting distribution with lim t P (X t I) 0 lim t P (X t IV). Question 5d # define the matrix P <- matrix(c(0.40, 0.10, 0.30, 0.20, 0.00, 0.20, 0.80, + 0.00, 0.00, 0.60, 0.40, 0.00, 0.30, 0.30, + 0.10, 0.30), ncol4, byrowtrue) # perform eigen decomposition eigen(p) Question 5e V <- eigen(p)$vectors Vinv <- solve(eigen(p)$vectors) D <- diag(eigen(p)$values) V % * % D % * % Vinv Question 5f The second largest eigenvalue equals 0.6. Hence, the influences of initial values wanes at the rate 0.6 n. Question 6 Question 6a See lecture notes. 4

Question 6b Now the state space S {A, G, C, T} is employed. That, the order of the nucleotides is not alphabetic, but order by their chemical properties: first the purines then the pyrimidines. The transition matrix is then: P 0.35 0.15 0.25 0.25 0.15 0.35 0.25 0.25 0.20 0.20 0.25 0.35 0.20 0.20 0.35 0.25 and its stationary distribution φ T (0.222, 0.222, 0.278, 0.278). Question 6c The transition matrix (using the same state space as with Question 6b): 0.40 0.20 0.20 0.20 P 0.20 0.40 0.20 0.20 0.20 0.20 0.40 0.20 0.20 0.20 0.20 0.40 and its stationary distribution φ T (0.25, 0.25, 0.25, 0.25). Question 6d φ T (0.2360.2360.2640.264). Question 6e φ T (0.23320.23320.26680.2668). Question 7 Question 7a P (TAGA) 0. Question 7b # define vector with nucleotide frequencies nuclfreq <- c(186, 242, 265, 307) nuclprob <- nuclfreq / sum(nuclfreq) # define the transition matrix with ML estimates dimerfreq <- matrix(c(58, 60, 0, 68, 39, 77, 83, 43, 43, 0, 87, + 134, 46, 105, 95, 61), ncol4, byrowtrue) dimerprob <- dimerfreq / rowsums(dimerfreq) # calculates ingredients for test statistic motifprob <- nuclprob[4] * dimerprob[4,3] * + dimerprob[3,4] * dimerprob[4,2] motifexpfreq <- (500-4 + 1) * motifprob motifvarfreq <- (500-4 + 1) * motifprob * (1 - motifprob) 5

# calculate test statistic and significance teststat <- (17 - motifexpfreq) / sqrt(motifvarfreq) pvalue <- pnorm(teststat, lower.tailfalse) 6