Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball that we have seen before. (a) Find an expression for Pr(N > k) for k = 2,,..., n + 1. n!(k 1) (b) Show that Pr(N = k) = for k = 2,,..., n + 1. (n k + 1)!nk n ( (c) Prove that E(N) = 2 + 1 1 ) ( 1 k 1 ). n n k=2 Hint: (b) and (c) follow directly from (a), so skipping (a) will make your life more difficult. 2. (Gamma-Poisson mixture) Recall that if Z Poisson(λ), then E(Z) = Var(Z) = λ, where λ is the intensity parameter. This property of the Poisson distribution is dissatisfying, because in practice the variance of observed count data often exceeds the mean. One possible solution to this discrepancy is to use a Gamma-Poisson mixture, which generated by first randomly drawing X Gamma(α, β), where α, β > 0 (I am using the inverse-scale parameterization, so that E(X) = α/β and Var(X) = α/β 2 ). Then, we generate Y X Poisson(X). Derive E(Y ) an Var(Y ) and show that E(Y ) < Var(Y ).. (lack of memory property) (a) Let T be a geometric random variable. Show that for any integers k, k 0 1, we have Pr(T = k + k 0 T > k 0 ) = Pr(T = k). (b) Let X be an exponential random variable with rate parameter λ. Show that for all t, t 0 > 0, we have Pr(X t + t 0 X t 0 ) = Pr(X t).
Stat 516, Homework 2 Due date: October 14 1. Let {X n } n 0 be a homogeneous Markov chain with state space E and transition matrix P. Let τ be the first time n for which X n X 0, where τ = + if X n = X 0 for all n 0. Express E[τ X 0 = i] in terms of p ii. 2. Let {X n } n 0 be a homogeneous Markov chain with state space E = {1, 2,, 4} and transition matrix 0.2 0. 0.5 0 0 0.2 0. 0.5 P =. 0.5 0 0.2 0. 0. 0.5 0 0.2 What is the probability that when starting from state 1, the chain hits state before it hits state 4?.. Write a routine to simulate realizations of the gambler s ruin chain {X n } with probabilities p i,i+1 = p, p i,i 1 = q, p + q = 1. The routine should stop simulations as soon as you hit one of the absorbing states. Your input will consist of an initial state i, state space size N, and probability of increasing gambler s fortune p. The routine should return a vector of Markov chain states until absorption. (a) Provide the source code in any computer language of your choice and output of your routine in the form of 20 random realizations of the Markov chain for input parameters N = 10, i =, and p = 0.27. (b) Use your simulation routine to estimate the probability of reaching the largest state N = 10 starting at state 5, u(5, p), for probabilities p i,i+1 = p = 0.1, 0.2,..., 0.9. Turn in a graph with estimated u(5, p) plotted against p. In your graph, include values u(5, p) computed using the formulae that we derived in class.
Due date: October 21 Stat 516, Homework 1. Consider a sequence of L nucleotides (A,G,C, and T). We model evolution of this sequence as a discrete-time Markov chain. At each step, we randomly choose one of L nucleotides and replace it with one of the three equally probable alternatives. Notice that the randomly chosen sequence position must change its state. We assume that one of 4 L possible nucleotide sequences of length L has a special property of being able to bind regulatory proteins and control expression of one or more genes nearby. Let {X n } be a Markov chain that counts the number of positions where our randomly evolving sequence at step n matches the special, regulatory sequence. (a) For what i, j {0,..., L}, transition probabilities p ij = Pr(X 1 = j X 0 = i) are not equal to zero? Provide algebraic expressions for these non-zero transition probabilities. (b) Show that the stationary distribution of {X n }, π = (π 0,...,π L ), is binomial with L trials and probability of success 1. Explain why this stationary distribution is 4 unique. (c) Let T L = inf{n 1, X n = L} be the first time X n matches the target. Using your knowledge of π, show that E(T L X 0 = L) = 4 L. (d) Let µ n (i) = E(X n X 0 = i) be the mean number of matches in the evolving sequence at step n, given i matches at step 0. Show that for n 1, µ n (i) satisfies the following recursive equations µ n (0) = µ n 1 (0) 2 + µ n 1(1) 1 µ n (i) = µ n 1 (i 1) i L + µ n 1(i) L i L 2 + µ n 1(i + 1) L i L 1, i = 1,...,L 1 µ n (L) = µ n 1 (L 1). What initial conditions do these recursive equations satisfy? 2. Prove that recurrence is a communication class property: i j and i is recurrent j is recurrent.. Let {X n } be a homogeneous Markov chain with state space E and transition matrix P. Define Y n = (X n, X n+1 ). The process {Y n } is also a homogeneous Markov chain with a state space F = {(i 0, i 1 ) E 2 : p i0 i 1 > 0}. (a) Derive the general entry of the transition matrix of {Y n }. (b) Show that if {X n } is irreducible, then so is {Y n }. (c) Show that if {X n } has a stationary distribution π, then {Y n } also has a stationary distribution. Express the general entry of this stationary distribution in terms of π and P.
Due date: October 28 Stat 516, Homework 4 1. Prove that an irreducible homogeneous Markov chain on a finite state space is positive recurrent. Hint: The main step in the proof is to establish recurrence of the Markov chain. Try to complete this part of the proof by contradiction. 2. Let {X n } be an irreducible positive recurrent homogeneous Markov chain with stationary distribution π. Define k(n) as the number of returns of the chain to a subset of state space states A E during the first n steps. Prove that k(n) a.s. π i. n i A. Consider a Markov chain with transition probability matrix P = 1 1 1 1 0 0 0 1 0 (a) Show that this Markov chain has a limiting distribution and find this distribution analytically. (b) Take three arbitrary numbers x 1, x 2, and x and form the successive running averages x n = (x n + x n 2 + x n 1 )/ starting with x 4. Using what you know about the lim n P n, prove that. lim x n = x 1 + 2x 2 + x. n 6 4. Consider the Ehrenfest model of diffusion with N = 100 gas molecules. From our derivations we know that the stationary distribution of the chain is Bin(0.5, N). We also know that the chain is irreducible and positive recurrent. Use simulations and the ergodic theorem to approximate the variance of the stationary distribution and compare your approximation with the true value of the stationary variance. 5. Square matrices A and B are called similar if there exists a non-singular matrix T such that A = T 1 BT. Prove that transition probability matrix of an irreducible and reversible Markov chain defined on a finite state-space is similar to a symmetric matrix.
Due date: November 16 Stat 516, Homework 5 1. In this exercise, you will statistically analyze the Wright-Fisher model with mutations. To simplify the analysis, assume that Pr(a A) = Pr(A a) = u, so that transition probabilities of {X n } are ( ) 2m p ij = p j i j (1 p i) 2m j, where p i = i ( (1 u) + 1 i ) u. 2m 2m (a) Write a simulation routine to generate realizations from the Markov chain. Setting the mutation probability u = 0.5 and gene number 2m = 10, generate 200 iterations of the chain starting from state 0. (b) Using your simulated data, compute the maximum likelihood estimate of the mutation probability u. I suggest doing this numerically. (c) Obtain a 95% confidence interval for u using asymptotic results discussed in class. You will need to estimate the stationary distribution. (d) Check your asymptotic-based answers by repeating the simulation and estimation 1000 times and reporting relevant summaries of the resulting empirical distribution of estimates of u. (e) Test the null hypothesis H 0 : u = 0.4 against the alternative H 1 : u 0.4 using a likelihood ratio test. Attach the source code with comments describing your steps.
Due date: November 2 Stat 516, Homework 6 1. Suppose we want to estimate the mean of the the standard normal distribution N(0, 1), so that our target density is f(x) = 1 e x2 2. 2π (a) Use double-exponential distribution with density g(x) = 1 2 e x and importance sampling to estimate the mean of the standard normal distribution. Compare Monte Carlo errors of importance sampling and naive Monte Carlo. (b) Implement the Metropolis-Hastings algorithm from the notes to approximate the mean of the standard normal distribution. Adjust the tuning parameter δ so that your acceptance probability is between 0. and 0.4. 2. Consider a toric Ising model with state-space Ω = {x = (x 1,...,x k ) : x i = ±1} and π(x) = 1 P Z eβ k i=1 x ix i+1, where x k+1 is understood to be equal to x 1. Set k = 50 and β = 0.9. Implement the Metropolis-Hastings sampler discussed in class to approximate E[M(x)] and Var[M(x)], where M(x) = k i=1 x i is the total magnetization. In each algorithm, start from a random state x = (x 1,...,x k ), obtained by flipping k independent fair coins and assigning values 1 or 1 to each component of x. Run your MCMC chains for N iterations. During the first L < N iterations, do not save sampled states of the system. L is the length of a burn-in period, needed for the Markov chain to achieve stationarity (hopefully).
Due date: December 7 1. For the ABO blood type example Stat 516, Homework 7 (a) Implement the EM algorithm and apply it to the the data n = (n A, n AB, n B, n 0 ) = (6, 4, 55, 5). (b) Nonparametric bootstrap is a Monte Carlo technique for studying sampling properties of statistics (data summaries). Suppose we observe iid data y = (y 1,...,y n ). We would like to study distributional properties of statistic T(y) (e.g. maximum likelihood estimators). Bootstrap prescribes to create synthetic data sets, y rep,1,...,y rep,n by drawing n samples from y with replacement. Distributional properties of T(y) are then obtained via sampling properties of T(y rep,1 ),..., T(y rep,n ). Use nonparametric bootstrap with 1000 synthetic blood type counts to compute the 95% confidence intervals for p A, p B, and p O. Report your estimates and confidence intervals in a tabular form. (c) Implement the Bayesian data augmentation algorithm assuming a priori that (p A, p B, p o ) Dirichlet(1, 1, 1). Use the data from part (a) to approximate the posterior distribution Pr(p A, p B, p O, m AA, m BB n). Report histograms of posterior samples for parameters and missing data and include posterior medians and Bayesian credible intervals in a tabular form. 2. Show that if (x,y) form a hidden Markov model with Pr(x,y) = Pr(x 1 ) n Pr(x t x t 1 ) t=2 then Pr(y t x 1:t,y 1:t 1 ) = Pr(y t x t ) for t = 1,...,n. n Pr(y t x t ), (1) In your derivation, you are allowed to use only the factorization (1) and elementary manipulations of conditional probabilities, marginal probabilities, etc. t=1