Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment

Similar documents
Maximum likelihood estimator consistency for ballistic random walk in a parametric random environment

Asymptotic Normality and Efficiency of the Maximum Likelihood Estimator for the Parameter of a Ballistic Random Walk in a Random Environment

Quenched Limit Laws for Transient, One-Dimensional Random Walk in Random Environment

Weak quenched limiting distributions of a one-dimensional random walk in a random environment

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

A log-scale limit theorem for one-dimensional random walks in random environments

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

1 Continuous-time chains, finite state space

Invariant measure for random walks on ergodic environments on a strip

Non-homogeneous random walks on a semi-infinite strip

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

5.3 METABOLIC NETWORKS 193. P (x i P a (x i )) (5.30) i=1

arxiv:math/ v1 [math.pr] 24 Apr 2003

Monte Carlo methods for sampling-based Stochastic Optimization

Positive and null recurrent-branching Process

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT

Resistance Growth of Branching Random Networks

Excited random walks in cookie environments with Markovian cookie stacks

Stochastic relations of random variables and processes

STAT STOCHASTIC PROCESSES. Contents

ON COMPOUND POISSON POPULATION MODELS

Spring 2012 Math 541B Exam 1

Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

Math Homework 5 Solutions

Mixing time for a random walk on a ring

A slow transient diusion in a drifted stable potential

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

process on the hierarchical group

Consistency of the maximum likelihood estimator for general hidden Markov models

Random Walk in Periodic Environment

On optimal stopping of autoregressive sequences

Jean-Michel Billiot, Jean-François Coeurjolly and Rémy Drouilhet

Lectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes

Statistical inference on Lévy processes

STOCHASTIC PROCESSES Basic notions

Modeling heterogeneity in random graphs

Stat 516, Homework 1

Lecture 23. Random walks

The parabolic Anderson model on Z d with time-dependent potential: Frank s works

The Metropolis-Hastings Algorithm. June 8, 2012

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Treball final de grau GRAU DE MATEMÀTIQUES Facultat de Matemàtiques Universitat de Barcelona MARKOV CHAINS

Mandelbrot s cascade in a Random Environment

On the Complexity of Best Arm Identification with Fixed Confidence

Spectra of Large Random Stochastic Matrices & Relaxation in Complex Systems

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Beyond stochastic gradient descent for large-scale machine learning

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Selected Exercises on Expectations and Some Probability Inequalities

The Art of Sequential Optimization via Simulations

FUNCTIONAL CENTRAL LIMIT (RWRE)

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

Model Selection and Geometry

Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting

Convergence of Quantum Statistical Experiments

Introduction to Algorithmic Trading Strategies Lecture 10

Introduction to Maximum Likelihood Estimation

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)

On the recurrence of some random walks in random environment

Reductions Of Undiscounted Markov Decision Processes and Stochastic Games To Discounted Ones. Jefferson Huang

Rare event simulation for the ruin problem with investments via importance sampling and duality

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

DETECTING PHASE TRANSITION FOR GIBBS MEASURES. By Francis Comets 1 University of California, Irvine

Lecture Notes on Random Walks in Random Environments

Bootstrap random walks

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

2. Transience and Recurrence

Skorokhod embeddings for two-sided Markov chains

The range of tree-indexed random walk

Practical conditions on Markov chains for weak convergence of tail empirical processes

Approximate Dynamic Programming

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Consistency of Quasi-Maximum Likelihood Estimators for the Regime-Switching GARCH Models

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Properties for systems with weak invariant manifolds

arxiv: v1 [math.pr] 20 Jul 2007

Intertwining of Markov processes

Statistical Estimation: Data & Non-data Information

Perturbed Proximal Gradient Algorithm

Master s Written Examination

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Infinite-Horizon Average Reward Markov Decision Processes

Methods of Estimation

Exponential martingales: uniform integrability results and applications to point processes

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

Chapter 2: Markov Chains and Queues in Discrete Time

Chapter 2. Poisson point processes

Exercises with solutions (Set D)

Central-limit approach to risk-aware Markov decision processes

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Long-Run Covariability

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

Adaptive Monte Carlo methods

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models

Transcription:

Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment Catherine Matias Joint works with F. Comets, M. Falconnet, D.& O. Loukianov Currently: Laboratoire Statistique & Génome, Évry, FRANCE Soon: Lab. Probabilités & Modèles Aléatoires, Paris, FRANCE http://cmatias.perso.math.cnrs.fr/

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

DNA unzipping RWRE introduced by [Chernov (67)] to model DNA replication. By the end of 90 s, various DNA unzipping experiments appeared. f G A C A C T C T A C C T G A 1 2 3 4 5 M A G A T G G A C T G T G T C f Goals DNA sequencing (exploratory), Study the structural properties of the molecule.

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

Model description I Random environment on Z ω = {ω x } x Z i.i.d. with ω x (0, 1) and ω x ν θ, θ Θ unknown parameter, Θ R d compact set, P θ = ν Z θ law on (0, 1) Z of ω and E θ expectation, Markov process conditional on the environment For fixed ω, let X = {X t } t N be the Markov chain on Z starting at X 0 = 0 and with transitions ω x if y = x + 1, P ω (X t+1 = y X t = x) = 1 ω x if y = x 1, 0 otherwise. P ω is the measure on the path space of X given ω (quenched law).

Model description II Random walk in random environment (RWRE) The (unconditional) law of X is the annealed law P θ ( ) = P ω ( )dp θ (ω), Note that X is not a Markov process. 1 ω x ω x 0 x 1 x x + 1

Limiting behaviour of X Let ρ x = 1 ω x ω x, x Z. [Solomon (75)] proved the classification: (a) Recurrent case: If E θ (log ρ 0 ) = 0, then = lim inf t X t < lim sup X t = +, t P θ -almost surely. (b) Transient case: if E θ (log ρ 0 ) < 0, then lim X t = +, t P θ -almost surely. If we moreover let T n = inf{t N : X t = n}, then (b1) Ballistic case: if E θ (ρ 0 ) < 1, then, P θ -almost surely, T n /n c, P θ -a.s. (b2) Sub-ballistic case: If E θ (ρ 0 ) 1, then T n /n + P θ -almost surely, when n tends to infinity.

Context Goal and context Goal: Estimate the parameter value θ relying on the observation of X [0,Tn]. In a much more general setting, [Adelman & Enriquez (04)] provide a link between the RWRE and the environment, leading to moment estimators for the distribution ν θ. Drawback: estimate some moments first and then invert a function to recover the parameter θ. May induce a loss of efficiency. We focus on maximum likelihood estimation (MLE). We assume a transient ballistic random walk.

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

MLE construction I We let L n x := T n 1 s=0 1{(X s, X s+1 ) = (x, x 1)} the number of left steps from site x and R n x the number of right steps (defined similarly). We have P ω (X [0,Tn]) = ω Rn x x (1 ω x ) Ln x x Z and (i.i.d. env.) P θ (X [0,Tn]) = x Z 1 Note that Only the visited sites contribute in this product. The number of visited sites x < 0 is bounded. For x = 1,..., n 1, R n x = L n x+1 + 1 0 a Rn x (1 a) Ln x dν θ (a). R n x 0 x x + 1 n L n x+1

MLE construction II Let φ θ be the function from N 2 to R given by φ θ (x, y) = log 1 The criterion function θ l n (θ) is defined as and our estimator is 0 a x+1 (1 a) y dν θ (a). (1) n 1 l n (θ) = φ θ (L n x+1, L n x), x=0 θ n Argmax l n (θ). θ Θ

Results: consistency, asymptotic normality and efficiency Under appropriate (and classical) assumptions, in the transient ballistic case, we establish that the MLE satisfies Consistency: lim n + θn = θ, in P -probability, Asymptotic normality: n( θ n θ ) P dist. N (0, Σ 1 θ ), Efficiency: Σ θ is the Fisher information matrix. Francis Comets, Mikael Falconnet, Oleg Loukianov, Dasha Loukianova & Catherine Matias Maximum likelihood estimator consistency for ballistic random walk in a parametric random environment. Stochastic Processes & Applications, 124(1): 268-288, 2014. Mikael Falconnet, Dasha Loukianova & Catherine Matias, Asymptotic normality and efficiency of the maximum likelihood estimator for the parameter of a ballistic random walk in a random environment. Mathematical Methods of Statistics, 23(1):1-19, 2014.

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

Underlying BPIRE I Main property (Kesten, Kozlov, Spitzer, 75) where (L n n, L n n 1,..., Ln 0 ) P (Z 0,..., Z n ) Z 0 = 0, and for k = 0,..., n 1, Z k+1 = Z k i=0 ξ k+1,i, with {ξ k,i } k N ;i N independent and m N, P ω (ξ k,i = m) = (1 ω k ) m ω k. Under annealed law P θ, {Z n } n N is an irreducible positive recurrent homogeneous Markov chain with transition kernel Q θ.

Underlying BPIRE II Consequence We have an equality in P -distribution l n (θ) = dist. n 1 k=0 φ θ(z k, Z k+1 ) and the right-hand side is (up to a constant) the likelihood of a positive recurrent Markov process. About ballistic assumption Stationary measure of (Z n ) has a finite first order moment only in the ballistic case. In this case, l n /n converges to a finite limit l, Sub-ballistic case studied in Mikael Falconnet, Arnaud Gloter & Dasha Loukianova Maximum likelihood estimation in the context of a sub-ballistic random walk in a parametric random environment. arxiv 1405.2880.

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

Examples of environment distributions I Example 1: Finite and known support Fix a 1 < a 2 (0, 1) and let ν p = pδ a1 + (1 p)δ a2, where δ a is the Dirac mass located at value a. Unknown parameter p Θ (0, 1) (namely θ = p) Assume that a 1, a 2 and Θ are such that the process is transient and ballistic. Then, the assumptions are satisfied and one can estimate p consistently and efficiently. May be generalised to K > 2 fixed and known support points and θ = (p 1,..., p K 1 ).

Examples of environment distributions II Example 2: Two unknown support points ν θ = pδ a1 + (1 p)δ a2 and unknown parameter θ = (p, a 1, a 2 ) Θ, where Θ is a compact subset of (0, 1) {(a 1, a 2 ) (0, 1) 2 : a 1 < a 2 } such that the process is transient and ballistic. Then, the assumptions are satisfied and one can estimate θ consistently. Moreover, if E θ (ρ 3 0 ) < 1, the MLE estimator is asymptotically normal and efficient.

Examples of environment distributions III Example 3: Beta distribution dν(a) = 1 B(α,β) aα 1 (1 a) β 1 da, Unknown parameter θ = (α, β) Θ where Θ is a compact subset of {(α, β) (0, + ) 2 : α > β + 1}. As E θ (ρ 0 ) = β/(α 1), the constraint α > β + 1 ensures that the process is transient and ballistic. Then, the assumptions are satisfied and one can estimate θ consistently and efficiently.

Outline Biophysical context Nearest-neighbour one-dimensional random walk in random environment MLE construction and properties RWRE and Branching process with immigration in random environment (BPIRE) Three examples Simulations

Simulations protocol Three models corresponding to the previous 3 examples, with θ as in Table 1. In each model, 1, 000 repeats of the following procedure Generate a random environment according to distribution ν θ on the set of sites { 10 4,..., 10 4 }. Run a random walk in this environment and stop it successively at the hitting times T n, with n {10 3 k; 1 k 10}. For each value of n, Estimate θ with MLE and [Adelman & Enriquez (04)] s procedure Estimate the Fisher information matrix Σ θ and compute a confidence interval for θ Simulation Fixed parameter Estimated parameter Example 1 (a 1, a 2 ) = (0.4, 0.7) p = 0.3 Example 2 - (a 1, a 2, p ) = (0.4, 0.7, 0.3) Example 3 - (α, β ) = (5, 1) Table : Parameter values for each experiment.

Boxplots of MLE (white) and [Adelman & Enriquez (04)] s estimate (grey) - Ex. 1 (ˆp) and 3 (ˆα, ˆβ) 1 2 3 4 5 6 7 8 9 10 0.1 0.2 0.3 0.4 1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 1 2 3 4 5 6 7 8 9 10 0.0 0.5 1.0 1.5 2.0 2.5 3.0

Boxplots of MLE - Ex. 2 (ˆp, â 1, â 2 ) 1 2 3 4 5 6 7 8 9 10 0.36 0.38 0.40 0.42 0.44 1 2 3 4 5 6 7 8 9 10 0.66 0.68 0.70 0.72 0.74 1 2 3 4 5 6 7 8 9 10 0.20 0.25 0.30 0.35 0.40 0.45

Empirical coverages of confidence regions Ex. 1 Ex. 2 Ex. 3 n 0.01 0.05 0.1 0.01 0.05 0.1 0.01 0.05 0.1 1000 0.994 0.952 0.899 0.992 0.953 0.909 0.977 0.942 0.901 2000 0.989 0.952 0.903 0.994 0.953 0.910 0.978 0.928 0.884 3000 0.988 0.942 0.901 0.990 0.938 0.886 0.981 0.940 0.889 4000 0.991 0.944 0.896 0.991 0.951 0.894 0.988 0.945 0.900 5000 0.990 0.942 0.896 0.993 0.942 0.891 0.986 0.941 0.883 6000 0.983 0.948 0.901 0.987 0.951 0.888 0.988 0.937 0.897 7000 0.986 0.950 0.900 0.992 0.951 0.900 0.986 0.942 0.898 8000 0.987 0.956 0.898 0.988 0.950 0.903 0.981 0.946 0.903 9000 0.990 0.959 0.913 0.990 0.949 0.893 0.985 0.939 0.901 10000 0.987 0.954 0.908 0.990 0.949 0.899 0.983 0.944 0.892 Table : Empirical coverages of (1 γ) asymptotic level confidence regions, for γ {0.01, 0.05, 0.1} and relying on 1000 iterations.

Conclusions Good performances of θ n on simulated data Unbiased estimator (like [Adelman & Enriquez (04)] s one) Less spread out than [Adelman & Enriquez (04)] s one (in fact efficient). Easier to compute (Ex. 2 [Adelman & Enriquez (04)] s estimate is out of reach). Confidence regions build from θ n have accurate empirical coverage. Questions?

References O. Adelman and N. Enriquez. Random walks in random environment: what a single trajectory tells. Israel J. Math., 142:205 220, 2004. A.A. Chernov. Replication of a multicomponent chain by the lightning mechanism. Biofizika, 12:297 301, 1967. F. Solomon. Random walks in a random environment. Ann. Probability, 3:1 31, 1975.