RESEARCH REPORT. A note on gaps in proofs of central limit theorems. Christophe A.N. Biscio, Arnaud Poinas and Rasmus Waagepetersen

Similar documents
Estimating functions for inhomogeneous spatial point processes with incomplete covariate data

Estimating functions for inhomogeneous spatial point processes with incomplete covariate data

Mi-Hwa Ko. t=1 Z t is true. j=0

Christophe Ange Napoléon Biscio and Rasmus Waagepetersen

ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES

SDS : Theoretical Statistics

A Thinned Block Bootstrap Variance Estimation. Procedure for Inhomogeneous Spatial Point Patterns

Decomposition of variance for spatial Cox processes Jalilian, Abdollah; Guan, Yongtao; Waagepetersen, Rasmus Plenge

Lecture 1: Review of Basic Asymptotic Theory

Spatial analysis of tropical rain forest plot data

Large Sample Properties of Estimators in the Classical Linear Regression Model

Studentization and Prediction in a Multivariate Normal Setting

On the smallest eigenvalues of covariance matrices of multivariate spatial processes

Testing Statistical Hypotheses

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Jean-Michel Billiot, Jean-François Coeurjolly and Rémy Drouilhet

STAT 331. Martingale Central Limit Theorem and Related Results

. Find E(V ) and var(v ).

Testing Statistical Hypotheses

Covariance function estimation in Gaussian process regression

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

The properties of L p -GMM estimators

Second-Order Analysis of Spatial Point Processes

Asymptotic Statistics-III. Changliang Zou

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

Likelihood Function for Multivariate Hawkes Processes

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

Malaya J. Mat. 2(1)(2014) 35 42

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

Point processes, spatial temporal

Statistics 222, Spatial Statistics. Outline for the day: 1. Problems and code from last lecture. 2. Likelihood. 3. MLE. 4. Simulation.

On the convergence of sequences of random variables: A primer

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

Variance Estimation for Statistics Computed from. Inhomogeneous Spatial Point Processes

Decomposition of variance for spatial Cox processes

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

The Bayesian Approach to Multi-equation Econometric Model Estimation

Decomposition of variance for spatial Cox processes

We are now going to go back to the concept of sequences, and look at some properties of sequences in R

Finite Population Sampling and Inference

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes

Using Estimating Equations for Spatially Correlated A

Made available courtesy of Elsevier:

A Note on Hilbertian Elliptically Contoured Distributions

ROBUST - September 10-14, 2012

Siegel s formula via Stein s identities

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

arxiv: v1 [stat.me] 24 May 2010

Statistícal Methods for Spatial Data Analysis

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

Conditional independence, conditional mixing and conditional association

Mechanistic spatio-temporal point process models for marked point processes, with a view to forest stand data

Asymptotic Normality under Two-Phase Sampling Designs

Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA

ROYAL INSTITUTE OF TECHNOLOGY KUNGL TEKNISKA HÖGSKOLAN. Department of Signals, Sensors & Systems

PATTERN CLASSIFICATION

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

(Practice Version) Midterm Exam 2

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Spectral representations and ergodic theorems for stationary stochastic processes

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

Time Series Analysis. Asymptotic Results for Spatial ARMA Models

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Theoretical Statistics. Lecture 1.

Non-Gaussian Maximum Entropy Processes

1 Probability theory. 2 Random variables and probability theory.

Sample path large deviations of a Gaussian process with stationary increments and regularily varying variance

A Primer on Asymptotics

Estimating the parameters of hidden binomial trials by the EM algorithm

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

RESEARCH REPORT. Log Gaussian Cox processes on the sphere. Francisco Cuevas-Pacheco and Jesper Møller. No.

Maximum-Entropy Models in Science and Engineering

HANDBOOK OF APPLICABLE MATHEMATICS

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

Coregionalization by Linear Combination of Nonorthogonal Components 1

DA Freedman Notes on the MLE Fall 2003

STAT Sample Problem: General Asymptotic Results

Monitoring Wafer Geometric Quality using Additive Gaussian Process

AALBORG UNIVERSITY. An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Rasmus Plenge Waagepetersen

5.1 Approximation of convex functions

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

Properties of spatial Cox process models

Econometrics I, Estimation

Consistency of the maximum likelihood estimator for general hidden Markov models

Bayesian Inference for the Multivariate Normal

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

Week 2: Sequences and Series

Weak Laws of Large Numbers for Dependent Random Variables

Large Deviations for Small-Noise Stochastic Differential Equations

A Guide to Modern Econometric:

ON THE COMPLETE CONVERGENCE FOR WEIGHTED SUMS OF DEPENDENT RANDOM VARIABLES UNDER CONDITION OF WEIGHTED INTEGRABILITY

Transcription:

CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2017 www.csgb.dk RESEARCH REPORT Christophe A.N. Biscio, Arnaud Poinas and Rasmus Waagepetersen A note on gaps in proofs of central limit theorems No. 12, November 2017

A note on gaps in proofs of central limit theorems Christophe A.N. Biscio 1, Arnaud Poinas 2 and Rasmus Waagepetersen 1 1 Department of Mathematical Sciences, Aalborg University, Denmark 2 IRMAR, Campus de Beaulieu, Bat. 22/23, France Abstract We fill two gaps in the literature on central limit theorems. First we state and prove a generalization of the Cramér-Wold device which is useful for establishing multivariate central limit theorems without the need for assuming the existence of a limiting covariance matrix. Second we extend and provide a detailed proof of a very useful result for establishing univariate central limit theorems. Keywords: central limit theorem, Cramér-Wold device, random field. 1 Introduction Many applications of spatial statistics involve spatially varying covariates. One common example is universal kriging (e.g. Chilès and Delfiner, 2008), where a deterministic component of a spatial variable is modelled using a regression on spatial covariates. Another example, which we will discuss in some detail for illustrative purpose, is log-linear modelling of intensity functions for spatial point processes, see e.g. Rathbun and Cressie (1994), Rathbun (1996), Schoenberg (2005), Waagepetersen (2007), Guan and Loh (2007). Such models have for example found much use in ecology where point processes are used to model locations of plants and animals, and the covariates could describe landscape type, topography or soil properties. Letting Z = {Z(u)} u R d where for each u R d, Z(u) R p is a covariate vector, the intensity function is often assumed to be of the form ρ(u; β) = exp(β T Z(u)) (1.1) where β R p. In the aforementioned references, the regression parameter β is inferred using an estimating function given by the score of the log likelihood function of a Poisson process. For deriving asymptotic results it is crucial to establish asymptotic normality of the estimating function. Often increasing domain asymptotics are used where a sequence {W n } n 1 of bounded but increasing observation windows W n R d are considered. The score estimating function is then e n (β) = Z(u) Z(u)ρ(u; β)du (1.2) u X W W n n 1

where X denotes the spatial point process. An estimate of β is obtained by solving e n (β) = 0. In spatial statistics in general, central limit theorems for α-mixing spatial processes are very popular tools for establishing asymptotic results both for random field and point process models. In case of the model (1.1), asymptotic properties of estimators of β were for example established by central limit theorems for α-mixing spatial processes in Guan and Loh (2007) and Waagepetersen and Guan (2009). Bolthausen (1982) provided a much cited central limit theorem for stationary α- mixing random fields. This result was later extended to the non-stationary case by Guyon (1995). Karácsony (2006) extended the result further to triangular arrays considering a combination of infill and increasing domain asymptotics. When taking a deeper look at the techniques employed in the proofs of the aforementioned references, a couple of issues may disturb the reader. These are addressed in Sections 2 3 below. 2 Cramér-Wold device Bolthausen (1982), Guyon (1995) and Karácsony (2006) only gave detailed proofs of central limit theorems in the univariate case. Guyon (1995) and Karácsony (2006) also stated multivariate central limit theorems and just referred to the Cramér-Wold device for extending the univariate central limit theorems to the multivariate case. However, a closer look shows that a simple application of the Cramér-Wold device does not always suffice. Suppose that {X n } n 1 is a sequence of random vectors in R p. The Cramér-Wold device (e.g. p. 383 in Billingsley, 1995) then says that X n converges in distribution to a random vector X if and only if, for all a R p, a T X n converges in distribution toward a T X. In statistical applications we often want to show that Var(X n ) 1/2 X n converges in distribution toward N (0, I p ) as n goes to infinity for some sequence of statistics X n. If Var(X n ) converges to a fixed positive definite matrix Σ then this is equivalent to showing that X n converges in distribution toward N (0, Σ) and the application of the Cramér-Wold device is trivial. However, in many applications a limit for Var(X n ) does not exist. Returning again to the problem of inferring the intensity function (1.1) and letting X n = e n (β)/ W n 1/2 be given by (1.2) normalized by W n 1/2, the covariance matrix of X n is Σ n = 1 ( ) Z(u) T Z(u)ρ(u; β)du + Z(u) T Z(v)[g(u, v) 1]dudv W n W n W n W n where g(, ) is the so-called pair correlation function (e.g. Møller and Waagepetersen, 2004). Due to the dependency on Z(u), u W n, it is not reasonable to assume that Σ n converges to a fixed limit. Guyon (1995) and Karácsony (2006) just refer to the simple application of the Cramér-Wold device and in particular do not discuss the issue of whether a limiting covariance matrix exists or not. To be on firm ground, a Cramér-Wold type result covering the case with no limiting covariance matrix seems missing. Therefore, we state the following generalization of the Cramér-Wold device. 2

Lemma 2.1. Let {X n } n N be a sequence of random variables in R p such that 0 < lim inf λ min(var(x n )) < lim sup λ max (Var(X n )) <, where for a symmetric matrix M, λ min (M) and λ max (M) denote the minimal and maximal eigenvalues of M. Then, Var(X n ) 1/2 X n N (0, I p) if for all a R p, ( a T Var(X n )a ) 1 2 a T X n N (0, 1). The condition in this lemma regarding lim inf λ min Var(X n ) is the kind of condition used in the central limit theorems of Guyon (1995) and Karácsony (2006) and so is not restrictive in practice. A proof of this lemma is provided in Section 4. 3 A lemma by Bolthausen Bolthausen (1982), Guyon (1995) and Karácsony (2006) all use the following key lemma. Lemma 3.1. For n N, let {X n } n N be random variables such that sup n N E(X 2 n) < and for all t R, Then X n N (0, 1). E[(it X n )e itxn ] 0. Karácsony (2006) somewhat misleadingly coins this Stein s lemma and refers to Stein (1972) and Guyon (1995). Guyon (1995) in turn refers to Stein (1972). However, the original reference is Bolthausen (1982) who states and proves the lemma while acknowledging inspiration from Stein (1972). Stein s lemma (Stein, 1981) says that if Z is standard normal then E[f (Z) Zf(Z)] = 0 for any differentiable f with E f (Z) <. This result is related to but nevertheless different from Lemma 3.1. We do not find Bolthausen (1982) s very condensed proof easily accessible and believe it is useful to provide a more detailed proof of this crucial lemma. Moreover, the conclusion of Lemma 3.1 holds under a weaker assumption as stated below. Lemma 3.2. For n N, let {X n } n N be uniformly integrable random variables such that for all t R, E[(it X n )e itxn ] 0. (3.1) Then X n N (0, 1). A proof of the last more general lemma is presented in Section 5. 3

4 Proof of Lemma 2.1 Assume by contradiction that Var(X n ) 1/2 X n Y N (0, I p) does not hold. Then there exists a bounded and continuous function f such that Ef(Var(X n ) 1/2 X n ) Ef(Y ) does not converge toward 0. Thus, there exists ɛ > 0 and a strictly increasing function b : N N such that for all n N, Ef(Var (X b(n) ) 1/2 X b(n) ) Ef(Y ) > ɛ. (4.1) Further, since for all a R p, (a T Var(X n )a) 1 2 a T X n ( a T Var(X b(n) )a ) 1 2 a T X b(n) N (0, 1) N (0, 1), we also have for any a R p. Since the sequence of eigenvalues of the matrices Var(X n ) is bounded, there exists by the Bolzano-Weierstrass theorem a strictly increasing function c : N N such that {c(n)} n N {b(n)} n N, and a matrix Σ such that Since Var (X c(n) ) Σ. ( a T Var (X c(n))a ) 1 2 a T X c(n) N (0, 1) we obtain by multiplying with a T Var (X c(n) )a and using Slutsky s lemma, that a T X c(n) Hence, by the Cramér-Wold device, which is equivalent to Therefore, by Slutsky s lemma, X c(n) Σ 1 2 Xc(n) Var (X c(n) ) 1 2 Xc(n) at N (0, Σ). N (0, Σ) N (0, I p). N (0, I p). From this we can conclude Ef(Var (Xc(n) ) 1/2 X c(n) ) Ef(Y ) ɛ for n large enough which contradicts (4.1). 4

5 Proof of Lemma 3.2 Since the X n are uniformly integrable, we have by (25.11) in Billingsley (1995), By Markov s inequality, for all ɛ > 0, P ( X n > ɛ) E X n ɛ sup E X n <. (5.1) n N sup n N E X n. (5.2) ɛ By (5.2), it follows that the sequence {X n } n N is tight. Suppose now that b : N N is an increasing function and X a random variable such that X b(n) X. Then, by Theorem 25.11 in Billingsley (1995) and (5.1), E X lim inf E X b(n) < so that E(X) and E(Xe itx ), for t R, are well defined. Since the random variables X n are uniformly integrable, so are X b(n) and X b(n) e itx b(n), for all t R. Thus, by Theorem 25.12 in Billingsley (1995), lim E(X b(n)) = E(X) and lim E(X b(n) e itx b(n) ) = E(Xe itx ). Then, by (3.1), which may be written E[(it X)e itx ] = 0 tφ X (t) + t φ X(t) = 0 (5.3) where φ X denotes the characteristic function of X. For all t R, we may multiply each side of (5.3) by e t2 /2 so that (5.3) is equivalent to t (et2 /2 φ X (t)) = 0 which, since φ X is a characteristic function, has the unique solution φ X (t) = e t2 /2. Thus, X N (0, 1). Hence, we have shown that each convergent subsequence of {X n } n N converges in distribution towards a standard normal distribution. Moreover, since {X n } n N is tight, such a subsequence exists by Theorem 25.10 in Billingsley (1995). Therefore, by the Corollary, p. 337 in Billingsley (1995), X n N (0, 1). Acknowledgements The research of Christophe A.N. Biscio and Rasmus Waagepetersen are supported by The Danish Council for Independent Research Natural Sciences, grant DFF 7014-00074 Statistics for point processes in space and beyond, and by the Centre for Stochastic Geometry and Advanced Bioimaging, funded by grant 8721 from the Villum Foundation. 5

References Billingsley, P., 1995. Probability and Measure, 3rd Edition. John Wiley & Sons, New York- Chichester-Brisbane, wiley Series in Probability and Mathematical Statistics. Bolthausen, E., 1982. On the central limit theorem for stationary mixing random fields. The Annals of Probability 10 (4), 1047 1050. Chilès, J.-P., Delfiner, P., 2008. Geostatistics. John Wiley & Sons, Inc. Guan, Y., Loh, J. M., 2007. A thinned block bootstrap variance estimation procedure for inhomogeneous spatial point patterns. Journal of the American Statistical Association 102 (480), 1377 1386. Guyon, X., 1995. Random fields on a network. Probability and its Applications. Springer, New York. Karácsony, Z., 2006. A central limit theorem for mixing random fields. Miskolc Mathematical Notes. A Publication of the University of Miskolc 7 (2), 147 160. Møller, J., Waagepetersen, R. P., 2004. Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton. Rathbun, S. L., 1996. Estimation of Poisson intensity using partially observed concomitant variables. Biometrics 52 (1), 226 242. Rathbun, S. L., Cressie, N., 1994. Asymptotic properties of estimators for the parameters of spatial inhomogeneous Poisson point processes. Advances in Applied Probability 26 (1), 122 154. Schoenberg, F., 2005. Consistent parametric estimation of the intensity of a spatialtemporal point process. Journal of Statistical Planning and Inference 128, 79 93. Stein, C. M., 1972. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory. University of California Press, pp. 583 602. Stein, C. M., 1981. Estimation of the mean of a multivariate normal distribution. The Annals of Statistics 9 (6), 1135 1151. Waagepetersen, R., Guan, Y., 2009. Two-step estimation for inhomogeneous spatial point processes. Journal of the Royal Statistical Society. Series B. Statistical Methodology 71 (3), 685 702. Waagepetersen, R. P., 2007. An estimating function approach to inference for inhomogeneous Neyman-scott processes. Biometrics 63, 252 258. 6