Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Similar documents
Extreme inference in stationary time series

Robust Backtesting Tests for Value-at-Risk Models

Cross-Validation with Confidence

Cross-Validation with Confidence

Empirical Likelihood Tests for High-dimensional Data

Central limit theorems and multiplier bootstrap when p is much larger than n

Can we do statistical inference in a non-asymptotic way? 1

Advanced Statistics II: Non Parametric Tests

Detection of structural breaks in multivariate time series

Multivariate Regression

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Asymptotic Statistics-III. Changliang Zou

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

STAT 461/561- Assignments, Year 2015

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Robust Testing and Variable Selection for High-Dimensional Time Series

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

Efficient and Robust Scale Estimation

A General Framework for High-Dimensional Inference and Multiple Testing

Modeling conditional distributions with mixture models: Theory and Inference

Multivariate Analysis and Likelihood Inference

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

01 Probability Theory and Statistics Review

CENTRAL LIMIT THEOREMS AND MULTIPLIER BOOTSTRAP WHEN p IS MUCH LARGER THAN n

Quantile Regression for Extraordinarily Large Data

large number of i.i.d. observations from P. For concreteness, suppose

Extremogram and ex-periodogram for heavy-tailed time series

Robustní monitorování stability v modelu CAPM

Linear Models and Estimation by Least Squares

What s New in Econometrics. Lecture 13

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

sparse and low-rank tensor recovery Cubic-Sketching

Large-Scale Multiple Testing of Correlations

University of California, Berkeley

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

Inference For High Dimensional M-estimates. Fixed Design Results

Comprehensive Examination Quantitative Methods Spring, 2018

Statistics of stochastic processes

Extremogram and Ex-Periodogram for heavy-tailed time series

Multivariate Time Series

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Bootstrapping the Grainger Causality Test With Integrated Data

Nonparametric Inference In Functional Data

Vector autoregressions, VAR

Marginal Screening and Post-Selection Inference

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Discussion of High-dimensional autocovariance matrices and optimal linear prediction,

Reconstruction from Anisotropic Random Measurements

Large sample covariance matrices and the T 2 statistic

Large sample distribution for fully functional periodicity tests

Financial Econometrics and Quantitative Risk Managenent Return Properties

1.4 Properties of the autocovariance for stationary time-series

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Introduction to Stochastic processes

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Estimation of large dimensional sparse covariance matrices

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

A Conditional Approach to Modeling Multivariate Extremes

Nonparametric Inference via Bootstrapping the Debiased Estimator

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

arxiv: v6 [math.st] 23 Jan 2018

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

3. ARMA Modeling. Now: Important class of stationary processes

Lawrence D. Brown* and Daniel McCarthy*

Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas

STT 843 Key to Homework 1 Spring 2018

Inference on distributions and quantiles using a finite-sample Dirichlet process

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

Asymptotic Statistics-VI. Changliang Zou

Testing Many Moment Inequalities

CHANGE DETECTION IN TIME SERIES

Statistical Inference On the High-dimensional Gaussian Covarianc

GARCH Models Estimation and Inference

Statistical Inference of Covariate-Adjusted Randomized Experiments

Convergence of Multivariate Quantile Surfaces

Covariance function estimation in Gaussian process regression

Bootstrap & Confidence/Prediction intervals

9. Multivariate Linear Time Series (II). MA6622, Ernesto Mordecki, CityU, HK, 2006.

Bivariate Paired Numerical Data

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Econ 583 Final Exam Fall 2008

On detection of unit roots generalizing the classic Dickey-Fuller approach

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Master s Written Examination

Vast Volatility Matrix Estimation for High Frequency Data

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

Title. Description. var intro Introduction to vector autoregressive models

Multivariate Distributions

Long-Run Covariability

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Nonlinear time series

University of California San Diego and Stanford University and

STAT STOCHASTIC PROCESSES. Contents

Determinants. Chia-Ping Chen. Linear Algebra. Professor Department of Computer Science and Engineering National Sun Yat-sen University 1/40

Transcription:

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 1 / 25

Overview Let x 1, x 2,..., x n be a sequence of mean-zero dependent random vectors in R p, where x i = (x i1, x i2,..., x ip ) with 1 i n. We provide a general (non-asymptotic) theory for quantifying: ρ n := sup P(T X t) P(T Y t), t R where T X = max 1 j p 1 n n i=1 x ij and T Y = max 1 j p 1 n n i=1 y ij with y i = (y i1, y i2,..., y ip ) being a Gaussian vector. Key techniques: Slepian interpolation and the leave-one-block out argument (modification of Stein s leave-one-out method). Two examples on inference for high dimensional time series. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 2 / 25

Outline 1 Inference for high dimensional time series Uniform confidence band for the mean Specification testing on the covariance structure 2 Gaussian approximation for maxima of non-gaussian sum M-dependent time series Weakly dependent time series 3 Bootstrap Blockwise multiplier bootstrap Non-overlapping block bootstrap Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 3 / 25

Example I: Uniform confidence band Consider a p-dimensional weakly dependent time series {x i }. Goal: construct a uniform confidence band for µ 0 = EX i R p based on the observations {x i } n i=1 with n p. Consider the (1 α) confidence band: { µ = (µ 1,..., µ p ) R p : } n max µ j x j c(α), 1 j p where x = ( x 1,..., x p ) = n i=1 x i/n is the sample mean. Question: how to obtain the critical value c(α)? Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 4 / 25

Blockwise Multiplier Bootstrap Capture the dependence within and between the data vectors. Suppose n = b n l n with b n, l n Z. Define the block sum A ij = ib n l=(i 1)b n+1 (x lj x j ), i = 1, 2,..., l n. When p = O(exp(n b )), b n = O(n b ) with 4b + 7b < 1 and b > 2b. Define the bootstrap statistic, T A = max 1 j p 1 l n A ij e i, n where {e i } is a sequence of i.i.d N(0, 1) random variables that are independent of {x i }. i=1 Compute c(α) := inf{t R : P(T A t {x i } n i=1 ) α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 5 / 25

Some numerical results Consider a p-dimensional VAR(1) process, x t = ρx t 1 + 1 ρ 2 ɛ t. 1 ɛ tj = (ε tj + ε t0 )/ 2, where (ε t0, ε t1,..., ε tp ) i.i.d N(0, I p+1 ); 2 ɛ tj = ρ 1 ζ tj + ρ 2 ζ t(j+1) + + ρ p ζ t(j+p 1), where {ρ j } p j=1 are generated independently from U(2, 3), and {ζ tj } are i.i.d N(0, 1) random variables; 3 ɛ tj is generated from the moving average model above with {ζ tj } being i.i.d centralized Gamma(4, 1) random variables. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 6 / 25

Some numerical results (Con t) Table: Coverage probabilities of the uniform confidence band, where n = 120. p = 500, 1 p = 500, 2 p = 500, 3 95% 99% 95% 99% 95% 99% ρ = 0.3 b n = 4 89.7 97.2 90.5 97.5 90.1 97.1 b n = 6 92.5 98.3 91.6 97.8 91.6 97.7 b n = 8 94.6 99.0 91.5 97.6 92.4 97.9 b n = 10 95.0 99.2 91.8 97.8 91.6 97.7 b n = 12 94.8 99.3 91.3 97.9 92.0 97.5 ρ = 0.5 b n = 4 76.9 92.9 83.5 94.0 83.3 93.7 b n = 6 87.1 96.3 87.3 96.2 87.4 95.9 b n = 8 91.6 98.3 88.8 96.6 89.4 96.9 b n = 10 92.5 98.6 89.8 97.1 89.3 97.0 b n = 12 93.0 99.0 90.0 97.2 90.5 97.0 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 7 / 25

Example II: Specification testing on the covariance structure For a mean-zero p-dimensional time series {x i }, define Γ(h) = Ex i+h x i R p p. Consider H 0 : Γ(h) = Γ(h) versus H a : Γ(h) Γ(h), for some h Λ {0, 1, 2... }. Special cases: 1 Λ = {0} : testing the covariance structure. See Cai and Jiang (2011), Chen et al. (2010), Li and Chen (2012) and Qiu and Chen (2012) for some developments when {x i } are i.i.d. 2 Λ = {1, 2,..., H} and Γ(h) = 0 for h Λ: white noise testing. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 8 / 25

Testing for white noise Consider the white noise testing problem. Our test is given by T = n max 1 h H max γ jk(h), 1 j,k p where Γ(h) = n h i=1 x i+hx i /n = ( γ jk(h)) p j,k=1. Let z i = (z i,1,..., z i,p 2 H ) = (vec(x i+1x i ),..., vec(x i+h x i ) ) R p2 H for i = 1,..., N := n H. Suppose N = b n l n for b n, l n Z. Define T A = max 1 j p 2 H 1 l n A ij e i, A ij = n i=1 ib n l=(i 1)b n+1 (z l,j z j ), where {e i } is a sequence of i.i.d N(0, 1) random variables that are independent of {x i }, and z j = N i=1 z i,j/n. Compute c(α) := inf{t R : P(T A t {x i } n i=1 ) α}, and reject the white noise null hypothesis if T > c(α). Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 9 / 25

Some numerical results We are interested in testing, versus H 0 : Γ(h) = 0, for 1 h L, H a : Γ(h) 0, for some 1 h L. Consider the following data generating processes: 1 multivariate normal: x tj = ρ 1 ζ tj + ρ 2 ζ t(j+1) + + ρ p ζ t(j+p 1), where {ρ j } p j=1 are generated independently from U(2, 3), and {ζ tj} are i.i.d N(0, 1) random variables; 2 multivariate ARCH model: x t = Σ 1/2 t ɛ t with ɛ t N(0, I p ) and Σ t = 0.1I p + 0.9x t 1 x t 1, where Σ1/2 t is a lower triangular matrix based on the Cholesky decomposition of Σ t ; 3 VAR(1) model: x t = ρx t 1 + 1 ρ 2 ɛ t, where ρ = 0.2 and the errors {ɛ t } are generated according to 1. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 10 / 25

Some numerical results (Con t) Table: Rejection percentages for testing the uncorrelatedness, where n = 240 and the actual number of parameters is p 2 L. p = 20, 1 p = 20, 2 p = 20, 3 5% 1% 5% 1% 5% 1% L = 1 b n = 1 4.3 0.8 2.8 0.3 90.3 71.9 b n = 4 5.0 1.0 1.0 0.3 86.3 63.3 b n = 8 5.3 1.2 1.6 0.9 86.0 59.2 b n = 12 5.1 1.0 2.3 1.4 86.5 59.2 L = 3 b n = 1 4.7 1.0 2.3 0.3 79.4 57.7 b n = 4 3.6 0.7 0.6 0.3 74.0 46.2 b n = 8 3.7 0.4 1.3 0.8 71.4 41.0 b n = 12 4.0 0.6 2.2 1.3 72.1 40.6 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 11 / 25

Maxima of non-gaussian sum The above applications hinge on a general theoretical result. Let x 1, x 2,..., x n be a sequence of mean-zero dependent random vectors in R p, where x i = (x i1, x i2,..., x ip ) with 1 i n. Target: approximate the distribution of 1 T X = max 1 j p n n x ij. i=1 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 12 / 25

Gaussian approximation Let y 1, y 2,..., y n be a sequence of mean-zero Gaussian random vectors in R p, where y i = (y i1, y i2,..., y ip ) with 1 i n. Suppose that {y i } preserves the autocovariance structure of {x i }, i.e., cov(y i, y j ) = cov(x i, x j ). Goal: quantify the Kolmogrov distance ρ n := sup P(T X t) P(T Y t), t R where T Y = max 1 j p 1 n n i=1 y ij. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 13 / 25

Existing results in the independent case Question: how large p can be in relation with n so that ρ n 0? Bentkus (2003): ρ n 0 provided that p 7/2 = o(n). Chernozhukov et al. (2013): ρ n 0 if p = O(exp(n b )) with b < 1/7 (an astounding improvement). Motivation: study the interplay between the dependence structure and the growth rate of p so that ρ n 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 14 / 25

Dependence Structure I: M-dependent time series A time series {x i } is called M-dependent if for i j > M, x i and x j are independent. Under suitable restrictions on the tail of x i and weak dependence assumptions uniformly across the components of x i, we show that for some γ (0, 1). ρ n M1/2 (log(pn/γ) 1) 7/8 n 1/8 + γ, When p = O(exp(n b )) for b < 1/11, and M = O(n b ) with 4b + 7b < 1, we have ρ n Cn c, c, C > 0. If b = 0 (i.e., M = O(1)), our result allows b < 1/7 [Chernozhukov et al. (2013)]. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 15 / 25

Dependence Structure II: Physical dependence measure [Wu (2005)] The sequence {x i } has the following causal representation, x i = G(..., ɛ i 1, ɛ i ), where G is a measurable function and {ɛ i } is a sequence of i.i.d random variables. Let {ɛ i } be an i.i.d copy of {ɛ i} and define x i = G(..., ɛ 1, ɛ 0, ɛ 1,..., ɛ i ). The strength of the dependence can be quantified via + θ i,j,q (x) = (E x ij xij q ) 1/q, Θ i,j,q (x) = θ l,j,q (x). l=i Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 16 / 25

Bound on the Kolmogrov distance Theorem Under suitable conditions on the tail of {x i } and certain weak dependence assumptions, we have ρ n n 1/8 M 1/2 l 7/8 n + (n 1/8 M 1/2 l 3/8 n ) q 1+q p j=1 Θ q M,j,q 1 1+q + γ, where Θ i,j,q = Θ i,j,q (x) Θ i,j,q (y). The tradeoff between the first two terms reflects the interaction between the dimensionality and dependence; Key step in the proof: M-dependent approximation. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 17 / 25

Bound on the Kolmogrov distance (Con t) Corollary Suppose that 1 max 1 j p Θ M,j,q = O(ρ M ) for ρ < 1 and q 2; 2 p = O(exp(n b )) for 0 < b < 1/11. Then we have ρ n Cn c, c, C > 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 18 / 25

Dimension free dependence structure Question: is there any so-called dimension free dependence structure? What kind of dependence assumption will not affect the increase rate of p? For a permutation π( ), (x iπ(1),..., x iπ(p) ) = (z i1, z i2 ). Suppose {z i1 } is a s-dimensional time series and {z i2 } is a p s dimensional sequence of independent variables. Assume that {z i1 } and {z i2 } are independent, and s/p 0. Under suitable assumptions, it can be shown that for p = O(exp(n b )) with b < 1/7, ρ n Cn c, c, C > 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 19 / 25

Resampling Summary: for M-dependent or more generally weakly dependent time series, we have shown that ρ n := sup P(T X t) P(T Y t) Cn c, c, C > 0. t R Question: in practice the autocovariance structure of {x i } is typically unknown. How can we approximate the distribution of T X or T Y? Solution: Resampling method. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 20 / 25

Blockwise multiplier bootstrap 1 Suppose n = b n l n. Compute the block sum, A ij = ib n l=(i 1)b n+1 x lj, i = 1, 2,..., l n. 2 Generate a sequence of i.i.d N(0, 1) random variables {e i } and compute 1 T A = max 1 j p n l n i=1 A ij e i. 3 Repeat step 2 several times and compute the α-quantile of T A c TA (α) = inf{t R : P(T A t {x i } n i=1 ) α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 21 / 25

Validity of the blockwise multiplier bootstrap Theorem Under suitable assumptions, we have for p = O(exp(n b )) with 0 < b < 1/15, P(T X c TA (α)) α n c, c > 0. sup α (0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 22 / 25

Non-overlapping block bootstrap 1 Let A 1j,..., A l nj be an i.i.d draw from the empirical distribution of {A ij } ln i=1 and compute 1 T A = max 1 j p n l n i=1 (A ij Āj), Ā j = l n i=1 A ij /l n. 2 Repeat the above step several times to obtain the α-quantile of T A, Theorem c TA (α) = inf{t R : P(T A t {x i } n i=1 ) α}. Under suitable assumptions, we have with probability 1 o(1), P(T X c TA (α) c TA (α)) α = o(1). sup α (0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 23 / 25

Future works 1 Choice of the block size in the blockwise multiplier bootstrap and non-overlapping block bootstrap; 2 Maximum eigenvalue of a sum of random matrices: a natural step going from vectors to matrices. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 24 / 25

Thank you! Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 25 / 25