Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Size: px

Start display at page:

Download "Bootstrapping high dimensional vector: interplay between dependence and dimensionality"

Alan Lewis
6 years ago
Views:

1 Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

2 Overview Let x 1, x 2,..., x n be a sequence of mean-zero dependent random vectors in R p, where x i = (x i1, x i2,..., x ip ) with 1 i n. We provide a general (non-asymptotic) theory for quantifying: ρ n := sup P(T X t) P(T Y t), t R where T X = max 1 j p 1 n n i=1 x ij and T Y = max 1 j p 1 n n i=1 y ij with y i = (y i1, y i2,..., y ip ) being a Gaussian vector. Key techniques: Slepian interpolation and the leave-one-block out argument (modification of Stein s leave-one-out method). Two examples on inference for high dimensional time series. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

3 Outline 1 Inference for high dimensional time series Uniform confidence band for the mean Specification testing on the covariance structure 2 Gaussian approximation for maxima of non-gaussian sum M-dependent time series Weakly dependent time series 3 Bootstrap Blockwise multiplier bootstrap Non-overlapping block bootstrap Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

4 Example I: Uniform confidence band Consider a p-dimensional weakly dependent time series {x i }. Goal: construct a uniform confidence band for µ 0 = EX i R p based on the observations {x i } n i=1 with n p. Consider the (1 α) confidence band: { µ = (µ 1,..., µ p ) R p : } n max µ j x j c(α), 1 j p where x = ( x 1,..., x p ) = n i=1 x i/n is the sample mean. Question: how to obtain the critical value c(α)? Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

5 Blockwise Multiplier Bootstrap Capture the dependence within and between the data vectors. Suppose n = b n l n with b n, l n Z. Define the block sum A ij = ib n l=(i 1)b n+1 (x lj x j ), i = 1, 2,..., l n. When p = O(exp(n b )), b n = O(n b ) with 4b + 7b < 1 and b > 2b. Define the bootstrap statistic, T A = max 1 j p 1 l n A ij e i, n where {e i } is a sequence of i.i.d N(0, 1) random variables that are independent of {x i }. i=1 Compute c(α) := inf{t R : P(T A t {x i } n i=1 ) α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

6 Some numerical results Consider a p-dimensional VAR(1) process, x t = ρx t ρ 2 ɛ t. 1 ɛ tj = (ε tj + ε t0 )/ 2, where (ε t0, ε t1,..., ε tp ) i.i.d N(0, I p+1 ); 2 ɛ tj = ρ 1 ζ tj + ρ 2 ζ t(j+1) + + ρ p ζ t(j+p 1), where {ρ j } p j=1 are generated independently from U(2, 3), and {ζ tj } are i.i.d N(0, 1) random variables; 3 ɛ tj is generated from the moving average model above with {ζ tj } being i.i.d centralized Gamma(4, 1) random variables. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

7 Some numerical results (Con t) Table: Coverage probabilities of the uniform confidence band, where n = 120. p = 500, 1 p = 500, 2 p = 500, 3 95% 99% 95% 99% 95% 99% ρ = 0.3 b n = b n = b n = b n = b n = ρ = 0.5 b n = b n = b n = b n = b n = Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

8 Example II: Specification testing on the covariance structure For a mean-zero p-dimensional time series {x i }, define Γ(h) = Ex i+h x i R p p. Consider H 0 : Γ(h) = Γ(h) versus H a : Γ(h) Γ(h), for some h Λ {0, 1, 2... }. Special cases: 1 Λ = {0} : testing the covariance structure. See Cai and Jiang (2011), Chen et al. (2010), Li and Chen (2012) and Qiu and Chen (2012) for some developments when {x i } are i.i.d. 2 Λ = {1, 2,..., H} and Γ(h) = 0 for h Λ: white noise testing. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

9 Testing for white noise Consider the white noise testing problem. Our test is given by T = n max 1 h H max γ jk(h), 1 j,k p where Γ(h) = n h i=1 x i+hx i /n = ( γ jk(h)) p j,k=1. Let z i = (z i,1,..., z i,p 2 H ) = (vec(x i+1x i ),..., vec(x i+h x i ) ) R p2 H for i = 1,..., N := n H. Suppose N = b n l n for b n, l n Z. Define T A = max 1 j p 2 H 1 l n A ij e i, A ij = n i=1 ib n l=(i 1)b n+1 (z l,j z j ), where {e i } is a sequence of i.i.d N(0, 1) random variables that are independent of {x i }, and z j = N i=1 z i,j/n. Compute c(α) := inf{t R : P(T A t {x i } n i=1 ) α}, and reject the white noise null hypothesis if T > c(α). Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

10 Some numerical results We are interested in testing, versus H 0 : Γ(h) = 0, for 1 h L, H a : Γ(h) 0, for some 1 h L. Consider the following data generating processes: 1 multivariate normal: x tj = ρ 1 ζ tj + ρ 2 ζ t(j+1) + + ρ p ζ t(j+p 1), where {ρ j } p j=1 are generated independently from U(2, 3), and {ζ tj} are i.i.d N(0, 1) random variables; 2 multivariate ARCH model: x t = Σ 1/2 t ɛ t with ɛ t N(0, I p ) and Σ t = 0.1I p + 0.9x t 1 x t 1, where Σ1/2 t is a lower triangular matrix based on the Cholesky decomposition of Σ t ; 3 VAR(1) model: x t = ρx t ρ 2 ɛ t, where ρ = 0.2 and the errors {ɛ t } are generated according to 1. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

11 Some numerical results (Con t) Table: Rejection percentages for testing the uncorrelatedness, where n = 240 and the actual number of parameters is p 2 L. p = 20, 1 p = 20, 2 p = 20, 3 5% 1% 5% 1% 5% 1% L = 1 b n = b n = b n = b n = L = 3 b n = b n = b n = b n = Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

12 Maxima of non-gaussian sum The above applications hinge on a general theoretical result. Let x 1, x 2,..., x n be a sequence of mean-zero dependent random vectors in R p, where x i = (x i1, x i2,..., x ip ) with 1 i n. Target: approximate the distribution of 1 T X = max 1 j p n n x ij. i=1 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

13 Gaussian approximation Let y 1, y 2,..., y n be a sequence of mean-zero Gaussian random vectors in R p, where y i = (y i1, y i2,..., y ip ) with 1 i n. Suppose that {y i } preserves the autocovariance structure of {x i }, i.e., cov(y i, y j ) = cov(x i, x j ). Goal: quantify the Kolmogrov distance ρ n := sup P(T X t) P(T Y t), t R where T Y = max 1 j p 1 n n i=1 y ij. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

14 Existing results in the independent case Question: how large p can be in relation with n so that ρ n 0? Bentkus (2003): ρ n 0 provided that p 7/2 = o(n). Chernozhukov et al. (2013): ρ n 0 if p = O(exp(n b )) with b < 1/7 (an astounding improvement). Motivation: study the interplay between the dependence structure and the growth rate of p so that ρ n 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

15 Dependence Structure I: M-dependent time series A time series {x i } is called M-dependent if for i j > M, x i and x j are independent. Under suitable restrictions on the tail of x i and weak dependence assumptions uniformly across the components of x i, we show that for some γ (0, 1). ρ n M1/2 (log(pn/γ) 1) 7/8 n 1/8 + γ, When p = O(exp(n b )) for b < 1/11, and M = O(n b ) with 4b + 7b < 1, we have ρ n Cn c, c, C > 0. If b = 0 (i.e., M = O(1)), our result allows b < 1/7 [Chernozhukov et al. (2013)]. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

16 Dependence Structure II: Physical dependence measure [Wu (2005)] The sequence {x i } has the following causal representation, x i = G(..., ɛ i 1, ɛ i ), where G is a measurable function and {ɛ i } is a sequence of i.i.d random variables. Let {ɛ i } be an i.i.d copy of {ɛ i} and define x i = G(..., ɛ 1, ɛ 0, ɛ 1,..., ɛ i ). The strength of the dependence can be quantified via + θ i,j,q (x) = (E x ij xij q ) 1/q, Θ i,j,q (x) = θ l,j,q (x). l=i Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

17 Bound on the Kolmogrov distance Theorem Under suitable conditions on the tail of {x i } and certain weak dependence assumptions, we have ρ n n 1/8 M 1/2 l 7/8 n + (n 1/8 M 1/2 l 3/8 n ) q 1+q p j=1 Θ q M,j,q 1 1+q + γ, where Θ i,j,q = Θ i,j,q (x) Θ i,j,q (y). The tradeoff between the first two terms reflects the interaction between the dimensionality and dependence; Key step in the proof: M-dependent approximation. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

18 Bound on the Kolmogrov distance (Con t) Corollary Suppose that 1 max 1 j p Θ M,j,q = O(ρ M ) for ρ < 1 and q 2; 2 p = O(exp(n b )) for 0 < b < 1/11. Then we have ρ n Cn c, c, C > 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

19 Dimension free dependence structure Question: is there any so-called dimension free dependence structure? What kind of dependence assumption will not affect the increase rate of p? For a permutation π( ), (x iπ(1),..., x iπ(p) ) = (z i1, z i2 ). Suppose {z i1 } is a s-dimensional time series and {z i2 } is a p s dimensional sequence of independent variables. Assume that {z i1 } and {z i2 } are independent, and s/p 0. Under suitable assumptions, it can be shown that for p = O(exp(n b )) with b < 1/7, ρ n Cn c, c, C > 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

20 Resampling Summary: for M-dependent or more generally weakly dependent time series, we have shown that ρ n := sup P(T X t) P(T Y t) Cn c, c, C > 0. t R Question: in practice the autocovariance structure of {x i } is typically unknown. How can we approximate the distribution of T X or T Y? Solution: Resampling method. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

21 Blockwise multiplier bootstrap 1 Suppose n = b n l n. Compute the block sum, A ij = ib n l=(i 1)b n+1 x lj, i = 1, 2,..., l n. 2 Generate a sequence of i.i.d N(0, 1) random variables {e i } and compute 1 T A = max 1 j p n l n i=1 A ij e i. 3 Repeat step 2 several times and compute the α-quantile of T A c TA (α) = inf{t R : P(T A t {x i } n i=1 ) α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

22 Validity of the blockwise multiplier bootstrap Theorem Under suitable assumptions, we have for p = O(exp(n b )) with 0 < b < 1/15, P(T X c TA (α)) α n c, c > 0. sup α (0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

23 Non-overlapping block bootstrap 1 Let A 1j,..., A l nj be an i.i.d draw from the empirical distribution of {A ij } ln i=1 and compute 1 T A = max 1 j p n l n i=1 (A ij Āj), Ā j = l n i=1 A ij /l n. 2 Repeat the above step several times to obtain the α-quantile of T A, Theorem c TA (α) = inf{t R : P(T A t {x i } n i=1 ) α}. Under suitable assumptions, we have with probability 1 o(1), P(T X c TA (α) c TA (α)) α = o(1). sup α (0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

24 Future works 1 Choice of the block size in the blockwise multiplier bootstrap and non-overlapping block bootstrap; 2 Maximum eigenvalue of a sum of random matrices: a natural step going from vectors to matrices. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

25 Thank you! Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD / 25

Extreme inference in stationary time series

Extreme inference in stationary time series Moritz Jirak FOR 1735 February 8, 2013 1 / 30 Outline 1 Outline 2 Motivation The multivariate CLT Measuring discrepancies 3 Some theory and problems The problem