arxiv: v2 [math.st] 11 Dec 2018

Size: px
Start display at page:

Download "arxiv: v2 [math.st] 11 Dec 2018"

Transcription

1 esting for high-diensional network paraeters in auto-regressive odels arxiv:803659v [aths] Dec 08 Lili Zheng and Garvesh Raskutti Abstract High-diensional auto-regressive odels provide a natural way to odel influence between M actors given ulti-variate tie series data for tie intervals While there has been considerable work on network estiation, there is liited work in the context of inference and hypothesis testing In particular, prior work on hypothesis testing in tie series has been restricted to linear Gaussian auto-regressive odels Fro a practical perspective, it is iportant to deterine suitable statistical tests for connections between actors that go beyond the Gaussian assuption In the context of high-diensional tie series odels, confidence intervals present additional estiators since ost estiators such as the Lasso and Dantzig selectors are biased which has led to de-biased estiators In this paper we address these challenges and provide convergence in distribution results and confidence intervals for the ulti-variate ARp odel with sub-gaussian noise, a generalization of Gaussian noise that broadens applicability and presents nuerous technical challenges he ain technical challenge lies in the fact that unlike Gaussian rando vectors, for sub- Gaussian vectors zero correlation does not iply independence he proof relies on using an intricate truncation arguent to develop novel concentration bounds for quadratic fors of dependent sub-gaussian rando variables Our convergence in distribution results hold provided = Ωs ρ log M, where s and ρ refer to sparsity paraeters which atches existed results for hypothesis testing with iid saples We validate our theoretical results with siulation results for both block-structured and chain-structured networks Introduction Vector autoregressive odels arise in a nuber of applications including acroeconoics see egang and Piazzesi [003],Hansen [003],Shan [005], coputational neuroscience see eggoebel et al [003],Seth et al [05],Harrison et al [003], Bressler et al [007], and any others see egmichailidis and dalché Buc [03],Fujita et al [007] Recent years has seen substantial Departent of Statistics, University of Wisconsin-Madison

2 developent in the theory and ethodology of high-diensional auto-regressive odels with respect to paraeter estiation see eg Song and Bickel [0],Basu et al [05],Davis et al [06],Medeiros and Mendes [06], Mark B and R [08] In particular if there are M dependent tie series eg voxels in the brain, actors in a social network, easureents at different spatial locations, tie series network odels allow us to odel teporal dependence between actors/nodes in a network More precisely, consider the following tie series auto-regressive network odel with lag p, p X t+ = A jx t+ j + ɛ t, j= where {X t } RM is the tie series data we have access to, {A j R M M, j =,, p} are the network paraeters of interest and ɛ t R M is zero-ean noise We are considering the high-diensional setting where the nuber of nodes M in the network is uch larger than the saple size Prior work in Basu et al [05] has addressed the question of how to estiate the network paraeter A with Gaussian noise ɛ t under sparsity assuptions and various structural constraints In this paper, we focus on inference and hypothesis testing for the paraeter A given the data X t In high-diensional statistics, there has recently been a growing body of work on confidence intervals and hypothesis testing under structural assuptions such as sparsity Since the widely used Lasso estiator for sparse linear regression is asyptotically biased, one-step estiators based on bias-correction have been studied in works such as Zhang and Zhang [04], Van de Geer et al [04] and Javanard and Montanari [04] which are referred to as LDPE, de-sparsifying and de-biasing estiator respectively Low-diensional coponents of these estiators have asyptotic norality and thus can be used for constructing hypothesis testing and confidence intervals In this paper, we adopt the fraework of Ning and Liu Ning et al [07] who propose a high diensional test statistic based on score function, called the decorrelated score function which we briefly describe here Forally, consider a statistical odel P = {P β : β Ω} with high-diensional paraeter vector β = θ, γ R d Suppose we are interested in the scalar paraeter θ and γ R d is the nuisance paraeter Suppose data {U i, i =,, n} are iid data following distribution P β, then the negative log-likelihood function is defined as lθ, γ = n n log fu i ; θ, γ i= It is known that the score function n θ l0, γ is asyptotically noral if the true paraeter β = 0, γ If γ is substituted by soe estiator ˆγ, the estiation induced error can be approxiated as the following: n θ l0, ˆγ n θ l0, γ n θγ l0, γ ˆγ γ,

3 when ˆγ γ is sall enough Although ˆγ γ converge to 0 with properly chosen ˆγ, eg Lasso estiator, n θγ l0, γ ˆγ γ would not vanish if E β θγ l0, γ 0 his fact otivates the decorrelated score function: Sθ, γ = θ lθ, γ I θγ I γγ γ lθ, γ, with Fisher inforation atrix I = E β lβ One can check that E γ Sθ, γ = 0 Both γ and I θγ I γγ are substituted by soe estiator, and it is shown in Ning et al [07] that the decorrelated score function is asyptotically noral In the linear regression case, the test statistic generated by the decorrelated score function in Ning et al [07] is equivalent to that constructed by de-biased estiator in Van de Geer et al [04] However, Ning et al [07] allow a ore general for, and thus is easier to adapt to the tie series case In fact Neykov et alneykov et al [08] consider aongst other exaples, high-diensional tie series with Gaussian error innovations While Gaussian error innovations are widely used, any tie series odels include data that has bounded range or discrete data, for which the Gaussian distribution is not a natural fit In this paper, we address the ore general and technically challenging setting in which the noise ɛ t is sub-gaussian One of the iportant technical challenges in going fro the Gaussian to the sub-gaussian case is that dependent Gaussian vectors can be rotated to be independent, while such a result does not hold for sub-gaussian vectors Prior work in Wong et al [06] addresses this challenges by iposing stationarity and β-ixing conditions In order to avoid these conditions, we develop novel concentration bounds for sub-gaussian rando vectors In this paper, we investigate the hypothesis testing and confidence region with respect to a low-diensional coponent of paraeter atrices {A j, j =,, p} for sub-gaussian data, using the testing fraework in Ning et al [07] Our ajor contributions are as follows: Extending theoretical results in Ning et al [07] for high-diensional hypothesis testing fro Gaussian to sub-gaussian teporal dependent data VAR odel, both under null and alternative hypothesis We also show that our techniques lead to siilar results to Neykov et alneykov et al [08] in the Gaussian case but under less restrictive conditions; A novel concentration bound for quadratic fors of sub-gaussian tie series data Note that unlike Gaussian vectors which can be rotated to be independent, sub-gaussian vectors can not which present additional technical challenges Our analysis also leads to estiators for covariance and regression paraeters for tie series data under sub-gaussian assuptions which are of independent interest 3

4 We also construct sei-paraetric efficient confidence region for ultivariate paraeters with fixed diension; Finally we support our theoretical guarantees with a siulation study on bounded noise, which is sub-gaussian but not Gaussian Related Work In the literature on inference for high-diensional VAR odels, ost work focuses on the estiation proble Song and Bickel Song and Bickel [0] investigate penalized least squares algoriths for different penalties, with soe externally iposed assuptions on the teporal dependence heoretical guarantees on Dantzig type and Lasso type estiators are studied in Han et al [05] and Basu et al [05], but with Gaussian noise Barigozzi and Brownlees Barigozzi and Brownlees [08] consider the inference for stationary dependence structure built aong variables, other than the paraeters in the VAR odel In our work, we control the error bounds of Lasso and Dantzig type estiators for paraeter atrices, with sub-gaussian noise hen we establish asyptotic distribution of test statistic based on this In the high-diensional hypothesis testing literature, there is soe work regarding to testing for high-diensional ean vector Srivastava [009], covariance atrices Chen et al [00],Zhang et al [03] and independence aong variables Schott [005] While for testing on regression paraeters, ost work assues iid saples Lockhart et al [04], aylor et al [04] and Lee et al [06] proposes ethods to test whether a covariate should be selected conditioning on the selection of soe other covariates A penalized score test depending on the tuning paraeter λ is considered in Vooran et al [04] Our work follows the a line of work by Zhang and Zhang [04], Van de Geer et al [04], Javanard and Montanari [04] and Ning et al [07], the de-sparsifying or decorrelated literature We construct a VAR version of decorrelated score test proposed by Ning et al [07] Chen and Wu Chen and Wu [08] tackles the hypothesis testing proble for tie series data as well, but they are testing the trend in a tie series, instead of the autoregressive paraeter which encodes the influence structure aong variables As entioned earlier, our work is ost closely related to the prior work of Neykov et alneykov et al [08], which provides a hypothesis testing fraework with high-diensional Gaussian tie series as a special case In our work, we consider the ore general and technically challenging case of sub-gaussian vector auto-regressive odels hroughout this paper, we provide a coparison to results derived in this work for the Gaussian case 4

5 Organization of the Paper Section explains the proble set up and proposes our test statistic heoretical guarantee is shown in section 3 Specifically, section 3 and 3 present the weak convergence rate of test statistic under the null and alternative hypothesis H 0 and H A Section 33 propose soe feasible estiators, which satisfy the assuptions required and can be plugged into the test statistic Section 34 considers the case when the variance of noise are unknown, and we construct a confidence region for ultivariate paraeter vectors in Section 35 We consider the special case of the AR odel with Gaussian noise, a detailed coparison with Neykov et al [08] is provided in section 36 Section 4 provides siulation results and section 5 includes the proofs for the two ain theores Much of the proof is deferred to Appendices 3 Notation We define the following nors for vectors and atrices: For a vector u = u,, u d R d, we d define the p-nor where p, u p = i= up p i For a atrix U R n, the l p nor and Uv Frobenius nor of U is defined as U p = sup p v v p, U F = n i= j= U ij We also use notation U, to denote the l penalty on U, which is n i= j= U i,j Furtherore, if U is syetric the trace nor of U is U tr = tr U hroughout the paper, we assue that the entries of noise vectors {ɛ ti, i M} t= are independent sub-gaussian variables with constant scale factor A univariate centered rando variable X has a sub-gaussian distribution with scale factor τ if M X t E [exptx] expτ t / Proble Setup We consider a general vector auto-regressive tie series with lag p, where p is known and finite and independent of or other diensions: p X t+ = AjX t j+ + ɛ t, j= where X t R M, ɛ t R M is zero-ean entry-wise independent sub-gaussian noise with identity covariance atrix, and Aj R M M, j =,, p are paraeters of interest Define the atrix A = A,, Ap R M pm and X t = Xt,, Xt p+ R pm, then we can also write as X t+ = A X t + ɛ t 3 5

6 For notational convenience, we assue that tie series data X t has tie range p t Based on data X t t= p, we test the hypothesis of whether a subset of entries in A are 0 Let A i be the ith row vector of A Without loss of generality, suppose the entries we test are in rows,, k Define D {,, pm} as the coluns we test in th row with d = D, and D = {i, j : i k, j D i }, with d = D = k = d We test the null hypothesis: H 0 : ÃD = 0 4 where ÃD = A D,, A k D k R d We also assue that d is finite and not increasing with In the work of of Neykov et alneykov et al [08], d is assued to be Stationary distribution Since we are developing a hypothesis testing fraework based on the decorrelated score test, it is iportant to specify a stationary distribution for X t Using standard notation fro autoregressive tie series odels, define the polynoial Az = I M p j= Ajzj, where I M is an M M identity atrix, and z is a coplex nuber o guarantee the existence of a stationary solution to 3, we assue hen we can write detaz 0, z Az = Ψ j z j, where Ψ j, j 0 are all real valued atrices which are polynoial functions of Ai, i p j=0 Note that in the special case where p =, Ψ j = A j It can be shown that the unique stationary solution to is X t = Ψ j ɛ t j, and the covariance atrix Σ of X t satisfies j=0 Σ = CovX t = Ψ j Ψ j 5 j=0 Decorrelated Score Function Using the fraeworks developed in Ning et al [07] for independent design, we consider the decorrelated score test First we define the score function SA R M M, with each entry defined as follows: [SA ] jk = X t+,j a j X t X tk = ɛ t,j X tk 6

7 As pointed out in Ning et al [07], the standard score function is infeasible and we need to consider the decorrelated score function S = S, S,, Sk R d, with each S R d corresponding to the tested row, D : S = ɛ t, X t,d w X t,d c, where X t,d R d is coposed of the entries of X t whose indices are within set D X t,d c R pm d is also defined siilarly and w R pm d d is chosen to satisfy CovX t,d w X t,d c, X t,d c = 0 6 Specifically, w is defined as a function of Υ = CovX t R pm pm : w = Υ D c,d c Υ D c,d 7 3 est Statistic Based on the decorrelated score function S, we first define the statistic V, R d : V, Υ S, with Υ R d d being defined as: Υ CovX t,d w X t,d c = CovX t,d X t,d c = Υ D,D Υ D,D c Υ D c,dc Υ D c,d 8 Let V be the d-diensional vector concatenated by V, s: V = V,,, V,k One of the ain results of the paper is to show that V is asyptotically Gaussian Define U = V, then U is asyptotically χ d Since we do not know ɛ t, w, and Υ, we later define estiators for these quantities Forally, we define our test statistic Û as Û = k Ŝ Υ Ŝ, 9 = where Υ R d d is an estiator for Υ and Ŝ R d is defined as Ŝ = X t+, Â D c X t,d c X t,d ŵx t,d c, 7

8 with  R pm and ŵ R pm d d estiating A and w Here we are not worried about the invertible issue of Υ, since Υ is a low diensional covariance atrix o guarantee a good estiation of the high-diensional paraeter A and w, we ipose sparsity conditions upon the Specifically, for each M, i k define ρ A 0, s i wi 0, 0 and note that they both depend on A he sparsity of w can be iplied by the sparsity of Υ, which is a coon condition in high-diensional hypothesis testing literature eg see Van de Geer et al [04] Specifically, the following Lea shows that when lag p = and A is syetric, the sparsity of w is iplied by the sparsity of A : Lea If p =, A R M M is syetric, then s defined in 0 satisfies s d ax ρ i, for k i M he proof for Lea is included in Appendix E 3 heoretical guarantee In this section, we present unifor convergence results for test statistic Û under H 0 and H A, with A and estiators satisfying conditions We also provide feasible estiators, and prove that they satisfy corresponding conditions in Section 33 Unknown variance and confidence region construction is discussed in Section 34 and 35 In Section 36 we provide consequences of our theory under AR odel with Gaussian noise and copare our results with Neykov et alneykov et al [08] Recall that the null hypothesis is H 0 : ÃD = 0, with ÃD R d being concatenated by A D,, A k D k While for the alternative hypothesis, like in Ning et al [07], we consider H A : ÃD = φ, with soe constant φ > 0 and constant vector R d Write =, k, 8

9 where each R d he reason why φ instead of is considered in is that we expect the test to be ore sensitive as saple size increases We will see how the value of φ influences the convergence of Û in heore 3 We still assue ɛ ti s are iid sub-gaussian rando variables, and also consider a special case, where ɛ t N 0, I We copare our result in the Gaussian case to results in Neykov et alneykov et al [08] First we define the sets Ω 0 and Ω of feasible paraeter atrices A under H 0 and H A respectively o control the stability of {X t } in odel 3, we ipose the condition: Ψ i+j β, 3 i=0 j=0 for soe constant β > 0 In the case p =, condition 3 reduces to A i+j β, 4 i=0 j=0 which is iplied by A ɛ for soe 0 < ɛ <, a typical condition assued see eg Neykov et al [08] hen define sets Ω 0 and Ω for any β, ρ, s, M,, φ > 0, set D of size d and vector =,, k R d : Ω 0 = {A R M pm : ÃD = 0, ax Ψ i+j i=0 Ω = {A R M pm : ÃD = φ, j=0 β, ρ A ρ, ax s A s}, ax Ψ i+j i=0 j=0 β, ρ A ρ, ax s A s} Note here ρ A and s A are still functions of A, since Υ is deterined by A Clearly we need reliable estiators for Â, ŵ and Σ with k, to guarantee the weak convergence of Û We present the following assuptions for these estiators, which we will verify in section 33 Note that constants C ay depend on p, d, β and τ, but do not depend on either M or 5 6 Assuption 3 Estiation Error for A For each A Ω 0 Ω,  A log M Cρ,  A ρ log M C,  A X t Xt  A C ρ log M, 7 9

10 hold for k, with probability at least c exp{ c log M} hese are standard error bounds for Lasso estiator and Dantzig Selector with independent design In this paper we verify Assuption 3 in section 33 and the reaining two assuptions when we have dependent sub-gaussian rando variables, as we do for our vector auto-regressive odel setting Assuption 3 Estiation Error for w For each A Ω 0 Ω : log M ŵ w Cs, ] tr [ŵ w X t,d c Xt,D c ŵ w hold for k, with probability at least c exp{ c log M} C s log M, 8 Siilar to Assuption 3, we will show that both Lasso estiator and Dantzig selector under odel 3 satisfy Assuption 3 Assuption 33 Estiation Error for Υ For each A Ω 0 Ω, Υ Υ s ρ log M I C, 9 Υ hold for k, with probability at least c exp{ c log M} Note that Υ R d d is a low-diensional atrix, and thus it is coputationally feasible to use the saple covariance atrix of X t,d ŵ X t,d c as an estiator for Υ We show in section 33 that, as long as ŵ is a reliable estiator for w, Υ would satisfy a tighter bound than 9 his looser bound in Assuption 33 actually allows ore choices for estiators for Υ, as shown in section 35 3 Unifor convergence under null hypothesis Based on these assuptions, we have the following ain theore heore 3 Consider the odel 3 with iid sub-gaussian noise ɛ ti with sub-gaussian paraeter τ If Assuptions 3-33 are satisfied, and ρ s log M = o, then Û defined in 9 satisfies sup x R,A Ω 0 C 8 PÛ x F d x s ρ log M C + C 3 + M C, 4 when > C for soe constant C Here the constants C i s depend on p, d, β, τ 0 0

11 heore 3 proves weak convergence of Û to χ d he unifor convergence rate can be understood as follows: the first ter is due to the rate obtained by artingale CL, where we require 8 rather than due to the dependence; the reaining two ters arise fro estiation error, with the second one being the error bounds, and third being the probability that the error bounds do not hold If we assue Gaussianity, we can iprove the first ter in the rate of convergence fro 8 to 4 +α for any α > 0 o the best of our knowledge, ours is the first work that forally attepts to characterize the rates of convergence Reark 3 Copared to the theoretical result for independent design in Ning et al [07], the only additional condition we add is i=0 j=0 Ψ i+j β, which is used to control the strength of dependence uniforly Also, we consider ultivariate testing which is ore general, and derive the explicit convergence rate Reark 3 he test statistic proposed in Van de Geer et al [04] and Javanard and Montanari [04] for the independent design share siilar ideas with our test statistic Instead of iposing a sparsity assuption upon w, Van de Geer et al [04] assues Υ to be row wise sparse his is actually equivalent to the sparsity assuption on w in the univariate case Javanard and Montanari [04] does not require the sparsity condition on Υ, but it is hard to extend their theory to the tie series setting, due to a difficulty in applying the artingale CL Reark 33 he theoretical guarantee we obtained here, is ore general and stronger than the result achieved in Neykov et al [08] A ore detailed coparison is presented in section 36 3 Unifor convergence under alternative hypothesis Recall the definition of Ω A in 6 he following theore establishes the asyptotic behavior of Û for A Ω A, with different values of φ First define =,, k, = Υ, where Υ is defined in 8 heore 3 Consider the odel 3 with iid sub-gaussian noise ɛ ti and sub-gaussian paraeter τ If Assuptions 3-33 are satisfied, and ρ s log M = o, then when > C for soe constant C, φ = sup x R,A Ω C 8 PÛ x F d, x s ρ log M C + C 3 + M C 4

12 0 < φ < sup PÛ x A Ω C 8 + C M C 3 + C 4 exp{ C 5 φ + C 6 x} 3 3 φ > sup x R,A Ω C 8 PÛ x F d x s ρ log M C + C 3 + M C 4 + C 3 φ 3 4 Here C i s are constants depending on p, d, β,, τ heore 3 shows the threshold value of φ for H A to be detectable When φ >, we cannot distinguish H 0 and H A since under both cases Û converges to χ d ; When φ <, Û diverges to + in probability, thus it would be very easy to detect H A ; When φ =, Û converges to a non-central χ d with noncentrality paraeter deterined by constant vector and Υ = CovX t, which iplies the power of the test Note here, 3 holds also for the trivial case φ < 0, since we do not use the fact φ > 0 in the proof Reark 34 heore 3 is also consistent with the threshold value of φ given by Ning et al [07] for linear regression with iid saples However, Ning et al [07] assues additional conditions on the scaling of saple size, nuber of covariates and sparsity of w for proving asyptotic power Our conditions are exactly the sae as the ones for H 0, due to a ore specific odel and careful analysis 33 Feasible Estiators Both the estiation of w and A can be viewed as high-diensional sparse regression probles, thus we can use the Lasso or Dantzig selector Forally, define as the Lasso estiator for A, and  L = arg in X t+ AX t + λ A A,, 5 A R M pm  D = arg in A R M pm A,, st X t+ AX t Xt λ A, 6 as the Dantzig selector estiator for A Siilarly, for k, define ŵ L = arg in X t,d w X t,d c w R pm d d + λ w w,, 7

13 and ŵ D = arg in w R pm d d w,, st X t,d w X t,d c X t,d c λ w 8 While for estiating Υ, since this is a low diensional covariance atrix for X t,d w X t,d c, we can directly use saple covariance of X t,d ŵ X t,d c as Υ : Υ = X t,d ŵx t,d c X t,d ŵx t,d c, 9 for k Here ŵ in the definition of 9 is either ŵ L or ŵ D As shown in the following, estiators 5 to 9 all satisfy Assuptions 3 to 33, under the odel setting stated in 3: Lea 3 If  = ÂL, or  = ÂD, which are defined as in 5 and 6 with λ A then  satisfies Assuption 3 when > Cρ log M Lea 3 If ŵ = ŵ L log M log M, or ŵ = ŵ D, which are defined as in 7 and 8 with λ w, then ŵ s satisfy Assuption 3 when > Cs log M Lea 33 If Υ s are defined as in 9, where ŵ satisfies 8 with probability at least c exp{ c log M}, then Υ Υ Υ log M I C, with probability at least c exp{ c log M}, when > Cs log M Note here Lea 33 is stronger than Assuption 33 he proof of these Leas are deferred to Appendix A By these leas and heore 3, 3, we arrive at following Corollary Corollary 3 Under odel 3 with iid sub-gaussian noise ɛ ti with paraeter τ, if  = ÂL or ÂD, ŵ = ŵ L or ŵ D, and Υ s are defined as in 9 for k with λ A λ w log M, then if ρ s log M = o and > C for soe constant C > 0, bounds 0 to 4 fro heores 3 and 3 hold 34 Variance Estiation In this section, we consider the case where σ = Varɛ ti is unknown under odel 3 Actually, if σ is known, it is straightforward to extend heore 3 to heore 3 for Û defined as follows: k Û = Ŝ Υ Ŝ /σ 30 = 3

14 his follows since if we consider Y t = X t /σ, tie series data Y t would satisfy the sae odel but with unit variance noise When σ is unknown, we apply the estiator and define the test statistic ˆσ = M Ũ = X t+ ÂX t, 3 k Ŝ Υ Ŝ /ˆσ 3 = We show that Ũ has the sae convergence results we derive for the unit variance noise case heore 33 Consider the odel 3 with iid sub-gaussian noise ɛ ti of variance σ = Varɛ ti σ0 > 0 and scale factor τσ hen heore 3 and 3 hold for Ũ under each corresponding condition, and constants C i s also depend on σ 0 heore 33 shows that when we have to estiate the unknown σ, test statistic Ũ aintains the sae asyptotic behavior as Û under the known variance case, given that all the assuptions for estiation errors are satisfied and σ is lower bounded by soe constant Reark 35 With sub-gaussian noise ɛ ti, if we still assue the scale factor τσ of ɛ ti to be bounded by constant, then Lea 3 to 33 would still hold hus the assuptions iposed on estiation errors of Â, ŵ and Υ are all satisfied However, if we don t assue σ to be bounded, then the tuning paraeters λ A and λ w have to scale with σ Reark 36 Neykov et al [08] proposes another estiator for the variance of ɛ ti, based on the fact that Σ = AΣA +Covɛ t Both these estiators are consistent and lead to convergence in distribution results 35 Sei-paraetric Optial Confidence Region In this section, we construct a confidence region for ÃD, under odel 3 with unknown noise variance σ Siilar to Ning et al [07], we consider the one-step estiator â for each A D, based on the decorrelated score function: â =  D Υ S, 33 where  is any estiator satisfying the Assuptions 3 on error bounds for  A, and both the Lasso or Dantzig Estiator for A are suitable Υ takes the for: Υ = X t,d ŵx t,d X c t,d, 34 4

15 which is another estiator for Υ, and S = X t+,  X t X t,d ŵx t,d c We will show that â A D is asyptotically Gaussian with covariance atrix Υ hus we construct the following confidence region for ÃD, with asyptotic confidence coefficient α: CRα = { θ = θ,, θ k : θ R d, ˆσ k â θ Υ â θ χ d } α = 35 his is a d diensional elliptical ball with center vector â, âk theore shows the weak convergence result of he following R Ṱ σ k â A D Υ â A D 36 = heore 34 Under odel 3 with iid sub-gaussian noise ɛ ti with variance σ = Varɛ ti σ 0 > 0 and sub-gaussian paraeter τσ, then heore 3 and 3 hold for R under each corresponding condition, and the constants C i s also depend on σ 0 Reark 37 In the definition of one-step estiator â, we use Υ instead of Υ for theoretical convenience heore 34 would still hold true if â is defined as  D Υ S Reark 38 We have exactly the sae theoretical result for Ũ and R, and this is due to the close relationship between these two quantities In particular, R = k = Ŝ Υ Υ Υ Ŝ /ˆσ, copared to Ũ = k = Ŝ Υ Ŝ /ˆσ We show in the proof of heore 34 that Υ Υ Υ also satisfies Assuption 33 as an estiator for Υ Reark 39 he one-step estiator â is asyptotically unbiased, and shares a siilar for to the de-biased estiator proposed by Zhang and Zhang [04], Van de Geer et al [04] he de-biased estiator in Van de Geer et al [04] would take the following for under our setting: b =  D + Θ D, X t X t+, Xt Â, 5

16 where Θ is coputed by node-wise regression, as an estiator for Υ When d = D =, this is essentially the sae as our estiator â, but would be slightly different in the ultivariate case Note that the asyptotic covariance atrix for â equals to the partial inforation atrix I A,D A,D c, and thus is sei-paraetric efficient, while ˆb is only efficient when it is a scalar Reark 30 R is also very siilar to the test statistic proposed by Neykov et al [08] for VAR odel with lag he only difference lies in the estiation of Varɛ ti, and they only consider Dantzig selector for estiating A and w We will provide a detailed coparison between their theoretical result with ours in section Special case: AR with Gaussian noise Our theoretical guarantee covers VAR odels with lag p and sub-gaussian noise, of which AR odel and Gaussian noise are special cases Here we explain the consequences of our result under this special case and provide coparison with Neykov et al [08] When we consider lag p =, the constraint for A becoes A i+j i=0 j=0 with ρ s log M = o β, ax ρ A ρ, ax s A s, he two sparsity conditions and saple size requireent are included in the conditions Neykov et al [08] proposes In addition, they assue the following: for soe 0 < ε < A C, A ε, Σ C Note that we don t require these conditions, aong which the first and third are quite strong, and the second one A ε is sufficient for our condition i=0 j=0 A i+j β his follows since if A ε, A i+j i=0 j=0 A i+j i=0 j=0 i=0 εi ε ε ε Until now the discussion focuses on the case where ɛ ti are iid sub-gaussian noise of scale factor Cσ, with σ being the variance of ɛ ti and lower bounded by soe constant hus our setting covers the case where ɛ t N 0, σ I with σ c If ɛ t N 0, Ψ with Ψ ii c as assued in Neykov et al [08], we can still prove the sae theoretical guarantee, under even weaker condition based on spectral density, due to established concentration bounds in Basu et al [05] 6

17 4 Nuerical Experients In this section, we provide a siulation study to validate our theoretical results For siplicity, our siulation is based on the AR odel: X t+ = A X t + ɛ t, t = 0,,, 37 where A R M M is set to be row-wise sparse Syetricity is not required in our theory, but in order to ensure the sparsity of w, we focus on syetric atrices under H 0, and slightly asyetric ones under H A he eigenvalues of A all fall in the unit circle of the coplex plane, which ensures the existence of stationary solution to this odel White noise ɛ ti is siulated as independent Unifor, in order to satisfy the sub-gaussianity condition Other distributions were also used but not reported since the results were very siilar o consider ulti-variate test sets, throughout the siulation we test the index set D with d = D = 6, which involves three different rows and two coluns in each row: D = {, 3,, 5, 3, 3, 3, 4, 5, 4, 5, 8} he null hypothesis takes the for H 0 : ÃD = µ with soe d-diensional vector µ Correspondingly, we consider alternative hypothesis H A : ÃD = µ + φ, with randoly selected fro d-diensional Gaussian distribution, and φ ranges fro 05 to Under H 0, we generate A with different row-wise sparsity levels and structures, and for each A, vector µ ay differ depending on the corresponding ÃD Under H A, A are still the sae atrices as under H 0, but only adding the tested indices ÃD by φ he experients are repeated under different settings of A,, M, and φ We use Lasso estiators defined in 5, 7 for the estiation of A and w, k, and tuning paraeters λ A, λ w are selected using cross validation In cross validation, the training sets are coposed of consecutive tie series data, with the reaining 0% of the original data set being testing sets Under H 0, 000 siulations are carried out under each paraeter setting, while under H A, we have 00 siulations In the following sections, we look into false positive rates FPR and true positive rates PR of test statistics Ũ and R as defined in 3 and 36, when we set the level of test as α = Under the Null Hypothesis Varying sparsity Here we suarize the experients with randoly generated A, that are syetric and row-wise sparse, with different sparsity levels ρ defined in 0 Figure shows how FPR of Ũ and R averaged over 000 experients vary with We can see that when 7

18 Figure : False positive rate FPR of Ũ and R vs, with various diension M and sparsity level ρ he red line is the significance level α = 005 increases to about 500, the FPR becoes stable and close to α = 005 regardless of ρ, M, choice between Ũ and R When the saple size is sall, the test tends to be conservative, which is the consequence of estiating variance σ and covariances Υ s In the siulation we use naive estiators for these two quantities, as defined in 3 and 9 which tend to be saller than the true paraeters his is because we usually fit noise in the regression, as noticed by Fan et al [0] As shown in these two figures, R is less conservative than Ũ when is sall, since the agnitude of Υ is larger than Υ, which akes Υ Υ Υ probably a better estiator for Υ We also suarize the FPR when the variance σ of ɛ ti is known in Figure We can see fro these figures that Û is still a little conservative when is sall, while R with ˆσ substituted by σ is not conservative Different Graph Structures If we consider the M actors in the tie series as nodes in a network, and a nonzero A ij represents an directed edge fro j to i, then each atrix A corresponds to a M-diensional directed graph We experient with different structures of A, which also correspond to different graph structure, including block graph or chain graph Specifically, we consider 8

19 Figure : FPR of Ũ and R when residual variance is known atrices with l nor equal to 075: /4 / / / /4 / A = 0 0 / /4, 0 0 /4 / 0 0 / /4 which is a block graph; c c 0 0 c 0 c 0 A 0 c 0 c 0 =, 0 c 0 c 0 c 0 with constant c chosen to ensure A = 075, which is a chain graph; and A 3 being randoly generated syetric atrix of sparsity level ρ =, and largest eigenvalue equal to 075 Figure 3 shows the difference aong these three different structures We can see 9

20 Figure 3: FPR under different graph structure Block refers to A, chain refers to A and rando refers to A 3 that block graph is less accurate than the other two, which is due to a larger variance for each X t,d w X t,d c Investigating the question of how graph structure theoretically influences testing perforance reains an open and interesting direction 4 Alternative Hypothesis First we look into how the true positive rate PR varies with φ, since we set H A as à D = µ+ φ and φ ay be viewed as a easure of distance fro the null hypothesis Fig 4 only presents the siulation results when A = A and M = 300, while the other choices of A and M generate very siilar results We can see fro these two figures that as φ increases, PR approaches he slope increases when saple size gets larger, or when the test statistic changes fro R to Ũ his aligns with intuition, since when increases, we are supposed to distinguish between H 0 and H A better, and Ũ is ore conservative than R as we show in subsection 4 We also check the influence of φ Figure 5 reveals how PR changes when increases, if we set and φ fixed If φ < 05, PR converges to very quickly, while if φ > 05, PR converges to 005, but the convergence is slower when φ or increases When φ = 05, heore 33 and 34 states that Ũ and R would converge to χ d,, thus the PR should converge to soe value between 005 and, depending on d and he black lines in figure 5 indicate this convergence value, but since the test tends to be conservative when is not large enough, PR when φ = 05 is usually above the black line he conservative issue is ore severe under H A since the deviation is also ultiplied by the estiated variances, which exaggerates 0

21 Figure 4: rue positive rate of Ũ and R, when A = A and M = 300 the conservative tendency However, this ay not be a big concern under H A, since we always want the PR to be large 5 Proof Overview One of the ain contributions of this work is the proof technique, which addresses a nuber of technical challenges and develops novel concentration bounds for dependent sub-gaussian rando vectors In this section, we present and discuss key leas for the proof and provide the ain steps for proving heores 3 and 3, deferring the ore technically intensive steps to the suppleent 5 Key Leas he ajor technical challenge lies in proving the following two concentration bounds for dependent sub-gaussian rando vectors Lea 5 Deviation Bound for A Under odel 3, when ɛ ti are sub-gaussian noise with scale factor τ, and A Ω 0 Ω, P ɛ t Xt log M > C c exp{ c log M}, When C log M Lea 5 is a standard deviation bound for proving estiation error bound of Lasso type or Dantzig selector type estiators We apply this lea both in the proof of heore 3, 3 and Lea 3

22 Figure 5: PR of Ũ and R when =, A = A Results for different graph size M fro 30 to 300 are cobined together and average PR is taken Red line is significance level α, the value that PR should converge to when φ < 05; while the black line is the convergence point specified in heore 3 when φ = 05 Lea 5 Under odel 3, when ɛ ti are sub-gaussian noise with constant scale factor τ, and A Ω 0 Ω, if B R pm pm is a syetric atrix, we have { { P Xt δ BX t trbυ > δ δ }} c exp c in, B B tr B Lea 5 provides concentration bound for the saple average of general quadratic for X t BX t, and is very helpful in proving artingale CL under our setting, REC, Lea 33, etc In the Gaussian case, both these leas follow fro prior work in Basu et al [05] which relies on the fact that dependent Gaussian vectors can be rotated to be independent Since dependent sub-gaussian rando variables cannot be rotated to be independent only uncorrelated, we exploit the independence of ɛ t by representing each X t by linear function of the infinite series {ɛ i } i=t i= and then use a careful truncation arguent We analyze sufficiently any ters in the suation, and control the infinite residues 5 Proof of heore 3 Proof Suppose A Ω 0 We will use C i, c i to refer to constants that only depend on p, d, β, τ not M or, and different constants ight share the sae notation he proof can be divided into two ajor parts: showing the convergence of U to χ d, and

23 bounding the estiation error Û U Forally, for any ε > 0, and PÛ x F d x PU x + ε + P Û U > ε Fd x PU x + ε F d x + ε + F d x + ε F d x + P F d x PÛ x =PÛ > x F d x PU > x ε + P Û U > ε + Fd x F d x ε P U x ε + F d x F d x ε + P Û U > ε, Û U > ε, which iplies PÛ x F d x sup PU y F d y + F d x + ε F d x ε + P y R Û U > ε 38 In the following, we provide bounds on each of the three ters he following lea shows the unifor weak convergence rate of V + µ to χ d, µ, of which the convergence of U = V to χ d is a special case Lea 53 Convergence Rate of V + µ Under odel 3 with ɛ ti being sub-gaussian noise of scale factor τ, then for any A Ω 0, µ R d, sup P V + µ x F d, µ x C µ 8, 39 x R when > C for soe absolute constant C, where C µ is a constant depending on and is non-decreasing with respect to µ his Lea is proved in section C, by applying a unifor artingale central liit theore result hus, by Lea 53, if > C for soe constant C, sup PU y F d y C 8 y R Meanwhile, F d x + ε F d x ε C ε since χ d has bounded density 3

24 Now we only need to choose a proper ε and bound P Û U > ε Û U k Ŝ Υ Ŝ V, = k Ŝ Υ Υ Ŝ + Υ Ŝ V, k Υ Υ I Υ Ŝ = Υ = Define E = Υ + Υ Ŝ S + V, Υ Ŝ S Ŝ S, then 40 turns into 40 Û U k E + V, E = + Υ We can bound V, using Lea 53 and Υ 4 Υ I V, + E Υ Υ Υ I using Lea 9, while for bounding the estiation induced error E, we first apply the following lea to bound the eigenvalues of Υ Lea 54 Consider the odel with independent noise ɛ ti of unit variance, A satisfies 3, then the eigenvalues of Υ can be bounded as follows: 0 < C β Λ in Υ Λ ax Υ C β Lea 54 is proved based on established results in Basu et al [05] Note that we assued unit variance in heore 3 and 3, so we can apply Lea 54 here Since Υ = Υ D,D, applying Lea 54 would lead us to the following: Λ in Υ Λ in Υ = Λ ax Υ C, Λ ax Υ Λ ax Υ = Λ in Υ C 4 hus we have E C Ŝ S, 4

25 with Ŝ S =ŵ w X t,d c ɛ t, + X t,d w X t,d c Xt,D  c D A c D c ŵ w X t,d c Xt,D c  D c A D c he following two leas provide bounds for X t,d c ɛ t,, and X t,d w X t,d c X t,d c 43 Lea 55 When C log M, P ɛ t Xt log M > C c exp{ c log M} Lea 5 is a coon condition in high-diensional regression probles, and is usually referred to as deviation bound We will prove it in Section C Lea 56 Deviation Bound for w With probability at least c exp{ c log M}, for all k, X t,d w X t,d c X t,d c log M C Lea 56 can also be viewed as a deviation bound, if we consider a regression proble with X t,d as response and X t,d c as covariates his is also proved in Section C Applying Assuptions 3 and 3, with probability at least c exp{ c log M}, E C s ρ log M + Q Q C s ρ log M, where Q =  D c Q =tr A D c [ŵ w X t,d c X t,d c X t,d c X t,d c ŵ w  ], D c A D c 5

26 ρ log M and Assuption 3 and 3 iplies Q C and Q C straightforward: to see why it holds true, let ĥ = Â A and H = have Q = = [ ] Xt,D ĥ c D c [ Xt ĥ Xt,D ĥ D ] s log M [ ] Xt ĥ + Xt,D ĥ D =ĥ Hĥ + ĥ H D,D ĥ D D C ρ log M Here we apply Assuption 3, and the fact that ĥ D H D,D ĥ D d H ĥ d H Υ + Λ ax Υ ρ log M C ρ log M he last inequality is due to Lea 54 and the following lea: Lea 57 With probability at least c exp{ c log M}, X t Xt log M Υ C herefore, by taking a union bound, we show that E C s ρ log M, he forer is not X txt, then we 44 for any k, with probability at least c exp{ c log M} Meanwhile, by applying Lea 53, one can show that for y > 5d, P V, > y C 8 + Fd y C 8 + exp{ y d/4} C 8 + Cy, 45 6

27 where the second inequality is due to a χ d tail bound established in Laurent and Massart [000] see Lea in Laurent and Massart [000], and the third inequality coes fro the fact that, constant C > 0, constant C such that Let y = s ρ log M 4 probability at least the following holds: sup y e C y C y 0 and plug it into 4, then with Assuption 33, we can show that with c exp{ c log M} c 3 s ρ log M 8 c4, Û U C s ρ log M C s ρ log M, s ρ log M 3 s ρ log M 4 + C if s ρ log M = o and > C for soe constant C ε = C s ρ log M, herefore, applying 38 with PÛ x F d x C s ρ log M 8 + C + C3 exp{ c log M} Since constants C i only depend on d, β and τ, this bound also holds for supreu over A Ω 0 and x R Note that for a clear presentation, we are not showing the sharpest bound, which can be obtained by choosing a different y 53 Proof of heore 3 proof of heore 3 We prove this case by case We will use C i, c i to refer to constants that only depend on d, β,, φ, and different constants ight share the sae notation Siilar fro the proof of heore 3, the ajor part of the proof is devoted to bounding Û V + µ with high probability for soe vector µ R d φ = Suppose A Ω Using siilar deduction as in the proof of heore 3, for any ε > 0, PÛ x F d, x sup P V y F d, y y R 46 + F d, x + ε F d, x ε + P Û V > ε 7

28 a Bounding the first two ters he first ter is the convergence rate of V to χ d, By Lea 53, sup P V y F d, y C 8 C 8 y R he last inequality is due to = k = and an upper bound for Λ ax Υ in 4 k Λ ax Υ, = Bounding the second ter in 46 is not straightforward as bounding F d x + ε F d x ε in the proof of heore 3, since is not aconstant vector when A takes different values in Ω We only have a unifor bound of as shown above One can show that Z F d, x + ε F d, x ε = P + x ε, x + ε] Cd x + ε d x ε d e x ε /, x ε Cd x + ε d x ε d,, x ε < where Z is a d-diensional standard Gaussian rando vector with density φz = Cd exp{ z /} he last inequality holds because that, for any set C R d, P Z C sup φz dz z C Suppose 0 < ε, then if x ε, x + ε d d x ε otherwise, hus, b Bounding Û V exp dεx + ε d exp{ x ε/8} z C dεe ε 4 sup y d exp{ y/8} Cdε, y 0 x + ε d d x ε { x ε } / dεx + ε d Cdε F d, x + ε F d, x ε Cdε 8

29 Siilar fro 4 in the proof of heore 3, it is straightforward to show that Û V k E + V, E = + Υ Υ Υ I V, + E, where E = Υ Ŝ V, + o bound E, note that 47 and V, = Υ S = Υ S Υ A D, S Υ A D = with S R d [ t = S + W X t,d w X t,d c X t,d Υ ] A D X t+, A D c X t,d c X t,d w X t,d c and W R d M defined as follows: S = t X t X t,d Υ,D A D, X t+, A D c X t,d c X t,d w X t,d c, W,D = I d d, W,D c = w 48 herefore, E Υ Ŝ S + Υ W X t Xt,D Υ,D t C Ŝ S + C d ax W i i X t Xt he last inequality applies 4 Meanwhile, t Υ ax i W i = + ax w i i + ax i s w i + s Λ in Υ D c,d c ax i + C s ax Υ ii i + C s Λ ax Υ C s Υ i 49 9

30 he first equality and second inequality coe fro the definition of W and w ; the third inequality is because that Υ i = Υ ii ; the fourth inequality is due to that Υ ii = e i Υ e i Λ ax Υ ; and the last inequality is obtained fro Lea 54 Applying Lea 57 leads us to We can write Ŝ S as E C Ŝ S =ŵ w s log M + C Ŝ S X t,d c ɛ t, + X t,d +  A D D c c X t,d c X t,d w X t,d c  A D D c c X t,d c Xt,D c ŵ w Note that X t,d c Xt,D X t Xt Υ + Υ t log M C + Υ C, due to Lea 54 and 57, which further iplies ŵ w X t,d c Xt,D C ŵ w 50 Applying Assuption 3 to 33, Lea 5, 56, one can show that with probability at least c exp{ c log M}, E C s ρ log M, 5 with the sae arguents as bounding Ŝ S under H 0 While for V, Υ, applying Lea 53 leads us to V, P Υ > y C 8 + Fd, y =C 8 + P Z + > y C 8 + P Z > y C, 30

31 for any y 0, where Z N 0, I d We apply the tail bound for χ d Lea in Laurent and Massart [000] as in 45, and obtain P Z > y C C y C Cy, when y > C for soe constant C Let y = y, 5 and 9 into 47, one can show that Û V s ρ log M 4 3 s ρ log M 4 s ρ log M C + C C with probability at least s ρ log M,, and plug V, Υ s ρ log M c exp{ c log M} c 3 s ρ log M 8 c4, if s ρ log M = o and > C herefore, applying 46 with ε = C PÛ x F d x s ρ log M leads to C s ρ log M 8 + C + C3 exp{ C 4 log M} Since constants C i only depend on d, β,, τ, this bound also holds for supreu over A Ω and x R 0 < φ < First we provide a lower bound for Û with high probability Since bounds in Assuption 3 to 33, Lea 5 to 57 hold with probability at least c exp{ c log M}, we apply these bounds directly in following deduction Meanwhile, we always assue ρ s log M = 3

32 o and > C for desired constant C With these conditions, one can show that Û = k Ŝ Υ Ŝ = k Υ Ŝ Υ d Υ Υ I = C C k = Υ Ŝ k Υ Ŝ S = V he third line is due to Assuption 33, which iplies Υ Υ Υ I con- verges to 0 under our scaling ρ s log M = o We provide a lower bound for Υ Ŝ S in the following First write Ŝ S as Ŝ S =ŵ w ɛ t, X t,d c + X t,d ŵx t,d c Xt,D A D X t,d ŵ X t,d c X t,d c  D c A D c E + E + E 3, we find the upper bounds for E, E 3 and lower bound for E in the following Applying Assuption 3 and Lea 5 provides an upper bound for E : Since E 3 E ŵ w X t ɛ t s log M ŵ w X t,d c Xt,D c  D c A D c + d X t,d w X t,d c Xt,D c  D c A D c, then using the sae arguent as bounding Ŝ S when proving heore 3, we have E 3 C s ρ log M 3 5

33 o lower bound E, first note that X t,d ŵ X t,d Xt,D c Υ ax W i i X t Xt Υ + ŵ w X t X t log M Cs, where we apply 49, Lea 57, Assuption 3, and bound X txt using the sae arguent as in 50 hus, E φ Υ Cs log M φ C φ, since is a constant vector, and Λ in Υ is lower bounded by constant as in 4 Applying these bounds for E i, i 3, one can show that, 53 k Υ Ŝ S = k = C φ C s ρ log M C φ Plug this into 5 and apply Lea 53, we have PÛ x C exp{ c log M} + P V C φ C x C exp{ c log M} + C 8 + Fd C 3 φ C 4 x C exp{ c log M} + C 8 + C3 exp{ C 3 φ C 4 x }, where in the last line we apply the χ d tail bound as in 45 Since the constants here only depend on d, β,, τ, this bound holds when taking supreu over A Ω and x R 3 φ > he proof of this case is siilar to that of heore 3 he only thing different lies in the choice of ε and bounding P Û U > ε he bound 4 for Û U still holds here, with E = Υ Ŝ S We directly apply the bounds in Assuptions 33

34 3 to 33, and Lea 5 to Lea 57 in the following First we write Ŝ S =ŵ w X t,d c ɛ t, + X t,d w X t,d c Xt,D  c D A c D c ŵ w X t,d c Xt,D c  D c A D c +φ X t,d ŵx t,d c Xt,D Note here that the first three ters are exactly the sae as in 43, and thus can be bounded as in the proof of heore 3 We only have to tackle the last ter By 53, one can show that, X t,d ŵ X t,d c X t,d hus, going through the sae arguents as bounding Υ + Cs log M E C s ρ log M + C φ, C, Ŝ S under H 0, we have with probability at least C exp{ c log M} Recall that in 45, when y > C for soe constant C, P V, y C 8 + C y Let y = s ρ log M 4 φ 6, then by 4 one can show that Û U C s ρ log M s ρ log M C + C φ 3, with probability at least s ρ log M 3 s ρ log M 4 + C + C3 φ 3 c exp{ c log M} c 3 s ρ log M 8 c4 c5 φ 3, if s ρ log M = o and > C for soe constant C herefore, applying 38 with ε = C s ρ log M + C φ 3, PÛ x F d x 34

35 C s ρ log M 8 + C + C3 φ 3 + C 4 exp{ C 5 log M} Since constants C i only depend on d, β, τ,, this bound also holds for supreu over A Ω and x R 6 Conclusion In this paper, we have provided theoretical guarantees for hypothesis tests for sparse highdiensional auto-regressive odels with sub-gaussian innovations Specific upper bounds for the convergence rates of test statistics are given Iportantly, our results go beyond the Gaussian assuption and do not rely on ixing assuptions As a consequence of our theory, we also develop novel concentration bounds for quadratic fors of dependent sub-gaussian rando variables using a careful truncation arguent It would be of interest to consider other variance estiation ethod, eg, scaled Lasso Sun and Zhang [0], or cross-validation based ethod Fan et al [0], and establish corresponding theoretical guarantee here also reain a nuber of open questions/challenges including extensions to generalized linear odels, heavy-tailed innovations and incorporating hidden variables under tie series setting Acknowledgeents We would like to thank both Suanta Basu and Yiing Sun for useful discussions and coents LZ and GR were supported by ARO W9NF and NGA HM GR was also supported by NSF DMS-8767 References A Ang and M Piazzesi A no-arbitrage vector autoregression of ter structure dynaics with acroeconoic and latent variables Journal of Monetary econoics, 504: , 003 M Barigozzi and C Brownlees Nets: Network estiation for tie series 08 S Basu, G Michailidis, et al Regularized estiation in sparse high-diensional tie series odels he Annals of Statistics, 434: , 05 35

36 S L Bressler, C G Richter, Y Chen, and M Ding Cortical functional network organization fro autoregressive odeling of local field potential oscillations Statistics in edicine, 6 : , 007 L Chen and W B Wu esting for trends in high-diensional tie series Journal of the Aerican Statistical Association, just-accepted: 37, 08 S X Chen, L-X Zhang, and P-S Zhong ests for high-diensional covariance atrices Journal of the Aerican Statistical Association, 05490:80 89, 00 R A Davis, P Zang, and Zheng Sparse vector autoregressive odeling Journal of Coputational and Graphical Statistics, 54: , 06 J Fan, S Guo, and N Hao Variance estiation using refitted cross-validation in ultrahigh diensional regression Journal of the Royal Statistical Society: Series B Statistical Methodology, 74:37 65, 0 A Fujita, J R Sato, H M Garay-Malpartida, R Yaaguchi, S Miyano, M C Sogayar, and C E Ferreira Modeling gene expression regulatory networks with the sparse vector autoregressive odel BMC systes biology, :39, 007 R Goebel, A Roebroeck, D-S Ki, and E Forisano Investigating directed cortical interactions in tie-resolved fri data using vector autoregressive odeling and granger causality apping Magnetic resonance iaging, 0:5 6, 003 I Graa and E Haeusler An asyptotic expansion for probabilities of oderate deviations for ultivariate artingales Journal of heoretical Probability, 9: 44, 006 R M Gray et al oeplitz and circulant atrices: A review Foundations and rends R in Counications and Inforation heory, 3:55 39, 006 F Han, H Lu, and H Liu A direct estiation of high diensional stationary vector autoregressions he Journal of Machine Learning Research, 6:35 350, 05 P R Hansen Structural changes in the cointegrated vector autoregressive odel Journal of Econoetrics, 4:6 95, 003 M Hardy Cobinatorics of partial derivatives the electronic journal of cobinatorics, 3:, 006 L Harrison, W D Penny, and K Friston Multivariate autoregressive odeling of fri tie series Neuroiage, 94:477 49, 003 A Javanard and A Montanari Confidence intervals and hypothesis testing for highdiensional regression he Journal of Machine Learning Research, 5: , 04 36

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013). A Appendix: Proofs The proofs of Theore 1-3 are along the lines of Wied and Galeano (2013) Proof of Theore 1 Let D[d 1, d 2 ] be the space of càdlàg functions on the interval [d 1, d 2 ] equipped with

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests Working Papers 2017-03 Testing the lag length of vector autoregressive odels: A power coparison between portanteau and Lagrange ultiplier tests Raja Ben Hajria National Engineering School, University of

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Testing equality of variances for multiple univariate normal populations

Testing equality of variances for multiple univariate normal populations University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

arxiv: v1 [stat.ot] 7 Jul 2010

arxiv: v1 [stat.ot] 7 Jul 2010 Hotelling s test for highly correlated data P. Bubeliny e-ail: bubeliny@karlin.ff.cuni.cz Charles University, Faculty of Matheatics and Physics, KPMS, Sokolovska 83, Prague, Czech Republic, 8675. arxiv:007.094v

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007 Deflation of the I-O Series 1959-2. Soe Technical Aspects Giorgio Rapa University of Genoa g.rapa@unige.it April 27 1. Introduction The nuber of sectors is 42 for the period 1965-2 and 38 for the initial

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Tail Estimation of the Spectral Density under Fixed-Domain Asymptotics

Tail Estimation of the Spectral Density under Fixed-Domain Asymptotics Tail Estiation of the Spectral Density under Fixed-Doain Asyptotics Wei-Ying Wu, Chae Young Li and Yiin Xiao Wei-Ying Wu, Departent of Statistics & Probability Michigan State University, East Lansing,

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Synchronization in large directed networks of coupled phase oscillators

Synchronization in large directed networks of coupled phase oscillators CHAOS 16, 015107 2005 Synchronization in large directed networks of coupled phase oscillators Juan G. Restrepo a Institute for Research in Electronics and Applied Physics, University of Maryland, College

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

DISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA.

DISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA. Proceedings of the ASME International Manufacturing Science and Engineering Conference MSEC June -8,, Notre Dae, Indiana, USA MSEC-7 DISSIMILARIY MEASURES FOR ICA-BASED SOURCE NUMBER ESIMAION Wei Cheng,

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Statistics and Probability Letters

Statistics and Probability Letters Statistics and Probability Letters 79 2009 223 233 Contents lists available at ScienceDirect Statistics and Probability Letters journal hoepage: www.elsevier.co/locate/stapro A CLT for a one-diensional

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

M ath. Res. Lett. 15 (2008), no. 2, c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS. Van H. Vu. 1.

M ath. Res. Lett. 15 (2008), no. 2, c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS. Van H. Vu. 1. M ath. Res. Lett. 15 (2008), no. 2, 375 388 c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS Van H. Vu Abstract. Let F q be a finite field of order q and P be a polynoial in F q[x

More information

Some Proofs: This section provides proofs of some theoretical results in section 3.

Some Proofs: This section provides proofs of some theoretical results in section 3. Testing Jups via False Discovery Rate Control Yu-Min Yen. Institute of Econoics, Acadeia Sinica, Taipei, Taiwan. E-ail: YMYEN@econ.sinica.edu.tw. SUPPLEMENTARY MATERIALS Suppleentary Materials contain

More information

Physics 215 Winter The Density Matrix

Physics 215 Winter The Density Matrix Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it

More information

A Markov Framework for the Simple Genetic Algorithm

A Markov Framework for the Simple Genetic Algorithm A arkov Fraework for the Siple Genetic Algorith Thoas E. Davis*, Jose C. Principe Electrical Engineering Departent University of Florida, Gainesville, FL 326 *WL/NGS Eglin AFB, FL32542 Abstract This paper

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

A Jackknife Correction to a Test for Cointegration Rank

A Jackknife Correction to a Test for Cointegration Rank Econoetrics 205, 3, 355-375; doi:0.3390/econoetrics3020355 OPEN ACCESS econoetrics ISSN 2225-46 www.dpi.co/journal/econoetrics Article A Jackknife Correction to a Test for Cointegration Rank Marcus J.

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe

. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe PROPERTIES OF MULTIVARIATE HOMOGENEOUS ORTHOGONAL POLYNOMIALS Brahi Benouahane y Annie Cuyt? Keywords Abstract It is well-known that the denoinators of Pade approxiants can be considered as orthogonal

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

An Introduction to Meta-Analysis

An Introduction to Meta-Analysis An Introduction to Meta-Analysis Douglas G. Bonett University of California, Santa Cruz How to cite this work: Bonett, D.G. (2016) An Introduction to Meta-analysis. Retrieved fro http://people.ucsc.edu/~dgbonett/eta.htl

More information

Generalized eigenfunctions and a Borel Theorem on the Sierpinski Gasket.

Generalized eigenfunctions and a Borel Theorem on the Sierpinski Gasket. Generalized eigenfunctions and a Borel Theore on the Sierpinski Gasket. Kasso A. Okoudjou, Luke G. Rogers, and Robert S. Strichartz May 26, 2006 1 Introduction There is a well developed theory (see [5,

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked

More information

Meta-Analytic Interval Estimation for Bivariate Correlations

Meta-Analytic Interval Estimation for Bivariate Correlations Psychological Methods 2008, Vol. 13, No. 3, 173 181 Copyright 2008 by the Aerican Psychological Association 1082-989X/08/$12.00 DOI: 10.1037/a0012868 Meta-Analytic Interval Estiation for Bivariate Correlations

More information