Sample Size Requirement For Some Low-Dimensional Estimation Problems

Size: px
Start display at page:

Download "Sample Size Requirement For Some Low-Dimensional Estimation Problems"

Transcription

1 Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation!

2 Acknowledgements/References Sun, T. and Zhang, C. H. (2012). Scaled sparse linear regression. Biometrika 99, Zhang, C.-H. and Zhang, S. S. (2011). Confidence intervals for low-dimensional parameters with high-dimensional data. Technical Report arxiv: , arxiv. Zhang, C.-H. (2011). Statistical inference for high-dimensional data. In Mathematisches Forschungsinstitut Oberwolfach: Very High Dimensional Semiparametric Models, Report No. 48/2011, pp Sun, T. and Zhang, C.-H. (2012). Comments on: Optimal rates of convergence for sparse covariance matrix estimation. Statistica Sinica 22, Ren, Z., Sun, T., Zhang, C.-H. and Zhou, H.H. (2013). Asymptotic normality and optimalities in estimation of large Gaussian graphical model, preprint.

3 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

4 LD estimation problems Consider HD models Is it feasible to make regular statistical inference at n 1/2 rate? Regular estimation: stable limit distribution; working in a connected sample space; not super efficient; not requiring variable selection consistency Statistical inference: confidence interval, p-value, asymptotic normality, efficiency with information bound, etc. This is typically not possible for the estimation of HD parameters or non-smooth LD parameters What is the sample size requirement?

5 Feasibility; Example Linear model y = Xβ + ε, ε N(0, σ 2 I n ), where β R p with p n Scaled Lasso: λ univ = 2(log p)/n, A > 1 { β, σ} = arg min { y Xb 2 2/(2σn) + σ } b,σ 2 + Aλ univ b 1 Städler et al (10), Antoniadis (10), Sun Z (10,12), Belloni et al (11) Suppose s log p n 1/2 and a side regularity condition on X, where s = β 0 or s = j min{ β j /(σλ univ ), 1}. Then, σ/σ 1 = o P (n 1/2 ) where σ = y Xβ 2 / n is the oracle estimator. Consequently n( σ/σ 1) N(0, 1/2)

6 Feasibility; Example Estimation of σ is special since it is orthogonal to the estimation of β in terms of score functions The scores for the estimation of β j and β k are not orthogonal in general Need to use efficient score Bias correction

7 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

8 Low-dimensional projection estimator (LDPE) Bias correction with a linear estimator β j = β (init) j + w j (y X β (init) ) Error decomposition: With η j = (X w j e j )/ w j 2, β j β j = w j ε + (X w j e j ) ( β (init) β) = w j ε + w j 2 η j ( β (init) β) The property X w j = e j only needs to hold approximately Since w j ε N(0, σ 2 w j 2 2 ), η j ( β (init) β) 0 β j β j σ w j 2 N(0, 1)

9 Low-dimensional projection estimator (LDPE) Bias correction with a linear estimator β j = β (init) j + w j (y X β (init) ) Approximate confidence interval: P { β j β j /( w j 2 σ) 1.96 } 0.95, provided that σ σ and η j ( β (init) β) 0 The key condition holds when β (init) β 1 = O P (sλ), e.g. Lasso η j C 2 log p 1/(λs) where η j = (X w j e j )/ w j 2 and default C = 1

10 LDPE Theory We use the scaled Lasso or the LSE after the scaled Lasso selection for the initial estimation We use the scales Lasso or a quadratic program to pick w j The compatibility factor (van de Geer, 07; van de Geer-Bühlmann, 09) κ = inf { S 1/2 Xu 2 n 1/2 u S 1 : u S c 1 ξ u S 1 } where S = {j : β j σλ univ } and ξ > (A + 1)/(A 1).

11 LDPE Theory Theorem 1: (Deterministic designs) Suppose s log p n. If 1/κ = O(1) and the choice of w j is feasible, then ( β j β j )/( στ j ) N(0, 1), σ/σ = 1 + o(n 1/2 ), (1) where τ j = w j 2 and σ = y Xβ 2 / n Theorem 2: (Random designs) Suppose s log p n and X has iid N(0, Σ). If max j Σ jj + Σ 1 (S) = O(1), then (1) holds and τ j = (1 + o(1)) n 1 (Σ 1 ) jj

12 Remarks Confidence intervals can be constructed without assuming min β j 0 β j Cλ univ This uniform signal strength or β min condition, required for variable ( selection consistency, divides the parameter space into p ) s 3 s disconnected regions according to sgn(β j ) { 1, 0, 1}. Stability selection: Meinshausen-Bühlmann (10) Recent developments: Bühlmann (12), Belloni et al (12), Javanmard-Montanari (13)

13 Multiplicity adjustments and thresholded LDPE Low-dimensional projection estimator (LDPE, Z-Zhang, 11) β 1, ŝe 1,..., β p, ŝe p, ŝe j = στ j σn 1/2 β (thr) j : threshold β j at level t j = ŝe j L ɛ, L ɛ = 2 log(p/ɛ) Estimation of β: For certain Ω n with P(Ω n ) 1 ɛ/p, E β (thr) [ β 2 2I Ωn (1+o(1))E βj 2 (στ j L ɛ ) 2 +(ɛ/p) j j (στ j ) 2] Selection with any signal strength {j : β j > 2 t j } {j : Similar to thresholding N(β j, se j ) β (thr) j 0} {j : β j 0}

14 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

15 Inference after variable selection Optimistic approach Find an estimate of S = supp(β) {0, 1} p, say Ŝ Estimate β j by the LSE β j, b S, where β M = (X MX M ) 1 X My If P{Ŝ = S} 1, then { ((X } P X bs bs ) 1) 1/2 β jj j, S b β j / σ bs % Tinbshirani (96), Fan-Li (01), Fan-Peng (04), Meinshausen-Bühlmann (06), Tropp (06), Zhao-Yu (06), Wainwright (06), Z (07,10), Zhang (09,11), Z-Zhang (2012) Super-efficiency: Outperforms the oracle LSE with known A, where A supp(β) with A > β 0

16 Inference after model selection Conservative approach Leeb-Pötscher (06), Berk et al (09,11), Laber-Murphy (11) If P { ( max M (X M X M ) 1 X Mε ) j / σ } M K j = 95%, then { ((X } P X bs bs ) 1) 1/2 β j, S b β j, S b / σ bs K j 95% jj where β M = (X MX M ) 1 X MXβ A conservative confidence interval for a random parameter Bias: β j,m β j when M S Inefficiency: log p K j p

17 Low-dimensional case Consider the estimation of µ based on iid X i N(µ, 1), i n Hodges example: Let µ = X I { X > λ} with nλ and λ 0 For fixed µ, n( µ µ) {N(0, 1) µ 0 0 µ = 0 Optimistic approach: If µ (0, 2λ),... Hájek-Le Cam s LAM: n( µ µ) does not converge in distribution if µ = h/ n 0 max µ ne µ ( µ µ) 2 nλ 2 X is better than µ

18 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

19 Statistical inference with high-dimensional data: The sampling distribution of high-dimensional regularized estimators is not tractable in general. However, low-dimensional statistical inference, such as p-values and confidence intervals for real parameters, is possible in high-dimensional models with high-dimensional data.

20 Semiparametric inference: Parametric component + NP component Low-dimensional statistical inference with high-dimensional data, or SemiLD inference : LD component + HD component General method of semild inference: HD estimation SemiLD inference is parallel to NP estimation semiparametric inference We borrow ideas from Engle et al (81), Chen (85,88), Rice (86), Heckman (86), Bickel et al (90),...

21 Minimum Fisher information; Stein (1956) Linear model y = Xβ + ε, ε N(0, σ 2 I n ) For a fixed a R p, consider the estimation of θ = a β Consider one-dim model β = uθ with a T u = 1 and known σ, y/σ = (Xu/σ)θ + N(0, 1) Minimum Fisher information: Subject to a T u = 1, F = min E(Xu/σ) 2 = min uσu/σ 2 = u 0 Σu 0 /σ 2 For the estimation of β j, the LDPE attains the minimum Fisher information bound Efficiency of LDPE: Z (11), van de Geer-Bühlmann-Ritov (13)

22 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

23 General picture Linear regression: X with iid N(0, Σ) rows Known Σ versus unknown Σ (Robins and Ritov, 97) Additional costs in computational complexity and theoretical assumptions if Σ is unknown n 1/4 convergence s log p n 1/2 (Z Zhang, 11) Weaker condition in semisupervised learning, s log p N 1/2 Precision matrix and partial correlation (Sun Z, 11; Ren et al, 13) Deterministic X: Will cost a little more Extensions to GLM and problems with sample Hessian depending on unknown: Will cost even more Extensions to quantile regression and problems without sample Hessian: Will cost a lot more

24 Precision matrix and partial correlation Data X with iid N(0, Σ) rows Precision matrix Θ = Σ 1 Multivariate regression model: Cov(ε A, X A c ) = 0 and X A = X A c Σ 1 A c Σ A c,a + ε A = X A c B A c,a + ε A The residual has the covariance structure Eε A ε A /n = Σ A Σ A,A c Σ 1 A Σ c A c,a = Θ 1 A For a given A, we are interested in smooth functions of Θ 1 A, The oracle estimator is τ(θ 1 A ) = τ(eε A ε A /n) τ = τ(ε A ε A /n) For an oracle expert with the knowledge of B A c,a, ε A is sufficient for Θ A, so that τ is efficient as the MLE in a fixed-dimensional regular parametric model (exponential family)

25 Precision matrix and partial correlation X has iid N(0, Σ) rows; Θ = Σ 1 Multivariate regression: Cov(ε A, X A c ) = 0, X A = X A c B A c,a + ε A, Eε A ε A /n = Θ 1 A The oracle τ = τ(ε A ε A /n) is efficient for the estimation of τ(θ 1 A ) Let b B A c,a be the scaled Lasso estimator of B A c,a and z A = X A X A c b BA c,a. We propose bτ = τ(z A z A /n) Theorem. Suppose Σ 1 A c (S) + max j A Σ jj + Σ 1 A (S) = O(1). Let A be fixed and s A = max j A #{k A : Θ jk 0}. Then, bτ = τ + O P (λ 2 s A ) = τ + o P (n 1/2 ), when τ is a Lipschitz function and s A log p n 1/2. Consequently, nfτ (bτ τ) N(0, 1) where F τ is the minimum Fisher information for the estimation of τ.

26 Precision matrix and partial correlation We pick A = {j, k} for the estimation of individual elements. Corollary 1. For the estimation of partial correlation r jk = Θ jk / Θ jj Θ kk, n( rjk r jk ) 1 r jk 2 N(0, 1) Corollary 2. For the estimation of individual elements of the precision matrix Θ jk, n( Θjk Θ jk ) Θjj Θkk + Θ 2 jk N(0, 1) We may then threshold these estimates for inference about high-dimensional quantities

27 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

28 Precision matrix estimation What is the sample size required for estimating LD parameters with parametric rate? Let { G = Σ : Θ (S) + max j } Σ jj M, max #{k : Θ jk 0} s j Theorem 1. Suppose s log p c 0 n for a certain small c 0. There exists c 1 depending on c 0 and M only such that { } inf sup P Θ jk Θ jk c 1 max((s/n) log p, n 1/2 ) c 1 inf j,k bθ jk Σ G For the low-dimensional projection estimator { } lim sup P Θ jk Θ jk t max((s/n) log p, n 1/2 ) = 0 max t j,k Σ G The sample size requirement is n (s log p) 2 for the n 1/2 rate

29 Precision matrix estimation Theorem 2. Suppose s log p c 0 n. There exists c 1 depending on c 0 and M only such that { inf sup P max Θ jk Θ jk c 1 max ( (s/n) log p, (log p)/n )} c 1 bθ jk Σ G j,k For the low-dimensional projection estimator { lim sup P max Θ jk Θ jk t max ( (s/n) log p, (log p)/n )} = 0 t Σ G j,k The sample size requirement is n s 2 log p for Bonferronni adjustments, attained This implies Θ thr Θ (S) s (log p)/n when s (log p)/n c 0 This is similar to the estimation of Σ (Bickel-Levina, 08)

30 Regression, estimation of individual coefficients Sample size requirement in the regression case is more complicated Deterministic design: Under a side condition on the design n (s log p) 2 for asymptotic normality n s 2 log p for simultaneous C.I. via Bonferroni Random design: Under a side condition on Σ = EX T X/n Known Σ: n s log p for efficient estimation n s log p for simultaneous confidence interval Unknown Σ: n s(d s)(log p) 2 for efficient estimation n s(d s) log p for simultaneous confidence interval Semi-supervised learning with unknown Σ: n s log p N s(d s)(log p) 2 for efficient estimation N s(d s) log p for simultaneous confidence interval We note that X is ancillary for the estimation of β. Are these conditions necessary?

31 Outline 1 LD problems 2 LDPE 3 Variable selection 4 SemiLD inference 5 Extensions 6 Sample size requirement 7 Simulation results

32 Simulation settings n = 200, p = 3000, σ = 1, λ univ = (2/n) log p = 0.283, β j = 3λ univ = for j = 1500, 1800,..., 3000, and β j = 3λ univ /j α otherwise; β j 0 for all j

33 Simulation settings n = 200, p = 3000, σ = 1, λ univ = (2/n) log p = 0.283, β j = 3λ univ = for j = 1500, 1800,..., 3000, and β j = 3λ univ /j α otherwise; β j 0 for all j (s, s (log p)/ n) = (8.93, 5.05) and (29.24, 16.55) respectively for α = 1 and 2, while the theory requires s(log p)/ n 0, where s = j min( β j /λ univ, 1).

34 Simulation settings n = 200, p = 3000, σ = 1, λ univ = (2/n) log p = 0.283, β j = 3λ univ = for j = 1500, 1800,..., 3000, and β j = 3λ univ /j α otherwise; β j 0 for all j (s, s (log p)/ n) = (8.93, 5.05) and (29.24, 16.55) respectively for α = 1 and 2, while the theory requires s(log p)/ n 0, where s = j min( β j /λ univ, 1). Generate ( X, X, ε) in each replication, where X has iid N(0, Σ) rows with Σ = (ρ j k ) p p and X is the column normalized version of X

35 Simulation settings n = 200, p = 3000, σ = 1, λ univ = (2/n) log p = 0.283, β j = 3λ univ = for j = 1500, 1800,..., 3000, and β j = 3λ univ /j α otherwise; β j 0 for all j (s, s (log p)/ n) = (8.93, 5.05) and (29.24, 16.55) respectively for α = 1 and 2, while the theory requires s(log p)/ n 0, where s = j min( β j /λ univ, 1). Generate ( X, X, ε) in each replication, where X has iid N(0, Σ) rows with Σ = (ρ j k ) p p and X is the column normalized version of X Four settings, labeled (A), (B), (C), and (D), respectively, with (α, ρ) = (2, 1/5), (1, 1/5), (2, 4/5), and (1, 4/5) Case (D) is most difficult

36 LDPE Unknown {β, σ}, deterministic X or unknown Σ for random X { y Xb 2 scaled Lasso = arg min 2 + σ } b,σ 2nσ 2 + λ univ b 1 { β (init), σ} = LSE after scaled Lasso z j = residual of Lasso(x j, X j ), = a regularized/approximate projection of x j to x k, k (init) β j = β j + z T j (y X β (init) )/(z T j x j ) ŝe j = σ z j / z T j x j

37 LDPE Unknown {β, σ}, deterministic X or unknown Σ for random X { y Xb 2 scaled Lasso = arg min 2 + σ } b,σ 2nσ 2 + λ univ b 1 { β (init), σ} = LSE after scaled Lasso z j = residual of Lasso(x j, X j ), = a regularized/approximate projection of x j to x k, k (init) β j = β j + z T j (y X β (init) )/(z T j x j ) ŝe j = σ z j / z T j x j Low-dimensional projection estimator (LDPE): β 1, ŝe 1,..., β p, ŝe p

38 LDPE Unknown {β, σ}, deterministic X or unknown Σ for random X { y Xb 2 scaled Lasso = arg min 2 + σ } b,σ 2nσ 2 + λ univ b 1 { β (init), σ} = LSE after scaled Lasso z j = residual of Lasso(x j, X j ), = a regularized/approximate projection of x j to x k, k (init) β j = β j + z T j (y X β (init) )/(z T j x j ) ŝe j = σ z j / z T j x j Low-dimensional projection estimator (LDPE): β 1, ŝe 1,..., β p, ŝe p Restricted LDPE: z j x k for the m = 4 smallest Ex T k x j

39 LDPE Unknown {β, σ}, deterministic X or unknown Σ for random X { y Xb 2 scaled Lasso = arg min 2 + σ } b,σ 2nσ 2 + λ univ b 1 { β (init), σ} = LSE after scaled Lasso z j = residual of Lasso(x j, X j ), = a regularized/approximate projection of x j to x k, k (init) β j = β j + z T j (y X β (init) )/(z T j x j ) ŝe j = σ z j / z T j x j Low-dimensional projection estimator (LDPE): β 1, ŝe 1,..., β p, ŝe p Restricted LDPE: z j x k for the m = 4 smallest Ex T k x j Oracle: β(oracle) j = e T j (X T K j X Kj ) 1 (X Kj (y X K c j β K c j ), K j = 3

40 LD problems LDPE Variable selection SemiLD inference Extensions Sample size requirement Bias correction: histogram for βbj βj with maximal βj = Sc a l e dl a s s o L e a s t s q u a r e se s t i ma t i o na f t e r s c a l e dl a s s os e l e c t i o n L DPE Re s t r i c t e dl DPE Simulation results

41 Bias correction: summary statistics for b β j β j with maximal β j = Estimator Lasso sclasso LSE-scLasso oracle LDPE R-LDPE (A) bias sd med abs err (B) bias sd med abs err (C) bias sd med abs err (D) bias sd med abs err

42 Overall relative coverage frequency, target = 95% (A) (B) (C) (D) all β j LDPE R-LDPE maximal β j LDPE R-LDPE Table: Mean coverage of LDPEs

43 LD problems LDPE Variable selection SemiLD inference Extensions Sample size requirement More bias correction: distribution of simulated relative coverage frequencies L DPE Re s t r i c t e dl DPE Simulation results

44 Efficiency: median ratio of width of the LDPE/restricted LDPE versus oracle (A) (B) (C) (D) LDPE R-LDPE Table: median ratio between widths

45 LD problems LDPE Variable selection SemiLD inference Extensions Sample size requirement Efficiency: median ratio of width of the LDPE/restricted LDPE versus oracle LDPE Rest r i ct edldpe Simulation results

46 Relative efficiency of the LDPE/restricted LDPE versus oracle (A) (B) (C) (D) LDPE R-LDPE Table: Median efficiency (ratio of the MSEs) of the LDPE and restricted LDPE estimators, versus the oracle estimator.

47 LD problems LDPE Variable selection SemiLD inference Extensions Sample size requirement Relative efficiency of the LDPE/restricted LDPE versus oracle LDPE Rest r i ct edldpe Simulation results

48 Threshold LDPE, β β 2, at Bonferroni 1/p level Estimator Lasso sclasso LSE-scLasso oracle t-ldpe (A) mean sd median (B) mean sd median (C) mean sd median (D) mean sd median

49 Variable selection with Bonferroni correction first block maximal β j (A) Lasso sclasso oracle LDPE (B) Lasso sclasso oracle LDPE (C) Lasso sclasso oracle LDPE (D) Lasso sclasso oracle LDPE

50 Thanks!

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan

More information

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo) Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

More information

The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression

The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression Cun-hui Zhang and Jian Huang Presenter: Quefeng Li Feb. 26, 2010 un-hui Zhang and Jian Huang Presenter: Quefeng The Sparsity

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel

More information

Quantile Regression for Extraordinarily Large Data

Quantile Regression for Extraordinarily Large Data Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models

A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models Tze Leung Lai, Stanford University Ching-Kang Ing, Academia Sinica, Taipei Zehao Chen, Lehman Brothers

More information

A General Framework for High-Dimensional Inference and Multiple Testing

A General Framework for High-Dimensional Inference and Multiple Testing A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional

More information

arxiv: v1 [math.st] 13 Feb 2012

arxiv: v1 [math.st] 13 Feb 2012 Sparse Matrix Inversion with Scaled Lasso Tingni Sun and Cun-Hui Zhang Rutgers University arxiv:1202.2723v1 [math.st] 13 Feb 2012 Address: Department of Statistics and Biostatistics, Hill Center, Busch

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

Estimating LASSO Risk and Noise Level

Estimating LASSO Risk and Noise Level Estimating LASSO Risk and Noise Level Mohsen Bayati Stanford University bayati@stanford.edu Murat A. Erdogdu Stanford University erdogdu@stanford.edu Andrea Montanari Stanford University montanar@stanford.edu

More information

Uncertainty quantification in high-dimensional statistics

Uncertainty quantification in high-dimensional statistics Uncertainty quantification in high-dimensional statistics Peter Bühlmann ETH Zürich based on joint work with Sara van de Geer Nicolai Meinshausen Lukas Meier 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

high-dimensional inference robust to the lack of model sparsity

high-dimensional inference robust to the lack of model sparsity high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,

More information

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian Sparse Linear Regression with Unknown Symmetric Error Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of

More information

Graphlet Screening (GS)

Graphlet Screening (GS) Graphlet Screening (GS) Jiashun Jin Carnegie Mellon University April 11, 2014 Jiashun Jin Graphlet Screening (GS) 1 / 36 Collaborators Alphabetically: Zheng (Tracy) Ke Cun-Hui Zhang Qi Zhang Princeton

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

Least squares under convex constraint

Least squares under convex constraint Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption

More information

The Iterated Lasso for High-Dimensional Logistic Regression

The Iterated Lasso for High-Dimensional Logistic Regression The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Statistica Sinica 18(2008), 1603-1618 ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang, Shuangge Ma and Cun-Hui Zhang University of Iowa, Yale University and Rutgers University Abstract:

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

A Practical Scheme and Fast Algorithm to Tune the Lasso With Optimality Guarantees

A Practical Scheme and Fast Algorithm to Tune the Lasso With Optimality Guarantees Journal of Machine Learning Research 17 (2016) 1-20 Submitted 11/15; Revised 9/16; Published 12/16 A Practical Scheme and Fast Algorithm to Tune the Lasso With Optimality Guarantees Michaël Chichignoud

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang 1, Joel L. Horowitz 2, and Shuangge Ma 3 1 Department of Statistics and Actuarial Science, University

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

arxiv: v2 [math.st] 15 Sep 2015

arxiv: v2 [math.st] 15 Sep 2015 χ 2 -confidence sets in high-dimensional regression Sara van de Geer, Benjamin Stucky arxiv:1502.07131v2 [math.st] 15 Sep 2015 Abstract We study a high-dimensional regression model. Aim is to construct

More information

arxiv: v2 [stat.me] 3 Jan 2017

arxiv: v2 [stat.me] 3 Jan 2017 Linear Hypothesis Testing in Dense High-Dimensional Linear Models Yinchu Zhu and Jelena Bradic Rady School of Management and Department of Mathematics University of California at San Diego arxiv:161.987v

More information

LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA

LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA The Annals of Statistics 2009, Vol. 37, No. 1, 246 270 DOI: 10.1214/07-AOS582 Institute of Mathematical Statistics, 2009 LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA BY NICOLAI

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013 Learning Theory Ingo Steinwart University of Stuttgart September 4, 2013 Ingo Steinwart University of Stuttgart () Learning Theory September 4, 2013 1 / 62 Basics Informal Introduction Informal Description

More information

M-Estimation under High-Dimensional Asymptotics

M-Estimation under High-Dimensional Asymptotics M-Estimation under High-Dimensional Asymptotics 2014-05-01 Classical M-estimation Big Data M-estimation An out-of-the-park grand-slam home run Annals of Mathematical Statistics 1964 Richard Olshen Classical

More information

Theoretical results for lasso, MCP, and SCAD

Theoretical results for lasso, MCP, and SCAD Theoretical results for lasso, MCP, and SCAD Patrick Breheny March 2 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/23 Introduction There is an enormous body of literature concerning theoretical

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Nonparametric Inference In Functional Data

Nonparametric Inference In Functional Data Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS The Annals of Statistics 2008, Vol. 36, No. 2, 587 613 DOI: 10.1214/009053607000000875 Institute of Mathematical Statistics, 2008 ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION

More information

Divide-and-combine Strategies in Statistical Modeling for Massive Data

Divide-and-combine Strategies in Statistical Modeling for Massive Data Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017

More information

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Feature selection with high-dimensional data: criteria and Proc. Procedures

Feature selection with high-dimensional data: criteria and Proc. Procedures Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Covariate-Assisted Variable Ranking

Covariate-Assisted Variable Ranking Covariate-Assisted Variable Ranking Tracy Ke Department of Statistics Harvard University WHOA-PSI@St. Louis, Sep. 8, 2018 1/18 Sparse linear regression Y = X β + z, X R n,p, z N(0, σ 2 I n ) Signals (nonzero

More information

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

High-dimensional statistics: Some progress and challenges ahead

High-dimensional statistics: Some progress and challenges ahead High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)

The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso) Electronic Journal of Statistics Vol. 0 (2010) ISSN: 1935-7524 The adaptive the thresholded Lasso for potentially misspecified models ( a lower bound for the Lasso) Sara van de Geer Peter Bühlmann Seminar

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

arxiv: v1 [math.st] 5 Oct 2009

arxiv: v1 [math.st] 5 Oct 2009 On the conditions used to prove oracle results for the Lasso Sara van de Geer & Peter Bühlmann ETH Zürich September, 2009 Abstract arxiv:0910.0722v1 [math.st] 5 Oct 2009 Oracle inequalities and variable

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Composite Loss Functions and Multivariate Regression; Sparse PCA

Composite Loss Functions and Multivariate Regression; Sparse PCA Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

The annals of Statistics (2006)

The annals of Statistics (2006) High dimensional graphs and variable selection with the Lasso Nicolai Meinshausen and Peter Buhlmann The annals of Statistics (2006) presented by Jee Young Moon Feb. 19. 2010 High dimensional graphs and

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Relaxed Lasso. Nicolai Meinshausen December 14, 2006

Relaxed Lasso. Nicolai Meinshausen December 14, 2006 Relaxed Lasso Nicolai Meinshausen nicolai@stat.berkeley.edu December 14, 2006 Abstract The Lasso is an attractive regularisation method for high dimensional regression. It combines variable selection with

More information

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

Sparse Matrix Inversion with Scaled Lasso

Sparse Matrix Inversion with Scaled Lasso Sparse Matrix Inversion with Scaled Lasso arxiv:1202.2723v2 [math.st] 14 Oct 2013 Tingni Sun Statistics Department, The Wharton School, University of Pennsylvania Philadelphia, Pennsylvania, 19104 tingni@wharton.upenn.edu

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Double/de-biased machine learning using regularized Riesz representers

Double/de-biased machine learning using regularized Riesz representers Double/de-biased machine learning using regularized Riesz representers Victor Chernozhukov Whitney Newey James Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

High-dimensional statistics, with applications to genome-wide association studies

High-dimensional statistics, with applications to genome-wide association studies EMS Surv. Math. Sci. x (201x), xxx xxx DOI 10.4171/EMSS/x EMS Surveys in Mathematical Sciences c European Mathematical Society High-dimensional statistics, with applications to genome-wide association

More information

Exact Post Model Selection Inference for Marginal Screening

Exact Post Model Selection Inference for Marginal Screening Exact Post Model Selection Inference for Marginal Screening Jason D. Lee Computational and Mathematical Engineering Stanford University Stanford, CA 94305 jdl17@stanford.edu Jonathan E. Taylor Department

More information

Approximation Theoretical Questions for SVMs

Approximation Theoretical Questions for SVMs Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Distribution-Free Predictive Inference for Regression

Distribution-Free Predictive Inference for Regression Distribution-Free Predictive Inference for Regression Jing Lei, Max G Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman Department of Statistics, Carnegie Mellon University Abstract We

More information