A General Framework for High-Dimensional Inference and Multiple Testing

Size: px
Start display at page:

Download "A General Framework for High-Dimensional Inference and Multiple Testing"

Transcription

1 A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1

2 Overview Goal: Control false scientific discoveries in high-dimensional data analysis Challenge: P-values for hypothesis testing in high-dimensional data Proposed Method: High-Dimensional Data Post-Regularization Inference Multiple Testing Decorrelated Score Test Wald Test {z } 2 Likelihood Ratio Test

3 {z } A Simulation Study 1 =0 n samples = d covariates {z } {z } + s sparsity Model: Y = X + n = 100, d = 500 and s = 4 3

4 Empirical Null Distribution of P-values H 0 : 1 =0 vs H 1 : 1 6= 0 (1). Perform variable selection by Lasso (2). Compute least square estimator after model selection Frequency Not uniformly distributed! P-value is wrong! P value 4

5 False Discovery Rate (FDR) H 0j : j =0 vs H 1j : j 6= 0, for 1 apple j apple d h i FDR = E #falsediscoveries #totaldiscoveries control FDR no greater than a given value (e.g. 0.1) well studied under independence, e.g. BH (1995) wide applications in scientific studies 5

6 Empirical FDR n = 500, d = 500 and s = 50 empirical FDR true FDR 6

7 Empirical FDR n = 500, d = 500 and s = 50 empirical FDR Most discoveries are false! true FDR 7

8 Challenges What we learn from the numerical studies? P-values are distorted by variable selection. Account for variable selection/regularization. Multiple hypothesis tests are dependent. Carefully design the test statistics to reduce dependence. 8

9 Problem Setup Data: X 1,...,X n i.i.d copies of X f(x; ). Partition of : par of interest 2 R nuisance par 2 R d 1 high-dimensional with d n Goal: Test the hypothesis H 0 : =0vsH 1 : 6= 0 9

10 Existing Work Sample splitting: Meinshausen & Buhlmann, Shah & Samworth Bootstrap and resampling: Chatterjee & Lahiri, Zhou & Min, Zhang & Cheng, Dezeure et al., McKeague & Qian Selective (conditional) inference: Lockhart et al., Lee & Taylor, Fithian et al., Tian & Taylor Debiased (desparsifying) estimator: Zhang & Zhang, van de Geer et al, Javanmard & Montanari, Belloni et al., Athey et al., Cai & Guo, Zhu & Bradic FDR Control Fan et al., Barber & Candes, G Sell et al, Liu, Ramdas et al., Candes et al. 10

11 Likelihood-based Inference Loglikelihood: L( )=n 1 P n i=1 log f(x i; ) 11

12 Likelihood-based Inference Loglikelihood: L( )=n 1 P n i=1 log f(x i; ) Low dimension: d n Profile loglikelihood: L(, ˆ ) Profile score: r L(, ˆ ) MLE: ˆ = arg max 2R L(, ˆ ) ˆ = arg max 2R d 1 L(, ) 12

13 Likelihood-based Inference Loglikelihood: L( )=n 1 P n i=1 log f(x i; ) Low dimension: d n High dimension: d n Profile loglikelihood: L(, ˆ ) Profile score: r L(, ˆ ) MLE: ˆ = arg max 2R L(, ˆ ) Decorrelated loglikelihood: L D ( ) =L(, ˆ ( ˆ )ŵ) Decorrelated score: S D ( ) =r L(, ˆ) ŵ T r L(, ˆ) M-estimator: ˆ D = arg max 2R L D ( ) ˆ = arg max 2R d 1 L(, ) 13

14 Likelihood-based Inference Loglikelihood: L( )=n 1 P n i=1 log f(x i; ) Low dimension: d n High dimension: d n Profile loglikelihood: L(, ˆ ) Profile score: r L(, ˆ ) MLE: ˆ = arg max 2R L(, ˆ ) Decorrelated loglikelihood: L D ( ) =L(, ˆ ( ˆ )ŵ) Decorrelated score: S D ( ) =r L(, ˆ) ŵ T r L(, ˆ) M-estimator: ˆ D = arg max 2R L D ( ) ˆ = arg max 2R d 1 L(, ) works for both HD and LD 14

15 A Framework for HD Inference 1. Calculate initial estimators ˆ = arg max 2R d L( ) P ( ) P is a generic penalty function with tuning parameter 0 15

16 A Framework for HD Inference 1. Calculate initial estimators ˆ = arg max 2R d L( ) P ( ) P is a generic penalty function with tuning parameter 0 2. Calculate ŵ = arg min kwk 1 s. t. kr L( ˆ) w T r L( ˆ)k max apple µ (µ tuning par) where, ŵ is an estimate of Stein s least favorable direction (1956) 16

17 A Framework for HD Inference 1. Calculate initial estimators ˆ = arg max 2R d L( ) P ( ) P is a generic penalty function with tuning parameter 0 2. Calculate ŵ = arg min kwk 1 s. t. kr L( ˆ) w T r L( ˆ)k max apple µ (µ tuning par) 3. Form test statistics for H 0 : =0vsH 1 : 6= 0 Plug ŵ into L D ( ) =L(, ˆ ( ˆ )ŵ) S D ( ) =r L(, ˆ) ŵ T r L(, ˆ) LR test Score test Wald test 2n[L D (ˆ D ) L D (0)] d 2 n 1/2 S D (0) d Normal n 1/2 ˆ D d Normal 17

18 A Framework for HD Inference 1. Calculate initial estimators ˆ = arg max 2R d L( ) P ( ) P is a generic penalty function with tuning parameter 0 2. Calculate ŵ = arg min kwk 1 s. t. kr L( ˆ) w T r L( ˆ)k max apple µ (µ tuning par) 3. Form test statistics for H 0 : =0vsH 1 : 6= 0 Plug ŵ into L D ( ) =L(, ˆ ( ˆ )ŵ) S D ( ) =r L(, ˆ) ŵ T r L(, ˆ) LR test Score test Wald test 2n[L D (ˆ D ) L D (0)] d 2 n 1/2 S D (0) d Normal n 1/2 ˆ D d Normal 18

19 A Brief History on Nuisance Par Neyman C( ) test E cient score function Projected score function (Neyman, 1959) (Bickel et al, 1993) (Lindsay, 1995) Locally ancillary estimating equation (Mcleish and Small, 1988) Doubly robust estimator (Robins et al, 1994, Scharfstein et al, 1999) 19

20 Example: Generalized Linear Model Model: f(y X; )=h(y )exp(y T X b( T X)) 1. Compute initial estimator: 2. Compute ŵ ˆ = arg max penalized loglikelihood ŵ = arg min kwk 1 s. t. kn 1 P n i=1 b00 ( ˆT X i )(Z i w T U i )U i k max apple µ =(, ) X =(Z, U) 3. Compute decorrelated score function S D ( ) =n 1 P n i=1 (Y i b 0 ( Z i ˆT U i ))(Z i ŵ T U i ) 20

21 Main Theorem for GLM Theorem [Uniformly normal approximation] For linear models with sub-gaussian noise, assume the covariate is sub-gaussian with a nonsingular covariance matrix. sup sup k k 0 apples,kwk 0 apples 0 t2r p ns D (0) P sd apple t (t). (s _ s0 ) log d p n under H 0 21

22 Main Theorem for GLM Theorem [Uniformly normal approximation] For linear models with sub-gaussian noise, assume the covariate is sub-gaussian with a nonsingular covariance matrix. sup sup k k 0 apples,kwk 0 apples 0 t2r p ns D (0) P sd apple t (t). (s _ s0 ) log d p n under H 0 uniform convergence over and w in a sparse set standard deviation of i (Z i w T U i ) determine convergence rate for normal approximation 22

23 Main Theorem for GLM Theorem [Uniformly normal approximation] For linear models with sub-gaussian noise, assume the covariate is sub-gaussian with a nonsingular covariance matrix. sup sup k k 0 apples,kwk 0 apples 0 t2r p ns D (0) P sd apple t (t). (s _ s0 ) log d p n under H 0 uniform convergence over and w in a sparse set standard deviation of i (Z i w T U i ) determine convergence rate for normal approximation 1. Tuning par satisfy µ p log d/n 2. Noncentral normal approximation under H 1n : = Cn 1/2 3. Same results hold for most GLM 4. The estimator attains the information lower bound. 23

24 Local Asymptotic Power for GLM Theorem [Uniformly normal approximation under alternative] For linear models with sub-gaussian noise, assume the covariate is sub-gaussian with a nonsingular covariance matrix. sup sup 2 1 (C),kwk 0 apples 0 t2r p ns D (0) P sd apple t (t +sd C). (s _ s0 ) log d p n under the local alternative hypothesis noncentral normal distribution 1 (C) ={(, ): = Cn 1/2, k k 0 apple s} 24

25 Data-Driven Tuning Parameters Assume that the tuning parameters and µ are chosen from a grid M = {a 1,...,a M } p log d/n Theorem [Normal approximation for data-driven tuning parameters] Under the same conditions, for any data-dependent tuning parameters ˆ, ˆµ : X! M it holds sup sup k k 0 apples,kwk 0 apples 0 t2r p ns D,ˆ,ˆµ (0) P sd apple t (t) = o(1) 25

26 Other Examples Graphical Model Classification Survival Analysis (GGM) (LDA) (AH/PH models) 26

27 Gaussian Graphical Model Model: X =(X 1,...,X d ) N(0, ), =( 1 ) k, = 1 1. Compute initial estimator: 2. Compute ŵ ˆ = arg min Compute decorrelated score function T ˆ e T k + k k 1, ŵ = arg min kwk 1, s.t. kˆ 12 w T ˆ 22 k 1 apple 0 S D ( ) =(1, ŵ T )(ˆ ˆ e k ) where ˆ =(, ˆ 1) can be extended to cluster-based graphical models. (Bunea et al) 27

28 Inference on Misspecified Model Working model f(x;, ) ˆ D o f o minimizes the KL distance between f and working model n 1/2 (ˆ D o ) d N(0, A 1 BA 1 ) Sandwich variance 28

29 FDR Control 1. Calculate P-values p j for H 0j : j =0 vs H 1j : j 6= 0, for 1 apple j apple d 29

30 FDR Control 1. Calculate P-values p j for H 0j : j =0 vs H 1j : j 6= 0, for 1 apple j apple d 2. Bound FDR for some cuto u 2 (0, 1) FDR(u) = P j2h 0 I(p j apple u) max{ P d j=1 I(p j apple u), 1} the set of null hypothesis H 0 is unknown 30

31 FDR Control 1. Calculate P-values p j for H 0j : j =0 vs H 1j : j 6= 0, for 1 apple j apple d 2. Bound FDR for some cuto u 2 (0, 1) FDR(u) = P j2h 0 I(p j apple u) max{ P d j=1 I(p j apple u), 1} H 0 apple d (can be improved by a two stage procedure). [FDR(u) = u d max{ P d j=1 I(p j apple u), 1} 31

32 Main Results on FDR Control Given level, calculate the cuto û =sup n o 0 <u<1: [FDR(u) apple 32

33 Main Results on FDR Control Given level, calculate the cuto û =sup n o 0 <u<1: [FDR(u) apple Theorem [FDR control for GLM] Under the same conditions, if s!1,d. n q for some q>1 FDR(û ) apple, in probability as n!1 33

34 {z } A Simulation Study 1 =0 n samples = d covariates {z } {z } + s sparsity Model: Y = X + n = 100, d = 500 and s = 4 34

35 Empirical Null Distribution H 0 : 1 =0 vs H 1 : 1 6= 0 MLE post model selection Proposed method Frequency Frequency P value P value

36 Empirical FDR BH + MLE post model selection Proposed method empirical FDR empirical FDR true FDR true FDR 36

37 Summary High-dimensional hypothesis testing is a common problem. We propose a unified framework to control testing errors in highdimensional models. The resulting P-values can be combined to control FDR in highdimensional multiple hypothesis testing. 37

38 Summary High-dimensional hypothesis testing is a common problem. We propose a unified framework to control testing errors in highdimensional models. The resulting P-values can be combined to control FDR in highdimensional multiple hypothesis testing. Ning & Liu, A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models, Annals of Statistics,

39 Thank You 39

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo) Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

More information

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan

More information

high-dimensional inference robust to the lack of model sparsity

high-dimensional inference robust to the lack of model sparsity high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,

More information

A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations

A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations arxiv:1510.08986v2 [math.st] 23 Jun 2016 Matey Neykov Yang Ning Jun S. Liu Han Liu Abstract We propose a new

More information

Recent Advances in Post-Selection Statistical Inference

Recent Advances in Post-Selection Statistical Inference Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University June 26, 2016 Joint work with Jonathan Taylor, Richard Lockhart, Ryan Tibshirani, Will Fithian, Jason Lee,

More information

Learning 2 -Continuous Regression Functionals via Regularized Riesz Representers

Learning 2 -Continuous Regression Functionals via Regularized Riesz Representers Learning 2 -Continuous Regression Functionals via Regularized Riesz Representers Victor Chernozhukov MIT Whitney K. Newey MIT Rahul Singh MIT September 10, 2018 Abstract Many objects of interest can be

More information

arxiv: v2 [stat.me] 3 Jan 2017

arxiv: v2 [stat.me] 3 Jan 2017 Linear Hypothesis Testing in Dense High-Dimensional Linear Models Yinchu Zhu and Jelena Bradic Rady School of Management and Department of Mathematics University of California at San Diego arxiv:161.987v

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

The lasso, persistence, and cross-validation

The lasso, persistence, and cross-validation The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Recent Advances in Post-Selection Statistical Inference

Recent Advances in Post-Selection Statistical Inference Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University October 1, 2016 Workshop on Higher-Order Asymptotics and Post-Selection Inference, St. Louis Joint work with

More information

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel

More information

High Dimensional Propensity Score Estimation via Covariate Balancing

High Dimensional Propensity Score Estimation via Covariate Balancing High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)

More information

A significance test for the lasso

A significance test for the lasso 1 First part: Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Second part: Joint work with Max Grazier G Sell, Stefan Wager and Alexandra

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

More information

Uncertainty quantification in high-dimensional statistics

Uncertainty quantification in high-dimensional statistics Uncertainty quantification in high-dimensional statistics Peter Bühlmann ETH Zürich based on joint work with Sara van de Geer Nicolai Meinshausen Lukas Meier 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

More information

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions Susan Athey Guido W. Imbens Stefan Wager Current version November 2016 Abstract There are many settings

More information

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.

More information

arxiv: v3 [stat.me] 14 Nov 2016

arxiv: v3 [stat.me] 14 Nov 2016 Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions Susan Athey Guido W. Imbens Stefan Wager arxiv:1604.07125v3 [stat.me] 14 Nov 2016 Current version November

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Double/de-biased machine learning using regularized Riesz representers

Double/de-biased machine learning using regularized Riesz representers Double/de-biased machine learning using regularized Riesz representers Victor Chernozhukov Whitney Newey James Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

arxiv: v2 [math.st] 15 Sep 2015

arxiv: v2 [math.st] 15 Sep 2015 χ 2 -confidence sets in high-dimensional regression Sara van de Geer, Benjamin Stucky arxiv:1502.07131v2 [math.st] 15 Sep 2015 Abstract We study a high-dimensional regression model. Aim is to construct

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Problem Set 7. Ideally, these would be the same observations left out when you

Problem Set 7. Ideally, these would be the same observations left out when you Business 4903 Instructor: Christian Hansen Problem Set 7. Use the data in MROZ.raw to answer this question. The data consist of 753 observations. Before answering any of parts a.-b., remove 253 observations

More information

arxiv: v2 [math.st] 9 Feb 2017

arxiv: v2 [math.st] 9 Feb 2017 Submitted to Biometrika Selective inference with unknown variance via the square-root LASSO arxiv:1504.08031v2 [math.st] 9 Feb 2017 1. Introduction Xiaoying Tian, and Joshua R. Loftus, and Jonathan E.

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Goodness-of-fit tests for high dimensional linear models

Goodness-of-fit tests for high dimensional linear models J. R. Statist. Soc. B (2018) 80, Part 1, pp. 113 135 Goodness-of-fit tests for high dimensional linear models Rajen D. Shah University of Cambridge, UK and Peter Bühlmann Eidgenössiche Technische Hochschule

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Post-selection Inference for Changepoint Detection

Post-selection Inference for Changepoint Detection Post-selection Inference for Changepoint Detection Sangwon Hyun (Justin) Dept. of Statistics Advisors: Max G Sell, Ryan Tibshirani Committee: Will Fithian (UC Berkeley), Alessandro Rinaldo, Kathryn Roeder,

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

False Discovery Rate

False Discovery Rate False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR

More information

On Semiparametric Exponential Family Graphical Models

On Semiparametric Exponential Family Graphical Models On Semiparametric Exponential Family Graphical Models arxiv:4.8697v [stat.ml] 5 Oct 05 Zhuoran Yang Yang Ning Han Liu Abstract We propose a new class of semiparametric exponential family graphical models

More information

arxiv: v1 [stat.me] 14 Aug 2017

arxiv: v1 [stat.me] 14 Aug 2017 Fixed effects testing in high-dimensional linear mixed models Jelena Bradic 1 and Gerda Claeskens and Thomas Gueuning 1 Department of Mathematics, University of California San Diego ORStat and Leuven Statistics

More information

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College Distributed Estimation, Information Loss and Exponential Families Qiang Liu Department of Computer Science Dartmouth College Statistical Learning / Estimation Learning generative models from data Topic

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Distribution-Free Predictive Inference for Regression

Distribution-Free Predictive Inference for Regression Distribution-Free Predictive Inference for Regression Jing Lei, Max G Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman Department of Statistics, Carnegie Mellon University Abstract We

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Incorporation of Sparsity Information in Large-scale Multiple Two-sample t Tests

Incorporation of Sparsity Information in Large-scale Multiple Two-sample t Tests Incorporation of Sparsity Information in Large-scale Multiple Two-sample t Tests Weidong Liu October 19, 2014 Abstract Large-scale multiple two-sample Student s t testing problems often arise from the

More information

P-Values for High-Dimensional Regression

P-Values for High-Dimensional Regression P-Values for High-Dimensional Regression Nicolai einshausen Lukas eier Peter Bühlmann November 13, 2008 Abstract Assigning significance in high-dimensional regression is challenging. ost computationally

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0041.R2 Title Empirical Likelihood Ratio Tests for Coefficients in High Dimensional Heteroscedastic Linear Models Manuscript ID SS-2017-0041.R2 URL http://www.stat.sinica.edu.tw/statistica/

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

VARIABLE SELECTION AND INDEPENDENT COMPONENT

VARIABLE SELECTION AND INDEPENDENT COMPONENT VARIABLE SELECTION AND INDEPENDENT COMPONENT ANALYSIS, PLUS TWO ADVERTS Richard Samworth University of Cambridge Joint work with Rajen Shah and Ming Yuan My core research interests A broad range of methodological

More information

Knockoffs as Post-Selection Inference

Knockoffs as Post-Selection Inference Knockoffs as Post-Selection Inference Lucas Janson Harvard University Department of Statistics blank line blank line WHOA-PSI, August 12, 2017 Controlled Variable Selection Conditional modeling setup:

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Graphical model inference with unobserved variables via latent tree aggregation

Graphical model inference with unobserved variables via latent tree aggregation Graphical model inference with unobserved variables via latent tree aggregation Geneviève Robin Christophe Ambroise Stéphane Robin UMR 518 AgroParisTech/INRA MIA Équipe Statistique et génome 15 decembre

More information

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions Susan Athey Guido W. Imbens Stefan Wager Current version February 2018 arxiv:1604.07125v5 [stat.me] 31

More information

Selective Inference for Effect Modification

Selective Inference for Effect Modification Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.

More information

arxiv: v1 [stat.me] 28 May 2018

arxiv: v1 [stat.me] 28 May 2018 High-Dimensional Statistical Inferences with Over-identification: Confidence Set Estimation and Specification Test Jinyuan Chang, Cheng Yong Tang, Tong Tong Wu arxiv:1805.10742v1 [stat.me] 28 May 2018

More information

Nonparametric Inference In Functional Data

Nonparametric Inference In Functional Data Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +

More information

A knockoff filter for high-dimensional selective inference

A knockoff filter for high-dimensional selective inference 1 A knockoff filter for high-dimensional selective inference Rina Foygel Barber and Emmanuel J. Candès February 2016; Revised September, 2017 Abstract This paper develops a framework for testing for associations

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

arxiv: v6 [stat.ml] 12 Dec 2017

arxiv: v6 [stat.ml] 12 Dec 2017 Double/Debiased Machine Learning for Treatment and Structural Parameters Victor Chernozhukov, Denis Chetverikov, Mert Demirer Esther Duflo, Christian Hansen, Whitney Newey, James Robins arxiv:1608.00060v6

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Double/Debiased Machine Learning for Treatment and Structural Parameters 1

Double/Debiased Machine Learning for Treatment and Structural Parameters 1 Double/Debiased Machine Learning for Treatment and Structural Parameters 1 Victor Chernozhukov, Denis Chetverikov, Mert Demirer Esther Duflo, Christian Hansen, Whitney Newey, James Robins Massachusetts

More information

arxiv: v3 [stat.me] 13 Oct 2016

arxiv: v3 [stat.me] 13 Oct 2016 Post-selection inference for l 1 -penalized likelihood models arxiv:1602.07358v3 [stat.me] 13 Oct 2016 Jonathan Taylor and Robert Tibshirani Stanford University October 17, 2016 Abstract We present a new

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

An Introduction to Statistical Machine Learning - Theoretical Aspects -

An Introduction to Statistical Machine Learning - Theoretical Aspects - An Introduction to Statistical Machine Learning - Theoretical Aspects - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny,

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

Divide-and-combine Strategies in Statistical Modeling for Massive Data

Divide-and-combine Strategies in Statistical Modeling for Massive Data Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017

More information

Package Grace. R topics documented: April 9, Type Package

Package Grace. R topics documented: April 9, Type Package Type Package Package Grace April 9, 2017 Title Graph-Constrained Estimation and Hypothesis Tests Version 0.5.3 Date 2017-4-8 Author Sen Zhao Maintainer Sen Zhao Description Use

More information

A significance test for the lasso

A significance test for the lasso 1 Gold medal address, SSC 2013 Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Reaping the benefits of LARS: A special thanks to Brad Efron,

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Lawrence D. Brown* and Daniel McCarthy*

Lawrence D. Brown* and Daniel McCarthy* Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Heterogeneity and False Discovery Rate Control

Heterogeneity and False Discovery Rate Control Heterogeneity and False Discovery Rate Control Joshua D Habiger Oklahoma State University jhabige@okstateedu URL: jdhabigerokstateedu August, 2014 Motivating Data: Anderson and Habiger (2012) M = 778 bacteria

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

4.2 Estimation on the boundary of the parameter space

4.2 Estimation on the boundary of the parameter space Chapter 4 Non-standard inference As we mentioned in Chapter the the log-likelihood ratio statistic is useful in the context of statistical testing because typically it is pivotal (does not depend on any

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Summary and discussion of: Dropout Training as Adaptive Regularization

Summary and discussion of: Dropout Training as Adaptive Regularization Summary and discussion of: Dropout Training as Adaptive Regularization Statistics Journal Club, 36-825 Kirstin Early and Calvin Murdock November 21, 2014 1 Introduction Multi-layered (i.e. deep) artificial

More information