Consistent change point estimation. Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University

Size: px
Start display at page:

Download "Consistent change point estimation. Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University"

Transcription

1 Consistent change point estimation Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University

2 Outline of presentation Change point problem = variable selection problem Penalized likelihood method Lasso (Tibshirani, 1996), scad (Fan and Li, 2001), bridge (Frank and Friedman, 1993), unbounded (Lee and Oh, 2009) New theory of selection consistency all local solutions are consistent! Simulation studies

3

4

5 Change point problem Home Depot stock returns ( ) Subseries: r t = log S t /S t 1. Sub I ( ) Sub II ( ) Sub III ( ) The length of full series and each subseries are respectively: 1510, 502, 506, 502.

6 Adjusted close price of Home Depot 3 rd,june, rd,june, Adj Close

7 0.1 Rate of return return

8 Change point problem Standard deviations of subseries Series Standard deviation Full Sub I Sub II Sub III

9 Change point problem Other applications Meteorology: Beaulieu, Chen, and Sarmiento (2012) Engineering: Gillet, Essid, and Richard (2007), Yao (1987) Econometrics: Perron (2005)

10 Change point problem Cumulative sum method (CUSUM) Tiao (1994), Lee, Ha, Na, and Na (2003), Kokoszka and Leipus (2003) Maximum likelihood + Dynamic programming Bai, Lumsdaine, and Stock (1998), Bai and Perron (1998) Bayesian method Lai and Xing (2013)

11 Change point problem = Variable selection Model: X t = µ t + ɛ t, t = 1, 2,..., n. ɛ t are independent N(0, σ 2 ). Re-parameterization: ξ 1 = µ 1 µ 2 ξ 2 = µ 2 µ 3. ξ n 1 = µ n 1 µ n ξ n = µ n µ 1 = ξ ξ n µ 2 = ξ ξ n. µ n 1 = ξ n 1 + ξ n µ n = ξ n

12 Change point problem = Variable selection Likelihood (ˆµ 1,..., ˆµ n ) = arg max 1 2 n t=1 (X t µ t ) 2 Equivalently, (ˆξ 1,..., ˆξ n ) = arg max { 12 } X Uξ 2 where X = X 1 X 2. X n, ɛ = ɛ 1 ɛ 2. ɛ n, U =

13 Change point problem = Variable selection Lasso penalized likelihood estimation: (ˆξ 1,..., ˆξ n ) = arg max 1 2 n t=1 (X t ξ t ξ t+1... ξ n ) 2 + λ n 1 t=1 ξ t. Equivalently, (ˆµ 1,..., ˆµ n ) = arg max 1 2 n t=1 (X t µ t ) 2 + λ n 1 t=1 µ t µ t+1. Computation algorithm?

14

15 Change point problem = Variable selection Local quadratic approximation Fan and Li (2001) and Hunter and Li (2005) ξ ξold 2 + ξ2 2 ξ old. ξ ξold 2 + ξ 2 2(δ + ξ old ). O(n) Iterative algorithm: arg max µ 1 2 n t=1 (X t µ t ) 2 + λ n 1 t=1 (µ t µ t+1 ) 2 2(δ + µ old t µ old t+1 ).

16 Change point problem = Variable selection Example: X = (0.12, 0.16, 0.76, 0.80) λ = 0.00: ˆξ = (0.04, 0.60, 0.04, 0.80) or ˆµ = (0.12, 0.16, 0.76, 0.80) λ = 0.01: ˆξ = (0.03, 0.60, 0.03, 0.79) or ˆµ = (0.13, 0.16, 0.76, 0.79) λ = 0.06: ˆξ = (0.00, 0.58, 0.00, 0.75) or ˆµ = (0.17, 0.17, 0.75, 0.75) λ = 1.00: ˆξ = (0.00, 0.00, 0.00, 0.46) or ˆµ = (0.46, 0.46, 0.46, 0.46)

17

18 Change point problem = Variable selection Alternative choice of penalized likelihood estimation: (ˆµ 1,..., ˆµ n ) = arg max 1 2 Scad: P λ (z) = n t=1 (X t µ t ) 2 + n 1 t=1 P λ ( µ t µ t+1 ). λ z, z n 1 λ, (nz 2 2aλ z + n 1 λ 2 )/[2(a 1)], n 1 λ < z an 1 λ, (a + 1)n 1 λ 2 /2, z > an 1 λ Fan and Li (2001) suggest a = 3.7. Bridge: P λ (z) = λ z γ (0 < γ < 1)

19 Change point problem = Variable selection Unbounded penalty P λ (z) = λ { where τ > 2, ν > 0, and g(z 2 ; τ, ν) = 1 4 log Γ(1/τ) + log τ τ + (τ 2) log g(z2 ; τ, ν) 2τ 2 τ + + z 2 2νg(z 2 ; τ, ν) + g(z2 ; τ, ν) τ (2 τ) 2 + 8τz2 ν }.,

20

21 Change point problem = Variable selection Local quadratic approximation (Fan and Li, 2001) Consider approximation P λ ( ξ ) b 0 + b 1 ξ 2 matching P λ ( ξ ) and P ( ξ ) at ξ old. That is P λ ( ξ ) {P λ (ξ old ) 12 P λ ( ξold )ξ old } + P λ ( ξold ) 2 ξ old ξ 2 Modification to avoid singularity (Hunter and Li, 2005): P λ ( ξ ) {P λ (ξ old ) 12 P λ ( ξold )ξ old } + P λ ( ξold ) 2(δ + ξ old ) ξ2 Computational complexity = O(n)

22 Evaluation of penalty function Gradual change Vs abrupt change Uniqueness of local solution Oracle: AT LEAST ONE OF the local maximums is good True identification: ALL local maximums are good

23 Evaluation of penalty function Technical definition of local maximum. Let Q be the penalized likelihood function and i be the gradient operator with respect to µ i. Then, ˆµ is said to be a local minimum if there exists a neighborhood N(ˆµ) such that for all µ N(ˆµ) {µ : ˆµ i = 0 i = 1, 2,..., n}, we have n i=1 (µ i ˆµ i ) T i Q(µ) < 0. This definition is applicable even in Bridge and unbounded penalty where P λ (0 +) =.

24

25 Evaluation of penalty function Gradual change One change or five changes??? ˆµ = (1, 1, 1, 1, 1, 1, 1.2, 1.4, 1.6, 1.8, 2, 2, 2, 2, 2) Abrupt change ˆµ = (1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2)

26 Evaluation of penalty function Gradual change Theorem: Suppose that the true model has only one change point. Let K L = 0, ±1, ±2... be a given constant. Let k > 1 and P k be the probability that there exists a local penalized likelihood solution with k > 1 changes at t (1) (= [nq (1) ] + K L ), t (1) + 1,..., t (1) + k 1. Then, (i) for unbounded penalty and Bridge, P k 0, (ii) for Lasso with λ = O(n α ) and α 1/2, P k C for some constant C < 1, (iii) for Scad with n 1/2 λ, P k 1.

27 Evaluation of penalty function Oracle Oracle: AT LEAST ONE OF the local maximums is good True model: changes at [nq (1) ], [nq (2) ],..., [nq (k) ] Estimation: changes at t (1), t (2),..., t (k )

28 Evaluation of penalty function Oracle Theorem: Suppose that the true model has k > 0 change points and k is finite. Then, (i) for Lasso with λ = O(n α ) and α 1/2, P k 0, (ii) for Scad with λ = O(n 1/2 ), P k C (0, 1), (iii) for Scad with λ = O(n α ) and α > 1/2, Bridge with α > γ/2, and unbounded penalty with α > 0, P k 1.

29 Evaluation of penalty function True identification True identification: ALL local maximums are good Penalized likelihood Convex: Lasso Non-convex: Scad, bridge, unbounded Inference on global maximum, e.g. Kim and Kwon (2012) Inference on the set of all local maximums

30 Evaluation of penalty function True identification Theorem: Suppose that the true model has k changes and k is finite. For bridge and unbounded, under some regularity conditions, the probability that all local solutions (fulfilling some criteria) have k k 2k goes to one. Ideal: All local solutions have k = k

31

32

33 Modified unbounded penalty Defined as P λ,λ (z) = { Pλ (z) if z > B, P λ (B) λ (B z ) otherwise, where P λ (z) is the unbounded penalty function. Here, λ = n β for some 1/2 < β < 1 and B is chosen such that n 1 P λ (B).

34 Modified unbounded penalty Theorem: Suppose that the true model has k changes and k is finite. Then, for modified unbounded, consistency in the number of changes: under some regularity conditions, the probability that all local solutions (fulfilling some criteria) have k = k goes to one. consistency in change dates: n 1 t (l) [nq (l) ] 0 for l = 1, 2,..., k consistency in parameter estimations: ˆµ i µ i 0 for i = 1, 2,..., n

35 Modified unbounded penalty True identification property = Trinity of consistency in the number of changes consistency in the change dates consistency in the parameter estimation

36 Modified unbounded penalty Uniqueness: Lasso No gradual change: Bridge, unbounded, modified Oracle: Scad, bridge, unbounded, modified True identification: modified

37 Simulation studies k = 0 Lasso, scad, bridge, unbounded penalty and modified unbounded penalty Detail settings: Scad with a = 3.7, bridge with γ = 1/2, unbounded penalty with τ = 30 and ν = 1, modified unbounded penalty with τ = 30, ν = 1, B = 1/n and λ = n 0.6. Bayesian information criterion is used to select λ. X 1,, X n (n = 500, 1000) are generated independently from N(µ i, 1) with µ i = 10 for 1 i n/2, and µ i = 20 for (n/2 + 1) i n.

38 Simulation studies k = 0 n = 500 Penalty Lasso Scad Bridge Unbounded Modified n = 1000 Penalty Lasso Scad Bridge Unbounded Modified

39 Simulation studies k = 0 Root means square error (RMSE): RMSE = 1 n n t=1 ˆµ t µ t 2.

40 Simulation studies k = 0 n = 500 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified n = 1000 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified

41 Simulation studies k = 1 X 1,, X n (n = 500, 1000) are generated independently from N(µ i, 1) with µ i = 10 for 1 i n/2, and µ i = 20 for (n/2 + 1) i n.

42

43 Simulation studies k = 1 n = 500 Penalty Lasso Scad Bridge Unbounded Modified n = 1000 Penalty Lasso Scad Bridge Unbounded Modified

44 Simulation studies k = 0 Root means square error (RMSE): RMSE = 1 n n t=1 ˆµ t µ t 2.

45 Simulation studies k = 1 n = 500 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified n = 1000 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified

46 Simulation studies k = 4 (X 1,, X 1500 ) is generated independently from N(µ i, 1) with µ i = 10 for 1 i 300, 601 i 900, and 1201 i 1500, and with µ i = 20 for 301 i 600 and 901 i 1200.

47

48 Simulation studies k = 4 n = 500 Penalty Lasso Scad Bridge Unbounded Modified n = 1000 Penalty Lasso Scad Bridge Unbounded Modified

49 Simulation studies k = 0 Root means square error (RMSE): RMSE = 1 n n t=1 ˆµ t µ t 2.

50 Simulation studies k = 4 n = 500 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified n = 1000 Penalty median RMSE(ˆµ) mean RMSE(ˆµ) Lasso Scad Bridge Unbounded Modified

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Minimum Message Length Analysis of the Behrens Fisher Problem

Minimum Message Length Analysis of the Behrens Fisher Problem Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

Divide-and-combine Strategies in Statistical Modeling for Massive Data

Divide-and-combine Strategies in Statistical Modeling for Massive Data Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017

More information

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Lasso & Bayesian Lasso

Lasso & Bayesian Lasso Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

A Robust Approach to Regularized Discriminant Analysis

A Robust Approach to Regularized Discriminant Analysis A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information

The annals of Statistics (2006)

The annals of Statistics (2006) High dimensional graphs and variable selection with the Lasso Nicolai Meinshausen and Peter Buhlmann The annals of Statistics (2006) presented by Jee Young Moon Feb. 19. 2010 High dimensional graphs and

More information

A UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009)

A UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) A UNIFIED APPROACH TO MODEL SELECTION AND SPARSE RECOVERY USING REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) Mar. 19. 2010 Outline 1 2 Sideline information Notations

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

VARIABLE SELECTION IN QUANTILE REGRESSION

VARIABLE SELECTION IN QUANTILE REGRESSION Statistica Sinica 19 (2009), 801-817 VARIABLE SELECTION IN QUANTILE REGRESSION Yichao Wu and Yufeng Liu North Carolina State University and University of North Carolina, Chapel Hill Abstract: After its

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013) A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical

More information

IMPROVED PENALTY STRATEGIES in LINEAR REGRESSION MODELS

IMPROVED PENALTY STRATEGIES in LINEAR REGRESSION MODELS REVSTAT Statistical Journal Volume 15, Number, April 017, 51-76 IMPROVED PENALTY STRATEGIES in LINEAR REGRESSION MODELS Authors: Bahadır Yüzbaşı Department of Econometrics, Inonu University, Turkey b.yzb@hotmail.com

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Variable Selection in Measurement Error Models 1

Variable Selection in Measurement Error Models 1 Variable Selection in Measurement Error Models 1 Yanyuan Ma Institut de Statistique Université de Neuchâtel Pierre à Mazel 7, 2000 Neuchâtel Runze Li Department of Statistics Pennsylvania State University

More information

VARIABLE SELECTION IN ROBUST LINEAR MODELS

VARIABLE SELECTION IN ROBUST LINEAR MODELS The Pennsylvania State University The Graduate School Eberly College of Science VARIABLE SELECTION IN ROBUST LINEAR MODELS A Thesis in Statistics by Bo Kai c 2008 Bo Kai Submitted in Partial Fulfillment

More information

Consistent Group Identification and Variable Selection in Regression with Correlated Predictors

Consistent Group Identification and Variable Selection in Regression with Correlated Predictors Consistent Group Identification and Variable Selection in Regression with Correlated Predictors Dhruv B. Sharma, Howard D. Bondell and Hao Helen Zhang Abstract Statistical procedures for variable selection

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise Minru Bai(x T) College of Mathematics and Econometrics Hunan University Joint work with Xiongjun Zhang, Qianqian Shao June 30,

More information

Theoretical results for lasso, MCP, and SCAD

Theoretical results for lasso, MCP, and SCAD Theoretical results for lasso, MCP, and SCAD Patrick Breheny March 2 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/23 Introduction There is an enormous body of literature concerning theoretical

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang 1, Joel L. Horowitz 2, and Shuangge Ma 3 1 Department of Statistics and Actuarial Science, University

More information

Generalized Linear Models and Its Asymptotic Properties

Generalized Linear Models and Its Asymptotic Properties for High Dimensional Generalized Linear Models and Its Asymptotic Properties April 21, 2012 for High Dimensional Generalized L Abstract Literature Review In this talk, we present a new prior setting for

More information

Heteroskedasticity; Step Changes; VARMA models; Likelihood ratio test statistic; Cusum statistic.

Heteroskedasticity; Step Changes; VARMA models; Likelihood ratio test statistic; Cusum statistic. 47 3!,57 Statistics and Econometrics Series 5 Febrary 24 Departamento de Estadística y Econometría Universidad Carlos III de Madrid Calle Madrid, 126 2893 Getafe (Spain) Fax (34) 91 624-98-49 VARIANCE

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification

More information

SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu

SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu LITIS - EA 48 - INSA/Universite de Rouen Avenue de l Université - 768 Saint-Etienne du Rouvray

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

arxiv: v2 [stat.me] 4 Jun 2016

arxiv: v2 [stat.me] 4 Jun 2016 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis

Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis Biostatistics (2010), 11, 4, pp. 599 608 doi:10.1093/biostatistics/kxq023 Advance Access publication on May 26, 2010 Simultaneous variable selection and class fusion for high-dimensional linear discriminant

More information

Regularized Regression A Bayesian point of view

Regularized Regression A Bayesian point of view Regularized Regression A Bayesian point of view Vincent MICHEL Director : Gilles Celeux Supervisor : Bertrand Thirion Parietal Team, INRIA Saclay Ile-de-France LRI, Université Paris Sud CEA, DSV, I2BM,

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

M-estimators for augmented GARCH(1,1) processes

M-estimators for augmented GARCH(1,1) processes M-estimators for augmented GARCH(1,1) processes Freiburg, DAGStat 2013 Fabian Tinkl 19.03.2013 Chair of Statistics and Econometrics FAU Erlangen-Nuremberg Outline Introduction The augmented GARCH(1,1)

More information

Bi-level feature selection with applications to genetic association

Bi-level feature selection with applications to genetic association Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may

More information

Sparse Learning and Distributed PCA. Jianqing Fan

Sparse Learning and Distributed PCA. Jianqing Fan w/ control of statistical errors and computing resources Jianqing Fan Princeton University Coauthors Han Liu Qiang Sun Tong Zhang Dong Wang Kaizheng Wang Ziwei Zhu Outline Computational Resources and Statistical

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Feature selection with high-dimensional data: criteria and Proc. Procedures

Feature selection with high-dimensional data: criteria and Proc. Procedures Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Variable Screening in High-dimensional Feature Space

Variable Screening in High-dimensional Feature Space ICCM 2007 Vol. II 735 747 Variable Screening in High-dimensional Feature Space Jianqing Fan Abstract Variable selection in high-dimensional space characterizes many contemporary problems in scientific

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Comparisons of penalized least squares. methods by simulations

Comparisons of penalized least squares. methods by simulations Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy

More information

The Iterated Lasso for High-Dimensional Logistic Regression

The Iterated Lasso for High-Dimensional Logistic Regression The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of

More information

Unit-free and robust detection of differential expression from RNA-Seq data

Unit-free and robust detection of differential expression from RNA-Seq data Unit-free and robust detection of differential expression from RNA-Seq data arxiv:405.4538v [stat.me] 8 May 204 Hui Jiang,2,* Department of Biostatistics, University of Michigan 2 Center for Computational

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17 Model selection I February 17 Remedial measures Suppose one of your diagnostic plots indicates a problem with the model s fit or assumptions; what options are available to you? Generally speaking, you

More information

Identification and Estimation for Generalized Varying Coefficient Partially Linear Models

Identification and Estimation for Generalized Varying Coefficient Partially Linear Models Identification and Estimation for Generalized Varying Coefficient Partially Linear Models Mingqiu Wang, Xiuli Wang and Muhammad Amin Abstract The generalized varying coefficient partially linear model

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1

More information

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Regularized Skew-Normal Regression. K. Shutes & C.J. Adcock. March 31, 2014

Regularized Skew-Normal Regression. K. Shutes & C.J. Adcock. March 31, 2014 Regularized Skew-Normal Regression K. Shutes & C.J. Adcock March 31, 214 Abstract This paper considers the impact of using the regularisation techniques for the analysis of the (extended) skew-normal distribution.

More information

Horseshoe, Lasso and Related Shrinkage Methods

Horseshoe, Lasso and Related Shrinkage Methods Readings Chapter 15 Christensen Merlise Clyde October 15, 2015 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Bayesian Lasso Park & Casella

More information

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy

More information

Regularized Skew-Normal Regression

Regularized Skew-Normal Regression MPRA Munich Personal RePEc Archive Regularized Skew-Normal Regression Karl Shutes and Chris Adcock Coventry University, University of Sheffield 24. November 213 Online at http://mpra.ub.uni-muenchen.de/52367/

More information

Location Multiplicative Error Model. Asymptotic Inference and Empirical Analysis

Location Multiplicative Error Model. Asymptotic Inference and Empirical Analysis : Asymptotic Inference and Empirical Analysis Qian Li Department of Mathematics and Statistics University of Missouri-Kansas City ql35d@mail.umkc.edu October 29, 2015 Outline of Topics Introduction GARCH

More information

On High-Dimensional Cross-Validation

On High-Dimensional Cross-Validation On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING

More information

LEAST ANGLE REGRESSION 469

LEAST ANGLE REGRESSION 469 LEAST ANGLE REGRESSION 469 Specifically for the Lasso, one alternative strategy for logistic regression is to use a quadratic approximation for the log-likelihood. Consider the Bayesian version of Lasso

More information

Convex Optimization and SVM

Convex Optimization and SVM Convex Optimization and SVM Problem 0. Cf lecture notes pages 12 to 18. Problem 1. (i) A slab is an intersection of two half spaces, hence convex. (ii) A wedge is an intersection of two half spaces, hence

More information

A Study of Relative Efficiency and Robustness of Classification Methods

A Study of Relative Efficiency and Robustness of Classification Methods A Study of Relative Efficiency and Robustness of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang April 28, 2011 Department of Statistics

More information

Test for Parameter Change in ARIMA Models

Test for Parameter Change in ARIMA Models Test for Parameter Change in ARIMA Models Sangyeol Lee 1 Siyun Park 2 Koichi Maekawa 3 and Ken-ichi Kawai 4 Abstract In this paper we consider the problem of testing for parameter changes in ARIMA models

More information

Lecture 5: Soft-Thresholding and Lasso

Lecture 5: Soft-Thresholding and Lasso High Dimensional Data and Statistical Learning Lecture 5: Soft-Thresholding and Lasso Weixing Song Department of Statistics Kansas State University Weixing Song STAT 905 October 23, 2014 1/54 Outline Penalized

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

Variable Selection and Parameter Estimation Using a Continuous and Differentiable Approximation to the L0 Penalty Function

Variable Selection and Parameter Estimation Using a Continuous and Differentiable Approximation to the L0 Penalty Function Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2011-03-10 Variable Selection and Parameter Estimation Using a Continuous and Differentiable Approximation to the L0 Penalty Function

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

Bayesian Spatial Model Selection for Detection and Identification in Chemical Plumes Based on Hyperspectral Imagery Data

Bayesian Spatial Model Selection for Detection and Identification in Chemical Plumes Based on Hyperspectral Imagery Data Bayesian Spatial Model Selection for Detection and Identification in Chemical Plumes Based on Hyperspectral Imagery Data - UC, Santa cruz Nicole Mendoza - UC, Santa Cruz Conference on Applied Statistics

More information

Regularization Paths

Regularization Paths December 2005 Trevor Hastie, Stanford Statistics 1 Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Saharon Rosset, Ji Zhu, Hui Zhou, Rob Tibshirani and

More information

CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning

CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning Professor Erik Sudderth Brown University Computer Science October 4, 2016 Some figures and materials courtesy

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results

More information

Group exponential penalties for bi-level variable selection

Group exponential penalties for bi-level variable selection for bi-level variable selection Department of Biostatistics Department of Statistics University of Kentucky July 31, 2011 Introduction In regression, variables can often be thought of as grouped: Indicator

More information

CHANGE DETECTION IN TIME SERIES

CHANGE DETECTION IN TIME SERIES CHANGE DETECTION IN TIME SERIES Edit Gombay TIES - 2008 University of British Columbia, Kelowna June 8-13, 2008 Outline Introduction Results Examples References Introduction sunspot.year 0 50 100 150 1700

More information

Forward Regression for Ultra-High Dimensional Variable Screening

Forward Regression for Ultra-High Dimensional Variable Screening Forward Regression for Ultra-High Dimensional Variable Screening Hansheng Wang Guanghua School of Management, Peking University This version: April 9, 2009 Abstract Motivated by the seminal theory of Sure

More information

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS The Annals of Statistics 2008, Vol. 36, No. 2, 587 613 DOI: 10.1214/009053607000000875 Institute of Mathematical Statistics, 2008 ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION

More information

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University UNDERDETERMINED LINEAR EQUATIONS We

More information

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

Statistical Inference of Moment Structures

Statistical Inference of Moment Structures 1 Handbook of Computing and Statistics with Applications, Vol. 1 ISSN: 1871-0301 2007 Elsevier B.V. All rights reserved DOI: 10.1016/S1871-0301(06)01011-0 1 Statistical Inference of Moment Structures Alexander

More information

SOME SPECIFIC PROBABILITY DISTRIBUTIONS. 1 2πσ. 2 e 1 2 ( x µ

SOME SPECIFIC PROBABILITY DISTRIBUTIONS. 1 2πσ. 2 e 1 2 ( x µ SOME SPECIFIC PROBABILITY DISTRIBUTIONS. Normal random variables.. Probability Density Function. The random variable is said to be normally distributed with mean µ and variance abbreviated by x N[µ, ]

More information