Single Index Quantile Regression for Heteroscedastic Data

Size: px
Start display at page:

Download "Single Index Quantile Regression for Heteroscedastic Data"

Transcription

1 Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

2 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

3 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

4 Introduction Motivation Least Squares Regression: Models the relationship between a d-dimensional vector of covariates X and the conditional mean of the response Y given X = x. Least Squares Estimator: 1 provides only a single summary measure for the conditional distribution of the response given the covariates. 2 sensitive to outliers and can provide a very poor estimator in many non-gaussian and especially long-tailed distributions. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

5 Introduction Motivation Quantile Regression (QR): Models the relationship between a d-dimensional vector of covariates X and the τ th conditional quantile of the response Y given X = x, Q τ (Y x). Quantile Regression: 1 provides a more complete picture for the conditional distribution of the response given the covariates. 2 useful for modeling data with heterogeneous conditional distributions, especially data where extremes are important. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

6 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

7 Introduction Possible Models Linear QR: Q τ (Y x) = β x, for a d-dimensional vector of unknown parameters β. Disadvantage: linear assumption not always valid. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

8 Introduction Possible Models Linear QR: Q τ (Y x) = β x, for a d-dimensional vector of unknown parameters β. Disadvantage: linear assumption not always valid. Nonparametric QR: Q τ (Y x) = g(x), for an unspecified function g : R d R. Disadvantage: meets the data sparseness problem for high dimensional data. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

9 Introduction Possible Models Linear QR: Q τ (Y x) = β x, for a d-dimensional vector of unknown parameters β. Disadvantage: linear assumption not always valid. Nonparametric QR: Q τ (Y x) = g(x), for an unspecified function g : R d R. Disadvantage: meets the data sparseness problem for high dimensional data. Single-Index QR: Q τ (Y x) = g(β x β), depends on x only through the single linear combination β x. Advantage: reduces the dimension and maintain some nonparametric flexibility. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

10 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

11 The Model Let {Y i, X i } n i=1 be independent and identically distributed (iid) observations that satisfy Y i = Q τ (Y X i ) + ɛ i, where Q τ (Y x) = Q τ (Y β 1x) and the error term satisfies Q τ (ɛ i x) = 0. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

12 The Model Let {Y i, X i } n i=1 be independent and identically distributed (iid) observations that satisfy Y i = Q τ (Y X i ) + ɛ i, where Q τ (Y x) = Q τ (Y β 1x) and the error term satisfies Q τ (ɛ i x) = 0. For identifiability, we impose certain conditions on β 1 : the most common one: β 1 = 1 with its first coordinate positive. the one adopted here: β 1 = (1, β ), for β R d 1. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

13 The Model The true parametric vector β satisfies β = arg min b E ( ρ τ (Y Q τ (Y b 1X)) ), where b 1 = (1, b ) and ρ τ (u) = (τ I (u < 0))u. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

14 Iterative algorithms As in the single index mean regression (SIMR) problem, the unknown Q τ (Y b 1X) must be replaced with an estimator. Unlike the SIMR problem, however, there is no closed form expression for the estimator of Q τ (Y b 1 X i), and this has led to iterative algorithms for estimating β (Wu, Yu, and Yu, 2010; Kong and Xia, 2012). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

15 The New Approach For any given b R d 1, define the function g(u b) : R R as where b 1 = (1, b ). g(u b) = E(Q τ (Y X) b 1X = u), E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

16 The New Approach For any given b R d 1, define the function g(u b) : R R as where b 1 = (1, b ). Noting that β also satisfies where b 1 = (1, b ). g(u b) = E(Q τ (Y X) b 1X = u), g(β 1X β) = Q τ (Y X), β = arg min b E ( ρ τ (Y g(b 1X b)) ), (1) E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

17 The New Approach The sample level version of (1) consists of minimizing S n (τ, b) = n ρ τ (Y i g(b 1X i b)). i=1 E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

18 The New Approach The sample level version of (1) consists of minimizing S n (τ, b) = n ρ τ (Y i g(b 1X i b)). i=1 Again, g( b) is unknown but it can be estimated, in a non-iterative fashion, by first obtaining estimators Q τ (Y X i ), for i = 1,..., n, and forming the Nadaraya-Watson-type estimator ĝ NW Q (t b) = n i=1 ( ) t b Q τ (Y X i )K 1 X i h ( ) n k=1 K t b, (2) 1 X k h where K( ) is a univariate kernel function and h is a bandwidth. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

19 The New Approach For the estimator Q τ (Y X i ), we use the local linear conditional quantile estimator (Guerre and Sabbah, 2012). Specifically, for a multivariate kernel function K (x) = K (x 1,..., x d ) and a univariate bandwidth h, let L n ((α 0, α 1 ); τ, x) = 1 nh d n i=1 ( ) ρ τ (Y i α 0 α 1(X i x))k Xi x h and define Q τ (Y x) as α 0 (τ; x), where α 0 (τ; x) is defined through ( α 0 (τ; x), α 1 (τ; x)) = arg min (α 0,α 1 ) L n((α 0, α 1 ); τ, x). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

20 The proposed estimator is obtained by β = arg min b Θ n ( ) ρ τ Y i ĝ NW (b 1X i b), (3) i=1 where Θ R d 1 is a compact set, β is in the interior of Θ. Q E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

21 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

22 Main Results Proposition 1 Let ĝ NW (t b) be as defined in (2). Under some regularity conditions, we Q have that sup ĝ NW Q (t b) g(t b) ( = Op a n + a n + h2) 2, b Θ,t T b where T b = {t : t = b 1 x, x X 0}, a n = (log n/n) s/(2s+d) and a n = (log n/(nh 2 )) 1/2. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

23 Main Results Proposition 1 Let ĝ NW (t b) be as defined in (2). Under some regularity conditions, we Q have that sup ĝ NW Q (t b) g(t b) ( = Op a n + a n + h2) 2, b Θ,t T b where T b = {t : t = b 1 x, x X 0}, a n = (log n/n) s/(2s+d) and a n = (log n/(nh 2 )) 1/2. Proposition 2 Let β be as defined in (3). Under some regularity conditions, β is n-consistent estimator of β. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

24 Main Results Theorem Let β be as defined in (3). Under the assumptions of Proposition 2, n( β β) d N ( 0, τ(1 τ)v 1 ΣV 1), where V = E ( (g (β 1X β)) 2 (X 1 E(X 1 β 1X))(X 1 E(X 1 β 1X)) f ɛ X (0 X) ), and Σ = E ( (g (β 1X β)) 2 (X 1 E(X 1 β 1X))(X 1 E(X 1 β 1X)) ), for g (t b) = ( / t)g(t b) and X 1 the (d 1) dimensional vector after removing the first coordinate. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

25 Numerical Studies Notation NWQR - the proposed estimator. WYY - the estimator resulting from the iterative algorithm proposed by Wu et al. (2010). WYY-1 - the estimator resulting from dividing the estimator of Wu et al. (2010) by its first component. WYY-2 - the estimator resulting from using the proposed constraint in conjunction with the iterative algorithm of Wu et al. (2010). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

26 Numerical Studies Example 1 Consider the model Y = exp(β 1X) + ɛ, (4) where X = (X 1, X 2 ), X i N(0, 1) are iid, β 1 = (1, 2) and the residual ɛ follows a standard normal distribution. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

27 Numerical Studies Example 1 Consider the model Y = exp(β 1X) + ɛ, (4) where X = (X 1, X 2 ), X i N(0, 1) are iid, β 1 = (1, 2) and the residual ɛ follows a standard normal distribution. We fit the single index quantile regression model for five different quantile levels, τ = 0.1, 0.25, 0.5, 0.75, 0.9. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

28 Numerical Studies Example 1 Consider the model Y = exp(β 1X) + ɛ, (4) where X = (X 1, X 2 ), X i N(0, 1) are iid, β 1 = (1, 2) and the residual ɛ follows a standard normal distribution. We fit the single index quantile regression model for five different quantile levels, τ = 0.1, 0.25, 0.5, 0.75, 0.9. We use sample size of n = 400 and perform N = 100 replications. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

29 Numerical Studies Example 1 For comparison of the above estimators, consider the mean square error, R( β), and the mean check based absolute residuals, R τ ( Q LL ), which are defined as R( β) = 1 N N ( β i 2) 2, R τ ( Q LL ) = 1 n i=1 n ρ τ (Y i i=1 Q LL τ (Y β 1X i )), where Q LL τ (Y β 1x) is the local linear conditional quantile estimator on the univariate variable β 1X i. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

30 Table 1: mean values and standard errors (in parenthesis), R( β) and R τ ( Q LL ) for Model (4). 95% coverage probability for NWQR, WYY-2 and the second estimated coefficient of Wu et al. (2010). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63 Numerical Studies Example 1 τ NWQR (0.0603) (0.0477) (0.0481) (0.0683) (0.3905) β WYY-1 (0.2521) (0.2252) (3.8756) (0.3246) (0.5906) WYY (0.1593) (0.1307) (0.3937) (0.6735) (0.7240) NWQR R( β) WYY WYY NWQR R τ ( β 1 ) WYY WYY coverage NWQR prob. WYY WYY

31 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

32 Variable Selection Motivation Ubiquity of high dimensional data. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

33 Variable Selection Motivation Ubiquity of high dimensional data. Including unnecessary variables Vs Variable selection. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

34 Variable Selection Motivation Ubiquity of high dimensional data. Including unnecessary variables Vs Variable selection. Variable selection is also important in Quantile regression. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

35 Variable Selection Motivation Ubiquity of high dimensional data. Including unnecessary variables Vs Variable selection. Variable selection is also important in Quantile regression. Large body of wok. For example, Wang, Li, and Jiang (2007), Wu and Liu (2009) for linear QR, Koenker (2011) for nonparametric/ semiparametric QR, Alkenani and Yu (2013) for SIQR model. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

36 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

37 The Model Let {Y i, X i } n i=1 be independent and identically distributed (iid) observations that satisfy Y i = Q τ (Y X i ) + ɛ i, where Q τ (Y x) = Q τ (Y β 1x) and the error term satisfies Q τ (ɛ i x) = 0. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

38 The Model Let {Y i, X i } n i=1 be independent and identically distributed (iid) observations that satisfy Y i = Q τ (Y X i ) + ɛ i, where Q τ (Y x) = Q τ (Y β 1x) and the error term satisfies Q τ (ɛ i x) = 0. Sparsity Assumption: X i = (X i1, X i2 ), X i1 R d, X i2 R d d, d d is an integer that specifies the d relevant variables. β 1 = (1, β ) = (1, β 11, β 12) = (1, β 11, 0 ), where β 11 R d 1 is the non-zero parametric vector and β 12 R d d is the zero vector corresponding to the irrelevant variables. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

39 Iterations The true parametric vector β satisfies β = arg min b E ( ρ τ (Y Q τ (Y b 1X)) ), where b 1 = (1, b ) and ρ τ (u) = (τ I (u < 0))u. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

40 Iterations The true parametric vector β satisfies β = arg min b E ( ρ τ (Y Q τ (Y b 1X)) ), where b 1 = (1, b ) and ρ τ (u) = (τ I (u < 0))u. Problem Same problem arises! Existing algorithms are iterative; computationally expensive and convergence issues. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

41 New Approach The proposed estimator is obtained by β = arg min b Θ n ( ) ρ τ Y i ĝ NW (b 1X i b), i=1 where ĝ NW is the Nadaraya-Watson-type estimator, Θ R Q d 1 is a compact set, and β is in the interior of Θ. Q E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

42 New Approach The proposed estimator is obtained by β = arg min b Θ n ( ) ρ τ Y i ĝ NW (b 1X i b), i=1 where ĝ NW is the Nadaraya-Watson-type estimator, Θ R Q d 1 is a compact set, and β is in the interior of Θ. Recall, ĝ NW Q (t b) = n i=1 Q ( ) t b Q τ (Y X i )K 1 X i h ( ) n k=1 K t b, 1 X k h where Q τ (Y X i ) is the local linear conditional quantile estimator. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

43 New Approach Goals: For high dimensional data, it is possible to improve the estimation of the local linear conditional quantile estimators Q τ (Y X i ). Consider the use of penalty term for variable selection to estimate Q τ (Y x). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

44 New Approach Goals: For high dimensional data, it is possible to improve the estimation of the local linear conditional quantile estimators Q τ (Y X i ). Consider the use of penalty term for variable selection to estimate Q τ (Y x). Consider the penalty term for producing the estimator of the parametric component β of the SIQR model. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

45 New Approach Steps: Step 1 can be thought of as an initial variable selection method used to achieve better estimation of the conditional quantiles Q τ (Y X i ). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

46 New Approach Steps: Step 1 can be thought of as an initial variable selection method used to achieve better estimation of the conditional quantiles Q τ (Y X i ). Step 2 uses these improved estimators to perform simultaneous variable selection and estimation of the parametric component of the SIQR model. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

47 New Approach - Step 1 Specifically, for Step 1 we minimize with respect to b the objective function L n (τ, b) = n i=1 ρ τ (Y i ĝ NW Q (b 1X i b)) + n d p λ1 ( b j ), (5) j=2 where b 1 = (1, b ) = (1, b 2,..., b d ) and λ 1 is the tuning parameter for the SCAD penalty p λ1 ( ). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

48 New Approach - Step 1 Denote with β VS the minimizer resulting from minimizing the objective function (5). This estimator can be used for variable selection. Let Ân = {j (2,..., d) : β VS j 0} of cardinality d 1. Construct improved estimators, QVS τ (Y x), of the conditional quantiles. In particular, for x the d -dimensional vector, define Q VS τ (Y x ) as the local linear conditional quantile estimator on the relevant variables. Q τ VS (Y x) has indeed a faster rate of convergence than Q τ (Y x). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

49 - Step 2 For Step 2, we set ĝ NW VS (t b) = n i=1 ) τ (Y X i )K h ( ) n k=1 K t b 1 X k h Q VS and define the minimization problem ( t b 1 X i Ŝ n (τ, b) = n ( ) ρ τ Y i ĝ NW (b 1X i b) + n i=1 VS d p λ2 ( b j ), j=2 where b 1 = (1, b ) = (1, b 2,..., b d ) and λ 2 > 0 is the tuning parameter for the SCAD penalty p λ2 ( ). The proposed estimator is obtained by β SCAD = arg min Ŝ n (τ, b), (6) b Θ where Θ R d 1 is a compact set assumed to contain the true value of β. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

50 Comments Some Comments: SCAD penalty is motivated by the fact that directly applying a L 1 penalty tends to include inactive variables and to introduce bias. Also, SCAD penalty possess the oracle property. To be defined in a while. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

51 Comments Some Comments: SCAD penalty is motivated by the fact that directly applying a L 1 penalty tends to include inactive variables and to introduce bias. Also, SCAD penalty possess the oracle property. To be defined in a while. An alternative algorithm, where Step 1 was replaced by local variable selection for the computation of Q VS τ (Y x), was also used in some simulations and performed equally well in the settings tried. While such an algorithm may be more appropriate for very high dimensional data, it is time consuming to run. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

52 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

53 Main Results Theorem Model Selection Consistency: Let β VS = ( β 2 VS,..., β VS d ) be the minimizer of the objective function L n (τ, b) defined in (5). Define Ân the β VS j 0} and A the set {j {2,..., d} : β j 0}. set {j {2,..., d} : Under some regularity conditions and the conditions λ 1 0 and nλ1 as n, we have lim n P(Ân = A) = 1. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

54 Main Results Proposition 1 Let Q VS τ (Y x) denote the local linear conditional quantile estimator on the relevant variables. Then, under some regularity assumptions, sup x X 0 ( ) VS log n s/(2s+d ) Q τ (Y x) Q τ (Y x) = O p = O p (an ) n if the bandwidth h is asymptotically proportional to (log n/n) 1/(2s+d ). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

55 Main Results Proposition 1 Let Q VS τ (Y x) denote the local linear conditional quantile estimator on the relevant variables. Then, under some regularity assumptions, sup x X 0 ( ) VS log n s/(2s+d ) Q τ (Y x) Q τ (Y x) = O p = O p (an ) n if the bandwidth h is asymptotically proportional to (log n/n) 1/(2s+d ). Proposition 2 Let β SCAD be as defined in (6). Then, under the assumptions of Proposition 1 and the assumption that λ 2 0 as n, β SCAD is n-consistent estimator of β. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

56 Main Results Theorem Oracle Property: Let β SCAD be as defined in (6) and set β SCAD = ( β SCAD SCAD 11, β 12 ) SCAD, where β 11 is of cardinality d 1. If the assumptions of Proposition 1 hold and moreover λ 2 0 and nλ 2 as n, then with probability tending to one, SCAD 1. Sparsity: β 12 = 0 2. Asymptotic Normality: n( βscad 11 β 11 ) d N(0, τ(1 τ)v 1 11 Σ 11V 1 11 ), where ( ) V 11 = E (g (β 1 X1 β))2 (X 1, 1 E(X 1, 1 β 1 X))(X1, 1 E(X1, 1 β 1 X)) f ɛ X (0 X) and Σ 11 = E ( (g (β 1X 1 β)) 2 (X 1, 1 E(X 1, 1 β 1X))(X 1, 1 E(X 1, 1 β 1X)) ), for g (t b) = ( / t)g(t b) and X 1, 1 the (d 1) dimensional vector. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

57 Numerical Studies NWQR denotes the unpenalized estimator. SCAD-NWQR denotes the proposed estimator. LASSO-AY denotes the LASSO estimator of Alkenani and Yu (2013). ALASSO-AY denotes the adaptive-lasso estimator of Alkenani and Yu (2013). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

58 Numerical Studies How to evaluate the performance? size of the estimated parametric component (number of nonzero coefficients), number of its correct and incorrect zeros, absolute estimation error of the estimated parametric component, and mean squared error (MSE) of MSE( β SCAD 1 X) = 1 n n i=1 For the final conditional quantile estimator the mean check based absolute residuals LL R τ ( Q τ ) = 1 n ρ τ (Y i n i=1 β SCAD 1 X, defined as ( β SCAD 1 X i β 1X i ) 2. Q LL τ (Y β SCAD 1 X i ), we consider LL Q τ (Y β SCAD 1 X i )). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

59 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

60 Numerical Studies Example 1 - Asymmetric Homoscedastic Errors Consider the model Y = 5 cos(β 1X) + exp( (β 1X) 2 ) + ɛ, (7) where X = (X 1,..., X 5 ), X i U(0, 1) are iid, β 1 = (1, 2, 0, 0, 0), the residual ɛ follows an exponential distribution with mean 2, and X i s and ɛ are mutually independent. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

61 Numerical Studies Example 1 - Asymmetric Homoscedastic Errors Consider the model Y = 5 cos(β 1X) + exp( (β 1X) 2 ) + ɛ, (7) where X = (X 1,..., X 5 ), X i U(0, 1) are iid, β 1 = (1, 2, 0, 0, 0), the residual ɛ follows an exponential distribution with mean 2, and X i s and ɛ are mutually independent. We fit the SIQR model for five different quantile levels, τ = 0.1, 0.25, 0.5, 0.75, 0.9. We use a sample size of n = 400 and perform N = 100 replications. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

62 Numerical Studies Example 1 τ size (0.4462) (0.4513) (0.4688) (0.5707) (0.6256) # corr. zeros (0.4462) (0.4513) (0.4688) (0.5707) (0.6256) # incor. zeros (0) (0) (0) (0) (0) R( β SCAD ) R τ ( Q LL τ ) Table 2: mean values and standard deviations (in parenthesis) for the size and the number of correct and incorrect zeros of the estimated parametric component β SCAD. Also, mean values for R( β SCAD ) and R τ ( Q LL τ ) for Model (7). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

63 Numerical Studies Example 1 Comments: The proposed methodology selects correctly X 2 as the significant covariate 100% of the times for all quantile levels (0 average number of incorrect zeros), gives an average number of correct zeros close to the true value of 3. The average R( β SCAD ) of the estimated parametric component seems to increase with increasing quantile level; the asymptotic variance formula of the sample quantiles involves the inverse density of the error term evaluated at the specific quantile of interest. Overall performance is good. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

64 Numerical Studies Example 1 τ SCAD-NWQR (0.0041) (0.0034) (0.0047) (0.0059) (0.0997) LASSO-AY (0.0005) (0.0026) (0.0064) (0.0454) (0.0734) ALASSO-AY (0.0004) (0.0022) (0.0065) (0.0443) (0.0702) Table 3: mean values and standard deviations (in parenthesis) for the MSE( β 1X) for the SCAD-NWQR, LASSO-AY, and ALASSO-AY estimated parametric components for Model (7). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

65 Numerical Studies Example 1 Comments: the proposed methodology outperforms the LASSO and adaptive-lasso estimators of Alkenani and Yu (2013) for all quantile levels, except for the lowest one. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

66 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

67 Numerical Studies Example 2 - Heteroscedastic Errors Consider the model Y = sin(2πβ 1X) + (1 + β 2X) 2 ɛ, (8) 4 where X = (X 1,..., X 5 ), X i U(0, 1) are iid, β 1 = (1, 2, 0, 0, 0), β 2 = (1, 1, 1, 0, 0), the residual ɛ follows an exponential distribution with mean 1. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

68 Numerical Studies Example 2 - Heteroscedastic Errors Consider the model Y = sin(2πβ 1X) + (1 + β 2X) 2 ɛ, (8) 4 where X = (X 1,..., X 5 ), X i U(0, 1) are iid, β 1 = (1, 2, 0, 0, 0), β 2 = (1, 1, 1, 0, 0), the residual ɛ follows an exponential distribution with mean 1. We fit the SIQR model for five different quantile levels, τ = 0.1, 0.25, 0.5, 0.75, 0.9. We use a sample size of n = 400 and perform N = 100 replications. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

69 Numerical Studies Example 2 - Heteroscedastic Errors Consider the model Y = sin(2πβ 1X) + (1 + β 2X) 2 ɛ, (8) 4 where X = (X 1,..., X 5 ), X i U(0, 1) are iid, β 1 = (1, 2, 0, 0, 0), β 2 = (1, 1, 1, 0, 0), the residual ɛ follows an exponential distribution with mean 1. We fit the SIQR model for five different quantile levels, τ = 0.1, 0.25, 0.5, 0.75, 0.9. We use a sample size of n = 400 and perform N = 100 replications. We compare the penalized estimated parametric component (SCAD-NWQR) with the unpenalized one (NWQR). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

70 Numerical Studies Example 2 τ size (0.6485) (0.3835) (0.3075) (0.4795) (0.5751) # corr. zeros (0.4777) (0.3667) (0.3075) (0.4795) (0.5751) # incor. zeros (0.3589) (0.1) (0) (0) (0) Table 4: mean values and standard deviations (in parenthesis) for the size and the number of correct and incorrect zeros of the estimator parametric component β SCAD for Model (8). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

71 Numerical Studies Example 2 τ NWQR MSE( β (0.9675) (0.5094) (0.2012) (0.2992) (4.0261) 1X) SCAD-NWQR (0.4606) (0.2201) (0.1486) (0.0461) (0.1468) R( β SCAD NWQR ) SCAD-NWQR R τ ( Q LL NWQR τ ) SCAD-NWQR Table 5: mean values and standard deviations (in parenthesis) for the MSE( β 1X), average R( β) and average R τ ( Q LL τ ) for NWQR and SCAD-NWQR estimated parametric components for Model (8). E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

72 Outline 1 Introduction Motivation Possible Models 2 Single Index Quantile Regression Main Results 3 Numerical Studies 4 Variable Selection Motivation Main Results 5 Numerical Studies Example 1 Example 2 Real Data Example 6 Conclusions 7 Future Work E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

73 Numerical Studies Boston Housing Data 506 observations. 14 variables, response: medv, the median value of owner-occupied homes in $1000 s, predictors: measurements on the 506 census tracts in suburban Boston from the 1970 census. available in the MASS library in R. Collinearity in the data set; Breiman and Friedman (1985) applied ACE method and selected the four covariates RM, TAX, PTRATIO, and LSTAT. Variable selection in QR: Wu and Liu (2009) considered SCAD and adaptive-lasso penalties, under the linear QR model. Alkenani and Yu (2013) considered the LASSO and adaptive-lasso penalties, under the SIQR model. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

74 Numerical Studies Boston Housing Data exclude CHAS and RAD, standardize the response and the remaining 11 predictor variables, denote the predictor variables by X 1, X 2,...,X 11, and consider the SIQR model: Q τ (medv X) = g(x 1 + β 2 X 2 + β 3 X β 11 X 11 β), (9) for τ = 0.1, 0.25, 0.5, 0.75, 0.9, which assumes that RM is a significant predictor. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

75 Numerical Studies Boston Housing Data τ RM CRIM ZN INDUS NOX AGE DIS TAX PTRATIO BLACK LSTAT e e e Table 6: Parameter estimates for the Boston housing data for fitting the SIQR model (9) to each quantile level. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

76 Numerical Studies Boston Housing Data τ SCAD-NWQR no. of zeros LASSO-AY ALASSO-AY τ SCAD-NWQR LASSO-AY ALASSO-AY Tables 7 and 8: Number of zero coefficients and mean check based absolute residuals, R τ ( Q LL τ ), for SCAD-NWQR, LASSO-AY, and ALASSO-AY for fitting the SIQR model (9) to each quantile level. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

77 Numerical Studies Boston Housing Data Comments: different significant predictor variables, the proposed methodology gives more sparse models thatn the LASSO and adaptive-lasso methodologies of Alkenani and Yu (2013), the mean check based absolute residuals resulting from the proposed SCAD-NWQR is slightly larger than that of the other methods for all except the 90th quantile. The difference is probably due to the different penalty functions used by the different methods, but could also be explained by the different levels of sparsity attained. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

78 Numerical Studies Boston Housing Data medv medv index index medv medv index index 0.9 medv index Figure: Estimated single index quantile regression for Boston housing data. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

79 Conclusions conditional quantiles provide a more complete picture of the conditional distribution and share nice properties such as robustness. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

80 Conclusions conditional quantiles provide a more complete picture of the conditional distribution and share nice properties such as robustness. consider a single index model to reduce the dimensionality and maintain some nonparametric flexibility. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

81 Conclusions conditional quantiles provide a more complete picture of the conditional distribution and share nice properties such as robustness. consider a single index model to reduce the dimensionality and maintain some nonparametric flexibility. iterative algorithms for estimating the parametric vector; convergence issues. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

82 Conclusions conditional quantiles provide a more complete picture of the conditional distribution and share nice properties such as robustness. consider a single index model to reduce the dimensionality and maintain some nonparametric flexibility. iterative algorithms for estimating the parametric vector; convergence issues. proposed a method that directly estimates the parametric component non-iteratively and have derived its asymptotic distribution under heteroscedastic errors. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

83 Conclusions conditional quantiles provide a more complete picture of the conditional distribution and share nice properties such as robustness. consider a single index model to reduce the dimensionality and maintain some nonparametric flexibility. iterative algorithms for estimating the parametric vector; convergence issues. proposed a method that directly estimates the parametric component non-iteratively and have derived its asymptotic distribution under heteroscedastic errors. for high-dimensional data, introduce penalty terms both for variable selection to estimate Q τ (Y x) and for producing the estimator of the parametric component of the SIQR model. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

84 Future Work extend the proposed methodology for censored data. The proposed estimator is obtained by β = arg min Ŝn CD (τ, b), b Θ where Ŝ CD n (τ, b) = n i=1 ( ) i Ĝ n (Z i X i ) ρ τ Z i ĝcd NW (b 1X i b). Christou and Akritas (2015); working paper. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

85 References I A. Alkenani and K. Yu Penalized single-index quantile regression. International Journal of Statistics and Probability, 2: 12 30, L. Breiman and J.H. Friedman Estimating Optimal Transformations for multiple regression and correlation. Journal of the American Statistical Association, 80: , E. Christou and M.G. Akritas Single Index Quantile Regression for Censored Data. Working paper, E. Guerre and C. Sabbah. Uniform bias study and Bahadure representation for local polynomial estimators of the conditional quantile function. Econometric Theory, 28(01):87 129, E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

86 References II R. Koenker. Additive Models for Quantile Regression: Model Selection and Confidence Bandaids. Brazilian Journal of Probability and Statistics, 25: , E. Kong and Y. Xia. A single-index quantile regression model and its estimation. Econometric Theory, 28: , J. D. Opsomer and D. Ruppert. A fully automated bandwidth selection for additive regression model. Journal of the American Statistical Association, 93: , H. Wang, G. Li and G. Jian Robust Regression Shrinkage and Consistent Variable Selection through the LAD-Lasso. Journal of Business and Economic Statistics 25: , E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

87 References III Y. Wu and Y. Liu Variable Selection in Quantile Regression. Statistica Sinica 19: , T. Z. Wu, K. Yu and Y. Yu. Single index quantile regression. Journal of Multivariate Analysis, 101(7): , K. Yu and Z. Lu. Local linear additive quantile regression. Scandinavian Journal of Statistics, 31: , E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

88 Thank you! Many thanks to my advisor, Prof. Michael Akritas. E. Christou, M. G. Akritas (PSU) SIQR SMAC, November 6, / 63

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015

More information

The Pennsylvania State University The Graduate School Eberly College of Science A NON-ITERATIVE METHOD FOR FITTING THE SINGLE

The Pennsylvania State University The Graduate School Eberly College of Science A NON-ITERATIVE METHOD FOR FITTING THE SINGLE The Pennsylvania State University The Graduate School Eberly College of Science A NON-ITERATIVE METHOD FOR FITTING THE SINGLE INDEX QUANTILE REGRESSION MODEL WITH UNCENSORED AND CENSORED DATA A Dissertation

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

On the Harrison and Rubinfeld Data

On the Harrison and Rubinfeld Data On the Harrison and Rubinfeld Data By Otis W. Gilley Department of Economics and Finance College of Administration and Business Louisiana Tech University Ruston, Louisiana 71272 (318)-257-3468 and R. Kelley

More information

VARIABLE SELECTION IN QUANTILE REGRESSION

VARIABLE SELECTION IN QUANTILE REGRESSION Statistica Sinica 19 (2009), 801-817 VARIABLE SELECTION IN QUANTILE REGRESSION Yichao Wu and Yufeng Liu North Carolina State University and University of North Carolina, Chapel Hill Abstract: After its

More information

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series #10-104 College of Health and Human Development

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

ON CONCURVITY IN NONLINEAR AND NONPARAMETRIC REGRESSION MODELS

ON CONCURVITY IN NONLINEAR AND NONPARAMETRIC REGRESSION MODELS STATISTICA, anno LXXIV, n. 1, 2014 ON CONCURVITY IN NONLINEAR AND NONPARAMETRIC REGRESSION MODELS Sonia Amodio Department of Economics and Statistics, University of Naples Federico II, Via Cinthia 21,

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Interquantile Shrinkage in Regression Models

Interquantile Shrinkage in Regression Models Interquantile Shrinkage in Regression Models Liewen Jiang, Huixia Judy Wang, Howard D. Bondell Abstract Conventional analysis using quantile regression typically focuses on fitting the regression model

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

Quantile Regression for Extraordinarily Large Data

Quantile Regression for Extraordinarily Large Data Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING Statistica Sinica 3 (013), 135-155 doi:http://dx.doi.org/10.5705/ss.01.075 SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai

More information

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered,

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, L penalized LAD estimator for high dimensional linear regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, where the overall number of variables

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

Data Mining Techniques. Lecture 2: Regression

Data Mining Techniques. Lecture 2: Regression Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 2: Regression Jan-Willem van de Meent (credit: Yijun Zhao, Marc Toussaint, Bishop) Administrativa Instructor Jan-Willem van de Meent Email:

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

Efficient Quantile Regression for Heteroscedastic Models

Efficient Quantile Regression for Heteroscedastic Models Efficient Quantile Regression for Heteroscedastic Models Yoonsuh Jung, University of Waikato, New Zealand Yoonkyung Lee, The Ohio State University Steven N. MacEachern, The Ohio State University Technical

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

Transformation and Smoothing in Sample Survey Data

Transformation and Smoothing in Sample Survey Data Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department

More information

Divide-and-combine Strategies in Statistical Modeling for Massive Data

Divide-and-combine Strategies in Statistical Modeling for Massive Data Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured

More information

Median Cross-Validation

Median Cross-Validation Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Outlier Detection and Robust Estimation in Nonparametric Regression

Outlier Detection and Robust Estimation in Nonparametric Regression Outlier Detection and Robust Estimation in Nonparametric Regression Dehan Kong Howard Bondell Weining Shen University of Toronto University of Melbourne University of California, Irvine Abstract This paper

More information

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood,

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood, A NOTE ON LAPLACE REGRESSION WITH CENSORED DATA ROGER KOENKER Abstract. The Laplace likelihood method for estimating linear conditional quantile functions with right censored data proposed by Bottai and

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Inference on distributions and quantiles using a finite-sample Dirichlet process

Inference on distributions and quantiles using a finite-sample Dirichlet process Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics

More information

arxiv: v1 [stat.me] 30 Dec 2017

arxiv: v1 [stat.me] 30 Dec 2017 arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw

More information

Robust Variable Selection Through MAVE

Robust Variable Selection Through MAVE Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse

More information

Stat588 Homework 1 (Due in class on Oct 04) Fall 2011

Stat588 Homework 1 (Due in class on Oct 04) Fall 2011 Stat588 Homework 1 (Due in class on Oct 04) Fall 2011 Notes. There are three sections of the homework. Section 1 and Section 2 are required for all students. While Section 3 is only required for Ph.D.

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Theoretical results for lasso, MCP, and SCAD

Theoretical results for lasso, MCP, and SCAD Theoretical results for lasso, MCP, and SCAD Patrick Breheny March 2 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/23 Introduction There is an enormous body of literature concerning theoretical

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Quantile Regression for Analyzing Heterogeneity. in Ultra-high Dimension

Quantile Regression for Analyzing Heterogeneity. in Ultra-high Dimension Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension Lan Wang, Yichao Wu and Runze Li Abstract Ultra-high dimensional data often display heterogeneity due to either heteroscedastic variance

More information

Multiple Regression Part I STAT315, 19-20/3/2014

Multiple Regression Part I STAT315, 19-20/3/2014 Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

F. Jay Breidt Colorado State University

F. Jay Breidt Colorado State University Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

arxiv: v1 [stat.me] 23 Dec 2017 Abstract

arxiv: v1 [stat.me] 23 Dec 2017 Abstract Distribution Regression Xin Chen Xuejun Ma Wang Zhou Department of Statistics and Applied Probability, National University of Singapore stacx@nus.edu.sg stamax@nus.edu.sg stazw@nus.edu.sg arxiv:1712.08781v1

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

Extreme inference in stationary time series

Extreme inference in stationary time series Extreme inference in stationary time series Moritz Jirak FOR 1735 February 8, 2013 1 / 30 Outline 1 Outline 2 Motivation The multivariate CLT Measuring discrepancies 3 Some theory and problems The problem

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Maria Ponomareva University of Western Ontario May 8, 2011 Abstract This paper proposes a moments-based

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

Regression Shrinkage and Selection via the Lasso

Regression Shrinkage and Selection via the Lasso Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,

More information

Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models

Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models G. R. Pasha Department of Statistics, Bahauddin Zakariya University Multan, Pakistan E-mail: drpasha@bzu.edu.pk

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T

More information

Asymmetric least squares estimation and testing

Asymmetric least squares estimation and testing Asymmetric least squares estimation and testing Whitney Newey and James Powell Princeton University and University of Wisconsin-Madison January 27, 2012 Outline ALS estimators Large sample properties Asymptotic

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

arxiv: v2 [stat.me] 4 Jun 2016

arxiv: v2 [stat.me] 4 Jun 2016 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Sparse polynomial chaos expansions as a machine learning regression technique

Sparse polynomial chaos expansions as a machine learning regression technique Research Collection Other Conference Item Sparse polynomial chaos expansions as a machine learning regression technique Author(s): Sudret, Bruno; Marelli, Stefano; Lataniotis, Christos Publication Date:

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010 Department of Biostatistics Department of Statistics University of Kentucky August 2, 2010 Joint work with Jian Huang, Shuangge Ma, and Cun-Hui Zhang Penalized regression methods Penalized methods have

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Forward Selection and Estimation in High Dimensional Single Index Models

Forward Selection and Estimation in High Dimensional Single Index Models Forward Selection and Estimation in High Dimensional Single Index Models Shikai Luo and Subhashis Ghosal North Carolina State University August 29, 2016 Abstract We propose a new variable selection and

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

Nonconvex penalties: Signal-to-noise ratio and algorithms

Nonconvex penalties: Signal-to-noise ratio and algorithms Nonconvex penalties: Signal-to-noise ratio and algorithms Patrick Breheny March 21 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/22 Introduction In today s lecture, we will return to nonconvex

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

Forward Regression for Ultra-High Dimensional Variable Screening

Forward Regression for Ultra-High Dimensional Variable Screening Forward Regression for Ultra-High Dimensional Variable Screening Hansheng Wang Guanghua School of Management, Peking University This version: April 9, 2009 Abstract Motivated by the seminal theory of Sure

More information

Problem Set 7. Ideally, these would be the same observations left out when you

Problem Set 7. Ideally, these would be the same observations left out when you Business 4903 Instructor: Christian Hansen Problem Set 7. Use the data in MROZ.raw to answer this question. The data consist of 753 observations. Before answering any of parts a.-b., remove 253 observations

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Robust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly

Robust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly Robust Variable Selection Methods for Grouped Data by Kristin Lee Seamon Lilly A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Semiparametric modeling and estimation of the dispersion function in regression

Semiparametric modeling and estimation of the dispersion function in regression Semiparametric modeling and estimation of the dispersion function in regression Ingrid Van Keilegom Lan Wang September 4, 2008 Abstract Modeling heteroscedasticity in semiparametric regression can improve

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Local Polynomial Regression

Local Polynomial Regression VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based

More information