Smoothing and variable selection using P-splines
|
|
- Dorcas Sparks
- 5 years ago
- Views:
Transcription
1 Smoothing and variable selection using P-splines Irène Gijbels Department of Mathematics & Leuven Statistics Research Center Katholieke Universiteit Leuven, Belgium Joint work with Anestis Antoniadis, Anneleen Verhasselt, Sophie Lambert-Lacroix
2 Anestis Antoniadis XIVth International Biometrics Conference in Namur, Belgium, in July 1988 first reading: Nonparametric penalized maximum likelihood estimation of the intensity of a counting process, AISM, 1989 joint work on model selection using wavelet decomposition (with G. Grégoire) change point detection change point detection in hazard function (with B. MacGibbon) unfolding sphere size distributions using wavelets (with J. Fan) Workshop in honor of Anestis, Villard de Lans, March 2011 p.1
3 penalized wavelet monotone regression (with J. Bigot) smoothing non equispaced heavy noisy data with wavelets (with Jean-Michel Poggi) penalized likelihood regression for generalized linear models with nonquadratic penalties (with Mila Nikolova) variable selection in additive models and in varying coefficient models using P-splines (with Anneleen Verhasselt and Sophie Lambert-Lacroix) Workshop in honor of Anestis, Villard de Lans, March 2011 p.2
4 Seminal contributions to many areas wavelets and their applications intensity function estimation survival analysis and point processes inverse problems (in particular Poisson inverse problems) constraint estimation analysis of functional data development of statistical methods for microarray data excellent and very dynamic researcher extremely broad knowledge Workshop in honor of Anestis, Villard de Lans, March 2011 p.3
5 Some characteristics many services to the profession associate editor of several international journals serving in scientific evaluation boards for many years very supporting to young (moderate and old) researchers always helping with scientific advise a remarkable honesty and modesty an admirable personality; an example for many... Workshop in honor of Anestis, Villard de Lans, March 2011 p.4
6 THANK YOU!
7 Y : response variable Additive models: introduction (X 1,...,X d ) vector of d explanatory variables additive model Y = f 0 + d f j (X j ) + ε E (f j (X j )) = 0 j=1 ε random noise term; mean 0 and variance σ 2 f j unknown univariate functions often only a few components f j are different from 0 aim: to select and estimate the non-zero f j components Workshop in honor of Anestis, Villard de Lans, March 2011 p.6
8 Nonnegative garrote method Original nonnegative garrote method proposed by Breiman (1995) in a multiple linear regression model data (Y i, X i1,...,x id ) from Y i = β 0 + d β j X ij + ε i j=1 i = 1,...,n β OLS j ordinary least squares estimator for β j Workshop in honor of Anestis, Villard de Lans, March 2011 p.7
9 basic idea: the nng method shrinks the least squares estimators β OLS j shrinkage done via: c j βols j with c j 0 and a bound on d j=1 c j task : how to find the shrinkage factors c j? Workshop in honor of Anestis, Villard de Lans, March 2011 p.8
10 the nonnegative garrote shrinkage factors ĉ j are found by solving 1 n ( d ) 2 OLS (ĉ 1,...,ĉ d ) = argmin c1,...,c d Y i β 0 c 2 βols j j X ij i=1 j=1 d s.t. 0 c j (j = 1,...,d), c j s j=1 for given s, or equivalently (ĉ 1,...,ĉ d ) = argmin c1,...,c d s.t. 0 c j (j = 1,...,d) 1 2 n i=1 ( Y i β OLS 0 d j=1 ) 2 d c βols j j X ij + θ j=1 c j for given θ > 0 s > 0 and θ > 0; regularization parameters (see e.g. Xiong (2010)) Workshop in honor of Anestis, Villard de Lans, March 2011 p.9
11 the nonnegative garrote estimator of the regression coefficient β j is β NNG j = ĉ j βols j special case: orthogonal design, i.e. X X = I n ĉ j = ( 1 θ ( β OLS j ) 2 ) + z + = max(z, 0) the larger θ, the stronger the shrinkage effect Workshop in honor of Anestis, Villard de Lans, March 2011 p.10
12 β NNG θ=0 θ=0.01 θ=0.05 θ= β OLS Figure 1: Shrinkage effect of the nonnegative garrote for different θ s Workshop in honor of Anestis, Villard de Lans, March 2011 p.11
13 Relation with other estimation methods LASSO Ridge : ( β Lasso 1,..., s.t. d j=1 β j s ( β Ridge 1,..., s.t. d j=1 β2 j s β Lasso d ) = argmin β n i=1 ( Y i β 0 d j=1 β jx ij ) 2 Ridge ( β d ) = argmin n β i=1 Y i β 0 ) 2 d j=1 β jx ij identity nonnegative garrote lasso ridge Least Squares Estimate Workshop in honor of Anestis, Villard de Lans, March 2011 p.12
14 More relations with (other) thresholding rules (see, e.g., literature on wavelet methods, work by Anestis Antoniadis) ) hard-thresholding rule δλ H 0 if β j λ ( βj = β j if β j > λ ) 0 if β j λ soft-thresholding rule δλ S ( βj = β j λ if βj > λ β j + λ if βj < λ hard-thresholding (a discontinous function): keep or kill rule soft-thresholding (a continuous function): shrink or kill rule Bruce & Gao (1996) and Marron, Adak, Johnstone, Newmann & Patil (1998),..., Gao (2008),... Workshop in honor of Anestis, Villard de Lans, March 2011 p.13
15 another thresholding rule Antoniadis & Fan (2001) suggested the SCAD (Smoothed Clipped Absolute Deviation) thresholding rule ) ( ) sign ( βj max 0, β j λ if β j 2λ ) ) δλ SCAD (a 1) β ( βj = j aλsign ( βj if 2λ < β j aλ a 2 β j if β j > aλ where a > 2 this is also a shrink or kill rule (a piecewise linear function) this rule does not over-penalize large values of β j and hence does not create excessive bias when the regression coefficients are large Antoniadis & Fan (2001), based on a Bayesian argument, have recommended to use the value a = 3.7 Workshop in honor of Anestis, Villard de Lans, March 2011 p.14
16 thresholding functions δ λ Hard-thresholding Soft-thresholding x NNG-thresholding x SCAD x x Workshop in honor of Anestis, Villard de Lans, March 2011 p.15
17 Functional nonnegative garrote method extension of the nng to additive models: see Cantoni, Fleming & Ronchetti (2011) and Yuan (2007) start with an initial estimator f init j (X j ) of f j (X j ) the nonnegative garrote shrinkage factors are then found via ( {1 min n c1,...,c d 2 i=1 Y i s.t. 0 c j (j = 1,...,d) f init 0 d j=1 c j ) 2 init f j (X ij ) + θ d j=1 c j } the nonnegative garrote estimate of f j is NNG init f j ( ) = ĉ j f j ( ) Cantoni, Fleming & Ronchetti (2011): use smoothing splines for the initial estimator using P-splines... Workshop in honor of Anestis, Villard de Lans, March 2011 p.16
18 univariate P-spline estimation P-splines, introduced by Eilers & Marx (1996), in the univariate nonparametric smoothing context Y i = f(x i ) + ε i for i = 1,...,n P-splines are an extension of regression splines with a penalty on the coefficients of adjacent B-splines (X i, Y i ), for i = 1,...,n, with X i [0, 1] IR regression spline model: approximate f(x) with m α j B j (x; q) where {B j ( ; q) : j = 1,...,K + q = m} is the q-th degree B-spline basis, using normalized B-splines such that j B j(x; q) = 1, with K + 1 equidistant knot points t 0 = 0, t 1 = 1 K,...,t K = 1 in [0, 1] α = (α 1,...,α m ) : unknown column vector of regression coefficients j=1 Workshop in honor of Anestis, Villard de Lans, March 2011 p.17
19 penalized least squares estimator α is the minimizer of S(α) = n i=1 ( Y i m j=1 λ > 0 : smoothing parameter ) 2 m α j B j (X i ; q) + λ j=k+1 ( k α j ) 2 the differencing operator: k α j = k t=0 ( 1)t( k t) αj t, with k IN examples: k = 1 : 1 α j = α j α j 1 k = 2 : 2 α j = α j 2α j 1 + α j 2 rewriting in matrix-notation: S(α) = (Y Bα) (Y Bα) + λα D kd k α the elements B ij of B ( IR n m ) are B j (X i ; q) D k ( IR (m k) m ) : matrix representation of the kth order differencing operator k Workshop in honor of Anestis, Villard de Lans, March 2011 p.18
20 Additive model: Nonnegative garrote with P-splines initial estimator additive model Y = f 0 + d f j (X j ) + ε E (f j (X j )) = 0 j=1 key ingredients to prove the consistency of the nonnegative garrote with P-splines (i) consistency result for (univariate) P-splines (Claeskens, Krivobokova & Opsomer (2009)) (ii) extension of a univariate smoothing estimator to additive models via backfitting (rely on results from Horowitz, Klemelä & Mammen (2006)) (iii) on a consistency result for the functional nonnegative garrote (e.g. Yuan (2007)) Workshop in honor of Anestis, Villard de Lans, March 2011 p.19
21 Consistency of nonnegative garrote with P-splines notation : f j = (f j (X 1j ),, f j (X nj )) Theorem: Under some assumptions, and if θ n tends to 0 such that κ n = n (q+1) 2q+3 = o( θ n ), then (given X ij = x ij ) (1). P( f j NNG = 0) 1 for any j such that f j = 0 1 (2). sup j n f (( NNG j f j 2 2 = O θ ) 2 ) P n in other words: the nonnegative garrote method with P-splines is variable selection consistent (1) estimation consistent (2) Workshop in honor of Anestis, Villard de Lans, March 2011 p.20
22 Other selection methods COSSO (Component Selection and Smoothing Operator, Lin & Zhang (2006)) ACOSSO (Adaptive Component Selection and Smoothing Operator, Storlie, Bondell, Reich & Zhang (2010)) APSO (Adaptive P-splines Selection Operator, Antoniadis et al. (2011)) Workshop in honor of Anestis, Villard de Lans, March 2011 p.21
23 Varying coefficient models: introduction (Fan and Zhang (2000),...) Y (t) = β 0 (t) + d X (p) (t)β p (t) + ε(t) = X(t) β(t) + ε(t) p=1 Y (t) is the response at time t (t T = [0, T]) X(t) = (X (0) (t),...,x (d) (t)) covariate vector at time t with X (0) (t) 1 β(t) = (β 0 (t),...,β d (t)) the vector of coefficients at time t β 0 (t) is the baseline effect ε(t) a mean zero stochastic process at time t Workshop in honor of Anestis, Villard de Lans, March 2011 p.22
24 longitudinal data, i.e. samples with n independent subjects or individuals each measured repeatedly over a time period the j-th measurement for subject i of (t, Y (t),x(t)) is denoted by (t ij, Y ij,x ij ) 1 i n, 1 j N i N i is the number of repeated measurements of subject i t ij is the measurement time, Y ij is the observed response at time t ij and X ij = (X (0) ij,...,x(d) ij ) N = n N i is the total number of observations i=1 Workshop in honor of Anestis, Villard de Lans, March 2011 p.23
25 P-spline estimation in varying coefficient models (see Lu, Zhang & Zhu (2008), Wang & Huang (2008),...) suppose: each unknown function β p (t), p = 0,...,d, can be approximated by a B-spline basis expansion β p (t) = m p l=1 B pl (t; q p )α pl where {B pl ( ; q p ) : l = 1,...,K p + q p = m p } is the q p -th degree B-spline basis with K p + 1 equidistant knots for the p-th component Workshop in honor of Anestis, Villard de Lans, March 2011 p.24
26 the P-spline estimates of the regression coefficients α pl are obtained by minimizing S(α) with respect to α = (α 0,...,α d ) IR dim 1, where α p = (α p1,...,α pmp ) and dim = p m p: S(α) = n i=1 1 N i N i j=1 ( Y ij d p=0 m p l=1 X (p) ij B pl(t ij ; q p )α pl ) 2 + d p=0 λ p α pd k p D kp α p k p is the differencing order for the p-th component λ p are the smoothing parameters Workshop in honor of Anestis, Villard de Lans, March 2011 p.25
27 S(α) = = n i=1 1 N i N i j=1 ( Y ij d p=0 m p l=1 X (p) ij B pl(t ij ; q p )α pl ) 2 + d n (Y i U i α) W i (Y i U i α) + αq λ α i=1 p=0 λ p α pd k p D kp α p Y i = (Y i1,..., Y ini ) B 01 (t; q 0 )... B 0m0 (t; q 0 ) B(t) = B d1 (t; q d )... B dmd (t, q d ) U ij = X ij B(t ij) IR 1 dim U i = (U i1,...,u in i ) IR N i dim ( ) W i = diag N 1 i,..., N 1 i IR N i N i N 1 i on the diagonal) (a diagonal matrix with N i times Q λ = diag(λ 0 D k 0 D k0,..., λ d D k d D kd ) IR dim dim (a block diagonal matrix with the matrices λ p D k p D kp on the diagonal) Workshop in honor of Anestis, Villard de Lans, March 2011 p.26
28 S(α) = n (Y i U i α) W i (Y i U i α) + αq λ α i=1 introducing further matrix notations: Ỹ Ũα αq λ α Y = (Y 1,...Y n) IR N 1 W = diag(w i ) i=1,...,n IR N N Ỹ = W 1/2 Y U = [U 0,...,U d ] Ũ = W 1/2 U Workshop in honor of Anestis, Villard de Lans, March 2011 p.27
29 consistency results are proved for the case that the number of knots K p + 1 (and thus m p = K p + m p ) grows with n β p ( ) is not a spline function itself, but can be approximated by a spline function theoretical results consistency result ( ( ) ) q/(2q+1) 1 β p (t) β p (t) L2 = O n P n 2 i=1 1 N i asymptotic normality Workshop in honor of Anestis, Villard de Lans, March 2011 p.28
30 Nonnegative garrote selection method the nonnegative garrote shrinkage factors ĉ = (ĉ 0,...,ĉ d ) are obtained from the optimization problem min c0,...,c d n i=1 1 N i d j=1 s.t. 0 c p (p = 0,...,d) ( Y ij ) 2 d p=0 X(p) ij c p β p init d (t ij ) + θ p=0 c p β init p ( ) : initial estimator for the regression coefficient function β p ( ) θ > 0 is a regularization parameter we use the P-spline estimator as an initial estimator Workshop in honor of Anestis, Villard de Lans, March 2011 p.29
31 some more matrix notations: min c Ỹ Zc θ d p=0 c p s.t. 0 c p (p = 0,...,d) where Y = (Y 1,...Y n) IR N 1 W = Ỹ = diag(w i ) i=1,...,n IR N N W 1/2 Y z (p) i = (X (p) i1,...,x(p) in i ) diag( β init p (t ij )) j=1,...,ni IR 1 N i Z p = (z (p) 1,...,z(p) n ) IR N 1 Z = [Z 0,...,Z d ] Z = W 1/2 Z c = (c 0,...,c d ) Workshop in honor of Anestis, Villard de Lans, March 2011 p.30
32 the P-spline estimator f p (t) = X (p) (t) β p (t) for the p-th component f p (t) = X (p) (t)β p (t) is consistent it can be shown that the nonnegative garrote estimator with the P-spline estimator as initial estimator for β p (t) f p NNG (t) = ĉ p fp (t) is estimation consistent variable selection consistent other selection methods: Adaptive P-spline Selection Operator (APSO)... Workshop in honor of Anestis, Villard de Lans, March 2011 p.31
33 example: CD4 data example the data are a subset from the Multicenter AIDS Cohort Study (Kaslow et al. (1987)) contain repeated measurements of physical examinations, laboratory results, CD4 cell counts and CD4 percentages of 283 homosexual men who became HIV-positive between 1984 and 1991 unequal numbers of repeated measurements and different measurement times for each individual aim: try to evaluate the effects of cigarette smoking, pre-hiv infection CD4 cell percentage and age at HIV infection on the mean CD4 percentage after infection the number of repeated measurements ranged from 1 to 14, with a median of 6 and mean of 6.57 Workshop in honor of Anestis, Villard de Lans, March 2011 p.32
34 the number of distinct time points was 59 covariates: X (1) i the smoking status of the i-th individual (1 or 0 if the individual ever or never smoked cigarettes) X (2) i X (3) i the centered age at HIV infection for the i-th individual the centered pre-infection CD4 percentage Workshop in honor of Anestis, Villard de Lans, March 2011 p.33
35 varying coefficient model for Y ij Y ij = β 0 (t ij ) + 3 β p (t ij ) + ε ij p=1 X(p) i β 0 (t) is the baseline CD4 percentage, represents the mean CD4 percentage t years after the HIV infection for a nonsmoker with average pre-infection CD4 percentage and average age at infection Table 1: Aids data. Summary parameters. method NS RSS ngp APSO Workshop in honor of Anestis, Villard de Lans, March 2011 p.34
36 36 34 ngp APSO ngp APSO baseline CD time since infection coefficient of smoking status time since infection (a) (b) ngp APSO ngp APSO coefficient of age coefficient of pre infection CD time since infection time since infection (c) (d) Figure 2: Aids data. Fitted (a) baseline effect; (b) coefficient of smoking status; (c) coefficient of age at HIV infection; (d) coefficient of pre-infection CD4. Workshop in honor of Anestis, Villard de Lans, March 2011 p.35
37 70 65 pre infection CD4= pre infection CD4=15 pre infection CD4=69 60 CD4 percentage time since infection Figure 3: Aids data. Fitted CD4 percentage for 3 pre-infection CD4 percentages. Workshop in honor of Anestis, Villard de Lans, March 2011 p.36
38 Grouped regularization methods Y (t) = β 0 (t) + d X (p) (t)β p (t) + ε(t) = X(t) β(t) + ε(t) p=1 approximation in terms of a basis of smooth functions β p (t) L p l=1 B (p) l (t)γ pl (L p large) as before, introducing the appropriate matrix notations n i=1 1 N i N i j=1 ( Y i (t ij ) p L k k=1 l=1 γ k,l X (k) i (t ij )B (k) l (t ij ) ) 2 Ỹ Zγ 2 Z : dimension N ( d p=0 L p) γ : dimension ( d p=0 L p) 1 2 Workshop in honor of Anestis, Villard de Lans, March 2011 p.37
39 grouped Lasso regularization minimize 1 2n Ỹ Zγ λ d w p γ p 2 p=0 w p = L p with respect to the vector of parameters γ = (γ 0,...,γ d ) defining p λ (v) = λv, for v 0, this can be written as minimize 1 2n Ỹ Zγ d ( ) p λ wp γ p 2 p=0 w p = L p Liu and Zhang (2008) Workshop in honor of Anestis, Villard de Lans, March 2011 p.38
40 grouped SCAD regularization the SCAD penalty p λ (v), for v 0, is λv if 0 v λ p λ (v) = v2 2aλv + λ 2 if λ < v < aλ 2(a 1) (a + 1)λ 2 if v aλ 2 grouped SCAD procedure: minimize 1 2n Ỹ Zγ d p λ (ω p γ p 2 ) p=0 w p = L p with respect to the vector of parameters γ Workshop in honor of Anestis, Villard de Lans, March 2011 p.39
41 possible approach a Taylor expansion of p λ (v) for v around v 0 gives p λ (v) p λ (v 0 ) p λ (v 0) v 0 (v 2 v 2 0) (see Fan and Li (2001)) this leads to solving a Ridge-regression type of problem (restricted to d n < n) Workshop in honor of Anestis, Villard de Lans, March 2011 p.40
42 first grouped SCAD regularization procedure minimize 1 2n Ỹ Zγ d p λ (ω p γ p 2 ) p=0 w p = L p with respect to the vector of parameters γ second grouped SCAD regularization procedure minimize with respect to γ 1 2n Ỹ Zγ d p λ ( γ p 1 ) p=0 algorithm for solving this optimization problem: in Breheny & Huang (2009) Workshop in honor of Anestis, Villard de Lans, March 2011 p.41
43 grouped Bridge regularization penalty function, for v > 0, p λ (v) = λ v q with0 < q < 1 grouped Bridge approach: minimizing the objective function 1 2n Ỹ Zγ d p λ (ω p γ p 2 ) p=0 an algorithm for solving this optimization problem: from Breheny & Huang (2009) Workshop in honor of Anestis, Villard de Lans, March 2011 p.42
44 Simulation studies comparison of 5 methods of grouped regularization methods gbridge: grouped Bridge (q = 1/2); local coordinate descent algorithm from Breheny & Huang (2009) gscad: grouped SCAD; idem gmcp: grouped MCP; idem glasso 1: grouped Lasso; idem glasso 2: grouped Lasso; implemented by Meier, van de Geer & Bühlman (2008) Workshop in honor of Anestis, Villard de Lans, March 2011 p.43
45 tuning parameter λ chosen by a BIC-type of criterion: ( RRSλ log n i=1 N i ) + n log( N i ) i=1 n i=1 N i df λ where RSS λ is the residual sum of squares df λ is the number of nonzero coefficients of γ Workshop in honor of Anestis, Villard de Lans, March 2011 p.44
46 simulation model (from Huang, Wu & Zhou (2002) and Wang, Li & Huang (2008)) 23 Y i (t ij ) = β 0 (t ij )+ p=1 β p (t ij )X (p) i (t ij )+ε i (t ij ), i = 1,...,n j = 1,...,Ñ intercept term and the three true relevant variables: ( ) ( ) πt π(t 25) β 0 (t) = sin β 1 (t) = 2 3 cos β 2 (t) = 6 0.2t β 3 (t) = 4 + (20 t) t [1, 30] remaining coefficients: β p (t) = 0, p = 4,...,23 time points t ij given by 1, 2,...,30 (Ñ = 30) and n = 100 Workshop in honor of Anestis, Villard de Lans, March 2011 p.45
47 simulation of three relevant variables X (k) i (t), k = 1,...,3: at any point t, the variable X (1) i (t) is sampled uniformly from [t/10, 2 + t/10] conditioning on X (1) i (t), the variable X (2) i (t) is centered Gaussian with variance given by (1 + X (1) i (t))/(2 + X (1) i (t)) the variable X (3) i (t) is independent of X (1) i and X (2) i and is a Bernoulli random variable with success rate equal to 0.6 the irrelevant variables X (k) i, k = 4,...,23 are paths of centered Gaussian process with covariance function Cov(X (k) i (t), X (k) i (s)) = 4 exp( t s ) the irrelevant variables are independent between them and of the others three first variables Workshop in honor of Anestis, Villard de Lans, March 2011 p.46
48 three levels of noise for the random error: σ = 1, 1.25 and 2 corresponds to signal-to-noise ratio (SNR) : 6.39, 5.11 and 3.19 SNR is defined by γ T Z T Zγ/N for each simulated data set: use cubic splines with five equidistant internal knots number of simulations: 500 Workshop in honor of Anestis, Villard de Lans, March 2011 p.47
49 reported criteria: mean value of the tuning parameter λ the average number of variables selected the average number of truly zero variables that where selected (false positives) the average number of truly nonzero variables that where not selected (false negatives) the mean and standard deviation of the model error: ( γ γ) T Z T Z( γ γ)/n Workshop in honor of Anestis, Villard de Lans, March 2011 p.48
50 Table 2: Selection model ability λ S FP FN ME σ = 1 gbridge (0.0033) gscad (0.0142) gmcp (0.0124) glasso (0.0066) glasso (0.0497) σ = 2 gbridge (0.0137) gscad (0.0316) gmcp (0.0317) glasso (0.0407) glasso (0.1209) Workshop in honor of Anestis, Villard de Lans, March 2011 p.49
51 remarks from this simulation study: for all signal-to-noise ratios, gbridge and glasso 1 lead to the best result for the selection ability and for the model error compared to the other methods the gbridge method is better in model error while glasso 1 method is better in selection ability the gscad and gmcp procedures are not very good in selection ability: the number of false positives is rather high the implementation of group lasso (glasso 2) gives relatively correct result in selection model but leads to very bad result in term of model error Workshop in honor of Anestis, Villard de Lans, March 2011 p.50
52 typical performance of the estimators of the four first coefficients for a signal-to-noise ratio = 1.25 Model error, SNR = gbridge gscad gmcp glasso 1 Model error, SNR = gbridge gscad gmcp glasso 1 Model error, SNR = gbridge gscad gmcp glasso 1 Figure 4: Workshop in honor of Anestis, Villard de Lans, March 2011 p.51
53 THANK YOU!
Quantile regression in varying coefficient models
Quantile regression in varying coefficient models Irène Gijbels KU Leuven Department of Mathematics and Leuven Statistics Research Centre Belgium July 6, 2017 Irène Gijbels (KU Leuven) ICORS 2017, Wollongong,
More informationPenalized estimation in additive varying coefficient models using grouped regularization
Noname manuscript No. (will be inserted by the editor) Penalized estimation in additive varying coefficient models using grouped regularization A. Antoniadis I. Gijbels S. Lambert-Lacroix Received: date
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationLecture 14: Variable Selection - Beyond LASSO
Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)
More informationBi-level feature selection with applications to genetic association
Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may
More informationBayesian Grouped Horseshoe Regression with Application to Additive Models
Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationLecture 5: Soft-Thresholding and Lasso
High Dimensional Data and Statistical Learning Lecture 5: Soft-Thresholding and Lasso Weixing Song Department of Statistics Kansas State University Weixing Song STAT 905 October 23, 2014 1/54 Outline Penalized
More informationGroup exponential penalties for bi-level variable selection
for bi-level variable selection Department of Biostatistics Department of Statistics University of Kentucky July 31, 2011 Introduction In regression, variables can often be thought of as grouped: Indicator
More informationStability and the elastic net
Stability and the elastic net Patrick Breheny March 28 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/32 Introduction Elastic Net Our last several lectures have concentrated on methods for
More informationReduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation
Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator
More informationEstimating subgroup specific treatment effects via concave fusion
Estimating subgroup specific treatment effects via concave fusion Jian Huang University of Iowa April 6, 2016 Outline 1 Motivation and the problem 2 The proposed model and approach Concave pairwise fusion
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationComparisons of penalized least squares. methods by simulations
Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy
More informationDirect Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationRegularization Methods for Additive Models
Regularization Methods for Additive Models Marta Avalos, Yves Grandvalet, and Christophe Ambroise HEUDIASYC Laboratory UMR CNRS 6599 Compiègne University of Technology BP 20529 / 60205 Compiègne, France
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationNonconvex penalties: Signal-to-noise ratio and algorithms
Nonconvex penalties: Signal-to-noise ratio and algorithms Patrick Breheny March 21 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/22 Introduction In today s lecture, we will return to nonconvex
More informationConsistent high-dimensional Bayesian variable selection via penalized credible regions
Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationLogistic Regression with the Nonnegative Garrote
Logistic Regression with the Nonnegative Garrote Enes Makalic Daniel F. Schmidt Centre for MEGA Epidemiology The University of Melbourne 24th Australasian Joint Conference on Artificial Intelligence 2011
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationA Short Introduction to the Lasso Methodology
A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael
More informationCONTRIBUTIONS TO PENALIZED ESTIMATION. Sunyoung Shin
CONTRIBUTIONS TO PENALIZED ESTIMATION Sunyoung Shin A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationHigh-dimensional regression
High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationAdaptive Piecewise Polynomial Estimation via Trend Filtering
Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationSOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu
SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu LITIS - EA 48 - INSA/Universite de Rouen Avenue de l Université - 768 Saint-Etienne du Rouvray
More informationSmoothly Clipped Absolute Deviation (SCAD) for Correlated Variables
Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)
More informationRobust Variable Selection Through MAVE
Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse
More informationRobust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly
Robust Variable Selection Methods for Grouped Data by Kristin Lee Seamon Lilly A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree
More informationIterative Selection Using Orthogonal Regression Techniques
Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department
More informationThe lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding
Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty
More informationThe Risk of James Stein and Lasso Shrinkage
Econometric Reviews ISSN: 0747-4938 (Print) 1532-4168 (Online) Journal homepage: http://tandfonline.com/loi/lecr20 The Risk of James Stein and Lasso Shrinkage Bruce E. Hansen To cite this article: Bruce
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationStatistics for high-dimensional data: Group Lasso and additive models
Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional
More informationHomogeneity Pursuit. Jianqing Fan
Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles
More informationComments on \Wavelets in Statistics: A Review" by. A. Antoniadis. Jianqing Fan. University of North Carolina, Chapel Hill
Comments on \Wavelets in Statistics: A Review" by A. Antoniadis Jianqing Fan University of North Carolina, Chapel Hill and University of California, Los Angeles I would like to congratulate Professor Antoniadis
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationA New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables
A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,
More informationCS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS
CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems
More informationTECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection
DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model
More informationLecture 14: Shrinkage
Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationEnsemble estimation and variable selection with semiparametric regression models
Ensemble estimation and variable selection with semiparametric regression models Sunyoung Shin Department of Mathematical Sciences University of Texas at Dallas Joint work with Jason Fine, Yufeng Liu,
More informationLinear model selection and regularization
Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It
More informationVariable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1
Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr
More informationShrinkage Tuning Parameter Selection in Precision Matrices Estimation
arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationA Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)
A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical
More informationSelection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty
Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationUNIVERSITETET I OSLO
UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Examination in: STK4030 Modern data analysis - FASIT Day of examination: Friday 13. Desember 2013. Examination hours: 14.30 18.30. This
More informationUltra High Dimensional Variable Selection with Endogenous Variables
1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High
More informationOutlier Detection and Robust Estimation in Nonparametric Regression
Outlier Detection and Robust Estimation in Nonparametric Regression Dehan Kong Howard Bondell Weining Shen University of Toronto University of Melbourne University of California, Irvine Abstract This paper
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationLikelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science
1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School
More informationA General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations
A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood
More informationarxiv: v1 [stat.me] 4 Oct 2013
Monotone Splines Lasso Linn Cecilie Bergersen, Kukatharmini Tharmaratnam and Ingrid K. Glad Department of Mathematics, University of Oslo arxiv:1310.1282v1 [stat.me] 4 Oct 2013 Abstract We consider the
More informationAsymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/
More informationGroup descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors
Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors Patrick Breheny Department of Biostatistics University of Iowa Jian Huang Department of Statistics
More informationarxiv: v1 [stat.me] 30 Dec 2017
arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw
More informationVariable Selection for Nonparametric Quantile. Regression via Smoothing Spline ANOVA
Variable Selection for Nonparametric Quantile Regression via Smoothing Spline ANOVA Chen-Yen Lin, Hao Helen Zhang, Howard D. Bondell and Hui Zou February 15, 2012 Author s Footnote: Chen-Yen Lin (E-mail:
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationExam: high-dimensional data analysis January 20, 2014
Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish
More informationBayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam. aad. Bayesian Adaptation p. 1/4
Bayesian Adaptation Aad van der Vaart http://www.math.vu.nl/ aad Vrije Universiteit Amsterdam Bayesian Adaptation p. 1/4 Joint work with Jyri Lember Bayesian Adaptation p. 2/4 Adaptation Given a collection
More informationESL Chap3. Some extensions of lasso
ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied
More informationLASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape
LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape Nikolaus Umlauf https://eeecon.uibk.ac.at/~umlauf/ Overview Joint work with Andreas Groll, Julien Hambuckers
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationHigh-dimensional Ordinary Least-squares Projection for Screening Variables
1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor
More informationOutlier detection and variable selection via difference based regression model and penalized regression
Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationCOMMENTS ON «WAVELETS IN STATISTICS: A REVIEW» BY A. ANTONIADIS
J. /ta/. Statist. Soc. (/997) 2, pp. /31-138 COMMENTS ON «WAVELETS IN STATISTICS: A REVIEW» BY A. ANTONIADIS Jianqing Fan University ofnorth Carolina, Chapel Hill and University of California. Los Angeles
More informationBusiness Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony
More informationVARIABLE SELECTION AND ESTIMATION WITH THE SEAMLESS-L 0 PENALTY
Statistica Sinica 23 (2013), 929-962 doi:http://dx.doi.org/10.5705/ss.2011.074 VARIABLE SELECTION AND ESTIMATION WITH THE SEAMLESS-L 0 PENALTY Lee Dicker, Baosheng Huang and Xihong Lin Rutgers University,
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationRatemaking application of Bayesian LASSO with conjugate hyperprior
Ratemaking application of Bayesian LASSO with conjugate hyperprior Himchan Jeong and Emiliano A. Valdez University of Connecticut Actuarial Science Seminar Department of Mathematics University of Illinois
More informationBoosting Methods: Why They Can Be Useful for High-Dimensional Data
New URL: http://www.r-project.org/conferences/dsc-2003/ Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) March 20 22, Vienna, Austria ISSN 1609-395X Kurt Hornik,
More informationBayesian shrinkage approach in variable selection for mixed
Bayesian shrinkage approach in variable selection for mixed effects s GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Outline 1 Introduction 2 3 4 Outline Introduction
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationLearning with Sparsity Constraints
Stanford 2010 Trevor Hastie, Stanford Statistics 1 Learning with Sparsity Constraints Trevor Hastie Stanford University recent joint work with Rahul Mazumder, Jerome Friedman and Rob Tibshirani earlier
More informationMath 533 Extra Hour Material
Math 533 Extra Hour Material A Justification for Regression The Justification for Regression It is well-known that if we want to predict a random quantity Y using some quantity m according to a mean-squared
More informationWavelet Estimators in Nonparametric Regression: A Comparative Simulation Study
Wavelet Estimators in Nonparametric Regression: A Comparative Simulation Study Anestis Antoniadis, Jeremie Bigot Laboratoire IMAG-LMC, University Joseph Fourier, BP 53, 384 Grenoble Cedex 9, France. and
More informationLocal Polynomial Modelling and Its Applications
Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,
More informationConsistent Group Identification and Variable Selection in Regression with Correlated Predictors
Consistent Group Identification and Variable Selection in Regression with Correlated Predictors Dhruv B. Sharma, Howard D. Bondell and Hao Helen Zhang Abstract Statistical procedures for variable selection
More information