Regression When the Predictors Are Images

Size: px
Start display at page:

Download "Regression When the Predictors Are Images"

Transcription

1 University of Haifa From the SelectedWorks of Philip T. Reiss May, 2009 Regression When the Predictors Are Images Philip T. Reiss, New York University Available at:

2 Regression When the Predictors Are Images Philip T. Reiss New York University reiss Wright State University Dayton, Ohio May 1, 2009 Joint work with Todd Ogden Research supported in part by grant number 5F31MH , NIMH

3 Simple linear regression: fit least-squares line x y x y 2

4 y x2 y x2 Multiple linear regression: fit least-squares (hyper)plane x1 x1 3

5 for i = 1,..., n, where Basic model equation for multiple linear regression y i = β 0 + β 1 x i β p 1 x i,p 1 + ε i y i is the ith subject s outcome, β 0 is the intercept of the true line (or plane...), x i1,..., x i,p 1 are subject i s values for predictor 1,..., p 1, β 1,..., β p 1 are coefficients or effects of predictor 1,..., p 1, ε i is an error term which in the nicest case is iid normal : 1. independent across subjects, 2. identically distributed for each subject, 3. normally distributed. Each of these 3 conditions can be relaxed. 4

6 Above system of n equations in p unknowns can be written in inner product form as where x = (1, x i1,..., x i,p 1 ) T matrix form as y i = x T i β + ε i, i = 1,..., n, and β = (β 0, β 1,..., β p 1 ) T, or in y = Xβ + ε where X is the n p matrix with ith row x i (the design matrix ). The least-squares estimate of β is then ˆβ = arg min β R p y Xβ 2. If rankx = p then we have the unique minimizer ˆβ = (X T X) 1 X T y. But if n < p, there are ordinarily infinitely many β R p such that y Xβ 2 = 0. So we generally seek the optimal β within some reasonable subset of R p. 5

7 A motivating example: Serotonin receptors in depression Major depressive disorder affects 9.5% of the U.S. population each year. Serotonin (5-HT), a neurotransmitter, is believed to play an important role in the disorder, mediated in part by distribution of 5-HT 1A receptors in the brain. 5-HT 1A receptor binding potential (BP), a measure of the receptors availability, can be mapped at each voxel (volume unit) of the brain via PET imaging and tracer kinetic modeling. Question: How can we relate these BP maps to a depression-related outcome such as Hamilton Depression Score (HAM-D)? 6

8 Idea: regress HAM-D (scalar) on BP (image) y i = α + s T i f + ε i 7

9 This can be thought of 1. as super-high-dimensional regression, 2. as regression with functional data (L 2 inner product) Many potential applications in psychiatry, including 1. predicting treatment response in depression 2. predicting progression from mild cognitive impairment to Alzheimer s disease 8

10 Road map Linear Generalized linear regression regression 1-D signal predictors D image predictors 3 9

11 An easier application: near-infrared spectroscopy!"#$%"&$&'(")$*+(,)-"!.#$/011(-(2,(3$&'(")$*+(,)-",5.2!674 #'$ #'% #'&!'#!'"!"##!$##!%##!&## "### ""## "$## 9:;;+<+-=+>1,5.2!674!#'##8 #'### #'##8 #'#!# #'#!8!"##!$##!%##!&## "### ""## "$## ()*+,+-./ ()*+,+-./012-34!,#$450*)6-(7$89% We model a vector y of n scalar responses as!3#$450*)6-(7$:89%!%;4<!!##!8# # 8# y = Xα + Sf + ε!"##!$##!%##!&## "### ""## "$## ()*+,+-./ and each column has mean zero?5+;;:=:+-/1;@-=/:5-?5+;;:=:+-/1;@-=/:5-!!##!8# # 8# where X is an n p covariate matrix (may just be a column of 1s) S is an n N matrix: ith row s i represents a signal predictor defined at points v 1,..., v N,!"##!$##!%##!&## "### ""## "$## ()*+,+-./ ε denotes iid errors Since n N, must reduce the dimension of S to solve for coefficient function f. 10

12 Previous approaches to solving for f: 1. Principal component regression (Massy, 1965) Replace model y = Xα + Sf + ε with y = Xα + SV q β + ε, where V q comprises the q leading columns of V, given the singular value decomposition S = UDV T ; take ˆf = V q ˆβ. 11

13 2. Penalized B-spline expansion Replace y = Xα + Sf + ε with y = Xα + SBβ + ε where the columns of B form a B-spline basis; take ˆf = B ˆβ, where (ˆα, ˆβ) minimize the penalized least squares criterion y Xα SBβ 2 + λβ T P β. P is chosen so that β T P β is a measure of the roughness of f = Bβ, e.g., R f (v) 2 dv (Cardot, Ferraty, and Sarda, 2003) or difference penalty (Marx and Eilers, 1999). The choice of λ > 0 governs the tradeoff between fidelity to the data and smoothness of the coefficient function (higher λ smoother). 12

14 Functional principal component regression Reiss and Ogden (2007) propose to combine the above two approaches to obtain a coefficient function estimate where ˆf = BV qˆζ B is N K with columns forming a B-spline basis as before; V q is a K q matrix whose columns are the leading columns of V, given the singular value decomposition SB = UDV T. This functional principal component regression (FPCR) modifies the above penalized least squares criterion y Xα SBβ 2 + λβ T P β by further restricting to the q leading principal components, to obtain y Xα SBV q ζ 2 + λζ T V T q P V q ζ. 13

15 The second dimension reduction SB SBV q is optimal in the following minimax sense. Proposition 1. Let UDV T be the singular value decomposition of the n K matrix Z, let U q be the matrix consisting of the first q < min(n, K) columns of U, and let D q be the q q upper left submatrix of D. Then M 0 = U q D q minimizes max Zw proj M Zw w R K, w =1 over all n q matrices M, where proj M denotes projection onto the column space of M. In other words: If we wish to replace an n K design matrix Z = SB with an n q design matrix M with q < K, the maximum perturbation in fitted values is minimized by taking M to be U q D q, which equals SBV q, the FPCR design matrix. 14

16 The q in Tuning parameter 1: Number of principal components y Xα SBV q ζ 2 + λζ T V T q P V q ζ can be chosen by multifold cross-validation: 1. Divide the data points (y i, x i, s i ) into, say, 5 validation sets. 2. For given q, and for k = 1,..., 5, remove the kth validation set and obtain estimates ˆα ( k) (q), ˆζ ( k) (q) with the remaining data (the training set ); use those estimates to obtain predicted values ŷ ( k) k (q) for the left-out outcomes; obtain prediction error y k ŷ ( k) k (q) 2 for the kth validation set. 3. Choose q such that P 5 k=1 y k ŷ( k) k (q) 2 is minimized. Alternatively, just choose q that seems large enough to capture the necessary detail. 15

17 Tuning parameter 2: Roughness penalty parameter λ GCV Too bumpy Too smooth Just right! Lambda Given q, the λ in y Xα SBV q ζ 2 + λζ T V T q P V qζ can be chosen by optimizing over λ a criterion function such as: 1. generalized cross-validation (GCV; Craven and Wahba, 1979) related to cross-validation, but does not require fitting the model repeatedly. 2. restricted maximum likelihood (REML; Ruppert, Wand, and Carroll, 2003) relies on connection with linear mixed models. 16

18 Reiss and Ogden (2009a) derived a 1-parameter family of functions h k (k 1) s.t.» d sgn dλ REML(λ)» d sgn GCV (λ) dλ = sgn[h 2 (λ) h 1 (λ)] and = sgn[h 2 (λ) h 3 (λ)]. In non-pathological cases, h 1 crosses h 2 (from smaller to larger) at a unique value ˆλ REML, whereas h 3 crosses h 2 (from larger to smaller) at a unique point ˆλ GCV. REML smooths more than GCV iff the latter crossing of h 2 precedes the former one. h functions GCV REML h1 h2 h Log(lambda) 17

19 Some asymptotic results Assume the outcomes are generated by the model y = α1 + S Bβ + ε where S = `s 1,..., s n T ; s i = (s i1,..., s in )T, i = 1, 2,..., which are i.i.d. random vectors with E(s 1j ) = 0 and E(s 4 1j ) < for each j = 1,..., N; ε is a vector of i.i.d. errors with mean zero and finite variance, independent of S ; and B is a fixed N K B-spline basis matrix. Let V q be the K q population principal component matrix whose columns are the eigenvectors of E(B T s 1 s T 1 B) corresponding to its leading eigenvalues ξ 1 >... > ξ q > 0, and let Ξ q = diag(ξ 1,..., ξ q ). Theorem 1. Suppose β = V qζ for some ζ = (ζ 1,..., ζ q ) T. If ˆβ n = V qˆζ denotes a q-component FPCR estimate for which λn is chosen to be o P (n 1/2 ), then n 1/2 (ˆβ n β) d Z 1 + Z 2, where Z 1 N K (0, σ 2 V qξ 1 q V T q ) and Z 2 N K (0, W ) for a K K matrix W not depending on σ 2. 18

20 Under mild assumptions, the λ n = o P (n 1/2 ) condition imposed in Theorem 1 is met if λ n is chosen by GCV or REML. Indeed, a stronger condition then holds: Theorem 2. Let λ n be the GCV or REML value associated with the q-component FPCR estimate. Assume that V T q P V q is nonsingular, and that f = Bβ with V T q β 0. Then there exists M > 0 such that P (λ n > M) 0 as n. 19

21 Consistency of choosing number of components by multifold CV: Theorem 3. Assume that f = BV qζ with ζ q 0. Suppose the number of FPCR components is chosen by multifold CV using D n divisions of the n observations into training and validation sets of size n t and n v = n n t, respectively, with λ n = o P (n 1/2 ). If n t, n v and D n = o P [(min{n t, n v }) 1/2 ], then for any positive integer q 1 < q, the q-component model will be chosen over the q 1 -component model with probability tending to 1 as n. 20

22 (a) Raw wheat spectra (b) Differenced wheat spectra log(1/r) Differenced log(1/r)! Wavelength (nm) Wavelength (nm) (c) Moisture: PCR (d) Moisture: FPCR!REML Coefficient function!100! Coefficient function!100! Wavelength (nm) Wavelength (nm) 21

23 Simulation study Given an n N signal matrix S (N-dimensional signal for each of n subjects), create a true coefficient function f R N and simulate outcomes y = Sf + ε, where ε consists of n iid normal errors. Then use FPCR to derive estimate ˆf. Two key performance metrics are estimation error ˆf f 2, prediction error ŷ E(y S) 2 = S ˆf Sf 2. 22

24 Table 1: Average Scaled Mean Squared Error of Prediction. Smooth function Bumpy function Wheat Gasoline Wheat Gasoline PBSE-GCV PBSE-REML PBSE-oracle PCR B-spline PCR FPCR C FPCR R -GCV FPCR R -REML FPCR R -oracle FPCR R -plug-in PLS B-spline PLS FPLS C FPLS R -GCV FPLS R -REML

25 Table 2: Mean of L 2 Norm of Error (Root Integrated Squared Error) in Estimating f. Smooth function Bumpy function Wheat Gasoline Wheat Gasoline PBSE-GCV PBSE-REML PBSE-oracle PCR B-spline PCR FPCR C FPCR R -GCV FPCR R -REML FPCR R -oracle FPCR R -plug-in PLS B-spline PLS FPLS C FPLS R -GCV FPLS R -REML

26 Linear Generalized linear regression regression 1-D signal predictors D image predictors 3 25

27 Generalized linear models Problem: The regression model y i = x T i β + ε i is not appropriate for certain types of outcome y i e.g., if y i is binary (0/1). Under the standard assumptions that ε i is independent of x i and has expectation 0, µ i E(y i x i ) = x T i β. This motivates the generalized linear model µ i = g 1 (x T i β) where g is an invertible link function. Key example: logistic regression. If y i is binary, µ i is simply p i P (y i = 1). With inverse link g 1 (t) = et, the above becomes 1+e t» p i = exp(xt i β) exp(x T 1 + exp(x T i β), i.e., y i Bernoulli i β) 1 + exp(x T i β). 26

28 FPCR for generalized linear models Reiss and Ogden (2009b) extended FPCR from the linear model µ = Xα + Sf to the generalized linear model g(µ) [g(µ 1 ),..., g(µ n )] T = Xα + Sf where g is an appropriate link function. Again we apply the restriction f = BV q ζ to obtain, e.g., the logistic model ( ) exp[(xα + SBV q ζ) i ] y i Bernoulli. 1 + exp[(xα + SBV q ζ) i ] 27

29 Linear Generalized linear regression regression 1-D signal predictors D image predictors 3 28

30 Extension to image predictors As for linear model with 1-D predictors, we seek to minimize y Xα SBV q ζ 2 + λζ T V T q P V q ζ, but must modify B and P. For the columns of B, we chose radial cubic B-splines (Saranli and Baykal, 1998) centered at each of an equally spaced grid of knots κ 1,..., κ K R 2. We chose P to yield the thin plate penalty given in two dimensions by ζ T V T q P V q ζ Z Z " «2 2 «f 2 2 «f 2 2 # f dv 1 dv 2. v 1 v 2 v 2 1 v

31 y i = α + s T i f + ε i Key scientific question re PET images: Where in the brain (at which voxels v) is f(v) significantly positive/negative? 30

32 This question can be approached through simultaneous confidence bands for the coefficient image. A 95% (pointwise) confidence interval for f(v) is an interval [L, U] (determined by the data) such that P (L f(v) U) =.95. A 95% simultaneous confidence band for f is given by functions ˆf L ( ), ˆf U ( ) such that P [ ˆf L (v) f(v) ˆf L (v) v] =.95. For a 1-D coefficient function this might look something like: Coefficient function Sig. positive Sig. negative

33 No analytic expression is available for ˆf L, ˆf U, so we pursue a bootstrapping approach (Mandel and Betensky, 2008): 1. Using, say, 999 random samples with replacement of n data points (y, x, s ) from the n observations, obtain bootstrap estimates ˆf 1,..., ˆf 999 of the coefficient function. 2. At each v, order the function estimates: ˆf (1)(v)... ˆf (999)(v). 3. [ ˆf (25)(v), ˆf (975)(v)] is a pointwise 95% confidence interval for f(v). 4. Q v [ ˆf (25)(v), ˆf (975)(v)] is in general not a simultaneous 95% confidence band for f; but the Cartesian product of, say, 99% pointwise intervals will be. 5. Given 95% simultaneous confidence bands [ ˆf L, ˆf U ] for f, the coefficient function is declared significantly positive at all v such that ˆf L (v) > 0. 32

34 Oversimplified example

35 More realistic example Function estimates cross over each other. Simultaneous band formed by Cartesian product of pointwise intervals from 2nd-smallest to 2nd-largest function: 34

36 Simulation studies We used 68 maps of binding potential of 5-HT 1A receptors obtained by Parsey et al. (2006) to perform two simulation studies: Study 1 Compare performance of FPCR with two other methods. Study 2 Assess how well the simultaneous inference procedure detects nonzero regions of f. 35

37 True coefficient image f for simulation studies

38 Simulation study 1: Performance comparisons We compared three methods for both linear and logistic regression. The methods all begin with restricting the coefficient function to a spline basis, but differ in what happens next. (a) Roughness penalty only referred to above as the penalized B-spline expansion. (b) Reduction to leading PCs only similar to FPCR, but without a roughness penalty (i.e., λ = 0). Thus the only smoothing parameter is the number of components, which we chose by 5-fold cross-validation from among the values (c) Leading PCs plus roughness penalty i.e., FPCR, with 35 PCs (accounting for 96% of the variation in SB). For both linear and logistic regression, we generated 100 outcome vectors with equally spaced R 2 ( proportion of explained variation ) values from.2 to.95, and computed each method s prediction error. 37

39 (a) Linear (b) Logistic Prediction error All components 1-10 components, λ=0 35 components Prediction error R^ R^2 Figure 1: (a) Linear regression results. (b) Logistic regression results. 38

40 Simulation study 2: Detecting significance For RL 2 =.5,.6,.7,.8,.9 (Menard, 2000), we simulated 20 sets of simulated binary outcomes " # exp(s T y i Bernoulli i f) 1 + exp(s T i f) (i = 1,..., 68), where s i denotes subject i s binding potential map (1 slice, 5778 voxels), and f = kf 0, where f 0 is the artificial coefficient image shown previously and k is chosen to attain the specified R 2 L. For each set of outcomes, we estimated f by logistic FPCR with 35 PCs, and formed simultaneous confidence bands from 1999 bootstrap samples. Using these bands, we determined how many times (out of 20) each voxel was deemed significantly positive (or negative). 39

41 R^2 = 0.5 R^2 = 0.6 R^2 = 0.7 R^2 = 0.8 R^2 = 0.9 Times (out of 20) found significant

42 f, ˆf, and simultaneous bands: 10 examples with R 2 L =.7 9 / 9 true, 2 false 9 / 9 true, 1 false 4 / 9 true, 0 false 8 / 9 true, 0 false 6 / 9 true, 0 false Coefficient function Coefficient function Coefficient function Coefficient function Coefficient function Voxel Voxel Voxel Voxel Voxel 5 / 9 true, 0 false 0 / 9 true, 0 false 5 / 9 true, 1 false 9 / 9 true, 0 false 8 / 9 true, 1 false Coefficient function Coefficient function Coefficient function Coefficient function Coefficient function Voxel Voxel Voxel Voxel Voxel 41

43 f, ˆf, and simultaneous bands: zooming in 9 / 9 true, 2 false 9 / 9 true, 1 false 4 / 9 true, 0 false 8 / 9 true, 0 false 6 / 9 true, 0 false Coefficient function Coefficient function Coefficient function Coefficient function Coefficient function Voxel Voxel Voxel Voxel Voxel 5 / 9 true, 0 false 0 / 9 true, 0 false 5 / 9 true, 1 false 9 / 9 true, 0 false 8 / 9 true, 1 false Coefficient function Coefficient function Coefficient function Coefficient function Coefficient function Voxel Voxel Voxel Voxel Voxel 42

44 Discussion Regressing scalar outcomes on entire images is a very challenging problem, but our method seems to do quite well at detecting salient (f 0) regions. Extension to 3D images will require tackling several practical issues. Another extension: locally adaptive smoothing parameter. Ongoing work (Todd Ogden and Yihong Zhao) uses wavelets rather than splines, and seeks a sparse representation of the coefficient function. 43

45 Thank you! 44

46 References [1] Cardot, H., Ferraty, F., and Sarda, P. (2003). Spline estimators for the functional linear model. Statistica Sinica 13, [2] Craven, P., and Wahba, G. (1979). Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31, [3] Mandel, M., and Betensky, R. A. (2008). Simultaneous confidence intervals based on the percentile bootstrap approach. Computational Statistics and Data Analysis 52, [4] Marx, B. D., and Eilers, P. H. C. (1999). Generalized linear regression on sampled signals and curves: a P -spline approach. Technometrics 41, [5] Massy, W. F. (1965). Principal components regression in exploratory statistical research. Journal of the American Statistical Association 60, [6] Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. The American Statistician 54, [7] Parsey, R. V., Oquendo, M. A., Ogden, R. T., Olvet, D. M., Simpson, N., Huang, Y., Van Heertum, R. L., Arango, V., and Mann, J. J. (2006). Altered serotonin 1A binding in major depression: A [carbonyl-c-11]way positron emission tomography study. Biological Psychiatry 59,

47 [8] Reiss, P. T., and Ogden, R. T. (2007). Functional principal component regression and functional partial least squares. Journal of the American Statistical Association 102, [9] Reiss, P. T., and Ogden, R. T. (2009a). Smoothing parameter selection for a class of semiparametric linear models. Journal of the Royal Statistical Society, Series B 71, [10] Reiss, P. T., and Ogden, R. T. (2009b). Functional generalized linear models with images as predictors. Biometrics, to appear. [11] Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric Regression. Cambridge and New York: Cambridge University Press. [12] Saranli, A., and Baykal, B. (1998). Complexity reduction in radial basis function (RBF) networks by using radial B-spline functions. Neurocomputing 18,

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression University of Haifa From the SelectedWorks of Philip T. Reiss August 7, 2008 Simultaneous Confidence Bands for the Coefficient Function in Functional Regression Philip T. Reiss, New York University Available

More information

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING Submitted to the Annals of Applied Statistics SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING By Philip T. Reiss, Lan Huo, Yihong Zhao, Clare

More information

Regression with Signals and Images as Predictors

Regression with Signals and Images as Predictors University of Haifa From the SelectedWorks of Philip T. Reiss 2006 Regression with Signals and Images as Predictors Philip T. Reiss, Columbia University Available at: https://works.bepress.com/phil_reiss/11/

More information

Function-on-Scalar Regression with the refund Package

Function-on-Scalar Regression with the refund Package University of Haifa From the SelectedWorks of Philip T. Reiss July 30, 2012 Function-on-Scalar Regression with the refund Package Philip T. Reiss, New York University Available at: https://works.bepress.com/phil_reiss/28/

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

A Magiv CV Theory for Large-Margin Classifiers

A Magiv CV Theory for Large-Margin Classifiers A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector

More information

Focused fine-tuning of ridge regression

Focused fine-tuning of ridge regression Focused fine-tuning of ridge regression Kristoffer Hellton Department of Mathematics, University of Oslo May 9, 2016 K. Hellton (UiO) Focused tuning May 9, 2016 1 / 22 Penalized regression The least-squares

More information

Nonparametric Small Area Estimation Using Penalized Spline Regression

Nonparametric Small Area Estimation Using Penalized Spline Regression Nonparametric Small Area Estimation Using Penalized Spline Regression 0verview Spline-based nonparametric regression Nonparametric small area estimation Prediction mean squared error Bootstrapping small

More information

Theoretical Exercises Statistical Learning, 2009

Theoretical Exercises Statistical Learning, 2009 Theoretical Exercises Statistical Learning, 2009 Niels Richard Hansen April 20, 2009 The following exercises are going to play a central role in the course Statistical learning, block 4, 2009. The exercises

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Statistica Sinica 24 (2014), 1143-1160 doi:http://dx.doi.org/10.5705/ss.2012.230 A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Huaihou Chen 1, Yuanjia Wang 2, Runze Li 3 and Katherine

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Short title: Restricted LR Tests in Longitudinal Models Ciprian M. Crainiceanu David Ruppert May 5, 2004 Abstract We assume that repeated

More information

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Resampling-Based Information Criteria for Adaptive Linear Model Selection

Resampling-Based Information Criteria for Adaptive Linear Model Selection Resampling-Based Information Criteria for Adaptive Linear Model Selection Phil Reiss January 5, 2010 Joint work with Joe Cavanaugh, Lei Huang, and Amy Krain Roy Outline Motivating application: amygdala

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Variable Selection and Model Choice in Survival Models with Time-Varying Effects Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Penalized Splines, Mixed Models, and Recent Large-Sample Results

Penalized Splines, Mixed Models, and Recent Large-Sample Results Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 9: Basis Expansions Department of Statistics & Biostatistics Rutgers University Nov 01, 2011 Regression and Classification Linear Regression. E(Y X) = f(x) We want to learn

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

Package Grace. R topics documented: April 9, Type Package

Package Grace. R topics documented: April 9, Type Package Type Package Package Grace April 9, 2017 Title Graph-Constrained Estimation and Hypothesis Tests Version 0.5.3 Date 2017-4-8 Author Sen Zhao Maintainer Sen Zhao Description Use

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation

More information

Semiparametric Methods for Mapping Brain Development

Semiparametric Methods for Mapping Brain Development University of Haifa From the SelectedWorks of Philip T. Reiss May, 2012 Semiparametric Methods for Mapping Brain Development Philip T. Reiss Yin-Hsiu Chen Lan Huo Available at: https://works.bepress.com/phil_reiss/27/

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

An Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K.

An Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K. An Introduction to GAMs based on penalied regression splines Simon Wood Mathematical Sciences, University of Bath, U.K. Generalied Additive Models (GAM) A GAM has a form something like: g{e(y i )} = η

More information

Recap from previous lecture

Recap from previous lecture Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines Nonparametric Small Estimation via M-quantile Regression using Penalized Splines Monica Pratesi 10 August 2008 Abstract The demand of reliable statistics for small areas, when only reduced sizes of the

More information

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting Andreas Groll 1 and Gerhard Tutz 2 1 Department of Statistics, University of Munich, Akademiestrasse 1, D-80799, Munich,

More information

Multidimensional Density Smoothing with P-splines

Multidimensional Density Smoothing with P-splines Multidimensional Density Smoothing with P-splines Paul H.C. Eilers, Brian D. Marx 1 Department of Medical Statistics, Leiden University Medical Center, 300 RC, Leiden, The Netherlands (p.eilers@lumc.nl)

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Least Squares Approximation

Least Squares Approximation Chapter 6 Least Squares Approximation As we saw in Chapter 5 we can interpret radial basis function interpolation as a constrained optimization problem. We now take this point of view again, but start

More information

Theorems. Least squares regression

Theorems. Least squares regression Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most

More information

Towards a Regression using Tensors

Towards a Regression using Tensors February 27, 2014 Outline Background 1 Background Linear Regression Tensorial Data Analysis 2 Definition Tensor Operation Tensor Decomposition 3 Model Attention Deficit Hyperactivity Disorder Data Analysis

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

CMSC858P Supervised Learning Methods

CMSC858P Supervised Learning Methods CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors

More information

Statistics for high-dimensional data: Group Lasso and additive models

Statistics for high-dimensional data: Group Lasso and additive models Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional

More information

Data analysis strategies for high dimensional social science data M3 Conference May 2013

Data analysis strategies for high dimensional social science data M3 Conference May 2013 Data analysis strategies for high dimensional social science data M3 Conference May 2013 W. Holmes Finch, Maria Hernández Finch, David E. McIntosh, & Lauren E. Moss Ball State University High dimensional

More information

Regularization Methods for Additive Models

Regularization Methods for Additive Models Regularization Methods for Additive Models Marta Avalos, Yves Grandvalet, and Christophe Ambroise HEUDIASYC Laboratory UMR CNRS 6599 Compiègne University of Technology BP 20529 / 60205 Compiègne, France

More information

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Logistic regression (v1) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 30 Regression methods for binary outcomes 2 / 30 Binary outcomes For the duration of this

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

LONGITUDINAL FUNCTIONAL MODELS WITH STRUCTURED PENALTIES

LONGITUDINAL FUNCTIONAL MODELS WITH STRUCTURED PENALTIES Johns Hopkins University, Dept. of Biostatistics Working Papers 11-2-2012 LONGITUDINAL FUNCTIONAL MODELS WITH STRUCTURED PENALTIES Madan G. Kundu Indiana University School of Medicine, Department of Biostatistics

More information

Chapter 7: Model Assessment and Selection

Chapter 7: Model Assessment and Selection Chapter 7: Model Assessment and Selection DD3364 April 20, 2012 Introduction Regression: Review of our problem Have target variable Y to estimate from a vector of inputs X. A prediction model ˆf(X) has

More information

Computational methods for mixed models

Computational methods for mixed models Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different

More information

Nonparametric Regression. Badr Missaoui

Nonparametric Regression. Badr Missaoui Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,

More information

Smooth Common Principal Component Analysis

Smooth Common Principal Component Analysis 1 Smooth Common Principal Component Analysis Michal Benko Wolfgang Härdle Center for Applied Statistics and Economics benko@wiwi.hu-berlin.de Humboldt-Universität zu Berlin Motivation 1-1 Volatility Surface

More information

High-Dimensional Statistical Learning: Introduction

High-Dimensional Statistical Learning: Introduction Classical Statistics Biological Big Data Supervised and Unsupervised Learning High-Dimensional Statistical Learning: Introduction Ali Shojaie University of Washington http://faculty.washington.edu/ashojaie/

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

The Hodrick-Prescott Filter

The Hodrick-Prescott Filter The Hodrick-Prescott Filter A Special Case of Penalized Spline Smoothing Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Joint work with Rob Paige, Missouri University of Science

More information

A Confidence Region Approach to Tuning for Variable Selection

A Confidence Region Approach to Tuning for Variable Selection A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Ridge Regression. Flachs, Munkholt og Skotte. May 4, 2009

Ridge Regression. Flachs, Munkholt og Skotte. May 4, 2009 Ridge Regression Flachs, Munkholt og Skotte May 4, 2009 As in usual regression we consider a pair of random variables (X, Y ) with values in R p R and assume that for some (β 0, β) R +p it holds that E(Y

More information

Exact Likelihood Ratio Tests for Penalized Splines

Exact Likelihood Ratio Tests for Penalized Splines Exact Likelihood Ratio Tests for Penalized Splines By CIPRIAN CRAINICEANU, DAVID RUPPERT, GERDA CLAESKENS, M.P. WAND Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street, Baltimore,

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Flexible penalized regression for functional data...and other complex data objects

Flexible penalized regression for functional data...and other complex data objects University of Haifa From the SelectedWorks of Philip T. Reiss November, 2015 Flexible penalized regression for functional data...and other complex data objects Philip T. Reiss Available at: https://works.bepress.com/phil_reiss/41/

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

An Introduction to Functional Data Analysis

An Introduction to Functional Data Analysis An Introduction to Functional Data Analysis Chongzhi Di Fred Hutchinson Cancer Research Center cdi@fredhutch.org Biotat 578A: Special Topics in (Genetic) Epidemiology November 10, 2015 Textbook Ramsay

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

Statistical Methods for SVM

Statistical Methods for SVM Statistical Methods for SVM Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot,

More information

Small Area Estimation for Skewed Georeferenced Data

Small Area Estimation for Skewed Georeferenced Data Small Area Estimation for Skewed Georeferenced Data E. Dreassi - A. Petrucci - E. Rocco Department of Statistics, Informatics, Applications "G. Parenti" University of Florence THE FIRST ASIAN ISI SATELLITE

More information

Two Applications of Nonparametric Regression in Survey Estimation

Two Applications of Nonparametric Regression in Survey Estimation Two Applications of Nonparametric Regression in Survey Estimation 1/56 Jean Opsomer Iowa State University Joint work with Jay Breidt, Colorado State University Gerda Claeskens, Université Catholique de

More information

Smoothing parameter selection and outliers accommodation for smoothing splines

Smoothing parameter selection and outliers accommodation for smoothing splines Int. Statistical Inst.: Proc. 58th World Statistical Congress, 011, Dublin (Session CPS008) p.6037 Smoothing parameter selection and outliers accommodation for smoothing splines Osorio, Felipe Universidad

More information

General Functional Concurrent Model

General Functional Concurrent Model General Functional Concurrent Model Janet S. Kim Arnab Maity Ana-Maria Staicu June 17, 2016 Abstract We propose a flexible regression model to study the association between a functional response and multiple

More information

Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression

Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Communications of the Korean Statistical Society 2009, Vol. 16, No. 4, 713 722 Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Young-Ju Kim 1,a a Department of Information

More information

STK-IN4300 Statistical Learning Methods in Data Science

STK-IN4300 Statistical Learning Methods in Data Science Outline of the lecture STK-I4300 Statistical Learning Methods in Data Science Riccardo De Bin debin@math.uio.no Model Assessment and Selection Cross-Validation Bootstrap Methods Methods using Derived Input

More information

4 Bias-Variance for Ridge Regression (24 points)

4 Bias-Variance for Ridge Regression (24 points) Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Covariate Adjusted Varying Coefficient Models

Covariate Adjusted Varying Coefficient Models Biostatistics Advance Access published October 26, 2005 Covariate Adjusted Varying Coefficient Models Damla Şentürk Department of Statistics, Pennsylvania State University University Park, PA 16802, U.S.A.

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information