Estimation of spatiotemporal effects by the fused lasso for densely sampled spatial data using body condition data set from common minke whales
|
|
- Ralph Mills
- 5 years ago
- Views:
Transcription
1 Estimation of spatiotemporal effects by the fused lasso for densely sampled spatial data using body condition data set from common minke whales Mariko Yamamura 1, Hirokazu Yanagihara 2, Keisuke Fukui 3, Hiroko Solvang 4, Nils Øien 4, Tore Haug 4 1 Graduate School of Education, Hiroshima University, 2 Graduate School of Science, Hiroshima University, 3 Research & Development Center, Osaka Medical College, 4 Institute of Marine Research, Norway 1 Introduction Samples evenly distributed all over the population are not always available for real data analysis. As an example of a spatial data, a data from common mink whales in Norwegian water provides values showing their body conditions with whaling locations such as longitudes and latitudes. Though whales are distributed all over the Norwegian water, whaling locations are almost the same every year, therefore samples are dense at particular locations. Solvang et al. (2017) and Yamamura et al. (2016) model spatial effects by polynomial in the densely sampled spatial data to avoid the estimation result greatly influenced by the dense. However, the polynomial is not enough to have a flexible estimation result. Having a more flexible estimation result, a nonparametric smoothing by placing basis functions such as the spline method is available for estimating spatial effects. However, the nonparametric smoothing may not show a precise estimation result of spatial effects at a space with small number of samples in densely sampled spatial data. Therefore, we propose an estimation method of spatial effect which is hardly affected by the dense samples. In the proposal method, the space to be analyzed is subdivided into 1
2 several, and we estimate the spatial effect by using fused lasso with combining spatial effects from subdivided space. 2 Estimation Method 2.1 Additive Model with Spatial Effect Let y ij be a response variable of the i-th sample at the j-th space for i = 1,..., n j and j = 1,..., m, where n = m j=1 n j and m are the sample size and the number of spaces, respectively, and let x l,ij be the l-th explanatory variable of the i-th sample at the j-th space for i = 1,..., n j, j = 1,..., m and l = 1,..., p. The additive model with spatial effect is defined by y ij = p f l (x l,ij ) + µ j + ε ij, l=1 (i = 1,..., n j ; j = 1,..., m), where f l is a function indicating the influence of l-th explanatory variable, µ j is the spatial effect of the j-th space and ε ij is an error variable of the i-th sample at the j-th space. Here, we assume that ε 11,..., ε n1 1,..., ε 1m,..., ε nmm are independetly and identically distributed according to a distribution with the mean 0 and variance σ 2. We estimate the spatial effects µ 1,..., µ m and effects other than space f 1,..., f p by the backfitting algorithm, see more about the backfitting algorithm in Hastie and Tibshirani (1990). The f l can be any function but should be the one which properly fits the analyzed data. We nonparametrically estimate f l with the common mink whale data, because some explanatory variables have non-linear shapes in previous study, Yamamura et al. (2016). We assume that f 1,..., f p1 are linear functions and f p1 +1,..., f p are nonlinear functions approximated by truncated cubic basis functions, i.e., p p 1 p f l (x l,ij ) = β l x l,ij + f l (x l,ij ), l=1 l=1 l=p
3 where for l = p 1 + 1,..., p, f l (x l,ij ) = β l,1 x l,ij + β l,2 x 2 l,ij + β l,3 x 3 l,ij + b 0 g=1 α l,g (x l,ij τ l,g ) 3 +. (1) Here, τ l,g is the knot of basis function given by the 100 g/(b 0 +1)% (g = 1,..., b 0 ) point of the l-th explanatory variable and (x τ) 3 + = I(x > τ)(x τ) 3 is the truncated cubic basis function, where I(A) is the indicator function, i.e., I(A) = 1 if A is true and I(A) = 0 if A is false. Let β l = (β l,1, β l,2, β l,3 ), x l,ij = (x l,ij, x 2 l,ij, x3 l,ij ), α l = (α l,1,..., α l,b0 ) and b l,ij = ((x l,ij τ l,1 ) 3 +,..., (x l,ij τ l,b0 ) 3 +). The f l in (1) is written as f l (x l,ij ) = β lx l,ij + α lb l,ij. Let y j and ε j be n j -dimensional vectors obtained by stacking response variables and error variables of the j-th space, respectively, i.e., y j = (y 1j,..., y nj j) and ε j = (ε 1j,..., ε nj j). Focusing on a space, the additive model of the j-th space is written as y j = X j β + B j α + µ j 1 nj + ε j, (j = 1,..., m), (2) where 1 nj as below. is the n j -dimensional vector of ones, and other vectors and matrices indicate β = (β 1,..., β p1, β p 1 +1,..., β p), α = (α p 1 +1,..., α p), x 1,1j... x p1,1j x p 1 +1,1j... x p,1j X j = (n j k, k = 3p 2p 1 ), x 1,nj j... x p1,n j j x p 1 +1,n j j... x p,n j j b p 1 +1,1j... b p,1j B j =..... (n j b, b = b 0 (p p 1 )). b p 1 +1,n j j... b p,n j j Let y and ε be n-dimensional vectors obtained by stacking the vectors of response variables and error variables of the j-th space, respectively, i.e., y = (y 1,..., y m) and ε = (ε 1,..., ε m), and let X and B be n k and n b matrices obtained by stacking the 3
4 matrices of explanatory variables and basis functions of the j-th space, respectively, i.e., X = (X 1,..., X m) and B = (B 1,..., B m). As a whole of space, the additive model in (2) is written such as y = Xβ + Bα + Rµ + ε, where µ = (µ 1,..., µ m ) and R is an n m matrix defined by 1 n1 e 1 R =.. 1 nm e m Here, e j is the m-dimensional vector of which the j-th element is 1 while all the other elements are 0, and indicates the Kronecker product of the two matrices. 2.2 Estimations of α and β We explain the estimation method for α and β at first, though we need to estimate µ. Therefore we put the estimator of µ as ˆµ here. We use the penalized spline regression introduced by Yanagihara (2012, 2018) to estimate α and β. Yanagihara (2012) shows that choosing the smoothing parameters in the penalized smoothing spline is equivalent to choosing the ridge parameters in the generalized ridge regression using the matrix of transformed basis function values as the matrix of explanatory variables. And then Yanagihara (2018) considers optimization of the ridge parameters in generalized ridge regression by minimizing a model selection criterion, i.e., generalized cross-validation (GCV). Hence, we estimate α by the penalized smoothing spline optimized by minimizing GCV. Let Q be the b b orthogonal matrix which diagonalizes B (I n P X )B, where P X = X(X X) 1 X, such that Q B (I n P X )BQ = D = diag(d 1,..., d b ), (d 1 d b ). (3) By using Q and D, we define the following n b matrix: H = (I n P X )BQD 1/2. (4) 4
5 From H, a b-dimensional vector z is defined by z = (z 1,..., z b ) = H (y R ˆµ). (5) In the penalized spline, there is a possibility that some singular values of the matrix of basis functions become very small. Then estimates of α and β become unstable. Removing very small singular values from data will eliminate a fault of an estimation. Hence we consider to use d 1,..., d γ (γ = 1,..., b), i.e., to remove d γ+1,..., d b, for estimating α and β. Let Q γ and H γ be b γ and n γ matrices consisted from the 1st to the γ-th columns of Q in (3) and H in (4), respectively, and D γ be a γ γ diagonal matrix defined by D γ = diag(d 1,..., d γ ). Moreover, let t γ,1 t γ,γ be the order statistics of z1, 2..., zγ, 2 where z j is given by (5). By using the order statistics, we define the following statistics of dispersion s 2 γ,a and a region π γ,a : (y R ˆµ) (I n P X H γ H γ)(y R ˆµ) n k γ s 2 γ,a = (n j γ)s 2 γ,0 + a j=1 t γ,j n k γ + a (0, t γ,1 ] (a = 0) π γ,a = (t γ,a, t γ,a+1 ] (a = 1,..., γ 1). (a = 0) (a = 1,..., γ), (t γ,γ, ) (a = γ) Let A γ be a set of integers which is defined by A γ = {a = {0, 1,..., γ} s 2 γ,a π γ,a }. From Yanagihara (2018), we can see that #(A γ ) = 1. Hence, we write the only element of the set A γ as a γ. Let V γ be a γ γ diagonal matrix as V γ = diag(ν γ,1,..., ν γ,γ ), ν γ,j = I ( ) ( ) zj 2 > s 2 γ,a 1 s2 γ,a γ γ zj 2 (j = 1,..., γ). From Yanagihara (2012, 2018), estimates of α and β with d 1,..., d γ smoothing parameters by GCV are given by after optimizing ˆα γ = Q γ V γ D 1/2 γ z γ, ˆβγ = (X X) 1 X (y R ˆµ B ˆα γ ), 5
6 respectively, where z γ = (z 1,..., z γ ). Since an optimization of γ remains, we optimize γ by a minimization of GCV as ˆγ = arg min γ {1,...,b} (y R ˆµ) (I n P X H γ V γ H γ)(y R ˆµ) {1 (k + γ)/n} 2. Therefore, ultimate estimates of α and β optimized by GCV are ˆα = ˆαˆγ = Qˆγ Vˆγ D 1/2 ˆγ zˆγ, ˆβ = ˆβˆγ = (X X) 1 X (y R ˆµ B ˆαˆγ ). (6) 2.3 Estimation of µ We explain the estimation method for µ, therefore estimators of α and β are given as ˆα and ˆβ in (6). The penalized residual sum of squares (PRSS λ ) for the adaptive fused lasso is given by PRSS λ (µ ˆf) = y X ˆβ m B ˆα Rµ 2 + λ w jl µ j µ l, (7) j=1 l D j where λ is the non-negative regularization parameter, i.e., λ 0. Here, D j is a set indicating spaces adjacent to the j-th space. As an example, the D 1 = {2, 3, 5} expresses that the 2nd, 3rd, and 5th spaces are adjacent to the 1st space. The number of elements of the set D j is denoted by m j, i.e., m j = #(D j ). The weight of the adaptive fused lasso is denoted by w jl such as w jl = 1/ µ j µ l, where µ j is the j-th element of (M M) 1 M y and M = (R, X, B). The spatial effect µ is estimated by minimizing PRSS λ as ˆµ λ = arg min µ R m PRSS λ(µ ˆf). The above minimization problem can be solved by the coordinate descent algorithm in Friedman et al. (2007). Suppose that all of the values of µ l other than µ γ (γ = 1,..., m) are given. The equation (7) can expressed by a function of µ γ (γ = 1,..., m), i.e., ϕ γ (µ γ ) such that PRSS λ (µ ˆf) = ϕ γ (µ γ ) + (a term not dependent on µ γ ). 6
7 Let t γ,1 t γ,mγ be the order statistics of spatial effects of adjacent spaces of the γ-th space, i.e., the order statistics of sequence {ˆµ l } l Dγ, and let us define the region π γ,a as (0, t γ,1 ] (a = 0) π γ,a = (t γ,a, t γ,a+1 ] (a = 1,..., m γ 1), (t γ,mγ, ) (a = m γ ) By using the above region and the set D γ, we define D γ,a = { l D γ µ l } a π γ,h, ν γ,a = u γ + 2λ w γl w γl, n λ l Dγ,a c l D γ,a h=0 where u j is given by u j = 1 n j (y j X j ˆβ Bj ˆα) n j. (8) Then, ϕ γ is given by a piecewise function as ϕ γ (µ γ ) = ϕ γ,a (µ γ ) = n γ (µ 2 γ 2ν γ,a µ γ ), (µ γ π γ,a, a = 1,..., γ). (9) Let A γ,1 and A γ,2 be sets of integers defined by A γ,1 = { a = {0, 1,..., m γ } νγ,a π γ,a }, A γ,2 = { a = {0, 1,..., m γ 1} {ν γ,a t γ,a+1 } {t γ,a t γ,a+1 } }. Notice that #(A γ,1 ) = 1 and #(A γ,2 ) = 1 if A γ,1 and A γ,2, respectively. Hence, we write the only on elements of A γ,1 and A γ,2 as a γ,1 and a γ,2, respectively. Moreover, we can see that #(A γ,1 A γ,2 ) = 1 when A γ,1 A γ,2. By using the above equations, we obtain the minimizer of ϕ γ (µ γ ) in (9) as ˆµ γ = arg min µ γ R ϕ γ(µ γ ) = ν γ,a γ,1 (A γ,1 ) (A γ,2 ), t γ,a γ,2 t γ,mγ (A γ,1 A γ,2 ). 7
8 An optimization of λ is performed by the minimization of GCV. Let us define ˆµ = 1 mr (y X ˆβ B ˆα). n By using the above equation and u j in (8), we define λ max = u j ˆµ max. j {1,...,m} m j /n j Moreover, we prepare a set of λ as Λ = {λ 1,..., λ 100 }, where λ j = λ max (0.75) 100 j. The optimization of λ is performed by the minimization of GCV as ˆλ = arg min λ Λ y X ˆβ B ˆα R ˆµ λ 2 (1 df λ /n) 2, where the df λ is the number of non-zero elements of ˆµ λ. 3 Data Over the period , the body condition data were obtained from common minke whales taken in Norwegian scientific and commercial whaling operations in the Northeast Atlantic during the months April to September. This data is basically the same one used in Solvang et al. (2017) and Yamamura et al. (2016) but samples in are newly added here. Immediately after death, the whales were taken onboard and hauled across the foredeck of the boat. Total body length was measured in a straight line from the tip of the upper jaw to the apex of the tail fluke notch; girth was measured right behind the flipper; and blubber thickness was measured at three sites (Fig.1): Dorsally behind the blowhole (BT1) and behind the dorsal fin (BT2), and laterally just above the center of the flipper (BT3). Blubber measurements were made perpendicular from the skin surface to the muscle-connective tissue interface. Length and girth measurements were made to the nearest centimeter, while blubber measurements were to the nearest millimeter. For all whales, the year, month, day, latitude, and longitude were recorded. After removing data in period and data with missing values, final numbers of individuals included 8
9 in the analysis are 11,505. We use that y ij as the BT1, x 1,ij as the sex, x 2i j as the year, x 3i j as the calendar day, x 4i j as the length, µ j as the space effect, and ε ij as the error term about the i- th sample (i = 1,..., n j ) at the j-th space (j = 1,..., m). Figure 1: Measurement sites. There is the geographic distribution of the five International Whaling Commission (IWC) management areas: ES (Svalbard-Bear Island area), EB (Eastern Barents Sea), EW (Norwegian Sea and coastal zones off North Norway, including the Lofoten area), EN (North Sea), and CM (Western Norwegian Sea- Jan Mayen area). We subdivide each areas to have about 300 samples, and estimate ˆµ of subdivided areas. If the ˆµ of a subdivided area is equal to the one of its neighbor area, the subdivided area and the neighbor are united. 4 Estimation Result From the fused lasso estimation result, the subdivided areas are narrowed down the 11 spaces, and whales have the thickest BT1 in the northernmost space. The other detailed results are showed at the seminar. References [1] Friedman, J., Hastie, T., Höfling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Annals of Applied Statistics, 1, [2] Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman & Hall Ltd., London, New York. [3] Solvang, H. K., Yanagihara, H., Øien, N. and Haug, T. (2017). Temporal and geographical variation in body condition of common minke whales (Balaenoptera acutorostrata acutorostrata) in the northeast Atlantic. Polar Biology, 40,
10 [4] Yamamura, M., Yanagihara, H., Solvang, H. K., Øien, N. and Haug, T. (2016). Canonical correlation analysis for geographical and chronological responses. Procedia Computer Science, 96, [5] Yanagihara, H. (2012). A non-iterative optimization method for smoothness in penalized spline regression. Statistics and Computing, 22, [6] Yanagihara, H. (2018). Explicit solution to the minimization problem of generalized cross-validation criterion for selecting ridge parameters in generalized ridge regression. Hiroshima Mathematical Journal, 48,
Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives
TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationComparison with RSS-Based Model Selection Criteria for Selecting Growth Functions
TR-No. 14-07, Hiroshima Statistical Research Group, 1 12 Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 2, Mariko Yamamura 1 and Hirokazu Yanagihara 2 1
More informationLikelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science
1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School
More informationStatistics for high-dimensional data: Group Lasso and additive models
Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional
More informationComparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions
c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,
More informationAdaptive Piecewise Polynomial Estimation via Trend Filtering
Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationSaharon Rosset 1 and Ji Zhu 2
Aust. N. Z. J. Stat. 46(3), 2004, 505 510 CORRECTED PROOF OF THE RESULT OF A PREDICTION ERROR PROPERTY OF THE LASSO ESTIMATOR AND ITS GENERALIZATION BY HUANG (2003) Saharon Rosset 1 and Ji Zhu 2 IBM T.J.
More informationRegularization Methods for Additive Models
Regularization Methods for Additive Models Marta Avalos, Yves Grandvalet, and Christophe Ambroise HEUDIASYC Laboratory UMR CNRS 6599 Compiègne University of Technology BP 20529 / 60205 Compiègne, France
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 9: Basis Expansions Department of Statistics & Biostatistics Rutgers University Nov 01, 2011 Regression and Classification Linear Regression. E(Y X) = f(x) We want to learn
More informationLocal regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression
Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction
More informationASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS
Mem. Gra. Sci. Eng. Shimane Univ. Series B: Mathematics 47 (2014), pp. 63 71 ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS TAKUMA YOSHIDA Communicated by Kanta Naito (Received: December 19, 2013)
More informationA Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)
A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical
More informationFocused fine-tuning of ridge regression
Focused fine-tuning of ridge regression Kristoffer Hellton Department of Mathematics, University of Oslo May 9, 2016 K. Hellton (UiO) Focused tuning May 9, 2016 1 / 22 Penalized regression The least-squares
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationVariable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting
Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting Andreas Groll 1 and Gerhard Tutz 2 1 Department of Statistics, University of Munich, Akademiestrasse 1, D-80799, Munich,
More informationRegression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr
Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics
More informationBoosting Methods: Why They Can Be Useful for High-Dimensional Data
New URL: http://www.r-project.org/conferences/dsc-2003/ Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) March 20 22, Vienna, Austria ISSN 1609-395X Kurt Hornik,
More informationInteraction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage
More informationGeneralized Additive Models
Generalized Additive Models The Model The GLM is: g( µ) = ß 0 + ß 1 x 1 + ß 2 x 2 +... + ß k x k The generalization to the GAM is: g(µ) = ß 0 + f 1 (x 1 ) + f 2 (x 2 ) +... + f k (x k ) where the functions
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationOutlier detection and variable selection via difference based regression model and penalized regression
Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationChapter 3. Linear Models for Regression
Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationLecture 5: Soft-Thresholding and Lasso
High Dimensional Data and Statistical Learning Lecture 5: Soft-Thresholding and Lasso Weixing Song Department of Statistics Kansas State University Weixing Song STAT 905 October 23, 2014 1/54 Outline Penalized
More informationRegularization: Ridge Regression and the LASSO
Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationEconometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018
Econometrics I Lecture 10: Nonparametric Estimation with Kernels Paul T. Scott NYU Stern Fall 2018 Paul T. Scott NYU Stern Econometrics I Fall 2018 1 / 12 Nonparametric Regression: Intuition Let s get
More information9. Least squares data fitting
L. Vandenberghe EE133A (Spring 2017) 9. Least squares data fitting model fitting regression linear-in-parameters models time series examples validation least squares classification statistics interpretation
More informationmboost - Componentwise Boosting for Generalised Regression Models
mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten Hothorn Department of Statistics Ludwig-Maximilians-University Munich 13.8.2008 Boosting in a Nutshell Boosting
More informationWell-developed and understood properties
1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationWaseda Cherry Blossom Workshop
Waseda Cherry Blossom Workshop Time Series Factor Models & Causality Venue: Waseda University, Nishi-Waseda Campus Building 63-1 Meeting Room (Access map: http://www.sci.waseda.ac.jp/eng/access) Date:
More informationCMSC858P Supervised Learning Methods
CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationRegularization Path Algorithms for Detecting Gene Interactions
Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable
More informationFlexible Spatio-temporal smoothing with array methods
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and
More informationInternational Symposium on Statistical Theory and Methodology for Large Complex Data November 16-18, 2018
International Symposium on Statistical Theory and Methodology for Large Complex Data November 6-8, 208 Venue: D509 Institute of Natural Sciences, University of Tsukuba -- Tennodai, Tsukuba, Ibaraki 305-857,
More informationBias-corrected AIC for selecting variables in Poisson regression models
Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,
More informationRecap. HW due Thursday by 5 pm Next HW coming on Thursday Logistic regression: Pr(G = k X) linear on the logit scale Linear discriminant analysis:
1 / 23 Recap HW due Thursday by 5 pm Next HW coming on Thursday Logistic regression: Pr(G = k X) linear on the logit scale Linear discriminant analysis: Pr(G = k X) Pr(X G = k)pr(g = k) Theory: LDA more
More informationA Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion
TR-No. 17-07, Hiroshima Statistical Research Group, 1 24 A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion Mineaki Ohishi, Hirokazu
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationModel Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University
Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates
More informationGeneralized Additive Models
By Trevor Hastie and R. Tibshirani Regression models play an important role in many applied settings, by enabling predictive analysis, revealing classification rules, and providing data-analytic tools
More informationIntroduction to the genlasso package
Introduction to the genlasso package Taylor B. Arnold, Ryan Tibshirani Abstract We present a short tutorial and introduction to using the R package genlasso, which is used for computing the solution path
More informationDifferent types of regression: Linear, Lasso, Ridge, Elastic net, Ro
Different types of regression: Linear, Lasso, Ridge, Elastic net, Robust and K-neighbors Faculty of Mathematics, Informatics and Mechanics, University of Warsaw 04.10.2009 Introduction We are given a linear
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationUNIVERSITETET I OSLO
UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Examination in: STK4030 Modern data analysis - FASIT Day of examination: Friday 13. Desember 2013. Examination hours: 14.30 18.30. This
More informationReduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation
Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator
More informationHomogeneity Pursuit. Jianqing Fan
Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles
More informationarxiv: v1 [stat.me] 4 Oct 2013
Monotone Splines Lasso Linn Cecilie Bergersen, Kukatharmini Tharmaratnam and Ingrid K. Glad Department of Mathematics, University of Oslo arxiv:1310.1282v1 [stat.me] 4 Oct 2013 Abstract We consider the
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationFIRST RECORD OF SOWERBY'S BEAKED WHALE FROM AZORES
FIRST RECORD OF SOWERBY'S BEAKED WHALE FROM AZORES FRANCISCO REINER* The Odontocete species Mesoplodon bidens (Sowerby, 1804) is generally thought to occur naturally in the northern part of the Atlantic,
More informationTheoretical Exercises Statistical Learning, 2009
Theoretical Exercises Statistical Learning, 2009 Niels Richard Hansen April 20, 2009 The following exercises are going to play a central role in the course Statistical learning, block 4, 2009. The exercises
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationP -spline ANOVA-type interaction models for spatio-temporal smoothing
P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and
More informationRobust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly
Robust Variable Selection Methods for Grouped Data by Kristin Lee Seamon Lilly A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationBayes Estimators & Ridge Regression
Readings Chapter 14 Christensen Merlise Clyde September 29, 2015 How Good are Estimators? Quadratic loss for estimating β using estimator a L(β, a) = (β a) T (β a) How Good are Estimators? Quadratic loss
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationAdditive Terms. Flexible Regression and Smoothing. Mikis Stasinopoulos 1 Bob Rigby 1
1 Flexible Regression and Smoothing Mikis Stasinopoulos 1 Bob Rigby 1 1 STORM, London Metropolitan University XXV SIMPOSIO INTERNACIONAL DE ESTADêSTICA, Armenia, Colombia, August 2015 2 Outline 1 Linear
More informationThe Algebra of the Kronecker Product. Consider the matrix equation Y = AXB where
21 : CHAPTER Seemingly-Unrelated Regressions The Algebra of the Kronecker Product Consider the matrix equation Y = AXB where Y =[y kl ]; k =1,,r,l =1,,s, (1) X =[x ij ]; i =1,,m,j =1,,n, A=[a ki ]; k =1,,r,i=1,,m,
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationarxiv: v3 [stat.ml] 14 Apr 2016
arxiv:1307.0048v3 [stat.ml] 14 Apr 2016 Simple one-pass algorithm for penalized linear regression with cross-validation on MapReduce Kun Yang April 15, 2016 Abstract In this paper, we propose a one-pass
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationUsing P-splines to smooth two-dimensional Poisson data
1 Using P-splines to smooth two-dimensional Poisson data Maria Durbán 1, Iain Currie 2, Paul Eilers 3 17th IWSM, July 2002. 1 Dept. Statistics and Econometrics, Universidad Carlos III de Madrid, Spain.
More informationSpatially Adaptive Smoothing Splines
Spatially Adaptive Smoothing Splines Paul Speckman University of Missouri-Columbia speckman@statmissouriedu September 11, 23 Banff 9/7/3 Ordinary Simple Spline Smoothing Observe y i = f(t i ) + ε i, =
More informationComputational Physics
Interpolation, Extrapolation & Polynomial Approximation Lectures based on course notes by Pablo Laguna and Kostas Kokkotas revamped by Deirdre Shoemaker Spring 2014 Introduction In many cases, a function
More informationA General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations
A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood
More information20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =
20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J
More informationLocal Polynomial Modelling and Its Applications
Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,
More informationDirect Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Classification Logistic regression models for classification problem We consider two class problem: Y {0, 1}. The Bayes rule for the classification is I(P(Y = 1 X = x) > 1/2) so
More informationLeast Angle Regression, Forward Stagewise and the Lasso
January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More informationCompressed Sensing in Cancer Biology? (A Work in Progress)
Compressed Sensing in Cancer Biology? (A Work in Progress) M. Vidyasagar FRS Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar University
More informationSimultaneous Coefficient Penalization and Model Selection in Geographically Weighted Regression: The Geographically Weighted Lasso
Simultaneous Coefficient Penalization and Model Selection in Geographically Weighted Regression: The Geographically Weighted Lasso by David C. Wheeler Technical Report 07-08 October 2007 Department of
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationA note on the group lasso and a sparse group lasso
A note on the group lasso and a sparse group lasso arxiv:1001.0736v1 [math.st] 5 Jan 2010 Jerome Friedman Trevor Hastie and Robert Tibshirani January 5, 2010 Abstract We consider the group lasso penalty
More informationORIE 4741: Learning with Big Messy Data. Regularization
ORIE 4741: Learning with Big Messy Data Regularization Professor Udell Operations Research and Information Engineering Cornell October 26, 2017 1 / 24 Regularized empirical risk minimization choose model
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationPackage NonpModelCheck
Type Package Package NonpModelCheck April 27, 2017 Title Model Checking and Variable Selection in Nonparametric Regression Version 3.0 Date 2017-04-27 Author Adriano Zanin Zambom Maintainer Adriano Zanin
More informationConvex Optimization / Homework 1, due September 19
Convex Optimization 1-725/36-725 Homework 1, due September 19 Instructions: You must complete Problems 1 3 and either Problem 4 or Problem 5 (your choice between the two). When you submit the homework,
More informationSupport Vector Machines for Classification: A Statistical Portrait
Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationFunctional Mixed Effects Spectral Analysis
Joint with Robert Krafty and Martica Hall June 4, 2014 Outline Introduction Motivating example Brief review Functional mixed effects spectral analysis Estimation Procedure Application Remarks Introduction
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More informationThreshold Autoregressions and NonLinear Autoregressions
Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models
More informationRobust estimators for additive models using backfitting
Robust estimators for additive models using backfitting Graciela Boente Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina Alejandra Martínez Facultad de Ciencias
More informationFunctional SVD for Big Data
Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem
More information