Multivariate Calibration with Robust Signal Regression
|
|
- Constance Thornton
- 5 years ago
- Views:
Transcription
1 Multivariate Calibration with Robust Signal Regression Bin Li and Brian Marx from Louisiana State University Somsubhra Chakraborty from Indian Institute of Technology Kharagpur David C Weindorf from Texas Tech University. July 31, /22
2 Outline Motivating example. Recap: Penalized Signal Regression (PSR). Generalized Huber Loss and robust PSR. Simulation and empirical results. Related issues. 2/22
3 A Soil Data Example Data: 675 soil samples collected from CA, NE, and TX in 2014, 225 samples in each location. All the soil samples were scanned using a portable VisNIR spectroradiometer with a spectral range of 350 to 2500 nm. Ten physicochemical properties were measured: soil cation exchange capacity (CEC), total nitrogen level, electrical conductivity (EC), total carbon level, loss on ignition (LOI), soil organic matter (SOM), clay, sand, silt, and soil ph level. LOI and SOM are highly correlated, so LOI was removed. Objective: use VisNIR spectra to predict soil properties. 3/22
4 Sample spectra Wavelength (nm) Thirty sample spectra (first derivative) for the soil data. 4/22
5 Penalized Signal Regression PSR: P. Eilers and B.D. Marx (Statistical Science, 1996). PSR minimizes the following objective: S(α) = y XBα 2 +λ Dα 2, with difference matrix D penalizes differences on α. The closed form solution for α: ˆα = (U U +λd D) 1 U y Response Y: soil property indicator (m 1 column vector, m = 675). Input X: VisNIR Spectra m p matrix, p = 214. B-spline basis matrix B: p n matrix, n = 100. Difference matrix D: (n d) n, d is the order of the difference penalty (d = 0,1,2,3). Coefficient vector α: n 1. 5/22
6 Q-Q Plot of PSR Residuals Sample Quantiles CEC Theoretical Quantiles Carbon Sample Quantiles EC Theoretical Quantiles LOI Sample Quantiles Nitrogen Theoretical Quantiles SOM Sample Quantiles Sample Quantiles Theoretical Quantiles Clay Theoretical Quantiles Sample Quantiles Sample Quantiles Theoretical Quantiles Sand Theoretical Quantiles Sample Quantiles Sample Quantiles Theoretical Quantiles Silt Theoretical Quantiles Normal quantile-quantile plots of the residuals (from PSR models) for nine soil property indicators. 6/22
7 With vs. Without Outliers on PSR and Robust PSR PSR PSR Coefficient with outliers w/o outliers Predicted Wave length (nm) Measured rpsr rpsr Coefficient with outliers w/o outliers Predicted Wave length (nm) Measured 7/22
8 Generalized Huber Loss Generalized Huber loss { e 2 e < K ρ η (e) = K 2 +2ηK( e K) e K., 0 η 1. ρ(e) η = 1 η = 0.5 η = 0 e 2 ρ(e) η = 1 η = 0.5 η = e e 8/22
9 Robust Penalized Signal Regression (rpsr) The rpsr estimator minimizes { m } Q(α) = ρ η (y i U iα) +λα D d D dα i=1 which can be represented as a difference of two convex functions as follows: Q(α) = h 1 (α) h 2 (α), where m h 1 (α) = ei 2 +λα D Dα, h 2 (α) = i=1 m I( e i > K) [ ei 2 +2ηK(K e i ) K 2], i=1 9/22
10 Difference Convex Programming Difference Convex Programming: An and Tao (1997). Consider minimizing a nonconvex objective function g(w) = g 1 (w) g 2 (w) where both g 1 (w) and g 2 (w) are convex in w. D.C. programming constructs a sequence of subproblems and solves them iteratively. Given the solution for the (m 1)th subproblem w (m 1), the mth subproblem solves w (m) [ ] = arg min g 1(w) g 2 (w (m 1) ) + w w (m 1), g 2 (w (m 1) ), w = arg min w g 1(w) w, g 2 (w (m 1) ). where g 2 (w (m 1) ) is the subgradient of g 2 (w) at w (m 1) with respect to w. 10/22
11 Robust PSR Algorithm Minimizing the objective function of rpsr becomes minimizing a sequence of PSR with the adjusted responses Y A y 1 I( e 1 > K)[e 1 ηksign(e 1 )] Y A =.. y m I( e m > K)[e m ηksign(e m )] m 1 Only the observations with the residuals greater than K (in absolute value) will be adjusted. If K is greater than all the residuals {e i }, then rpsr and PSR solutions are the same. 11/22
12 Robust PSR Algorithm (cont.) Initial ˆα is from the PSR estimate (with the same value of λ). Algorithm stops when max{ (ˆα j cur ˆα pre j )/ˆα pre j } n j=1 < The cutoff value K is chosen based on 1.5 IQR rule on the residuals in each iteration. Grid search on the optimal values for λ and η based on CV performance. The rpsr algorithm usually converges within just a few iterations. 12/22
13 Simulation Studies Underlying model: Y i = f(x i )+ǫ i. f(x i ): PSR fitted value on CEC with λ = Three error distributions on ǫ i : Normal: ei N(0, ). Mixed normal: ei 0.95N(0, )+0.05N(0, ). Slash distribution: ei N(0,1)/U(0,1). Three levels of η are considered: 0, 0.5 and fold CV to find optimal value of λ. 50 random splits of the datasets: 75% training and 25% test sets. Comparative RMSE and MAE on test samples. 13/22
14 Simulation Results Normal Mixed normal Slash Comparative RMSE PSR rpsr_1 rpsr_0.5 rpsr_0 PSR rpsr_1 rpsr_0.5 rpsr_0 Normal Comparative MAE Comparative RMSE Mixed normal Comparative MAE Comparative RMSE PSR rpsr_1 rpsr_0.5 rpsr_0 Slash Comparative MAE PSR rpsr_1 rpsr_0.5 rpsr_0 PSR rpsr_1 rpsr_0.5 rpsr_0 PSR rpsr_1 rpsr_0.5 rpsr_0 14/22
15 Simulation Results (cont.) Average of test RMSEs and MAEs based on 50 replications. RMSE PSR rpsr (η = 1) rpsr (η = 0.5) rpsr (η = 0) Normal Mixed Slash MAE PSR rpsr (η = 1) rpsr (η = 0.5) rpsr (η = 0) Normal Mixed Slash /22
16 Model Stability Three error distributions as above: normal, mixed normal and slash. Three levels of η are considered: 0, 0.5 and 1. PSR and rpsr are fitted on 95% of the random sample with λ = random splits of the datasets. Evaluate model stability on coefficient estimation: L 2 distance standard deviation (L 2 DSD) criterion L 2 DSD = SD({ ˆβ (i) β 2 } 20 i=1), where β is the average ˆβ from 20 replications. Evaluate model stability on prediction: SD of the predicted values on all 675 samples. 16/22
17 Simulation Results (cont.) Summary of L 2 DSD of ˆβ and SD of ŷ based on 20 replications. Error PSR rpsr (η = 1) η = 0.5 η = 0 Normal Mixed Slash Normal Mixed Slash /22
18 Soil Data Study 50 random splits: 75% training and 25% test sets. Three levels of η are considered: 0, 0.5 and 1. RMSE on test samples. Error PSR rpsr (η = 1) η = 0.5 η = 0 CEC EC Nitrogen Carbon SOM Clay Sand Silt ph /22
19 Soil Data Study (cont.) MAE on test samples. Error PSR rpsr (η = 1) η = 0.5 η = 0 CEC EC Nitrogen Carbon SOM Clay Sand Silt ph /22
20 Soil Data Study (cont.) Compare PSR and rpsr coefficients and identify the outliers. Data: use all 675 samples as training set; Y: Carbon. PSR vs. rpsr model with η = 0.5 Leading two PCs explains 79.8% total variance. rpsr identifies 17 outliers (about 2.5% of total samples). PC NE CA TX Predicted Carbon Coefficient rpsr PSR PC Measured Carbon Wavelength (nm) 20/22
21 Connection With Lee and Oh s Procedure (2007) Lee and Oh (2007) explored robust penalized regression spline using Huber loss. They proposed an iterative fitting procedure basesd on the pseudo-response ỹ: ỹ i = ŷ i + ψ(e i) 2, where ψ( ) is the first order derivative of Huber loss ρ H ( ). We can show the pseudo-response ỹ i is equivalent to our adjusted response Y A with η = 1: ỹ i = y i I( e i > K)[e i ηksign(e i )]. Lee and Oh s approach is theoretically supported by Cox s result (1983), which requires ψ( ) to be 2 nd order differentiable. The proposed rpsr procedure is a generalization of Lee and Oh s procedure, and motivated from a different perspective. 21/22
22 Reference An, L. and Tao, P. (1997). Sovling a class of linearly constrained indefinite quadratic problems by d.c. algorithms, Journal of Global Optimization, 11, Cox, D. (1983). Asymptotics for m-type smoothing splines, The Annals of Statistics, 11(2), Eilers, P. and Marx, B. (1996). Flexible smoothing with b-splines and penalties, Statistical Science, 11(2), Lee, T. and Oh, H. (2007). Robust penalized regression spline fitting with application to additive mixed modeling, Computational Statistics, 22(1), Li, B. and Marx, B. (2018), Multivariate calibration with robust signal regression. Accepted In Statistical Modelling: An International Journal. 22/22
Functional SVD for Big Data
Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem
More informationRobust high-dimensional linear regression: A statistical perspective
Robust high-dimensional linear regression: A statistical perspective Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics STOC workshop on robustness and nonconvexity Montreal,
More informationOptimizing Model Development and Validation Procedures of Partial Least Squares for Spectral Based Prediction of Soil Properties
Optimizing Model Development and Validation Procedures of Partial Least Squares for Spectral Based Prediction of Soil Properties Soil Spectroscopy Extracting chemical and physical attributes from spectral
More informationEXTENDING PARTIAL LEAST SQUARES REGRESSION
EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationVariable Selection and Model Choice in Survival Models with Time-Varying Effects
Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität
More informationRobust estimation, efficiency, and Lasso debiasing
Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling
More informationIndian Statistical Institute
Indian Statistical Institute Introductory Computer programming Robust Regression methods with high breakdown point Author: Roll No: MD1701 February 24, 2018 Contents 1 Introduction 2 2 Criteria for evaluating
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationLecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University
Lecture 18: Kernels Risk and Loss Support Vector Regression Aykut Erdem December 2016 Hacettepe University Administrative We will have a make-up lecture on next Saturday December 24, 2016 Presentations
More information9. Robust regression
9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More informationSOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu
SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu LITIS - EA 48 - INSA/Universite de Rouen Avenue de l Université - 768 Saint-Etienne du Rouvray
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied
More informationLecture 1: Supervised Learning
Lecture 1: Supervised Learning Tuo Zhao Schools of ISYE and CSE, Georgia Tech ISYE6740/CSE6740/CS7641: Computational Data Analysis/Machine from Portland, Learning Oregon: pervised learning (Supervised)
More informationMachine Learning And Applications: Supervised Learning-SVM
Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine
More informationMedian Cross-Validation
Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationAn Empirical Characteristic Function Approach to Selecting a Transformation to Normality
Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationExtreme L p quantiles as risk measures
1/ 27 Extreme L p quantiles as risk measures Stéphane GIRARD (Inria Grenoble Rhône-Alpes) joint work Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER (University of Nottingham) December
More informationA Magiv CV Theory for Large-Margin Classifiers
A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationHigh-dimensional regression
High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and
More informationCPSC 340: Machine Learning and Data Mining. Gradient Descent Fall 2016
CPSC 340: Machine Learning and Data Mining Gradient Descent Fall 2016 Admin Assignment 1: Marks up this weekend on UBC Connect. Assignment 2: 3 late days to hand it in Monday. Assignment 3: Due Wednesday
More informationSTAT 704 Sections IRLS and Bootstrap
STAT 704 Sections 11.4-11.5. IRLS and John Grego Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 14 LOWESS IRLS LOWESS LOWESS (LOcally WEighted Scatterplot Smoothing)
More informationSCMA292 Mathematical Modeling : Machine Learning. Krikamol Muandet. Department of Mathematics Faculty of Science, Mahidol University.
SCMA292 Mathematical Modeling : Machine Learning Krikamol Muandet Department of Mathematics Faculty of Science, Mahidol University February 9, 2016 Outline Quick Recap of Least Square Ridge Regression
More informationRobust model selection criteria for robust S and LT S estimators
Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often
More informationP -spline ANOVA-type interaction models for spatio-temporal smoothing
P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationMultidimensional Density Smoothing with P-splines
Multidimensional Density Smoothing with P-splines Paul H.C. Eilers, Brian D. Marx 1 Department of Medical Statistics, Leiden University Medical Center, 300 RC, Leiden, The Netherlands (p.eilers@lumc.nl)
More informationNonparametric Small Area Estimation Using Penalized Spline Regression
Nonparametric Small Area Estimation Using Penalized Spline Regression 0verview Spline-based nonparametric regression Nonparametric small area estimation Prediction mean squared error Bootstrapping small
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationComputing regularization paths for learning multiple kernels
Computing regularization paths for learning multiple kernels Francis Bach Romain Thibaux Michael Jordan Computer Science, UC Berkeley December, 24 Code available at www.cs.berkeley.edu/~fbach Computing
More informationJournal Club. A Universal Catalyst for First-Order Optimization (H. Lin, J. Mairal and Z. Harchaoui) March 8th, CMAP, Ecole Polytechnique 1/19
Journal Club A Universal Catalyst for First-Order Optimization (H. Lin, J. Mairal and Z. Harchaoui) CMAP, Ecole Polytechnique March 8th, 2018 1/19 Plan 1 Motivations 2 Existing Acceleration Methods 3 Universal
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationInfeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization
Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel
More informationInference based on robust estimators Part 2
Inference based on robust estimators Part 2 Matias Salibian-Barrera 1 Department of Statistics University of British Columbia ECARES - Dec 2007 Matias Salibian-Barrera (UBC) Robust inference (2) ECARES
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationA General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations
A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationHomework 6. Due: 10am Thursday 11/30/17
Homework 6 Due: 10am Thursday 11/30/17 1. Hinge loss vs. logistic loss. In class we defined hinge loss l hinge (x, y; w) = (1 yw T x) + and logistic loss l logistic (x, y; w) = log(1 + exp ( yw T x ) ).
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationMULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES
MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES Philip Jonathan, Kevin Ewans, David Randell, Yanyun Wu philip.jonathan@shell.com www.lancs.ac.uk/ jonathan Wave Hindcasting & Forecasting
More informationLong-Run Covariability
Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips
More informationPenalized Splines, Mixed Models, and Recent Large-Sample Results
Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong
More informationDivide-and-combine Strategies in Statistical Modeling for Massive Data
Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017
More informationEstimation de mesures de risques à partir des L p -quantiles
1/ 42 Estimation de mesures de risques à partir des L p -quantiles extrêmes Stéphane GIRARD (Inria Grenoble Rhône-Alpes) collaboration avec Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER
More informationCombination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters
Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,
More informationGeneralized Boosted Models: A guide to the gbm package
Generalized Boosted Models: A guide to the gbm package Greg Ridgeway April 15, 2006 Boosting takes on various forms with different programs using different loss functions, different base models, and different
More informationApplied Time Series Topics
Applied Time Series Topics Ivan Medovikov Brock University April 16, 2013 Ivan Medovikov, Brock University Applied Time Series Topics 1/34 Overview 1. Non-stationary data and consequences 2. Trends and
More informationAn Introduction to Statistical Machine Learning - Theoretical Aspects -
An Introduction to Statistical Machine Learning - Theoretical Aspects - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny,
More informationFlexible Spatio-temporal smoothing with array methods
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and
More informationOn DC. optimization algorithms for solving minmax flow problems
On DC. optimization algorithms for solving minmax flow problems Le Dung Muu Institute of Mathematics, Hanoi Tel.: +84 4 37563474 Fax: +84 4 37564303 Email: ldmuu@math.ac.vn Le Quang Thuy Faculty of Applied
More informationSpace-time modelling of air pollution with array methods
Space-time modelling of air pollution with array methods Dae-Jin Lee Royal Statistical Society Conference Edinburgh 2009 D.-J. Lee (Uc3m) GLAM: Array methods in Statistics RSS 09 - Edinburgh # 1 Motivation
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationMachine Learning. Part 1. Linear Regression. Machine Learning: Regression Case. .. Dennis Sun DATA 401 Data Science Alex Dekhtyar..
.. Dennis Sun DATA 401 Data Science Alex Dekhtyar.. Machine Learning. Part 1. Linear Regression Machine Learning: Regression Case. Dataset. Consider a collection of features X = {X 1,...,X n }, such that
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationSmoothing Proximal Gradient Method. General Structured Sparse Regression
for General Structured Sparse Regression Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbonell, Eric P. Xing (Annals of Applied Statistics, 2012) Gatsby Unit, Tea Talk October 25, 2013 Outline Motivation:
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationInexact Alternating Direction Method of Multipliers for Separable Convex Optimization
Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationLECTURE NOTE #3 PROF. ALAN YUILLE
LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.
More informationA Bahadur Representation of the Linear Support Vector Machine
A Bahadur Representation of the Linear Support Vector Machine Yoonkyung Lee Department of Statistics The Ohio State University October 7, 2008 Data Mining and Statistical Learning Study Group Outline Support
More informationLinear model selection and regularization
Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationRecovering Indirect Information in Demographic Applications
Recovering Indirect Information in Demographic Applications Jutta Gampe Abstract In many demographic applications the information of interest can only be estimated indirectly. Modelling events and rates
More informationRirdge Regression. Szymon Bobek. Institute of Applied Computer science AGH University of Science and Technology
Rirdge Regression Szymon Bobek Institute of Applied Computer science AGH University of Science and Technology Based on Carlos Guestrin adn Emily Fox slides from Coursera Specialization on Machine Learnign
More informationUNIVERSITETET I OSLO
UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Examination in: STK4030 Modern data analysis - FASIT Day of examination: Friday 13. Desember 2013. Examination hours: 14.30 18.30. This
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationNonparametric Small Area Estimation via M-quantile Regression using Penalized Splines
Nonparametric Small Estimation via M-quantile Regression using Penalized Splines Monica Pratesi 10 August 2008 Abstract The demand of reliable statistics for small areas, when only reduced sizes of the
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationTGDR: An Introduction
TGDR: An Introduction Julian Wolfson Student Seminar March 28, 2007 1 Variable Selection 2 Penalization, Solution Paths and TGDR 3 Applying TGDR 4 Extensions 5 Final Thoughts Some motivating examples We
More informationAn automatic report for the dataset : affairs
An automatic report for the dataset : affairs (A very basic version of) The Automatic Statistician Abstract This is a report analysing the dataset affairs. Three simple strategies for building linear models
More informationIntroduction Robust regression Examples Conclusion. Robust regression. Jiří Franc
Robust regression Robust estimation of regression coefficients in linear regression model Jiří Franc Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Department of Mathematics
More informationSimultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR
Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR Howard D. Bondell and Brian J. Reich Department of Statistics, North Carolina State University,
More informationClassification Logistic Regression
Announcements: Classification Logistic Regression Machine Learning CSE546 Sham Kakade University of Washington HW due on Friday. Today: Review: sub-gradients,lasso Logistic Regression October 3, 26 Sham
More informationLecture 12 Robust Estimation
Lecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Copyright These lecture-notes
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part VI: Cross-validation Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD) 453 Modern
More informationEstimation of Mars surface physical properties from hyperspectral images using the SIR method
Estimation of Mars surface physical properties from hyperspectral images using the SIR method Caroline Bernard-Michel, Sylvain Douté, Laurent Gardes and Stéphane Girard Source: ESA Outline I. Context Hyperspectral
More informationIntroduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems
New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,
More informationOptimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method
Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015
More informationBi-level feature selection with applications to genetic association
Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may
More informationthe-go Soil Sensing Technology
Agricultural Machinery Conference May, 006 On-the the-go Soil Sensing Technology Viacheslav I. Adamchuk iological Systems Engineering University of Nebraska - Lincoln Agenda Family of on-the-go soil sensors
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 3: Linear Models I (LFD 3.2, 3.3) Cho-Jui Hsieh UC Davis Jan 17, 2018 Linear Regression (LFD 3.2) Regression Classification: Customer record Yes/No Regression: predicting
More informationIs the Whole Greater Than the Sum of Its Parts?
Is the Whole Greater Than the Sum of Its Parts? Presenter: Liangyue Li Joint work with Liangyue Li (ASU) Hanghang Tong (ASU) Yong Wang (HKUST) Conglei Shi (IBM->Airbnb) Nan Cao (Tongji) Norbou Buchler
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationStatistical Methods as Optimization Problems
models ( 1 Statistical Methods as Optimization Problems Optimization problems maximization or imization arise in many areas of statistics. Statistical estimation and modeling both are usually special types
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationRelaxed linearized algorithms for faster X-ray CT image reconstruction
Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction
More informationFast Regularization Paths via Coordinate Descent
August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor
More information