Accounting for measurement uncertainties in industrial data analysis

Size: px
Start display at page:

Download "Accounting for measurement uncertainties in industrial data analysis"

Transcription

1 Accounting for measurement uncertainties in industrial data analysis Marco S. Reis * ; Pedro M. Saraiva GEPSI-PSE Group, Department of Chemical Engineering, University of Coimbra Pólo II Pinhal de Marrocos, COIMBRA, PORTUGAL Abstract This paper addresses the issue of integrating measurement uncertainty information in data analysis, namely in parametric and non-parametric regression. Several existent and new approaches will be presented here and critically accessed regarding their prediction and parameter estimation ability, under different scenarios. The results show that methods which explicitly incorporate measurement uncertainty information are quite sound and promising but do not always outperform other simpler approaches.. Introduction With the development of measurement instrumentation methods and metrology, the depth of knowledge regarding measurements quality, features and uncertainty has increased significantly (ISO, 993). Even though many efforts have been made regarding the specification of uncertainty in data generation and measurement, this is not very often the case when one moves to the corresponding task of data analysis, where we should also use techniques that take into account not only data, but also their associated uncertainty. The work presented in this paper addresses this issue. Several methodologies with the potential of integrating uncertainty information in the analysis of industrial data are shortly presented and compared with their current counterparts that do ignore measurement uncertainties. Some new methodologies are also proposed, in order to overcome some of the existing shortcomings. We provide examples from two modelling paradigm extremes: the non-parametric (nearest neighbour methodology) and parametric (linear regression methodologies).. The General Modelling Problem When we have at our disposal a reference data set with both inputs, X, and outputs, Y, from a given system, and the goal is to develop approaches that will allows us to make, in the future, predictive inferences about Y under given scenarios in the X domain, a wide spectrum of approaches can be used with two major poles: non-parametric approaches put very mild assumptions about the X to Y relationship and basically use only the reference data as it is ( data-driven techniques); parametric approaches assume a given well-defined underlying model of reality and adjust some of its parameters according to data. Here, we will pick up two particular cases that represent each of both categories: the nearest neighbour method and the class of methods relying on linear models. They will be used to illustrate what options are available for explicitly accounting for known measurement uncertainties in addition to the X and Y values. For * Author to whom correspondence should be addressed: marco@eq.uc.pt

2 each technique that makes use of the [X,Y] data, we present its counterpart, that explores the availability of both measurement values and the correspondent uncertainties, respectively [X,Y] and [unc(x),unc(y)].. Non-parametric approaches Nearest neighbour regression (NNR) consists of using only those k observations from the reference (or training) data set that are closest to the new X value, whose Y we want to estimate, with the inference for Y(x) being (Hastie et al., 00): Yˆ( x ) = y () i k x N ( x) i k where N k (x) is the set of k-nearest neighbours of x. However, when data measurement uncertainties are also available, the distance in the X space should reflect them as well: if x is at the same Euclidean distance of x i and x k, but unc(x i ) > unc(x k ), it is more likely for x i to be further way from x than x k. Therefore, we propose the following modification of the Euclidean distance for the counterpart, uncertainty based approach (unnr): N ( ) ( ) D ( x, x ) = x x unc( x ) + unc( x ) w i k i= i k i k () This should be complemented with a modified averaging methodology that takes also care of the information regarding uncertainties in Y, leading to: ˆ( ) i ( i ) ( i ) Y x y unc y unc y = x N i ( x) x N i ( x) W, k W, k (3). Parametric approaches We will restrict here ourselves to linear regression models with single/multiple inputs and a single output (i.e. SISO/MISO models). Furthermore, due to space limitations, some well-known linear regression approaches, such as weighted least squares (WLS) will not be part of our comparison study. Also, classical EIV approaches, that simultaneously estimate parameters and true data, are not considered, since our purpose is to compare (i) estimated parameter vectors and (ii) predictions over new data sets... Ordinary Least Squares (OLS) and Multivariate Least Squares (MLS) The well known OLS estimate considers only homoscedastic errors in Y and is given by: ˆ ( ) ˆ ˆ (4) T ( ) argmin n T B OLS = y y B( OLS) = ( X X ) X y i= i i B where X is the n (p+) matrix with n observations of the p inputs, plus one column for the intercept, y is the n column of outputs, and B the (p+) vector with the intercept and input coefficients. The full consideration of measurement uncertainties in both inputs and outputs, which can be heteroscedastic, is carried out with MLS (Martínez et al., 00), and consists of numerically solving the following optimization problem:

3 BMLS ˆ( ) = argmin y y s where s ei B n ( ˆ ) (5) i= i i e i is the variance of the regression residual at observation i when uncertainties in both inputs and outputs are accounted for and calculated using error propagation theory. Although the statistical properties of the OLS estimator are well established, it is pertinent to analyze the less well known properties of MLS... Stepwise regression (SR) and best subset (BS) regression with OLS and MLS A well known problem of the OLS estimator is the increasing variability of the estimated parameters when the inputs are correlated. To overcome this problem, often only a subset of variables is found and used by: stepwise regression or best subset regression (Draper and Smith, 998). These two procedures are based on OLS estimates applied to a subset of variables (we will address to them as SR-OLS and BS-OLS). Furthermore, SR-OLS uses the ANOVA decomposition along with a normality assumption to perform the necessary significance tests. In order to develop the counterpart methodologies that account for measurement uncertainties, we will replace the OLS steps with corresponding MLS steps. This does not raise any conceptual problem regarding an implementation of the so called BS- MLS, but, to the best of our knowledge, there is no exact ANOVA decomposition for the MLS regression, and thus one needs to come up with an appropriate procedure for SR-MLS, by using weighted sums of squares provided by the MLS algorithm (5) instead of the usual un-weighted ones of SR-OLS...3 Partial least squares (PLS) and principal components regression (PCR) Another class of methodologies for overcoming collinearity and achieving a sort of dimensionality reduction involves choosing several orthogonal directions in the X space and regressing Y into those directions, which are linear combinations of X-variables (called X-scores). In PCR, the linear combinations are those that most explain the X variability, and in PLS are those that, taking also into account the X-variability, most correlate with the output (or, more generally, the Y-scores). Both PCR and PLS present some robustness to noise, and, to some extent, have means to take it into account through adequate scaling. However, this is not enough for general errors structures, like heteroscedastic noise. Based on a previous work on maximum likelihood (ML) PCA, that incorporates X uncertainties in the estimation of the PCA model, Wentzell and Andrews (997) developed ML-PCR* using OLS in the regression step. Martínez et al. (00) replace OLS by MLS, incorporating in this way the Y-uncertainties. This method will be henceforth called ML-PCR. So far, to the best of our knowledge, no analogous methodology was presented for PLS. Here, we propose a methodology that preserves the original successful algorithmic nature of PLS, but modifying the optimal problems solved at each step in PLS, by their counterparts that incorporate uncertainty, leading to what we will call upls. Thus, OLS regression steps are replaced by MLS steps, and least squares optimization problems by general weighted least squares problems, where the weights are given by the inverse of the square of the data measurement uncertainties. Furthermore, score uncertainties are calculated using error propagation formulas.

4 3. A Case Study Comparative Analysis In this section we describe the main comparative results obtained by applying the different approaches mentioned before (with and without fully accounting for uncertainties) to a set of Monte Carlo simulated examples. To provide the basis of comparison, the following quantities were varied: number of variables or number of latent dimensions; correlation structure (COST) of the input variables (all variables with a fixed correlation among themselves) studied at both 0. and 0.9; heterogeneity level (HLEV) of the uncertainties for each variable studied also at two different levels (high/low; high level means a highly heteroscedastic behaviour of the measurement noise standard deviation, or uncertainty, from observation to observation). The uncertainty variations occur randomly (uniform distribution) within a range that is given by 0.0 (HLEV=low) or (HLEV=high) times the mean uncertainty for each variable. The mean uncertainty for each variable was kept constant at 0. times its standard deviation. For each scenario, reference data were generated (using a linear model with unit coefficients), and we will be using the mean relative error, MRE (or the mean absolute error, MAE) for parameter estimation performance assessment. Then, another data set is generated, and predictions are made using the estimated vector, after which the root mean square error of prediction is calculated (RMSEP). This process is repeated 00 times and the MRE, MAE and RMSEP mean values presented. For the comparison between NNR and unnr, our simulation procedure is simpler, since the goal is mainly to illustrate the advantage of incorporating uncertainty information. 3. NNR and unnr Our simulation study consists of considering a non-linear relationship between Y and X (a sine wave), where we: (i) generate 500 samples uniformly distributed in [0,π]; (ii) add heteroscedastic noise to X and Y (mean uncertainty of 0. for X and Y; HLEV=high); (iii) create 50 testing samples, with the corresponding weighted root mean square error calculated (RMSE W ). This process was repeated 50 times for each value of the parameter number of nearest neighbours, and their means are shown in Figure, where unnr outperforms NNR consistently. 3. Ordinary Least Squares (OLS) and Multivariate Least Squares (MLS) In this analysis the number of variables was varied in two levels: and 6. It can be seen in Table that the results for MRE differ widely, depending whether we are considering or not the intercept term in the calculations. For just one regressor variable, MLS does a better job in estimating correctly the parameters, but its performance deteriorates for 6 variables. 3.3 Stepwise regression (SR) and best subset (BS) regression with OLS and MLS We considered 4 and 0 variables, only half of which having non-zero coefficients. For the subset methods, we used the a priori known optimal number of variables. The detailed results can not be shown due to space restrictions, but in general SR-MLS does a better parameter estimation job than its counterpart, SR-OLS, in particular for COST=0.; the same happens for the BS methods, but the results are similar for COST=0.9. SR-MLS and BS-MLS results are quite similar.

5 4 NNR NNR unc 0 W M SE R # nearest neighbours Figure. Mean RMSEw for NNR (- -) and unnr ( ) with an increasing number of nearest neighbours considered. Table. Mean values of MRE for regression coefficient vector with OLS and MLS. Simulation conditions Methods # variables HLEV COST OLS MLS / * / * / 3.93 * / * / * / * / * / 0.74 * / * / * / 3.60 * /.0947 * / * / 0.78 * /.903 * 6.6 /.080 * (* values without considering the intercept term.) 3.4 PLS, PCR, upls and ML-PCR In the Monte Carlo study for this class of methods, the number of latent dimensions to be used was kept constant at different levels ( and latent dimensions), in order to allow for a fair comparison at similar levels of complexity. The results for the MRE means are shown in Table. With one latent variable, the methods tend to perform better at high correlation levels than at low correlations, which could be expected because at high input correlations the use of only one latent dimension does not limit so strongly the explanation of variability as in the case where the variables are almost uncorrelated. The fact that the PLS-based methods perform better for COST=0., may indicate a more effective use of the specified latent dimension. When latent dimensions are used, the pattern of results for PLS and PCR changes. One possible explanation would be that they are using the second dimension to better estimate the reminding variability in the uncorrelated case (COST=0.), but are mostly fitting noise in the high correlation case (COST=0.9). This explanation is consistent with the prediction results obtained (Table 3), where we can see a similar pattern. The proposed upls does not show this strong pattern and presents a consistently interesting estimation performance at COST=0.9. As for prediction, there is a certain dependency of upls with HLEV, with the best performance being achieved at the LOW level.

6 Table. Mean values of MRE for regression coefficient vector with PLS, upls and PCR (values without considering the intercept term). Simulation conditions Methods # Lat. Dim. HLEV COST PLS upls PCR ML-PCR Table 3. RMSEP results for PLS, upls, PCR and ML-PCR. Simulation conditions Methods # Lat. Dim. HLEV COST PLS upls PCR ML-PCR Discussion and Conclusions In any simulation study of this kind, the results are strictly linked to the simulation settings used, but hopefully provide useful guidelines to adequately use the suggested methods as well as point out future research directions. In this line of thought, the results shown allow us to conclude that methods which explicitly incorporate measurement uncertainty information are quite sound and promising but do not always clearly outperform other approaches. For instance, MLS shows problems for the multivariate collinear case and ML-PCR seems to require substantial more dimensions to achieve the same predictive performance of other methods. In general, the predictive performance of the uncertainty based methods can also be a matter of concern. This underlines the importance of developing methodologies that consistently perform better in multivariate noise environments either in estimation or in prediction. In this regard, we proposed several methodologies (unnr, SR-MLS, BS-MLS, upls) under a common general framework for measurement uncertainty incorporation that seems to be particularly promising under certain operating scenarios. Acknowledgements The authors would like to acknowledge FCT for financial support through the project POCTI/3647/EQU/000. References Draper, N.R.; H. Smith, 998, Applied Regression Analysis, 3 rd. ed., Wiley, NY. ISO, 993; Guide to the Expression of Uncert. in Measurement, Geneva, Switzerland. Hastie, T.; R. Tibshirani; J. Friedman, 00, The Elements of Stat. Learning, Springer. Martínez, À.; J. Riu; F.X. Rius, 00. J. Chemometrics 6. Wentzell, P.D.; D.T. Andrews; B.R. Kowlaski, 997. Anal. Chem. 69.

Heteroscedastic latent variable modelling with applications to multivariate statistical process control

Heteroscedastic latent variable modelling with applications to multivariate statistical process control Chemometrics and Intelligent Laboratory Systems 80 (006) 57 66 www.elsevier.com/locate/chemolab Heteroscedastic latent variable modelling with applications to multivariate statistical process control Marco

More information

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares R Gutierrez-Osuna Computer Science Department, Wright State University, Dayton, OH 45435,

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

International Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA

International Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA International Journal of Pure and Applied Mathematics Volume 19 No. 3 2005, 359-366 A NOTE ON BETWEEN-GROUP PCA Anne-Laure Boulesteix Department of Statistics University of Munich Akademiestrasse 1, Munich,

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

SOME APPLICATIONS: NONLINEAR REGRESSIONS BASED ON KERNEL METHOD IN SOCIAL SCIENCES AND ENGINEERING

SOME APPLICATIONS: NONLINEAR REGRESSIONS BASED ON KERNEL METHOD IN SOCIAL SCIENCES AND ENGINEERING SOME APPLICATIONS: NONLINEAR REGRESSIONS BASED ON KERNEL METHOD IN SOCIAL SCIENCES AND ENGINEERING Antoni Wibowo Farewell Lecture PPI Ibaraki 27 June 2009 EDUCATION BACKGROUNDS Dr.Eng., Social Systems

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Nonparametric Principal Components Regression

Nonparametric Principal Components Regression Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS031) p.4574 Nonparametric Principal Components Regression Barrios, Erniel University of the Philippines Diliman,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Multimodel Ensemble forecasts

Multimodel Ensemble forecasts Multimodel Ensemble forecasts Calibrated methods Michael K. Tippett International Research Institute for Climate and Society The Earth Institute, Columbia University ERFS Climate Predictability Tool Training

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

CSE446: non-parametric methods Spring 2017

CSE446: non-parametric methods Spring 2017 CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want

More information

EXTENDING PARTIAL LEAST SQUARES REGRESSION

EXTENDING PARTIAL LEAST SQUARES REGRESSION EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

Chemometrics. 1. Find an important subset of the original variables.

Chemometrics. 1. Find an important subset of the original variables. Chemistry 311 2003-01-13 1 Chemometrics Chemometrics: Mathematical, statistical, graphical or symbolic methods to improve the understanding of chemical information. or The science of relating measurements

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Statistical Inference

Statistical Inference Statistical Inference J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France SPM Course Edinburgh, April 2011 Image time-series Spatial

More information

Review: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF)

Review: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF) Case Study 4: Collaborative Filtering Review: Probabilistic Matrix Factorization Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 2 th, 214 Emily Fox 214 1 Probabilistic

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis 2.2 Classification of multivariate techniques

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch,

More information

Managing Uncertainty

Managing Uncertainty Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian

More information

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 STRUCTURAL EQUATION MODELING Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 Introduction: Path analysis Path Analysis is used to estimate a system of equations in which all of the

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Regression: Ordinary Least Squares

Regression: Ordinary Least Squares Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32 Regression

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

Statistical Learning

Statistical Learning Statistical Learning Supervised learning Assume: Estimate: quantity of interest function predictors to get: error Such that: For prediction and/or inference Model fit vs. Model stability (Bias variance

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods.

Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods. TheThalesians Itiseasyforphilosopherstoberichiftheychoose Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods Ivan Zhdankin

More information

Howard Mark and Jerome Workman Jr.

Howard Mark and Jerome Workman Jr. Linearity in Calibration: How to Test for Non-linearity Previous methods for linearity testing discussed in this series contain certain shortcomings. In this installment, the authors describe a method

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Emerging trends in Statistical Process Control of Industrial Processes. Marco S. Reis

Emerging trends in Statistical Process Control of Industrial Processes. Marco S. Reis Emerging trends in Statistical Process Control of Industrial Processes Marco S. Reis Guimarães, July 15 th, 16 Chemical Process Engineering and Forest Products Research Center CIEPQPF Department of Chemical

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Relevance Vector Machines for Earthquake Response Spectra

Relevance Vector Machines for Earthquake Response Spectra 2012 2011 American American Transactions Transactions on on Engineering Engineering & Applied Applied Sciences Sciences. American Transactions on Engineering & Applied Sciences http://tuengr.com/ateas

More information

IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH

IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH SESSION X : THEORY OF DEFORMATION ANALYSIS II IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH Robiah Adnan 2 Halim Setan 3 Mohd Nor Mohamad Faculty of Science, Universiti

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

COMS 4771 Regression. Nakul Verma

COMS 4771 Regression. Nakul Verma COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the

More information

Inferential Analysis with NIR and Chemometrics

Inferential Analysis with NIR and Chemometrics Inferential Analysis with NIR and Chemometrics Santanu Talukdar Manager, Engineering Services Part 2 NIR Spectroscopic Data with Chemometrics A Tutorial Presentation Part 2 Page.2 References This tutorial

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

L2: Two-variable regression model

L2: Two-variable regression model L2: Two-variable regression model Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: September 4, 2014 What we have learned last time...

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU

Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU Advanced Engineering Statistics - Section 5 - Jay Liu Dept. Chemical Engineering PKNU Least squares regression What we will cover Box, G.E.P., Use and abuse of regression, Technometrics, 8 (4), 625-629,

More information

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD Least Squares Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 75A Winter 0 - UCSD (Unweighted) Least Squares Assume linearity in the unnown, deterministic model parameters Scalar, additive noise model: y f (

More information

Prediction & Feature Selection in GLM

Prediction & Feature Selection in GLM Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Exploiting Sparse Non-Linear Structure in Astronomical Data

Exploiting Sparse Non-Linear Structure in Astronomical Data Exploiting Sparse Non-Linear Structure in Astronomical Data Ann B. Lee Department of Statistics and Department of Machine Learning, Carnegie Mellon University Joint work with P. Freeman, C. Schafer, and

More information

Statistical Inference

Statistical Inference Statistical Inference Jean Daunizeau Wellcome rust Centre for Neuroimaging University College London SPM Course Edinburgh, April 2010 Image time-series Spatial filter Design matrix Statistical Parametric

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods) Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015

CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015 CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015 Luke ZeElemoyer Slides adapted from Carlos Guestrin Predic5on of con5nuous variables Billionaire says: Wait, that s not what

More information

DIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION. P. Filzmoser and C. Croux

DIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION. P. Filzmoser and C. Croux Pliska Stud. Math. Bulgar. 003), 59 70 STUDIA MATHEMATICA BULGARICA DIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION P. Filzmoser and C. Croux Abstract. In classical multiple

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Data Mining und Maschinelles Lernen

Data Mining und Maschinelles Lernen Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting

More information

Bias-Variance in Machine Learning

Bias-Variance in Machine Learning Bias-Variance in Machine Learning Bias-Variance: Outline Underfitting/overfitting: Why are complex hypotheses bad? Simple example of bias/variance Error as bias+variance for regression brief comments on

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

Principal component analysis

Principal component analysis Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Effect of trends on the estimation of extreme precipitation quantiles

Effect of trends on the estimation of extreme precipitation quantiles Hydrology Days 2010 Effect of trends on the estimation of extreme precipitation quantiles Antonino Cancelliere, Brunella Bonaccorso, Giuseppe Rossi Department of Civil and Environmental Engineering, University

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Recap from previous lecture

Recap from previous lecture Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience

More information

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari MS&E 226: Small Data Lecture 6: Bias and variance (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 47 Our plan today We saw in last lecture that model scoring methods seem to be trading o two di erent

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Steps in Regression Analysis

Steps in Regression Analysis MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4) 2-1 Steps in Regression Analysis 1. Review the literature and develop the theoretical model 2. Specify the model:

More information

Regression. Simple Linear Regression Multiple Linear Regression Polynomial Linear Regression Decision Tree Regression Random Forest Regression

Regression. Simple Linear Regression Multiple Linear Regression Polynomial Linear Regression Decision Tree Regression Random Forest Regression Simple Linear Multiple Linear Polynomial Linear Decision Tree Random Forest Computational Intelligence in Complex Decision Systems 1 / 28 analysis In statistical modeling, regression analysis is a set

More information

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression General linear models One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression 2-way ANOVA in SPSS Example 14.1 2 3 2-way ANOVA in SPSS Click Add 4 Repeated measures The stroop

More information

Sigmaplot di Systat Software

Sigmaplot di Systat Software Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical

More information

STK-IN4300 Statistical Learning Methods in Data Science

STK-IN4300 Statistical Learning Methods in Data Science Outline of the lecture STK-I4300 Statistical Learning Methods in Data Science Riccardo De Bin debin@math.uio.no Model Assessment and Selection Cross-Validation Bootstrap Methods Methods using Derived Input

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information