A New Robust Partial Least Squares Regression Method

Size: px
Start display at page:

Download "A New Robust Partial Least Squares Regression Method"

Transcription

1 A New Robust Partial Least Squares Regression Method 8th of September, 2005 Universidad Carlos III de Madrid Departamento de Estadistica

2 Motivation Robust PLS Methods PLS Algorithm Computing Robust Variance Covariance Matrix Monte Carlo Simulation A Service Quality Application Conclusions Future work

3 Why robust methods? Motivation Robust PLS Methods... just which robust methods you use is not important, what is important is that you use some. J. W. Tukey (1979) Fundamental Continuity Concept Small changes in the data result in only small changes in estimate. Change a few, so what? J.W. Tukey (1977).

4 Outliers Outline Motivation Robust PLS Methods Outliers are atypical observations that are well separated from the bulk of the data. Covariance matrix and means vector are very sensitive to outliers in data. 1-D (relatively easy to detect). 2-D (harder to detect). Higher-D (very hard to detect).

5 Literature review Outline Motivation Robust PLS Methods Wakeling and Macfie (1992). First PLSR robustification. Algorithm with weighted regressions for PLS1 and PLS2.

6 Literature review Outline Motivation Robust PLS Methods Wakeling and Macfie (1992). First PLSR robustification. Algorithm with weighted regressions for PLS1 and PLS2. Griep et al. (1995). Comparative study of LSM, RM, and IRLS. (Algorithms not resistant to leverage points).

7 Literature review Outline Motivation Robust PLS Methods Wakeling and Macfie (1992). First PLSR robustification. Algorithm with weighted regressions for PLS1 and PLS2. Griep et al. (1995). Comparative study of LSM, RM, and IRLS. (Algorithms not resistant to leverage points). Gil and Romera (1998). Robustification of the cross variance matrix through the SD estimator.

8 Literature review Outline Motivation Robust PLS Methods Wakeling and Macfie (1992). First PLSR robustification. Algorithm with weighted regressions for PLS1 and PLS2. Griep et al. (1995). Comparative study of LSM, RM, and IRLS. (Algorithms not resistant to leverage points). Gil and Romera (1998). Robustification of the cross variance matrix through the SD estimator. Hubert and Vandem Brandem (2003). PLS Robustification based on the SIMPLS algorithm.

9 Data and General Considerations PLS Algorithm Computing Robust Variance Covariance Matrix We analyze the case q=1.

10 Data and General Considerations PLS Algorithm Computing Robust Variance Covariance Matrix We analyze the case q=1. Population loading vector computed as in Helland (1988).

11 Data and General Considerations PLS Algorithm Computing Robust Variance Covariance Matrix We analyze the case q=1. Population loading vector computed as in Helland (1988). Optimal number of PLS component through leave-one-out cross validation.

12 Data and General Considerations PLS Algorithm Computing Robust Variance Covariance Matrix We analyze the case q=1. Population loading vector computed as in Helland (1988). Optimal number of PLS component through leave-one-out cross validation. One-step robustification comes from using the robust covariance matrix.

13 PLS Algorithm Outline PLS Algorithm Computing Robust Variance Covariance Matrix [y,x] = ( σ 2 y δ T y,x δ y,x x ) The population loading vectors given by Helland (1988): w 1 δ y,x w a+1 δ y,x x W a(w T a x W a) 1 W T a δ y,x Where W a, 1 < a A are the loading vectors W i and A the selected numbers of PLS components. W a = [w 1, w 2,..., w a ]

14 PLS Algorithm Outline PLS Algorithm Computing Robust Variance Covariance Matrix Then the population regression vector (non robust) is given by: β a = W a (Wa T x W a) 1 Wa T δ y,x

15 PLS Algorithm Outline PLS Algorithm Computing Robust Variance Covariance Matrix Then the population regression vector (non robust) is given by: β a = W a (Wa T x W a) 1 Wa T δ y,x The proposed global Robust algorithm come from using the covariance matrix S [y,x] obtained from the original data being in this case w 1 a normalization of δ y,x and: w a+1 δ y,x S x W a (W T a S x W a ) 1 W T a δ y,x

16 PLS Algorithm Computing Robust Variance Covariance Matrix Outliers detection and computation of S The algorithm Peña and Prieto(2005) works in three steps after data are scaled and centered: x i = S 1/2 (x i x), i = 1,..., n. STEP 1: finding directions maximizing and minimizing the kurtosis, projecting the data and identifying outliers.

17 PLS Algorithm Computing Robust Variance Covariance Matrix Outliers detection and computation of S The algorithm Peña and Prieto(2005) works in three steps after data are scaled and centered: x i = S 1/2 (x i x), i = 1,..., n. STEP 1: finding directions maximizing and minimizing the kurtosis, projecting the data and identifying outliers. STEP 2: generating random directions and stratifying the sample and identifying outliers again.

18 PLS Algorithm Computing Robust Variance Covariance Matrix Outliers detection and computation of S The algorithm Peña and Prieto(2005) works in three steps after data are scaled and centered: x i = S 1/2 (x i x), i = 1,..., n. STEP 1: finding directions maximizing and minimizing the kurtosis, projecting the data and identifying outliers. STEP 2: generating random directions and stratifying the sample and identifying outliers again. STEP 3: check the suspicious observations by using the Mahalanobis distance and repit until no more outliers are found.

19 Step I: Searching kurtosis directions PLS Algorithm Computing Robust Variance Covariance Matrix The direction that maximizes (minimizes) the coefficient of kurtosis is obtained as the solution of the optimization problem: d j = arg max(min) d 1 n n i=1 ( ) (j) 4 d x i s.t. d d = 1. The sample points are projected onto a lower dimension subspace, orthogonal to the directions d j

20 Step I: Searching kurtosis directions PLS Algorithm Computing Robust Variance Covariance Matrix Why kurtosis directions? Because is possible to study the presence of outliers on the kurtosis values and to use this moment coefficient to identify them. Symmetric and and a small proportion of outliers generated with asymmetric contamination increase the coefficient on the observed data.

21 Step I: Searching kurtosis directions PLS Algorithm Computing Robust Variance Covariance Matrix Why kurtosis directions? Because is possible to study the presence of outliers on the kurtosis values and to use this moment coefficient to identify them. Symmetric and and a small proportion of outliers generated with asymmetric contamination increase the coefficient on the observed data. A large proportion of outliers generated by asymmetric contamination can make the kurtosis coefficient very small.

22 Step II: Searching random directions PLS Algorithm Computing Robust Variance Covariance Matrix Chosen two observations randomly from the sample and compute the direction ˆd l defined by these two observations.

23 Step II: Searching random directions PLS Algorithm Computing Robust Variance Covariance Matrix Chosen two observations randomly from the sample and compute the direction ˆd l defined by these two observations. The observations are then projected onto this direction, to obtain the values ẑi l = ˆd l T x i

24 Step II: Searching random directions PLS Algorithm Computing Robust Variance Covariance Matrix Chosen two observations randomly from the sample and compute the direction ˆd l defined by these two observations. The observations are then projected onto this direction, to obtain the values ẑi l = ˆd l T x i Then the sample is partitioned into K groups of size n/k, where K is a prespecified number, based on the ordered values of the projections ẑi l, so that group k, 1 k K, contains those observations i satisfying. ẑ( (k 1)n/K +1) l ẑl i ẑ( kn/k ) l

25 Step II: Searching random directions PLS Algorithm Computing Robust Variance Covariance Matrix From each group k, 1 k K, a subsample of p observations is chosen without replacement, the orthogonal direction is computed and the corresponding projections. Why random directions? Because is necessary a procedure that detect outliers when the proportion of contamination is between.2 and.3 and the contamination distribution has the same variance as the original distribution. (case when kurtosis fails)

26 PLS Algorithm Computing Robust Variance Covariance Matrix Step III: Deleting outliers and computing S A Mahalanobis distance is computed for all observations labeled as outliers in the preceding steps. Being U the set of all observations not labeled as outliers: m = 1 U i U x i S = 1 U 1 i U (x i m)(x i m) v i = (x i m) T S 1 (x i m) Those observations i U such that v i < ξp 1, not to be outliers and are included in U. are considered

27 Simulation Study Outline Monte Carlo Simulation A Service Quality Application Comparative of methods PLS(Helland, 1988), PLS-SD (Gil and Romera, 1998),RSIMPLS (Hubert and Branden 2003) and (González, Peña and Romera).

28 Simulation Study Outline Monte Carlo Simulation A Service Quality Application Comparative of methods PLS(Helland, 1988), PLS-SD (Gil and Romera, 1998),RSIMPLS (Hubert and Branden 2003) and (González, Peña and Romera). Four types of outliers were generated: bad leverage points, vertical outliers, orthogonal outliers and very concentrated outliers.

29 Simulation Study Outline Monte Carlo Simulation A Service Quality Application Comparative of methods PLS(Helland, 1988), PLS-SD (Gil and Romera, 1998),RSIMPLS (Hubert and Branden 2003) and (González, Peña and Romera). Four types of outliers were generated: bad leverage points, vertical outliers, orthogonal outliers and very concentrated outliers. Three measures for comparing the methods.

30 Simulation Study Outline Monte Carlo Simulation A Service Quality Application Comparative of methods PLS(Helland, 1988), PLS-SD (Gil and Romera, 1998),RSIMPLS (Hubert and Branden 2003) and (González, Peña and Romera). Four types of outliers were generated: bad leverage points, vertical outliers, orthogonal outliers and very concentrated outliers. Three measures for comparing the methods. Simulations have been done in a personal computer Pentium III 650 MH with 128 Mb of internal memory. Code implemented in Matlab.

31 Simulation Study. Bilinear model Monte Carlo Simulation A Service Quality Application T N A (0 A, Σ t ) with A < p X = TI A,p + N p (0 p, 0.1I p ) Y = TQ + N q (0 q, I q ) (I A,p ) i,j = 1 for i = j and (I A,p ) i,j 1. Q is a matrix of dimensions A p with (A i,j ) = 1 i, j The simulation is done with a known values of A = A opt and generating randomly a number n ɛ of outliers. Table: Simulation study q n p A σ t σ t Contamination diag(4,2) 1 10% and 30% Out.

32 Simulation Study. Outliers Monte Carlo Simulation A Service Quality Application Bad leverage regression points: T ɛ N A (10 A, σ t ) X ɛ = T ɛ I A,p + N p (0 p, 0.1I p )

33 Simulation Study. Outliers Monte Carlo Simulation A Service Quality Application Bad leverage regression points: T ɛ N A (10 A, σ t ) X ɛ = T ɛ I A,p + N p (0 p, 0.1I p ) Vertical outliers: Y ɛ = TQ A,q + N q (10 q, 0.1I q )

34 Simulation Study. Outliers Monte Carlo Simulation A Service Quality Application Bad leverage regression points: T ɛ N A (10 A, σ t ) X ɛ = T ɛ I A,p + N p (0 p, 0.1I p ) Vertical outliers: Y ɛ = TQ A,q + N q (10 q, 0.1I q ) Orthogonal outliers: X ɛ = T ɛ I A,p + N p ((0 A, 10 p A ), 0.1I p )

35 Simulation Study. Outliers Monte Carlo Simulation A Service Quality Application Bad leverage regression points: T ɛ N A (10 A, σ t ) X ɛ = T ɛ I A,p + N p (0 p, 0.1I p ) Vertical outliers: Y ɛ = TQ A,q + N q (10 q, 0.1I q ) Orthogonal outliers: X ɛ = T ɛ I A,p + N p ((0 A, 10 p A ), 0.1I p ) Very concentrated outliers: X ɛ = T ɛ I A,p + N p (10 p, 0.001I p )

36 Monte Carlo Simulation A Service Quality Application MC Simulation Study. Measures The experimental slope of each method β (l) angβ 1,A = ang(β, ˆβ [y c,x c ],A)

37 MC Simulation Study. Measures Monte Carlo Simulation A Service Quality Application The experimental slope of each method β (l) angβ 1,A = ang(β, ˆβ [y c,x c ],A) The mean squared error of the norms. MSE A ( ˆβ) = 1 m m l=i ˆβ (l) A β

38 MC Simulation Study. Measures Monte Carlo Simulation A Service Quality Application The experimental slope of each method β (l) angβ 1,A = ang(β, ˆβ [y c,x c ],A) The mean squared error of the norms. MSE A ( ˆβ) = 1 m m l=i ˆβ (l) A β A test set of n t observations with the original model and we compute: RMSE A = 1 n n t i=1 (y i ŷ i,a ) 2 Being ŷ i,k the predicted value of y in the observation i.

39 Monte Carlo Simulation A Service Quality Application MC Simulation Study. 10% contamination

40 Monte Carlo Simulation A Service Quality Application MC Simulation Study. 30% contamination

41 RENFE Data Outline Monte Carlo Simulation A Service Quality Application 17 independing variables that present some measures of the RENFE(Public Railroad system in Spain) service. Station security Train cleanness Noise level...

42 RENFE Data Outline Monte Carlo Simulation A Service Quality Application 17 independing variables that present some measures of the RENFE(Public Railroad system in Spain) service. Station security Train cleanness Noise level... One dependent variable that corresponds with a measure of the global satisfaction of the customers with the service quality.

43 RENFE Data Outline Monte Carlo Simulation A Service Quality Application 17 independing variables that present some measures of the RENFE(Public Railroad system in Spain) service. Station security Train cleanness Noise level... One dependent variable that corresponds with a measure of the global satisfaction of the customers with the service quality. All variables were requested to evaluate on a 0-9 scale.

44 Monte Carlo Simulation A Service Quality Application The sample include 1499 questionnaires and are available in Computational times Algorithm PLS PLS-KurSD RSIMPLS Time(seg.)

45 Monte Carlo Simulation A Service Quality Application The sample include 1499 questionnaires and are available in The first principal component explains the 93.4% of the variability. Computational times Algorithm PLS PLS-KurSD RSIMPLS Time(seg.)

46 Conclusions Outline Conclusions Future work behaves well with any type of contamination and is easy to compute.

47 Conclusions Outline Conclusions Future work behaves well with any type of contamination and is easy to compute. is resistant to outliers even with a big percent of contamination.

48 Conclusions Outline Conclusions Future work behaves well with any type of contamination and is easy to compute. is resistant to outliers even with a big percent of contamination. is very fast an is useful in large data sets.

49 Future work Outline Conclusions Future work Extend the work to the case of several dependent variables.

50 Future work Outline Conclusions Future work Extend the work to the case of several dependent variables. To develop a Robust PLS with p n.

51 Future work Outline Conclusions Future work Extend the work to the case of several dependent variables. To develop a Robust PLS with p n. To analyze in which cases a robustification of the regression is necessary.

52 Conclusions Future work 4th Simposium of PLS and Related Methods Barcelona 7th-9th of September, 2005

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

EXTENDING PARTIAL LEAST SQUARES REGRESSION

EXTENDING PARTIAL LEAST SQUARES REGRESSION EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

9. Least squares data fitting

9. Least squares data fitting L. Vandenberghe EE133A (Spring 2017) 9. Least squares data fitting model fitting regression linear-in-parameters models time series examples validation least squares classification statistics interpretation

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Robust estimation of principal components from depth-based multivariate rank covariance matrix Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary

More information

Fast and robust bootstrap for LTS

Fast and robust bootstrap for LTS Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part VI: Cross-validation Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD) 453 Modern

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability

More information

Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods.

Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods. TheThalesians Itiseasyforphilosopherstoberichiftheychoose Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods Ivan Zhdankin

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

Prediction Intervals in the Presence of Outliers

Prediction Intervals in the Presence of Outliers Prediction Intervals in the Presence of Outliers David J. Olive Southern Illinois University July 21, 2003 Abstract This paper presents a simple procedure for computing prediction intervals when the data

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

ROBUST PARTIAL LEAST SQUARES REGRESSION AND OUTLIER DETECTION USING REPEATED MINIMUM COVARIANCE DETERMINANT METHOD AND A RESAMPLING METHOD

ROBUST PARTIAL LEAST SQUARES REGRESSION AND OUTLIER DETECTION USING REPEATED MINIMUM COVARIANCE DETERMINANT METHOD AND A RESAMPLING METHOD ROBUST PARTIAL LEAST SQUARES REGRESSION AND OUTLIER DETECTION USING REPEATED MINIMUM COVARIANCE DETERMINANT METHOD AND A RESAMPLING METHOD by Dilrukshika Manori Singhabahu BS, Information Technology, Slippery

More information

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

MATH 829: Introduction to Data Mining and Analysis Principal component analysis 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data

Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data Vadim Ayuyev, Joseph Jupin, Philip Harris and Zoran Obradovic Temple University, Philadelphia, USA 2009 Real Life Data is Often

More information

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series A, Pt. 1, pp. 144 149 A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE By M. MERCEDES SUÁREZ RANCEL and MIGUEL A. GONZÁLEZ SIERRA University

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

MULTIVARIATE TECHNIQUES, ROBUSTNESS

MULTIVARIATE TECHNIQUES, ROBUSTNESS MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

Independent component analysis for functional data

Independent component analysis for functional data Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability

More information

Detection of outliers in multivariate data:

Detection of outliers in multivariate data: 1 Detection of outliers in multivariate data: a method based on clustering and robust estimators Carla M. Santos-Pereira 1 and Ana M. Pires 2 1 Universidade Portucalense Infante D. Henrique, Oporto, Portugal

More information

FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS

FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS Int. J. Appl. Math. Comput. Sci., 8, Vol. 8, No. 4, 49 44 DOI:.478/v6-8-38-3 FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS YVON THARRAULT, GILLES MOUROT, JOSÉ RAGOT, DIDIER MAQUIN

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Fast and Robust Discriminant Analysis

Fast and Robust Discriminant Analysis Fast and Robust Discriminant Analysis Mia Hubert a,1, Katrien Van Driessen b a Department of Mathematics, Katholieke Universiteit Leuven, W. De Croylaan 54, B-3001 Leuven. b UFSIA-RUCA Faculty of Applied

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 2: Linear Regression (v3) Ramesh Johari rjohari@stanford.edu September 29, 2017 1 / 36 Summarizing a sample 2 / 36 A sample Suppose Y = (Y 1,..., Y n ) is a sample of real-valued

More information

Statistical View of Least Squares

Statistical View of Least Squares May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples

More information

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System processes System Overview Previous Systems:

More information

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X = The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Santiago Velilla. Curriculum Vitae

Santiago Velilla. Curriculum Vitae Santiago Velilla Departamento de Estadística Universidad Carlos III de Madrid 28903 - Getafe, Madrid, Spain. Tel: +34-91 - 624-9855. Fax: +34-91 - 624-9849 e-mail: santiago.velilla@uc3m.es Curriculum Vitae

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Estimation of Complex Small Area Parameters with Application to Poverty Indicators

Estimation of Complex Small Area Parameters with Application to Poverty Indicators 1 Estimation of Complex Small Area Parameters with Application to Poverty Indicators J.N.K. Rao School of Mathematics and Statistics, Carleton University (Joint work with Isabel Molina from Universidad

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Research Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor

Research Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor Research Overview Kristjan Greenewald University of Michigan - Ann Arbor February 2, 2016 2/17 Background and Motivation Want efficient statistical modeling of high-dimensional spatio-temporal data with

More information

Robust negative binomial regression

Robust negative binomial regression Robust negative binomial regression Michel Amiguet, Alfio Marazzi, Marina S. Valdora, Víctor J. Yohai september 2016 Negative Binomial Regression Poisson regression is the standard method used to model

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Estimating Optimal Dynamic Treatment Regimes from Clustered Data Estimating Optimal Dynamic Treatment Regimes from Clustered Data Bibhas Chakraborty Department of Biostatistics, Columbia University bc2425@columbia.edu Society for Clinical Trials Annual Meetings Boston,

More information

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2 3.2c Computer Output, Regression to the Mean, & AP Formulas Be sure you can locate: the slope, the y intercept and determine the equation of the LSRL. Slope is always in context and context is x value.

More information

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Principal Component Analysis vs. Independent Component Analysis for Damage Detection 6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR

More information

Supplementary Materials for Fast envelope algorithms

Supplementary Materials for Fast envelope algorithms 1 Supplementary Materials for Fast envelope algorithms R. Dennis Cook and Xin Zhang 1 Proof for Proposition We first prove the following: F(A 0 ) = log A T 0 MA 0 + log A T 0 (M + U) 1 A 0 (A1) log A T

More information

Computational Statistics and Data Analysis

Computational Statistics and Data Analysis Computational Statistics and Data Analysis 53 (2009) 2264 2274 Contents lists available at ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Robust

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

Minimum Regularized Covariance Determinant Estimator

Minimum Regularized Covariance Determinant Estimator Minimum Regularized Covariance Determinant Estimator Honey, we shrunk the data and the covariance matrix Kris Boudt (joint with: P. Rousseeuw, S. Vanduffel and T. Verdonck) Vrije Universiteit Brussel/Amsterdam

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

CSC 411: Lecture 04: Logistic Regression

CSC 411: Lecture 04: Logistic Regression CSC 411: Lecture 04: Logistic Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 23, 2015 Urtasun & Zemel (UofT) CSC 411: 04-Prob Classif Sep 23, 2015 1 / 16 Today Key Concepts: Logistic

More information

Damage detection in the presence of outliers based on robust PCA Fahit Gharibnezhad 1, L.E. Mujica 2, Jose Rodellar 3 1,2,3

Damage detection in the presence of outliers based on robust PCA Fahit Gharibnezhad 1, L.E. Mujica 2, Jose Rodellar 3 1,2,3 Damage detection in the presence of outliers based on robust PCA Fahit Gharibnezhad 1, L.E. Mujica 2, Jose Rodellar 3 1,2,3 Escola Universitària d'enginyeria Tècnica Industrial de Barcelona,Department

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Robust Factorization of a Data Matrix

Robust Factorization of a Data Matrix Robust Factorization of a Data Matrix Christophe Croux 1 and Peter Filzmoser 2 1 ECARE and Institut de Statistique, Université Libre de Bruxelles, Av. F.D. Roosevelt 50, B-1050 Bruxelles, Belgium 2 Department

More information

ANCOVA. Lecture 9 Andrew Ainsworth

ANCOVA. Lecture 9 Andrew Ainsworth ANCOVA Lecture 9 Andrew Ainsworth What is ANCOVA? Analysis of covariance an extension of ANOVA in which main effects and interactions are assessed on DV scores after the DV has been adjusted for by the

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

K-Lists. Anindya Sen and Corrie Scalisi. June 11, University of California, Santa Cruz. Anindya Sen and Corrie Scalisi (UCSC) K-Lists 1 / 25

K-Lists. Anindya Sen and Corrie Scalisi. June 11, University of California, Santa Cruz. Anindya Sen and Corrie Scalisi (UCSC) K-Lists 1 / 25 K-Lists Anindya Sen and Corrie Scalisi University of California, Santa Cruz June 11, 2007 Anindya Sen and Corrie Scalisi (UCSC) K-Lists 1 / 25 Outline 1 Introduction 2 Experts 3 Noise-Free Case Deterministic

More information

Estimation of Mars surface physical properties from hyperspectral images using the SIR method

Estimation of Mars surface physical properties from hyperspectral images using the SIR method Estimation of Mars surface physical properties from hyperspectral images using the SIR method Caroline Bernard-Michel, Sylvain Douté, Laurent Gardes and Stéphane Girard Source: ESA Outline I. Context Hyperspectral

More information

Median Cross-Validation

Median Cross-Validation Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Predictive Models Predictive Models 1 / 34 Outline

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

Multivariate coefficient of variation for functional data

Multivariate coefficient of variation for functional data Multivariate coefficient of variation for functional data Mirosław Krzyśko 1 and Łukasz Smaga 2 1 Interfaculty Institute of Mathematics and Statistics The President Stanisław Wojciechowski State University

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Measure-Transformed Quasi Maximum Likelihood Estimation

Measure-Transformed Quasi Maximum Likelihood Estimation Measure-Transformed Quasi Maximum Likelihood Estimation 1 Koby Todros and Alfred O. Hero Abstract In this paper, we consider the problem of estimating a deterministic vector parameter when the likelihood

More information

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators Stefano Marchetti 1 Nikos Tzavidis 2 Monica Pratesi 3 1,3 Department

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 49 Outline 1 How to check assumptions 2 / 49 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can

More information

FAST CROSS-VALIDATION IN ROBUST PCA

FAST CROSS-VALIDATION IN ROBUST PCA COMPSTAT 2004 Symposium c Physica-Verlag/Springer 2004 FAST CROSS-VALIDATION IN ROBUST PCA Sanne Engelen, Mia Hubert Key words: Cross-Validation, Robustness, fast algorithm COMPSTAT 2004 section: Partial

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

Statistics II Lesson 1. Inference on one population. Year 2009/10

Statistics II Lesson 1. Inference on one population. Year 2009/10 Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27 Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix

More information

Machine Learning - MT Linear Regression

Machine Learning - MT Linear Regression Machine Learning - MT 2016 2. Linear Regression Varun Kanade University of Oxford October 12, 2016 Announcements All students eligible to take the course for credit can sign-up for classes and practicals

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Econometrics I. Ricardo Mora

Econometrics I. Ricardo Mora Econometrics I Department of Economics Universidad Carlos III de Madrid Master in Industrial Economics and Markets Outline Motivation 1 Motivation 2 3 4 Motivation The Analogy Principle The () is a framework

More information

Robust multivariate methods in Chemometrics

Robust multivariate methods in Chemometrics Robust multivariate methods in Chemometrics Peter Filzmoser 1 Sven Serneels 2 Ricardo Maronna 3 Pierre J. Van Espen 4 1 Institut für Statistik und Wahrscheinlichkeitstheorie, Technical University of Vienna,

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

WWTP diagnosis based on robust principal component analysis

WWTP diagnosis based on robust principal component analysis WWTP diagnosis based on robust principal component analysis Y. Tharrault, G. Mourot, J. Ragot, M.-F. Harkat Centre de Recherche en Automatique de Nancy Nancy-université, CNRS 2, Avenue de la forêt de Haye

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI ** ANALELE ŞTIINłIFICE ALE UNIVERSITĂłII ALEXANDRU IOAN CUZA DIN IAŞI Tomul LVI ŞtiinŃe Economice 9 A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI, R.A. IPINYOMI

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM, Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

A Short Introduction to Curve Fitting and Regression by Brad Morantz

A Short Introduction to Curve Fitting and Regression by Brad Morantz A Short Introduction to Curve Fitting and Regression by Brad Morantz bradscientist@machine-cognition.com Overview What can regression do for me? Example Model building Error Metrics OLS Regression Robust

More information