WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =!

Similar documents
3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

(all terms are scalars).the minimization is clearer in sum notation:

Linear Regression Demystified

1 Inferential Methods for Correlation and Regression Analysis

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Regression, Inference, and Model Building

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Linear Regression Models

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Properties and Hypothesis Testing

11 Correlation and Regression

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Simple Linear Regression

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Lecture 11 Simple Linear Regression

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Random Variables, Sampling and Estimation

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

MA 575, Linear Models : Homework 3

Sample Size Determination (Two or More Samples)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

ECON 3150/4150, Spring term Lecture 3

Correlation Regression

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Statistical Properties of OLS estimators

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Efficient GMM LECTURE 12 GMM II

Problem Set 4 Due Oct, 12

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

TAMS24: Notations and Formulas

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

CHAPTER 5. Theory and Solution Using Matrix Techniques

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

Simple Linear Regression

ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t

The Method of Least Squares. To understand least squares fitting of data.

CLRM estimation Pietro Coretto Econometrics

Algebra of Least Squares

Stat 139 Homework 7 Solutions, Fall 2015

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

of the matrix is =-85, so it is not positive definite. Thus, the first

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

An Introduction to Randomized Algorithms

Data Analysis and Statistical Methods Statistics 651

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

The Efficiency of Some Robust Ridge Regression. for Handling Multicollinearity and Non-Normals. Errors Problems

Median and IQR The median is the value which divides the ordered data values in half.

10-701/ Machine Learning Mid-term Exam Solution

Output Analysis (2, Chapters 10 &11 Law)

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Section 14. Simple linear regression.

Unbiased Estimation. February 7-12, 2008

Machine Learning Brett Bernstein

University of California, Los Angeles Department of Statistics. Simple regression analysis

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Machine Learning for Data Science (CS 4786)

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

Chapter 8: Estimating with Confidence

Topic 9: Sampling Distributions of Estimators

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Expectation and Variance of a random variable

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Nonlinear regression

Stat 200 -Testing Summary Page 1

Machine Learning for Data Science (CS 4786)

CHAPTER I: Vector Spaces

Chapter 1 Simple Linear Regression (part 6: matrix version)

THE KALMAN FILTER RAUL ROJAS

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Topic 10: The Law of Large Numbers

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Mathematical Notation Math Introduction to Applied Statistics

6 Sample Size Calculations

Applying least absolute deviation regression to regressiontype estimation of the index of a stable distribution using the characteristic function

Sampling Distributions, Z-Tests, Power

Elementary Statistics

Transcription:

WEIGHTED LEAST SQUARES - used to give more emphasis to selected poits i the aalysis What are eighted least squares?! " i=1 i=1 Recall, i OLS e miimize Q =! % =!(Y - " - " X ) or Q = (Y_ - X "_) (Y_ - X "_) I eighted least squares, e miimize Q = DDe œ DDcY Y b" X ) d The ormal equatios become b! D + b" DX = DY! " i bdx + bdx = DXY For the itermediate calculatios e get (here is the eight) t D D X, DD Y, DD X Y, DD X, DD Y, DD i j Calculatio of the corrected sum of squares is DD y = D D (Y Y..) = DDY i j the slope is " " " = b = DD xy DDYX DD x œ DDX ( DDY X ) i j D ( DDX) i j DD i j ( DDY ) DD the itercept is calculated ith X ad Y as usual, but these are calculated as DD Y Y.. = ad X.. = DD DDX DD It the eights are 1, the all results are the same as OLS The variace is 5 = 5 = 5 = 5 % Y

WEIGHTED LEAST SQUARES hadout! " i=1 i=1 I OLS e miimize Q =! =! % (Y - " - " X ) or Q = ( Y - X ") ( Y - X ") I eighted least squares, e miimize Q = De i œ DcY b! - b" Xd i If the eights are 1, the all results are the same as OLS The ormal equatios become b! D + b" DX = DY b! DX + b" DX i = DXY For the itermediate calculatios e get (here is the eight) t D D X, DD Y, DD X Y, DD X, DD Y, DD i j Calculatio of the corrected sum of squares is _ DD y = D D (Y Y..) = DDY i j the slope is DDYX DDX ( DDY X ) i j D ( DDX) i j DD i j ( DDY ) DD " " = b " = the itercept is calculated ith X ad Y as usual, but the meas are calculated as _ DDY _ Y.. = ad X.. = DD 5 % Y DDX DD The variace is 5 = 5 = 5 = For multiple regressio Q = DDe = DDcY b! b" X " i b# X # i... b: -" X :-",i) d the ormal equatios (X WX)B = (X WY) the regressio coefficiets -" B = (X WX) (X WY) the variace-covariace matrix; 5, -" = 5 (X WX) is estimated by MSE, based o eighted deviatios; MSE = 5 (Y Y) A A p D

I a multiple regressio, e miimize Q = DDe œ DDcY Y b" X " i b# X # i... b: -" X :-",i) d The matrix cotaiig eights is W = 88 x Ô " 0 0 0 0 # 0 0 Ö Ù 0 0... 0 Õ 0 0 0 Ø 8 The matrix equatios for the regressio solutios are, the ormal equatios the regressio coefficiets (X WX)B = (X WY) -" B = (X WX) (X WY) -", the variace-covariace matrix; 5 = 5 (X WX) here 5 is estimated by MSE A, based o eighted deviatios MSE = = A D(Y Y) D / p p

Estimatig eights For ANOVA this may be relatively easy if eough observatios are available i each group. Variaces ca be estimated directly. Hoever, i regressio e usually have a "smooth" fuctio of chagig residuals ith chagig X values Þ To estimate these e ote that 5 ca be estimated by the residuals e. We ca the estimate the residuals ith OLS, ad regress the residuals (squared or usquared) o a suitable variable (usually oe of the X variables or Y ). The procedure ca be iterated. That is, if the e regressios coefficiets differ substacially from the old, the residuals ca be estimated agai from the eighted regressio ad the eights ca be calcalated agai ad the regressio fit agai ith the e eights.

Weightig may be used for various cases 1) Oe commo use is to adjust for o-homogeeous variace. Oe commo approach is assume that the variace is o-homogeeous because it si a fuctio of X. The e must determie hat that fuctio is, for simple liear regressio, the fuctio is commoly 5 = 5 X, 5 = 5 X, 5 = 5 ÈX to adjust for this, e ould eight by the iverse of the fuctio ) Aother commo case is here the fuctio is ot ko, but the data ca be subset ito smaller groups. These may be separate samples, or they may be cells from a aalysis of variace (ot regressio, but eightig also orks for (ANOVA). The e determie a value of 5 for each cell i, ad the eightig fuctio 1 is the reciprocal (iverse) of the variace for each subgroup 5 ) The values aalyzed are meas, the mea ill differ betee observatios" if the sample size are ot equal. Sice e may still assume that the variace of the origial observatios is homogeeous, The the variace of each poit is a simple fuctio of (here the variace of a mea is 5 1 ) e ould eight by the iverse of the coefficiet, of simply If e could ot assume that the variaces ere equal the for each mea e ould have the variace of 5, ad the eight ould be 5 NOTE: the actual value of a eight is ot importat, oly the proportioal value betee eights, so all eights could be multiplied or divided by some costat value. Some people recommed eightig by ithout 5 cocer for the homogeeity of variace. Sice all 5 are equal to 5 he the variace is homogeeous, this is the same as takig all eights, ad multiplyig by a costat 1 5 5, or

8. ROBUST REGRESSION There are may regressios developed that call themselves "robust". Some are based o the media deviatio, others o trimmed aalyses. The oe e ill discuss uses eighted least squares, ad is called ITERATIVELY REWEIGHTED LEAST SQUARES. a) ordiary least squares is cosidered "robust", but is sesitive to poits hich deviate greatly from the lie this occurs (1) ith data hich is ot ormal or symmetrical data ith a large tail () he there are outliers (cotamiatio) b) ROBUST REGRESSION is basically eighted least squares here the here the eights are a iverse fuctio of the residuals (1) poits ithi a rage "close" to the regressio lie get a eight of 1 if all eights are 1 the the result is ordiary least squares regressio () poits outside the rage get a eight less tha 1 () ordiary least squares miimize D (Y X ") = D(Y Y..) = (residual) robust regressio miimizes D Ò(Y X " ) c5 here is some fuctio of the stadardized residual ad should be chose to miimize the ifluece of large residuals. The chose is a eight, ad if the proper fuctio is chose, the calculatios may be doe as a eighted least squares, miimizig D (Y X ")

(4) the fuctio chose by Huber as chose such that if the residual is small ( Ÿ 1.45 5 ) there is o eight First e defie a robust estimate of variace as called the media absolute 7/.+8Öl/ 7/.+8Ð/ Ñl deviatio (MAD) such that MAD =!Þ'(%&, here the costat 0.6745 ill give a ubiased estimate of 5 for idepedet observatios from a ormal distributio. MAD is a alterative estimate of The e defie a scaled residual as the eight = 1 if le l Ÿ 1. $%&5 "Þ$%& = l. l for ke k > 1.45 5 ÈMSE.ß here. = / QEH here 1.45 is called the tuig costat, ad is chose to make the techique robust ad 95% efficiet. Efficiecy refers to the ratio of variaces from this model relative to ormal regressio models. (5) The solutio is iterative (a) do regressio, obtai e values (b) compute the eights (c) ru the eighted least squares aalysis usig the eights (d) iteratively recalculate the eights ad reru the eighted least squares aalysis UNTIL (1) there is o chage i the regressio coefficiets to some predetermied level of precisio () ote that the solutio may ot be the oe ith the miimum sum of squares deviatios

(6) PROBLEMS (a) this is ot a ordiary least squares, but the hypotheses are likely to be tested as if they ere the distributioal properties of the estimator are ot ell documeted (b) Schreuder, Boos ad Hafley (Forest Resources Ivetories Workshop Proc., Colorado State U., 1979) suggest ruig both regressios if "similar", use ordiary least squares if "differet" fid out hy; if because of a outlier the robust regressio is "probably" better they further suggest usig robust regressio as "a tool for aalysis, ot a cure all for bad data" The techique is useful for detectig outliers.