(all terms are scalars).the minimization is clearer in sum notation:

Similar documents
Regression, Inference, and Model Building

Properties and Hypothesis Testing

Linear Regression Models

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

TAMS24: Notations and Formulas

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

1 Inferential Methods for Correlation and Regression Analysis

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

CLRM estimation Pietro Coretto Econometrics

Statistics 20: Final Exam Solutions Summer Session 2007

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

STATISTICAL INFERENCE

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Efficient GMM LECTURE 12 GMM II

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Stat 200 -Testing Summary Page 1

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

Lecture 1, Jan 19. i=1 p i = 1.

Linear Regression Models, OLS, Assumptions and Properties

Simple Linear Regression

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

1.010 Uncertainty in Engineering Fall 2008

Linear Regression Demystified

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Lesson 11: Simple Linear Regression

Correlation Regression

Sampling Distributions, Z-Tests, Power

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

ECON 3150/4150, Spring term Lecture 3

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Algebra of Least Squares

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Logit regression Logit regression

Stat 139 Homework 7 Solutions, Fall 2015

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Question 1: Exercise 8.2

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

University of California, Los Angeles Department of Statistics. Simple regression analysis

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topic 9: Sampling Distributions of Estimators

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

MA Advanced Econometrics: Properties of Least Squares Estimators

Comparing your lab results with the others by one-way ANOVA

1 Review of Probability & Statistics

Simple Linear Regression

Lecture 11 Simple Linear Regression

Mathematical Notation Math Introduction to Applied Statistics

11 Correlation and Regression

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Random Variables, Sampling and Estimation

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

This is an introductory course in Analysis of Variance and Design of Experiments.

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Sample Size Determination (Two or More Samples)

Chapter 4 - Summarizing Numerical Data

Sampling, Sampling Distribution and Normality

Problem Set 4 Due Oct, 12

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Common Large/Small Sample Tests 1/55

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Expectation and Variance of a random variable

WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =!

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

UNIT 11 MULTIPLE LINEAR REGRESSION

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Statistical Properties of OLS estimators

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Module 1 Fundamentals in statistics

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Final Examination Solutions 17/6/2010

Topic 9: Sampling Distributions of Estimators

Chapter 1 Simple Linear Regression (part 6: matrix version)

Describing the Relation between Two Variables

Chapter 13, Part A Analysis of Variance and Experimental Design

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

REGRESSION AND ANALYSIS OF VARIANCE. Motivation. Module structure

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Bayesian Methods: Introduction to Multi-parameter Models

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Stat 421-SP2012 Interval Estimation Section

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Transcription:

7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1 x 11 x 1 x 1 x 1 x 11 x 1 x 1 1 x X = 1 x x x x ik = x 1 x x x ik 1 x 1 x x x x 1 x x b The regressio coefficiets are B =, the forecast equatio is Ŷ = XB, ad b 1 ' ' the forecast error vector E = ' is give by E = Y XB ' ' ' Least squares approach: choose B to miimize E = E T E = The total sum of squares variace) of Y is SST = y i ' = Y ' T Y ' The residual forecast errors) sum of squares is = E T E = Y XB) T Y XB) = Y T Y Y T XB B T X T Y + B T X T XB all terms are scalars)the miimizatio is clearer i sum otatio:

8 = E = y i b l x il l = ' The miimizatio gives the ormal equatios : ' = = y b i b l x il k ) x = ik ' = x T ki y i x T ki b l x il ) =, k =,1,, l = l = X T Y X T XB So, i matrix form, the ormal equatios for the coefficiets are X T XB = X T Y or B = X T X ) 1 X T Y Agai, we separate the total sum of squares SST ito the regressio explaied) sum of squares SSR) ad the error sum of squares ) SST = Y ' T Y ' Y ' = Y Y = Y XB) T Y XB) = Y T Y B T X T Y Y T XB + B T X T XB = = Y T Y Y T XB sice X T XB = X T Y Sice these are all scalars, they are the same as their traspose: Y T XB = Y T XB) T = B T X T Y so that = Y T Y Y T XB = Y T B T X T )Y = E T Y the scalar product of the error ad the predictad) Note prove) that E T Ŷ =, the error is orthogoal to the forecast) SSR = SST - = R SST SST R = is the square of the geeralized correlatio coefficiet SST or explaied variace

9 This is a estimate for the depedet sample used for traiig the regressio, so that it is overoptimistic I a idepedet sample, the explaied variace is smaller Note that the aïve forecast error variace = 1 1 i = is seriously 1 biased overoptimistic) This is for two reasos: a) we are usig predictors, ad b) it is the estimate for the depedet traiig) sample The ubiased estimate for the forecast error variace for the depedet sample is s = y i ŷ i ) 1 = 1 This is because the is estimated with --1 dof SST is estimated with -1 dof oe dof was used to compute y ) SSR uses dof for the regressio coefficiets b k, so that is left with --1 dof It is clear that is we use too may predictors, ie, if ~ O) we ca have over fittig If =-1, we ca fit perfectly the depedet sample, so that the aïve depedet sum of errors squared is = However, the estimate of the depedet forecast error squared is i that case s = 1 = So we should ever over fit, ad should keep a umber of predictors such that << Moreover, the regressio coefficiets traied o the depedet sample) also have samplig errors they are oly estimates of the true regressio coefficiets)

3 Therefore whe we apply the regressio formula to ew predictors x a idepedet data set), the stadard deviatio of the error is give by ) 1 1+ x T X T X ) 1 x ucertaity icreases with the umber of predictors ucertaity icreases due to errors i the samplig of B whe used with idepedet data For =1 this correctio for idepedet data is 1+ 1 + x x ) x i x ) ~ 1+ if x is ot far from x Therefore the forecast error for idepedet data is give by the t distributio x T B Ex T B) ) ~ t 1 1 1+ x T X T X ) 1 x We ca estimate a 11-a) cofidece iterval for predictig Yx ) as Yx ) = x T B ± 1 1+ x T X T X ) 1 x )t a If a=5, we look for values of t 5, 1, 1

31 These predictio error estimates are oly estimates A better estimatio of the error is to reserve part of the data for idepedet data testig crossvalidatio) For example, we use 9 of the data to obtai the regressio coefficiets, ad test the forecast o the remaiig 1 I that case this ca be repeated 1 times for differet 1 subsets jackkifig ) This will give a good estimate of the forecast errors to be expected with a idepedet data set, as well as the error variace of the regressio coefficiets Statistical packages provide iformatio ot oly about regressio coefficiets but also about their error estimates usig the formulas above) ad about how sigificatly smaller is the forecast error This is give i aalysis of variace ANOVA) ad parameter error tables ANOVA Source of variace dof Sum of Squares SS) Mea Square MS=SS/dof F ratio test statistic Total -1 SST SST/-1) Regressio SSR SSR/ MSR/MSE Error residual) --1 --1) Regressio Summary Predictor Coefficiet Stadard error t-ratio --1) Costat b s b b / s b x k b k s bk b k / s bk If the F ratio is large compared to F 5,, 1 the we reject the ull hypothesis that the coefficiets b k are really zero for the populatio ad that b k due to samplig