Training Sample Model: Given n observations, [[( Yi, x i the sample model can be expressed as (1) where, zero and variance σ

Similar documents
Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Dr. Shalabh. Indian Institute of Technology Kanpur

Special Instructions / Useful Data

Linear Regression with One Regressor

Chapter 3 Experimental Design Models

Qualifying Exam Statistical Theory Problem Solutions August 2005

Introduction to Matrices and Matrix Approach to Simple Linear Regression

STK4011 and STK9011 Autumn 2016

Singular Value Decomposition. Linear Algebra (3) Singular Value Decomposition. SVD and Eigenvectors. Solving LEs with SVD

Chapter 14 Logistic Regression Models

Econometric Methods. Review of Estimation

TESTS BASED ON MAXIMUM LIKELIHOOD

Lecture Note to Rice Chapter 8

ENGI 3423 Simple Linear Regression Page 12-01

Multiple Choice Test. Chapter Adequacy of Models for Regression

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

Maximum Likelihood Estimation

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

4. Standard Regression Model and Spatial Dependence Tests

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Simple Linear Regression and Correlation.

Simple Linear Regression

LINEAR REGRESSION ANALYSIS

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

ECON 5360 Class Notes GMM

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

ε. Therefore, the estimate

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

BASIC PRINCIPLES OF STATISTICS

Stats & Summary

Chapter 2 General Linear Hypothesis and Analysis of Variance

Lecture 3 Probability review (cont d)

Objectives of Multiple Regression

DISTURBANCE TERMS. is a scalar and x i

Chapter 5 Properties of a Random Sample

Unit 9. The Tangent Bundle

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Multiple Linear Regression Analysis

ESS Line Fitting

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

4 Inner Product Spaces

STK3100 and STK4100 Autumn 2017

6.867 Machine Learning

Chapter 3 Multiple Linear Regression Model

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

Chapter 4 Multiple Random Variables

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STK3100 and STK4100 Autumn 2018

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Dimensionality Reduction and Learning

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

1 Solution to Problem 6.40

Line Fitting and Regression

Channel Models with Memory. Channel Models with Memory. Channel Models with Memory. Channel Models with Memory

ρ < 1 be five real numbers. The

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

ENGI 4421 Propagation of Error Page 8-01

Econ 388 R. Butler 2016 rev Lecture 5 Multivariate 2 I. Partitioned Regression and Partial Regression Table 1: Projections everywhere

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

. The set of these sums. be a partition of [ ab, ]. Consider the sum f( x) f( x 1)

Factorization of Finite Abelian Groups

Some Different Perspectives on Linear Least Squares

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

MOLECULAR VIBRATIONS

Probability and. Lecture 13: and Correlation

Lecture Notes Types of economic variables

Point Estimation: definition of estimators

Chapter 2 Supplemental Text Material

Bayes (Naïve or not) Classifiers: Generative Approach

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

Quiz 1- Linear Regression Analysis (Based on Lectures 1-14)

Econometrics. 3) Statistical properties of the OLS estimator

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Department of Mathematics UNIVERSITY OF OSLO. FORMULAS FOR STK4040 (version 1, September 12th, 2011) A - Vectors and matrices

A class of Liu-type estimators based on ridge regression under multicollinearity with an application to mixture experiments

2. Independence and Bernoulli Trials

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

An Introduction to. Support Vector Machine

Chapter 9 Jordan Block Matrices

: At least two means differ SST

Lecture 8: Linear Regression

Answer key to problem set # 2 ECON 342 J. Marcelo Ochoa Spring, 2009

Transcription:

Stat 74 Estmato for Geeral Lear Model Prof. Goel Broad Outle Geeral Lear Model (GLM): Trag Samle Model: Gve observatos, [[( Y, x ), x = ( x,, xr )], =,,, the samle model ca be exressed as Y = µ ( x, x,, x ) + ε, =,,,, () r where,,,,, zero ad varace σ. ε =, deote the ose (radom errors), each wth mea From ow o, we deote the features f, =,,,, themselves as coded redctor varables x, x,, x. I the smlest settg, the radom errors are assumed to be ucorrelated wth equal varace. Thus the samle GLM ca be exressed as = β + ε, [ ε ] =, [ ε ] = σ, ( ε, εk ) =,. = Y x E Var Cov k () EY [ ] = µ = β x. = x (3) Vector/matrx otato for the resose, redctor varables, error terms ad the ukow coeffcets: Y β ε x. x Y β ε x x. Y =, β =, ε =,ad X =. Also, let x. =, Y β ε x x. deote the th colum of X,.e., X = [ x., x.,, x. ]. Gve the resose vector Y, ad the desg matrx X, the samle GLM ca be wrtte as Y = Xβ + ε E ε = Cov ε = ε ε = E ε ε = E εε = σ I (4), [ ], [ ] ((cov(, ))) (( ( ))) [ ]. Thus, E[ Y] = μ= Xβ, Cov( Y) = E[( Y-Xβ)( Y-Xβ) ] = E[ εε ] = σ I. (5)

Ordary Least Square (OLS): For a estmate β of β, corresodg Resdual Sum of Squares: l ( β) = ( Y β x ) = e ( β) = ( β) ( β). e e = = = Problem: Fd a estmated coeffcet vector ˆ β = arg m l( β) I matrx otato, β R m e ( β) e( β) = m( YX β)( Y X β) = e ( ˆ β) e( ˆ β). β R β R (6) Exad S( β) = ( YXβ)( Y Xβ) = YY YX β β XY + β XX β. O settg the artal dervatves of S( β ) wth resect to β equal to zero, we get the Normal Equatos XX β = XY. (7) Ay soluto of (7) s a otmal soluto to the OLS roblem. Full rak case: Examle Smle Lear Regresso, Multle Regresso wth learly deedet features. If the matrx XXs o-sgular (the desg matrx X s of full colum rak ), verse of XXexsts, ad the uque otmal least square soluto s ˆ = XX XY (8) βˆ ( ). Note that E[ β] = ( ) E[ ] = ( ) β = β XX X Y XX XX. Sce, Cov( TY) = TCov ( Y) T, therefore, ˆ Cov( β) = Cov( TY) = σ [ TIT ], where T = ( X X) X. Therefore, Cov( β) σ ( ). ˆ = XX

Not full-rak case: Examle - ANOVA for Desged Exermets The ormal equatos (7) are cosstet, but the system has ftely may solutos. Each soluto ca be exressed as β = ( XX ) XY, where ( XX ) s a geeralzed verse of XX. I fact, E β = ( XX ) X E[ Y] = ( XX ) ( XX ) β = H β β, so some comoets of β do ot ossess ubased estmators. The bases β may deed o the geeralzed verse used obtag the artcular OLS soluto. All these estmators ca t be regarded as otmal estmator of the vector β. Why? Otmal wth resect to what crtera? May eed to add a addtoal crtero to OLS to get a uque soluto, e.g., Mmum orm OLS estmator: β + = ( XX ) + XY, where ( XX ) + s the Moore-Perose geeralzed verse (Pseudo-verse) of XX. Coordate Free (Vector Sace) Aroach: Iterret the model μ= X β as - μ C[ X ]. However, ote that ˆµ = X β = PY = X(X X) X Y, the roecto of Y oto C[ X ]. The symmetrc matrx P= X( XX ) X s the orthogoal roecto matrx oto C[ X ], the sace saed by the colums of X. Eve though o-full rak there are ftely may solutos ( β ) to the ormal equatos case, the roecto ˆ µ = PY (also called Y ˆ ) s uque,.e., the matrx P does ot chage wth the choce of a geeralzed verse of XX. For a vector u C[ X ],.e., u = Xb, for some vector b, Pu=X( XX ) Xu = X( XX ) XXb = Xb= u. Thus the roecto of a vector u C[ X ] oto C[ X] s u tself.

Furthermore, for a arbtrary vector Y V, PY C[ X], therefore, P(PY) = PY holds true for all Y R. Hece, (P - P) = P(I - P) =. Thus, P = P, (.e., P s a demotet matrx). Is P a symmetrc matrx?. Fact: Every symmetrc demotet matrx s a orthogoal roecto matrx oto the sace saed by ts colums. Sce P (I-P) =, rows (colums) of P are orthogoal to the colums (rows) of (I-P),.e., Py ad ( I - P) y are orthogoal. Note that, Xβ = ˆ µ = PY = Yˆ, Y Y ˆ = (I - P)Y = e, the vector of resduals. Therefore, the vectors Yˆ ad e are orthogoal,.e., Examle: Detals of full-rak lear regresso model. The Key Questo: How to characterze the class of lear fuctos Ye ˆ = ye ˆ =. be estmated uquely through the least squares solutos? c β = c β that ca Estmable fuctos: A lear arametrc fucto c β s sad to be estmable, f there exst at least oe ubased estmator. If there does ot exst ay ubased estmator of the lear fucto c β, t s sad to be o-estmable. Why cosder estmable fuctos? We wll dscuss ts coecto wth the cocet of Idetfablty. Note that Y s a ubased estmator of x. β. (Why?) Thus x. β s estmable for each row of the matrx X. Hecec β, where the vector c s some lear combato of rows of X, s also estmable. Fact: A lear arametrc fucto c β of β ' c C ( X ) Row sace of X. (Prove t.) c C ( X ) c= Xl for some l. s = s estmable f ad oly f

Therefore, for ay OLS β, c β = lx ( XX ) XY = lpy = l ˆ µ. Thus c β s varat to the choce of geeralzed verse [Uque OLS soluto for ubased estmator ofc β. Gauss-Markov Theorem - c β Best (Mmum Varace) Lear Ubased Estmator (B.L.U.E.) of a estmable lear fuctoc β. Geeralzed Least Squares: Var( ε) = σ V, where V s a kow.d. matrx. Reduce ths roblem to a OLS roblem by a o-sgular trasformato Sce V s a ostve defte matrx, there exsts a o-sgular matrx T such that V = TT. Now, cosder the lear trasformato Z = TY. Note E( Z ) = TE( Y) = TXβ, Cov( Z) = σ TVT = σ T( T T) T = σ I. Ca cosder OLS roblem for Z.

Backgroud - Vector dfferetato: Vector of Partal dervatves of a lear form l u = lu,ad a quadratc form u Au = a uu, = = = for a symmetrc matrx A : ( lu) ( u Au) au + a u l au ( ) l ( ) = lu lu ( ) a u a u + = ; ( ) u Au u au = l u Au =. = = = Au l a u + au au ( lu ) ( ) u Au Whe A s ot symmetrc, uau = u {( A+ A ) / } u, wth {( A+ A ) / } symmetrc. Therefore, ( u Au) = ( A + A ) u.