However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Similar documents
Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

First Year Examination Department of Statistics, University of Florida

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Composite Hypotheses testing

Norms, Condition Numbers, Eigenvalues and Eigenvectors

LECTURE 9 CANONICAL CORRELATION ANALYSIS

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

APPENDIX A Some Linear Algebra

STAT 511 FINAL EXAM NAME Spring 2001

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

2.3 Nilpotent endomorphisms

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Estimation: Part 2. Chapter GREG estimation

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

β0 + β1xi. You are interested in estimating the unknown parameters β

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Economics 130. Lecture 4 Simple Linear Regression Continued

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

β0 + β1xi. You are interested in estimating the unknown parameters β

Professor Chris Murray. Midterm Exam

Linear Approximation with Regularization and Moving Least Squares

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Polynomial Regression Models

Statistical pattern recognition

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

The Ordinary Least Squares (OLS) Estimator

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Important Instructions to the Examiners:

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Bose (1942) showed b t r 1 is a necessary condition. PROOF (Murty 1961): Assume t is a multiple of k, i.e. t nk, where n is an integer.

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Chapter 12 Analysis of Covariance

Econometrics of Panel Data

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

1 Matrix representations of canonical matrices

Lecture 10 Support Vector Machines II

a. (All your answers should be in the letter!

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Multi-dimensional Central Limit Theorem

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Limited Dependent Variables

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Radar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed.

Chapter 11: Simple Linear Regression and Correlation

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

DISCRIMINANTS AND RAMIFIED PRIMES. 1. Introduction A prime number p is said to be ramified in a number field K if the prime ideal factorization

e - c o m p a n i o n

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

Perron Vectors of an Irreducible Nonnegative Interval Matrix

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Feb 14: Spatial analysis of data fields

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Properties of Least Squares

Introduction to Regression

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

The Geometry of Logit and Probit

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

x = , so that calculated

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Solutions Homework 4 March 5, 2018

Math 217 Fall 2013 Homework 2 Solutions

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Affine and Riemannian Connections

Topic 7: Analysis of Variance

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

w ). Then use the Cauchy-Schwartz inequality ( v w v w ).] = in R 4. Can you find a vector u 4 in R 4 such that the

Formulas for the Determinant

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Lecture 3: Probability Distributions

A FORMULA FOR COMPUTING INTEGER POWERS FOR ONE TYPE OF TRIDIAGONAL MATRIX

Primer on High-Order Moment Estimators

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

7. Products and matrix elements

Transcription:

Fall 007 Soluton to Mdterm Examnaton STAT 7 Dr. Goel. [0 ponts] For the general lnear model = X + ε, wth uncorrelated errors havng mean zero and varance σ, suppose that the desgn matrx X s not necessarly of full rank. Let ˆ = Xˆ denote the projecton of onto the space spanned by the columns of X, where ˆ s any OLS soluton to the normal equatons XX = X. Fnd the varance-covarance matrx ˆ of ˆ. Furthermore show thattrace( ) = σ rank( X). ˆ For the GLM = X + ε, E[ ε] = 0, Cov[ ε] = ((cov( ε, ε j))) = σ I, the estmator ˆ ˆ = X = XXX ( ) X = P, where P = XXX ( ) X s the orthogonal projecton matrx onto the space spanned by the columns of X, s unquely defned, even f the column rank(x) s not full. Now, P P P PIP. ˆ = Cov( ) = Cov( ) = σ However, snce P s a symmetrc dempotent matrx, = PIP = P = P. ˆ σ σ σ Furthermore, the egen-values λ, =,, L, p, of P are ether 0 or [Egen-values of P are equal to the square of the egen-values of P, and P = P mples that these values must be ether 0 or ]. Now, n p Var( ˆ ) = trace[ ˆ ] = trace[ σ P] = σ trace[ P] = σ λ σ rank( ). = X = =. [50 ponts] Consder a two-way ANOVA model y = α + τ + ε, =,; j =,. j j j The parameters α and τ are unknown. j a. [5 ponts] Suppose that the errorsε j have mean zero, varance σ, and are uncorrelated. Express these observatons nto a general lnear model form = X + ε, where = ( α, α, τ, τ),.e., defne the vector and the matrx X, and the covarance matrx of the error vectorε. Let us stack the data ponts n the vector gven by = (,,, ), and let e = ( ε, ε, ε, ε ),and ß= ( α, α, τ, τ ). 0 0 0 0 = 0 0 0 0 Then the desgn matrx X=. Furthermore, e σ I.

b. [0 ponts] Show that for ths desgn, a parametrc functon cα+ cα + dτ+ dτs estmable f and only f c + c = d + d. The parametrc functon l ß = cα+ cα+ d τ+ d τs estmable, ff a vector t such that l = tx c c d d = ( t + t t + t t + t t + t ). ( ) Therefore, c + c = t + t + t + t = d + d. 3 3 3 Now suppose that c+ c= d+ d holds, then c = d + d c. Now ( c c d d ) d d c c d d ( d c d c ) = ( + ) = + X. c. [5 ponts] Consder the parametrc functonsα α, τ τand α + α + τ + τ. Show that the coeffcent vectors n these parametrc functons form an orthogonal bass of the row space of X. 0 0 Note that these functons can be expressed as Kß,where K = 0 0. Easy to check that KK s a dagonal matrx. Hence the rows of K are orthogonal. Furthermore, rank( K)=3, and X= K. Thus, each row of X can be wrtten as lnear combnatons of rows of K. R X. Therefore, rows of K form an orthogonal bass of the row space [ ] d. [5 ponts] A g-nverse of the matrx XX for the above model s gven below: 0 0 0 0 0 ( XX ) = 0 3 0 3 Fnd the best lnear unbased estmators of three parametrc functons n part (c) above and ther varance covarance matrx. Gven a generalzed nverse of XX, the BLUEs of Kßare gven by ˆ Kß= K[ XX ] X = L,where L =. Note that the three rows of L are orthogonal, and the length of each row vector equals one. Hence the var-cov matrx of Kß ˆ =Σ = σ LL = σ I. K߈ 3

e. [5 ponts] Is the best lnear unbased estmator ofα + τ? Explan No, t s not. Snce α + τs estmable, therefore ts BLUE s unque. It s easy to check that ( 0 0 ) ߈ = (3 + + ). 9 Note that the varance of the BLUE s σ < Var( ). 6 f. [0 ponts] Consder a reduced model for ths problem under the restrcton α α = 0. Fnd the dfference of the ERROR Sum of Squares for the reduced model and the full model. From Part (c) above, ( αˆ αˆ) = ( ),wth Var( αˆ ˆ α) = σ. The handout on Optmzaton of Error SS under Lnear Restrctons on parameter vector, t s known for estmable lnear restrctons, Error SS(Reduced model under the restrcton α α = 0- Error SS(Full model) = σ ( αˆ αˆ )/Var( αˆ αˆ ) = ( αˆ αˆ ). 3. [0 ponts] Assume that the -dmensonal random vector follows the model = µ + ε, =,, 3,, where the errors have mean zero, and gven the scalar c, the varance covarance matrx of ε s gven by c c 0 c 0 c σ. c 0 c 0 c c a) [5 ponts] Fnd all values of c for whch the above matrx s a covarance matrx. For the matrx V above to be a covarance matrx, t must be n.n.d. Thus all ts egenvalues must be non-negatve. Consderng the approprate x parttoned matrces n V, λ c λ c c 0 λ c c 0 V λi =. c λ c λ 0 c c λ 0 c + ci ci λ c ( λ) c ( c λ) ( λ) ( c λ) = c λ = = ( c λ) ( λ) + c ( c λ) ( λ) = = ( λ) c ( λ) ( λ)(( λ) c ).

Thus the roots of the characterstc polynomal V λi = 0 are λ = (wth multplcty ) and - λ= c. Now λ 0 - c 0 c. b) [0 ponts] Fnd the Gauss Markov estmator ˆµ of µ based on the vector. Note that for ths model the desgn matrx s, a column of s. Furthermore, VX = (+ ). c Therefore, the column space of VX s same as the column space of X. Hence the Gauss Markov estmator of µ = OLS of µ = X / XX = =. c) [5 ponts] Fnd the rato of the varances of ˆµ and the OLS of µ. (+ ) c Now Var( ) = σ. Of course, snce the two estmators are same, the rato of ther varances equals. Note that for c=, Var ( ) = 0,and = µ wth Prob.. [5 ponts each] Explan why each of the followng statement s True or False. If you make correct choce, but provde ncorrect explanaton, you wll not receve any credt. a. [True/False] In a general lnear model, = X + ε, let x denote the th column of X, =,, p. The parametrc functon c+ cs estmable f the vectors { x, =,} do not belong to the space spanned by the vectors{ cx cx, x3, L, x p }. False, but somethng close to ths holds. Snce the vectors ( c c ) and ( c c) are orthogonal, one can reparametrze = (,,, L, ) by * = (,,, L, ) 3 p 3 usng = c+ c and = c c. Now solve for, n terms of p, * c c.e., * = c c c c. Assume, wthout loss of generalty, that + c + c =, and substtute for, n the orgnal model n terms of the,. Then frst two columns n the desgn matrx of the reparametrzed model are cx + cx, cx cx. Now s estmable f the frst column of the new matrx, *.e., cx + cx does not belong to the space spanned by the last (p-) columns of the new matrx,.e. cx + cx R[ cx cx, X 3, L, X p ]. [Ths was a home work problem, and dscussed n the class.] The key s to thnk of reparametrzaton. The problem statement sad b. [True/False] If the margnal dstrbuton of X and Xare normal wth means zero and varance, then ther jont dstrbuton must be a bvarate normal dstrbuton.

False, snce the margnals do not determne the jont dstrbuton. c. [True/False] In the sample model = x + ε, =,, L,, wth errors{} ε havng mean zero, varance t σ, and par-wse correlatons ρ, the B.L.U.E. of the parameter s gven by the rato estmator ˆ = / x. = = In ths case, snce the X does not contan the column of s, the GLS and OLS may be dfferent, unless ρ =0. However, snce [( ρ) I + ρj] = ci + cj,where c's are non-zero, XV X = cxx + c ( xx ) and XV = cxy + c ( x )(' y). Thus, the G-M estmator s gven by ther rato. Ths s not equal to the rato estmator (' y)/( x ). d. [True/False] Let c and d be both BLUE of some parametrc functon l. Then c and d must be equal. Ths s false f the functon l. s not estmable, snce 0 l = l ( XX ) X depends on the choce of g-nverse. However, f t s estmable, the BLUE s unque. Thus c = d for all and hence c= d.