Statistics 910, #5 1. Regression Methods

Similar documents
F9 F10: Autocorrelation

ECON 4160, Spring term Lecture 12

Linear Regression. Junhui Qian. October 27, 2014

1 Introduction to Generalized Least Squares

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Quick Review on Linear Multiple Regression

Vector autoregressions, VAR

STAT 100C: Linear models

Dynamic Regression Models (Lect 15)

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Regression of Time Series

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Chapter 3: Regression Methods for Trends

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Matrix Factorizations

Ordinary Least Squares Regression

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Cross Sectional Time Series: The Normal Model and Panel Corrected Standard Errors

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

Maximum Likelihood Estimation

STAT 350: Geometry of Least Squares

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Lecture 1: Systems of linear equations and their solutions

ECON 4160, Lecture 11 and 12

Multivariate Time Series: VAR(p) Processes and Models

Lecture 6: Geometry of OLS Estimation of Linear Regession

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

Final Exam. Economics 835: Econometrics. Fall 2010

Econometrics of Panel Data

MEI Exam Review. June 7, 2002

Multivariate Time Series

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Lecture 2: Linear and Mixed Models

MS&E 226: Small Data

Properties of the least squares estimates

A time series is called strictly stationary if the joint distribution of every collection (Y t

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Heteroskedasticity. Part VII. Heteroskedasticity

Statistics 910, #15 1. Kalman Filter

Lecture 11: Regression Methods I (Linear Regression)

Linear Regression (9/11/13)

Lecture 11: Regression Methods I (Linear Regression)

Ch3. TRENDS. Time Series Analysis

GLS and related issues

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Multivariate Regression Analysis

MAT3379 (Winter 2016)

STAT 100C: Linear models

Econ 424 Time Series Concepts

Applied Time Series Topics

Association studies and regression

Simple and Multiple Linear Regression

MA Advanced Econometrics: Spurious Regressions and Cointegration

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics

1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects.

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Advanced Econometrics

Introduction to Estimation Methods for Time Series models. Lecture 1

Regression: Ordinary Least Squares

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Econometrics 2, Class 1

Econometrics I. Professor William Greene Stern School of Business Department of Economics 25-1/25. Part 25: Time Series

Econometrics Summary Algebraic and Statistical Preliminaries

The Simple Linear Regression Model

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

Econ 4120 Applied Forecasting Methods L10: Forecasting with Regression Models. Sung Y. Park CUHK

Regression #3: Properties of OLS Estimator

Linear Regression and Its Applications

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Statistics Homework #4

ECON 3150/4150, Spring term Lecture 7

Eigenvalues and diagonalization

Short T Panels - Review

Applied Econometrics (QEM)

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Regression: Lecture 2

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE -MODULE2 Midterm Exam Solutions - March 2015

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

For more information about how to cite these materials visit

Generalized Least Squares Theory

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Ma 3/103: Lecture 24 Linear Regression I: Estimation

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Linear Model Under General Variance Structure: Autocorrelation

Empirical Market Microstructure Analysis (EMMA)

Multivariate Regression

Cointegration Lecture I: Introduction

Transcription:

Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known way to approach time series analysis is to decompose the observed series into several components, X t = Trend t + Seasonal t + Irregular t and then model the additive components separately. usually means linear or quadratic patterns. Here trend Stationarity The methods we have developed require (or at least assume to be plausible) the assumption of stationarity to permit averaging over time. That does not make sense for the trend component. A simpler view of the problem is to think of the data as X t = deterministic patterns t + stationary variation t and then model the deterministic component using tools like regression, leaving the rest to be handled as a stationary process, eventually reducing the data to X t = predictable variation t + white noise t Examples Data series from the text are:

Statistics 910, #5 2 Global temperature (since 1900, comparing annual and monthly) A univariate model, with no exogenous variables. Mortality data, with several exogenous variables. Fish harvest (SOI), with many plausible variables. Question In these examples, the regression residuals show correlations at several lags. Are the usual summaries of the fitted models (standard errors and t-statistics) reliable? Quick Review of Regression Data Response is a column vector y = y 1:n = (y 1,..., y n ) that is a time series in our examples. The explanatory variables (including a leading column of 1s for the intercept) are collected as columns in the n q matrix X. Denote a row of X as the column vector x t. Model Linear equation with stationary errors in scalar form is y t = x tβ + e t, E(e t ) = 0, Cov(e t, e s ) = γ(t, s) = γ( t s ) and in matrix form with the errors in the vector e = e 1:n y = X β + e, E(e) = 0, Var(e) = Γ = σ 2 Φ (1) where Φ is normalized to have 1 s along its diagonal (a correlation matrix) by dividing each element in Γ by γ(0). Hence Var(e t ) = γ(0) = σ 2. (Notation with γ(0) = σ 2 gives expressions that look more familiar.) OLS Estimator Assuming the explanatory variables are known and observed, the OLS estimator for β is ˆβ = (X X) 1 X y = β + (X X) 1 X e (2) with the residuals ê t = y t x t ˆβ. The usual unbiased estimator of σ 2 (given uncorrelated errors) is ˆσ 2 = ê2 t n q

Statistics 910, #5 3 Tests The text reviews the basic tests (such as the F test for added variables) in Section 2.2. Properties of the OLS estimator The second form of (2) makes it easy to see that although it ignores dependence among the observations, the OLS estimator is unbiased. The variance of the estimator is Var( ˆβ) = (X X) 1 X Var(e) X(X X) 1 = (X X) 1 X Γ X(X X) 1 = σ 2 (X X) 1 X ΦX(X X) 1 This contrasts with the usual expression in the case of uncorrelated errors, Var( ˆβ) = σ 2 (X X) 1 Sandwich estimator The form of the variance of the OLS estimator suggests a simple, robust estimate of its variance: var( ˆβ) = (X X) 1 X ˆΓX(X X) 1 (3) In the heteroscedastic case, this leads to the White estimator which uses White estimator: ˆΓ = diag(e 2 t ). For dependence, one can use blocked estimates of the covariances. GLS Estimator The GLS estimator (which is the MLE under a normal distribution) is the solution to the following minimization: min α (y Xα) Γ 1 (y Xα) (4) To solve (4), assume that we know Γ and factor this matrix as Γ = Γ 1/2 Γ 1/2 and express the sum of squares to be minimized as (Γ 1/2 y (Γ 1/2 X)α) (Γ 1/2 y (Γ 1/2 X)α) = (ỹ Xα) (ỹ Xα) where ỹ = Γ 1/2 y and X = Γ 1/2 X. This is now formulated as a regular least squares problem, for which the solution is β = ( X X) 1 X ỹ = (X Γ 1 X) 1 X Γ 1 y = β + (X Γ 1 X) 1 X Γ 1 e The variance of the GLS estimator is Var( β) = σ 2 (X ΦX) 1

Statistics 910, #5 4 Comparison of Estimators: Scalar Model Scalar model (q = 1) simplifies the arithmetic and suggests the matrix expressions. y t = βx t + e t ˆβ = x y x x = β + x e x x Assume that the errors are autoregressive with coefficient α: e t = αe t 1 + w t, α < 1. Impact of dependence The variance of the OLS estimator is (use the fact that Φ ij = α i j ) Var( ˆβ) = Var(x e) (x x) 2 = σ2 (x x) 2 x Φx = = σ 2 ( x 2 (x x) 2 t + 2α x t x t 1 + 2α ) 2 x t x t 2 σ 2 ( ) (x 1 + 2α + 2α 2 xt x t 2 x) x 2 t xt x t 1 x 2 t We can obtain nicer expressions by choosing values of x t so that these sums (which look like estimated correlations) have known values (approximately). The easiest way to do this is to imagine x t as an autoregression with some coefficient, With this choice, the sums becomes Var( ˆβ) = x t = ρ x t 1 + u t, ρ < 1. P xt+h x t P x 2 t σ 2 (x x) σ 2 1 + αρ (x x) 1 αρ ρ h and the approximation ( 1 + 2αρ + 2α 2 ρ 2 ) Notice that we are not thinking of x t as a random variable; we are using this method to generate a sequence of numbers with a specific set of autocorrelations. We can also approximate x x as n Var(X) and write Var( ˆβ) 1 + αρ n Var(X) 1 αρ σ 2 (5)

Statistics 910, #5 5 Optimistic Compared the expression for the variance with uncorrelated errors, the actual variance differs from the usual expression by the factor 1 + αρ 1 αρ If you set ρ = 1, you get the factor from the previous lecture for the variance of X in the presence of autocorrelated data. In this case, the presence of positive correlation in the exogenous series compounds the loss of efficiency. For example, if α = ρ = 0.8, then 1+αρ 1 αρ = 1+0.64 0.36 4.56. Ignoring the dependence leads to a false sense of accuracy (i.e., confidence intervals based on the OLS expression are too narrow for the stated coverage). Bias of ˆσ 2 The estimator is also biased due to the autocorreation. In the scalar model under these same conditions, we can approximate the bias easily: (Note that ˆβ β = ( x t e t )/ x 2 t ) E ê 2 t = E (e t x t ( ˆβ ) 2 β) = E ( e 2 t + x 2 t ( ˆβ β) 2 2( ˆβ ) β)x t e t ( σ 2 n + 1 + αρ + αρ 21 1 αρ 1 αρ = σ 2 (n 1 + αρ 1 αρ ) which is also shrunken toward zero. Hence the usual OLS expression s 2 /(x x) is going to be much too small. Relative efficiency The simplified univariate setup makes it easy to compare the variances of the OLS and GLS estimators. To obtain the GLS estimator in the scalar case, we can write the effects of multiplying by Γ 1/2 directly. Namely, subtract the expression for αy t 1 from the expression for y t : y t αy t 1 = β(x t αx t 1 ) + (e t αe t 1 ) }{{} w t Hence, we can estimate β efficiently via OLS using these so-called generalized differences of the observed data. The resulting GLS )

Statistics 910, #5 6 estimator is β = (xt αx t 1 )(y t αy t 1 ) (xt αx t 1 ) 2 The variance is then (treating x t as uncorrelated with w t, a must for least squares, and then viewing as conditional on x t ) ( ) Var( β) (xt αx t 1 )w t = Var (xt αx t 1 ) 2 = σ 2 w (xt αx t 1 ) 2 Assuming {X t } is autoregressive with coefficient ρ as before implies that (expand the square and approximate the 3 sums) (1/n) (x t αx t 1 ) 2 σ 2 x(1 + α 2 2αρ) We also know that Var(e t ) = σ 2 w/(1 α 2 ), so σ 2 w = (1 α 2 )σ 2 (note: σ 2 = Var(e t )). Hence, when all are pulled together we get Var( β) σ2 1 α 2 nσx 2 1 + α 2 2αρ and the ratio of variances (see (5)) is Var( β) 1 αρ Var( ˆβ) 1 + αρ 1 α 2 1 + α 2 2αρ. For the example with α = ρ = 0.8, the efficiency is 0.21. Conclusion OLS is not only not efficient, it makes claims as though it were. Under this model of positively correlated data, OLS expressions underestimate the variance of ˆβ OLS estimates are much less efficient than GLS estimates. Comparison of Estimators: General Case Variances The variances of the estimators are OLS: Var( ˆβ) = σ 2 (X X) 1 X ΦX(X X) 1 and GLS: Var( β) = σ 2 (X Φ 1 X) 1

Statistics 910, #5 7 Equality of estimators The columns of X must span the same linear subspace as the columns of ΦX (as if ΦX = X). In general, that does not happen. (We will later see some situations in which it does, at least for n large.) If the columns of X are eigenvectors of Φ, equality also obtains. Asymptotically, that will occur for some very special Xs. Comparison criterion There s no unique way to order matrices. To find a criterion, consider this one: because the GLS estimator has smaller variance than OLS, we know that for any linear combination of estimates Var(a β) Var(a ˆβ) It follows then that we can use generalized variances (i.e., determinants) to order the estimators: V ar( β) Var( ˆβ) and the ratio of these generalized variances (efficiency?) is Var( β) Var( ˆβ) = X X 2 X ΦX X Φ 1 X Bounds By orthogonalizing X so that X X = I, it can be shown that there is a lower bound to the efficiency given by Var(a β) Var(a ˆβ) 4λ 1λ n (λ 1 + λ n ) 2 where λ 1 λ 2 λ n > 0 are the eigenvalues of Φ (which is assumed to be full rank). These bounds come from the Kantorovich inequality: For 0 < λ 1 λ 2... λ n and nonnegative weights p i for which p i = 1, then ( ) ( ) 1 pi λ i pi A2 λ i G 2 where A = (λ 1 + λ n )/2 is the arithmetic mean and G = λ 1 λ n is the geometric mean.

Statistics 910, #5 8 Lingering Questions Computing in practice Suppose that we estimate α from the data. Is the resulting approximate GLS estimator as good as suggested by these approximations? Model specification What if we don t know that the process is autoregressive, much less its coefficient? Are the gains going to be so large as possible from these indications?