Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark

Size: px
Start display at page:

Download "Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark"

Transcription

1 Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark March 22, 2017 WLS and BLUE (prelude to BLUP) Suppose that Y has mean β and known covariance matrix V (but Y need not be normal). Then ˆβ = ( T V 1 ) 1 T V 1 Y is a weighted least squares estimate since it minimizes (Y β) T V 1 (Y β). It is also the best linear unbiased estimate (BLUE) - that is the unbiased estimate with smallest variance in the sense that Var β Var ˆβ is positive semi-definite for any other linear unbiased estimate β. 1 / 1 More generally, suppose Y has mean vector µ in linear subspace M with orthogonal projection P. Then Aˆµ = APY is BLUE for ψ = Aµ. 2 / 1 BLUE for general parameter Suppose EY = µ is in linear subspace M and CovY = σ 2 I and ψ = Aµ. Then BLUE of ψ is ˆψ = Aˆµ where ˆµ = PY and P orthogonal projection on M. Proof: Assume ψ is LUE. I.e. ψ = BY and E ψ = Bµ = Aµ for all µ L. We also have APµ = Aµ for all µ M (which implies that ˆψ is unbiased). Thus for all w R p (B AP)Pw = BPw APw = APw APw = 0 since Pw M. This implies (B AP)P = 0 which gives Cov( ψ ˆψ, ˆψ) = σ 2 (B AP)P T A T = 0 Case: V = LL T different from σ 2 I and µ = β where has full rank. Then BLUE of ψ = Aµ is Aˆµ where ˆµ = ( T V 1 ) T V 1 Y is WLS estimate of µ. Proof: combine previous result with result below, letting Ỹ = L 1 Y. Result: suppose ˆψ = CỸ is BLUE of ψ based on data Ỹ and K is an invertible matrix. Then ˆψ = CK 1 Y is BLUE based on Y = KỸ as well. so that Var( ψ) = Var( ψ) = Var( ψ ˆψ) + Var ˆψ. Note: like Pythogoras if we say ψ ˆψ and ˆψ orthogonal when their covariance is zero. 3 / 1 4 / 1

2 Estimable parameters and BLUE Definition: A linear combination a T β is estimable if it has a LUE b T Y. Result: a T β is estimable a T β = c T µ for some c. If a T β is estimable then BLUE is c Tˆµ. Optimal prediction and Y random variables, g real function. General result Cov(Y E[Y ], g( )) = Cov(E[Y E[Y ] ], E[g( ) ])+ Note, since E[Y E[Y ]] = 0 we also have ECov(Y E[Y ], g( ) ) = 0 E(Y E[Y ])g( ) = 0 In particular, for any prediction Ỹ = f ( ) of Y : 5 / 1 E [ (Y E[Y ])(E[Y ] f ( )) ] = 0 from which it follows that E(Y Ỹ ) 2 = E(Y E[Y ]) 2 +E(E[Y ] Ỹ ) 2 E(Y E[Y ]) 2 Thus E[Y ] minimizes mean square prediction error. 6 / 1 Pythagoras and conditional expectation Space of real random variables with finite variance may be viewed as a vector space with inner product and (L 2 ) norm <, Y >= E(Y ) = E 2 Orthogonal decomposition (Pythagoras): Y 2 = E[Y ] 2 + Y E[Y ] 2 E[Y ] may be viewed as projection of Y on since it minimizes distance E(Y Ỹ ) 2 among all predictors Ỹ = f ( ). For zero-mean random variables, orthogonal is the same as uncorrelated. Decomposition of Y : Y = E[Y ] + (Y E[Y ]) where predictor E[Y ] and prediction error Y E[Y ] uncorrelated. Thus, VarY = VarE[Y ]+Var(Y E[Y ]) = VarE[Y ]+EVar[Y ] whereby Var(Y E[Y ]) = EVar[Y ]. Prediction variance is equal to the expected conditional variance of Y. (Grimmett & Stirzaker, Prob. and Random Processes, Chapter 7.9 good source on this perspective on prediction and conditional expectation) 7 / 1 8 / 1

3 BLUP Consider random vectors Y and with with mean vectors and covariance matrix EY = µ Y E = µ [ ] ΣY Σ Σ = Y Σ Y Σ Then the best linear unbiased predictor of Y given is in the sense that Ŷ = µ Y + Σ Y Σ 1 ( µ ) Var[Y (a + B )] Var[Y Ŷ ] is positive semi-definite for all linear unbiased predictors a + B and = only if Ŷ = a + B (unbiased: E[Y a B ] = 0). 9 / 1 Fact: Thus Cov[Y Ŷ, Ŷ ] = 0 and Cov[C, Y Ŷ ] = 0 for all C. (1) VarŶ = Cov(Y, Ŷ ) = Σ Y Σ 1 Σ Y It follows that mean square prediction error is Var[Y Ŷ ] = VarY + VarŶ 2Cov(Y, Ŷ ) = Σ Y Σ Y Σ 1 Σ Y Proof of fact: Cov[C, Y Ŷ ] = Cov[C, Y ] Cov[C, Ŷ ] = CΣ Y CΣ Σ 1 Σ Y = 0 10 / 1 Proof of BLUP One version: Recall Cov(B, Y Ŷ ] = 0 for all B. Var[Y (a + B )] = Var[Y Ŷ ] + Var[Ŷ (a + B )]+ Cov[Y Ŷ, Ŷ (a + B )] + Cov[Ŷ (a + B ), Y Ŷ ] = Var[Y Ŷ ] + Var[Ŷ (a + B )] Hence Var[Y (a + B )] Var[Y Ŷ ] = Var[Ŷ (a + B )] where right hand side is positive semi-definite. Second version: (Y scalar for consistency with previous slide) BLUP is projection of Y onto linear subspace spanned by 1,..., n (with orthonormal basis U 1,... U n where U = Σ 1/2 ) (analogue to least squares). NB: conditional expectation E[Y ] projection on space of all variables Z = f ( 1,..., n ) where f real function. 11 / 1 Conditional distribution in multivariate normal distribution Consider jointly normal random vectors Y and with mean vector and covariance matrix Then (provided Σ invertible) µ = (µ Y, µ ) [ ] ΣY Σ Σ = Y Σ Y Y = x N(µ Y + Σ Y Σ 1 (x µ ), Σ Y Σ Y Σ 1 Σ Y ) Proof: By BLUP Σ Y = Ŷ + R where Ŷ = µ Y + Σ Y Σ 1 ( µ ), R = Y Ŷ and Cov(R, ) = 0. By normality R is independent of and the result follows. 12 / 1

4 Conditional simulation using prediction Suppose Y and are jointly normal and we wish to simulate Y = x. By previous result Y = x ŷ + R where ŷ = µ Y + Σ Y Σ 1 (x µ ). We thus need to simulate R. This can be done by simulated prediction : simulate (Y, ) and compute Ŷ and R = Y Ŷ. Then our conditional simulation is ŷ + R Advantageous if it is easier to simulate (Y, ) and compute Ŷ than simulate directly from conditional distribution of Y = x (e.g. if simulation of (Y, ) easy but Σ Y Σ Y Σ 1 Σ Y difficult) 13 / 1 Prediction in linear mixed model Let U N(0, Ψ) and Y U = u N( β + Zu, Σ). Then Cov[U, Y ] = ΨZ T and VarY = V = ZΨZ T + Σ. and NB: Û = E[U Y ] = ΨZ T V 1 (Y β) VarÛ = ΨZ T V 1 ZΨ T ΨZ T (ZΨZ T + Σ) 1 = (Ψ 1 + Z T Σ 1 Z) 1 Z T Σ 1 e.g. useful if Ψ 1 is sparse (like AR-model). Similarly Var[U Y ] = Var[U Û] = Ψ ΨZ T V 1 ZΨ T = (Ψ 1 +Z T Σ 1 Z) 1 One-way anova example at p. 186 in M & T. 14 / 1 IQ example BLUP as hierarchical likelihood estimates Y measurement of IQ, U subject specific random effect: Y = µ + U + ɛ where standard deviation of U and ɛ are 15 and 5 and µ = 100. Given Y = 130, E[µ + U Y = 130] = 127. Example of shrinkage to the mean. Maximization of joint density ( hierarchical likelihood ) f (y u; β)f (u; ψ) with respect to u gives BLUP (M & T p for one-way anova and p. 183 for general linear mixed model) Joint maximization wrt. u and β gives Henderson s mixed-model equations (M & T p. 184) leading to BLUE ˆβ and BLUP û. 15 / 1 16 / 1

5 BLUP of mixed effect with unknown β Assume E = Cβ and EY = Dβ. Given and β, BLUP of is K = Aβ + BY ˆK(β) = Aβ + BŶ (β) where BLUP Ŷ (β) = Dβ + Σ Y Σ 1 ( Cβ). Typically β is unknown. Then BLUP is where ˆβ is BLUE (Harville, 1991) ˆK = A ˆβ + BŶ ( ˆβ) Proof: ˆK(β) can be rewritten as Aβ+BŶ (β) = [A+BD BΣ Y Σ 1 C]β+BΣ Y Σ 1 = T +BΣ Y Σ 1 Note BLUE of T = [A + BD BΣ Y Σ 1 C]β is ˆT = [A + BD BΣ Y Σ 1 C] ˆβ. Now consider a LUP K = H = [H BΣ Y Σ 1 ] + BΣ Y Σ 1 of K. By unbiasedness, T = [H BΣ Y Σ 1 ] is LUE of T. Note Var[ T T ] Var[ ˆT T ]. Also note by (??) Cov[ T T, ˆK(β) K] = 0 and Cov[ ˆT T, ˆK(β) K] = 0 Using this it follows that Var[ K K] Var[ ˆK K] 17 / 1 Hint: subtract and add ˆK(β). 18 / 1 EBLUP and EBLUE Model assessment Typically covariance matrix depends on unknown parameters. EBLUPS are obtained by replacing unknown variance parameters by their estimates (similar for EBLUE). Make histograms, qq-plots etc. for EBLUPs of ɛ and U. May be advantageous to consider standardized EBLUPS. Standardized BLUP is [CovÛ] 1/2 Û 19 / 1 20 / 1

6 Example: prediction of random intercepts and slopes in orthodont data ort7=lmer(distance~age+factor(sex)+(1 Subject),data=Orthodont) #check of model ort7 #residuals res=residuals(ort7) qqnorm(res) qqline(res) #outliers occur for subjects M09 and M13 #plot residuals against subjects boxplot(resort~orthodont$subject) #plot residuals against fitted values fitted=fitted(ort7) plot(rank(fitted),resort) #extract predictions of random intercepts raneffects=ranef(ort7) #qqplot of random intercepts qqnorm(ranint[[1]]) qqline(ranint[[1]]) #plot for subject M09 M09=Orthodont$Subject=="M09" plot(orthodont$age[m09],fitted[m09],type="l",ylim=c(20,32)) points(orthodont$age[m09],orthodont$distance[m09]) 21 / 1 22 / 1 Normal Q Q Plot Example: quantitative genetics (Sorensen and Waagepetersen 2003) Sample Quantiles resort ij size of jth litter of ith pig. Histogram Pedigree x z M16 M11 M03 M14 M06 M10 F06 F05 F02 F03 F Theoretical Quantiles rank(fitted) Sample Quantiles Normal Q Q Plot fitted[m09] litter size w 2 1 v 2 w Etc. pig without observation. pig with observation. 3 4 Theoretical Quantiles Orthodont$age[M09] 23 / 1 24 / 1

7 U i, Ũ i random genetic effects influencing size and variability of ij : ij U i = u i, Ũ i = ũ i N (µ i + u i, exp( µ i + ũ i )) (U 1,..., U n, Ũ 1,..., Ũ n ) N(0, G A) A: additive genetic relationship (correlation) matrix (depending on pedigree). Correlation structure derived from simple model: U offspring = 1 2 (U father + U mother ) + ɛ Q = A 1 sparse! (generalization of AR(1)) [ σu G = 2 ] ρσ u σũ ρσ u σũ ρ: coefficient of genetic correlation between U i and Ũ i. NB: high dimension n > Aim: identify pigs with favorable genetic effects σ 2 ũ 25 / 1 Exercises 1. Write out in detail the proofs of the results on slide For the orthodont data: fit models where the subject specific intercepts are treated a) as fixed effects b) as random effects. Compare the BLUEs of the fixed effects with the BLUPs of the random effects. 3. Fill in the details of the proof on slide Verify the results on page 186 in M&T regarding BLUPs in case of a one-way anova. 5. Consider slides 14 and the special case of the hidden AR(1) model. Explain how you can do conditional simulation of the hidden AR(1) process U given the observed data Y using 1) conditional simulation using prediction 2) the expressions for the conditional mean and covariance (cf. slide 14). Try to implement the solutions in R. 26 / 1

WLS and BLUE (prelude to BLUP) Prediction

WLS and BLUE (prelude to BLUP) Prediction WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need

More information

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016 Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Mixed-Models. version 30 October 2011

Mixed-Models. version 30 October 2011 Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012 Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model. *Modified notes from Dr. Dan Nettleton from ISU

Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model. *Modified notes from Dr. Dan Nettleton from ISU Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model *Modified notes from Dr. Dan Nettleton from ISU Suppose intelligence quotients (IQs) for a population of

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Analysis of variance using orthogonal projections

Analysis of variance using orthogonal projections Analysis of variance using orthogonal projections Rasmus Waagepetersen Abstract The purpose of this note is to show how statistical theory for inference in balanced ANOVA models can be conveniently developed

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013 Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic

More information

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution Outline for today Maximum likelihood estimation Rasmus Waageetersen Deartment of Mathematics Aalborg University Denmark October 30, 2007 the multivariate normal distribution linear and linear mixed models

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model 21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model Copyright c 2018 (Iowa State University) 21. Statistics 510 1 / 26 C. R. Henderson Born April 1, 1911,

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29 REPEATED MEASURES Copyright c 2012 (Iowa State University) Statistics 511 1 / 29 Repeated Measures Example In an exercise therapy study, subjects were assigned to one of three weightlifting programs i=1:

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Chapter 5 Prediction of Random Variables

Chapter 5 Prediction of Random Variables Chapter 5 Prediction of Random Variables C R Henderson 1984 - Guelph We have discussed estimation of β, regarded as fixed Now we shall consider a rather different problem, prediction of random variables,

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Outline for today. Two-way analysis of variance with random effects

Outline for today. Two-way analysis of variance with random effects Outline for today Two-way analysis of variance with random effects Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Two-way ANOVA using orthogonal projections March 4, 2018 1 /

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction MIT 18.655 Dr. Kempthorne Spring 2016 1 Problems Targets of Change in value of portfolio over fixed holding period. Long-term interest rate in 3 months Survival time of patients being treated for cancer

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix: Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,

More information

If you believe in things that you don t understand then you suffer. Stevie Wonder, Superstition

If you believe in things that you don t understand then you suffer. Stevie Wonder, Superstition If you believe in things that you don t understand then you suffer Stevie Wonder, Superstition For Slovenian municipality i, i = 1,, 192, we have SIR i = (observed stomach cancers 1995-2001 / expected

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Mixed models with correlated measurement errors

Mixed models with correlated measurement errors Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations

More information

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables. Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Probability Lecture III (August, 2006)

Probability Lecture III (August, 2006) robability Lecture III (August, 2006) 1 Some roperties of Random Vectors and Matrices We generalize univariate notions in this section. Definition 1 Let U = U ij k l, a matrix of random variables. Suppose

More information

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1 Estimation and Inference in Econometrics Exercises, January 24, 2003 Solutions 1. a) cov(wy ) = E [(WY E[WY ])(WY E[WY ]) ] = E [W(Y E[Y ])(Y E[Y ]) W ] = W [(Y E[Y ])(Y E[Y ]) ] W = WΣW b) Let Σ be a

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

Fitting Linear Statistical Models to Data by Least Squares II: Weighted

Fitting Linear Statistical Models to Data by Least Squares II: Weighted Fitting Linear Statistical Models to Data by Least Squares II: Weighted Brian R. Hunt and C. David Levermore University of Maryland, College Park Math 420: Mathematical Modeling April 21, 2014 version

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing Feng Li Department of Statistics, Stockholm University What we have learned last time... 1 Estimating

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Chapter 12 REML and ML Estimation

Chapter 12 REML and ML Estimation Chapter 12 REML and ML Estimation C. R. Henderson 1984 - Guelph 1 Iterative MIVQUE The restricted maximum likelihood estimator (REML) of Patterson and Thompson (1971) can be obtained by iterating on MIVQUE,

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Mixed effects models - Part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

1 Introduction to Generalized Least Squares

1 Introduction to Generalized Least Squares ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Applied linear statistical models: An overview

Applied linear statistical models: An overview Applied linear statistical models: An overview Gunnar Stefansson 1 Dept. of Mathematics Univ. Iceland August 27, 2010 Outline Some basics Course: Applied linear statistical models This lecture: A description

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Correlation in Linear Regression

Correlation in Linear Regression Vrije Universiteit Amsterdam Research Paper Correlation in Linear Regression Author: Yura Perugachi-Diaz Student nr.: 2566305 Supervisor: Dr. Bartek Knapik May 29, 2017 Faculty of Sciences Research Paper

More information

36-720: Linear Mixed Models

36-720: Linear Mixed Models 36-720: Linear Mixed Models Brian Junker October 8, 2007 Review: Linear Mixed Models (LMM s) Bayesian Analogues Facilities in R Computational Notes Predictors and Residuals Examples [Related to Christensen

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

MLES & Multivariate Normal Theory

MLES & Multivariate Normal Theory Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4. Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Lecture 34: Properties of the LSE

Lecture 34: Properties of the LSE Lecture 34: Properties of the LSE The following results explain why the LSE is popular. Gauss-Markov Theorem Assume a general linear model previously described: Y = Xβ + E with assumption A2, i.e., Var(E

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information