4 Multiple Linear Regression
|
|
- Sophia Sims
- 5 years ago
- Views:
Transcription
1 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ 2 ). Y = β + β x + β 2 x β k x k + ε (4.) orollary: If Y fits a Multiple Linear Regression Model, then for a fixed value of x = (x, x 2,..., x k ) R k the conditional expectation of Y given x equals E(Y x) = E(β + β x + β 2 x β k x k + ε) = β + β x + β 2 x β k x k (4.2) and the conditional variance of Y given x is V ar(y x) = V ar(β + β x + β 2 x β k x k + ε) = V ar(ε) = σ 2 (4.3) β... β k are called the regression coefficients, and they and σ 2 are the parameters of the model. This implies that the mean of Y depends on x in a linear fashion and the variance of Y is independent from the value of x, which is called homoscedasticity. 4.2 Visualize 4.2. Without Interaction Let limit ourselves for the moment to two predictor variables x and x 2 a third variable is Y the response variable. In a model without error Y = β + β x + β 2 x 2. Let Y = 5 + x + 2x 2. The following diagram shows this function in a 3d-plot from R.
2 The equation does not include an interaction term, which means that values of x 2 do not have an effect on the slope in the relationship between x and y, therefore the graph in a plane With Interaction When interaction is present this indicates that for different values of x 2 the slope of the line relating x and y can change the graph is not a plane anymore. In the model this can be introduced by including the product term for the interacting variables, x x 2. Let Y = 5 + x + 2x 2 3 x x 2. Here you see that the result is no longer plane but a curved surface. 2
3 4.3 Estimation Like for the simple linear regression model the first goal is to estimate the parameters of the model, using point estimators and confidence intervals. The data is given as an array like y x x 2... x k y 2 x 2 x x 2k.. y n x n x n2... x nk Using the data we will see if a multiple linear model is an appropriate description for the population, and if it is what estimates for the model parameters can be found. If the MLR model is not a good fit can we adjust the model to obtain a better fit We will find if outliers are causing problems, or if the model should be discarded entirely, because Y is independent from x,..., x k. onsidering the above array, it seems to be reasonable to use matrices to represent the data Matrix notation If The MLR model fits the population then each observation in the sample can be modeled by a random variable which fits the model equation, i.e. for i n Y i = β + β x i + β 2 x i2 + + β k x ik + ε i For the analysis it is assumed that the n random variables are independent. Then with Y = (Y, Y 2,..., Y n ), ε = (ε,..., ε n ), β = (β, β,..., β k ), and ˆ x x 2... x k X =. x 2 x x 2k x n x n2... x nk The model for n independent observations can be given in the following form: Y = X β + ε, ε N (, σ 2 I n ) 4.4 Point Estimators The least squares estimators are chosen such that the total squared distances to the regression function are as small as possible. Theorem 4.. In the case that X X is invertible the least squares estimator for the coefficient vector β, ˆβ, is given by ˆβ = (X X) X Y 3
4 Proof: For β R k+ the total of the squared distances of y, y 2,..., y n to the model function is ε ε = ( Y X β) ( Y X β) = Y Y Y X β (X β) Y + (X β) (X β) = Y Y 2 β X Y + β X X β In order to find least squares estimator ˆβ, which minimizes the expression above, find the partial derivatives of the expression and set them to zero, then solve for β. The linear system giving ˆβ (Math 24 we will not dwell on this here) which is equivalent to the normal equation giving in the case that X X is invertible Which concludes the proof. 2X Y + 2X X ˆβ = k X X ˆβ = X Y ˆβ = (X X) X Y X X is invertible, iff the columns of X are linearly independent. The i th column of X gives the values for the n individuals from the sample for the i th predictor variable. It follows that in the case of linearly dependent columns the least square estimator is not unique. In the case the columns are linearly dependent one usually removes columns until the design matrix has full rank, and the least squares estimator becomes unique. This is also the reason it is disadvantageous if predictor variables have a high correlation, this results in a small determinant of X X making it hard numerically to invert the matrix and calculating the least squares estimator with high precisions. Least squares estimator as projection Geometrically regression is the orthogonal projection of the vector y R n into the p dimensional space spanned by the columns from X. The projection ˆµ is according to Linear lgebra ˆµ = X (X X) X y = X ˆβ In regression it is tradition to use ŷ instead of ˆµ. The hat matrix is defined as because when applied to Y, it gets a hat. H = X (X X) X The vector ŷ gives the fitted values for observed values y from the model estimates. The model for the n observations are Y = Xβ + ε where ε has en expected value of. The fitted value of y, ŷ is then ŷ = X ˆβ 4
5 Lemma. The observed residual vector for a data vector is. e = y ŷ = y X ˆβ = y X(X X) X y = (I n X(X X) X ) y = (I n H) y H is called the hat matrix and is central in regression analysis. Properties of ˆβ Theorem ˆβ is an unbiased estimator of β. 2. The covariance matrix of ˆβ is ov( ˆβ) = σ 2 (X X) 3. ˆβ has a multivariate normal distribution. Proof:. 2. E( ˆβ) = E ((X X) X Y ) = (X X) X E(Y ) = (X X) X Xβ = I n β = β ov( ˆβ) = ov ((X X) X Y ) = (X X) X ov(y )X((X X) ) = (X X) X σ 2 I n X((X X) ) = σ 2 (X X) X X(X X) = σ 2 (X X) 3. y definition is ˆβ a linear combination in normally distributed random variables and therefore multivariate normal. ˆβ has also minimal variance. This property has to be defined, because ˆβ is a random vector and therefore has a p = k + dimensional covariance matrix, and how to order matrices needs to be clarified. Theorem 4.3. (Gauss-Markov Theorem) Let ψ = c β, for c R p, be a linear combination of the parameter vector β in the MLR model, assume the design matrix X has full rank, then ˆψ = c ˆβ is an unbiased estimator for ψ and has smallest variance in the class of linear unbiased estimators for ψ. 5
6 Proof:(ompare Faraway(22)) ecause of E( ˆψ) = E( c ˆβ) = c E( ˆβ) = c β = ψ, ˆψ is an unbiased estimator for ψ. Suppose a Y is some unbiased estimate of c β so that which means that a X = c (). Now let b = (X X) c, then c = X X b (2) and E( a Y ) = c β for all β a X β = c β for all β ˆψ = c ˆβ = b X X ˆβ = b X X(X X) X Y = b X Y (3) Now find the variance of a Y (show that V ar( a Y ) V ar( ˆψ)): V ar( a Y ) = V ar( a Y c ˆβ + c ˆβ) = V ar( a Y c ˆβ) + V ar( c ˆβ) + 2ov( a Y c ˆβ, c ˆβ) ut using (3) ov( a Y c ˆβ, c ˆβ) = ov( a Y b X Y, b X Y ) = ov(( a b X ) Y, b X Y ) = ( a b X )ov( Y )X b according to intermission 4 (ov( l Y, h Y ) = l ov( Y ) h). Then ( a b X )ov( Y )X b = ( a b X )σ 2 I n X b = σ 2 ( a X b X X) b = σ 2 ( c c ) b = because of () and (2). Therefore V ar( a Y ) = V ar( a Y c ˆβ) + V ar( c ˆβ) Since variances are always at least, this means that for all unbiased estimators a Y Which concludes the proof. Example 4.. V ar( a Y ) V ar( c ˆβ) small example relating age and weight to blood pressure: The data 6
7 weight age blood pressure The scatterplots do not display strong linear relationships, but we will never the less propose the following model for blood pressure in the population: bp = β + β (weight) + β 2 (age) + ε, ε N (, σ 2 ) First find the vector of parameter estimates for β = (β, β, β 2 ), based on the sample data above: X = with X X = and (X X) = (Last matrix found with R) With y = , X y = we get ˆβ = (X X) X y = and ŷ = X ˆβ = omparing the vector of fitted values, ŷ, with the measurement vector, y, we can observe a close fit. 7
8 For closer examination find the residual vector: e = y ŷ = The residuals giving the difference in blood pressure measured and predicted by the model, are quite small, and seem to support the model. Interpretation of the parameter estimates: The mean blood pressure for a person of age and weight is estimated to be This is not making sense!. We would be extrapolating using this interpretation. The blood pressure increase in average by 2.38 for every extra year when the weight is unchanged. The blood pressure increase in average by.44 for every extra kilogram when the age is unchanged. To be continued. The estimator for the variance of the error in the MLR model, σ 2, is again based on the residuals. Definition 4.2. Let p := k + ( the number of regression parameters). The residual sum of squares are 2. The mean squares of error are SS Res = e e MS Res = SS Res n p Theorem SS Res = Y Y ˆβX Y = Y [I n X(X X) X ] Y 2. E(SS Res ) = (n p)σ 2 3. SS Res /σ 2 is χ 2 distributed with df = (n p) and non centrality parameter λ =. Proof: 8
9 . SS Res = e e = ( Y Ŷ ) ( Y Ŷ ) = Y Y Y X ˆβ ˆβ X Y + ˆβ X X ˆβ = Y Y Y X ˆβ ˆβ X Y + Y X(X X) X X ˆβ = Y Y Y X ˆβ ˆβ X Y + Y X ˆβ = Y Y ˆβ X Y which proves the first part. ontinue from here SS Res = Y Y ˆβ X Y = Y Y Y X(X X) X Y = Y [I X(X X) X ] Y which concludes the second part of the equation. 2. To find the mean use E( Y Y ) = tr(ov(y )) + E( Y ) E( Y ) from intermission 4. Therefore we need the tr(i X(X X) X ): Then tr([i X(X X) X ]) = tr(i) tr(x(x X) X ) = n tr((x X) X X) = n tr(i p ) = n p E(SS Res ) = E( Y [I X(X X) X ] Y ) = tr([i X(X X) X ] σ 2 I) + E( Y ) [I X(X X) X ] E( Y ) = σ 2 (n p) + β X [I X(X X) X ] Xβ = σ 2 (n p) + β X Xβ β X X(X X) X Xβ = σ 2 (n p) + β X Xβ β X Xβ = σ 2 (n p) 3. ecause of intermission 5 it follows that SS Res /σ 2 is χ 2 distributed with df = n p. Rest without proof. onclusion: ˆσ 2 = MS Res is an unbiased estimator for σ 2. ontinue Example. 4.: SS Res = e e = (.5,.853,.869,.53,.2,.53) The estimated variance ˆσ 2 = 5.78/(6 3) =.93, and therefore ˆσ =.93 =.39. = 5.78 We estimate that the measurements vary with a standard deviation of.39 around the regression function. 9
3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More information4.7 Confidence and Prediction Intervals
4.7 Confidence and Prediction Intervals Instead of conducting tests we could find confidence intervals for a regression coefficient, or a set of regression coefficient, or for the mean of the response
More information3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.
7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationSTA 302f16 Assignment Five 1
STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationQuantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017
Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationBasic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:
8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationReference: Davidson and MacKinnon Ch 2. In particular page
RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationWell-developed and understood properties
1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationx 21 x 22 x 23 f X 1 X 2 X 3 ε
Chapter 2 Estimation 2.1 Example Let s start with an example. Suppose that Y is the fuel consumption of a particular model of car in m.p.g. Suppose that the predictors are 1. X 1 the weight of the car
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationMa 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA
Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationProperties of the least squares estimates
Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationMEI Exam Review. June 7, 2002
MEI Exam Review June 7, 2002 1 Final Exam Revision Notes 1.1 Random Rules and Formulas Linear transformations of random variables. f y (Y ) = f x (X) dx. dg Inverse Proof. (AB)(AB) 1 = I. (B 1 A 1 )(AB)(AB)
More informationMultivariate Regression Analysis
Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationLecture Notes on Different Aspects of Regression Analysis
Andreas Groll WS 2012/2013 Lecture Notes on Different Aspects of Regression Analysis Department of Mathematics, Workgroup Financial Mathematics, Ludwig-Maximilians-University Munich, Theresienstr. 39,
More informationMLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More information2. Regression Review
2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationAdvanced Quantitative Methods: ordinary least squares
Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More informationREGRESSION ANALYSIS AND INDICATOR VARIABLES
REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationIn the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)
RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationEstadística II Chapter 5. Regression analysis (second part)
Estadística II Chapter 5. Regression analysis (second part) Chapter 5. Regression analysis (second part) Contents Diagnostic: Residual analysis The ANOVA (ANalysis Of VAriance) decomposition Nonlinear
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationLecture 18 MA Applied Statistics II D 2004
Lecture 18 MA 2612 - Applied Statistics II D 2004 Today 1. Examples of multiple linear regression 2. The modeling process (PNC 8.4) 3. The graphical exploration of multivariable data (PNC 8.5) 4. Fitting
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationAppendix A: Review of the General Linear Model
Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationStatistical View of Least Squares
May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More information2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23
2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationInstructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses
ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout
More informationLinear Model Under General Variance
Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T
More informationAnalisi Statistica per le Imprese
, Analisi Statistica per le Imprese Dip. di Economia Politica e Statistica 4.3. 1 / 33 You should be able to:, Underst model building using multiple regression analysis Apply multiple regression analysis
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationMFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators
MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationApplied linear statistical models: An overview
Applied linear statistical models: An overview Gunnar Stefansson 1 Dept. of Mathematics Univ. Iceland August 27, 2010 Outline Some basics Course: Applied linear statistical models This lecture: A description
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 4. Linear Regression 4.1 Introduction So far our data have consisted of observations on a single variable of interest.
More informationRidge Regression. Flachs, Munkholt og Skotte. May 4, 2009
Ridge Regression Flachs, Munkholt og Skotte May 4, 2009 As in usual regression we consider a pair of random variables (X, Y ) with values in R p R and assume that for some (β 0, β) R +p it holds that E(Y
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More information