Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA
|
|
- Shawn Oliver
- 5 years ago
- Views:
Transcription
1 Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, / 44
2 1 OLS estimator 2 Restricted regression 3 Errors in variables 4 ANOVA 5 The F test in an ANOVA framework 6 Contrasts KC Border Linear Regression II March 6, / 44
3 OLS estimator Standard Linear Model y = Xβ + ε, where E ε = 0 and Var ε = E(εε ) = σ 2 I KC Border Linear Regression II March 6, / 44
4 OLS estimator OLS estimation With N observations on X 1,, X K, Y, let X be the N K matrix of regressors, and y be the N 1 vector of observations on the response Y Then if X has rank K, the OLS estimator ˆβ OLS of the parameter vector β is given by ˆβ OLS = (X X) 1 X y (1) It is obtained by orthogonally projecting y onto the column space of X KC Border Linear Regression II March 6, / 44
5 OLS estimator Regression and Correlation X X = N t x t (X X) 1 = β 0 = (X X) 1 X y = β 1 y t = β 0 + β 1 x t t x t t x 2 t and X t y = y t t y tx t t x t 2 t x t t x t N 1 N t (x t x) 2 1 N t (x t x) 2 ˆβ 1 = N( t y tx t ) ( t x t)( t y t) N( t x 2 t ) ( t x t) 2 = ( ( t xt 2)( t yt) ( t xt)( t ytx t ) ( t x t)( t y t) + N( t y tx t ) t y tx t ) ( t x t)( t y t)/n ( t x t 2 ) ( t x t) 2 /N KC Border Linear Regression II March 6, / 44
6 OLS estimator Corr(X, Y ) = Cov(X, Y ) (SD X)(SD Y ) = Cov(X, Y ) = E(X Y ) Given pairs (x t, y t ), t = 1,, N, of observations, define the sample correlation coefficient r by Nt=1 (x t x)(y t ȳ) r = Nt=1 Nt=1, (x t x) 2 (y t ȳ) 2 which is the sample analog of the correlation It is also known as the Pearson product-moment correlation coefficient Consider the centered variables x t = x t x, ỹ t = y t ȳ ˆβ 1 = N( t y tx t ) ( t x t)( t y t) N( t x t 2 ) ( t x t) 2 = N( t ỹt x t ) ( t x t)( t ỹt) N( t x t 2 ) ( t x t) 2 But by construction, x t = t t ỹ t = 0, KC Border Linear Regression II March 6, / 44
7 OLS estimator Now look at the formula for the correlation coefficient It can be rewritten as Nt=1 x t ỹ t r = = s x s ˆβ s x 1, y s y where s x = t (x t x) 2 = t x 2 t and s y = t ỹ 2 t Among other things this implies that r = 0 if and only the slope ˆβ 1 of the regression line is zero (If s x = 0, then all the x t are the same, and the slope is not identifiable) KC Border Linear Regression II March 6, / 44
8 OLS estimator Testing for serial correlation Regress e t on e t 1 : Test ˆβ 1 = 0 e t = β 0 + β 1 e t 1 KC Border Linear Regression II March 6, / 44
9 OLS estimator Testing linear restrictions on β To test q simultaneous restrictions, let H 0 : a = Aβ, where A is a q K matrix with rank q Theorem Under the null hypothesis, the test statistic F = 1 qs 2 (a A ˆβ OLS ) [ A(X X) 1 A ] (a A ˆβ OLS ) has an F -distribution with (q, N K) degrees of freedom KC Border Linear Regression II March 6, / 44
10 OLS estimator The F -test of the regression Many software packages, including R, compute for you something called the F -statistic for the regression The F -statistic for the regression tests the null hypothesis that all the coefficients on the non-constant terms are zero, H 0 : β 2 = β 3 = = β K = 0 (If you have a constant term, it is usually X 1 in our terminology) KC Border Linear Regression II March 6, / 44
11 OLS estimator Coefficient of Multiple Correlation R y 2 = y y = ˆβ OLSX X ˆβ OLS + e e + 2 ˆβ OLS X e }{{} = 0 The coefficient of multiple correlation R is a measure of the fraction of y y explained by the regressors Specifically, 1 R 2 = e e y y, or R2 = ŷ ŷ y y = ˆβ OLSX X ˆβ OLS y y The Pythagorean Theorem implies y y = e e + ŷ ŷ, so 0 R 2 1 KC Border Linear Regression II March 6, / 44
12 OLS estimator Geometry of R 2 R = R 2 is the cosine of the angle ϕ between y and ŷ = X ˆβ OLS y ˆβ 1 x 1 e 0 x 1 ϕ x 2 ˆβ2 x 2 ŷ KC Border Linear Regression II March 6, / 44
13 OLS estimator Adjusted R 2 Increasing the number of right-hand side variates can only decrease the sum of squared residuals, so it is desirable to penalize the measure of fit The adjusted R 2 is defined by: or (1 R 2 ) = 1 N K e e 1 N 1 y y = N 1 N K (1 R2 ) R 2 = 1 K N K + N 1 N K R2 It is possible for the adjusted R 2 to be negative KC Border Linear Regression II March 6, / 44
14 OLS estimator What is a good value for R 2? KC Border Linear Regression II March 6, / 44
15 OLS estimator Prediction intervals Let y = x β + ε, ŷ = x ˆβ OLS But what is the confidence interval for y? ŷ y = x ˆβ OLS x β ε = x ( ˆβ OLS β) ε Therefore Var(ŷ y ) = Var (x ( ˆβ ) OLS β) ε ( = Var x ( ˆβ ) OLS β) + Var(ε ) = σ 2 (x (X X) 1 x + 1) KC Border Linear Regression II March 6, / 44
16 OLS estimator Confidence intervals Under the normality hypothesis, x ˆβ OLS y σ 2 (x (X X) 1 x +1) (N K)s 2 σ 2 = x ˆβ OLS y s x (X X) 1 x + 1 Thus a (1 α) confidence interval of y is [ ŷ t α,n K 2 s x (X X) 1 x + 1, t(n K) ] ŷ + t α,n K 2 s x (X X) 1 x + 1 KC Border Linear Regression II March 6, / 44
17 Restricted regression The Lagrange Multiplier Theorem If x minimizes f (x) subject to g i (x) = 0 (i = 1,, m), and if g 1 (x ),, g m (x ) are linearly independent, then there exist Lagrange multipliers λ i, i = 1,, m such that f (x ) + λ 1 g 1 (x ) + + λ m g m (x ) = 0 KC Border Linear Regression II March 6, / 44
18 Restricted regression Restricted OLS To minimize (y Xb) (y Xb) subject to the constraint Ab = a (where A is q k), the LMT tells us to form the Lagrangean and solve the FOC (y Xb) (y Xb) + λ (Ab a) 2X y + 2X Xb + A λ = 0 (2) KC Border Linear Regression II March 6, / 44
19 Restricted regression Solving the FOC Premultiply by A(X X) 1 : 2A(X X) 1 X y + 2 A(X X) 1 (X X)b A(X X) 1 A λ = 0, }{{} =a so, solving for λ [ λ = 2 A(X X) 1 A ] 1 [ ] a A(X X) 1 X y Substitute this into (2) to get X y + X Xb A [A(X X) 1 A ] 1 [a A(X X) 1 X y] which after premultiplying by (X X) 1, with some work simplifies to b = ˆβ OLS + (X X) 1 A [A(X X) 1 A ] 1 (a A ˆβ OLS ) KC Border Linear Regression II March 6, / 44
20 Restricted regression Restricted residuals Let e r = y Xb be the vector of residuals from the restricted regression It can be shown that e r e r = e u e u + (a A ˆβ OLS ) [A(X X) 1 A ] 1 (a A ˆβ OLS ), where e u e u is the sum of squares from the unrestricted OLS regression Thus e r e r e u e u = (a A ˆβ OLS ) [A(X X) 1 A ] 1 (a A ˆβ OLS ) is a quadratic form in the q variables a A ˆβ OLS KC Border Linear Regression II March 6, / 44
21 Restricted regression Testing a linear restriction H 0 : Aβ = a Let e u and e r be the vector of residuals from the unrestricted and restricted regressions Then under the null hypothesis, F = e r e r e u e u q e u e u N K has an F -distribution with (q, N K) degrees of freedom The null hypothesis should be rejected if F F 1 α,q,n K KC Border Linear Regression II March 6, / 44
22 Restricted regression y e r e u y r y u y r x 2 y u 0 x 1 {Xb : Ab = a} Restricted regression with restriction β 1 = 1 The points y r, y u, y form a right triangle with hypotenuse y r y KC Border Linear Regression II March 6, / 44
23 Restricted regression F -tests and t-tests may seem to conflict! KC Border Linear Regression II March 6, / 44
24 Errors in variables Measurement error True model: y = Xβ + ε, but observe X = X + V So the estimated model is y = Xβ + η, (3) The OLS estimate derived from (3) is ˆβ = β + (X X) 1 X (ε V β) The expectation is E ˆβ = β + E(X X) 1 X V β, which is not, in general, unbiased, nor consistent KC Border Linear Regression II March 6, / 44
25 ANOVA Some jargon According to Larsen and Marx, pp , The word factor is used to denote any treatment or therapy applied to the subjects being measured or any relevant feature (age, sex, ethnicity, etc) characteristic of those subjects Different versions, extents, or aspects, of a factor are referred to as levels Sometimes subjects or environments share certain characteristics that affect the way levels of a factor respond, yet those characteristics are of no intrinsic interest to the experimenter Any such set of conditions or subjects is called a block KC Border Linear Regression II March 6, / 44
26 ANOVA ANOVA ANOVA is an acronym for ANalysis Of VAriance KC Border Linear Regression II March 6, / 44
27 ANOVA Model equations One factor with k levels Y ij is the i th measurement of the response at factor level j n j observations at level j Y ij = µ j + ε ij, (i = 1,, n j ; j = 1,, k) n = n n k is the total number of observations ε ij are assumed to be independent, have common mean zero and common variance σ 2 µ j is just the expected value of the response at level j KC Border Linear Regression II March 6, / 44
28 ANOVA ANOVA is a special case of the Standard Linear Model X j is a dummy variable or indicator for the j th level y 11 y n1 1 y 12 y n2 2 y 1k y nk k = µ 1 µ 2 µ k + ε 11 ε n1 1 ε 12 ε n2 2 ε 1k ε nk k KC Border Linear Regression II March 6, / 44
29 ANOVA OLS and ANOVA X X = KC Border Linear Regression II March 6, / 44
30 ANOVA n n X X = 0 0 n k KC Border Linear Regression II March 6, / 44
31 ANOVA X y = y 11 y n1 1 y 12 y n2 2 y 1k y nk k KC Border Linear Regression II March 6, / 44
32 ANOVA X y = n1 i=1 y i1 n2 i=1 y i2 nk i=1 y ik KC Border Linear Regression II March 6, / 44
33 ANOVA (X X) 1 X y = n1 n2 nk i=1 y i1 n 1 i=1 y i2 n 2 i=1 y ik n k KC Border Linear Regression II March 6, / 44
34 The F test in an ANOVA framework Hypothesis testing in ANOVA The most common hypothesis is H 0 : µ 1 = = µ k, against the alternative H 1 : not all the µ j s are equal KC Border Linear Regression II March 6, / 44
35 The F test in an ANOVA framework A little more jargon y ij is the response of the i th observation at level j n j T j = y ij is the response total at level j i=1 Ȳ j = T j is the sample mean at level j n j k n j n T = y ij = T j is the sample overall total response j=1 i=1 j=1 Ȳ = 1 k n j y ij = 1 k T j is the sample overall average response n n j=1 i=1 j=1 KC Border Linear Regression II March 6, / 44
36 The F test in an ANOVA framework The fundamental identity For any list x 1,, x n, n i=1 x 2 i = n (x i x) 2 + n x 2 i=1 n n (x i x) 2 = (xi 2 2x i x + x 2 ) i=1 i=1 n n n = xi 2 2 x x i + x 2 i=1 i=1 i=1 n = xi 2 + n x 2 i=1 KC Border Linear Regression II March 6, / 44
37 The F test in an ANOVA framework The treatment sum of squares SSTR is defined to be k SSTR = n j (Ȳ j Ȳ ) 2 j=1 It is not hard to show using the fundamental identity and other tricks that (see L&M Theorem 1221, p ) where µ = k j=1 k E(SSTR) = (k 1)σ 2 + n j (µ j µ) 2, (4) j=1 n j n µ j is the overall average of the (unobserved) µ j s That is, a large value of SSTR relative to (k 1)σ 2 indicates that the null hypothesis H 0 : µ 1 = = µ k = µ should be rejected KC Border Linear Regression II March 6, / 44
38 The F test in an ANOVA framework Estimating σ 2 Start by defining and aggregating s 2 j = nj i=1 (Y ij Ȳ j) 2, n j 1 k k n j SSE = (n j 1)sj 2 = (y ij ȳ j ) 2, (5) j=1 j=1 i=1 which is called the error sum of squares The important fact about these is: SSE σ 2 χ2 (n k) and SSE and SSTR are stochastically independent (L&M, Theorem 1223, p 600) KC Border Linear Regression II March 6, / 44
39 The F test in an ANOVA framework Under the null H 0 : µ 1 = = µ k = µ, Therefore, under the null, SSTR σ 2 χ 2 (k 1) F = SSTR/(k 1) SSE/(n k) F k 1, n k KC Border Linear Regression II March 6, / 44
40 The F test in an ANOVA framework The F -test At the α-level of significance, reject H 0 : µ 1 = = µ k if SSTR/(k 1) SSE/(n k) F 1 α,k 1,n k KC Border Linear Regression II March 6, / 44
41 The F test in an ANOVA framework ANOVA tables The traditional way to present ANOVA data is in the form of a table like this: Source df SS MS F P Treatment k-1 SSTR Error n-k SSE Total n-1 SSTOT SSTR k 1 SSE n k SSTR/(k 1) SSE/(n k) Two more terms: the mean square for treatments is MSTR = SSTR k 1 the mean square for errors is MSE = SSE n k F F k 1,n k KC Border Linear Regression II March 6, / 44
42 The F test in an ANOVA framework Contrasts A linear combination of the form C = w µ, where 1 w = 0 is called a contrast A typical contrast uses a vector of the form w = (0,, 0, 1, 0,, 0, 1, 0, 0), j j so C = w µ = µ j µ j Then the hypothesis H 0 : C = 0 amounts to H 0 : µ j = µ j This is probably why it is called a contrast KC Border Linear Regression II March 6, / 44
43 The F test in an ANOVA framework To test a hypothesis that C = 0, we weight the sample means k Ĉ = w j Ȳ j j=1 Then Define E Ĉ = C Var Ĉ = σ2 k SS C = Ĉ 2 kj=1 wj 2 n j j=1 w 2 j n j KC Border Linear Regression II March 6, / 44
44 The F test in an ANOVA framework F test of a contrast The test statistic F = SS C SSE/(n k) has F -distribution with (1, n k) degrees of freedom The null hypothesis H 0 : w µ = 0 should be rejected if F F 1 α,1,n k KC Border Linear Regression II March 6, / 44
The Standard Linear Model: Hypothesis Testing
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 25: The Standard Linear Model: Hypothesis Testing Relevant textbook passages: Larsen Marx [4]:
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More information2. Regression Review
2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationNotes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1
Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationChapter 10: Analysis of variance (ANOVA)
Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first
More informationChapter 11 - Lecture 1 Single Factor ANOVA
Chapter 11 - Lecture 1 Single Factor ANOVA April 7th, 2010 Means Variance Sum of Squares Review In Chapter 9 we have seen how to make hypothesis testing for one population mean. In Chapter 10 we have seen
More informationEcon 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )
Econ 60 Matrix Differentiation Let a and x are k vectors and A is an k k matrix. a x a x = a = a x Ax =A + A x Ax x =A + A x Ax = xx A We don t want to prove the claim rigorously. But a x = k a i x i i=
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within
More informationLinear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.
Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationLecture 3: Multiple Regression
Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More informationHeteroskedasticity. Part VII. Heteroskedasticity
Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least
More information16.3 One-Way ANOVA: The Procedure
16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationTMA4255 Applied Statistics V2016 (5)
TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationCorrelation 1. December 4, HMS, 2017, v1.1
Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationMULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics
MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question
More information1. The OLS Estimator. 1.1 Population model and notation
1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationPANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1
PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationOutline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model
Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationLecture 4: Testing Stuff
Lecture 4: esting Stuff. esting Hypotheses usually has three steps a. First specify a Null Hypothesis, usually denoted, which describes a model of H 0 interest. Usually, we express H 0 as a restricted
More informationAnalisi Statistica per le Imprese
, Analisi Statistica per le Imprese Dip. di Economia Politica e Statistica 4.3. 1 / 33 You should be able to:, Underst model building using multiple regression analysis Apply multiple regression analysis
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationREGRESSION ANALYSIS AND INDICATOR VARIABLES
REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora
More informationEcon 510 B. Brown Spring 2014 Final Exam Answers
Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationMuch of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.
Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best
More informationReliability of inference (1 of 2 lectures)
Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of
More informationBasic Probability Reference Sheet
February 27, 2001 Basic Probability Reference Sheet 17.846, 2001 This is intended to be used in addition to, not as a substitute for, a textbook. X is a random variable. This means that X is a variable
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More information1 Overview. 2 Multiple Regression framework. Effect Coding. Hervé Abdi
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Effect Coding Hervé Abdi 1 Overview Effect coding is a coding scheme used when an analysis of variance (anova) is performed
More informationEcon 3790: Business and Economic Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis
More informationSTAT 705 Chapter 16: One-way ANOVA
STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationMotivation for multiple regression
Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope
More informationAnalysis of Variance
Analysis of Variance Math 36b May 7, 2009 Contents 2 ANOVA: Analysis of Variance 16 2.1 Basic ANOVA........................... 16 2.1.1 the model......................... 17 2.1.2 treatment sum of squares.................
More informationCS 5014: Research Methods in Computer Science
Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and
More information4 Multiple Linear Regression
4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ
More informationEmpirical Economic Research, Part II
Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction
More informationAdvanced Quantitative Methods: ordinary least squares
Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST4233 Linear Models: Solutions. (Semester I: ) November/December, 2007 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Linear Models: Solutions (Semester I: 2007 2008) November/December, 2007 Time Allowed : 2 Hours Matriculation No: Grade Table Problem 1 2 3 4 Total Full marks
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationEconometrics Multiple Regression Analysis: Heteroskedasticity
Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationANOVA (Analysis of Variance) output RLS 11/20/2016
ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.
More informationWISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A
WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationLECTURE 5 HYPOTHESIS TESTING
October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.
More informationGreene, Econometric Analysis (7th ed, 2012)
EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.
More informationChapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression
Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More information