Chapter 8 Conclusion
|
|
- Richard Jacobs
- 5 years ago
- Views:
Transcription
1 1 Chapter 8 Conclusion Three questions about test scores (score) and student-teacher ratio (str): a) After controlling for differences in economic characteristics of different districts, does the effect of str on score depend on the fraction of English learners (pctel)? b) Does this effect depend on str? (Is there a non-linear relationship?) c) After taking economic factors and nonlinearities into account, what is the estimated effect on score of reducing str?
2 2 > teachdata = read.csv(" > attach(teachdata) > head(teachdata) sublunch score str avginc pctel
3 3 An economics study should always include a description of the data: sublunch percent qualifying for reduced-price lunch score average test score str student teacher ratio avginc district average income (in $1000 s) pctel percentage of English learners It is also common to provide descriptive statistics for the variables. The variable of interest is str ( policy variable). Two measures of the economic background of students: sublunch and avginc pctel also important because of O.V.B.
4 4 In a previous lecture, it was argued that avginc might have a non-linear relationship with score: > plot(avginc, score, xlim = c(5,60), ylim = c(600,710)) score avginc
5 5 What are some ways we can deal with this? (i) Polynomials: > avginc2 = avginc^2 > avginc3 = avginc^3 > eqcubic = lm(score ~ avginc + avginc2 + avginc3) > summary(eqcubic) Call: lm(formula = score ~ avginc + avginc2 + avginc3) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 6.001e e < 2e-16 *** avginc 5.019e e e-08 *** avginc e e * avginc e e Signif. codes: 0 *** ** 0.01 * Residual standard error: on 416 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 3 and 416 DF, p-value: < 2.2e-16
6 6 Let s plot the cubic regression function: > par(new = TRUE) > curve( *x *x^ *x^3, xlim = c(5,60), ylim = c(600,710), ylab = "", xlab = "", col = 2) score avginc
7 7 (ii) Logarithms: > eqlog = lm(score ~ log(avginc)) > summary(eqlog) Call: lm(formula = score ~ log(avginc)) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** log(avginc) <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 418 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 418 DF, p-value: < 2.2e-16 Add this regression to the plot:
8 8 > par(new = TRUE) > curve( *log(x), xlim = c(5,60), ylim = c(600,710), ylab = "", xlab = "", col = 3) > legend("bottomright", c("cubic", "Lin-Log"), pch =" ", col=c(2,3)) score Cubic Lin-Log avginc
9 9 Do you like the cubic or lin-log model better? What are the advantages/disadvantages? Does heteroskedasticity appear to be present? We will proceed by using log(avginc). But first, to revise omitted variable bias, let s see what happens if we leave log(avginc) out of the regression. > eq1 = lm(score ~ str + pctel + sublunch) > summary(eq1) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** str e-05 *** pctel *** sublunch < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 9.08 on 416 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 3 and 416 DF, p-value: < 2.2e-16
10 10 Now add log(avginc): > eq2 = lm(score ~ str + pctel + sublunch + log(avginc)) > summary(eq2) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** str ** pctel e-08 *** sublunch < 2e-16 *** log(avginc) e-11 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 415 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 4 and 415 DF, p-value: < 2.2e-16 How have the results changed? What is going on here?
11 11 Regressor (1) (2) (3) (4) (5) (6) (7) str -1.00** (0.24) -0.73** (0.23) str 2 str 3 pctel ** (0.033) hiel ** (0.032) hiel str hiel str 2 hiel str 3 sublunch ** (0.022) ** (0.030) log(avginc) 11.57** (1.74) Intercept 700.2** (4.7) 658.6** (7.7) R
12 12 Let s address (a): After controlling for differences in economic characteristics of different districts, does the effect of str on score depend on the fraction of English learners (pctel)? An easier way to examine this might be to create a dummy variable. Let s define a new variable (high percentage of English learners): hiel = 0 for classes with small percentage of English learners hiel = 1 for classes with large percentage of English learners How should we determine the threshold? > summary(pctel) Min. 1st Qu. Median Mean 3rd Qu. Max
13 13 Create hiel: hiel = 0 hiel[pctel >= 10] = 1 To address (a), create the interaction term: hielstr = hiel*str
14 14 Try a regression without economic controls: > eq3 = lm(score ~ str + hiel + hielstr) > summary(eq3) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** str hiel hielstr Signif. codes: 0 *** ** 0.01 * Residual standard error: on 416 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 62.4 on 3 and 416 DF, p-value: < 2.2e-16 Which coefficient should we be testing to see if str has a different effect for classes with many English learners? What do we conclude? In anticipation of (c), let s test if str matters. Does it appear to matter from the results above?
15 15 H 0 : student-teacher ratio has no effect on test scores H 0 : model (3) The model under the null hypothesis is: > eqnul1 = lm(score ~ hiel) > summary(eqnul1) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** hiel <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 418 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 418 DF, p-value: < 2.2e-16
16 16 Formula for F-statistic: F = F = (R 2 U R 2 R ) q (1 R 2 U ) (n k U 1) ( ) 2 ( ) ( ) = 7.57 Since this is greater than the 5% critical value of 3.00, we reject the null. Alternatively, use the following R-code to perform the test: > anova(eq3,eqnul1) Analysis of Variance Table Model 1: score ~ str + hiel + hielstr Model 2: score ~ hiel Res.Df RSS Df Sum of Sq F Pr(>F) *** --- Signif. codes: 0 *** ** 0.01 *
17 17 Let s try a model with economic controls. > eq4 = lm(score ~ str + hiel + hielstr + sublunch + log(avginc)) > summary(eq4) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** str hiel hielstr sublunch < 2e-16 *** log(avginc) e-11 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 414 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 5 and 414 DF, p-value: < 2.2e-16 Has the conclusion (about a different effect for classes with many English learners) changed?
18 18 Again, let s test the null that str doesn t matter. Restricted model: > eqnul2 = lm(score ~ hiel + sublunch + log(avginc)) > anova(eq4,eqnul2) Analysis of Variance Table Model 1: score ~ str + hiel + hielstr + sublunch + log(avginc) Model 2: score ~ hiel + sublunch + log(avginc) Res.Df RSS Df Sum of Sq F Pr(>F) ** --- Signif. codes: 0 *** ** 0.01 *
19 19 Regressor (1) (2) (3) (4) (5) (6) (7) str -1.00** (0.24) -0.73** (0.23) (0.54) (0.30) str 2 str 3 pctel ** (0.033) ** (0.032) hiel 5.64 (16.7) hiel str (0.84) hiel str (9.1) (0.47) hiel str 3 sublunch ** (0.022) ** (0.030) ** (0.029) log(avginc) 11.57** (1.74) 12.12** (1.8) Intercept 700.2** (4.7) 658.6** (7.7) 682.2** (10.5) 653.7** (8.9) R
20 20 Now let s address (b): is the relationship between str and score non-linear? > str2 = str^2 > str3 = str^3 > eq5 = lm(score ~ str + str2 + str3 + hiel + sublunch + log(avginc)) > summary(eq5) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) str * str ** str ** hiel e-07 *** sublunch < 2e-16 *** log(avginc) e-11 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 413 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 6 and 413 DF, p-value: < 2.2e-16
21 21 Regressor (1) (2) (3) (4) (5) (6) (7) str -1.00** (0.24) -0.73** (0.23) (0.54) (0.30) 64.33** (25.5) str ** (1.29) str ** (0.022) pctel ** (0.033) ** (0.032) hiel 5.64 (16.7) 5.50 (9.1) -5.47** (1.03) hiel str (0.84) (0.47) hiel str 2 hiel str 3 sublunch ** (0.022) ** (0.030) ** (0.029) ** (0.028) log(avginc) 11.57** (1.74) 12.12** (1.8) 11.75** (1.7) Intercept 700.2** (4.7) 658.6** (7.7) 682.2** (10.5) 653.7** (8.9) (165.8) R
22 22 To test the null hypothesis that the relationship between str and score is linear, estimate a restricted model and compare it to model (5): > eqnul3 = lm(score ~ hiel + sublunch + log(avginc)) > anova(eq5,eqnul3) Analysis of Variance Table Model 1: score ~ str + str2 + str3 + hiel + sublunch + log(avginc) Model 2: score ~ hiel + sublunch + log(avginc) Res.Df RSS Df Sum of Sq F Pr(>F) *** --- Signif. codes: 0 *** ** 0.01 * What do you conclude? What other way might you try to capture this non-linear effect? How would you test to see if str matters, using model (5)?
23 23 Let s reconsider (a) under the cubic specification. We want to know if the effect of str on score is different for classes with a high percentage of English learners. Again, the strategy is: have the dummy variable hiel interact with all terms involving str this allows for the marginal effect to differ between the two groups testing to see if the coeffecients on the interaction terms are jointly equal to zero is equivalent to testing that there is no difference between the two groups Create the new interaction terms: hielstr2 = hiel*str2 hielstr3 = hiel*str3 Add the interaction terms to model (5): eq6 = lm(score ~ str + str2 + str3 + hiel + hielstr + hielstr2 + hielstr3 + sublunch + log(avginc))
24 24 Regressor (1) (2) (3) (4) (5) (6) (7) str -1.00** (0.24) -0.73** (0.23) (0.54) (0.30) 64.33** (25.5) 83.70** (29.69) str ** (1.29) -4.38** (1.51) str ** (0.022) 0.075** (0.025) pctel ** (0.033) ** (0.032) hiel 5.64 (16.7) 5.50 (9.1) -5.47** (1.03) 816.1* (434.61) hiel str (0.84) (0.47) * (66.35) hiel str * (3.35) hiel str * (0.056) sublunch ** (0.022) ** (0.030) ** (0.029) ** (0.028) ** (0.029) log(avginc) 11.57** (1.74) 12.12** (1.8) 11.75** (1.7) 11.80** (1.75) Intercept 700.2** (4.7) 658.6** (7.7) 682.2** (10.5) 653.7** (8.9) (165.8) (192.2) R
25 25 How do we test (a) using model (6)? > anova(eq6,eq5) Analysis of Variance Table Model 1: score ~ str + str2 + str3 + hiel + hielstr + hielstr2 + hielstr3 + sublunch + log(avginc) Model 2: score ~ str + str2 + str3 + hiel + sublunch + log(avginc) Res.Df RSS Df Sum of Sq F Pr(>F) So, once again, we can t reject the null that the effect of str on score is the same regardless of number of English learners. This suggests that the interaction terms are not needed, and model (5) is adequate. For a final model, let s make sure that our results are invariant to the use of hiel or pctel. eq7 = lm(score ~ str + str2 + str3 + pctel + sublunch + log(avginc))
26 26 Regressor (1) (2) (3) (4) (5) (6) (7) str -1.00** (0.24) -0.73** (0.23) (0.54) (0.30) 64.33** (25.5) 83.70** (29.69) 65.29** (25.48) str ** (1.29) -4.38** (1.51) -3.47** (1.30) str ** (0.022) 0.075** (0.025) 0.060** (0.022) pctel ** (0.033) ** (0.032) ** (0.032) hiel 5.64 (16.7) 5.50 (9.1) -5.47** (1.03) 816.1* (434.61) hiel str (0.84) (0.47) * (66.35) hiel str * (3.35) hiel str * (0.056) sublunch ** (0.022) ** (0.030) ** (0.029) ** (0.028) ** (0.029) ** (0.030) log(avginc) 11.57** (1.74) 12.12** (1.8) 11.75** (1.7) 11.80** (1.75) 11.51** (1.73) Intercept 700.2** (4.7) 658.6** (7.7) 682.2** (10.5) 653.7** (8.9) (165.8) (192.2) (165.9) R
27 27 Summary (a) Based on hypothesis tests involving models (3), (4) and (6), there doesn t appear to be a substantial difference in the effect of str on score for classes with many English learners. (b) A hypothesis test involving model (5) indicates the relationship between str and score is non-linear. (c) Using F-tests, the null hypothesis that str has no effect on score is rejected in all models. (Only one of these F-tests was shown). Model (5) and (7) should be our preferred models based on the sequence of testing. Let s use them to provide some policy recommendation. If str = 20, then reducing str to 18 would improve score by 3.00 using model (5), and 2.93 using model (7). If str = 22, then reducing str to 20 would improve score by 1.93 (model 5) or 1.90 (model 7).
Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:
1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted
More informationNonlinear Regression Functions
Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.
More information4. Nonlinear regression functions
4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change
More informationThe linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.
8 Nonlinear effects Lots of effects in economics are nonlinear Examples Deal with these in two (sort of three) ways: o Polynomials o Logarithms o Interaction terms (sort of) 1 The linear model Our models
More informationLinear Regression with one Regressor
1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these
More informationStat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb
Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 13 Nonlinearities Saul Lach October 2018 Saul Lach () Applied Statistics and Econometrics October 2018 1 / 91 Outline of Lecture 13 1 Nonlinear regression functions
More informationReplication of Examples in Chapter 6
Replication of Examples in Chapter 6 Zheng Tian 1 Introduction This document is to show how to perform hypothesis testing for a single coefficient in a simple linear regression model. I replicate examples
More informationThe Application of California School
The Application of California School Zheng Tian 1 Introduction This tutorial shows how to estimate a multiple regression model and perform linear hypothesis testing. The application is about the test scores
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More informationMODELS WITHOUT AN INTERCEPT
Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationLecture notes to Stock and Watson chapter 8
Lecture notes to Stock and Watson chapter 8 Nonlinear regression Tore Schweder September 29 TS () LN7 9/9 1 / 2 Example: TestScore Income relation, linear or nonlinear? TS () LN7 9/9 2 / 2 General problem
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More information1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species
Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for
More informationChapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression
Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationDiagnostics and Transformations Part 2
Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics
More informationHypothesis Tests and Confidence Intervals. in Multiple Regression
ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and
More informationExample: 1982 State SAT Scores (First year state by state data available)
Lecture 11 Review Section 3.5 from last Monday (on board) Overview of today s example (on board) Section 3.6, Continued: Nested F tests, review on board first Section 3.4: Interaction for quantitative
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationNested 2-Way ANOVA as Linear Models - Unbalanced Example
Linear Models Nested -Way ANOVA ORIGIN As with other linear models, unbalanced data require use of the regression approach, in this case by contrast coding of independent variables using a scheme not described
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationThe F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)
The F distribution If: 1. u 1,,u n are normally distributed; and. X i is distributed independently of u i (so in particular u i is homoskedastic) then the homoskedasticity-only F-statistic has the F q,n-k
More information1 Introduction 1. 2 The Multiple Regression Model 1
Multiple Linear Regression Contents 1 Introduction 1 2 The Multiple Regression Model 1 3 Setting Up a Multiple Regression Model 2 3.1 Introduction.............................. 2 3.2 Significance Tests
More informationEconomics Introduction to Econometrics - Fall 2007 Final Exam - Answers
Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.
More informationAssessing Studies Based on Multiple Regression
Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationMultiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)
1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:
More information1 Use of indicator random variables. (Chapter 8)
1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting
More informationMultiple Regression: Example
Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationSCHOOL OF MATHEMATICS AND STATISTICS
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More informationMultiple Linear Regression. Chapter 12
13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationVariance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017
Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf
More information6. Assessing studies based on multiple regression
6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal
More information1 The Classic Bivariate Least Squares Model
Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force
More informationEconometrics 1. Lecture 8: Linear Regression (2) 黄嘉平
Econometrics 1 Lecture 8: Linear Regression (2) 黄嘉平 中国经济特区研究中 心讲师 办公室 : 文科楼 1726 E-mail: huangjp@szu.edu.cn Tel: (0755) 2695 0548 Office hour: Mon./Tue. 13:00-14:00 The linear regression model The linear
More informationHypothesis Tests and Confidence Intervals in Multiple Regression
Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients
More informationIntroduction to Econometrics. Multiple Regression (2016/2017)
Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:
More informationFactorial Analysis of Variance with R
Factorial Analysis of Variance with R # Potato Data with R potato = read.table("http://www.utstat.toronto.edu/~brunner/data/legal/potato2.data") potato Bact Temp Rot 1 1 1 7 2 1 1 7 3 1 1 9 4 1 1 0............
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationChapter 6: Linear Regression With Multiple Regressors
Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationIntroduction to Econometrics. Multiple Regression
Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR
More informationIntroduction to the Analysis of Hierarchical and Longitudinal Data
Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models
More informationStatistical Inference. Part IV. Statistical Inference
Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided
More informationSCHOOL OF MATHEMATICS AND STATISTICS
SHOOL OF MATHEMATIS AND STATISTIS Linear Models Autumn Semester 2015 16 2 hours Marks will be awarded for your best three answers. RESTRITED OPEN BOOK EXAMINATION andidates may bring to the examination
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationECON 4230 Intermediate Econometric Theory Exam
ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the
More informationGeneral Linear Statistical Models - Part III
General Linear Statistical Models - Part III Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Interaction Models Lets examine two models involving Weight and Domestic in the cars93 dataset.
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationNC Births, ANOVA & F-tests
Math 158, Spring 2018 Jo Hardin Multiple Regression II R code Decomposition of Sums of Squares (and F-tests) NC Births, ANOVA & F-tests A description of the data is given at http://pages.pomona.edu/~jsh04747/courses/math58/
More informationPsychology 405: Psychometric Theory
Psychology 405: Psychometric Theory Homework Problem Set #2 Department of Psychology Northwestern University Evanston, Illinois USA April, 2017 1 / 15 Outline The problem, part 1) The Problem, Part 2)
More informationModel Specification and Data Problems. Part VIII
Part VIII Model Specification and Data Problems As of Oct 24, 2017 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers A functional form misspecification generally means
More informationECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors
ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationWorkshop 7.4a: Single factor ANOVA
-1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationActivity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression
Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear
More informationInference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister
More informationMATH 423/533 - ASSIGNMENT 4 SOLUTIONS
MATH 423/533 - ASSIGNMENT 4 SOLUTIONS INTRODUCTION This assignment concerns the use of factor predictors in linear regression modelling, and focusses on models with two factors X 1 and X 2 with M 1 and
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationIntroduction to Linear Regression Rebecca C. Steorts September 15, 2015
Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using
More informationMultiple Regression Part I STAT315, 19-20/3/2014
Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.
More information36-707: Regression Analysis Homework Solutions. Homework 3
36-707: Regression Analysis Homework Solutions Homework 3 Fall 2012 Problem 1 Y i = βx i + ɛ i, i {1, 2,..., n}. (a) Find the LS estimator of β: RSS = Σ n i=1(y i βx i ) 2 RSS β = Σ n i=1( 2X i )(Y i βx
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationMotor Trend Car Road Analysis
Motor Trend Car Road Analysis Zakia Sultana February 28, 2016 Executive Summary You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationSCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: "Statistics Tables" by H.R. Neave PAS 371 SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester 2008 9 Linear
More informationSTAT 572 Assignment 5 - Answers Due: March 2, 2007
1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.
More informationMultiple Regression Analysis. Basic Estimation Techniques. Multiple Regression Analysis. Multiple Regression Analysis
Multiple Regression Analysis Basic Estimation Techniques Herbert Stocker herbert.stocker@uibk.ac.at University of Innsbruck & IIS, University of Ramkhamhaeng Regression Analysis: Statistical procedure
More informationECO321: Economic Statistics II
ECO321: Economic Statistics II Chapter 6: Linear Regression a Hiroshi Morita hmorita@hunter.cuny.edu Department of Economics Hunter College, The City University of New York a c 2010 by Hiroshi Morita.
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationExtensions of One-Way ANOVA.
Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard
More informationInference with Heteroskedasticity
Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.
More informationMGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu
Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More information