Multiple Linear Regression for the Salary Data

Size: px
Start display at page:

Download "Multiple Linear Regression for the Salary Data"

Transcription

1 Multiple Linear Regression for the Salary Data Experience Salary HS BS BS Experience Salary No Yes

2 Problem & Data Overview Primary Research Questions: 1. Are company pay guidelines being followed? (Inference) 2. Use this model to establish a fair salary system. (Prediction) Issues: 1. How do we deal with the categorical variables Education and Manager?

3 Exploratory Techniques For Categorical Covariates 1. Side-by-side Boxplots Salary Salary HS BS BS+ Salary No Yes Experience Education Manager

4 Exploratory Techniques For Categorical Covariates 2. Color-coded Scatterplots Salary HS BS BS Salary No Yes Experience Experience

5 MLR with Categorical Covariates We want to use MLR: y i iid N 0 + PX p=1 px ip, 2! y i = i th person s salary x i1 = i th person s experience level But how do we put categories in a mathematical function? Use indicator functions: ( 1 if A is true I(A) = 0 otherwise.

6 MLR with Categorical Covariates x i2 = The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ Or, alternatively: y i iid N 0 + ( 1 if BS 0 otherwise. x i3 = PX p=1 4 I(Manager i =Yes)+ i px ip, y i = i th person s salary 2! x i1 = i th person s experience level ( 1 if BS+ 0 otherwise. x i4 = ( 1 if Manager 0 otherwise.

7 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i What about HS and Not a Manager? The become absorbed into the intercept term.

8 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i How do you interpret 0? For non-managers with a HS education and zero years experience, the salary is 0, on average.

9 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i How do you interpret 1? Holding all else constant, as the years of experience goes up by 1, the salary goes up by 1, on average.

10 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i How do you interpret 2? For equal years of experience and managerial levels, a person with a BS has a 2 higher salary than a person with a HS degree, on average.

11 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i How do you interpret 3? For equal years of experience and managerial levels, a person with a BS+ has a 3 higher salary than a person with a HS degree, on average. For equal experience and manager levels, how much more salary does a BS+ get than a BS, on average? 3 2

12 MLR with Categorical Covariates The MLR Model: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ i How do you interpret 4? For equal years of experience and education levels, a person with managerial position has a 4 higher salary than a person without a managerial position, on average.

13 MLR with Categorical Covariates Fitted MLR Model: ŷ = (Experience) I(Education = BS) I(Education = BS+) I(Manager = Yes) For equal years experience and education, how much more does a manager make than a non-manager, on average? You want to say because ˆ4 = but you re wrong. Why? is our best guess but we are uncertain and we should express this uncertainty.

14 Expressing Uncertainty in MLR You can show (in, say, STAT 535) that for any p =0,...,P t = ˆp p SE( ˆp) T n P 1 1. Confidence Intervals: ˆp ± t? 1 SE( ˆp) Computer s calculate SE s for us (at least in this class) Example 95% interval for 4 : ± ! ( , ) How do you interpret this interval? For equal experience and education levels, a manager would make between 6212 and 7521 more than a nonmanager, on average.

15 Expressing Uncertainty in MLR 2. Hypothesis Testing 1. All coefficients simultaneously 2. Some coefficients simultaneously 3. One coefficient at a time

16 Expressing Uncertainty in MLR Hypothesis Test #1: Testing all coefficients simultaneously. One can show (in Stat 535) that if, H 0 : 1 = 2 = = P =0 H A : At least one is non-zero Reduced Model Full Model then, F = R 2 /p (1 R 2 )/(n P 1) F p,n P 1 F-Distribution with p and n-p-1 DF so the F-distribution can be used to compute p-values.

17 Expressing Uncertainty in MLR Consider: F = R 2 /p (1 R 2 )/(n P 1) F p,n P 1 R 2 : How much variation you explain (1 R 2 ) : How much variation you don t explain F-statistic is 1. Analyzes variances (ANOVA) 2. Ratio of explained to unexplained variance.

18 Expressing Uncertainty in MLR Testing all coefficient simultaneously for salary data H 0 : 1 = 2 = = 4 =0 H A : At least one p is non-zero F = 211.7! p value 0 Density What is the conclusion? Reject the null hypothesis and conclude that at least one covariate significantly explains salary. F F

19 Expressing Uncertainty in MLR Hypothesis Test #2: Testing some coefficients simultaneously. Let, 1,..., Q be the coefficients you think are non-zero and don t want to test. And, Q+1,..., P be the coefficients you do want to test. then, H 0 : Q+1 = = P =0 H A : At least one is non-zero R2 with all P variables R2 with just Q variables F = (R2 P RQ 2 )/(P Q) (1 RP 2 )/(n P 1) F P Q,n P 1 Reduced Model Full Model so the F-distribution can, again, be used to compute p-values. F is ratio of how much explained variation you lost, relative to total unexplained variation.

20 Expressing Uncertainty in MLR Hypothesis Test #2: Salary Data Example. Suppose we want to test if Education has an effect on salary. y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ Then the hypotheses are, H 0 : 2 = 3 =0 H A : At least one is non-zero F = ! p value 0 What is the conclusion? Education has a non-zero effect on salary. 4 I(Manager i =Yes)+ i

21 Expressing Uncertainty in MLR Hypothesis Test #3: Testing individual coefficients. Remember: t = ˆp p SE( ˆp) T n P 1 so the t-distribution can be used to calculate p-values. Example: Does managerial position have an effect on salary? H 0 : 4 =0 H A : 4 6= 0 t = p value 0 What is the conclusion? Managerial position has an effect on salary. =

22 Expressing Uncertainty in MLR Question: Why test coefficients simultaneously? We want to avoid the multiple comparison problem. Multiplicity (or multiple comparison problem): If you do lots of tests then you are more likely to commit errors (your overall Type I error rate is inflated).

23 Expressing Uncertainty in Predictions Fitted MLR Model: ŷ = (Experience) I(Education = BS) I(Education = BS+) I(Manager = Yes) What is the predicted salary for a manager with a BS education and 10 years experience? You want to say because = but you re wrong. Why? is our best guess but we are uncertain and we should express this uncertainty.

24 Expressing Uncertainty in Predictions 1. Confidence Intervals for the Mean ˆµ(x 1,...,x P ) ± t? 1 (df = n P 1) SE(ˆµ) 2. Prediction Intervals for One Value ŷ ± t? 1 (df = n P 1) SE(ŷ) Important Notes: 1. SE formulas are ugly (but not in matrix notation) so let computer calculate it for you. 2. SE(ˆµ) < SE(ŷ) so prediction intervals will be wider than confidence intervals.

25 Cross-Validation Revisited When we perform cross-validation, we are used to calculating: 1. Bias 2. RPMSE But prediction intervals should also be generated for each test observation and calculate the following: 1. Coverage = % of prediction intervals that contain the true value 2. Predictive Interval Width = average width of prediction interval

26 MLR with Categorical Covariates Fitted MLR Model: ŷ = (Experience) I(Education = BS) I(Education = BS+) I(Manager = Yes) Salary HS,No BS,No BS+,No HS,Yes BS,Yes BS+,Yes Experience

27 Interactions Just based on this picture, if you have a HS degree and become a manager, how much does your salary go up on average? About $3000 or $4000 Salary HS BS BS Salary No Yes Experience Experience

28 Interactions Just based on this picture, if you have a BS degree and become a manager, how much does your salary go up on average? About $8000 or $9000 Salary HS BS BS Salary No Yes Experience Experience

29 Interactions Key Observation: How much your salary goes up when you become a manager depends on how much education you have. Salary HS BS BS Salary No Yes Experience Experience

30 Interactions Interaction: Two (or more) variables work simultaneously to affect the response. In other words, the effect of one covariate on the response depends on the value of another covariate. Interactions enter the regression multiplicatively. Types of Interactions: 1. Quantitative-Quantitative (Q-Q) 2. Quantitative-Categorical (Q-C) 3. Categorical-Categorical (C-C)

31 Interactions Q-Q Interactions: x i1,x i2 : Quantitative Variables y i = x i1 + 2 x {z i2 + } 3x i1 x i2 + {z } i Main E ects Interaction Term Holding x i1 constant, as x i2 goes up by 1, how much does y i go up on average? x i1 + 2 (x i2 + 1) + 3 x i1 (x i2 + 1) + i ( x i1 + 2 x i2 + 3 x i1 x i2 + i ) = x i1 It depends on the value of x i1!

32 Interactions Q-Q Interactions (Example): x i1,x i2 : Quantitative Variables y i =0+0.5x i1 +( 0.5)x i2 {z } Main E ects + ( 1)x i1 x i2 + i {z } Interaction Term y x1= 1 x1=0 x1= x 2

33 Interactions Q-C Interactions: x i1 : Quantitative Variable ( 1 if Category x i2 = 0 otherwise. x i2 y i = x i1 + 2 x {z i2 + } 3x i1 x i2 + {z } i Main E ects Interaction Term x i1 Holding constant, as goes up by 1, how much does go up on average? ( 1 if x i2 = x i2 = if x i2 =1 It depends on the value of BUT is easily interpretable. x i2 y i

34 Interactions C-C Interactions: x i1,x i2 : Categorical Variables y i = x i1 + 2 x {z i2 + } 3x i1 x i2 + {z } i Main E ects Interaction Term Holding x i1 constant, as x i2 goes up by 1, how much does y i go up on average? ( 2 if x i1 = x i1 = if x i1 =1

35 Salary Example with Interactions The MLR Model for Salary with Interactions: y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ 5I(Education i =BS)I(Manager i =Yes)+ 6I(Education i =BS+)I(Manager i =Yes)+ i

36 Salary Example with Interactions The MLR Model for Salary with Interactions: ŷ = (Experience i ) I(Education i =BS) I(Education i = BS+) I(Manager i =Yes) I(Education i =BS)I(Manager i =Yes) I(Education i =BS+)I(Manager i =Yes) What is the effect of becoming a manager on salary if you have a HS education? The salary would go up by , on average.

37 Salary Example with Interactions The MLR Model for Salary with Interactions: ŷ = (Experience i ) I(Education i =BS) I(Education i = BS+) I(Manager i =Yes) I(Education i =BS)I(Manager i =Yes) I(Education i =BS+)I(Manager i =Yes) What is the effect of becoming a manager on salary if you have a BS education? The salary would go up by =9038.1, on average.

38 Salary Example with Interactions The MLR Model for Salary with Interactions: ŷ = (Experience i ) I(Education i =BS) I(Education i = BS+) I(Manager i =Yes) I(Education i =BS)I(Manager i =Yes) I(Education i =BS+)I(Manager i =Yes) What is the effect of becoming a manager on salary if you have a BS+ education? The salary would go up by = , on average.

39 Salary Example with Interactions The MLR Model for Salary with Interactions: ŷ = (Experience i ) I(Education i =BS) I(Education i = BS+) I(Manager i =Yes) I(Education i =BS)I(Manager i =Yes) I(Education i =BS+)I(Manager i =Yes) For managers, what is the effect of having a BS+ education vs. a BS education with 2 years experience? ( ) ( ) = Salary would go DOWN by

40 Salary Example with Interactions The MLR Model for Salary with Interactions: Salary HS,No BS,No BS+,No HS,Yes BS,Yes BS+,Yes Experience Are the company s salary policies being followed (R 2 =0.99)?

41 Salary Example with Interactions How would I test if the interaction between education and manager position is significant? y i = (Experience i )+ 2 I(Education i =BS)+ 3I(Education i =BS+)+ 4 I(Manager i =Yes)+ 5I(Education i =BS)I(Manager i =Yes)+ 6I(Education i =BS+)I(Manager i =Yes)+ i Perform an F-test (see earlier notes in this unit): H 0 : 5 = 6 =0 H A :At least 1 is non-zero F = p value 0 Conclusion: There is an education-manager interaction.

42 End of Salary Analysis (see webpage for R and SAS code)

Multiple Linear Regression for the Supervisor Data

Multiple Linear Regression for the Supervisor Data for the Supervisor Data Rating 40 50 60 70 80 90 40 50 60 70 50 60 70 80 90 40 60 80 40 60 80 Complaints Privileges 30 50 70 40 60 Learn Raises 50 70 50 70 90 Critical 40 50 60 70 80 30 40 50 60 70 80

More information

Simple Linear Regression for the Climate Data

Simple Linear Regression for the Climate Data Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Introduction to Logistic Regression

Introduction to Logistic Regression Introduction to Logistic Regression Problem & Data Overview Primary Research Questions: 1. What are the risk factors associated with CHD? Regression Questions: 1. What is Y? 2. What is X? Did player develop

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Correlation and Regression (Excel 2007)

Correlation and Regression (Excel 2007) Correlation and Regression (Excel 2007) (See Also Scatterplots, Regression Lines, and Time Series Charts With Excel 2007 for instructions on making a scatterplot of the data and an alternate method of

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Introduction to Logistic Regression

Introduction to Logistic Regression Misclassification 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.0 0.2 0.4 0.6 0.8 1.0 Cutoff Introduction to Logistic Regression Problem & Data Overview Primary Research Questions: 1. What skills are important

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc. Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

Introduction to Survey Analysis!

Introduction to Survey Analysis! Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Multiple Comparisons

Multiple Comparisons Multiple Comparisons Error Rates, A Priori Tests, and Post-Hoc Tests Multiple Comparisons: A Rationale Multiple comparison tests function to tease apart differences between the groups within our IV when

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: CHAPTER 9, 10 Hypothesis Testing Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: The person is guilty. The person is innocent. To

More information

Correlation 1. December 4, HMS, 2017, v1.1

Correlation 1. December 4, HMS, 2017, v1.1 Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Two-Sample Inference for Proportions and Inference for Linear Regression

Two-Sample Inference for Proportions and Inference for Linear Regression Two-Sample Inference for Proportions and Inference for Linear Regression Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 24, 2015 Kwonsang Lee STAT111 April 24, 2015 1 / 13 Announcement:

More information

Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education

Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education Deciphering Math Notation Billy Skorupski Associate Professor, School of Education Agenda General overview of data, variables Greek and Roman characters in math and statistics Parameters vs. Statistics

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Multiple Linear Regression. Chapter 12

Multiple Linear Regression. Chapter 12 13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.

More information

Chapter 24. Comparing Means

Chapter 24. Comparing Means Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

One-way between-subjects ANOVA. Comparing three or more independent means

One-way between-subjects ANOVA. Comparing three or more independent means One-way between-subjects ANOVA Comparing three or more independent means Data files SpiderBG.sav Attractiveness.sav Homework: sourcesofself-esteem.sav ANOVA: A Framework Understand the basic principles

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

STA Module 10 Comparing Two Proportions

STA Module 10 Comparing Two Proportions STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare

More information

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015 Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Tables Table A Table B Table C Table D Table E 675

Tables Table A Table B Table C Table D Table E 675 BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values

More information

Multiple Regression: Chapter 13. July 24, 2015

Multiple Regression: Chapter 13. July 24, 2015 Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

1 Independent Practice: Hypothesis tests for one parameter:

1 Independent Practice: Hypothesis tests for one parameter: 1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

One-way between-subjects ANOVA. Comparing three or more independent means

One-way between-subjects ANOVA. Comparing three or more independent means One-way between-subjects ANOVA Comparing three or more independent means ANOVA: A Framework Understand the basic principles of ANOVA Why it is done? What it tells us? Theory of one-way between-subjects

More information

Homework 1 Solutions

Homework 1 Solutions Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam

Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 14 pages long. There are 4 questions,

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

STA Module 11 Inferences for Two Population Means

STA Module 11 Inferences for Two Population Means STA 2023 Module 11 Inferences for Two Population Means Learning Objectives Upon completing this module, you should be able to: 1. Perform inferences based on independent simple random samples to compare

More information

STA Rev. F Learning Objectives. Two Population Means. Module 11 Inferences for Two Population Means

STA Rev. F Learning Objectives. Two Population Means. Module 11 Inferences for Two Population Means STA 2023 Module 11 Inferences for Two Population Means Learning Objectives Upon completing this module, you should be able to: 1. Perform inferences based on independent simple random samples to compare

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

STA441: Spring Multiple Regression. More than one explanatory variable at the same time STA441: Spring 2016 Multiple Regression More than one explanatory variable at the same time This slide show is a free open source document. See the last slide for copyright information. One Explanatory

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Table 1: Fish Biomass data set on 26 streams

Table 1: Fish Biomass data set on 26 streams Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information