Multiple Regression: Chapter 13. July 24, 2015
|
|
- Magnus Richards
- 6 years ago
- Views:
Transcription
1 Multiple Regression: Chapter 13 July 24, 2015
2 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors) Note: the predictors can be quantitative, categorical, quadratic term, interaction term
3 Concentrate on: Reading computer output Interpreting coefficients for each Predictor Determining which order to test in picking the simplest model that does a good job for predicting Y
4 The Basic of MR Model: Y = α + β 1 X 1 + β 2 X β p X p + ɛ ( predictors: X 1, X 2,..., X p, # predictors: p) Assumptions: ɛ iid N(0, σ) Parameters: coefficients: β 1, β 2,..., β p constant: α
5 Reading the computer output: 1. Fitted Equation: ŷ = a + b 1 X 1 + b 2 X b p X p 2. ANOVA Test: H 0 : β 1 = β 2 =... = β p = 0 (nothing good in model) H a : at least one β i 0 (something good) Test Statistic: F = MSR MSE ANOVA table for regression Source df SS M S F P-value Regression p SSReg MSR MSR (from F table with MSE Error n p 1 SSE MSE df num, df denom ) Total n 1 SST
6 3. t test for Individual Predictors: H 0 : β i = 0 vs H a : β i 0 b Test Statistic: t = i 0 standard error of b i p-value computed from t-table with df= n p 1 (error) Interpretation: if p-value small, reject H 0 - conclude that predictor X i is a GOOD predictor of Y (X i provides significance information about Y ) AFTER all other predictors in the model are accounted for
7 Important Issues in Multiple Regression Don t just add predictors to the model - think! For p = n we have oversaturated model with R 2 = 100% (not useful to predict for larger populations, but only for this particular dataset) adjusted R 2 only increases if the new predictor added to the model is good, whereas R 2 goes up or stays the same even if the new predictors are bad Remember to look at p-values for each predictor.
8 Multicollinearity: when several predictors are correlated with each other, then the ANOVA p-value may be small even if all the individual t-test p-values are large. Correlated predictors give overlapping or redundant information. (Don t throw out all the predictors but take them out of model slowly) Sample size should be at least 5 to 20 times bigger than the number of predictors
9 Example: The following is the dataset on Blood Alcohol Content (BAC) and the Number of Beers consumed (NOB) with two more variables Weight and Sex. We fit different regression models and compare the output. BAC NOB Weight Sex M 1 F f f f m m f f m f m f m f m m m 1 0
10 Regression Analysis: BAC versus NOB The regression equation is BAC = NOB Predictor Coef SE Coef T P Constant NOB S = R-Sq = 80.0% R-Sq(adj) = 78.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
11 Regression Analysis: BAC versus NOB, M_1 The regression equation is BAC = NOB M_1 Predictor Coef SE Coef T P Constant NOB M_ S = R-Sq = 85.3% R-Sq(adj) = 83.1% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
12 Regression with Dummy Variables: Dummy Variable: Categorical variable coded as 0 or 1 0 if female Example: Let X 2 =Gender = 1 if male (baseline group has zero for dummy variable) Model (no interaction): Y = α + β 1 X 1 + β 2 X 2 + ɛ Note: This model gives two lines - one for females and one for males with same slope but different intercepts. F (X 2 = 0) Y = α + β 1 X 1 + ɛ M (X 2 = 1) Y = (α + β 2 ) + β 1 X 1 + ɛ
13 Interpret Coefficients: α β 1 β 2 y-intercept for baseline group (F) slope for both groups change in intercept for males compared to females
14 Regression Analysis: BAC versus NOB, Weight The regression equation is BAC = NOB Weight Predictor Coef SE Coef T P Constant NOB Weight S = R-Sq = 95.2% R-Sq(adj) = 94.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
15 Regression Analysis: BAC versus NOB, Weight, M_1 The regression equation is BAC = NOB Weight M_1 Predictor Coef SE Coef T P Constant NOB Weight M_ S = R-Sq = 95.3% R-Sq(adj) = 94.1% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
16 Question: What if gender coded the other way? Regression Analysis: BAC versus NOB, Weight, F_1 The regression equation is BAC = NOB Weight F_1 Predictor Coef SE Coef T P Constant NOB Weight F_ S = R-Sq = 95.3% R-Sq(adj) = 94.1% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
17 Interaction model: (with dummy) Y = α + β 1 X 1 + β 2 X 2 + β 3 X 1 X 2 + ɛ Note: This model gives two lines - one for females and one for males with different slopes and different intercepts. F (X 2 = 0) Y = α +β 1 X 1 +ɛ M (X 2 = 1) Y = (α + β 2 ) +(β 1 + β 3 )X 1 +ɛ Interpret Coefficients: α β 1 β 2 β 3 y-intercept for baseline group (F) slope for baseline group (F) change in intercept for males compared to females change in slope for males compared to females
18 Regression Analysis: BAC versus NOB, Weight, M_1, Weight*M_1 The regression equation is BAC = NOB Weight M_ Weight*M_1 Predictor Coef SE Coef T P Constant NOB Weight M_ Weight*M_ S = R-Sq = 95.5% R-Sq(adj) = 93.9% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
19 What if we had 3 groups? Suppose we want to predict BAC from NOB and Race: white, black, hispanic Need 2 dummy variables for 3 categories. Let X 2 = 1 if black 0 otherwise, X 3 = 1 if hispanic 0 otherwise Note: Race = White, is the baseline zero for both dummy variables.
20 No Interaction model with 2 Dummies: Y = α + β 1 X 1 + β 2 X 2 + β 3 X 3 + ɛ which gives the following 3 equations: X 2 = 0, X 3 = 0 (W): X 2 = 1, X 3 = 0 (B): X 2 = 0, X 3 = 1 (H): Y = α + β 1 X 1 + ɛ Y = (α + β 2 ) + β 1 X 1 + ɛ Y = (α + β 3 ) + β 1 X 1 + ɛ Interpret Coefficients: α β 1 β 2 β 3 intercept for baseline group (W) slope for all 3 groups change in intercept for blacks compared to whites change in intercept for hispanic compared to whites
21 Interaction Model: add interactions between the quantitative variable (X 1 ) and the dummy variables (X 2, X 3 ) Y = α+β 1 X 1 +β 2 X 2 +β 3 X 3 +β 4 X 1 X 2 +β 5 X 1 X 3 +ɛ which gives the following 3 equations: X 2 = 0, X 3 = 0 (W): X 2 = 1, X 3 = 0 (B): X 2 = 0, X 3 = 1 (H): Y = α + β 1 X 1 + ɛ Y = (α + β 2 ) + (β 1 + β 4 )X 1 + ɛ Y = (α + β 3 ) + (β 1 + β 5 )X 1 + ɛ Interpret Coefficients: α intercept for baseline group (W) β 1 slope for W β 2 change in intercept for B compared to W β 3 change in intercept for H compared to W β 4 change in slope for B compared to W change in slope for H compared to W β 5
22 In regression, if we have only one categorical predictor, REGRESSION ONE-WAY ANOVA Revisit the ONE-WAY ANOVA Example: Compare average weight loss for three diets. Data: Weight loss under 3 diets low FAT low CAL low CARB
23 ANOVA results (output): One-way ANOVA: lowfat, lowcal, lowcarb Source DF SS MS F P Factor Error Total S = R-Sq = 66.79% R-Sq(adj) = 59.41% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev lowfat ( * ) lowcal ( * ) lowcarb ( * ) Pooled StDev = 2.598
24 Now, let s set up the problem as regression with dummy variables. Y = weight loss (response) Let X 1 = 1 if lowcal 0 otherwise, X 2 = 1 if lowcarb 0 otherwise Model: Y = α + β 1 X 1 + β 2 X 2 + ɛ Interpret Coefficients: α β 1 β 2 intercept for baseline group (lowfat) change in intercept for lowcal compared to lowfat change in intercept for lowcarb compared to lowfat
25 REGRESSION results (output): Regression Analysis: Y versus x1, x2 The regression equation is Y = x x2 Predictor Coef SE Coef T P Constant x x S = R-Sq = 66.8% R-Sq(adj) = 59.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
26 More about RESIDUALS: Plot of RESIDUALS vs FITTED value will exaggerate any pattern present in data other than linear trend. How to judge non constant variance in response from residual vs fitted plot? (example in class) Recall: residuals = y ŷ (i.e. linear trend is removed from the model) any pattern (or trend) still present in residual vs fitted value plot suggests that the linear regression was not enough. Need to add quadratic (or other polynomial) terms in the equation (examples in class)
27 QUADRATIC REGRESSION Model: Y = α + β 1 X + β 2 X 2 + ɛ, note that p = 2 predictors (X, X 2 ) Assumptions: ɛ iid N(0, σ) Fitted Equation (output): ŷ = a + b 1 X + b 2 X 2 Interpret Coefficient: Only interpret the coefficient for the quadratic term. Is β 2 significantly different from zero? if yes - keep quadratic term - look for sign of b 2 (determines whether curvature opens up or down) if no - throw X 2 out - do SLR
28
29
30 Example: Suppose we are interested in predicting the GPA of students in college (CGPA) using 16 different predictor variables. Data were collected from a random sample of 59 college students. What is the response variable in this problem? What are the values of n and p? What are Ho and Ha that you can test using the ANOVA table? What is your decision, based on the following ANOVA table? What is your conclusion?
31 Regression Analysis: CGPA versus Height, Gender,... The regression equation is CGPA = Height Gender Haircut Job Studytime Smokecig Dated HSGPA HomeDist BrowseInternet WatchTV Exercise ReadNewsP Vegan PoliticalDegree PoliticalAff Predictor Coef SE Coef T P Constant Height Gender Haircut Job Studytime Smokecig Dated HSGPA HomeDist BrowseInternet WatchTV Exercise ReadNewsP Vegan PoliticalDegree PoliticalAff
32 S = R-Sq = 43.2% R-Sq(adj) = 21.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
33 Best Subsets Regression: CGPA versus Height, Gender,... Response is CGPA B o r l o i P w t o s i l S e R c i t S H I E e a t H u m o n W x a l i H G a d o m t a e d D c e e i y k D H e e t r N V e a i n r t e a S D r c c e e g l g d c J i c t G i n h i w g r A Mallows h e u o m i e P s e T s s a e f Vars R-Sq R-Sq(adj) Cp S t r t b e g d A t t V e P n e f X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
34 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
35 Regression Analysis: CGPA versus HSGPA, Exercise The regression equation is CGPA = HSGPA Exercise Predictor Coef SE Coef T P Constant HSGPA Exercise S = R-Sq = 31.6% R-Sq(adj) = 29.2% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
36
37 LOGISTIC REGRESSION Y = Categorical Response (Yes/No) or Binary Response (1 or 0) Example: Predict the probability that a person pay bills on time based on past credit history, income, employment, age, etc.. Example: Predict the probability that a person gets lung cancer based on smoking, family history, asthma, age, gender, race, eating habit, exercise habit, etc..
38 Logistic Regression Model: (with 1 predictor variable) p = exp(α + βx) 1 + exp(α + βx) Example: Whether a person has travel credit card. X = annual income (in thousand euros), y = (partial dataset..) income y if yes 0 if no
39 Link Function: Logit Response Information Variable Value Count y 1 31 (Event) 0 69 Total 100 Logistic Regression Table Predictor Coef SE Coef Z P Constant income
40 Interpretations: Annual income is a good predictor of probability of having a travel credit card the probability of having a travel credit card increases (because of the positive sign of the coefficient) with higher annual income.
41 Prediction Equation: ˆp = exp( X) 1 + exp( X) i.e. a = 3.52, b = predict the probability that person with annual income 12K (euros) has a travel credit card (answer: ˆp = 0.09) predict the probability that person with annual income 65K (euros) has a travel credit card (answer: ˆp = 0.97) the probability of having a travel credit card is 50% when X = a b = = (why?)
42 Multiple Logistic Regression: Example: Predict Marijuana use (Y/N) based on Alcohol use (Y/N) and cigarette smoking (Y/N) for HS seniors. Data: 2276 HS seniors in non-urban area outside Dayton, Ohio. Marijuana Cigarette Alcohol Frequency
43 Binary Logistic Regression: Marijuana versus Alcohol, Cigarette Link Function: Logit Response Information Variable Value Count Marijuana (Event) Total 2276 Frequency: Frequency Logistic Regression Table Predictor Coef SE Coef Z P Constant Alcohol Cigarette
44 Predict probability of using Marijuana if Alcohol use = Yes and Cigarette smoking = Yes ˆp = exp( ) 1 + exp( ) = Alcohol use = No and Cigarette smoking = Yes ˆp = exp( ) 1 + exp( ) = 0.079
Basic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationChapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models
Chapter 14 Multiple Regression Models 1 Multiple Regression Models A general additive multiple regression model, which relates a dependent variable y to k predictor variables,,, is given by the model equation
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationChapter 26 Multiple Regression, Logistic Regression, and Indicator Variables
Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric
More informationSTAT 212 Business Statistics II 1
STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More information(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)
12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationRegression Analysis IV... More MLR and Model Building
Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression
More informationInference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationStart with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model
Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationChapter 14 Multiple Regression Analysis
Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationLecture 6: Linear Regression
Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationSTAT Chapter 10: Analysis of Variance
STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationConcordia University (5+5)Q 1.
(5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner
More informationSTA102 Class Notes Chapter Logistic Regression
STA0 Class Notes Chapter 0 0. Logistic Regression We continue to study the relationship between a response variable and one or more eplanatory variables. For SLR and MLR (Chapters 8 and 9), our response
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationSection 5: Dummy Variables and Interactions
Section 5: Dummy Variables and Interactions Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Example: Detecting
More information6. Multiple Linear Regression
6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X
More informationx3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators
Multiple Regression Relating a response (dependent, input) y to a set of explanatory (independent, output, predictor) variables x, x 2, x 3,, x q. A technique for modeling the relationship between variables.
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationUnit 7: Multiple linear regression 1. Introduction to multiple linear regression
Announcements Unit 7: Multiple linear regression 1. Introduction to multiple linear regression Sta 101 - Fall 2017 Duke University, Department of Statistical Science Work on your project! Due date- Sunday
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More information23. Inference for regression
23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationAnswer Key: Problem Set 6
: Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationMBA Statistics COURSE #4
MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of
More informationSection 4: Multiple Linear Regression
Section 4: Multiple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 The Multiple Regression
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More information22S39: Class Notes / November 14, 2000 back to start 1
Model diagnostics Interpretation of fitted regression model 22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics 22S39: Class Notes / November 14, 2000 back to start 2 Model diagnostics
More informationData Analysis 1 LINEAR REGRESSION. Chapter 03
Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationRegression Models for Quantitative and Qualitative Predictors: An Overview
Regression Models for Quantitative and Qualitative Predictors: An Overview Polynomial regression models Interaction regression models Qualitative predictors Indicator variables Modeling interactions between
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationCS 5014: Research Methods in Computer Science
Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationMultiple Regression. Peerapat Wongchaiwat, Ph.D.
Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model
More informationCIVL 7012/8012. Simple Linear Regression. Lecture 3
CIVL 7012/8012 Simple Linear Regression Lecture 3 OLS assumptions - 1 Model of population Sample estimation (best-fit line) y = β 0 + β 1 x + ε y = b 0 + b 1 x We want E b 1 = β 1 ---> (1) Meaning we want
More informationMultiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1
Multiple Regression Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 12, Slide 1 Review: Matrix Regression Estimation We can solve this equation (if the inverse of X
More informationPredict y from (possibly) many predictors x. Model Criticism Study the importance of columns
Lecture Week Multiple Linear Regression Predict y from (possibly) many predictors x Including extra derived variables Model Criticism Study the importance of columns Draw on Scientific framework Experiment;
More informationAdvanced Regression Summer Statistics Institute. Day 2: MLR and Dummy Variables
Advanced Regression Summer Statistics Institute Day 2: MLR and Dummy Variables 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationCorrelation & Regression Chapter 5
Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have a relationship (p
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationLecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012
Lecture Notes 2 Advanced Topics Econ 2050, Principles of Statistics Kevin R Foster, CCNY Spring 202 Endogenous Independent Variables are Invalid Need to have X causing Y not vice-versa or both! NEVER regress
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationLecture 6: Linear Regression (continued)
Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.
More information27. SIMPLE LINEAR REGRESSION II
27. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationModels with qualitative explanatory variables p216
Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34
More informationClassification & Regression. Multicollinearity Intro to Nominal Data
Multicollinearity Intro to Nominal Let s Start With A Question y = β 0 + β 1 x 1 +β 2 x 2 y = Anxiety Level x 1 = heart rate x 2 = recorded pulse Since we can all agree heart rate and pulse are related,
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationApplied Regression Analysis. Section 2: Multiple Linear Regression
Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response
More informationPART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,
Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.
More informationSTAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis
STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO
More informationassumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )
Multiple Linear Regression is used to relate a continuous response (or dependent) variable Y to several explanatory (or independent) (or predictor) variables X 1, X 2,, X k assumes a linear relationship
More informationSociology 593 Exam 2 March 28, 2002
Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More information10. Alternative case influence statistics
10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the
More informationMultiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar
Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationCh 11- One Way Analysis of Variance
Multiple Choice Questions Ch 11- One Way Analysis of Variance Use the following to solve questions 1 &. Suppose n = 8 and there are 4 groups, how many between groups (samples) degrees of freedom are there?
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationACOVA and Interactions
Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA
More informationAnswer Key: Problem Set 5
: Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More information