Unit 2 Regression and Correlation Practice Problems. SOLUTIONS Version STATA
|
|
- Judith Tyler
- 6 years ago
- Views:
Transcription
1 PubHlth 640. Regression and Correlation Page 1 of 19 Unit Regression and Correlation Practice Problems SOLUTIONS Version STATA 1. A regression analysis of measurements of a dependent variable Y on an independent variable X produces a statistically significant association between X and Y. Drawing upon your education in introductory biostatistics, the theory of epidemiology, the scientific method, etc see how many explanations you can offer for this finding. Hint I get seven (7) (i) X causes Y Example: X=smoking Y=cancer (ii) Y causes X Example: X=cancer Y=smoking (iii) Positive confounding Example: X=alcohol Y=cancer Confounder = Z = smoking (iv) Necessary but not sufficient Example: X=sun exposure Y=melanoma Sun exposure alone does not cause melanoma. Melanoma is the result of a gene-environment interaction. (v) Intermediary Example: X = asthma Y=lesions on the lung X=asthma is an intermediary in the pathway coal exposure asthma lesions on lung (vi) Confounding Example: X=yellow finger Y=lung cancer Z=smoking causes both yellow finger and lung cancer. (vii) Anomoly There is no association. An event of low probability has occurred.
2 PubHlth 640. Regression and Correlation Page of 19. Below is a figure summarizing some data for which a simple linear regression analysis has been performed. The point denoted X that appears on the line is (x,y). The two points indicated by open circles were NOT included in the original analysis.
3 PubHlth 640. Regression and Correlation Page 3 of 19 Multiple Choice (Choose ONE): Suppose the two points represented by the open circles are added to the data set and the regression analysis is repeated. What is the effect of adding these points on: 1. The estimated slope. (a) increase (b) decrease X (c) no change. The residual sum of squares. (a) increase (b) decrease X_ (c) no change 3. The degrees of freedom. X (a) increase (b) decrease (c) no change 4. Standard error of the estimated slope. (a) increase X (b) decrease (c) no change 5. The predicted value of y at x=4. (a) increase (b) decrease X (c) no change
4 PubHlth 640. Regression and Correlation Page 4 of The course website page REGRESSION AND CORRELATION will have some examples of code to produce regression analyses in STATA and SAS. The data in the table below are values of boiling points (Y) and temperature (X). Carry out an exploratory analysis to determine whether the relationship between temperature and boiling point is better represented using (i) (ii) Y = β 0+β1X or ' ' 100 log (Y) = β +β X In developing your answer, use whatever statistical software you like (SAS, STATA, or Minitab). Try your hand at producing (a) Estimates of the regression line parameters (b) Analysis of variance tables (c) R (d) Scatter plot with overlay of fitted line. Complete your answer with a one paragraph text that is an interpretation of your work. Take your time with this and have fun. X=Temp Y=Boiling Pt X=Temp Y=Boiling Pt X=Temp Y=Boiling Pt
5 PubHlth 640. Regression and Correlation Page 5 of 19 Solution (one paragraph of text that is interpretation of analysis): Did you notice that the scatter plot of these data reveal two outlying values? Their inclusion may or may not be appropriate. If all n=31 data points are included in the analysis, then the model that explains more of the variability in boiling point is Y=boiling point modeled linearly in X=temperature. It has a greater R (9% v 89%). Be careful - It would not make sense to compare the residual mean squares of the two models because the scales of measurement involved are different. (a). Estimates of Regression Line Parameters i. Y= *X ii. 100log 10 (Y)= *X Table 1. Parameters estimations for dependent = Boiling Point Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > t Intercept Intercept <.0001 Temp Temperature <.0001 Table. Parameters estimations for dependent = 100 log 10 (Boiling Point) Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > t Intercept Intercept Temp Temperature <.0001 (b). Analysis of Variance Tables Y=Boiling Point Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total
6 PubHlth 640. Regression and Correlation Page 6 of 19 Y = 100 log 10 (Boiling Point) Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total (c). R Y=Boiling Point: R = Y = 100 log 10 (Boiling Point) R = (d) Scatterplot with Overlay of Fitted Line Y=Boiling Point
7 PubHlth 640. Regression and Correlation Page 7 of 19 Y = 100 log 10 (Boiling Point)
8 PubHlth 640. Regression and Correlation Page 8 of 19 For STATA users (version 10.1) *comments begin with an asterisk and are shown in green Read in data and create NEWY = 100 log 10 (BOILING). * toggle off the screen by screen pausing of results. set more off. * read into memory the data for exercise #3. It is named week0.dta. use " * Use the command GENERATE to create a new variable called newy. generate newy=100*log10(boiling) Model 1: Y=Boiling Point Least squares estimation and analysis of variance table. regress boiling temp You should see.. Source SS df MS Number of obs = F( 1, 9) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = boiling Coef. Std. Err. t P> t [95% Conf. Interval] temp _cons Scatter plot with overlay of fitted line.. graph twoway (scatter boiling temp, msymbol(d)) (lfit boiling temp), title("simple Linear Regression") subtitle("y=boiling v X=temp") note("unit_graph01.png")
9 PubHlth 640. Regression and Correlation Page 9 of 19 You should see.. Model : NEWY=100*log10(Boiling Point) Least squares estimation and analysis of variance table. regress newy temp You should see.. Source SS df MS Number of obs = F( 1, 9) = 3.69 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = newy Coef. Std. Err. t P> t [95% Conf. Interval] temp _cons
10 PubHlth 640. Regression and Correlation Page 10 of 19 Scatter plot with overlay of fitted line. Note - This requires saving the predicted values. Here I ve saved them as hat_temp. graph twoway (scatter newy temp, msymbol(d)) (lfit newy temp), title("simple Linear Regression") subtitle("y=100*log10(boiling) v X=temp") note("unit_graph0.png") You should see..
11 PubHlth 640. Regression and Correlation Page 11 of A psychiatrist wants to know whether the level of pathology (Y) in psychotic patients 6 months after treatment could be predicted with reasonable accuracy from knowledge of pretreatment symptom ratings of thinking disturbance (X 1 ) and hostile suspiciousness (X ). (a) The least squares estimation equation involving both independent variables is given by Y = (X 1 ) 7.147(X ) Using this equation, determine the predicted level of pathology (Y) for a patient with pretreatment scores of.80 on thinking disturbance and 7.0 on hostile suspiciousness. How does the predicted value obtained compare with the actual value of 5 observed for this patient? Y = X X X1 X with =.80 and =7.0 Y = This value is lower than the observed value of 5 (b) Using the analysis of variance tables below, carry out the overall regression F tests for models containing both X 1 and X, X 1 alone, and X alone. Source DF SS Regression on X Residual Source DF SS Regression on X Residual
12 PubHlth 640. Regression and Correlation Page 1 of 19 Source DF SS Regression on X 1, X 784 Residual Model Containing X 1 and X F,784 / = = ,008/ 50 on DF=,50 p-value= Conclude linear model in X 1 and X explains statistically significantly more of the variability in level of pathology (Y) than is explained by Y (the intercept model) alone. Model Containing X 1 ALONE F 1546 /1 = = ,46 / 51 on DF=1,51 p-value= Conclude linear model in X 1 explains statistically significantly more of the variability in level of pathology (Y) than is explained by Y (the intercept model) alone. Model Containing X ALONE F 160 /1 = = ,63 / 51 on DF=1,51 p-value= Conclude linear model in X does NOT explain statistically significantly more of the variability in level of pathology (Y) than is explained by Y (the intercept model) alone.
13 PubHlth 640. Regression and Correlation Page 13 of 19 (c) Based on your results in part (b), how would you rate the importance of the two variables in predicting Y? X explains a significant proportion of the variability in Y when modelled as a linear predictor. 1 X does not. (However, we don't know if a different functional form might have been important.) (d) What are the R values for the three regressions referred to in part (b)? Total SSQ= (Regression SSQ) + (Regression SSQ) is constant. Therefore total SSQ can be calculated from just one anova table: Total (SSQ)= 1, ,46 = 13,79 ( ) R X1 only = (Re gression SSQ)/(Total SSQ) R (X only) = (160)/(13,79) = ( ) ( ) = (1546)/(13,79) = R X1and X = 784 /(13, 79) = (e) What is the best model involving either one or both of the two independent variables? Eliminate from consideration model with X only. Compare model with X1 alone versus X1 and X using partial F test. Re gression SSQ / DF ( ) /1 Partial F = = Re sidual SSQ( full) / DF (11,008) / 50 = on DF=1,50 P-value = Most appropriate model includes X and X 1
14 PubHlth 640. Regression and Correlation Page 14 of In an experiment to describe the toxic action of a certain chemical on silkworm larvae, the relationship of log 10 (dose) and log 10 (larva weight) to log 10 (survival) was sought. The data, obtained by feeding each larva a precisely measured dose of the chemical in an aqueous solution and then recording the survival time (ie time until death) are given in the table. Also given are relevant computer results and the analysis of variance table. Larva Y = log 10 (survival time) X 1 =log 10 (dose) X =log 10 (weight) Larva Y = log 10 (survival time) X 1 =log 10 (dose) X =log 10 (weight) Y = (X 1 ) Y = (X ) Y = (X 1 ) + ).871 (X ) Source DF SS Regression on X Residual Source DF SS Regression on X Residual Source DF SS Regression on X 1, X Residual
15 PubHlth 640. Regression and Correlation Page 15 of 19 (a) Test for the significance of the overall regression involving both independent variables X 1 and X. X1and X (0.464) / F = = (0.0471) /1 on DF =,1 P value < (b) Test to see whether using X 1 alone significantly helps in predicting survival time. X1 alone (0.3633) /1 F = = on DF = 1,13 (0.1480) /13 P value = (c) Test to see whether using X alone significantly helps in predicting survival time. X alone (0.3367) /1 F = = 5.07 (0.1746) /13 on = 1,13 P value = (d) Compute R for each of the three models. TotalSSQ = ( ) R X1 and X = / = R (X1 alone) = / = ( ) R X alone = / = (e) Which independent predictor do you consider to be the best single predictor of survival time? Based on overall F test and comparison of, the single predictor model containing X is better. R 1
16 PubHlth 640. Regression and Correlation Page 16 of 19 (f) Which model involving one or both of the independent predictors do you prefer and why? Partial F for comparing model with X1alone versus model with X1 an d X ( gressionssq) ( sidual SSQ) ( ) Re / Re g DF /1 = = Re / Re sidual DF /1 = on DF=1,1 P-value= Choose model with both X and X 1 6. Using whatever software package you like, try your hand at reproducing the analysis of variance tables you worked with in problem #5. For Stata users Green comments (note: comments begin with an asterisk) Black stata syntax Blue output (note: I have shown only the analysis of variance table portions of the output). * Use FILE > OPEN to read in data from week03.dta. use " * Anova table for X1. regress y x1 Source SS df MS Number of obs = F( 1, 13) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = * Anova table for X. regress y x Source SS df MS Number of obs = F( 1, 13) = 5.07 Model Prob > F = Residual R-squared = Adj R-squared = 0.63 Total Root MSE =.11589
17 PubHlth 640. Regression and Correlation Page 17 of 19. * Anova table for X1 and X. regress y x1 x Source SS df MS Number of obs = F(, 1) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = An educator examined the relationship between number of hours devoted to reading each week (Y) and the independent variables social class (X 1 ), number of years school completed (X ), and reading speed measured by pages read per hour (X 3 ). The analysis of variance table obtained from a stepwise regression analysis on data for a sample of 19 women over the age of 60 is shown. Source DF SSQ Regression (X 3 ) (X X 3 ) (X 1 X,X 3 ) Residual (a) Test the significance of each variable as it enters the model. Total SSQ = X 3 Step 1 Enters Regression SSQ = on DF=1 Residual SSQ = ( ) + (37.98) + ( )= F= / /17 = on DF=1,17 P-value = on DF=17 Step X 3 already in X Enters Additional Regression SSQ = on DF=1 Residual SSQ = ( ) + (37.98)=401.8 on DF= /1 F= /16 P-value = = 7.36 on DF=1,16 Step 3 X and X 3 already exist in X 1 Enters Additional Regression SSQ = on DF=1 Residual SSQ = on DF=15
18 PubHlth 640. Regression and Correlation Page 18 of /1 F= 363.3/15 P-value=0.964 = on DF=1,15 (b) Test H O : β 1 = β = 0 in the model Y = β 0 + β 1 X 1 + β X + β 3 X 3 + E. Models to Compare: X alone versus (X 1, X and X 3) 3 Partial F = Re gression SSQ / Re g DF Re sidual SSQ / Re s DF P-value= ( + ) / = = / on DF (c) Why can t we test H O : β 1 = β 3 = 0 using the ANOVA table given? What formula would you use for this test? The regression SSQ for the model containing X alone is not available. Re g SSQ(mod el w X1and X and X 3) Re g SSQ( X alone) / Partial F = Re sid SSQ( X, X and X ) / (d) What is your overall evaluation concerning the appropriate model to use given the results in parts (a) and (b)? The most appropriate model is the one with two predictors, X3and X ( R = ). 1 The additional predictive information in X (change in R = 0.031) is not statistically significant (p=0.3)
19 PubHlth 640. Regression and Correlation Page 19 of Consider the following analysis of variance table. Source DF SS Regression (X 1 ) 1 18, (X 3 X 1 ) 1 7, (X X 1,X 3 ) Residual 16,48.3 8,.3 Using a type I error of 0.05, (a) Provide a test to compare the following two models: Y = β 0 + β 1 X 1 + β X + β 3 X 3 + E. VERSUS Y = β 0 + β 1 X 1 + E. RegSSQ(X 1, X,X 3)-RegSSQ(X 1)/ Partial F= ResSSQ(X,X X )/ P-value=< ( )/ = = 4.983on DF=,16 (48.3)/16 (b) Provide a test to compare the following two models: Y = β 0 + β 1 X 1 + β 3 X 3 + E. VERSUS Y = β 0 + E. RegSSQ(X 1,X )/ (18, )/ Overall F= = ResSSQ(X,X )/17 (, )/ P-value< = on DF=,17 (c) State which two models are being compared in computing: (18, , )/3 F = (48.3)/16 Y = β + β X + β X + β X + E Y= β + E versus
BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) STATA Users
Unit Regression and Correlation 1 of - Practice Problems Solutions Stata Users 1. In this exercise, you will gain some practice doing a simple linear regression using a Stata data set called week0.dta.
More informationBIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users
BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing
More informationUnit 2 Regression and Correlation WEEK 3 - Practice Problems. Due: Wednesday February 10, 2016
BIOSTATS 640 Spring 2016 2. Regression and Correlation Page 1 of 5 Unit 2 Regression and Correlation WEEK 3 - Practice Problems Due: Wednesday February 10, 2016 1. A psychiatrist wants to know whether
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationUnit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS. X = cigarette consumption (per capita in 1930)
BIOSTATS 540 Fall 2015 Introductory Biostatistics Page 1 of 10 Unit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS Consider the following study of the relationship
More informationName: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm
Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationSection Least Squares Regression
Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it
More informationLab 10 - Binary Variables
Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationLecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II
Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationProblem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics
Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =
More informationSociology Exam 2 Answer Key March 30, 2012
Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales
More informationProblem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]
Problem Set #3-Key Sonoma State University Economics 317- Introduction to Econometrics Dr. Cuellar 1. Use the data set Wage1.dta to answer the following questions. a. For the regression model Wage i =
More informationsociology 362 regression
sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationLab 6 - Simple Regression
Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello
More informationSTAT 350 Final (new Material) Review Problems Key Spring 2016
1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,
More informationProblem Set 1 ANSWERS
Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one
More informationSOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS
SOCY5601 DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS More on use of X 2 terms to detect curvilinearity: As we have said, a quick way to detect curvilinearity in the relationship between
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationEmpirical Application of Simple Regression (Chapter 2)
Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationSoc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis
Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Problem 1. The files
More informationsociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income
Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression
More informationInterpreting coefficients for transformed variables
Interpreting coefficients for transformed variables! Recall that when both independent and dependent variables are untransformed, an estimated coefficient represents the change in the dependent variable
More informationAnalysis of repeated measurements (KLMED8008)
Analysis of repeated measurements (KLMED8008) Eirik Skogvoll, MD PhD Professor and Consultant Institute of Circulation and Medical Imaging Dept. of Anaesthesiology and Emergency Medicine 1 Day 2 Practical
More information2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).
CHAPTER Functional Forms of Regression Models.1. Consider the following production function, known in the literature as the transcendental production function (TPF). Q i B 1 L B i K i B 3 e B L B K 4 i
More informationsociology 362 regression
sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationInteraction effects between continuous variables (Optional)
Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, https://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat
More informationCorrelation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship
1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationSection I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -
First Exam: Economics 388, Econometrics Spring 006 in R. Butler s class YOUR NAME: Section I (30 points) Questions 1-10 (3 points each) Section II (40 points) Questions 11-15 (10 points each) Section III
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More informationespecially with continuous
Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei UK Stata Users meeting, London, 13-14 September 2012 Interactions general concepts General idea of
More informationSplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006
SplineLinear.doc 1 # 9 Problem:... 2 Objective... 2 Reformulate... 2 Wording... 2 Simulating an example... 3 SPSS 13... 4 Substituting the indicator function... 4 SPSS-Syntax... 4 Remark... 4 Result...
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More information1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.
Economics 3 Introduction to Econometrics Winter 2004 Professor Dobkin Name Final Exam (Sample) You must answer all the questions. The exam is closed book and closed notes you may use calculators. You must
More informationECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor
ECON4150 - Introductory Econometrics Lecture 4: Linear Regression with One Regressor Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 4 Lecture outline 2 The OLS estimators The effect of
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More information1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Sylvan Herskowitz Section #10 4-1-15 1 Warm-Up: Remember that exam you took before break? We had a question that said this A researcher wants to
More informationPubHlth 540 Fall Summarizing Data Page 1 of 18. Unit 1 - Summarizing Data Practice Problems. Solutions
PubHlth 50 Fall 0. Summarizing Data Page of 8 Unit - Summarizing Data Practice Problems Solutions #. a. Qualitative - ordinal b. Qualitative - nominal c. Quantitative continuous, ratio d. Qualitative -
More informationIntroduction to Regression
Introduction to Regression ιατµηµατικό Πρόγραµµα Μεταπτυχιακών Σπουδών Τεχνο-Οικονοµικά Συστήµατα ηµήτρης Φουσκάκης Introduction Basic idea: Use data to identify relationships among variables and use these
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationHomework Solutions Applied Logistic Regression
Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that
More informationLecture 12: Interactions and Splines
Lecture 12: Interactions and Splines Sandy Eckel seckel@jhsph.edu 12 May 2007 1 Definition Effect Modification The phenomenon in which the relationship between the primary predictor and outcome varies
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationQuestion 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points
Economics 102: Analysis of Economic Data Cameron Spring 2016 May 12 Department of Economics, U.C.-Davis Second Midterm Exam (Version A) Compulsory. Closed book. Total of 30 points and worth 22.5% of course
More informationUnit 2. Regression and Correlation
BIOSTATS 640 - Spring 016. Regression and Correlation Page 1 of 88 Unit. Regression and Correlation Don t let us quarrel, the White Queen said in an anxious tone. What is the cause of lighting? The cause
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationS o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7,
S o c i o l o g y 63993 E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7, 2 0 0 9 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain
More informationCAMPBELL COLLABORATION
CAMPBELL COLLABORATION Random and Mixed-effects Modeling C Training Materials 1 Overview Effect-size estimates Random-effects model Mixed model C Training Materials Effect sizes Suppose we have computed
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationWorking with Stata Inference on the mean
Working with Stata Inference on the mean Nicola Orsini Biostatistics Team Department of Public Health Sciences Karolinska Institutet Dataset: hyponatremia.dta Motivating example Outcome: Serum sodium concentration,
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 Work all problems. 60 points are needed to pass at the Masters Level and 75 to pass at the
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationSociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame,
Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ I. True-False. (20 points) Indicate whether the following statements
More informationChapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models
Chapter 14 Multiple Regression Models 1 Multiple Regression Models A general additive multiple regression model, which relates a dependent variable y to k predictor variables,,, is given by the model equation
More informationSpecification Error: Omitted and Extraneous Variables
Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationNonlinear relationships Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015
Nonlinear relationships Richard Williams, University of Notre Dame, https://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationAP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation
Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More information171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th
Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2
Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationInference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister
More informationThis module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.
WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More information9) Time series econometrics
30C00200 Econometrics 9) Time series econometrics Timo Kuosmanen Professor Management Science http://nomepre.net/index.php/timokuosmanen 1 Macroeconomic data: GDP Inflation rate Examples of time series
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More information8 Analysis of Covariance
8 Analysis of Covariance Let us recall our previous one-way ANOVA problem, where we compared the mean birth weight (weight) for children in three groups defined by the mother s smoking habits. The three
More informationEconometrics. 8) Instrumental variables
30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationMeta-analysis of epidemiological dose-response studies
Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.
More information