Bivariate and Multiple Linear Regression (SECOND PART)

Size: px
Start display at page:

Download "Bivariate and Multiple Linear Regression (SECOND PART)"

Transcription

1 ACADEMIC YEAR 2013/2014 Università degli Studi di Milano GRADUATE SCHOOL IN SOCIAL AND POLITICAL SCIENCES APPLIED MULTIVARIATE ANALYSIS Luigi Curini Do not quote without author s permission Bivariate and Multiple Linear Regression (SECOND PART) 5. Multiple Linear Regressions with continuous variables There is always the possibility that alternative causes rival explanations are at work, affecting the observed relationship between X and Y. It is in this sense possible (and advisable) to include more than one explanatory variable in a regression equation and there are various reasons that we might want to do this. In fact we rarely believe that there is only one factor influencing the outcome variable of interest. This raises several questions about the association between variables. For instance, we sometimes want to know whether there is an effect of x on y, after controlling for the effect of some other variable z. After all it may be that x is correlated with z and z is correlated with y, but for any fixed level of z there is no correlation between x and y. So an increase in x may be associated with a change in y, but this change depends entirely on what happens to z. For example, it may be that democracies are less likely to go to war with each other because they are more likely to trade with each other and it is trading, rather than democracy that reduces the chances of war. So we would say there is no effect of democracy on the chances of war after controlling for trade. In this sense we could have a spurious relationship: once the researcher controls for a rival causal factor, the original relationship becomes weak, or disappears. Another possibility is that we may observe no relationship between x and y because the pattern of association between x and z and between z and y confounds the effect of x on y. For example, it may be that democracy actually decreases the chances of going to war but democracies also have higher military capacities which make them more likely to go to war, and the net result is that democracies are no more or less likely to go to war than other countries. These kinds of issues can sometimes be resolved using multiple linear regression. This is a direct extension of linear regression, except with more than one independent (or explanatory) variable. With p explanatory variables we have, y = a + b1x1 + b2x bp xp + e Just as in the case of regression with a single regressor, the factors that determine Y in addition to X1 and X2 are incorporated into the regression equation as an error term e. The error term is the 1

2 deviation of a particular observation from the average population relationship (from our estimates according to our model). The intercept and the slope coefficients are estimated once again by minimizing the sum of squared prediction mistakes, that is, by choosing the estimators b 0, b 1 and so on as to n minimize: 2 Yi b0 b1 X1 i b2 X2i... bk Xki. i 1 The estimators of the coefficients b 0, b 1, b k, that minimize the sum of squared mistakes are called the ordinary least squares (OLS) estimators of b 0, b 1, b k. The OLS regression line is the straight line constructed using the OLS estimators, that is bˆ ˆ 0, b 1, etc. Based on the OLS regression line is Yˆ bˆ bˆx.... The OLS residual for the ith observation is the difference between Y i and its the i predicted value, that is the difference between Y i and Y ˆi. In multiple linear regression the bi are partial regression coefficients. So bk is the estimated change in the dependent variable y associated with a unit increase in xk keeping all the other independent variables in the model constant. What do we mean that a particular beta coefficient in a multiple regression is the effect on Y of a unit change in X1, holding X2 constant or controlling for X2? Let s write an hypothetical regression function: Y = b0+b1x1+b2x2 and imagine changing X1 by the amount X1 while not changing X2, that is, while holding X2 constant. Because X1 has changed, Y will change by some amount, say Y. After this change the new value of Y, Y+ Y is: Y+ Y= b0+b1(x1+ X1)+b2X2 An equation for Y in terms of X1 (that explains the value of Y) is obtained by subtracting the equation Y = b0+b1x1+b2x2 from equation Y+ Y= b0+b1(x1+ X1)+b2X2, yielding Y= b1 X1, that is: b1= Y/ X1, holding X2 constant. The coefficient b1 is the effect on Y (the expected change in Y) of a unit change in X1, holding X2 fixed. Another phrase used to describe b1 is the partial effect on Y of X1, holding X2 fixed. Finally, the interpretation of the intercept in the multiple regression model, b0, is similar to the interpretation of the intercept in the single-regressor model: it is the expected value of Y when X1 and X2 are zero. Simply put, the intercept b0 determines how far up the Y axis the population regression line starts. Look at the multiple linear regression of ecogr709 on const45 and federal45 together. Let s assume that your theory states that as the level of certainty of the rules increases (const45) and as the level of intra-region economic competition decreases (federal45), then the economic growth should increase. 2

3 Let s first look at the scatterplot matrix for the variables in our regression model as a preliminary step. graph matrix ecogr709 const45 federal45, half mlabel(countryt) twoway (scatter ecogr709 const45, mlabel(countryt) mlabp(9) mlabs(vsmall)) (lfit ecogr709 const45) twoway (scatter ecogr709 federal45, mlabel(countryt) mlabp(9) mlabs(vsmall)) (lfit ecogr709 federal45) Now let s compare the following three equations. The first two are simply bivariate ones, while the third one is a multivariate OLS. reg ecogr709 const45 reg ecogr709 federal45 reg ecogr709 const45 federal45 Has anything changed? Why in the first two equations the two IV are not significant, while in the third model now both coefficients are significant? Now let s add one further variable (judrev45, that is the existence of a constitution subject to judicial review. After all, even this aspect increases the certainty of rules against political interference) reg ecogr709 const45 federal45 judrev45 So far, we have concerned ourselves with testing a single variable at a time, for example looking at the coefficient for const45 and determining if that is significant. We can also test sets of variables, using the test command, to see if the set of variables are significant. First, let's start by testing a single variable, const45, using the test command. test const45==0 ( 1) ell = 0.0 F( 1, 385) = Prob > F = If you compare this output with the output from the last regression you can see that the result of the F-test, 12.62, is the same as the square of the result of the t-test in the regression (3.55^2 = 12.62). Note that you could get the same results if you typed the following since Stata defaults to comparing the term(s) listed to 0. test const45 ( 1) ell = 0.0 F( 1, 385) = Prob > F = Perhaps a more interesting test would be to see if the contribution of Constitution is significant. Since the information regarding Constitution is contained in two variables, const45 and judrev45, we include both of these with the test command. 3

4 test const45 judrev45 ( 1) acs_k3 = 0.0 ( 2) acs_46 = 0.0 F( 2, 385) = 3.95 Prob > F = The significant F-test, 3.95, means that the collective contribution of these two variables is significant. One way to think of this, is that there is a significant difference between a model with const45 and judrev45 as compared to a model without them, i.e., there is a significant difference between the "full" model and the "reduced" models. You can also make some estimation with the corresponding c.i.: lincom _b[_cons] +_b[const45]*2+_b[federal45]*1+_b[judrev45]*3 Finally, we can also estimate the difference between the previous value and another value characterized by just a different value of the federal45 variable. lincom (_b[_cons] +_b[const45]*2+_b[federal45]*3+_b[judrev45]*3)-(_b[_cons] +_b[const45]*2+_b[federal45]*1+_b[judrev45]*3) In our case, increasing the value of federalism compared to the previous situation, would reduce my level of growth of 1.13 points. Addendum: Which variable is more important? Significance testing only tells us how confident we can be that the true effect is not zero, or how confident that the sign of the coefficient is correct. Often we want to know which of several predictor variables is the most important. This is a complex question for which there is no simple answer. We can start by observing that in the following regression: reg ecogr709 const45 federal45, the a b s o l u t e coefficient of const45 is actually bigger than that of federal45. But this does not mean that const45 is more important than deferal45, since a unit change in const45 does not mean the same as a unit change in federal45. They are not measured in the same units (tab const45 federal45). To address this problem, we can add an option to the regress command called beta, which will give us the standardized regression coefficients. The beta coefficients are used by some researchers to compare the relative strength of the various predictors within the model. Because the beta coefficients are all measured in standard deviations, instead of the units of the variables, they can be compared to one another. In other words, the beta coefficients are the coefficients that you would obtain if the outcome and predictor variables were all transformed standard scores, also called z-scores, before running the regression. The standard score is: where: x is a raw score to be standardized; μ is the mean of the population; 4

5 σ is the standard deviation of the population. The quantity z represents the distance between the raw score and the population mean in units of the standard deviation. z is negative when the raw score is below the mean, positive when above. The standard score indicates how many standard deviations an observation is above or below the mean: the standard deviation is the unit of measurement of the z-score. It allows comparison of observations from different normal distributions, which is done frequently in research. reg ecogr709 const45 federal45, b Because the coefficients in the Beta column are all in the same standardized units you can compare these coefficients to assess the relative strength of each of the predictors. In this example, const45 has the largest Beta coefficient,. Thus, a one standard deviation increase in const45 leads to a 1.0 standard deviation increase in predicted ecogr, with the other variables held constant. In interpreting this output, remember that the difference between the numbers listed in the Coef. column and the Beta column is in the units of measurement. For example, to describe the raw coefficient for const45 you would say A one-unit decrease in const45 would yield a.70-unit increase in the predicted ecogrowth. This makes a lot of sense to me! However, for the standardized coefficient (Beta) you would say, A one-standard deviation decrease in const45 would yield a 0.7 standard deviation increase in the predicted ecogrowth. Not that easy to understand, after all! Standardizing makes therefore the coefficients substantially more difficult to interpret! Moreover, several critics of standardized regression coefficients argue that this is illusory: there is no reason why a change of one SD in one predictor should be equivalent to a change of one SD in another predictor. Some variables are easy to change - the amount of time watching television, for example. Others are more difficult - weight or cholesterol level. Others are impossible - height or age. In summary, standardized coefficients are in general (1) more difficult to interpret, (2) may add seriously misleading information. The original, unstandardized coefficients are meaningful and are not subject to these problems, although they generally cannot be compared for importance. A more important and final point is that most times scholars are not interested in finding out which variable will win the race. Most often it is theoretically "good enough" to say that even after controlling for a set of variables (i.e., plausible rival hypotheses, possible confounding influences), the variable in which we are interested still seems to have an important influence on the dependent variable. This is precisely the empirical evidence for which we search to substantiate or refute our theoretical expectations. Usually, little (social or political) understanding is gained by hypothesizing a winner in a race of the variables. 6. Multiple Linear Regressions with categorical variables Suppose that we want to investigate which factors affect the shape of party systems, estimated by employing the effective number of elective parties. To answer this question we will employ the Neto and Cox (1997) dataset. 5

6 The effective number of elective parties is a continuous measure of the number of parties defined 1 as: ENEP 2 v, where v i is the share of the vote for party i. If vote shares are replaced with i shares of seats we have the effective number of parliamentary parties instead. Suppose further that our theory predicts a relationship between ENEP and the type of government system. In particular, since the president is just one person in a national competition, presidential elections can be like a single member simple plurality election with only one district, the whole country. So having a presidential system can be a powerful factor reducing the number of parties in the legislature. There is an indicator (or dummy variable) called pres which takes the value 1 if there is a president with executive or legislative powers and zero otherwise. First let s just create a dummy of presidential system against non presidential ones: codebook prestype recode prestype (0=0 "Parliamentary democracy") (1/2=1 "Presidential democracy"), gen(presidential) tab prestype presidential reg enpv presidential table country, contents (mean enpv) This test shows that Presidential systems actually have more parties on average than Parliamentary systems, but we cannot quite be confident of it at the 5% level. However, if you use a one-tailed test (i.e., you predict that the parameter will go in a particular direction), then you can divide the p value by 2 before comparing it to your preselected alpha level. In this case, the Presidential variable turns to be significant at the 5% level. Two-sided alternative hypothesis to H 0 : H : vs. H : 0 j j,0 1 j j,0 One-sided alternative hypothesis to H 0 : H : vs. H : or 0 j j,0 1 j j,0 Of course, it may be that countries with a run-off election for president are less likely to see a strong curtailing effect on the effective number of parties in the legislature. The variable prestype is a classification of different types of presidential system according to whether they have run-off elections for president or not. Consider the table: table prestype, contents(freq mean enpv) This shows that there are differences between countries with run-off and single shot presidential elections in the effective number of parties. Suppose then that you want to test for differences between presidential types we need to create dummy variables, one for each type of presidential system and introduce these into the model. tab prestype, gen(presdummy) reg enpv presdummy2 presdummy3 A quick way to do this is to use the xi command. xi: regress enpv i.prestype 6

7 The xi: command (that we have already discussed) can be placed before a regression type command to create indicator variables for any explanatory variables with the i. suffix. This is helpful, especially when there are lots of different categories. Note that Stata creates new variables _Iprestype_1 and _Iprestype_2 for each value of prestype except the first. The first category is dropped because we cannot include indicators for all the different types of system in the model at once because the model would not be identified. Instead the first category of the categorical variable is treated as the baseline category. So the coefficient of _Iprestye_2 estimates the additional number of parties associated with one-shot presidential electoral systems relative to the number in the baseline category. In this case the baseline is no presidency at all. So the _cons value is the mean for the parliamentary democracies. The Iprestype_1 is the mean of effective number of parties for presidential with single election minus the mean of the omitted group. And the coefficient Iprestype_2 is the mean of the presidential democracies with run-off minus the mean of parliamentary democracies. If we are interested in whether there is an effect of the categorical variable prestype as a whole, we need to test the hypothesis that the coefficients of both the _Iprestype_ dummies are simultaneously zero. Do this using: test _Iprestype_1 _Iprestype_2 test _Iprestype_1 _Iprestype_2, m In this case we are doing a test of joint hypothesis on two or more coefficients: Two-sided alternative hypothesis to H 0 : H : 0 and 0 vs. H : 0 and Why can t we just test the individual coefficient one at a time? This is problematic every time there is some correlation between the regressors. We should use the F-Statistic to test joint hypothesis about regression coefficients. The F-test shows that we cannot reject the hypothesis that both of the prestype dummies are zero at the 90% c.i. This is similar to the value we got from: reg enpv presidential. We can also test if the two effects are significantly different from each other. Do presidential democracies with run off system present a higher number of parties compared to presidential democracies with a single electoral system? test _Iprestype_1 = _Iprestype_2 Stata tests the null hypothesis that the two coefficients are not statistically different from each other and returns a p-value for this test. Here we can conclude that the two coefficients are not significantly different from each other. What if we wanted a different group to be the reference group? For example the run off presidential democracies? char prestype [omit] 2 xi: regress enpv i.prestype 7

8 7. Multiple Linear Regressions with continuous and categorical variables Let s go back to the dataset nes2004, and let s try to explore a bit better the determinants of the popularity of the 2004 presidential candidate Kerry. use "D:\Pdh08_09\Lezione 1\nes2004.dta", clear codebook gender xi: reg kerry_therm welfare_therm i.gender (Note that Stata automatically created a dummy variable Igender2 coded 0 for men the lowest value on gender and 1 for women. Otherwise we should create it with the tab command!) According to our model, the impact of gender on kerry_therm is the same regardless of the value of welfare therm (once the value of welfare therm if fixed at a given level). That is, the impact of gender on kerry_therm is an additive one! What do we mean by that? When do we have an additive relationship between the IV, the DV and the other control variable? Every time the strength and the tendency of the relationship between IV and DV remains similar for all the values of the control variable lincom (_b[_cons] + _b[_igender_2] + _b[welfare_therm]*40) -(_b[_cons] + _b[_igender_2]*0 + _b[welfare_therm]*40) lincom (_b[_cons] + _b[_igender_2] + _b[welfare_therm]*70) -(_b[_cons] + _b[_igender_2]*0 + _b[welfare_therm]*70) predict yhat scatter yhat welfare_therm The coefficient for welfare_therm indicates that for every unite increase in welfare_therm the kerry_themr is predicted to increase by units. This is the slope of the lines shown in the graph. The graph has two lines, one for the male and one for the female. The coefficient for gender is 3.65 indicating that being a woman, compared of being a man, is expected to increase the kerry therm score by about As you can see in the graph, the top line is about 3.65 units higher than the lower line. Moreover, the intercept is around 39 for the lower line (male: when gender is 0) and the intercept is around 44 for the upper line (female: when gender is 1). Of course, we can also use within the same OLS more than one dummy and more than a continuous variable! Let s say that we suspect that partisanship has a big effect on the Kerry Thermometer as well. codebook partyid3 We want that our intercept has a meaning so we re-centred the welfare therm variable (this does not affect the coefficient for welfare_therm, just its interpretation): mean welfare_therm gen welfaremean = (welfare_therm ) 8

9 Alternatively: or: egen welfaremean = mean(welfare_therm) list welfare_therm welfaremean in 1/10 We also want that independents (partyid=2) is the omitted category, therefore: char partyid3 [omit] 2 xi: reg kerry_therm welfare_therm i.gender i.partyid3 Now gender is significant only at the 90% once controlling for partyid (almost a spurious relationship between gender and kerry_therm: that is, from this we can infer that women are more likely than men to be Democrats) and the coefficient for welfare_therm decreases (even if still significant after controlling for partyid3). Now the constant, our point of reference, identifies the mean kerry therm score for a male, independent and with an average score for welfare therm. The coefficient on Female tells us how much to adjust the male part of the intercept, controlling for partisanship and welfare therm. Thus, compared with male independents with an average welfare therm value/attitude, a male democrats average 17 degrees higher on the Kerry thermometer. test _Ipartyid3_1 _Ipartyid3_3 test _Ipartyid3_1 = _Ipartyid3_3 SECOND ASSIGNMENT Using the dataset on Satisfaction with democracy (satisfaction_with_democracy_europe.dta) Develop two competing models to explain the difference among European countries on the level of satisfaction with democracy. Introduce first your (main) hypotheses, then present the tables of the regression coefficients and describe your results in no more than 500 words. NB: ASSIGNMENTS THAT EXCEED THE WORD LIMITS WILL NOT BE MARKED. Summing up: some GOLDEN RULES: 1. My R 2 is bigger than yours! So publish me! Sometimes R 2 is considered to be a measure of the fit between the statistical model and the true model. A high R 2 is considered to be proof that the correct model has been specified or that the theory being tested is correct. A higher R 2 in one model is also taken to mean that that model is better. All these interpretations are wrong. R 2 is a measure of the spread of points around a regression line. Full stop! There is nothing intrinsically interesting in the spread of points around a regression line. If you are interested in the precision with which you can confidently make inferences, then look at your standard errors (see below)!!! 2. You do not run an OLS to maximize the R 2! You only run an OLS to test hypotheses (hopefully derived from a theory!!!). Besides, if your R 2 close to 1, (usually) you can have problems with your estimated model (such as using as IV a different version of your DV). 9

10 3. Remember: the R 2 does NOT tell you whether: 1) an included variable is statistically significant (to ascertain this you need to perform a hypothesis testing using the t-statistics); 2) the regressors are a true cause of the movements of the DV (you can have an high R 2 but the relationship is not causal: spurious one!); 3) you have chosen the most appropriate set of regressors (this question is just related to theory and the nature of the questions being addressed. An high R 2, or a low one, does not mean that you have the most appropriate set of regressors, or an inappropriate set of regressors!); 4) there is no omitted variable bias if the R 2 is high (you have omitted variable bias in an estimator because a variable that is a determinant of DV and is correlated with a regressor (X) has been omitted from the regression: omitted variable bias can occur in regression with a low or a high R 2 : even in this case, the theory is crucial!) 4. Never take out from your model IV if they are not significant! You added them according to the literature and to the theory (otherwise why have you added them in the first place?). Therefore, by dropping them, you incur in an (at least theoretical) omission-bias problem! Besides statistical problems: your model estimated when you have included all your (theoretically) relevant variables is going to be a different thing compared to when you drop them!!! 5. Never dropping from your dataset influential observations! They are influential for a given reason!!! Try to understand it to improve your model. And if you cannot, add a dummy for them or use robust s.e. (more on this later )! 6. Never select your data according to the value they display for your DV: selection bias problems! Example of extreme-right parties in Europe analysis 7. Finally, no data mining! Or if you do it, at least DO NOT SAY IT! Sources: King, Gary. How Not to Lie With Statistics: Avoiding Common Mistakes in Quantitative Political Science, American Journal of Political Science, Vol. 30, No. 3 (August, 1986): Pp King, Gary. Truth is Stranger than Prediction, More Questionable Than Causal Inference, American Journal of Political Science, Vol. 35, No. 4 (November, 1991): Pp King, Gary; Michael Tomz; and Jason Wittenberg. Making the Most of Statistical Analyses: Improving Interpretation and Presentation, American Journal of Political Science, Vol. 44, No. 2 (April, 2000):

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Chapter 5 Friday, May 21st

Chapter 5 Friday, May 21st Chapter 5 Friday, May 21 st Overview In this Chapter we will see three different methods we can use to describe a relationship between two quantitative variables. These methods are: Scatterplot Correlation

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

Research Methods in Political Science I

Research Methods in Political Science I Research Methods in Political Science I 6. Linear Regression (1) Yuki Yanai School of Law and Graduate School of Law November 11, 2015 1 / 25 Today s Menu 1 Introduction What Is Linear Regression? Some

More information

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships Objectives 2.3 Least-squares regression Regression lines Prediction and Extrapolation Correlation and r 2 Transforming relationships Adapted from authors slides 2012 W.H. Freeman and Company Straight Line

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Sociology Exam 2 Answer Key March 30, 2012

Sociology Exam 2 Answer Key March 30, 2012 Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Lecture 7: OLS with qualitative information

Lecture 7: OLS with qualitative information Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

S o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7,

S o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7, S o c i o l o g y 63993 E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7, 2 0 0 9 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain

More information

Lab 6 - Simple Regression

Lab 6 - Simple Regression Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics CHAPTER 4 & 5 Linear Regression with One Regressor Kazu Matsuda IBEC PHBU 430 Econometrics Introduction Simple linear regression model = Linear model with one independent variable. y = dependent variable

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat

More information

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections Answer Key Fixed Effect and First Difference Models 1. See discussion in class.. David Neumark and William Wascher published a study in 199 of the effect of minimum wages on teenage employment using a

More information

Answer Key: Problem Set 6

Answer Key: Problem Set 6 : Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,

More information

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister

More information

Linear Regression with one Regressor

Linear Regression with one Regressor 1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions

More information

Multiple Regression Theory 2006 Samuel L. Baker

Multiple Regression Theory 2006 Samuel L. Baker MULTIPLE REGRESSION THEORY 1 Multiple Regression Theory 2006 Samuel L. Baker Multiple regression is regression with two or more independent variables on the right-hand side of the equation. Use multiple

More information

Regression and Stats Primer

Regression and Stats Primer D. Alex Hughes dhughes@ucsd.edu March 1, 2012 Why Statistics? Theory, Hypotheses & Inference Research Design and Data-Gathering Mechanics of OLS Regression Mechanics Assumptions Practical Regression Interpretation

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

SCATTERPLOTS. We can talk about the correlation or relationship or association between two variables and mean the same thing.

SCATTERPLOTS. We can talk about the correlation or relationship or association between two variables and mean the same thing. SCATTERPLOTS When we want to know if there is some sort of relationship between 2 numerical variables, we can use a scatterplot. It gives a visual display of the relationship between the 2 variables. Graphing

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

MORE ON MULTIPLE REGRESSION

MORE ON MULTIPLE REGRESSION DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON MULTIPLE REGRESSION I. AGENDA: A. Multiple regression 1. Categorical variables with more than two categories 2. Interaction

More information

Simple Linear Regression Using Ordinary Least Squares

Simple Linear Regression Using Ordinary Least Squares Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression

More information

LECTURE 2: SIMPLE REGRESSION I

LECTURE 2: SIMPLE REGRESSION I LECTURE 2: SIMPLE REGRESSION I 2 Introducing Simple Regression Introducing Simple Regression 3 simple regression = regression with 2 variables y dependent variable explained variable response variable

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame,

Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame, Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ I. True-False. (20 points) Indicate whether the following statements

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias Contact Information Elena Llaudet Sections are voluntary. My office hours are Thursdays 5pm-7pm in Littauer Mezzanine 34-36 (Note room change) You can email me administrative questions to ellaudet@gmail.com.

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons 1. Suppose we wish to assess the impact of five treatments while blocking for study participant race (Black,

More information

The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

SIMPLE TWO VARIABLE REGRESSION

SIMPLE TWO VARIABLE REGRESSION DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 SIMPLE TWO VARIABLE REGRESSION I. AGENDA: A. Causal inference and non-experimental research B. Least squares principle C. Regression

More information

THE PEARSON CORRELATION COEFFICIENT

THE PEARSON CORRELATION COEFFICIENT CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Lecture 24: Partial correlation, multiple regression, and correlation

Lecture 24: Partial correlation, multiple regression, and correlation Lecture 24: Partial correlation, multiple regression, and correlation Ernesto F. L. Amaral November 21, 2017 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A

More information

LECTURE 9: GENTLE INTRODUCTION TO

LECTURE 9: GENTLE INTRODUCTION TO LECTURE 9: GENTLE INTRODUCTION TO REGRESSION WITH TIME SERIES From random variables to random processes (cont d) 2 in cross-sectional regression, we were making inferences about the whole population based

More information

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to: STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Hypothesis Tests and Confidence Intervals in Multiple Regression

Hypothesis Tests and Confidence Intervals in Multiple Regression Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Classification & Regression. Multicollinearity Intro to Nominal Data

Classification & Regression. Multicollinearity Intro to Nominal Data Multicollinearity Intro to Nominal Let s Start With A Question y = β 0 + β 1 x 1 +β 2 x 2 y = Anxiety Level x 1 = heart rate x 2 = recorded pulse Since we can all agree heart rate and pulse are related,

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors Chapter 4 Regression with Categorical Predictor Variables Page. Overview of regression with categorical predictors 4-. Dummy coding 4-3 4-5 A. Karpinski Regression with Categorical Predictor Variables.

More information