POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010

Size: px
Start display at page:

Download "POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010"

Transcription

1 POLI 618 Notes Stuart Soroka, Department of Political Science, McGill University March 2010 These pages were written originally as my own lecture notes, but are now designed to be distributed to students taking the stats methods course Poli 618 at McGill University. They are also freely available online, at snsoroka.com. The notes draw on a good number of statistics texts, including Kennedy s Econometrics, Greene s Econometric Analysis, and a number of volumes in Sage s quantitative methods series. That said, please do keep in mind that they are just lecture notes there are errors and omissions, and there is for no single topic enough information included in this file to learn statistics from the notes alone. (There are of course many textbooks that are better equipped for that purpose.) The notes are nonetheless a useful background guide to Poli 618 and perhaps, more generally, to some of the basic statistics most common in empirical political science. If you find errors (and you will), please do let me know. Thanks, Stuart Soroka stuart.soroka@mcgill.ca

2 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 2 Table of Contents Variance, Covariance and Correlation... 3 Introducing Bivariate Ordinary Least Squares Regression... 5 Multivariate Ordinary Least Squares Regression Error, and Model Fit Assumptions of OLS regression Nonlinearities Collinearity and Multicollinearity Heteroskedasticity Outliers Models for dichotomous data Linear Probability Models Nonlinear Probability Model: Logistic Regression An Alternative Description: The Latent Variable Model Nonlinear Probability Model: Probit Regression Maximum Likelihood Estimation Interpretation & Goodness of Fit Measures for Categorical Models Models for Categorical Data Ordinal Outcomes Nominal Outcomes Times Series: Autocorrelation Univariate Statistics Bivariate Statistics Multivariate Models Significance Tests Distribution Functions The chi-square test The t test The F Test Factor Analysis Background: Correlations and Factor Analysis An Algebraic Description Factor Analysis Results Rotated Factor Analyses... 54

3 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 3 Variance, Covariance and Correlation Let s begin with Y i, a continuous variable measuring some value for each individual (i) in a representative sample of the population. Y i can be income, or age, or a thermometer score expressing degrees of approval for a presidential candidate. Variance in our variable Y i is calculated as follows: (Yi (1) S2 Y = Ȳ )2 N 1, or (2) S 2 Y = N( Y 2 i ) ( Y i ) 2 N(N 1), where both versions are equivalent, and the latter is referred to as the computational formula (because it is, in principle, easier to calculate by hand). Note that the equation is pretty simple: we are interested in variance in Y i, and Equation 1 is basically taking the average of each individual Y i s variance around the mean (Y ). There are a few tricky parts. First, the differences between each individual Y i and Y (that is, Y i Y ) are squared in Equation 1, so that negative values do not cancel out positive values (since squaring will lead to only positive values). Second, we use N-1 as the denominator rather than N (where N is the number of cases). This produces a more conservative (slightly inflated) result, in light of the fact that we re working with a sample variance rather than the population variance that is, the values of Y i in our (hopefully) representative sample, and the values of Y i that we believe may exist in the total real-world population. For a small-n samples, where we might suspect that we under-estimate the variance in the population, using N-1 effectively adjusts the estimated variance upwards. With a large-n sample, the difference between N-1 and N is increasingly marginal. That the adjustment matters more for small sample than for big samples reflects our increasing confidence in the representative-ness of our sample as it increases. (Note that some texts distinguish between S Y 2 and σ Y 2, where the Roman S is the sample variance and the Greek σ is the population variance. Indeed, some texts will distinguish between sample values and population values using Roman and Greek versions across the board B for an estimated slope coefficient, for instance, and β for an actual slope in the population. I am not this systematic below.) The standard deviation is a simple function of variance:

4 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 4 (Yi (3) S Y = S 2Y = Ȳ )2 N 1, So standard deviations are also indications of the extent to which a given variable varies around its mean. S Y is important for understanding distributions and significance tests, as we shall see below. So far, we ve looked only at univariate statistics statistics describing a single variable. Most of the time, though, what we want to do is describe relationships between two (or more) variables. Covariance a measure of common variance between two variables, or how much two variables change together is calculated as follows: (4) S XY = (Xi X)(Y i Ȳ ) N 1, or (5) S XY = N X i Y i X i Yi N(N 1), the latter of which is the computational formula. Again, we use N-1 as the denominator, for the same reasons as above. Pearson s correlation coefficient is also based on a ratio of covariances and standard deviations, as follows: (6) r = S XY S X S Y, or (7) r = (Xi X)(Y i Ȳ ) (Xi X) 2 (Y i Ȳ )2. where S XY is the sample covariance between X i and Y i, and S X and S Y are the sample standard deviations of X i and Y i respectively. (Note the relationship between this Equation 7, and the preceding equations for standard deviations and covariances, Equation 3 and Equation 4.)

5 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 5 Introducing Bivariate Ordinary Least Squares Regression Take a simple data series, and plot it X Y What we want to do is describe the relationship between X and Y. Essentially, we want to draw a line between the dots, and describe that line. Given that the data here are relatively simple, we can just do this by hand, and describe it using two basic properties, α and β : where α, the constant, is in this case equal to 1, and β, the slope, is 1 (the increase in Y) divided by 2 (the increase in X) =.5. So we can produce an

6 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 6 equation for this line allowing us to predict values of Y based on values of X. The general model is, (8) Y i = α + βx i And the particular model in this case is Y = 1 +.5X. Note that the constant is simply a function of the means of both X and Y, along with the slope. That is: (9) α = Ȳ β X X Y mean So, following Equation 9, α = Ȳ β X = 3.5 (.5)*5 = = 1. This is pretty simple. The difficulty is that data aren t like this they don t fall along a perfect line. They re likely more like this: X Y Now, note that we can draw any number of lines that will satisfy Equation 8. All that matters is that the line goes through the means of X and Y. So the means are:

7 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 7 X Y mean And let s make up an equation where Y=3.75 when X=5 Y = α + β X 3.75 = α + (β )* = 4 + (β )* = 4 + (-.05)* = 4 + (-.25) So here it is: Y = 4 + (-.05)X. Plotted, it looks like this: Note that this new model has to be expressed in a slightly different manner, including an error term: (10) Y i = α + βx i + i, or, alternatively: (11) Y i = Ŷi + i, where are the estimated values of the actual Y i, and where the error can be expressed in the following ways: (12) i = Y i Ŷ. or i = Y i (α + βx i ). So we ve now accounted for the fact that we work with messy data, and that there will consequently be a certain degree of error in the model. This is

8 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 8 inevitable, of course, since we re trying to draw a straight line through points that are unlikely to be perfectly distributed along a straight line. Of course, the line above won t do it quite clearly does not describe the relationship between X and Y. What we need is a method of deriving a model that better describes the effect that X has on Y essentially, a method that draws a line that comes as close to all the dots as possible. Or, more precisely, a model that minimizes the total amount of error(ε i ). We first need a measure of the total amount of error the degree to which our predictions miss the actual values of Y i. We can t simply take the sum of all errors, i, because positive and negative errors can cancel each other out. We could take the sum of the absolute values, i, which in fact is used in some estimations. The norm is to use the sum of squared errors, the SSE or 2 i. This sum is most greatly affected by large errors by squaring residuals, large residuals take on very large magnitudes. An estimation of Equation 10 that tries to minimize 2 i accordingly tries especially hard to avoid large errors. (By implication, outlying cases will have a particularly strong effect on the overall estimation. We return to this in the section on outliers below.) This is what we are trying to do in ordinary least squares (OLS) regression: minimize the SSE, and have an estimate of β (on which our estimate of α relies) that comes as close to all the dots as is possible. Least-squares coefficients for simple bivariate regression are estimated as follows: (13) β = (Xi X)(Y i Ȳ ) (Xi X) 2, or

9 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 9 (14) β = N Y i X i Y i Xi N X 2 i ( X i ) 2. The latter is referred to as the computational formula, as it s supposed to be easier to compute by hand. (I actually prefer the former, which I find easier to compute, and has the added advantage of nicely illustrating the important features of OLS regression.) We can use Equation 13 to calculate the Least Squares estimate for the above data: The data Calculated values (used in Equation 13) X i Y i X i X Y i Y (X i X )(Y i Y ) (X i X ) X i =5 Y i =3.75 = 9 =20 So solving Equation 13 with the values above looks like this: β = (Xi X)(Y i Ȳ ) (Xi X) 2 = 9 20 =.45 And we can use these results in Equation 9 to find the constant: α = Ȳ β X =3.75 (.45) 5= = 1.5 So the final model looks like this: Y i =1.5+(.45) X i

10 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 10 Using this model, we can easily see what the individual predicted values ( ˆ Y i ) are, as well as the associated errors (ε i ): X i Y i Y ˆ i ε i = Y ˆ i Y i X i =5 Y i =3.75 One further note about Equation 13, and our means of estimating OLS slope coefficients: Recall the equations for variance (Equation 1) and covariance (Equation 4). If we take the ratio of covariance and variance, as follows, (15) S XY S 2 x = P (Xi X)(Y i Ȳ ) N 1 P (Xi ˆX) 2 N 1, we can adjust somewhat to produce the following, (16) S XY S 2 x = (Xi X)(Y i Ȳ ) (Xi ˆX) 2, where Equation 16 simply drops the N-1 denominators, which cancel each other out. More importantly, Equation 16 looks suspiciously indeed, exactly like the formula for β (Equation 13). β is thus essentially a ratio between the covariance between X and Y, and the variance of X, as follows: (17) β YX = S YX S 2 X This should make sense when we consider the standard interpretation of β : for a one-unit shift in X, how much does Y change?

11 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 11 Multivariate Ordinary Least Squares Regression Things are more complicated for multiple, or multivariate, regression, where there is more than one independent variable. The standard OLS multivariate model is nevertheless a relatively simple extension of bivariate regression imagine, for instance, plotting a line through dots plotted along two X axes, in what amounts to three-dimensional space: This is all we re doing in multivariate regression drawing a line through these dots, where values of Y are driven by a combination of X 1 and X 2, and where the model itself would be as follows: (18) Y i = α + β 1 X 1 i + β 2 X 2 i + i. That said, when we have more than two regressors, we start plotting lines through four- and five-dimensional space, and that gets hard to draw. Least squares coefficients for multiple regression with two regressors, as in Equation 18, are calculated as follows: (19)β 1 = ( (X X )(Y Y ) 1i 1 i (X 2i X 2 )) ( (X 2i X 2 )(Y i Y ) (X 1i X 1 )(X 2i X 2 )) ( (X 1i X 1 ) 2 (X 2i X 2 ) 2 ) ( (X 1i X 1 )(X 2i X 2 )) 2 and (20)β 2 = ( (X X )(Y Y ) 2i 2 i (X 1i X 1 )) ( (X 1i X 1 )(Y i Y ) (X 1i X 1 )(X 2i X 2 )), ( (X 1i X 1 ) 2 (X 2i X 2 ) 2 ) ( (X 1i X 1 )(X 2i X 2 )) 2 and the constant is now estimated as follows: (21) α = Ȳ β 1 X 1 β 2 X2.,

12 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 12 Error, and Model Fit The standard deviation of the residuals, or the standard error of the slope, is as follows, 2 i (22) SE β = N 2, Or, more generally, 2 (23) i SE β = N K 2, Equation 22 is the same as Equation 23, except that the former is a simple version that applies to bivariate regression only, and the latter is a more general version that applies to multivariate regression with any number of independent variables. N in these equations refers to the total number of cases, while K is the total number of independent variables in the model. The SE β is a useful measure of the fit of a regression slope it gives you the average error of the prediction. It s also used to test the significance of the slope coefficient. For instance, if we are going to be 95% confident that our estimate is significantly different from zero, zero should not fall within the interval β ± 2(SE β ). Alternatively, if we are using t-statistics to examine coefficients significance, then the ratio of β to SE β should be roughly 2. Assuming you remember the basic sampling and distributional material in your basic statistics course, this reasoning should sound familiar. Here s a quick refresher: Testing model fit is based on some standard beliefs about distributions. Normal distributions are unimodel, symmetric, and are described by the following probability distribution: (24) p(y )= e (Y µ Y ) 2 /2σ 2 Y 2πσ 2 Y where p(y) refers to the probability of a given value of Y, and where the shape of the curve is determined by only two values: the population mean,, and its variance,. (Also see our discussion of distribution functions, below.) Assuming two distributions with the same mean (of zero, for instance), the effect of changing variances is something like this:

13 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 13 We know that many natural phenomena follow a normal distribution. So we assume that many political phenomena do as well. Indeed, where the current case is concerned, we believe that our estimated slope coefficient, β, is one of a distribution of possible β s we might find in repeated samples. These β s are normally distributed, with a standard deviation that we try to estimate from our data. We also know that in any normal distribution, roughly 68% of all cases fall within plus or minus one standard deviation from the mean, and 95% of all cases fall within plus or minus two standard deviations from the mean. It follows that our slope should not be within two standard errors of zero. If it is, we cannot be 95% confidence that our coefficient is significantly different from zero that is, we cannot reject the null hypothesis that there is no significant effect. Going through this process step-by-step is useful. Let s begin with our estimated bivariate model from page 8, where the model is Y i = (.45)*X i, and the data are, X i Y i Y ˆ i ε i = Y ˆ i Y i 2 ε i X i =5 Y i = Based on Equation 22, we calculate the standard error of the slope as follows:

14 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 14 SE β = 2 i N 2 = 4 2 = 2 = 1.35 = 1.16 So, we can be 95% confident that the slope estimate in the population is.45 ± (2 1.16), or.45 ± Zero is certainly within this interval, so our results are not statistically significant. This is mainly due to our very small sample size. Imagine the same slope and SE β, but based on a sample of 200 cases: SE β = 2 i N 2 = = 198 =.014 =.118 Now we can be 95% confident that the slope estimate in the population is.45 ± (2.118), or.45 ±.236. Zero is not within this interval, so our results in this case would be statistically significant. Just to recap, our decision about the statistical significance of the slope is based on a combination of the magnitude of the slope (β ), the total amount of error in the estimate (using the SE β ), and the sample size (N, used in our calculation of the SE β ). Any one of these things can contribute to significant findings: a greater slope, less error, and/or a larger sample size. (Here, we saw the effect that sample size can have.) Another means of examining the overall model fit that is, including all independent variables in a multivariate context is by looking at proportion of the total variation in Y i explained by the model. First, total variation can be decomposed into explained and unexplained components as follows: TSS is the Total Sum of Squares RSS is the Regression Sum of Squares (note that some texts call this RegSS) ESS is the Error Sum of Squares (some texts call this the residual sum of squares, RSS) So, TSS = RSS + ESS, where (25) TSS = (Y i Ȳ )2, (26) RSS = (Ŷi Ȳ )2, and (27)ESS = (Y i Ŷ )2

15 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 15 We re basically dividing up the total variance in Y i around its mean (TSS) into two parts: the variance accounted for in the regression model (RSS), and the variance not accounted for by the regression model (ESS). Indeed, we can illustrate on a case-by-case basis the variance from the mean that is accounted for by the model, and the remaining, unaccounted for, variance: All the explained variance (squared) is summed to form RSS; all the unexplained variance (squared) is summed to form ESS. Using these terms, the coefficient of determination, more commonly, the R 2, is calculated as follows: (28) R 2 = RSS TSS, or R 2 =1 ESS TSS, or R 2 = Or, alternatively, following from Equation 25-Equation 27: (29) R 2 = RSS TSS = And we can estimate all of this as follows: TSS ESS TSS ( Ŷ 1 Ȳ )2 (Yi Ȳ )2 = (Yi Ȳ )2 (Y i Ŷi) 2 (Yi Ȳ )2 X i Y i Y ˆ i (Y i Y ) 2 ( Y ˆ i Y ) 2 (Y i Y ˆ i ) X i =5 Y i =3.75 TSS=6.74 RSS=4.04 ESS=2.7.

16 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 16 The coefficient of determination is thus R2 = RSS TSS = =.599. The coefficient of determination is calculated the same way for multivariate regression. The R 2 has one problem, though it can only ever increase or stay equal as variables are added to the equation. More to the point, including extra variables can never lower the R 2, and the measure accordingly does not reward for model parsimony. If you want a measure that does so, you need to use a correction for degrees of freedom (sometimes called an adjusted R-squared): (30) R2 =1 RSS N K 1 TSS N 1 Note that this should only make a difference when the sample size is relatively small, or the number of independent variables is relatively large. But you can see in Equation 30 that if the sample size is small, increasing the number of variables will reduce the numerator, and thus reduce the adjusted R 2. One further note about the coefficient of determination: note that the R 2 is equivalent to the square of Pearson s r (Equation 6). That is, (31) r = S XY S X S Y = R 2 XY, There is, then, a clear relationship between the correlation coefficient and the coefficient of determination. There is also a relationship between a bivariate correlation coefficient and the regression coefficient. Let s begin with an equation for the regression coefficient, as in Equation 17 above: (32) β XY = S XY S 2 X, and rearrange these terms to isolate the covariance: (33) S XY = β XY SX 2, Now, let s substitute this for in the equation for correlation (Equation 6): (34) r XY = S XY S X S Y = β XY S 2 X S X S Y.

17 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 17 So the correlation coefficient and bivariate regression coefficient are products of each other. More clearly: (35) r XY = β XY S X S Y, and (36) β XY = r XY S Y S X. The relationship between the two in multivariate regression is of course much more complicated. But the point is that all these measures - measures capturing various aspects of the relationship between two (or more) variables - are related to each other, each a function of a given set of variances and covariances.

18 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 18 Assumptions of OLS regression The preceding OLS linear regression models are unbiased and efficient (that is, they provide the Best Linear Unbiased Estimator, or BLUE) provided five assumptions are not violated. If any of these assumptions are violated, the regular linear OLS model ceases to be unbiased and/or efficient. The assumptions themselves, as well as problems resulting from violating each one, are listed below (drawn from Kennedy, Econometrics). Of course, many data or models violate one or more of these assumptions, so much of what we have to cover now is how to deal with these problems. 1. Y can be calculated as a linear function of X, plus a disturbance term. Problems: wrong regressors, nonlinearity, changing parameters 2. Expected value of e is zero; the mean of e is zero. Problems: biased intercept 3. Disturbance terms have the same variance and are not correlated with one another Problems: heteroskedasticity, autocorrelated errors 4. Observations of Y are fixed in repeated samples; it is possible to repeat the sample with the same independent values Problems: errors in variables, autoregression, simultaneity 5. Number of observations is greater than the number of independent variables, and there are no exact linear relationships between the independent variables. Problems: multicollinearity

19 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 19 Nonlinearities So far, we ve assumed that the relationship between Y i and X i is linear. In many cases, this will not be true. We could imagine any number of non-linear relationships. Here are two just common possibilities: We can of course estimate a linear relationship in both cases it doesn t capture the actual relationship very well, though. In order to better capture the relationship between Y and X, we may want to adjust our variables to represent this non-linearity. Let s begin with the basic multivariate model, (37) Y i = α + β 1 X 1i + β 2 X 2i + i. Where a single X is believed to have a nonlinear relationship with Y, the simplest approach is to manipulate the X to use X 2 in place of X, for instance: (38) Y i = α + β 1 X 2 1i + β 2 X 2i + i, This may capture the exponential increase depicted in the first figure above. To capture the ceiling effect in the second figure, we could use both the linear (X) and quadratic (X 2 ), with the expectation that the coefficient for the former (β 1 ) would be positive and large, and the coefficient for the latter ( β 2 ) would be negative and small: (39) Y i = α + β 1 X 1i + β 2 X 2 1i + β 3 X 2i + i, This coefficient on the quadratic will gradually, and increasingly, reduce the positive effect of X 1. Indeed, if the effect of the quadratic is great enough, it can in combination with the linear version of X 1 produce a line that increases, peaks, and then begins to decrease.

20 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 20 Of course, these are just two of the simplest (and most common) nonlinearities. You can imagine any number of different non-linear relationships; most can be captured by some kind of mathematical adjustment to regressors. Sometimes we believe there is a nonlinear relationship between all the Xs and Y that is, all Xs combined have a nonlinear effect on Y, for instance: (40) Y i =(α + β 1 X 1i + β 3 X 2i ) 2 + i. The easiest way to estimate this is not Equation 40, though, but rather an adjustment as follows: (41) Yi = α + β 1 X 1i + β 3 X 2i + i. Here, we simply transform the dependent variable. I ve replaced the squared version of the right hand side (RHS) variables with the square root of the left hand side (LHS) because it s a simple example of a nonlinear transformation. It s not the most common, however. The most common is taking the log of Y, as follows: (42) ln(y i )=α + β 1 X 1i + β 3 X 2i + i. Doing so serves two purposes. First, we might believe that the shape of the effect of our RHS variables on Y i is actually nonlinear and specifically, logistic in shape (a S-curve). This transformation may quite nicely capture this nonlinearity. Second, taking the log of Y i can solve a distributional problem with that variable. OLS estimations will work more efficiently with variables that are normally distributed. If Y i has a great many small values, and a long right-hand tail (as many of our variables will; for instance, income), then taking the log of Y i often does a nice job of generating a more normal distribution. This example highlights a second reason for transforming a variable, on the LHS or RHS. Sometimes, a transformation is based on a particular shape of an effect, based on theory. Other times, a transformation is used to fix a non-normally distributed variable. The first transformation is based on theoretical expectations; the second is based on a statistical problems. (In practice, separating the two is not always easy.)

21 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 21 Collinearity and Multicollinearity When there is a linear relationship among the regressors, the OLS coefficients are not uniquely identified. This is not a problem if your goal is only to predict Y multicollinearity will not affect the overall prediction of the regression model. If your goal is to understand how the individual RHS variables impact Y, however, multicollinearity is a big problem. One problem is that the individual p-values can be misleading confidence intervals on the regression coefficients will be very wide. Essentially, what we are concerned about is the correlation amongst regressors, for instance, X 1 and X 2 : (43) r 12 = (X1 X 2 )(X 2 X 2 ) (Xi X 1 ) 2 (X 2 X 2 ) 2, This is of course just a simple adjustment to the Pearson s r equation (Equation 7). Equation 43 deals just with the relationship between two variables, however, and we are often worried about a more complicated situation one in which a given regressor is correlated with a combination of several, or even all, the other regressors in a model. (Note that this multicollinearity can exist even if there are no striking bivariate relationships between regressors.) Multicollinearity is perhaps most easily depicted as a regression model in which one X is regressed on all others. That is, for the regression model, (44) Y i = α + β 1 X 1i + β 2 X 2i + β 3 X 3i + β 4 X 4i + i we might be concerned that the following regression produces strong results: (45) X 1i = α + β 2 X 2i + β 3 X 3i + β 4 X 4i + i If X 1 is well predicted by X 2 through X 4, it will be very difficult to identify the slope (and error) for X 1 from the set of other slopes (and errors). (The slopes and errors for the other slopes may be affected as well.) Variance inflation factors are one measure that can be used to detect multicollinearity. Essentially, VIFs are a scaled version of the multiple correlation coefficient between variable j and the rest of the independent variables. Specifically, (46) VIF j = 1 1 R 2 j where R 2 j would be based on results from a model as in Equation 45. If R 2 j equals zero (i.e., no correlation between X j and the remaining independent

22 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 22 variables), then VIF j equals 1. This is the minimum value. As R 2 j increases, however, the denominator of Equation 46 decreases, and the estimated VIF rises as a consequence. A value greater than 10 represents a pretty big multicollinearity problem. VIFs tell us how much the variance of the estimated regression coefficient is 'inflated' by the existence of correlation among the predictor variables in the model. The square root of the VIF actually tells us how much the standard error is inflated. This table, drawn from the Sage volume by Fox, shows the relationship between a given R 2 j, the VIF, and the estimated amount by which the standard error of X j is inflated by multicollinearity. Coefficient Variance Inflation as a Function of Inter-Regressor Multiple Correlation R j 2 VIF (impact on SE β j ) Ways of dealing with multicollinearity include (a) dropping variables, (b) combining multiple collinear variables into a single measure, and/or (c) if collinearity is only moderate, and all variables are of substantive importance to the model, simply interpreting coefficients and standard errors taking into account the effects of multicollinearity.

23 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 23 Heteroskedasticity Heteroskedasticity refers to unequal variance in the regression errors. Note that there can be heteroskedasticity relating to the effect of individual independent variables, and also heteroskedasticity related to the combined effect of all independent variables. (In addition, there can be heteroskedasticity in terms of unequal variance over time.) The following figure portrays the standard case of heteroskedasticity, where the variance in Y (and thus the regression error as well) is systematically related to values of X. The difficulty here is that the error of the slope will be poorly estimated it will over-estimate the error at small values of X, and under-estimate the error at large values of X. Diagnosing heteroskedasticity is often easiest by looking at a plot of errors (ε i ) by values of the dependent variable (Y i ). Basically, we begin with the standard bivariate model of Y i, (47) Y i = α + βx i + ε i, and then plot the resulting values of ε i by Y i. If we did so for the data in the preceding figure, then the resulting residuals plot would look as follows:

24 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 24 As Y i increases here, so too does the variance in ε i. There are of course other possible (heteroskedastic) relationships between Y i and ε i, for instance, where variance in much greater in the middle. Any version of heteroskedasticity presents problems for OLS models. When the sample size is relatively small, these diagnostic graphs are probably the best means of identifying heteroskedasticity. When the sample size is large, there are too many dots on the graph to distinguish what s going on. There are several tests for heteroskedasticity, however. The Breusch-Pagan test tests for a relationship between the error and the independent variables. It starts with a standard multivariate regression model, (48) Y i = α + β 1 X 1i + β 2 X 2i β k X ki + i, and then substitutes the estimated errors, squared, for the dependent variable, (49) 2 i = α + β 1x 1i + β 2 x 2i β k x ki + ν i. We then use a standard F-test to test the joint significance of coefficients in Equation 49. If they are significant, there is some kind of systematic relationship between the independent variables and the error.

25 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 25 Outliers Recall that OLS regression pays particularly close attention to avoiding large errors. It follows that outliers cases that are unusual can have a particularly large effect on an estimated regression slope. Consider the following two possibilities, where a single outlier has a huge effect on the estimated slope: Hat values (h i ) are the common measure of leverage in a regression. It is possible to express the fitted values of in terms of the observed values : (50) Ŷj = h 1j Y 1 + h 2 Y h nj Y n = n H ij Y i. i=1 The coefficient, or weight, h ij captures the contribution of each observation to the fitted value. Outlying cases can usually not be discovered by looking at residuals OLS estimation tries, after all, to minimize the error for high-leverage cases. In fact, the variance in residuals is in part a function of leverage, (51) V (E i)=σ 2 (1 h i ). The greater the hat value in Equation 51, the lower the variance. How can we identify high-leverage cases? Sometimes, simply plotting data can be very helpful. Also, we can look closely at residuals. Start with the model for standardized residuals, as follows, (52) E i = E i S E 1 hi, which simply expresses each residual as a number (or increment) of standard deviations in E i. The problem with Equation 52 is that case i is included in the

26 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 26 estimation of the variance; what we really want is a sense for how i looks in relation to the variance in all other cases. This is a studentized residual, (53) E i = E i S E( 1) 1 hi. and it provides a good indication of just how far out a given case is in relation to all other cases. (To test significance, the statistic follows a t-distribution with N-K-2 degrees of freedom.) Note that you can estimate studentized residuals in a quite different way (though with the same results). Start by defining a variable D, equal to 1 for case i and equal to 0 for all other cases. Now, for a multivariate regression model as follows: (54) Y i = α + β 1 X 1 + β 2 X β k X k + i. add variable D and estimate, (55) Y i = α + β 1 X 1 + β 2 X β k X k + γd i + i. This is referred to as a mean-shift outlier model, and the t-statistic for γ provides a test equivalent to the studentized residual. What do we do if we have outliers? That depends. If there are reasons to believe the case is abnormal, then sometimes it s best just to drop it from the dataset. If you believe the case is correct, or justifiable, however, in spite of the fact that it s an outlier, then you may choose to keep it in the model. At a minimum, you will want to test your model with and without this outlier, to explore the extent to which you results are driven by a single case (or, in case of several outliers, a small number of cases).

27 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 27 Linear Probability Models Models for dichotomous data Let s begin with a simple definition of our binary dependent variable. We have variable, Y i, which only takes on the values 0 or 1. We want to predict when Y i is equal to 0, or 1; put differently, we want to know for each individual case i the probability that Y i is equal to 1, given X i. More formally, (56)E(Y i )=Pr(Y i =1 X i ), which states that the expected value of Y i is equal to the probability that Y i is equal to one, given X i. Now, a linear probability model simply estimates Pr(Y i =1) in same way as we would estimate an interval-level Y i : (57) Pr(Y i = 1) = α + βx i. There are two difficulties with this kind of model. First, while the estimated slope coefficients are good, the standard errors are incorrect due to heteroskedasticity (errors increase in the middle range, first negative, then positive). Graphing the data with a regular linear regression line, for instance, would look something like this: The second problem with the linear probability model is that it will generate predictions that are greater than 1 and/or less than 0 (as shown in the preceding figure) even though these are nonsensical where probabilities are concerned. As a consequence, it is desirable to try and transform either the LHS or RHS of the model so predictions are both realistic and efficient.

28 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 28 Nonlinear Probability Model: Logistic Regression One option is to transform Y i, to develop a nonlinear probability model. To extend the range beyond 0 to 1, we first transform the probability into the odds (58) Pr(Y i =1 X i ) Pr(Y i =0 X i ) = Pr(Y i =1 X i ) 1 Pr(Y i =1 X i ), which indicate how often something happens relative to how often it does not, and range from 0 to infinity as X i approaches 1. We then take the log of this to get, (59) ln( Pr(Y i =1 X i ) 1 Pr(Y i =1 X i ) ), or more simply, (60) ln( where, p i 1 p i ), (61) p i = Y i =1 X i. Modeling what we ve seen in equation 60 then captures the log odds that something will happen. By taking the log, we ve effectively stretched out the ends of the 0 to 1 range, and consequently have a comparatively unconstrained dependent variable that can be used without difficulty in an OLS regression, where (62) ln( p i 1 p i )=βx i. Just to make clear the effects of our transformation, here s what taking the log odds of a simple probability looks like:

29 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 29 Probability Odds Logit /99= /95= /9= /7= /5= / /1= /5= /1= Note that there is another way of representing a logit model, essentially the inverse (un-logging of both sides) of Equation 62: (63) Pr(Y i =1 X i )= expβx 1 1+exp βx i. Just to be clear, we can work our way backwards from equation Equation 63 to Equation 62 as follows: (64) Pr(Y i =1 X i )= expβx 1 1+exp βx i, and Pr(Y i =0 X i )= exp βx i 1 1+expβX or 1 exp βx i. So, (65) p 1 p 0 = p 1 1 p 1 = and, expβx i 1+expβX i 1 1+expβX i = expβx i 1, (66) p i 1 p i = exp βxi, which when logging both sides becomes, (67)ln( p i 1 p i )=βx i.

30 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 30 The notation in Equation 62 is perhaps the most useful in connecting logistic with probit and other non-linear estimations for binary data. The logit transformation is just one possible transformation that effectively maps the linear prediction into the 0 to 1 interval allowing us to retain the fundamentally linear structure of the model while at the same time avoiding the contradiction of probabilities below 0 or above 1. Many cumulative density functions (CDFs) will meet this requirement. (Note that CDFs define the probability mass to the left of a given value of X; they are of course related in that they are slight adjustment of PDFs, which are dealt with in more detail in the section on significance tests.) Equation 63 is in contrast useful for thinking about the logit model as just one example of transformations in which Pr(Y i =1) is a function of a non-linear transformation of the RHS variables, based on any number of CDFs. A more general version of Equation 63 is, then, (68) Pr(Y i =1 X i )=F (βx i ). where F is the logistic CDF for the logit model, as follows, (69)Pr(Y i =1 X i )=F (βx i ), where F = 1 1+exp (x µ)/s, but could just as easily be the normal CDF for the probit model, or a variety of other CDFs. How do we know which CDF to use? The CDF we choose should reflect our beliefs about the distribution of Y i, or, alternatively (and equivalently) the distribution of error in Y i. We discuss this more below. An Alternative Description: The Latent Variable Model Another way to draw the link between logistic and regular regression is through the latent variable model, which posits that there is an unobserved, latent variable Y * i, where (70)Y i = βx i + i, and the link between the observed binary Y i and the latent Y i * is as follows: (71)Y i =1 if Y i > 0, and (72)Y i =0 if Y i 0. Using this example, the relationship between the observed binary Y i and the latent Y i can be graphed as follows:

31 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 31 So, at any given value of X i there is a given probability that Y i is greater than zero. This figure also shows how our beliefs about the distribution of error (ε i ) are fundamental there is a distribution of possible outcomes in Y i * when, in this figure, X i =4. For a probit model, we assume that Var(ε i ) =1 ; for a logit model, we assume that Var(ε i ) = π 2 /3. Other CDFs make other assumptions. The distribution of error (ε i ) at any given value of X i is related to a non-linear increase in the probability that Y i =1. Indeed, we can show this non-linear shift first by plotting a distribution of ε i at each value of X i, and then by looking at how the movement of this distribution across the zero line shifts the probability that Y i =1:

32 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 32 As the thick part of the distribution moves across the zero line, the probability increases dramatically. Nonlinear Probability Model: Probit Regression As noted above, probit models are based on the same logic as logistic models. Again, they can be thought of as a non-linear transformation of the LHS or RHS variables. The only difference for probit models is that rather than assume a logistic distribution, we assume a normal one. In equation 68, then, F would now be the cumulative density function for a normal distribution. Why assume a normal distribution? The critical question is why assume a logistic one? We typically assume a logistic distribution because it is very close to normal, and estimating a logistic model is computationally much easier than estimating probit model. We now have faster computers, so there is now less reason to rely on logit rather than probit models. That said, logit has some advantages where teaching is concerned. Compared to probit, it s very simple.

33 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 33 Maximum Likelihood Estimation Models for categorical variables are not estimated using OLS, but using maximum likelihood. ML estimates are the values of the parameters that have the greatest likelihood (that is, the maximum likelihood) of generating the observed sample of data if the assumptions of the model are true. For a simple model like Y i = α + βx i, an ML estimation looks at many different possible values of and, and finds the combination which is most likely to generating the observed values of Y i. Take, for instance, the above graph, which shows the observed values of Y i on the bottom axis. There are two different probability distributions, one produced by one set of parameters, A, and one produced by another set of parameters, B. MLE asks which distribution seems more likely to have produced the observed data. Here, it looks like the B parameters have an estimated distribution more likely to produce the observed data. Alternatively, consider the following. If we are interested in the probability that Y i =1, given a certain set of parameters (p), then an ML estimation is interested in the likelihood of p given the observed data (73) L(p Y i ). This is a likelihood function. Finding the best set of parameters is an iterative process, which starts somewhere and starts searching; different optimization algorithms may start in slightly different places, and conduct the search differently; all base their decision about searching for parameters on the rate of improvement in the model. (The way in which model fit is judged is addressed below.)

34 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 34 Note that our being vague about parameters here is purposeful. As analysts, the parameters we are thinking about are the coefficients for the various independent variables (βx ). The parameters critical to the ML estimation, however, are those that define the shape of the distribution; for a normal distribution, for instance, these are the mean (µ) and variance (σ ) (see Equation 24). Every set of parameters, βx, however, produces a given estimated normal distribution of Y i with mean µ and variance σ ; the ML estimation tries to find the βx producing the distribution most likely to have generated our observed data. Not also that while we speak about ML estimations maximizing the likelihood equation, in practice programs maximize the log of the likelihood, which simplifies computations considerably (and gets the same results). Because the likelihood is always between 0 and 1, the log likelihood is always negative. We can see this in the iteration log in STATA logit estimates, for instance.

35 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 35 Interpretation & Goodness of Fit Measures for Categorical Models Indeed, the -2 log likelihood is the measure of model fit for most categorical models. It is as follows, (74) 2(LL A LL B ), where LL A is the log likelihood of finding our sample of Y i in a distribution produced by our parameterized model, and LL B is the log likelihood of finding our sample of Y i in the distribution produced when all parameters are restricted to 0. Essentially, then, we re looking at the total improvement in the model s predictive power the difference between our model, and no model (save for a distributional assumption). Multiplying this difference by -2 has the (albeit mysterious) advantage of producing a statistic that is asymptotically χ 2 distributed. There are various versions of a pseudo R 2 for categorical models, usually based on some manipulation of the -2 log likelihood. To interpret individual coefficients resulting from a categorical model, we usually transform them into odds ratios (from log-odds ratios, which are not readily interpretable). This transformation is relatively simple. Recall that one version of the logit model is as follows, p i (75) ln( )=βx i. 1 p i This is the log odds ratio, of course, equivalent to the following, (76) p i 1 p i = exp(βx i ). This transformation of coefficients produces odds ratios, where each coefficient is now expressed as the odds that Y i is equal to 1 (rather than 0) when there is a one-unit increase in X i. (There are equivalent transformations for probit coefficients.)

36 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 36 Ordinal Outcomes Models for Categorical Data For models where the dependent variable is categorical, but ordered, ordered logit is the most appropriate modelling strategy. A typical description begins with a latent variable Y * i which is a function of (77)Y i * = βx i + ε i, and a link between an observed binary Y i and a latent Y i * as follows: (78)Y i =1 if Y i * δ 1 Y i = 2 if δ 1 Y i * δ 2, and, and Y i = 3 if Y i * δ 2, where δ 1 and δ 2 are unknown parameters to be estimated along with the β in equation 79. We can restate the model, then, as follows: (79)Pr(Y i =1x) = Pr(βX i + ε i δ 1 ) = Pr(ε i δ 1 βx i ), and Pr(Y i = 2 X i ) = Pr(δ 1 βx + ε i δ 2 ) = Pr(δ 1 βx i < ε i δ 2 βx i ), and Pr(Y i = 3 X i ) = Pr(βX i + ε i δ 2 ) = Pr(ε i δ 2 βx i ). The last statement of each line here makes clear the importance that the distribution of error plays in the estimation: the probability of a given outcome can be expressed as the probability that the error is in the first line, for instance smaller than the difference between theta and the estimated value. This set of statements can also be expressed as follows, adding hats to denote estimated values, substituting predicted Y ˆ for βx, and inserting a given cumulative distribution function, F, from which we derive our probability estimates: (80) p ˆ i1 = Pr(ε i ˆ δ 1 Y ˆ i ) = F( ˆ δ 1 Y ˆ i ), and p ˆ i2 = Pr( ˆ δ 1 Y ˆ i < ε i ˆ δ 2 Y ˆ i ) = F( ˆ δ 2 Y ˆ i ) F( ˆ δ 1 Y ˆ 1 ), and p ˆ i3 = Pr(ε i ˆ δ 2 Y ˆ i ) =1 F( ˆ δ 2 Y ˆ i ), Where F can again be the logistic CDF (for ordered logit), but also the normal CDF (for ordered probit), and so on. Again, using the logistic version as the

37 March 2010 Poli618 Notes, Stuart Soroka, Dept of Political Science, McGill University pg 37 example is far easier, and we can express the whole system in another way, as follows: p (81)ln( 1 p ) = βx, ln( 1 + p 2 p ) = βx, ln( 1 + p p k ) = βx, 1 p 1 1 p 1 p 2 1 p 1 p 2... p k where. Note that these models rest on the parallel slopes assumption: the slope coefficients do not vary between different categories of the dependent variable (i.e., from the first to second category, the second to third category, and so on). If this assumption is unreasonable, a multinomial model is more appropriate. (In fact, this assumption can be tested by fitting a multinomial model and examining differences and similarities in coefficients across categories.) And now, when we talk about odds ratios, we are talking about a shift in the odds of falling into a given category (m), (82)OR(m) = Pr(Y i m) Pr(Y i < m). Nominal Outcomes Multinomial logit is essentially a series of logit regressions examining the probability that Y i = m rather than Y i = k, where k is a reference category. This means that one category of the dependent variable is set aside as the reference category, and all models show the probability of Y i being one outcome rather than outcome k. Say, for instance, there are four outcomes k, m, n, and q, where k is the reference category. The models estimated are: (83)ln( Pr(Y i = m) Pr(Y i = k) ) = β m X, ln(pr(y i = n) Pr(Y i = k) ) = β n X, ln(pr(y i = q) Pr(Y i = k) ) = β q X These models explore the variables that distinguish each of m, n, and q from k. Any category can be the base category, of course. It may be that it is additionally interesting to see how q is distinguished from the other categories, in which case the following models can be estimated: (84)ln( Pr(Y i = k) Pr(Y i = q) ) = β k X, ln(pr(y i = m) Pr(Y i = q) ) = β m xx, ln(pr(y i = n) Pr(Y i = q) ) = β n X Results for multinomial logit models aren t expressed as odds ratios, since odds ratios refer to the probability of an outcome divided by 1. Rather, multinomial results are expressed as a risk-ratio, or relative risk, which is easily calculated by taking the exponential of the log risk-ratio. Where, the log risk-ratio is

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

ECON 497: Lecture 4 Page 1 of 1

ECON 497: Lecture 4 Page 1 of 1 ECON 497: Lecture 4 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 4 The Classical Model: Assumptions and Violations Studenmund Chapter 4 Ordinary least squares

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

12 Statistical Justifications; the Bias-Variance Decomposition

12 Statistical Justifications; the Bias-Variance Decomposition Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression

More information

Quadratic Equations Part I

Quadratic Equations Part I Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing

More information

Regression with Nonlinear Transformations

Regression with Nonlinear Transformations Regression with Nonlinear Transformations Joel S Steele Portland State University Abstract Gaussian Likelihood When data are drawn from a Normal distribution, N (µ, σ 2 ), we can use the Gaussian distribution

More information

ECONOMETRICS HONOR S EXAM REVIEW SESSION

ECONOMETRICS HONOR S EXAM REVIEW SESSION ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes

More information

Chapter 19: Logistic regression

Chapter 19: Logistic regression Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

POL 681 Lecture Notes: Statistical Interactions

POL 681 Lecture Notes: Statistical Interactions POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This document was written and copyrighted by Paul Dawkins. Use of this document and its online version is governed by the Terms and Conditions of Use located at. The online version of this document is

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Linear Regression & Correlation

Linear Regression & Correlation Linear Regression & Correlation Jamie Monogan University of Georgia Introduction to Data Analysis Jamie Monogan (UGA) Linear Regression & Correlation POLS 7012 1 / 25 Objectives By the end of these meetings,

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Lesson 21 Not So Dramatic Quadratics

Lesson 21 Not So Dramatic Quadratics STUDENT MANUAL ALGEBRA II / LESSON 21 Lesson 21 Not So Dramatic Quadratics Quadratic equations are probably one of the most popular types of equations that you ll see in algebra. A quadratic equation has

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 28 LOGIT and PROBIT Model Good afternoon, this is doctor Pradhan

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

MATH CRASH COURSE GRA6020 SPRING 2012

MATH CRASH COURSE GRA6020 SPRING 2012 MATH CRASH COURSE GRA6020 SPRING 2012 STEFFEN GRØNNEBERG Contents 1. Basic stuff concerning equations and functions 2 2. Sums, with the Greek letter Sigma (Σ) 3 2.1. Why sums are so important to us 3 2.2.

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

ISQS 5349 Final Exam, Spring 2017.

ISQS 5349 Final Exam, Spring 2017. ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Algebra & Trig Review

Algebra & Trig Review Algebra & Trig Review 1 Algebra & Trig Review This review was originally written for my Calculus I class, but it should be accessible to anyone needing a review in some basic algebra and trig topics. The

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Generalized Models: Part 1

Generalized Models: Part 1 Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance. Heteroskedasticity y i = β + β x i + β x i +... + β k x ki + e i where E(e i ) σ, non-constant variance. Common problem with samples over individuals. ê i e ˆi x k x k AREC-ECON 535 Lec F Suppose y i =

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Linear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error?

Linear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error? Simple linear regression Linear Regression Nicole Beckage y " = β % + β ' x " + ε so y* " = β+ % + β+ ' x " Method to assess and evaluate the correlation between two (continuous) variables. The slope of

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

A Re-Introduction to General Linear Models

A Re-Introduction to General Linear Models A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information