Chapter 15: Other Regression Statistics and Pitfalls

Size: px
Start display at page:

Download "Chapter 15: Other Regression Statistics and Pitfalls"

Transcription

1 Chapter 15: Other Regression Statistics and Pitfalls Chapter 15 Outline Two-Tailed Confidence Intervals o Confidence Interval Approach: Which Theories Are Consistent with the Data? o A Confidence Interval Example: Television Growth Rates o Calculating Confidence Intervals with Statistical Software Coefficient of Determination, R-Squared (R ) Pitfalls o Explanatory Variable Has the Same Value for All Observations o One Explanatory Variable Is a Linear Combination of Other Explanatory Variables o Dependent Variable Is a Linear Combination of Explanatory Variables o Outlier Observations o Dummy Variable Trap Chapter 15 Prep Questions 1. A friend believes that the internet is displacing the television as a source of news and entertainment. The friend theorizes that after accounting for other factors, television usage is falling by 1 percent annually: 1.0 Percent Growth Rate Theory: After accounting for all other factors, the annual growth rate of television users is negative, 1.0 percent. Recall the model we used previously to explain television use: LogUsers = β + β + β CapitalHuman t Const t CapHum t + β CapitalPhysical + β GdpPC + β Auth + e CapPhy t GDP t Auth t t and the data we used: Internet and Data: Panel data of Internet,, economic, and political statistics for 08 countries from 1995 to 00. [Link to MIT-InternetFlat wf1 goes here.] LogUsersInternet t LogUsers t t Logarithm of Internet users per 1,000 people for observation t Logarithm of television users per 1,000 people for observation t for observation t

2 CapitalHuman t Literacy rate for observation t (percent of population 15 and over) CapitalPhysical t Telephone mainlines per 10,000 people for observation t GdpPC t Per capita real GDP in nation t (1,000 s of international dollars) Auth t The Freedom House measures of political authoritarianism for observation t normalized to a 0 to 10 scale. 0 represents the most democratic rating and 10 the most authoritarian. During the period, Canada and the U.S. had a 0 rating; Iraq and the Democratic Republic of Korea (North Korea) rated 10. Now, assess your friend s theory. a. Use the ordinary least squares (OLS) estimation procedure to estimate the model s parameters. [Link to MIT-InternetFlat wf1 goes here.] b. Formulate the appropriate null and alternative hypotheses. Is a onetailed or a two-tailed test appropriate? c. Use the Econometrics Lab to calculate the Prob[Results IF H 0 True]. [Link to MIT-TTest 0.1 goes here.]. A regression s coefficient of determination, called the R-Squared, is referred to as the goodness of fit. It equals the portion of the dependent variable s squared deviations from its mean that is explained by the parameter estimates: R Explained Squared Deviations from the Mean = = Actual Squared Deviations from the Mean t= 1 T T ( Esty y) t= 1 t ( y y) Calculate the R-Squared for Professor Lord s first quiz by filling in the following blanks: Actual y Actual Explained y Explained Deviation Squared Esty Deviation Squared from Mean Deviation Equals from Mean Deviation Student x t y t yt y ( yt y) x Estyt y ( Estyt y) t

3 T t= 1 y = t T ( yt y) = t= 1 T ( Estyt y) = t= 1 y = = R -Squared = = 3 3. Students frequently experience difficulties when analyzing data. To illustrate some of these we first review the goal of multiple regression analysis: Goal of Multiple Regression Analysis: Multiple regression analysis attempts to sort out the individual effect of each explanatory variable. An explanatory variable s coefficient estimate allows us to estimate the change in the dependent variable resulting from a change in that particular explanatory variable while all other explanatory variables remain constant. Reconsider our baseball data for Baseball Data: Panel data of baseball statistics for the 588 American League games played during the summer of Attendance t Paid attendance for game t DH t Designator hitter for game t (1 if DH permitted; 0 otherwise) HomeSalary t Player salaries of the home team for game t (millions of dollars) PriceTicket t Average price of tickets sold for game t s home team (dollars) VisitSalary t Player salaries of the visiting team for game t (millions of dollars) Now, consider several pitfalls that students often encounter: a. Explanatory variable has the same value for all observations. Run the following regression: [Link to MIT-ALSummer-1996.wf1 goes here.] Dependent variable: Attendance Explanatory variables: PriceTicket, HomeSalary, and DH 1) What happens? ) What is the value of DH t for each of the observations? 3) Why is it impossible to determine the effect of an explanatory variable if the explanatory variable has the same value for each observation? Explain. b. One explanatory variable is a linear combination of other explanatory variables. Generate a new variable, the ticket price in terms of cents: PriceCents = 100 PriceTicket Run the following regression: [Link to MIT-ALSummer-1996.wf1 goes here.]

4 4 Dependent variable: Attendance Explanatory variables: PriceTicket, PriceCents, and HomeSalary 1) What happens? ) Is it possible to sort out the effect of two explanatory variables when they contain redundant information? c. One explanatory variable is a linear combination of other explanatory variables another example. Generate a new variable, the total salaries of the two teams playing: TotalSalary = HomeSalary + VisitSalary Run the following regression: [Link to MIT-ALSummer-1996.wf1 goes here.] Dependent variable: Attendance Explanatory variables: PriceTicket, HomeSalary, VisitSalary, and TotalSalary 1) What happens? ) Is it possible to sort out the effect of explanatory variables when they are linear combinations of each other and therefore contain redundant information? d. Dependent variable is a linear combination of explanatory variables. Run the following regression: [Link to MIT-ALSummer-1996.wf1 goes here.] Dependent variable: TotalSalary Explanatory variables: HomeSalary and VisitSalary What happens? e. Outlier observations. First, run the following regression: [Link to MIT-ALSummer-1996.wf1 goes here.] Dependent variable: Attendance Explanatory variables: PriceTicket and HomeSalary, 1) What is the coefficient estimate for the ticket price? ) Look at the first observation. What is the value of HomeSalary for the first observation? Now, access a second workfile in which a single value was entered incorrectly. [Link to MIT-ALSummerOutlier-1996.wf1 goes here.]

5 5 3) Look at the first observation. What is the value of HomeSalary for the first observation? Is the value that was entered correctly? Run the following regression: Dependent variable: Attendance Explanatory variables: PriceTicket and HomeSalary 4) Compare the coefficient estimates in the two regressions. 4. Return to our faculty salary data. Faculty Salary Data: Artificially constructed cross section salary data and characteristics for 00 faculty members. Salary t Salary of faculty member t (dollars) Experience t Teaching experience for faculty member t (years) Articles t Number of articles published by faculty member t SexM1 t 1 if faculty member t is male; 0 if female As we did in Chapter 13, generate the dummy variable SexF1 which equals 1 for a woman and 0 for a man: Run the following three regressions specifying Salary as the dependent variable: [Link to MIT-FacultySalaries.wf1 goes here.] a. Explanatory variables: SexF1 and Experience b. Explanatory variables: SexM1 and Experience c. Explanatory variables: SexF1, SexM1, and Experience but without a constant Getting Started in EViews To estimate the third model (part c) using EViews, you must fool EViews into running the appropriate regression: In the Workfile window: highlight Salary and then while depressing <Ctrl> highlight SexF1, SexM1, and Experience. In the Workfile window: double click on a highlighted variable. Click Open Equation. In the Equation Specification window delete c so that the window looks like this: salary sexf1 sexm1 experience. Click OK. For each regression, what is the equation that estimates the salary for 1) men?

6 6 ) women? Last, run one more regression specifying Salary as the dependent variable: d. Explanatory variables: SexF1, SexM1, and Experience but with a constant. What happens? 5. Consider a system of linear equations of equations and 3 unknowns. Can you solve for all three unknowns? Two-Tailed Confidence Intervals: Which Theories Are Consistent with the Data? Our approach thus far has been to present a theory first and then use data to assess the theory: First, we presented a theory. Second, we analyzed the data to determine whether or not the data were consistent with the theory. In other words, we have started with a theory and then decided whether or not the data were consistent with the theory. The confidence interval approach reverses this process. Confidence intervals indicate the range of theories that are consistent with the data. First, we analyze the data. Second, we consider various theories and determine which theories are consistent with the data and which are not. In other words, the confidence interval approach starts with the data and then decides what theories are compatible. Hypothesis testing plays a key role in both approaches. Consequently, we must choose a significance level. A confidence interval s size determines the significance level. We use significance levels to distinguish between a small probability and a large probability. The significance level associated with a confidence interval equals 100 percent less the size of the two-tailed confidence interval. Three commonly used significance levels are 90, 95, and 99 percent: For a 90 percent confidence interval, the significance level is 10 percent. For a 95 percent confidence interval, the significance level is 5 percent. For a 99 percent confidence interval, the significance level is 1 percent. A theory is consistent with the data if we cannot reject the null hypothesis at the confidence interval s significance level.

7 7 A Confidence Interval Example: Television Growth Rates No doubt this sounds confusing, so let us work through an example using our international television data: Project: Which growth theories are consistent with the international television data? Internet and Data: Panel data of Internet,, economic, and political statistics for 08 countries from 1995 to 00. [Link to MIT-InternetFlat wf1 goes here.] LogUsersInternet t Logarithm of Internet users per 1,000 people for observation t LogUsers t Logarithm of television users per 1,000 people for observation t t for observation t CapitalHuman t Literacy rate for observation t (percent of population 15 and over) CapitalPhysical t Telephone mainlines per 10,000 people for observation t GdpPC t Per capita real GDP in nation t (1,000 s of international dollars) Auth t The Freedom House measures of political authoritarianism for observation t normalized to a 0 to 10 scale. 0 represents the most democratic rating and 10 the most authoritarian. During the period, Canada and the U.S. had a 0 rating; Iraq and the Democratic Republic of Korea (North Korea) rated 10.

8 8 We begin by specifying the size of the confidence interval. It is most common to specify a 95 percent confidence interval. This means that we are choosing a significance level of 5 percent. The following two steps formalize the procedure to decide whether a theory lies within the two-tailed 95 percent confidence interval: Step 1: Analyze the data. Use the ordinary least squares (OLS) estimation procedure to estimate the model s parameters. Step : Consider a specific theory. Is the theory consistent with the data? Does the theory lie within the confidence interval? o Step a: Based on the theory, construct the null and alternative hypotheses. The null hypothesis reflects the theory. o Step b: Compute Prob[Results IF H 0 True]. o Step c: Do we reject the null hypothesis? Yes: Reject the theory. The data are not consistent with the theory. The theory does not lie within the confidence interval. No: The data are consistent with the theory. The theory does lie within the confidence interval. Since our example uses a 95 percent confidence interval and hence a 5 percent significance level: Prob[Results IF H 0 True] <.05 Prob[Results IF H 0 True] >.05 Reject H 0 Do not reject H 0 Theory is not consistent with the Theory is consistent with the data. data. Theory does not lie within the Theory does lie within the 95 percent confidence interval 95 percent confidence interval We shall illustrate illustrate the steps by focusing on four growth rate theories postulating what the growth rate of television use equals after accounting for other relevant factors: 0.0 Percent Growth Rate Theory 1.0 Percent Growth Rate Theory 4.0 Percent Growth Rate Theory 6.0 Percent Growth Rate Theory

9 9 0.0 Percent Growth Rate Theory Since television is a mature technology we begin with a theory postulating that time will have no impact on television use after accounting for other factors; that is, after accounting for other factors the growth rate of television use will equal 0.0. We shall now apply our two steps to determine if the 0.0 percent growth rate theory lies within the 95 percent confidence interval: Step 1: Analyze the data. Use the ordinary least squares (OLS) estimation procedure to estimate the model s parameters. We shall apply the same model to explain television use that we used previously: Model: LogUsers = β + β + β CapitalHuman t Const t CapHum t + β CapitalPhysical + β GdpPC + β Auth + e CapPhy t GDP t Auth t t We already estimated the parameters of this model in Chapter 13: Ordinary Least Squares (OLS) Dependent Variable: LogUsers Explanatory Variable(s): Estimate SE t-statistic Prob CapitalHuman CapitalPhysical GdpPC Auth Const Number of Observations 74 Estimated Equation: EstLogUsers = CapitalHuman +.00CapitalPhysical +.058GdpPC +.064Auth Table 15.1: Television Regression Results Step : 0.0 Percent Growth Rate Theory. Focus on the effect of time. Is a 0.0 percent growth theory consistent with the data? Does the theory lie within the confidence interval? 0.0 Percent Growth Rate Theory: After accounting for all other explanatory variables, time has no effect on television use; that is, after accounting for all other explanatory variables, the annual growth rate of television use equals 0.0 percent. Accordingly, the actual coefficient of, β, equals.000.

10 10 o Step a: Based on the theory, construct the null and alternative hypotheses. H 0 : β =.000 H 1 : β.000 o Step b: Compute Prob[Results IF H 0 True]. Prob[Results IF H 0 True] = Probability that the coefficient estimate would be at least.03 from.000, if H 0 were true (if the actual coefficient equals, β,.000). OLS estimation If H 0 Standard Number of Number of procedure unbiased true error observations parameters é ã é ã Mean[ b ] = β = 0 SE[ b ] =.0159 DF = 74 6 = 736 We can use the Econometrics Lab to calculate the probability of obtaining the results if the null hypothesis is true. Remember that we are conducting a two-tailed test. Econometrics Lab 15.1: Calculate Prob[Results IF H 0 True]. First, calculate the right hand tail probability. [Link to MIT-Lab 15.1a goes here.] Question: What is the probability that the estimate lies at or above.03? Answer:.074. Student t-distribution Mean =.000 SE =.0159 DF = Figure 15.1: Probability Distribution of Coefficient Estimate 0.0 Percent Growth Rate Theory b T V Ye ar

11 11 Second, calculate the left hand tail probability. [Link to MIT-Lab 15.1b goes here.] Question: What is the probability that the estimate lies at or below.03? Answer:.074. The Prob[Results IF H 0 True] equals the sum of the of the right and left tail two probabilities: Left Tail Right Tail Prob[Results IF H 0 True] = o Step c: Do we reject the null hypothesis? No, we do not reject the null hypothesis at a 5 percent significance level; Prob[Results IF H 0 True] equals.148 which is greater than.05. The theory is consistent with the data; hence,.000 does lie within the 95 percent confidence interval. Let us now apply the procedure to three other theories: 1.0 Percent Growth Rate Theory: After accounting for all other factors, the annual growth rate of television users is 1.0 percent; that is, β equals Percent Growth Rate Theory: After accounting for all other factors, the annual growth rate of television users is 4.0 percent; that is, β equals Percent Growth Rate Theory: After accounting for all other factors, the annual growth rate of television users is 6.0 percent; that is, β equals.060.

12 1 We shall not provide justification for any of these theories. The confidence interval approach does not worry about justifying the theory. The approach is pragmatic; the approach simply asks whether or not the data support the theory. 1.0 Percent Growth Rate Theory Step 1: Analyze the data. Use the ordinary least squares (OLS) estimation procedure to estimate the model s parameters. We have already done this. Step : 1.0 Percent Growth Rate Theory. Is the theory consistent with the data? Does the theory lie within the confidence interval? o Step a: Based on the theory, construct the null and alternative hypotheses. H 0 : β =.010 H 1 : β.010 o Step b: Compute Prob[Results IF H 0 True]. To compute Prob[Results IF H 0 True] we first pose a question: Question: How far is the coefficient estimate,.03, from the value of the coefficient specified by the null hypothesis,.010? Answer:.033. Accordingly, Prob[Results IF H 0 True] = Probability that the coefficient estimate would be at least.033 from.010, if H 0 were true (if the actual coefficient equals, β,.010). OLS estimation If H 0 Standard Number of Number of procedure unbiased true error observations parameters é ã é ã b ] = β =.010 SE[ b ] =.0159 DF = 74 6 = 736 Mean[ We can use the Econometrics Lab to calculate the probability of obtaining the results if the null hypothesis is true. Once again, remember that we are conducting a two-tailed test:

13 13 Econometrics Lab 15.: Calculate Prob[Results IF H 0 True] First, calculate the right hand tail probability. [Link to MIT-Lab 15.a goes here.] Student t-distribution Mean =.010 SE =.0159 DF = b T V Ye ar Figure 15.: Probability Distribution of Coefficient Estimate 1.0 Percent Growth Rate Theory Question: What is the probability that the estimate lies.033 or more above.010, at or above.03? Answer: Second, calculate the left hand tail probability. [Link to MIT-Lab 15.b goes here.] Question: What is the probability that the estimate lies.033 or more below.010, at or below.043? Answer: The Prob[Results IF H 0 True] equals the sum of the of the two probabilities: Left Tail Right Tail Prob[Results IF H 0 True] = o Step c: Do we reject the null hypothesis? Yes, we do reject the null hypothesis at a 5 percent significance level; Prob[Results IF H 0 True] equals.038 which is less than.05.

14 14 The theory is not consistent with the data; hence.010 does not lie within the 95 percent confidence interval. 4.0 Percent Growth Rate Theory Following the same procedure for the 4.0 percent growth rate theory: Prob[Results IF H 0 True].85. We do not reject the null hypothesis at a 5 percent significance level. The theory is consistent with the data; hence.040 does lie within the 95 percent confidence interval. 6.0 Percent Growth Rate Theory Again, following the same procedure for the 6.0 percent growth rate theory: Prob[Results IF H 0 True].00. We do reject the null hypothesis at a 5 percent significance level. The theory is not consistent with the data; hence.060 does not lie within the 95 percent confidence interval.

15 15 Now, let us summarize the four theories: Figure 15.3: Probability Distribution of Coefficient Estimate Comparison of Growth Rate Theories

16 16 Growth Rate Prob[Results Confidence Theory Null and Alternative Hypotheses IF H 0 True] Interval 1% H 0 : β =.010 H 1 : β No 0% H 0 : β =.000 H 1 : β Yes 4% H 0 : β =.040 H 1 : β Yes 6% H 0 : β =.060 H 1 : β No Table 15.: Growth Rate Theories and the 95 Percent Confidence Interval Now, we shall make two observations and pose two questions: The 0.0 percent growth rate theory lies within the confidence interval, but the 1.0 percent theory does not. Question: What is the lowest growth rate theory that is consistent with the data; that is, what is the lower bound of the confidence interval, β? LB The 4.0 percent growth rate theory lies within the confidence interval, but the 6.0 percent theory does not. Question: What is the highest growth rate theory that is consistent with the data; that is, what is the upper bound of the confidence interval, β? UB

17 17 Prob[Results IF H 0 True] % 0.0% Within 95% 4.0% 6.0% Prob[Results IF H 0 True] Confidence Interval Growth Rate Theory β L UB Ye Bar β Y e ar Do Not Reject H 0 Reject H Reject H 0 0 Significance Level = 5% =.05 Figure 15.4: Lower and Upper Confidence Interval Bounds Figure 15.5 answers these questions visually by illustrating the lower and upper bounds. The Prob[Results IF H 0 True] equals.05 for both lower and upper bound growth theories because our calculations are based on a 95 percent confidence interval: The lower bound growth theory postulates a growth rate that is less than that estimated. Hence, the coefficient estimate,.03, marks the right tail border of the lower bound. The upper bound growth theory postulates a growth rate that is greater than that estimated. Hence, the coefficient estimate,.03, marks the left tail border of the upper bound.

18 18 Figure 15.5: Probability Distribution of Coefficient Estimate Lower and Upper Confidence Intervals Econometrics Lab 15.3: Calculating the 95 Percent Confidence Interval. We can use the Econometrics Lab to calculate the lower and upper bounds: LB Calculating the Lower Bound: β For the lower bound, the right tail probability equals.05. [Link to MIT-Lab 15.3a goes here.] The appropriate information is already entered for us: Standard Error:.0159 Value:.03 Degrees of Freedom: 736 Area to Right:.05 Click Calculate. The reported Mean is the lower bound. Mean:.008 β =.008. LB

19 19 Calculating the Upper Bound: UB β For the upper bound, the left tail probability equals.05. Accordingly, the right tail probability will equal.975. [Link to MIT-Lab 15.3b goes here.] The appropriate information is already entered for us: Standard Error:.0159 Value:.03 Degrees of Freedom: 736 Area to Right:.975 Click Calculate. The reported Mean is the upper bound. Mean:.054 β =.054. UB.008 and.054 mark the bounds of the two-tailed 95 percent confidence interval: For any growth rate theory between.8 percent and 5.4 percent: Prob[Results IF H 0 True] >.05 Do not reject H 0 at the 5 percent significance level. For any growth rate theory below.8 percent or above 5.4 percent: Prob[Results IF H 0 True] <.05 Reject H 0 at the 5 percent significance level.

20 0 Calculating Confidence Intervals with Statistical Software Fortunately, statistic software provides us with an easy and convenient way to compute confidence intervals. The software does all the work for us. Getting Started in EViews After running the appropriate regression: In the Equation window: Click View, Coefficient Diagnostics, and Coefficient Intervals. In the Confidence Intervals window: Enter the confidence levels you wish to compute. (By default the values of.90,.95, and.99 are entered.) Click OK. 95 Percent Interval Estimates Dependent Variable: LogUsers Explanatory Variable(s): Estimate Lower Upper CapitalHuman CapitalPhysical GdpPC Auth Const Number of Observations 74 Table 15.3: 95 Percent Confidence Interval Calculations Table 15.3 reports that the lower and upper bounds for the 95 percent confidence interval are.008 and.054. These are the same values that we calculated using the Econometrics Lab.

21 1 Coefficient of Determination (Goodness of Fit), R-Squared (R ) All statistical packages report the coefficient of determination, the R-squared, in their regression printouts. The R-squared seeks to capture the goodness of fit. It equals the portion of the dependent variable s squared deviations from its mean that is explained by the parameter estimates: R Explained Squared Deviations from the Mean = = Actual Squared Deviations from the Mean t= 1 T T ( Esty y) t= 1 t ( y y) To explain how the coefficient of determination is calculated, we shall revisit Professor Lord s first quiz: Student Minutes Studied (x) Quiz Score (y) Table 15.4: First Quiz Data Recall the theory, the model, and our analysis: Theory: An increase in the number of minutes studied results in an increased quiz score. Model: y t = β Const + β x x t + e t x = Minutes studied by student t y = Quiz score earned by student t Theory: β x > 0 We used the ordinary least squares (OLS) estimation procedure to estimate the model s parameters: [Link to MIT-Quiz1.wf1 goes here.] t

22 Ordinary Least Squares (OLS) Dependent Variable: y Explanatory Variable(s): Estimate SE t-statistic Prob x Const Number of Observations 3 R-squared Estimated Equation: Esty = x Interpretation of Estimates: b Const = 63: Students receive 63 points for showing up. b x = 1.: Students receive 1. additional points for each additional minute studied. Critical Result: The coefficient estimate equals 1.. The positive sign of the coefficient estimate, suggests that additional studying increases quiz scores. This evidence lends support to our theory. Table 15.5: First Quiz Regression Results Next, we formulated the null and alternative hypotheses to determine how much confidence we should have in the theory: H 0 : β x = 0 Studying has no impact on a student s quiz score H 1 : β x > 0 Additional studying increases a student s quiz score We then calculated Prob[Results IF H 0 True], the probability of the results like we obtained (or even stronger) if studying in fact had no impact on quiz scores. The tails probability reported in the regression printout allows us to calculate this easily. Since a one-tailed test is appropriate, we divide the tails probability by :.601 Prob[Results IF H 0 True] =.13 We cannot reject the null hypothesis that studying has no impact even at the 10 percent significance level.

23 3 The regression printout reports that the R-squared equals about.84; this means that 84 percent of the dependent variable s squared deviations from its mean are explained by the parameter estimates. Table 15.6 shows the calculations required to compute the R-squared: Actual y Actual Explained y Explained Deviation Squared Esty Deviation Squared from Mean Deviation Equals from Mean Deviation Student x t y t yt y ( yt y) x Estyt y ( Estyt y) T t= 1 y = 43 t T ( yt y) = 34 t= y = = 81 R -Squared = = Table 15.6: R-Squared Calculations for First Quiz T ( Estyt y) = 88 t= 1 The R-squared equals R T T ( Estyt y) divided by ( yt y) : t= 1 t= 1 ( Estyt y) Explained Squared Deviations from the Mean 88 = = = =.84 Actual Squared Deviations from the Mean 34 ( y y) t= 1 T 84 percent of the y s squared deviations are explained by the estimated constant and coefficient. Our calculation of the R-squared agrees with the regression printout. T t= 1 t

24 4 While the R-Squared is always calculated and reported by all statistical software, it is not useful in assessing theories. We shall justify this claim by considering a second quiz that Professor Lord administered. Each student studies the same number of minutes and earns the same score in the second quiz as he/she did in the first quiz: Student Minutes Studied (x) Quiz Score (y) Table 15.7: Second Quiz Data Before we run another regression that includes the data from both quizzes, let us apply our intuition: Begin by focusing on only the first quiz. Taken in isolation, first quiz suggests that studying improves quiz scores. We cannot be very confident of this, however, since we cannot reject the null hypothesis even at a 10 percent significance level. Next, consider only the second quiz. Since the data from the second quiz is identical to the data from the first quiz, the regression results would be identical. Hence, taken in isolation, the second quiz suggests that studying improves quiz scores. Each quiz in isolation suggests that studying improves quiz scores. Now, consider both quizzes together. The two quizzes taken together reinforce each other; this should make us more confident in concluding that studying improves quiz scores, should it not? If our intuition is correct, how should the Prob[Results IF H 0 True] be affected when we consider both quizzes together? Since we are more confident in concluding that studying improves quiz scores, the probability should be less. Let us run a regression using data from both the first and second quizzes to determine whether or not this is true: [Link to MIT-Quiz1&.wf1 goes here.]

25 5 Ordinary Least Squares (OLS) Dependent Variable: y Explanatory Variable(s): Estimate SE t-statistic Prob x Const Number of Observations 3 R-squared Table 15.8: First and Second Quiz Regression Results Using data from both quizzes:.0099 Prob[Results IF H 0 True] =.005 As a consequence of the second quiz, the probability has fallen from.13 to.005; clearly, our confidence in the theory rises. We can now reject the null hypothesis that studying has no impact at the traditional significance levels of 1, 5, and 10 percent. Our calculations confirm our intuition. Next, consider the R-squared for the last regression that includes both quizzes. The regression printout reports that the R-squared has not changed; the R-squared is still.84. Table 15.9 explains why: Actual y Actual Explained y Explained Deviation Squared Esty Deviation Squared Quiz/ from Mean Deviation Equals from Mean Deviation Student x t y t yt y ( yt y) x Estyt y ( Estyt y) 1/ / / / / / T t= 1 y = 486 t T ( yt y) = 684 t= 1 T ( Estyt y) = 576 t= y = = 81 R -Squared = = Table 15.9: R-Squared Calculations for First and Second Quizzes

26 6 R ( Estyt y) Explained Squared Deviations from the Mean 586 = = = =.84 Actual Squared Deviations from the Mean 684 ( y y) t= 1 T T t= 1 t The R-squared still equals.84. Both the actual and explained squared deviations have doubled; consequently, their ratio, the R-squared, remains unchanged. Clearly, the R-squared does not help us assess our theory. We are now more confident in the theory, but the value of the R-squared has not changed. The bottom line is that if we are interested in assessing our theories we should focus on hypothesis testing, not on the R-squared. Pitfalls Frequently econometrics students using statistical software encounter pitfalls that are frustrating. We shall now discuss several of these pitfalls and describe the warning signs that accompany them. We begin by reviewing the goal of multiple regression analysis: Goal of Multiple Regression Analysis: Multiple regression analysis attempts to sort out the individual effect of each explanatory variable. An explanatory variable s coefficient estimate allows us to estimate the change in the dependent variable resulting from a change in that particular explanatory variable while all other explanatory variables remain constant. We shall consider five common pitfalls that often befell students: Explanatory variable has the same value for all observations. One explanatory variable is a linear combination of other explanatory variables. Dependent variable is a linear combination of explanatory variables. Outlier observations. Dummy variable trap.

27 7 We shall illustrate the first four pitfalls by revisiting our baseball attendance data that reports on every game played in the American League during the summer of 1996 season. Project: Assess the determinants of baseball attendance. Baseball Data: Panel data of baseball statistics for the 588 American League games played during the summer of Attendance t Paid attendance for game t DH t Designator hitter for game t (1 if DH permitted; 0 otherwise) HomeSalary t Player salaries of the home team for game t (millions of dollars) PriceTicket t Average price of tickets sold for game t s home team (dollars) VisitSalary t Player salaries of the visiting team for game t (millions of dollars) [Link to MIT-ALSummer-1996.wf1 goes here.] We begin with a model that we have studied before in which attendance, Attendance, depends on two explanatory variables, ticket price, PriceTicket, and home team salary, HomeSalary: Attendance t = β Const + β Price PriceTicket t + β HomeSalary HomeSalary t + e t Recall the regression results from Chapter 14: Ordinary Least Squares (OLS) Dependent Variable: Attendance Explanatory Variable(s): Estimate SE t-statistic Prob PriceTicket HomeSalary Const Number of Observations 585 Estimated Equation: EstAttendance = 9,46 591PriceTicket + 783HomeSalary Interpretation: b PriceTicket = 591. We estimate that a $1.00 increase in the price of tickets decreases attendance by 591 per game. b HomeSalary = 783. We estimate that a $1 million increase in the home team salary increases attendance by 783 per game. Table 15.10: Baseball Attendance Regression Results

28 8 Explanatory Variable Has the Same Value for All Observations One common pitfall is to include an explanatory variable in a regression that has the same value for each observation. To illustrate this, consider the variable DH: DH t Designator hitter for game t (1 if DH permitted; 0 otherwise) Our baseball data includes only American League games in Since interleague play did not begin until 1997 and all American League games allowed designated hitters, the variable DH t equals 1 for each observation. Let us try to use the ticket price, PriceTicket, home team salary, HomeSalary, and the designated hitter dummy variable, DH, to explain attendance, Attendance: [Link to MIT-ALSummer-1996.wf1 goes here.] The statistical software issues a diagnostic. While the verbiage differs from software package to software package, the message is the same: the software cannot perform the calculations that we requested. That is, the statistical software is telling us that it is being asked to do the impossible. What is the intuition behind this? To determine how a dependent variable is affected by an explanatory variable, we must observe how the dependent variable changes when the explanatory variable changes. The intuition is straightforward: If the dependent variable tends to rise when the explanatory variable rises, the explanatory variable affects the dependent variable positively suggesting a positive coefficient. On the other hand, if the dependent variable tends to fall when the explanatory variable rises, the explanatory variable affects the dependent variable negatively suggesting a negative coefficient. The evidence of how the dependent variable changes when the explanatory variable changes is essential. In the case of our baseball example, there is no variation in the designated hitter explanatory variable, however; the DH t equals 1 for each observation. We have no way to assess the effect that the designated hitter has on attendance. We are asking our statistical software to do the impossible. While we have attendance information when the designated hitter was used, we have no attendance information when the designated hitter was not used. How then can we expect the software to assess the impact of the designed hitter on attendance?

29 9 One Explanatory Variable Is a Linear Combination of Other Explanatory Variables We have already seen one example of this when we discussed multicollinearity in the previous chapter. We included both the ticket price in terms of dollars and the ticket price in terms of cents as explanatory variables. The ticket price in terms of cents was a linear combination of the ticket price in terms of dollars: PriceCents = 100 PriceTicket Let us try to use the ticket price, PriceTicket, home team salary, HomeSalary, and the ticket price in terms of cents, PriceCents, to explain attendance, Attendance: [Link to MIT-ALSummer-1996.wf1 goes here.] When both measures of the price were included in the regression our statistical software will issue a diagnostic indicating that it is being asked to do the impossible. Statistical software cannot separate out the individual influence of the two explanatory variables, PriceTicket and PriceCents, because they contain precisely the same information; the two explanatory variables are redundant. We are asking the software to do the impossible. In fact, any linear combination of explanatory variables produces this problem. To illustrate this, we consider two regressions. The first specifies three explanatory variables: ticket price, home team salary, and visiting team salary. [Link to MIT-ALSummer-1996.wf1 goes here.] Ordinary Least Squares (OLS) Dependent Variable: Attendance Explanatory Variable(s): Estimate SE t-statistic Prob PriceTicket HomeSalary VisitSalary Const Number of Observations 585 Estimated Equation: EstAttendance = 3,59 587PriceTicket + 791HomeSalary + 163VisitSalary Table 15.11: Baseball Attendance Now, generate a new variable, TotalSalary: TotalSalary = HomeSalary + VisitSalary TotalSalary is a linear combination of HomeSalary and VisitSalary. Let us try to use the ticket price, PriceTicket, home team salary, HomeSalary, and visiting

30 30 team salary, VisitSalary, and total salary, TotalSalary, to explain attendance, Attendance: [Link to MIT-ALSummer-1996.wf1 goes here.] Our statistical software will issue a diagnostic indicating that it is being asked to do the impossible. The information contained in TotalSalary is already included in HomeSalary and VisitSalary. Statistical software cannot separate out the individual influence of the three explanatory variables because they contain redundant information. We are asking the software to do the impossible. Dependent Variable Is a Linear Combination of Explanatory Variables Suppose that the dependent variable is a linear combination of the explanatory variables. The following regression illustrates this scenario. TotalSalary is by definition the sum of HomeSalary and VisitSalary. Total salary, TotalSalary, is the dependent variable; home team salary, HomeSalary, and visiting team salary, VisitSalary, are the explanatory variables: [Link to MIT-ALSummer-1996.wf1 goes here.] Ordinary Least Squares (OLS) Dependent Variable: Attendance Explanatory Variable(s): Estimate SE t-statistic Prob HomeSalary E E VisitSalary E E Const E Number of Observations 588 Estimated Equation: EstTotalSalary = 1.000HomeSalary VisitSalary Table 15.1: Baseball Attendance The estimates of the constant and coefficients reveal the definition of TotalSalary: TotalSalary = HomeSalary + VisitSalary Furthermore, the standard errors are very small, approximately 0. In fact, they are precisely equal to 0, but they are not reported as 0 s as a consequence of how digital computers process numbers. We can think of these very small standard errors as telling us that we are dealing with an identity here, something that is true by definition.

31 31 Outlier Observations We should be aware of the possibility of outliers because the ordinary least squares (OLS) estimation procedure is very sensitive to them. An outlier can occur for many reasons. One observation could have a unique characteristic or one observation could include a mundane typo. To illustrate the effect that an outlier may have, once again consider the games played in the summer of the 1996 American League season. [Link to MIT-ALSummer-1996.wf1 goes here.] The first observation reports the game played in Milwaukee on June 1, 1996: the Cleveland Indians visited the Milwaukee Brewers. The salary for the Brewers totaled 0.3 million dollars in 1996: Home Visiting Home Team Observation Month Day Team Team Salary Milwaukee Cleveland Oakland New York Seattle Boston Toronto Kansas City Texas Minnesota Review the following regression: Ordinary Least Squares (OLS) Dependent Variable: Attendance Explanatory Variable(s): Estimate SE t-statistic Prob PriceTicket HomeSalary Const Number of Observations 585 Estimated Equation: EstAttendance = 9,46 591PriceTicket + 783HomeSalary Table 15.13: Baseball Attendance Regression with Correct Data Now, suppose that a mistake was made in entering the Milwaukee s player salary for the first observation; suppose that the decimal point was misplaced and that was entered instead of All the other values were entered correctly. You can access the data including this outlier: [Link to MIT-ALSummerOutlier-1996.wf1 goes here.]

32 3 Home Visiting Home Team Observation Month Day Team Team Salary Milwaukee Cleveland Oakland New York Seattle Boston Toronto Kansas City Texas Minnesota Ordinary Least Squares (OLS) Dependent Variable: Attendance Explanatory Variable(s): Estimate SE t-statistic Prob PriceTicket HomeSalary Const Number of Observations 585 Estimated Equation: EstAttendance = 9,46 591PriceTicket + 783HomeSalary Table 15.14: Baseball Attendance Regression with an Outlier Even though only a single value has been altered, the estimates of both coefficients changes dramatically. The estimate of the ticket price coefficient changes from about 591 to 1,896 and the estimate of the home salary coefficient changes from to.088. This illustrates how sensitive the ordinary least squares (OLS) estimation procedure can be to an outlier. Consequently, we must take care to enter data properly and to check to be certain that we have generated any new variables correctly.

33 33 Dummy Variable Trap To illustrate the dummy variable trap, we shall revisit our faculty salary data: Project: Assess the possibility of discrimination in academia. Faculty Salary Data: Artificially constructed cross section salary data and characteristics for 00 faculty members. Salary t Salary of faculty member t (dollars) Experience t s of teaching experience for faculty member t Articles t Number of articles published by faculty member t SexM1 t 1 if faculty member t is male; 0 if female We shall investigate models that include only dummy variables and years of teaching experience. More specifically, we shall consider four cases: Dependent Explanatory Model Variable Variables Constant 1 Salary SexF1 and Experience Yes Salary SexM1 and Experience Yes 3 Salary SexF1, SexM1, and Experience No 4 Salary SexF1, SexM1, and Experience Yes We begin by generating the variable SexF1 as we did in Chapter 13: SexF1 = 1 SexM1 Now, we shall estimate the parameters of the four models. First, Model 1. Model 1: Salary t = β Const + β SexF1 SexF1 t + β E Experience t + e t [Link to MIT-FacultySalaries.wf1 goes here.] Ordinary Least Squares (OLS) Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-statistic Prob SexF Experience Const Number of Observations 00 Estimated Equation: EstSalary = 4,38,40SexF1 +,447Experience Table 15.15: Faculty Salary Regression

34 34 Now, calculate the estimated salary equation for men and women: For men, SexF1 = 0: EstSalary = 4,38,40SexF1 +,447Experience EstSalary Men = 4,38 0 +,447Experience = 4,38 +,447Experience The intercept for men equals $4,38; the slope equals,447. For women, SexF1 = 1: EstSalary = 4,38,40SexF1 +,447Experience EstSalary Women = 4,38,40 +,447Experience = 39,988 +,447Experience It is easy to plot the estimated salary equations for men and women. EstSalary EstSalary Men = 4,38 +,447Experience 4,38,40 Slope =,447 EstSalary Women = 39,998 +,447Experience 39,998 Experience Figure 15.6: Estimated Salaries Equations for Men and Women Both plotted lines have the same slope,,447. The intercepts differ, however. The intercept for men is 4,38 while the intercept for women is 39,998:

35 35 Model : Salary t = β Const + β SexM1 SexM1 t + β E Experience t + e t EstSalary = b Const + b SexM1 SexM1 + b E Experience Let us attempt to calculate the second model s estimated constant and the estimated male sex dummy coefficient, b Const and b SexM1, using the intercepts from Model 1. For men For women SexM1 = 1 SexM1 = 0 EstSalary Men = b Const + b SexM1 + b E Experience EstSalary Women = b Const + b E Experience Intercept Men = b Const + b SexM1 Intercept Women = b Const 4,38 = b Const + b SexM1 39,998 = b Const We now have two equations: 4,38 = b Const + b SexM1 39,998 = b Const and two unknowns, b Const and b SexM1. It is easy to solve for the unknowns. The second equation tells us that b Const equals 39,998: b Const = 39,998 Next, focus on the first equation: 4,38 = b Const + b SexM1 Substituting for b Const 4,38 = 39,998 + b SexM1 Solving for b SexM1 : b SexM1 = 4,38 39,998 =,40 Using the estimates from Model 1, we compute that the estimate for Model s estimate for the constant should be 39,998 and the estimate for the male sex dummy coefficient should be,40. Let us now run the regression: Ordinary Least Squares (OLS) Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-statistic Prob SexM Experience Const Number of Observations 00 Estimated Equation: EstSalary = 39,998 +,40SexM1 +,447Experience Table 15.16: Faculty Salary Regression The regression confirms our calculations.

36 36 Model 3: Salary t = β SexF1 SexF1 t + β SexM1 SexM1 t + β E Experience t + e t EstSalary = b SexF1 SexF1 + b SexM1 SexM1 + b E Experience Again, let us attempt to calculate the third model s estimated female sex dummy coefficient and its male sex dummy coefficient, b SexF1 and b SexM1, using the intercepts from Model 1. For men For women SexF1 = 0 and SexM1 = 1 SexF1 = 1 and SexM1 = 0 EstSalary Men = b SexM1 + b E Experience EstSalary Women = b SexF1 + b E Experience Intercept Men = b SexM1 Intercept Women = b SexF1 4,38 = b SexM1 39,998 = b SexF1 We now have two equations: 4,38 = b SexM1 39,998 = b SexF1 and two unknowns, b SexF1 and b SexM1. Using the estimates from Model 1, we compute that the estimate for Model 3 s estimate for the male sex dummy coefficient should be 4,38 and the estimate for the female sex dummy coefficient should be 39,998. Let us now run the regression: Getting Started in EViews To estimate the third model (part c) using EViews, you must fool EViews into running the appropriate regression: In the Workfile window: highlight Salary and then while depressing <Ctrl> highlight SexF1, SexM1, and Experience. In the Workfile window: double click on a highlighted variable. Click Open Equation. In the Equation Specification window delete c so that the window looks like this: salary sexf1 sexm1 experience. Click OK.

37 37 Ordinary Least Squares (OLS) Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-statistic Prob SexF SexM Experience Number of Observations 00 Estimated Equation: EstSalary = 39,998 SexM1 + 4,38SexM1 +,447Experience Table 15.17: Faculty Salary Regression Again, the regression results confirm our calculations. Model 4: Salary t = β Const + β SexF1 SexF1 t + β SexM1 SexM1 t + β E Experience + e t EstSalary = b Const + b SexF1 SexF1 + b SexM1 SexM1 + b E Experience Question: Can we calculate the fourth model s b Const, b SexF1, and b SexM1 using Model 1 s intercepts? For men For women SexF1 = 0 and SexM1 = 1 SexF1 = 1 and SexM1 = 0 EstSalary EstSalary Men = b Const + b SexM1 + b E Experience Women = b Const + b SexF1 + b E Experience Intercept Men = b Const + b SexM1 Intercept Women = b Const + b SexF1 4,38 = b Const + b SexM1 39,998 = b Const + b SexF1 We now have two equations: 4,38 = b Const + b SexM1 39,998 = b Const + b SexF1 and three unknowns, b Const, b SexF1, and b SexM1. We have more unknowns than equations. We cannot solve for the three unknowns. It is impossible. This is called a dummy variable trap: Dummy Variable Trap: A model in which there are more parameters representing the intercepts than there are intercepts. There are three parameters, b Const, b SexF1, and b SexM1, estimating the two intercepts.

38 38 Now, let us try to run the regression: [Link to MIT-FacultySalaries.wf1 goes here.] Our statistical software will issue a diagnostic telling us that it is being asked to do the impossible. In some sense, the software is being asked to solve for three unknowns with only two equations.

Chapter 13: Dummy and Interaction Variables

Chapter 13: Dummy and Interaction Variables Chapter 13: Dummy and eraction Variables Chapter 13 Outline Preliminary Mathematics: Averages and Regressions Including Only a Constant An Example: Discrimination in Academia o Average Salaries o Dummy

More information

Chapter 14: Omitted Explanatory Variables, Multicollinearity, and Irrelevant Explanatory Variables

Chapter 14: Omitted Explanatory Variables, Multicollinearity, and Irrelevant Explanatory Variables Chapter 14: Omitted Explanatory Variables, Multicollinearity, and Irrelevant Explanatory Variables Chapter 14 Outline Review o Unbiased Estimation Procedures Estimates and Random Variables Mean of the

More information

Chapter 5: Ordinary Least Squares Estimation Procedure The Mechanics Chapter 5 Outline Best Fitting Line Clint s Assignment Simple Regression Model o

Chapter 5: Ordinary Least Squares Estimation Procedure The Mechanics Chapter 5 Outline Best Fitting Line Clint s Assignment Simple Regression Model o Chapter 5: Ordinary Least Squares Estimation Procedure The Mechanics Chapter 5 Outline Best Fitting Line Clint s Assignment Simple Regression Model o Parameters of the Model o Error Term and Random Influences

More information

[Mean[e j ] Mean[e i ]]

[Mean[e j ] Mean[e i ]] Amherst College Department of Economics Economics 360 Fall 202 Solutions: Wednesday, September 26. Assume that the standard ordinary least square (OLS) premises are met. Let (x i, y i ) and (, y j ) be

More information

Chapter 12: Model Specification and Development

Chapter 12: Model Specification and Development Chapter 12: Model Specification and Development Chapter 12 Outline Model Specification: Ramsey REgression Specification Error Test (RESET) o RESET Logic o Linear Demand Model o Constant Elasticity Demand

More information

Solutions: Monday, October 15

Solutions: Monday, October 15 Amherst College Department of Economics Economics 360 Fall 2012 1. Consider Nebraska petroleum consumption. Solutions: Monday, October 15 Petroleum Consumption Data for Nebraska: Annual time series data

More information

Chapter 8 Handout: Interval Estimates and Hypothesis Testing

Chapter 8 Handout: Interval Estimates and Hypothesis Testing Chapter 8 Handout: Interval Estimates and Hypothesis esting Preview Clint s Assignment: aking Stock General Properties of the Ordinary Least Squares (OLS) Estimation Procedure Estimate Reliability: Interval

More information

Chapter 10: Multiple Regression Analysis Introduction

Chapter 10: Multiple Regression Analysis Introduction Chapter 10: Multiple Regression Analysis Introduction Chapter 10 Outline Simple versus Multiple Regression Analysis Goal of Multiple Regression Analysis A One-Tailed Test: Downward Sloping Demand Theory

More information

Monday, October 15 Handout: Multiple Regression Analysis Introduction

Monday, October 15 Handout: Multiple Regression Analysis Introduction Amherst College Department of Economics Economics 360 Fall 2012 Monday, October 15 Handout: Multiple Regression Analysis Introduction Review Simple and Multiple Regression Analysis o Distinction between

More information

Chapter 11 Handout: Hypothesis Testing and the Wald Test

Chapter 11 Handout: Hypothesis Testing and the Wald Test Chapter 11 Handout: Hypothesis Testing and the Wald Test Preview No Money Illusion Theory: Calculating True] o Clever Algebraic Manipulation o Wald Test Restricted Regression Reflects Unrestricted Regression

More information

Wednesday, September 19 Handout: Ordinary Least Squares Estimation Procedure The Mechanics

Wednesday, September 19 Handout: Ordinary Least Squares Estimation Procedure The Mechanics Amherst College Department of Economics Economics Fall 2012 Wednesday, September 19 Handout: Ordinary Least Squares Estimation Procedure he Mechanics Preview Best Fitting Line: Income and Savings Clint

More information

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Preview A One-Tailed Hypothesis Test: The Downward Sloping

More information

Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test

Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 17 Handout: Hypothesis Testing and the Wald Test Preview No Money Illusion Theory: Calculating True] o Clever Algebraic

More information

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows: INTRODUCTORY ECONOMETRICS Dpt of Econometrics & Statistics (EA3) University of the Basque Country UPV/EHU OCW Self Evaluation answers Time: 21/2 hours SURNAME: NAME: ID#: Specific competences to be evaluated

More information

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England An Introduction to Econometrics A Self-contained Approach Frank Westhoff The MIT Press Cambridge, Massachusetts London, England How to Use This Book xvii 1 Descriptive Statistics 1 Chapter 1 Prep Questions

More information

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) In 2007, the number of wins had a mean of 81.79 with a standard

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Correlation (pp. 1 of 6)

Correlation (pp. 1 of 6) Correlation (pp. 1 of 6) Car dealers want to know how mileage affects price on used Corvettes. Biologists are studying the effects of temperature on cricket chirps. Farmers are trying to determine if there

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Remember: Z.05 = 1.645, Z.01 = 2.33 We will only cover one-sided hypothesis testing (cases 12.3, 12.4.2, 12.5.2,

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues

Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues What effects will the scale of the X and y variables have upon multiple regression? The coefficients

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit Amherst College Department of Economics Economics 360 Fall 2014 Exam 1: Solutions 1. (10 points) The following table in reports the summary statistics for high and low temperatures in Key West, FL from

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Amherst College Department of Economics Economics 360 Fall 2015 Monday, December 7 Problem Set Solutions

Amherst College Department of Economics Economics 360 Fall 2015 Monday, December 7 Problem Set Solutions Amherst College epartment of Economics Economics 3 Fall 2015 Monday, ecember 7 roblem et olutions 1. Consider the following linear model: tate residential Election ata: Cross section data for the fifty

More information

LI EAR REGRESSIO A D CORRELATIO

LI EAR REGRESSIO A D CORRELATIO CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Simple Linear Regression

Simple Linear Regression CHAPTER 13 Simple Linear Regression CHAPTER OUTLINE 13.1 Simple Linear Regression Analysis 13.2 Using Excel s built-in Regression tool 13.3 Linear Correlation 13.4 Hypothesis Tests about the Linear Correlation

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties

Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties Amherst College Department of Economics Economics 360 Fall 2012 Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties Chapter 18 Outline Review o Regression Model

More information

University of Maryland Spring Economics 422 Final Examination

University of Maryland Spring Economics 422 Final Examination Department of Economics John C. Chao University of Maryland Spring 2009 Economics 422 Final Examination This exam contains 4 regular questions and 1 bonus question. The total number of points for the regular

More information

Essential Statistics. Gould Ryan Wong

Essential Statistics. Gould Ryan Wong Global Global Essential Statistics Eploring the World through Data For these Global Editions, the editorial team at Pearson has collaborated with educators across the world to address a wide range of subjects

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Looking Ahead to Chapter 10

Looking Ahead to Chapter 10 Looking Ahead to Chapter Focus In Chapter, you will learn about polynomials, including how to add, subtract, multiply, and divide polynomials. You will also learn about polynomial and rational functions.

More information

Chapter 20 Comparing Groups

Chapter 20 Comparing Groups Chapter 20 Comparing Groups Comparing Proportions Example Researchers want to test the effect of a new anti-anxiety medication. In clinical testing, 64 of 200 people taking the medicine reported symptoms

More information

Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions

Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions Part 1: Conceptual ideas about correlation and regression Tintle 10.1.1 The association would be negative (as distance increases,

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Solutions to Exercises in Chapter 9

Solutions to Exercises in Chapter 9 in 9. (a) When a GPA is increased by one unit, and other variables are held constant, average starting salary will increase by the amount $643. Students who take econometrics will have a starting salary

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Multiple Regression Analysis

Multiple Regression Analysis Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is Chapter 9 Dummy (Binary) Variables 9.1 Introduction The multiple regression model y = β+β x +β x + +β x + e (9.1.1) t 1 2 t2 3 t3 K tk t Assumption MR1 is 1. yt =β 1+β 2xt2 + L+β KxtK + et, t = 1, K, T

More information

POL 681 Lecture Notes: Statistical Interactions

POL 681 Lecture Notes: Statistical Interactions POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Practice Questions for Exam 1

Practice Questions for Exam 1 Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number of features as they arrive in the lot in order to determine their worth. Among the features looked at are miles per gallon

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13 C h a p t e r 13 Independent-Samples t Test and Mann- Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES How to quantify qualitative variables to quantitative variables? Why do we need to do this? Econometric model needs quantitative variables to estimate its parameters

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

11.5 Regression Linear Relationships

11.5 Regression Linear Relationships Contents 11.5 Regression............................. 835 11.5.1 Linear Relationships................... 835 11.5.2 The Least Squares Regression Line........... 837 11.5.3 Using the Regression Line................

More information

Ch. 16: Correlation and Regression

Ch. 16: Correlation and Regression Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to

More information

Inferential statistics

Inferential statistics Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Outline. Lesson 3: Linear Functions. Objectives:

Outline. Lesson 3: Linear Functions. Objectives: Lesson 3: Linear Functions Objectives: Outline I can determine the dependent and independent variables in a linear function. I can read and interpret characteristics of linear functions including x- and

More information

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean. Math 1342 Exam 3-Review Chapters 7-9 HCCS **************************************************************************************** Name Date **********************************************************************************************

More information

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45 Announcements Solutions to Problem Set 3 are posted Problem Set 4 is posted, It will be graded and is due a week from Friday You already know everything you need to work on Problem Set 4 Professor Miller

More information

Module 7 Practice problem and Homework answers

Module 7 Practice problem and Homework answers Module 7 Practice problem and Homework answers Practice problem, page 1 Is the research hypothesis one-tailed or two-tailed? Answer: one tailed In the set up for the problem, we predicted a specific outcome

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information