15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES
|
|
- Esmond Owens
- 6 years ago
- Views:
Transcription
1 15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES The method of multiple regression that we have studied through the use of the two explanatory variable life expectancies example can be extended to any number of explanatory (sometimes called predictor ) variables. This is very useful when we have several potentially useful explanatory variables measured for each observation and wish to explore which are useful for prediction of a response variable of interest. Performing a multiple regression analysis on all the available potentially explanatory variables can help with this. Consider the following example.
2 Example Data were gathered for a period of 17 years on the average price of beef and a number of factors that were believed to have potential effects on the price of beef. But we do not know without statistical analysis that all six explanatory variables are simultaneously needed to predict beef pricing. These explanatory variables are as follows:* CBE: consumption of beef per capita (lb) PPO: price of pork (cents/lb) CPO: consumption of pork per capita (lb) DINC: disposable income per capita index CFO: food consumption per capita index RDINC: index of real disposable income per capita It is easy to see why each of these variables could have an effect on the price of beef. For example, the more pork that is consumed, or the cheaper pork is, the less the demand for beef should be. Obtainable from any standard statistical computer package, the usual least squares multiple regression analysis yields the following regression relationship between the price of beef (in cents/lb) and the six explanatory variables: Beef price (CBE) 0. 32(PPO) 0. 87(CPO) 0. 07(DINC) 0. 37(CFO) 0. 16(RDINC) The ANOVA table for regression, as introduced in the previous section, is as follows: Sum of Degrees Mean Source squares of freedom square Regression Error Total F Clearly, we strongly reject the null hypothesis that the regression is not worthwhile (consult the F tables for 5% and 1% significance values to see whether you agree!). The equation we have constructed thus does have the power to explain the price of beef. Two essential issues arise, however. First, why does the RDINC coefficient have its sign opposite to that which we might expect? That is, we would expect that as real disposable income (RDINC) increased, the price of beef would increase since people would consume more high-priced beef, increasing its demand and driving up its price. However, the negative sign for that term indicates the *F. B. Waugh, Graphic Analysis in Agricultural Economics, Agricultural Handbook 128 (Washington, D.C.: U.S. Department of Agriculture, 1957). The term per capita means per person here.
3 opposite relationship. It is possible that prediction is helped by the (RDINC) term, but we wonder! A second and crucial question is, could we have got by with fewer explanatory variables and have done just as well in explaining the price of beef? Perhaps we are overfitting the data by including some useless explanatory variables. We will address these questions later in this section. Estimation of the Regression Equation and Using the Equation for Prediction The idea of using least squares as a method of finding coefficients in linear regression was introduced in Section 3.5 and discussed briefly in the last section. In Section 3.5, we only had one explanatory variable, and we attempted to determine what value of the slope of the regression line would minimize the mean squared error. The method here is exactly the same, except that we are minimizing over all explanatory variables simultaneously. We want to find the set of coefficients that, when applied in the regression equation, minimizes the mean square error. Of course, we cannot perform this minimization by hand without severe difficulty and spending a great amount of time. That is why we turn to a convenient computer package to provide the proper least square estimates of the coefficients of the explanatory variables. Once this regression line has been found, we can use it, as we did in Chapter 3, to make predictions as to what future responses will be, based on observed explanatory variables. As was explained in Chapter 3, we want to be careful to use only interpolation, not extrapolation. In multiple regression, interpolation means that all of the observed explanatory values used to make the prediction should be within the range of the data values of the corresponding explanatory variable used in forming the regression equation. For example, with the beef data, the range of each of the explanatory variables was as follows: CBE: PPO: CPO: DINC: CFO: RDINC: So, if we were going to use the this equation for prediction, we would want to make sure that each of the explanatory values we were using was within these ranges (a few exceptions, as long as they are not too far out of the range, would be acceptable).
4 Testing Explanatory Variables for Usefulness We now want to explore the questions we posed at the end of Example Namely, we will explore why one of the coefficients in the regression equation has its sign opposite to what was expected, and also whether we can eliminate certain explanatory variables without lessening our ability to explain the price of beef. It turns out that the answers to these questions are related. First, it is important to understand that which of the other six explanatory variables are present in a particular multiple regression equation changes the coefficient of a particular variable such as RDINC. There exists a certain amount of total variation in the response variable, as measured by its total sum of squares. Although one of the explanatory variables may explain some of this total response variation, other explanatory variables will also share in explaining the total variation in the response variable, depending on which of them are included in the model. To understand this perhaps puzzling idea, let s consider an example. Consider prediction of college freshman grade point average (GPA) using both SAT and ACT college entrance scores. First, we note that although they are somewhat different, in fact these two college entrance tests measure very similar things and are highly correlated in the population entering college. We would expect, as is true, that the SAT score by itself does a good job predicting freshman GPA, and hence will have an influential coefficient in the regression equation GPA C m(sat score). But if both scores are used together as explanatory variables, the coefficient of the SAT score will be much less because the ACT score is now also sharing in predicting freshman GPA. The point is that if several explanatory variables are present, then the coefficient of each variable represents the explanatory capacity of that variable viewed in cooperation with the explanatory capacities of other variables. Let s see how this relates to our beef pricing prediction equation. Consider the case of the variable RDINC. We now understand that the variation it explains in the regression equation of Example is variation not being explained by the other five variables, including in particular the variable DINC. It is natural to assume that RDINC and DINC, which are indeed defined to be very similar, would be explaining much the same variation in the price of beef. In fact, the the sample correlation coefficient between RDINC and DINC is 0.82, indicating a strong relationship between them and hence a similar prediction role for them. The point is that the coefficient for any particular explanatory variable in a regression equation has to be understood in the context of all the other explanatory variables present in the equation.
5 This leads to the issue of whether it is of value to include both RDINC and DINC in the regression. If they both explain approximately the same thing, then why include them both? This is a very important issue that statisticians doing multiple regression address, since in developing models to predict and explain the world around us, we are always interested in creating the simplest model (in particular, fewest explanatory variables)possible, while retaining our ability to explain or predict the response variable as well as possible. The ANOVA table shown in Example considers the regression in one line of the table. However, it is possible (though we do not explain how here) to split off, from the regression sum of squares, a sum of squares component for RDINC. For our example, this expanded ANOVA table is as follows: Sum of Degrees Mean Source squares of freedom square F Other five variables RDINC Error Total Note that the six-degrees-of-freedom regression sum of squares (743.13) of Example has here been decomposed into the five-degrees-of-freedom sum of squares for the combined influence of CBO, PPO, CPO, DINC, and CFO and the single-degree-of-freedom sum of squares for RDINC, which, as the theory says, must add to (check it!). Now we have a separate F test for the explanatory variable RDINC. It is important to recall, however, that this sum of squares for RDINC is the sum of squares assuming that the other five explanatory variables are included in the equation. Thus the F test is asking whether including the RDINC variable helps predict the response variable (the price of beef) given that the other five variables are part of the prediction equation. Because of this, if we find by using the F distribution that an explanatory variable is not important, we will want to redo the least squares regression equation with that variable removed. When a variable is removed, the coefficients of the remaining explanatory variables will change. In our example, the RDINC F statistic is We test the null hypothesis that the RDINC variable is of no use in the model (that its coefficient is 0) by comparing 0.16 to the F distribution 5% value with 1 numerator and 10 denominator degrees of freedom. This 5% point is 4.96, so we strongly conclude that we cannot reject this null hypothesis. Thus we conclude that RDINC is not of use in the presence of the other explanatory variables (and
6 hence its original negative coefficient was not to be trusted). We remove this variable from the regression equation. We could explore removing other explanatory variables. Indeed, it is important to ask how many and which variables are needed to obtain a regression equation where each included variable is useful for prediction in addition to the others present and where adding any other explanatory variable does not improve prediction. An advanced ANOVA analysis that considers all possible regression equations formed by including various subsets of the six explanatory variables produces a solution to this question: Beef price (CFO) 1. 27(CBE) 0. 78(CPO) 0. 31(PPO) Here the F test for each coefficient rejects the null hypothesis that the coefficient is zero, indicating the predictive usefulness of each of the variables, even with the other three explanatory variables present. Recall the multiple correlation coefficient was only 66% in the life expectancy example of Section By contrast, the (100 R2)% value here is 97%, a very high value indicating very effective predictive capability. Comparing the coefficients of the four explanatory variables in the above equation and in the original Example equation with all six explanatory variables, we note that two of the coefficients changed little, one changed a moderate amount, and one is now much different. Interestingly, this four explanatory variable equation has dropped both RDINC and DINC because of their being ineffective in the presence of the other four variables. SECTION 15.8 EXERCISES 1. For both of the following sets of values, using 2 3. Can you determine the value of R for the beef the regression equation found in the section, price example? Refer back to Example predict the price of beef, if it is appropriate. If on page 656. it is not appropriate, explain why not. 4. Consider the regression with two predica. CBE 52, PPO 51. 2, CPO 56. 3, tor variables based on 59 metropolitan ar- DINC 48. 7, CFO 90. 5, RDINC eas, where Y Average income in $1000 s, X Average educational level, and Z b. CBE 48, PPO 86. 3, CPO 48. 1, Percentage of workers who are white-collar. DINC 21. 9, CFO 96. 2, RDINC a. The ANOVA table below shows the sum of squares for X as the explanatory variable, 2. Answer true or false, and explain: Since the and the sum of squares for Z after X has sign for the explanatory variable DINC is explained what it can: positive, the correlation between DINC and the price of beef is necessarily also positive.
7 Sum of Degrees of Mean Source squares freedom square X ?? Z ?? Error ? Total Fill in the mean squares for XZ,, and Error, and the F statistics for X and Z. b. Perform the F test for the X variable, with significance level. 05. What do you conclude? c. Perform the F test for the Z variable. What do you conclude? d. The next ANOVA table shows the sum of squares for Z as the explanatory variable, and a blank for the sum of square for X after Z has explained what it can: Sum of Degrees of Mean Source squares freedom square F F Exams, score on exams during semester (not including final); and Final, score on final exam. a. Let Y Final. Which single one of the other variables would you expect to best predict Y? b. The ANOVA table below has the sums of squares for predicting the final exam score from the others. This first line has the sum of squares due to the three variables Labs, In Class, and Exams, and the second has the sum of squares due to the HW after the other three have explained what they can: Degrees Sum of of Mean Source squares freedom square Labs, In Class, Exams ??? HW???? Error ?? Total ? Z ?? Fill in the spaces that have question marks. X? 1?? c. Test whether Labs, In Class, and Exams Error ? together have significant predictive value Total of final exam score. d. Test whether HW has significant additional predictive power after the other three vari- Fill in the sum of squares for X, the mean ables have explained what they can. squares, and the F s. e. The next ANOVA table has the sum of e. Perform the F test for the Z variable. What squares for Labs and In Class together, do you conclude? Compare your concluthen the sum of squares for the additional sion to that in part (c). Is there a contradiceffect of Exams. tion. Explain. f. Perform the F test for the X variable. What do you conclude? Degrees g. Which equation would you prefer? Explain. Source squares freedom square F Sum of of Mean (i) Y a bx cz Labs, In Class ??? (ii) Y a bx Exams???? (iii) Y a cz Error ?? 5. The scores for 107 statistics students included the following: HW, score on book homework; Total ? Labs, score on computer laboratory assignments; In Class, score on in-class assignments; Fill in the missing information. F
8 f. Test whether Labs and In Class vari- h. Which of the following equations would ables combined have significant predictive you prefer for predicting final score? Why? power. (i) Final a b Labs c In Class d g. Test whether Exams has significant addi- Exams tional predictive power after Labs and In (ii) Final a b HW c Labs d Class have explained what they can. In Class e Exams (iii) Final a b Labs c In Class
Steps to take to do the descriptive part of regression analysis:
STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More information11 Correlation and Regression
Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value
More informationAnswer Key: Problem Set 6
: Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,
More informationLI EAR REGRESSIO A D CORRELATIO
CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationSociology 593 Exam 2 March 28, 2002
Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that
More informationLecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000
Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationLecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)
Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationECON 497: Lecture Notes 10 Page 1 of 1
ECON 497: Lecture Notes 10 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 10 Heteroskedasticity Studenmund Chapter 10 We'll start with a quote from Studenmund:
More informationChapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1
Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationAnswer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1
9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her
More informationHypothesis testing: Steps
Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute
More informationBlack White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126
Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;
More informationECON 497 Midterm Spring
ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain
More informationECO375 Tutorial 4 Wooldridge: Chapter 6 and 7
ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationt-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression
t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationCRP 272 Introduction To Regression Analysis
CRP 272 Introduction To Regression Analysis 30 Relationships Among Two Variables: Interpretations One variable is used to explain another variable X Variable Independent Variable Explaining Variable Exogenous
More informationPractice exam questions
Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.
More informationSECTION I Number of Questions 42 Percent of Total Grade 50
AP Stats Chap 7-9 Practice Test Name Pd SECTION I Number of Questions 42 Percent of Total Grade 50 Directions: Solve each of the following problems, using the available space (or extra paper) for scratchwork.
More informationWednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms
Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Preview A One-Tailed Hypothesis Test: The Downward Sloping
More informationPsychology Seminar Psych 406 Dr. Jeffrey Leitzel
Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationMultiple Linear Regression
1. Purpose To Model Dependent Variables Multiple Linear Regression Purpose of multiple and simple regression is the same, to model a DV using one or more predictors (IVs) and perhaps also to obtain a prediction
More informationChapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1
Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Example 10-2: Absences/Final Grades Please enter the data below in L1 and L2. The data appears on page 537 of your textbook.
More informationEconometrics Review questions for exam
Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =
More informationSimple Linear Regression: One Quantitative IV
Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationHypothesis testing: Steps
Review for Exam 2 Hypothesis testing: Steps Repeated-Measures ANOVA 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region
More informationCHAPTER 5 LINEAR REGRESSION AND CORRELATION
CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear
More informationEco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013
Midterm 3 4/5/2013 Instructions: You may use a calculator, and one sheet of notes. You will never be penalized for showing work, but if what is asked for can be computed directly, points awarded will depend
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationFinal Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)
Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the
More informationSimple Linear Regression
9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient
More informationProb/Stats Questions? /32
Prob/Stats 10.4 Questions? 1 /32 Prob/Stats 10.4 Homework Apply p551 Ex 10-4 p 551 7, 8, 9, 10, 12, 13, 28 2 /32 Prob/Stats 10.4 Objective Compute the equation of the least squares 3 /32 Regression A scatter
More informationChapter 12 : Linear Correlation and Linear Regression
Chapter 1 : Linear Correlation and Linear Regression Determining whether a linear relationship exists between two quantitative variables, and modeling the relationship with a line, if the linear relationship
More informationIn a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:
Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s
More informationWe d like to know the equation of the line shown (the so called best fit or regression line).
Linear Regression in R. Example. Let s create a data frame. > exam1 = c(100,90,90,85,80,75,60) > exam2 = c(95,100,90,80,95,60,40) > students = c("asuka", "Rei", "Shinji", "Mari", "Hikari", "Toji", "Kensuke")
More informationQuantitative Bivariate Data
Statistics 211 (L02) - Linear Regression Quantitative Bivariate Data Consider two quantitative variables, defined in the following way: X i - the observed value of Variable X from subject i, i = 1, 2,,
More information3.2: Least Squares Regressions
3.2: Least Squares Regressions Section 3.2 Least-Squares Regression After this section, you should be able to INTERPRET a regression line CALCULATE the equation of the least-squares regression line CALCULATE
More informationSTAT 350 Final (new Material) Review Problems Key Spring 2016
1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationMarketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)
Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Remember: Z.05 = 1.645, Z.01 = 2.33 We will only cover one-sided hypothesis testing (cases 12.3, 12.4.2, 12.5.2,
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More information1 Correlation and Inference from Regression
1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is
More informationCHAPTER 4. > 0, where β
CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates
More informationSolutions to Problem Set 4 (Due November 13) Maximum number of points for Problem set 4 is: 66. Problem C 6.1
Solutions to Problem Set 4 (Due November 13) EC 228 01, Fall 2013 Prof. Baum, Mr. Lim Maximum number of points for Problem set 4 is: 66 Problem C 6.1 (i) (3 pts.) If the presence of the incinerator depresses
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationLinear Correlation and Regression Analysis
Linear Correlation and Regression Analysis Set Up the Calculator 2 nd CATALOG D arrow down DiagnosticOn ENTER ENTER SCATTER DIAGRAM Positive Linear Correlation Positive Correlation Variables will tend
More informationPsych 230. Psychological Measurement and Statistics
Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State
More informationPerform the same three operations as above on the values in the matrix, where some notation is given as a shorthand way to describe each operation:
SECTION 2.1: SOLVING SYSTEMS OF EQUATIONS WITH A UNIQUE SOLUTION In Chapter 1 we took a look at finding the intersection point of two lines on a graph. Chapter 2 begins with a look at a more formal approach
More informationSolutions: Monday, October 15
Amherst College Department of Economics Economics 360 Fall 2012 1. Consider Nebraska petroleum consumption. Solutions: Monday, October 15 Petroleum Consumption Data for Nebraska: Annual time series data
More informationAP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions
AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions Know the definitions of the following words: bivariate data, regression analysis, scatter diagram, correlation coefficient, independent
More informationRegression Analysis. BUS 735: Business Decision Making and Research
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn
More informationB. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i
B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts
More informationAnalysis of Variance. Contents. 1 Analysis of Variance. 1.1 Review. Anthony Tanbakuchi Department of Mathematics Pima Community College
Introductory Statistics Lectures Analysis of Variance 1-Way ANOVA: Many sample test of means Department of Mathematics Pima Community College Redistribution of this material is prohibited without written
More informationSimple Linear Regression
Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationDescribing Bivariate Data
Describing Bivariate Data Correlation Linear Regression Assessing the Fit of a Line Nonlinear Relationships & Transformations The Linear Correlation Coefficient, r Recall... Bivariate Data: data that consists
More informationChapter 10: Multiple Regression Analysis Introduction
Chapter 10: Multiple Regression Analysis Introduction Chapter 10 Outline Simple versus Multiple Regression Analysis Goal of Multiple Regression Analysis A One-Tailed Test: Downward Sloping Demand Theory
More informationBasic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model
Basic Linear Model Chapters 4 and 4: Part II Statistical Properties of Least Square Estimates Y i = α+βx i + ε I Want to chooses estimates for α and β that best fit the data Objective minimize the sum
More informationUsing regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement
EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics
More informationChapter 6. Exploring Data: Relationships. Solutions. Exercises:
Chapter 6 Exploring Data: Relationships Solutions Exercises: 1. (a) It is more reasonable to explore study time as an explanatory variable and the exam grade as the response variable. (b) It is more reasonable
More informationYou are permitted to use your own calculator where it has been stamped as approved by the University.
ECONOMICS TRIPOS Part I Friday 11 June 004 9 1 Paper 3 Quantitative Methods in Economics This exam comprises four sections. Sections A and B are on Mathematics; Sections C and D are on Statistics. You
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationChapter 9: Roots and Irrational Numbers
Chapter 9: Roots and Irrational Numbers Index: A: Square Roots B: Irrational Numbers C: Square Root Functions & Shifting D: Finding Zeros by Completing the Square E: The Quadratic Formula F: Quadratic
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationPaired Samples. Lecture 37 Sections 11.1, 11.2, Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2012
Paired Samples Lecture 37 Sections 11.1, 11.2, 11.3 Robb T. Koether Hampden-Sydney College Mon, Apr 2, 2012 Robb T. Koether (Hampden-Sydney College) Paired Samples Mon, Apr 2, 2012 1 / 17 Outline 1 Dependent
More informationTesting and Model Selection
Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationCorrelation & Regression Chapter 5
Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have a relationship (p
More informationLesson Least Squares Regression Line as Line of Best Fit
STATWAY STUDENT HANDOUT STUDENT NAME DATE INTRODUCTION Comparing Lines for Predicting Textbook Costs In the previous lesson, you predicted the value of the response variable knowing the value of the explanatory
More informationInferences for Correlation
Inferences for Correlation Quantitative Methods II Plan for Today Recall: correlation coefficient Bivariate normal distributions Hypotheses testing for population correlation Confidence intervals for population
More informationInformation Sources. Class webpage (also linked to my.ucdavis page for the class):
STATISTICS 108 Outline for today: Go over syllabus Provide requested information I will hand out blank paper and ask questions Brief introduction and hands-on activity Information Sources Class webpage
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More informationChapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania
Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are
More informationVariance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009
Variance Estimates and the F Ratio ERSH 8310 Lecture 3 September 2, 2009 Today s Class Completing the analysis (the ANOVA table) Evaluating the F ratio Errors in hypothesis testing A complete numerical
More informationQUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018
Page 1 of 4 QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 ECONOMICS 250 Introduction to Statistics Instructor: Gregor Smith Instructions: The exam
More informationSTA2601. Tutorial Letter 104/1/2014. Applied Statistics II. Semester 1. Department of Statistics STA2601/104/1/2014 TRIAL EXAMINATION PAPER
STA2601/104/1/2014 Tutorial Letter 104/1/2014 Applied Statistics II STA2601 Semester 1 Department of Statistics TRIAL EXAMINATION PAPER BAR CODE Learn without limits. university of south africa Dear Student
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from
More informationQ1: What is the interpretation of the number 4.1? A: There were 4.1 million visits to ER by people 85 and older, Q2: What percent of people 65-74
Lecture 4 This week lab:exam 1! Review lectures, practice labs 1 to 4 and homework 1 to 5!!!!! Need help? See me during my office hrs, or goto open lab or GS 211. Bring your picture ID and simple calculator.(note
More informationIntroduction To Confirmatory Factor Analysis and Item Response Theory
Introduction To Confirmatory Factor Analysis and Item Response Theory Lecture 23 May 3, 2005 Applied Regression Analysis Lecture #23-5/3/2005 Slide 1 of 21 Today s Lecture Confirmatory Factor Analysis.
More information1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests
Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores
More informationScatterplots and Correlation
Bivariate Data Page 1 Scatterplots and Correlation Essential Question: What is the correlation coefficient and what does it tell you? Most statistical studies examine data on more than one variable. Fortunately,
More informationDISTRIBUTIONS USED IN STATISTICAL WORK
DISTRIBUTIONS USED IN STATISTICAL WORK In one of the classic introductory statistics books used in Education and Psychology (Glass and Stanley, 1970, Prentice-Hall) there was an excellent chapter on different
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues
Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues What effects will the scale of the X and y variables have upon multiple regression? The coefficients
More informationMGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu
Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationExtra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015
Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 12.00 14.45, July 2, 2015 Also hand in this exam and your scrap paper. Always motivate your answers. Write your answers in
More informationMgmt 469. Causality and Identification
Mgmt 469 Causality and Identification As you have learned by now, a key issue in empirical research is identifying the direction of causality in the relationship between two variables. This problem often
More information