FAQ: Linear and Multiple Regression Analysis: Coefficients
|
|
- Domenic Rose
- 5 years ago
- Views:
Transcription
1 Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable (dependent variable) can be predicted from the others (independent variables). For example, if one knows the relationship between corporate research and development expenditures and future sales, one may be able to predict future sales. The linear regression model is typically represented as Y= a+bx+e. In order to construct a regression model, information on both x and y must be obtained from a sample of objects or individuals. The relationship between the two can then be estimated. Assume we are estimated the relation between the number of operating hours of a machine and its annual repair and maintenance costs. In this example, the number of operating hours is the independent variable and the repair cost is the dependent variable. The parameters a and b can take on any of an infinite number of real values. The goal in the regression procedure is to create a model where we can predict the value of y with good accuracy. More exactly, we estimate a regression to minimize the sum of the squared deviations between predicted y and actual y (the residual). Fortunately, statisticians have developed equations to estimate a and b so that the best fitting regression line is obtained (called least squares equations). So, lets use the following information on machine hours and repair costs to calculate a least squares regression line. First, collect data to estimate the relation. We limit the sample to 5 observations for simplicity only. Typically at least 3 observations will be required to fit any type of line. Hours Costs 1
2 5 $1, 8 $2, 1,5 $2,5 2,5 $7, 6, $9, Mean 2,26 Mean $4,3 Second, estimate the regression coefficient b, the average amount repair costs increase when operating hours increase one unit and other independent variables are held constant. b is estimated as S (x-mean of x) (y-mean of y) / S(x-mean of x) 2 This is the covariance of x and y divided by the variance of the x. In our example, this estimate is (5-2,26)*(1,-4,3) + (8-2,26)*(2,-4,3) + (5-226) + (8-226) + b = Third, estimate for a is computed as mean of y (b*mean of x). In our example, this estimate is a = 43 (1.449 * 226) The final estimated line is thus y = 1, x + e. This means we can predict repair costs. For x equal to 3 hours, we estimate repair costs as 1, *3, = $5,373. 2
3 Note that for b coefficients for dummy variables which have been binary coded (1 or ), or based on categories, b is relative to the reference category (the category left out). Thus for the a of dummy variables for "Region," assuming "North" is the reference category and income is the dependent, a b of -1.5 for the dummy "South" means that the expected income the South is 1.5 units less than the average of "North" respondents. Also, note that t-tests are used to assess the significance of individual b coefficients. specifically testing the null hypothesis that the regression coefficient is zero. A common rule of thumb is to drop from the equation all variables not significant at the.5 level or better. Note that there is always error in the regression procedure. Error may be incorporated into the information collected to estimate the line. Regression models are a source of information about the world but they must be used wisely. Question 2: How do I calculate the coefficient of correlation? Answer 2: The correlation coefficient measures the strength of the relationship between two variables. Perfectly positive correlation coefficients are +1.. This means that for every change in one variable, the other variable changes in the same direction and in the same proportion as every other change in the first variable. For example, if every $1, increase in a company s reported net income is associated with a 5% increase in its stock price, then the relationship between company reported net income and stock price is perfectly positively correlated (the correlation coefficient equals 1.). Note that this correlation infers nothing about cause and effect. That is, an increase in the stock price may not be caused by the increase in net income. It may be that another identified variable is affecting both. Perfectly negative correlations also exist such as the relationship between age and purchases of CDs. When no correlation exists, the coefficient is zero, such as the relation between the air temperature and birth rates. Like any statistic, we can test the significance of the measured correlation using a t-statistic. 3
4 The correlation coefficient is defined as the covariance of x and y divided by the product of the standard deviations of x and y. Correlation coefficients are particularly useful when we have situations in which y is considered to be a dependent variable and x is taken as an independent variable and times when x is considered to be the dependent variable and y is the independent variable. So how do we calculate it the coefficient if we do not use the function in excel? Consider the following example of 5 observations of the relation between number of years of education school past high school and dollars spent on new cars over the 1 years period following high school. Years $ 3, 5, 2 9, 4 95, 6 8, 8 First, compute the mean of each variable. Mean of years = 4. years Mean of $ = $69, Next compute the difference of each variable from its mean. Years Deviation $ Deviation 4. 3, 39K , 19K 4. 9, 21K 4
5 , 26K , 11K Third, for each pair of observations, sum the product of the deviations. The sum is (4. * 39K) + (2. * 19K) + (. * 21K) + (2. * 26K) + (4. * 11K) = 29K Next, compute the product of the sum of the squared deviations. For Years: (4.) 2 + (2.) 2 + (.) 2 + (2.) 2 + (4.) 2 = 4 For $: (39K) 2 +(19K) 2 +(21K) 2 +(26K) 2 +(11K) 2 = 3,12,, The product of the sums is then $124,8,, Finally, divide the sum of the products by the square root of the product of the sum of the squared deviations. 29, / SQR (124,8,,) = 29, / 353, =.82 This is r, the correlation coefficient. In this case the coefficient is positive by not perfectly correlated. We could use a t-statistic to test the significance. Question 3: How do I calculate the coefficient of determination? Answer 3: The coefficient of determination, R 2, is the percent of the variance in the dependent explained by the independents. R-squared can also be interpreted as the proportionate reduction in error in estimating the dependent when knowing the independents. That is, R 2 is the number of errors made when using the regression model to guess the value of the dependent, in ratio to the total errors made when using only the dependent's mean as the basis for estimating all cases. 5
6 Mathematically, R 2 = (1 - (SSE/SST)), where SSE = error sum of squares and SST = total sum of squares. Error sum of squares is the sum of the squared residuals (predicted y less actual y) and total sum of squares is the sum of the squared deviations between actual y and the mean value of y. SSE is the error not explained by the model and SST is the total error whether explained or not. In our example of machine hours and costs, we can compute R 2. First, using the estimated regression(y=1, *x), compute the predicted y for each x. Hours Actual Predicte Costs d Cost 5 $1, 1,75 8 $2, 2,185 1,5 $2,5 3,2 2,5 $7, 4,649 6, $9, 9,72 Mean 2,26 Mean $4,3 Mean $4,31 Second, compute SSE as (1-175) 2 + (2 2185) 2 + (25-32) 2 + (7-4649) 2 + (9 972) 2 = approx. 7,, Third, compute SST as (175 4,31) 2 + ( ) 2 + (32 431) 2 +( ) 2 +( ) 2 = approx. 41,68, Thus, R 2 = (1-16.8%) = 83.2% While R 2 can be increased by adding variables, it is inappropriate unless variables are added to the equation for sound theoretical 6
7 reason. At an extreme, when n-1 variables are added to a regression equation, R 2 will be 1, but this result is meaningless. Adjusted R 2 is used as a conservative reduction to R 2 to penalize for adding variables. Adjusted R-Square is an adjustment for the fact that when one has a large number of independents, it is possible that R 2 will become artificially high simply because some independents' chance variations "explain" small parts of the variance of the dependent. At the extreme, when there are as many independents as cases in the sample, R 2 will always be 1.. The adjustment to the formula arbitrarily lowers R 2 as p, the number of independents, increases. Also note that typically, R 2 should not be compared between samples because one sample may have more variation generally to explain. More variance might allow more explanation of the variance. Question 4: How do I calculate the standard error of the estimate? Answer 4: Residual variance is a measure of the variation of the y values about the regression line. That is, it is the deviations between actual y and predicted y. The square root of the residual variance is the standard error of estimate. If the standard error is too large, the model may not be useful. To compute the standard error, first, determine predicted y values. Second, compute the difference between predicted y and actual y values and square the differences. This is also known as the error sum of squares (SSE). It is the error that cannot be explained by the estimated regression model. Finally, scale the error sum of square (SSE) by the sample size less the number of independent variables less one. The square root of the result is the standard error of the estimate and it measures the dispersion of the actual values of y around the fitted regression line. The standard error is also called the standard deviation of the regression model. Even if the coefficient of determination is high, the standard error will be too high to provide adequate confidence in the model. For example two standard deviations might be impractical for prediction intervals. A better model with different variables or additional variables is then warranted. 7
8 Question 5: What are the similarities between simple linear regression analysis and multiple regression analysis? Answer 5: The multiple regression equation takes the form: y = a + b 1 x 1 + b 2 x b n x n + e. In multiple regression, more than one independent variables explain the one dependent variable. This is expected to reduce the standard error. The b's are the regression coefficients, representing the amount the dependent variable y changes when the independent changes 1 unit. The a is the constant, where the regression line intercepts the y axis, representing the amount the dependent y will be when all the independent variables are. Associated with multiple regression is R 2, multiple correlation, which is the percent of variance in the dependent variable explained collectively by all of the independent variables. While for simple linear regression, R 2 2 (that is, the simple correlations coefficient squared), for multiple regression the same equality does not hold. Like linear regression, multiple regression shares all the assumptions of correlation: linearity of relationships, the same level of relationship throughout the range of the independent variable ("homoscedasticity"),interval or near-interval data, and data whose range is not truncated. In addition, it is important that the model being tested is correctly specified. The exclusion of important causal variables or the inclusion of extraneous variables can change markedly the estimated coefficients and thus the interpretation of the importance of the independent variables. Question 6: What are the basic assumptions of regression analysis? Answer 6: There are four general assumption of regression analysis. 1. First, model errors are normally distributed. 8
9 2. Second, the mean of the model error terms is zero. 3. Third, the model error terms have a constant variance for all values and combination of values of the independent variable("homoscedasticity". 4. Finally, the each error for each x is independent of each other. Error, represented by the residuals, should be normally distributed for each set of values of the independents. A histogram of standardized residuals should show a roughly normal curve. An alternative for the same purpose is the normal probability plot, with the observed cumulative probabilities of occurrence of the standardized residuals on the Y axis and of expected normal probabilities of occurrence on the X axis, such that a 45-degree line will appear when observed conforms to normally expected. The central limit theorem assumes that even when error is not normally distributed, when sample size is large, the sampling distribution of the b coefficient will still be normal. Therefore violations of this assumption usually have little or no impact on substantive conclusions for large samples, but when sample size is small, tests of normality are important. The assumption that the mean error is independent of the x independent variables. This is a critical regression assumption which, when violated, may lead to substantive misinterpretation of output. The (population) error term, which is the difference between the actual values of the dependent and those estimated by the population regression equation, should be uncorrelated with each of the independent variables. Since the population regression line is not known for sample data, the assumption must be assessed by theory. Specifically, one must be confident that the dependent is not also a cause of one or more of the independents, and that the variables not included in the equation are not causes of Y and correlated with the variables which are included. Either circumstance would violate the assumption of uncorrelated error. One common type of correlated error occurs due to selection bias with regard to membership in the independent variable "group" (representing membership in a treatment vs. a comparison group): measured factors such as gender, race, education, etc., may cause differential selection into the two groups and also can be correlated with the dependent variable. When there is correlated error, conventional computation of 9
10 standard deviations, t-tests, and significance are biased and cannot be used validly. Other assumptions are also part of regression. Some of the major ones are discussed here. Model specification is critical. Relevant variables are not omitted. If relevant variables are omitted from the model, the common variance they share with included variables may be wrongly attributed to those variables, and the error term is inflated. Similarly, if causally irrelevant variables are included in the model, the common variance they share with included variables may be wrongly attributed to the irrelevant variables. Omission and irrelevancy can both affect substantially the size of the b coefficients. This is one reason why it is better to use regression to compare the relative fit of two models rather than to seek to establish the validity of a single model specification. Continuous data are required(interval or ratio), though it is common to use ordinal data. Dummy variables form a special case and are allowed in regression as independents. Dichotomies may be used as independents but not as the dependent variable. Use of a dichotomous dependent in regression violates the assumptions of normality and homoscedasticity as a normal distribution is impossible with only two values. Also, when the values can only be or 1, residuals will be low for the portions of the regression line near Y= and Y=1, but high in the middle -- hence the error term will violate the assumption of homoscedasticity (equal variances) when a dichotomy is used as a dependent. Unbounded data are an assumption. That is, the regression line produced by can be extrapolated in both directions but is meaningful only within the upper and lower natural bounds of the dependent. Data are not censored, sample selected, or truncated. There are as many observations of the independents as for the dependents. Absence of perfect multicollinearity is another assumption. When there is perfect multicollinearity, there is no unique regression solution. Perfect multicollinearity occurs if independents are linear functions of each other (ex., age and year of birth), when the researcher creates dummy variables 1
11 for all values of a categorical variable rather than leaving one out, and when there are fewer observations than variables. Regression analysis is a linear procedure. To the extent nonlinear relationships are present, conventional regression analysis will underestimate the relationship. Nonlinear transformation of selected variables may be a pre-processing step, but this is not common because it runs the danger of over fitting the model to what are, in fact, chance variations in the data. When nonlinearity is present, there may be a need for exponential or interactive terms. The same underlying distribution is assumed for all variables. To the extent that an independent variable has a different underlying distribution compared to the dependent (bimodal vs. normal, for instance), then a unit increase in the independent will have nonlinear impacts on the dependent. Even when independent/dependent data pairs are ordered perfectly, unit increases in the independent cannot be associated with fixed linear changes in the dependent. For instance, perfect ordering of a bimodal independent with a normal dependent will generate an s-shaped scatter plot not amenable to a linear solution. Linear regression will underestimate the correlation of the independent and dependent when they come from different underlying distributions. Variable measurement is reliable and valid. To the extent there is systematic error in the measurement of the variables, the regression coefficients will be simply wrong. Independent observations (absence of autocorrelation) leading to uncorrelated error terms. Current values should not be correlated with previous values in a data series. This is often a problem with time series data, where many variables tend to increment over time such that knowing the value of the current observation helps one estimate the value of the previous observation. That is, each observation should be independent of each other observation if the error terms are not to be correlated, which would in turn lead to biased estimates of standard deviations and significance. 11
Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationFinQuiz Notes
Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable
More informationTrendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues
Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationEcon 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10
Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 17th Class 7/1/10 The only function of economic forecasting is to make astrology look respectable. --John Kenneth Galbraith show
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationRegression Analysis. BUS 735: Business Decision Making and Research
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationLecture 5: Omitted Variables, Dummy Variables and Multicollinearity
Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationChapter 13. Multiple Regression and Model Building
Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the
More informationEconometrics Honor s Exam Review Session. Spring 2012 Eunice Han
Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationChapter 14 Multiple Regression Analysis
Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.
More informationACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.
ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,
More informationPBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.
PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the
More informationCorrelation and Regression Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength
More informationRef.: Spring SOS3003 Applied data analysis for social science Lecture note
SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model
Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory
More informationDraft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM
1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact
More informationBusiness Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM
Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationLECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity
LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists
More informationInstructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses
ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout
More informationEconometrics Review questions for exam
Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationECONOMETRICS HONOR S EXAM REVIEW SESSION
ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes
More informationECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47
ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationREED TUTORIALS (Pty) LTD ECS3706 EXAM PACK
REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK 1 ECONOMETRICS STUDY PACK MAY/JUNE 2016 Question 1 (a) (i) Describing economic reality (ii) Testing hypothesis about economic theory (iii) Forecasting future
More informationMultiple Linear Regression CIVL 7012/8012
Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for
More information405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati
405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Prof. M. El-Sakka Dept of Economics Kuwait University In this chapter we take a critical
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More informationMultiple Linear Regression estimation, testing and checking assumptions
Multiple Linear Regression estimation, testing and checking assumptions Lecture No. 07 Example 1 The president of a large chain of fast-food restaurants has randomly selected 10 franchises and recorded
More informationRegression Diagnostics Procedures
Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the
More informationEco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013
Midterm 3 4/5/2013 Instructions: You may use a calculator, and one sheet of notes. You will never be penalized for showing work, but if what is asked for can be computed directly, points awarded will depend
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationEconometrics -- Final Exam (Sample)
Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationRegression With a Categorical Independent Variable
Regression With a Independent Variable Lecture 10 November 5, 2008 ERSH 8320 Lecture #10-11/5/2008 Slide 1 of 54 Today s Lecture Today s Lecture Chapter 11: Regression with a single categorical independent
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and
More informationDiagnostics of Linear Regression
Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions
More informationCHAPTER 5 LINEAR REGRESSION AND CORRELATION
CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationRegression With a Categorical Independent Variable
Regression ith a Independent Variable ERSH 8320 Slide 1 of 34 Today s Lecture Regression with a single categorical independent variable. Today s Lecture Coding procedures for analysis. Dummy coding. Relationship
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationEcn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator
More informationApplied Econometrics Lecture 1
Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity
More informationProject Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang
Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations
More informationCHAPTER 6: SPECIFICATION VARIABLES
Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationECON 5350 Class Notes Functional Form and Structural Change
ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard
More informationstatistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:
Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 7: Multicollinearity Egypt Scholars Economic Society November 22, 2014 Assignment & feedback Multicollinearity enter classroom at room name c28efb78 http://b.socrative.com/login/student/
More informationClassification & Regression. Multicollinearity Intro to Nominal Data
Multicollinearity Intro to Nominal Let s Start With A Question y = β 0 + β 1 x 1 +β 2 x 2 y = Anxiety Level x 1 = heart rate x 2 = recorded pulse Since we can all agree heart rate and pulse are related,
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationIntroduction to Econometrics. Heteroskedasticity
Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory
More informationChapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.
9.1 Simple linear regression 9.1.1 Linear models Response and eplanatory variables Chapter 9 Regression With bivariate data, it is often useful to predict the value of one variable (the response variable,
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More information6. Assessing studies based on multiple regression
6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal
More informationIV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors
IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems
Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationDEMAND ESTIMATION (PART III)
BEC 30325: MANAGERIAL ECONOMICS Session 04 DEMAND ESTIMATION (PART III) Dr. Sumudu Perera Session Outline 2 Multiple Regression Model Test the Goodness of Fit Coefficient of Determination F Statistic t
More informationChapter 12 - Part I: Correlation Analysis
ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More information6. Dummy variable regression
6. Dummy variable regression Why include a qualitative independent variable?........................................ 2 Simplest model 3 Simplest case.............................................................
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationISQS 5349 Spring 2013 Final Exam
ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices
More informationFriday, March 15, 13. Mul$ple Regression
Mul$ple Regression Mul$ple Regression I have a hypothesis about the effect of X on Y. Why might we need addi$onal variables? Confounding variables Condi$onal independence Reduce/eliminate bias in es$mates
More informationChapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity
More informationSTOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3
More informationFNCE 926 Empirical Methods in CF
FNCE 926 Empirical Methods in CF Lecture 2 Linear Regression II Professor Todd Gormley Today's Agenda n Quick review n Finish discussion of linear regression q Hypothesis testing n n Standard errors Robustness,
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationWISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A
WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2015-16 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More information1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11
Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More information