Applied Quantitative Methods II

Size: px
Start display at page:

Download "Applied Quantitative Methods II"

Transcription

1 Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68

2 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 2 / 68

3 Properties of an estimator An estimator is good when... unbiased: the mean of its sampling distribution is equal to the value of the population parameter consistent: it converges to the value of the true parameter as the sample size increases efficient: the variance of its sampling distribution is smaller than the variance of the distribution of any other unbiased estimator Be careful: distinguish between variance (and standard error) of the estimator and variance (standard deviation) of the sample/population! OLS: Gauss-Markov assumptions OLS is BLUE Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 3 / 68

4 Properties of a good estimator - example Let X i be observations sampled from a distribution with mean µ and variance σ 2 n Let us consider the sample mean X n = 1 n X i as an estimator of µ It can be shown that: i=1 1 E[X n ] = µ 2 X n µ as n increases 3 X n has the smallest variance of all possible estimators of µ Hence, X n is an unbiased, consistent and efficient estimator of µ Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 4 / 68

5 Steps of an empirical analysis 1 Finding an idea 2 Formulation of an economic model (rigorous or intuitive) 3 Formulation of an econometric model based on the economic model 4 Collection of data 5 Estimation of the econometric model 6 Interpretation of results Reading: Woolridge: Introductory Econometrics Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 5 / 68

6 Example Economic model Denote: p c q(p)... price of the good... firm s average cost per one unit of output... demand for firm s output Firm profit: Derive: π = q(p) (p c) q = a 2 b 2 c Demand for good: q(p) = a b p We call q dependent variable and c explanatory variable Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 6 / 68

7 Example Econometric model Write the relationship in a simple linear form q = β 0 + β 1 c (have in mind that β 0 = a 2 and β 1 = b 2 ) There are other (unpredictable) things that influence firms sales add disturbance term q = β 0 + β 1 c + ε Find the value of parameters β 1 and β 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 7 / 68

8 Example Data Ideally: investigate all firms in the economy Really: investigate a sample of firms We need a random (unbiased) sample of firms Collect data: Firm q c Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 8 / 68

9 Example Data Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 9 / 68

10 Example - Estimation Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 10 / 68

11 Example - Estimation OLS method: Make the fit as good as possible Make the misfit as low as possible Minimize the (vertical) distance between data points and regression line Minimize the sum of squared deviations Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 11 / 68

12 Ordinary Least Squares OLS = fitting the regression line by minimizing the sum of vertical distance between the regression line and the observed points Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 12 / 68

13 Terminology y i = β 0 + β 1 x i + ε i... regression line y i x i ε i...dependent/explained variable (i-th observation)...independent/explanatory variable (i-th observation)...random error term/disturbance (of i-th observation) β 0...intercept parameter ( β 0... estimate of this parameter) β 1...slope parameter ( β 1... estimate of this parameter) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 13 / 68

14 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 14 / 68

15 OLS assumptions: violations Regression model is linear in parameters and correctly specified if not linear, we can rewrite it so that it is linear if misspecified, then biased - modify the functional form to have correct specification Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 15 / 68

16 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) zero mean if not, estimates are biased Solution: intercept is in the equation constant variance = homoskedasticity if not constant variance, we have heteroskedasticity and standard errors (SE) of estimates are unreliable Solution: White s var-covar matrix robust SE Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 16 / 68

17 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) correlated error terms = serial correlation: if serial correlation, SE are unrealiable (similar to heteroskedasticity) Solution: e.g. ARMA adjustment of error term (not in this course) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 17 / 68

18 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) The most important assumption! if violated, estimates of coefficients are biased e.g. selection bias - omitted variable bias, reversed causality, measurement error Solution: include more control variables, use econometric methods to solve the bias (next lectures) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 18 / 68

19 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable if violated, we call it multicollinearity Detect multicollinearity: look at correlation coeffs of explanatory variables, VIF (variance inflation factor) > 10 if multicollinearity, estimates of coefficients are biased (OLS cannot distinguish between explanatory variables) Solution: exclude one of the variables (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 19 / 68

20 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) OK if large sample size (Law of large numbers) if not satisfied, standard errors are not correct and we need to use special regression types or test statistics (e.g. not t-test, but some non-parametrics test) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 20 / 68

21 OLS assumptions: Summary What to do to make sure assumptions are satisfied? include intercept in the equation think about correct functional form (linear vs non-linear - see next slides) use robust standard errors to deal with possible heteroskedasticity (most cross-sectional data have heteroskedasticity problem) you need not worry about serial correlation unless you use time series data check for multicollinearity, use common sense (including three different measures of one phenomenon likely leads to multicollinearity) when you have large sample, you need not worry about normality assumption (if sample is small, use non-parametric test to test for significance of coeffcients and validity of your hypotheses) Thus, the only assumption that might be difficult to satisfy is: the explanatory variables are uncorrelated with the error term see next lectures! Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 21 / 68

22 Linear model and nonlinear specification OLS assumption: equation is linear in coefficients However, can be nonlinear in variables ln y = β 0 + β 1 ln x + β 2 z + ε is a linear model y = x β1 + ε is NOT a linear model We have to carefully choose functional form should be based on the underlying economic theory and/or intuition Do we expect a complicated curve instead of a straight line? Does the effect of a variable peak at some point and then start to decline? Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 22 / 68

23 Linear form formula y = β 0 + β 1 x 1 + β 2 x 2 + ε Assumes that the effect of the explanatory variable on the dependent variable is constant: y x k = β k k = 1, 2 Interpretation: if x k increases by 1 x-unit, then y will change by β k y-units Marginal effect Linear form used as default functional form unless strong evidence against Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 23 / 68

24 Double-log form ln y = β 0 + β 1 ln x 1 + β 2 ln x 2 + ε Assumes that the elasticity of dependent variable with respect to the explanatory variable is constant: ln y ln x k = y/y x k /x k = β k k = 1, 2 Interpretation: if x k increases by 1 percent, y changes by β k percent Hint: Before using a double-log model, make sure that there are no negative or zero observations in the data set Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 24 / 68

25 Example Estimating the production function of Indian sugar industry: ln Q = ( 0.14) ln L ( 0.17) ln K Q... output L... labor K... capital employed Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 25 / 68

26 Example Estimating the production function of Indian sugar industry: ln Q = ( 0.14) ln L ( 0.17) ln K Q... output L... labor K... capital employed Interpretation: if we increase the labor by 1%, production of sugar increases by 0.59%, ceteris paribus. Ceteris paribus is a Latin phrase meaning other things being equal. Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 25 / 68

27 Semilog forms Linear-log form: y = β 0 + β 1 ln x 1 + β 2 ln x 2 + ε Interpretation: if x k increases by 1 percent, y changes by (β k /100) units (k = 1, 2) Log-linear form: ln y = β 0 + β 1 x 1 + β 2 x 2 + ε Interpretation: if x k increases by 1 unit, then y will change by (β k 100) percent (k = 1, 2) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 26 / 68

28 Example of semilog forms Estimating the influence of education and experience on wages: ln wage = ( 0.008) educ ( 0.002) exper wage... annual wage (USD) educ... years of education exper... years of experience Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 27 / 68

29 Example of semilog forms Estimating the influence of education and experience on wages: ln wage = ( 0.008) educ ( 0.002) exper wage... annual wage (USD) educ... years of education exper... years of experience Interpretation: increase in education by one year increases annual wage by 9.8%, ceteris paribus increase in experience by one year increases annual wage by 1%, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 27 / 68

30 Polynomial form (quadratic) y = β 0 + β 1 x 1 + β 2 x ε To determine the effect of x 1 on y, we need to calculate the derivative: y x 1 = β β 2 x 1 Clearly, the effect of x 1 on y is not constant, but changes with the level of x 1 turning point x = ˆβ 1 /2 ˆβ 2 We might also have higher order polynomials, e.g.: y = β 0 + β 1 x 1 + β 2 x β 3 x β 4 x ε Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 28 / 68

31 Example of polynomial form Impact of number of hours of studying on grade from AQM II.: ĝrade = hours hours 2 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 29 / 68

32 Example of polynomial form Impact of number of hours of studying on grade from AQM II.: ĝrade = hours hours 2 To determine effect of hours on grade, calculate derivative: y x = grade hours = hours = hours Decreasing returns to hours of studying: more hours imply higher grade but the positive effect decreases with more hours Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 29 / 68

33 Interactions y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + ε To effect of x 1 on y may depend on magnitude of x 2 y x 1 = β 1 + β 3 x 2 if β 3 > 0, with higher x 2 higher effect of x 1 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 30 / 68

34 Choice of correct functional form Functional form has to be correctly specified in order to avoid biased and inconsistent estimates One of OLS assumptions is that the model is correctly specified Ideally: specification given by underlying theory of the equation In reality: underlying theory does not give precise functional form In most cases, either linear form is adequate, or common sense will point out an easy choice from among the alternatives Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 31 / 68

35 Choice of correct functional form Nonlinearity of explanatory variables often approximated by polynomial form missing higher powers of a variable can be detected as omitted variables Nonlinearity of dependent variable harder to detect based on statistical fit of the regression R 2 is incomparable across models where the y is transformed dependent variables are often transformed to log-form in order to make their distribution closer to the normal distribution Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 32 / 68

36 Dummy variables: Intercept dummy Dummy variable - values of 0 or 1, depending on a qualitative attribute E.g. Female Dummy variable included in a regression alone intercept dummy changes the intercept for the subset of data defined by a dummy variable condition: y i = β 0 + β 1 D i + β 2 x i + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = (β 0 + β 1 ) + β 2 x i + ε i if D i = 1 y i = β 0 + β 2 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 33 / 68

37 Intercept dummy Di=1 Slope = β2 Slope = β2 Y Di=0 β0+β1 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 34 / 68

38 Example Estimating the determinants of wages: ŵage i = ( 0.270) M i ( 0.051) educ i ( 0.064) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 35 / 68

39 Example Estimating the determinants of wages: ŵage i = ( 0.270) M i ( 0.051) educ i ( 0.064) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Interpretation: M: men earn on average $2.156 per hour more than women, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 35 / 68

40 Slope dummy If a dummy variable is interacted with another variable (x), it is a slope dummy It changes the relationship between x and y for a subset of data defined by a dummy variable condition: y i = β 0 + β 1 x i + β 2 (x i D i ) + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = β 0 + (β 1 + β 2 )x i + ε i if D i = 1 y i = β 0 + β 1 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 36 / 68

41 Slope dummy Slope = β1+β2 Di=1 Slope = β1 Y Di=0 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 37 / 68

42 Example Estimating the determinants of wages: ŵage i = ( 0.054) educ i ( 0.021) M i educ i ( 0.065) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 38 / 68

43 Example Estimating the determinants of wages: ŵage i = ( 0.054) educ i ( 0.021) M i educ i ( 0.065) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Interpretation: men get on average 17 cents per hour more than women for each additional year of education, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 38 / 68

44 Slope and intercept dummies Allow both for different slope and intercept for two subsets of data y i = β 0 + β 1 D i + β 2 x i + β 3 (x i D i ) + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = (β 0 + β 1 ) + (β 2 + β 3 )x i + ε i if D i = 1 y i = β 0 + β 2 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 39 / 68

45 Slope and intercept dummies Di=1 Slope = β2+β3 Y Slope = β2 Di=0 β0+β1 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 40 / 68

46 Dummy variable for each category What if a variable defines three (or more) attributes? Example level of education - elementary school, high school, and college Define and use a set of dummy variables: H = { 1 if high school 0 otherwise and C = { 1 if college 0 otherwise Should we include also a third dummy in the regression, which is equal to 1 for people with elementary education? No, unless we exclude the intercept! Using full set of dummies leads to perfect multicollinearity (dummy variable trap) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 41 / 68

47 Hypothesis testing Once we have our model correctly specified, we can proceed to estimation and testing of our hypothesis. Basic principles of hypothesis testing: We cannot prove that a given hypothesis is correct using hypothesis testing We can only reject a given hypothesis with a certain degree of confidence, because our sample does not conform to the hypothesis In such a case, we conclude that it is very unlikely the sample result would have been observed if the hypothesized theory were correct Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 42 / 68

48 Null and Alternative Hypotheses First step: state explicitly the hypothesis Null hypothesis: statement of the range of values of the regression coefficient that would be expected to occur if the researcher s theory were not correct Alternative hypothesis: specification of the range of values of the coefficient that would be expected to occur if the researcher s theory were correct Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 43 / 68

49 Null and Alternative Hypotheses Notation: H 0 H A... null hypothesis... alternative hypothesis Examples: One-sided test H 0 : β 0 H A : β > 0 Two-sided test H 0 : β = 0 H A : β 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 44 / 68

50 Type I and type II errors It would be unrealistic to think that conclusions drawn from regression analysis will always be right There are two types of errors we can make Type I : We reject a true null hypothesis Type II : We do not reject a false null hypothesis Example: H 0 : β = 0 H A : β 0 Type I error: it holds that β = 0, we conclude that β 0 Type II error: it holds that β 0, we conclude that β = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 45 / 68

51 Type I and type II errors Example H 0 : The defendant is innocent H A : The defendant is guilty Type I error = Sending an innocent person to jail Type II error = Freeing a guilty person Obviously, lowering the probability of Type I error means increasing the probability of Type II error In hypothesis testing, we focus on Type I error and we ensure that its probability is not unreasonably large Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 46 / 68

52 Type I and type II errors Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 47 / 68

53 Decision rule 1 Calculate sample statistic 2 Compare it with the critical value (from statistical tables) The critical value divides the range of possible values of the statistic into two regions: acceptance region and rejection region If the sample statistic falls into the rejection region, we reject H 0 If the sample statistic falls into the acceptance region, we do not reject H 0 The idea is that if the value of the coefficient does not support H 0, the sample statistic should fall into the rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 48 / 68

54 One-sided rejection region H 0 : β 0 vs H A : β > 0 Distribution of β: Probability of Type I error Acceptance region Rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 49 / 68

55 Two-sided rejection region H 0 : β = 0 vs H A : β 0 Distribution of β: Probability of Type I error Rejection region Acceptance region Rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 50 / 68

56 Level of significance Classical approach to hypothesis testing: first choose the significance level (e.g. 5%) then test the hypothesis there is no correct significance level standard values: * 10%, ** 5%, *** 1%, (**** 0.1%) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 51 / 68

57 The p-value Or we can ask: What is the smallest significance level at which the null hypothesis would still be rejected? This is p-value. if I repeat experiment 100 times, if there is no effect, I would get such result in how many cases? Remember: significance level describes probability of type I error if the null is true The smaller the p-value, the smaller the probability of rejecting the true null hypothesis the bigger the confidence the null hypothesis is indeed correctly rejected Read more here The p-value for H 0 : β = 0 is displayed in most regression outputs Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 52 / 68

58 Multiple-comparisons: problem of chance Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 53 / 68

59 Multiple-comparisons: problem of chance Example Eating chocolate resuls in weigth loss a serious clinical trial 5 men and 11 women over 3 weeks, random division into 3 groups control, low-carb diet, chocolate diet significant weight loss in the treatment groups Why? 18 different measures! actually, 60% chance of some significant results Could have been Chocolate improves sleeping or whatever Read the story here and the paper here Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 54 / 68

60 Multiple-comparisons: problem of chance Comparing a lot of variables by simple t-tests may bring significant results by chance not a problem of OLS Measure is like a lottery ticket: the more measures, the higher chance of winning p(winning) = 1 (1 p) n Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 55 / 68

61 Multiple-comparisons: problem of chance Solution: Correction for the significance level Bonferroni method simply divide original alpha by number of comparisons can become too conservative when many comparisons in place Others: Tukey honest test, LSD test, Scheffé, Dunnet,... Name for not correcting for multiple measures: P-hacking Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 56 / 68

62 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 57 / 68

63 Conclusion Now everybody should understand OLS Gauss-Markov assumptions and problems with their violations Interpretation of coefficients in various functional forms Using dummy variable Hypothesis testing Multiple-comparison tests Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 58 / 68

64 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 59 / 68

65 Random variable A random variable X is a variable whose numerical value is determined by chance. It is a quantification of the outcome of a random phenomenon. Discrete random variable: has a countable number of possible values Example the number of times that a coin will be flipped before heads is obtained Example Continuous random variable: can take on any value in an interval time until the first goal is shot in a football match between FC Barcelona and Real Madrid Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 60 / 68

66 Discrete random variables Probability distribution of a variable X that can take values x 1, x 2, x 3,... : P(X = x 1 ) = p 1 P(X = x 2 ) = p 2 P(X = x 3 ) = p 3. Cumulative distribution function (CDF) : F X (x) = P(X x) = P(X = x i ) i=1,x i x Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 61 / 68

67 Continuous random variables Probability density function f X (x) (PDF) describes the relative likelihood for the random variable X to occur at a given point x Cumulative distribution function (CDF) : F X (x) = P(X x) = x f X (t)dt Example: Normal (Gaussian) distribution Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 62 / 68

68 Expected value and variance Expected value (mean) : Discrete variable E [X ] = i=1 x i P(X = x i ) Continuous variable Variance : Var[X ] = E E [X ] = + x f X (x)dx Standard deviation : σ X = Var[X ] [ (X E [X ]) 2] = E[X 2 ] (E[X ]) 2 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 63 / 68

69 Covariance, correlation, independence Covariance : Cov(X, Y ) = E [(X E[X ]) (Y E[Y ])] = E [XY ] E[X ]E[Y ] (Pearson s) Correlation: Corr(X, Y ) = Cov(X, Y ) σ X σ Y Independence : X and Y are independent if the conditional probability distribution of X given the observed value of Y is the same as if the value of Y had not been observed. If X and Y are independent, then E[XY ] = E[X ]E[Y ] and Cov(X, Y ) = 0 (not the other way round in general) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 64 / 68

70 Examples of correlation Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 65 / 68

71 Sample moments Counterparts of theoretical moments of the distribution of X, computed based on observations X 1,..., X n drawn from this distribution Sample mean : Sample variance : Sample covariance : X n = 1 n S 2 n = 1 n 1 Cov n (X, Y ) = 1 n 1 n i=1 X i n (X i X n ) 2 i=1 n (X i X n )(Y i Y n ) i=1 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 66 / 68

72 Vectors of random variables Sometimes, we deal with vectors of random variables X 1 Example: X = X 2 X 3 Expected value: E [X] = E[X 1] E[X 2 ] E[X 3 ] Variance/covariance matrix: Var[X 1 ] Cov(X 1, X 2 ) Cov(X 1, X 3 ) Var [X] = Cov(X 2, X 1 ) Var[X 2 ] Cov(X 2, X 3 ) Cov(X 3, X 1 ) Cov(X 3, X 2 ) Var[X 3 ] Computational rule: Var[AX] = AVar[X]A Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 67 / 68

73 Chi squared distribution Chi-squared distribution with k degrees of freedom : χ 2 k Let Z i N(0, 1) for each i and independent, then X = k i=1 Z 2 i X χ 2 k Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 68 / 68

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

LECTURE 1. Introduction to Econometrics

LECTURE 1. Introduction to Econometrics LECTURE 1 Introduction to Econometrics Ján Palguta September 20, 2016 1 / 29 WHAT IS ECONOMETRICS? To beginning students, it may seem as if econometrics is an overly complex obstacle to an otherwise useful

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Testing for Discrimination

Testing for Discrimination Testing for Discrimination Spring 2010 Alicia Rosburg (ISU) Testing for Discrimination Spring 2010 1 / 40 Relevant Readings BFW Appendix 7A (pgs 250-255) Alicia Rosburg (ISU) Testing for Discrimination

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

FNCE 926 Empirical Methods in CF

FNCE 926 Empirical Methods in CF FNCE 926 Empirical Methods in CF Lecture 2 Linear Regression II Professor Todd Gormley Today's Agenda n Quick review n Finish discussion of linear regression q Hypothesis testing n n Standard errors Robustness,

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2015-16 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Steps in Regression Analysis

Steps in Regression Analysis MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4) 2-1 Steps in Regression Analysis 1. Review the literature and develop the theoretical model 2. Specify the model:

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK 1 ECONOMETRICS STUDY PACK MAY/JUNE 2016 Question 1 (a) (i) Describing economic reality (ii) Testing hypothesis about economic theory (iii) Forecasting future

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94 Freeing up the Classical Assumptions () Introductory Econometrics: Topic 5 1 / 94 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions needed for derivations

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3 Probability Paul Schrimpf January 23, 2018 Contents 1 Definitions 2 2 Properties 3 3 Random variables 4 3.1 Discrete........................................... 4 3.2 Continuous.........................................

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Friday, June 5, 009 Examination time: 3 hours

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing Robert Vanderbei Fall 2014 Slides last edited on November 24, 2014 http://www.princeton.edu/ rvdb Coin Tossing Example Consider two coins.

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

1 Correlation between an independent variable and the error

1 Correlation between an independent variable and the error Chapter 7 outline, Econometrics Instrumental variables and model estimation 1 Correlation between an independent variable and the error Recall that one of the assumptions that we make when proving the

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157.

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157. CHAPTER 7 SOLUTIONS TO PROBLEMS 7. (i) The coefficient on male is 87.75, so a man is estimated to sleep almost one and one-half hours more per week than a comparable woman. Further, t male = 87.75/34.33

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21 This lecture (#12): Based on the econometric model specification from Lecture 9

More information

Chapter 1. An Overview of Regression Analysis. Econometrics and Quantitative Analysis. What is Econometrics? (cont.) What is Econometrics?

Chapter 1. An Overview of Regression Analysis. Econometrics and Quantitative Analysis. What is Econometrics? (cont.) What is Econometrics? Econometrics and Quantitative Analysis Using Econometrics: A Practical Guide A.H. Studenmund 6th Edition. Addison Wesley Longman Chapter 1 An Overview of Regression Analysis Instructor: Dr. Samir Safi

More information

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON Introductory Econometrics. Lecture 13: Internal and external validity ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external

More information