Applied Quantitative Methods II

Size: px

Start display at page:

Download "Applied Quantitative Methods II"

Myles Pitts
5 years ago
Views:

1 Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68

2 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 2 / 68

3 Properties of an estimator An estimator is good when... unbiased: the mean of its sampling distribution is equal to the value of the population parameter consistent: it converges to the value of the true parameter as the sample size increases efficient: the variance of its sampling distribution is smaller than the variance of the distribution of any other unbiased estimator Be careful: distinguish between variance (and standard error) of the estimator and variance (standard deviation) of the sample/population! OLS: Gauss-Markov assumptions OLS is BLUE Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 3 / 68

4 Properties of a good estimator - example Let X i be observations sampled from a distribution with mean µ and variance σ 2 n Let us consider the sample mean X n = 1 n X i as an estimator of µ It can be shown that: i=1 1 E[X n ] = µ 2 X n µ as n increases 3 X n has the smallest variance of all possible estimators of µ Hence, X n is an unbiased, consistent and efficient estimator of µ Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 4 / 68

5 Steps of an empirical analysis 1 Finding an idea 2 Formulation of an economic model (rigorous or intuitive) 3 Formulation of an econometric model based on the economic model 4 Collection of data 5 Estimation of the econometric model 6 Interpretation of results Reading: Woolridge: Introductory Econometrics Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 5 / 68

6 Example Economic model Denote: p c q(p)... price of the good... firm s average cost per one unit of output... demand for firm s output Firm profit: Derive: π = q(p) (p c) q = a 2 b 2 c Demand for good: q(p) = a b p We call q dependent variable and c explanatory variable Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 6 / 68

7 Example Econometric model Write the relationship in a simple linear form q = β 0 + β 1 c (have in mind that β 0 = a 2 and β 1 = b 2 ) There are other (unpredictable) things that influence firms sales add disturbance term q = β 0 + β 1 c + ε Find the value of parameters β 1 and β 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 7 / 68

8 Example Data Ideally: investigate all firms in the economy Really: investigate a sample of firms We need a random (unbiased) sample of firms Collect data: Firm q c Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 8 / 68

9 Example Data Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 9 / 68

10 Example - Estimation Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 10 / 68

11 Example - Estimation OLS method: Make the fit as good as possible Make the misfit as low as possible Minimize the (vertical) distance between data points and regression line Minimize the sum of squared deviations Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 11 / 68

12 Ordinary Least Squares OLS = fitting the regression line by minimizing the sum of vertical distance between the regression line and the observed points Output Average cost Output Average cost Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 12 / 68

13 Terminology y i = β 0 + β 1 x i + ε i... regression line y i x i ε i...dependent/explained variable (i-th observation)...independent/explanatory variable (i-th observation)...random error term/disturbance (of i-th observation) β 0...intercept parameter ( β 0... estimate of this parameter) β 1...slope parameter ( β 1... estimate of this parameter) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 13 / 68

14 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 14 / 68

15 OLS assumptions: violations Regression model is linear in parameters and correctly specified if not linear, we can rewrite it so that it is linear if misspecified, then biased - modify the functional form to have correct specification Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 15 / 68

16 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) zero mean if not, estimates are biased Solution: intercept is in the equation constant variance = homoskedasticity if not constant variance, we have heteroskedasticity and standard errors (SE) of estimates are unreliable Solution: White s var-covar matrix robust SE Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 16 / 68

17 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) correlated error terms = serial correlation: if serial correlation, SE are unrealiable (similar to heteroskedasticity) Solution: e.g. ARMA adjustment of error term (not in this course) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 17 / 68

18 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) The most important assumption! if violated, estimates of coefficients are biased e.g. selection bias - omitted variable bias, reversed causality, measurement error Solution: include more control variables, use econometric methods to solve the bias (next lectures) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 18 / 68

19 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable if violated, we call it multicollinearity Detect multicollinearity: look at correlation coeffs of explanatory variables, VIF (variance inflation factor) > 10 if multicollinearity, estimates of coefficients are biased (OLS cannot distinguish between explanatory variables) Solution: exclude one of the variables (The error term is normally distributed) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 19 / 68

20 OLS assumptions: violations Regression model is linear in parameters and correctly specified Error term has a zero population mean & constant variance: u i iid (0, σ 2 ) Observations of the error term are uncorrelated with each other (Time series) All explanatory variables are uncorrelated with the error term (zero-conditional mean of errors) No explanatory variable is (almost) perfect linear function of any other explanatory variable (The error term is normally distributed) OK if large sample size (Law of large numbers) if not satisfied, standard errors are not correct and we need to use special regression types or test statistics (e.g. not t-test, but some non-parametrics test) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 20 / 68

21 OLS assumptions: Summary What to do to make sure assumptions are satisfied? include intercept in the equation think about correct functional form (linear vs non-linear - see next slides) use robust standard errors to deal with possible heteroskedasticity (most cross-sectional data have heteroskedasticity problem) you need not worry about serial correlation unless you use time series data check for multicollinearity, use common sense (including three different measures of one phenomenon likely leads to multicollinearity) when you have large sample, you need not worry about normality assumption (if sample is small, use non-parametric test to test for significance of coeffcients and validity of your hypotheses) Thus, the only assumption that might be difficult to satisfy is: the explanatory variables are uncorrelated with the error term see next lectures! Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 21 / 68

22 Linear model and nonlinear specification OLS assumption: equation is linear in coefficients However, can be nonlinear in variables ln y = β 0 + β 1 ln x + β 2 z + ε is a linear model y = x β1 + ε is NOT a linear model We have to carefully choose functional form should be based on the underlying economic theory and/or intuition Do we expect a complicated curve instead of a straight line? Does the effect of a variable peak at some point and then start to decline? Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 22 / 68

23 Linear form formula y = β 0 + β 1 x 1 + β 2 x 2 + ε Assumes that the effect of the explanatory variable on the dependent variable is constant: y x k = β k k = 1, 2 Interpretation: if x k increases by 1 x-unit, then y will change by β k y-units Marginal effect Linear form used as default functional form unless strong evidence against Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 23 / 68

24 Double-log form ln y = β 0 + β 1 ln x 1 + β 2 ln x 2 + ε Assumes that the elasticity of dependent variable with respect to the explanatory variable is constant: ln y ln x k = y/y x k /x k = β k k = 1, 2 Interpretation: if x k increases by 1 percent, y changes by β k percent Hint: Before using a double-log model, make sure that there are no negative or zero observations in the data set Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 24 / 68

25 Example Estimating the production function of Indian sugar industry: ln Q = ( 0.14) ln L ( 0.17) ln K Q... output L... labor K... capital employed Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 25 / 68

26 Example Estimating the production function of Indian sugar industry: ln Q = ( 0.14) ln L ( 0.17) ln K Q... output L... labor K... capital employed Interpretation: if we increase the labor by 1%, production of sugar increases by 0.59%, ceteris paribus. Ceteris paribus is a Latin phrase meaning other things being equal. Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 25 / 68

27 Semilog forms Linear-log form: y = β 0 + β 1 ln x 1 + β 2 ln x 2 + ε Interpretation: if x k increases by 1 percent, y changes by (β k /100) units (k = 1, 2) Log-linear form: ln y = β 0 + β 1 x 1 + β 2 x 2 + ε Interpretation: if x k increases by 1 unit, then y will change by (β k 100) percent (k = 1, 2) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 26 / 68

28 Example of semilog forms Estimating the influence of education and experience on wages: ln wage = ( 0.008) educ ( 0.002) exper wage... annual wage (USD) educ... years of education exper... years of experience Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 27 / 68

29 Example of semilog forms Estimating the influence of education and experience on wages: ln wage = ( 0.008) educ ( 0.002) exper wage... annual wage (USD) educ... years of education exper... years of experience Interpretation: increase in education by one year increases annual wage by 9.8%, ceteris paribus increase in experience by one year increases annual wage by 1%, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 27 / 68

30 Polynomial form (quadratic) y = β 0 + β 1 x 1 + β 2 x ε To determine the effect of x 1 on y, we need to calculate the derivative: y x 1 = β β 2 x 1 Clearly, the effect of x 1 on y is not constant, but changes with the level of x 1 turning point x = ˆβ 1 /2 ˆβ 2 We might also have higher order polynomials, e.g.: y = β 0 + β 1 x 1 + β 2 x β 3 x β 4 x ε Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 28 / 68

31 Example of polynomial form Impact of number of hours of studying on grade from AQM II.: ĝrade = hours hours 2 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 29 / 68

32 Example of polynomial form Impact of number of hours of studying on grade from AQM II.: ĝrade = hours hours 2 To determine effect of hours on grade, calculate derivative: y x = grade hours = hours = hours Decreasing returns to hours of studying: more hours imply higher grade but the positive effect decreases with more hours Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 29 / 68

33 Interactions y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + ε To effect of x 1 on y may depend on magnitude of x 2 y x 1 = β 1 + β 3 x 2 if β 3 > 0, with higher x 2 higher effect of x 1 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 30 / 68

34 Choice of correct functional form Functional form has to be correctly specified in order to avoid biased and inconsistent estimates One of OLS assumptions is that the model is correctly specified Ideally: specification given by underlying theory of the equation In reality: underlying theory does not give precise functional form In most cases, either linear form is adequate, or common sense will point out an easy choice from among the alternatives Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 31 / 68

35 Choice of correct functional form Nonlinearity of explanatory variables often approximated by polynomial form missing higher powers of a variable can be detected as omitted variables Nonlinearity of dependent variable harder to detect based on statistical fit of the regression R 2 is incomparable across models where the y is transformed dependent variables are often transformed to log-form in order to make their distribution closer to the normal distribution Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 32 / 68

36 Dummy variables: Intercept dummy Dummy variable - values of 0 or 1, depending on a qualitative attribute E.g. Female Dummy variable included in a regression alone intercept dummy changes the intercept for the subset of data defined by a dummy variable condition: y i = β 0 + β 1 D i + β 2 x i + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = (β 0 + β 1 ) + β 2 x i + ε i if D i = 1 y i = β 0 + β 2 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 33 / 68

37 Intercept dummy Di=1 Slope = β2 Slope = β2 Y Di=0 β0+β1 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 34 / 68

38 Example Estimating the determinants of wages: ŵage i = ( 0.270) M i ( 0.051) educ i ( 0.064) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 35 / 68

39 Example Estimating the determinants of wages: ŵage i = ( 0.270) M i ( 0.051) educ i ( 0.064) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Interpretation: M: men earn on average $2.156 per hour more than women, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 35 / 68

40 Slope dummy If a dummy variable is interacted with another variable (x), it is a slope dummy It changes the relationship between x and y for a subset of data defined by a dummy variable condition: y i = β 0 + β 1 x i + β 2 (x i D i ) + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = β 0 + (β 1 + β 2 )x i + ε i if D i = 1 y i = β 0 + β 1 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 36 / 68

41 Slope dummy Slope = β1+β2 Di=1 Slope = β1 Y Di=0 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 37 / 68

42 Example Estimating the determinants of wages: ŵage i = ( 0.054) educ i ( 0.021) M i educ i ( 0.065) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 38 / 68

43 Example Estimating the determinants of wages: ŵage i = ( 0.054) educ i ( 0.021) M i educ i ( 0.065) exper i where M i = { 1 if the i-th person is male 0 if the i-th person is female wage... average hourly wage in USD Interpretation: men get on average 17 cents per hour more than women for each additional year of education, ceteris paribus Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 38 / 68

44 Slope and intercept dummies Allow both for different slope and intercept for two subsets of data y i = β 0 + β 1 D i + β 2 x i + β 3 (x i D i ) + ε i where { 1 if the i-th observation meets a particular condition D i = 0 otherwise We have y i = (β 0 + β 1 ) + (β 2 + β 3 )x i + ε i if D i = 1 y i = β 0 + β 2 x i + ε i if D i = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 39 / 68

45 Slope and intercept dummies Di=1 Slope = β2+β3 Y Slope = β2 Di=0 β0+β1 β0 X Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 40 / 68

46 Dummy variable for each category What if a variable defines three (or more) attributes? Example level of education - elementary school, high school, and college Define and use a set of dummy variables: H = { 1 if high school 0 otherwise and C = { 1 if college 0 otherwise Should we include also a third dummy in the regression, which is equal to 1 for people with elementary education? No, unless we exclude the intercept! Using full set of dummies leads to perfect multicollinearity (dummy variable trap) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 41 / 68

47 Hypothesis testing Once we have our model correctly specified, we can proceed to estimation and testing of our hypothesis. Basic principles of hypothesis testing: We cannot prove that a given hypothesis is correct using hypothesis testing We can only reject a given hypothesis with a certain degree of confidence, because our sample does not conform to the hypothesis In such a case, we conclude that it is very unlikely the sample result would have been observed if the hypothesized theory were correct Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 42 / 68

48 Null and Alternative Hypotheses First step: state explicitly the hypothesis Null hypothesis: statement of the range of values of the regression coefficient that would be expected to occur if the researcher s theory were not correct Alternative hypothesis: specification of the range of values of the coefficient that would be expected to occur if the researcher s theory were correct Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 43 / 68

49 Null and Alternative Hypotheses Notation: H 0 H A... null hypothesis... alternative hypothesis Examples: One-sided test H 0 : β 0 H A : β > 0 Two-sided test H 0 : β = 0 H A : β 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 44 / 68

50 Type I and type II errors It would be unrealistic to think that conclusions drawn from regression analysis will always be right There are two types of errors we can make Type I : We reject a true null hypothesis Type II : We do not reject a false null hypothesis Example: H 0 : β = 0 H A : β 0 Type I error: it holds that β = 0, we conclude that β 0 Type II error: it holds that β 0, we conclude that β = 0 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 45 / 68

51 Type I and type II errors Example H 0 : The defendant is innocent H A : The defendant is guilty Type I error = Sending an innocent person to jail Type II error = Freeing a guilty person Obviously, lowering the probability of Type I error means increasing the probability of Type II error In hypothesis testing, we focus on Type I error and we ensure that its probability is not unreasonably large Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 46 / 68

52 Type I and type II errors Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 47 / 68

53 Decision rule 1 Calculate sample statistic 2 Compare it with the critical value (from statistical tables) The critical value divides the range of possible values of the statistic into two regions: acceptance region and rejection region If the sample statistic falls into the rejection region, we reject H 0 If the sample statistic falls into the acceptance region, we do not reject H 0 The idea is that if the value of the coefficient does not support H 0, the sample statistic should fall into the rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 48 / 68

54 One-sided rejection region H 0 : β 0 vs H A : β > 0 Distribution of β: Probability of Type I error Acceptance region Rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 49 / 68

55 Two-sided rejection region H 0 : β = 0 vs H A : β 0 Distribution of β: Probability of Type I error Rejection region Acceptance region Rejection region Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 50 / 68

56 Level of significance Classical approach to hypothesis testing: first choose the significance level (e.g. 5%) then test the hypothesis there is no correct significance level standard values: * 10%, ** 5%, *** 1%, (**** 0.1%) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 51 / 68

57 The p-value Or we can ask: What is the smallest significance level at which the null hypothesis would still be rejected? This is p-value. if I repeat experiment 100 times, if there is no effect, I would get such result in how many cases? Remember: significance level describes probability of type I error if the null is true The smaller the p-value, the smaller the probability of rejecting the true null hypothesis the bigger the confidence the null hypothesis is indeed correctly rejected Read more here The p-value for H 0 : β = 0 is displayed in most regression outputs Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 52 / 68

58 Multiple-comparisons: problem of chance Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 53 / 68

59 Multiple-comparisons: problem of chance Example Eating chocolate resuls in weigth loss a serious clinical trial 5 men and 11 women over 3 weeks, random division into 3 groups control, low-carb diet, chocolate diet significant weight loss in the treatment groups Why? 18 different measures! actually, 60% chance of some significant results Could have been Chocolate improves sleeping or whatever Read the story here and the paper here Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 54 / 68

Multiple-comparisons: problem of chance Comparing a lot of variables by simple t-tests may bring significant results by chance not a problem of OLS Measure

60 Multiple-comparisons: problem of chance Comparing a lot of variables by simple t-tests may bring significant results by chance not a problem of OLS Measure is like a lottery ticket: the more measures, the higher chance of winning p(winning) = 1 (1 p) n Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 55 / 68

61 Multiple-comparisons: problem of chance Solution: Correction for the significance level Bonferroni method simply divide original alpha by number of comparisons can become too conservative when many comparisons in place Others: Tukey honest test, LSD test, Scheffé, Dunnet,... Name for not correcting for multiple measures: P-hacking Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 56 / 68

62 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 57 / 68

63 Conclusion Now everybody should understand OLS Gauss-Markov assumptions and problems with their violations Interpretation of coefficients in various functional forms Using dummy variable Hypothesis testing Multiple-comparison tests Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 58 / 68

64 Outline 1 Econometric analysis Properties of an estimator Steps in empirical analysis Assumptions Choice of functional form Dummy variables Hypothesis testing 2 Conclusion 3 Appendix - stats recap Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 59 / 68

65 Random variable A random variable X is a variable whose numerical value is determined by chance. It is a quantification of the outcome of a random phenomenon. Discrete random variable: has a countable number of possible values Example the number of times that a coin will be flipped before heads is obtained Example Continuous random variable: can take on any value in an interval time until the first goal is shot in a football match between FC Barcelona and Real Madrid Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 60 / 68

66 Discrete random variables Probability distribution of a variable X that can take values x 1, x 2, x 3,... : P(X = x 1 ) = p 1 P(X = x 2 ) = p 2 P(X = x 3 ) = p 3. Cumulative distribution function (CDF) : F X (x) = P(X x) = P(X = x i ) i=1,x i x Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 61 / 68

Continuous random variables Probability density function f X (x) (PDF) describes the relative likelihood for the random variable X to occur at a given point x

67 Continuous random variables Probability density function f X (x) (PDF) describes the relative likelihood for the random variable X to occur at a given point x Cumulative distribution function (CDF) : F X (x) = P(X x) = x f X (t)dt Example: Normal (Gaussian) distribution Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 62 / 68

68 Expected value and variance Expected value (mean) : Discrete variable E [X ] = i=1 x i P(X = x i ) Continuous variable Variance : Var[X ] = E E [X ] = + x f X (x)dx Standard deviation : σ X = Var[X ] [ (X E [X ]) 2] = E[X 2 ] (E[X ]) 2 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 63 / 68

69 Covariance, correlation, independence Covariance : Cov(X, Y ) = E [(X E[X ]) (Y E[Y ])] = E [XY ] E[X ]E[Y ] (Pearson s) Correlation: Corr(X, Y ) = Cov(X, Y ) σ X σ Y Independence : X and Y are independent if the conditional probability distribution of X given the observed value of Y is the same as if the value of Y had not been observed. If X and Y are independent, then E[XY ] = E[X ]E[Y ] and Cov(X, Y ) = 0 (not the other way round in general) Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 64 / 68

70 Examples of correlation Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 65 / 68

71 Sample moments Counterparts of theoretical moments of the distribution of X, computed based on observations X 1,..., X n drawn from this distribution Sample mean : Sample variance : Sample covariance : X n = 1 n S 2 n = 1 n 1 Cov n (X, Y ) = 1 n 1 n i=1 X i n (X i X n ) 2 i=1 n (X i X n )(Y i Y n ) i=1 Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 66 / 68

72 Vectors of random variables Sometimes, we deal with vectors of random variables X 1 Example: X = X 2 X 3 Expected value: E [X] = E[X 1] E[X 2 ] E[X 3 ] Variance/covariance matrix: Var[X 1 ] Cov(X 1, X 2 ) Cov(X 1, X 3 ) Var [X] = Cov(X 2, X 1 ) Var[X 2 ] Cov(X 2, X 3 ) Cov(X 3, X 1 ) Cov(X 3, X 2 ) Var[X 3 ] Computational rule: Var[AX] = AVar[X]A Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 67 / 68

73 Chi squared distribution Chi-squared distribution with k degrees of freedom : χ 2 k Let Z i N(0, 1) for each i and independent, then X = k i=1 Z 2 i X χ 2 k Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 68 / 68

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will