Econometrics I. Andrea Beccarini. Summer 2011

Size: px
Start display at page:

Download "Econometrics I. Andrea Beccarini. Summer 2011"

Transcription

1 Econometrics I Andrea Beccarini Summer 2011

2 Outline Very brief review of statistical basics Simple linear regression model (specification, point estimation, interval estimation, hypothesis tests, forecasting, maximum likelihood estimation) Multiple linear regression model Violations of (some) model assumptions 9

3 Review of basic statistics Random experiment (Zufallsexperiment) Sample space (Ergebnismenge) Event (Ereignis) Set operations (Verknüpfungen von Ereignissen) Partition (Partition oder vollständige Zerlegung) 10

4 Probability (Wahrscheinlichkeit) Kolmogorov s axioms (Kolmogorovs Axiome) Conditional probability (bedingte Wahrscheinlichkeit) Total probability (Satz von der totalen Wahrscheinlichkeit) Bayes theorem (Satz von Bayes) Independence (Unabhängigkeit) 11

5 Random variables (Zufallsvariable) Definition and intuition Distribution function and quantile function (Verteilungsfunktion und Quantilfunktion) Discrete and continuous random variables (diskrete und stetige Zufallsvariable) Density function (Dichtefunktion) Expectation (Erwartungswert) Variance (Varianz) 12

6 Special discrete distributions, e.g. Bernoulli, binomial, Poisson, geometric, hypergeometric,... Special continuous distributions e.g. normal, standard normal distribution, exponential, Pareto, χ 2,F,t,... There are many more special distributions Which distribution can be used when? 13

7 Simple linear regression model Econometrics: Application of statistical methods to empirical research in economics Econometric problems: Specification of an appropriate model Estimation of the model (Schätzung) Hypothesis testing Forecasting (Prognose) 14

8 Economic model SPECIFICATION functional (A-assumptions) error term (B-assumptions) variables (C-assumptions) Econometric model ESTIMATION Estimated model HYPOTHESIS TESTS FORECASTING 15

9 Data Empirical research requires (high quality) data Often, collecting data is the main problem of empirical research Thereisnosystematicapproach Kinds of data: Time series data (Zeitreihendaten), cross sectional data (Querschnittsdaten), panel data (Paneldaten) 16

10 Specification Numeric illustration: Data of the gratuity example t x t y t t x t y t Billing amount x t and tip y t (bothineuro) of 20 observed guests 17

11 Functional dependence (generic) y = f (x) More specifically, the functional dependence is assumed to be y = α + βx Other functional forms are of course possible; more on that later The econometric model is specified using the A-, B- and C-assumptions 18

12 Economic model: y t = α + βx t for t =1,...,20 Trinkgeld y α 20 20β R Rechnungsbetrag x 19

13 Econometric model: y t = α + βx t + u t for t =1,...,20 y t x t 20

14 The A-assumptions (functional specification): Assumption a1: No relevant exogenous variable is omitted from the econometric model, and the exogenous variable included in the model is relevant Assumption a2: The true functional dependence between x t and y t is linear Assumption a3: The parameters α and β are constant for all T observations (x t,y t ) 21

15 The B-assumptions (error term specification): Assumption b1: E(u t )=0for t =1,...,T Assumption b2: Homoskedasticity: Var(u t )=σ 2 for t =1,...,T Assumption b3: For all t 6= s with t =1, 2,..., T and s =1, 2,..,T we have Cov(u t,u s )=0 Assumption b4: Theerrortermsu t are normally distributed. Compact notation of all B-assumptions: u t NID(0,σ 2 ) for t =1,...,T 22

16 Graphical illustration of the error term distribution 23

17 The C-assumptions (variable specification): Assumption c1 The exogenous variable x t is not stochastic, but can be controlled as in an experimental situation Assumption c2 The exogenous variable x t is not constant for all observations t Of course, many (or even all?) of the A-, B-, and C-assumptions are restrictive and unrealistic We will nevertheless suppose they are satisfied for the time being, and consider their violations later on 24

18 Point estimation The simple (two-variable) linear regression model is y t = α + βx t + u t Numeric illustration: The first data of the gratuity example t x t y t y t x t 25

19 Estimation: Compute estimated values ˆα and ˆβ Distinguish between true and estimated values If the true econometric model is y t = α + βx t + u t then the corresponding estimated model is ŷ t =ˆα + ˆβx t 26

20 How can we estimate the coefficients? y t R^ 2 R^ 3 R^ x t 27

21 Least squares method Sum of squared residuals where the residuals are Sûû = TX û 2 t t=1 û t = y t ŷ t = y t ˆα ˆβx t Residual (Residuum): Difference between the observed value y t and the estimated (predicted) value ŷ t 28

22 Choose ˆα and ˆβ such that the sum of squared residuals is minimized Sûû = TX t=1 û 2 t = TX t=1 (y t ˆα ˆβx t ) 2 Derivation of estimators (Schätzer) [1] with ˆβ = S xy /S xx ˆα = ȳ ˆβ x S xx = X (x t x) 2 = X x 2 t T x 2 S xy = X (x t x)(y t ȳ) = X y t x t T xȳ 29

23 Numeric illustration for the three-points example t x t y t Calculate {1} ˆα, ˆβ ŷ 1, ŷ 2, ŷ 3 û 1, û 2, û 3 Sûû 30

24 The coefficient of determination R 2 Variation of the endogenous variable S yy = P (y t ȳ) 2 y = y t g y 1 y y2 y x y y x t R 31

25 Variation Sŷŷ = P (ŷ t ȳ) 2 and sum of squared residuals Sûû = P û 2 t y t $y3 y g $u 2 $y y2 y 1 y 2 $u $u 3 $R KQ y x t 3 y 32

26 Decomposition of sum of squares (Streuungszerlegungssatz): [2] or S yy = Sŷŷ + Sûû X (yt ȳ) 2 = X (ŷ t ȳ) 2 + X û 2 t Coefficient of determination (Bestimmtheitsmaß) R 2 = explained variation unexplained variation = S yy Sûû S yy = S ŷŷ S yy Computation of R 2 {2} R 2 = ˆβS xy S yy = S2 xy S xx S yy 33

27 Properties of the estimators The estimators ˆβ = S xy /S xx ˆα = ȳ ˆβ x are random variables Thought experiment: repeated samples Computer simulation [experiment.r] 34

28 Under the a-, b- and c-assumptions (without b4) [3] E(ˆα) = α E(ˆβ) = β and [4] Cov(ˆα, ˆβ) = σ 2 ( x/s xx ) Var(ˆα) = σ 2 ³ 1/T + x 2 /S xx Var(ˆβ) = σ 2 /S xx BLUE property: ˆα and ˆβ are the best linear unbiased estimators [5] If, additionally b4 is true, then ˆα and ˆβ are the best unbiased estimators 35

29 How are y t, ˆα and ˆβ distributed? Because of y t is normally distributed, t =1,...,T u t NID(0,σ 2 ) The expectation of y t is E (y t ) = E(α + βx t + u t ) = E (α)+e(βx t )+E(u t ) = α + βx t 36

30 Thevarianceofy t is Var(y t ) = E ³ (y t E (y t )) 2 = E ³ (y t α βx t ) 2 = E ³ u 2 t = E ³ (u t E(u t )) 2 = σ 2 Further, for t =1,...,T y t NID(α + βx t,σ 2 ) 37

31 Since ˆβ = S xy /S xx ˆα = ȳ ˆβ x both ˆα and ˆβ are linear transformations of the y t Linear transformations of independent normally distributed random variables are normally distributed Hence, ˆα N ³ α, σ 2 (1/T + x 2 /S xx ) ˆβ N ³ β, σ 2 /S xx 38

32 Interval estimation (Intervallschätzung) We already know that ˆβ is a random variable and ˆβ N ³ β, σ 2 /S xx Instead of a point estimator ˆβ we now want an interval estimator satisfying [ˆβ k ; ˆβ + k] P ³ˆβ k β ˆβ + k =1 a The interval [ˆβ k ; ˆβ + k] is called (1 a)-confidence interval (Konfidenzintervall) 39

33 Confidence interval when σ 2 is known Step 1: Standardization of ˆβ se(ˆβ) = q σ 2 /S xx z = ˆβ E(ˆβ) se(ˆβ) = ˆβ β N (0, 1) se(ˆβ) The random variable z =(ˆβ β)/se(ˆβ) is a pivot (Pivot), i.e. its distribution does not depend on unknown parameters 40

34 Step 2: Find the (1 α/2)-quantile z a/2 P ( z a/2 z z a/2 )=1 a Step 3: Substitute z by (ˆβ β)/se(ˆβ) P Ã z a/2 ˆβ β se(ˆβ) z a/2! =1 a Rewriting yields the (1 a)-interval [6]{3} hˆβ z a/2 se(ˆβ); ˆβ + z a/2 se(ˆβ) i 41

35 Confidence interval when σ 2 is unknown Step 1: Estimation of σ 2 and se(ˆβ): ˆσ 2 = 1 T 2 TX û 2 t t=1 is a consistent and unbiased estimator of σ 2 and q cse(ˆβ) = ˆσ 2 /S xx is a consistent estimator of se(ˆβ) (wepostponetheproofs) Step 2: Standardization of ˆβ t = ˆβ E(ˆβ) cse(ˆβ) = ˆβ β cse(ˆβ) t (T 2) 42

36 The random variable t =(ˆβ β)/cse(ˆβ) is a pivot Step 3: Find the (1 α/2)-quantile t a/2 P ( t a/2 t t a/2 )=1 a Step 4: Substitute and solve for β, P (ˆβ t a/2 cse(ˆβ) β ˆβ + t a/2 cse(ˆβ)) = 1 a The interval estimator is {4} hˆβ t a/2 cse(ˆβ); ˆβ + t a/2 cse(ˆβ) i 43

37 Interval estimator for intercept α hˆα ta/2 cse(ˆα) ; ˆα + t a/2 cse(ˆα) i where cse(ˆα) = q bσ 2 (1/T + x 2 /S xx ) Some terminology: The standard error (Standardfehler) is se(ˆβ); the estimated standard error is cse(ˆβ) Usually, both se(ˆβ) and cse(ˆβ) are called standard error (Standardfehler) Interpretation of interval estimators? 44

38 Hypothesis tests How can we test hypotheses about the regression coefficients (usually about the slope β)? Null hypothesis H 0 and alternative hypothesis H 1 (Nullhypothese und Alternativhypothese) There are one-sided and two-sided tests We already know that ˆβ N ³ β, σ 2 /S xx 45

39 If the null hypothesis H 0 : β = q is true, then β can be substituted by q ˆβ N ³ q, σ 2 /S xx Then P (ˆβ k q ˆβ + k) =1 a P (q k ˆβ q + k) =1 a With high probability 1 α, theestimatorˆβ will be inside the interval [q k; q + k], ifh 0 is true If the estimator ˆβ is outside the interval, that is evidence against the null hypothesis 46

40 Graphical illustration 47

41 The analytical approach is slightly different Step 1: Set up H 0 and H 1 and fix the significance level a H 0 : β = q H 1 : β 6= q Step 2: Estimate se(ˆβ) with ˆσ 2 = Sûû / (T 2) cse(ˆβ) = q ˆσ 2 /S xx 48

42 Step 3: Compute the t-test statistic If H 0 : β = q is true, then t = ˆβ q cse(ˆβ) t t (T 2) Step 4: Find the critical value t a/2 P ( t a/2 t t a/2 )=1 a Step 5: Compare t a/2 and t. If t is outside [ t a/2 ; t a/2 ], i.e. if t >t a/2, then reject H 0 {5} 49

43 Connections between hypothesis testing and confidence intervals Under the (two-sided) null hypothesis H 0 P ³ q t a/2 cse(ˆβ) ˆβ q + t a/2 cse(ˆβ) =1 a The (1 a)-confidence interval is hˆβ t a/2 cse(ˆβ); ˆβ + t a/2 cse(ˆβ) i Conclusion: If q is outside the confidence interval, H 0 is rejected {6} 50

44 One-sided hypothesis tests (einseitige Tests) Rightorleft-sidedtests Right-sided null hypothesis H 0 : β q H 1 : β>q The basic idea remains the same: If ˆβ is much larger than q, reject H 0 51

45 Graphical illustration: 52

46 Analytical approach (right-sided null hypothesis) Step 1: State H 0 and H 1 and set the significance level a H 0 : β q H 1 : β>q Step 2: Estimate se(ˆβ) Step 3: Compute the t-statistic Under H 0 its distribution is t t (T 2) t = ˆβ q cse(ˆβ) 53

47 Step 4: Find the critical value t a P (t t a )=1 a For left-sided null hypotheses, the steps 1, 2 and 3 are the same; the critical value is t 1 a with P (t <t a )=a Step 5: Compare t a and t; reject H 0,ift>t a {7} For left-sided null hypotheses, H 0 is rejected if t is less than the critical value, t<t 1 a 54

48 The p-value (p-wert) The p-value is the probability that the test statistic (a random variable) is greater than the realized test statistic Traditional approach: Reject the null hypothesis if the test statistic is inside the critical region, e.g. if t>t a Alternative approach: Comparison of probabilities; reject the null hypothesis if the p-value is less than the significance level a 55

49 Graphical illustration: 56

50 The two approaches comparison of t-statistic and critical value or comparison of p-value and significance level are essentially identical {8} Advantages of the p-value approach? Disadvantages? p-value formulas for right- and left-sided hypothesis tests? [7] p-value formula for two-sided hypothesis test? 57

51 How to choose the null and alternative hypotheses There are basically two strategies: State the opposite of the conjecture as the null hypothesis and try to reject it State the conjecture as the null hypothesis and show that it cannot be rejected There is an important asymmetry between rejection and non-rejection 58

52 Maximum likelihood estimation Main idea: Find those parameter values that maximize the probability (or likelihood) of observing the actually observed data Notation: θ : Parameter vector, e.g. θ =(α, β, σ 2 ) L(θ) : Likelihood (given all the data) ln L (θ) : Log-likelihood Maximum likelihood estimators ˆθ =argminlnl(θ) 59

53 We already know that, for t =1,...,T y t NID(α + βx t,σ 2 ), hence the density of y t is f yt (y) = Ã 1 2πσ 2 exp 1 2 (y α βx t ) 2 σ 2! Due to independence, the joint likelihood and log-likelihood are L(α, β, σ 2 ) = f y1,...,y T (y 1,...,y T )= ln L(α, β, σ 2 ) = lnf y1,...,y T (y 1,...,y T )= TY t=1 TX f yt (y t ) t=1 ln f yt (y t ) 60

54 Maximize ln L(α, β, σ 2 ) = lnf y1,...,y T (y 1,...,y T ) " Ã TX 1 = ln 2πσ 2 exp 1 (y t α βx t ) 2 2 σ 2 t=1 with respect to the parameters α, β, σ 2 [8]!# The ML estimators are ˆα ML = ȳ ˆβ ML x ˆβ ML = S xy ˆσ 2 ML = 1 T S xx TX û 2 t t=1 61

55 Hypothesis tests in the maximum likelihood framework (the three classical tests: Wald, LR, LM) Null and alternative hypotheses, e.g. H 0 : β = β 0 H 1 : β 6= β 0 Derivation of the test statistics [exercise] 62

56 Forecasting Conditional forecast: the value of the exogenous variable is known and non-stochastic x 0 Point forecast of the endogenous variable is {9} ŷ 0 =ˆα + ˆβx 0 Thetruevalueofy 0 is usually not ŷ 0 but y 0 = α + βx 0 + u 0 63

57 The forecasting error is ŷ 0 y 0 = ˆα + ˆβx 0 (α + βx 0 + u 0 ) = (ˆα α)+ ³ˆβ β x 0 u 0 There are two error sources: 1. The error term u 0 will not vanish, in general. 2. The parameter estimates ˆα and ˆβ will deviate from the true values α and β. 64

58 Properties of the point forecast Expected forecasting error: E(ŷ 0 y 0 ) = E(ˆα α)+e(ˆβ β)x 0 E(u 0 ) = 0 Variance of the forecasting error [9] Var(ŷ 0 y 0 )=σ 2 h 1+1/T +(x 0 x) 2 /S xx i Estimated variance of the forecasting error {9} Var(ŷ d 0 y 0 )=ˆσ 2 h 1+1/T +(x 0 x) 2 i /S xx 65

59 Interval forecast Step 1: Estimation of se(ŷ 0 y 0 ) Step 2: Standardization of (ŷ 0 y 0 ) =0 t = (ŷ z } { 0 y 0 ) E (ŷ 0 y 0 ) cse(ŷ 0 y 0 ) = ŷ0 y 0 cse(ŷ 0 y 0 ) t T 2 Step 3: Find the t a/2 -value (from statistical tables or using statistical computer software) 66

60 Step 4: With large probability 1 α, the random variable t will be inside the interval [ t a/2 ; t a/2 ], P Ã t a/2 ŷ0 y 0 cse (ŷ 0 y 0 ) t a/2! =1 a Solve for y 0 P ³ ŷ 0 t a/2 cse(ŷ 0 y 0 ) y 0 ŷ 0 + t a/2 cse(ŷ 0 y 0 ) =1 a Hence, the interval forecast is {9} hŷ0 t a/2 cse(ŷ 0 y 0 ); ŷ 0 + t a/2 cse(ŷ 0 y 0 ) i 67

61 Width of the interval 68

62 Multiple linear regression model Until today we only considered a single exogenous variable, but in most empirical problems we face many exogenous variables Many of the results from the simple linear regression model can be transferred to the multiple case Important tool: matrix algebra (main diagonal, transpose, addition, scalar multiplication, inner product, matrix multiplication, idem potent, determinant, rank, inverse, trace, definit matrices, semidefinite matrices) 69

63 Specification Example: Estimation of a production function for barley Conduct an experiment where the barley output (Gerste, g t )isobservedfor different combinations of phosphate (p t ) and nitrogen (n t ) There are T =30different combinations The following table shows the data 70

64 t p t n t g t t p t n t g t 1 22,00 40,00 38, ,00 110,00 59, ,00 60,00 49, ,00 50,00 55, ,00 90,00 59, ,00 70,00 54, ,00 120,00 59, ,00 90,00 66, ,00 50,00 45, ,00 110,00 61, ,00 80,00 53, ,00 40,00 48, ,00 100,00 56, ,00 60,00 54, ,00 120,00 50, ,00 80,00 58, ,00 40,00 44, ,00 100,00 62, ,00 60,00 54, ,00 50,00 50, ,00 90,00 60, ,00 70,00 51, ,00 120,00 58, ,00 100,00 59, ,00 50,00 51, ,00 110,00 68, ,00 80,00 58, ,00 60,00 59, ,00 100,00 57, ,00 100,00 64,39 71

65 Functional specification (A-assumptions) The economic (agro-economic) model formalizes the connection between the barley output (g) and the fertilizers (p and n) g = f(p, n) Possible function form g = α + β 1 p + β 2 n A more realistic functional form g = Ap β 1n β 2, where A, β 1 and β 2 are constant parameters 72

66 Take logarithms of the production function g = Ap β 1n β 2, ln g =lna + β 1 ln p + β 2 ln n Define α =lna, y =lng, x 1 =lnp and x 2 =lnn, then y = α + β 1 x 1 + β 2 x 2 Table of log-values: t x 1 x 2 y t (= ln p t ) (= ln n t ) (= ln g t ) 1 3,0910 3,6889 3, ,3673 4,6052 4,

67 The econometric model is for t =1,...,T y t = α + β 1 x 1t + β 2 x 2t + u t General model for K exogenous variables y t = α + β 1 x 1t + β 2 x 2t β K x kt + u t for t =1,...,T or y 1 = α + β 1 x 11 + β 2 x β K x K1 + u 1 y 2 = α + β 1 x 12 + β 2 x β K x K2 + u 2. y T = α + β 1 x 1T + β 2 x 2T β K x KT + u T 74

68 Matrix notation: Define y = y 1 y 2. y T ; X = 1 x x K1 1 x x K x 1T... x KT ; β = α β 1. β K ; u = u 1 u 2. u T Compactnotationforthemultipleregressionmodel y = Xβ + u or y 1 y 2. y T = 1 x x K1 1 x x K x 1T... x KT α β 1. β K + u 1 u 2. u T 75

69 The A-assumptions Assumption A1: No relevant exogenous variable is omitted from the econometric model, and all exogenous variables included in the model are relevant Assumption A2: The true functional dependence between X and y is linear Assumption A3: The parameters β are constant for all T observations (x t,y t ) 76

70 The B-assumptions TheB-assumptionsarethesameasinthesimplelinearmodel,i.e.E(u t )=0, Var(u t )=σ 2, Cov(u t,u s )=0for t 6= s and normality B1 to B4 in matrix notation u N ³ 0,σ 2 I T 77

71 The C-assumptions Assumption C1: The exogenous variables x 1t,...,x Kt are not stochastic, but can be controlled as in an experimental situation Assumption C2: No perfect multicollinearity: The are no parameter values γ 0, γ 1, γ 2,...,γ K (with at least one γ k 6=0), such that for all t =1,...,T γ 0 + γ 1 x 1t + γ 2 x 2t γ K x Kt =0 Assumption C2 in matrix notation: (implication: T K +1) rang(x) =K +1 78

72 Perfect multicollinearity with two regressors If C2 is violated, there are γ 0, γ 1, γ 2,(notall0)suchthat for all t =1,...,T, thus γ 0 + γ 1 x 1t + γ 2 x 2t =0 x 2t = (γ 0 /γ 2 ) (γ 1 /γ 2 ) x 1t = δ 0 + δ 1 x 1t with δ 0 = (γ 0 /γ 2 ) and δ 1 = (γ 1 /γ 2 ) Hence, there are not really two regressors, since y t = α + β 1 x 1t + β 2 x 2t + u t = (α + β 2 δ 0 ) +(β {z } 1 + β 2 δ 1 ) x {z } 1t + u t =α 0 =β 0 79

73 Point estimation The econometric model is y = Xβ + u y t = α + β 1 x 1t β K x Kt + u t for t =1,...,T The estimated model is ŷ = Xˆβ ŷ t = ˆα + ˆβ 1 x 1t ˆβ K x Kt for t =1,...,T 80

74 Define the residuals û = y ŷ û t = y t ŷ t for t =1,...,T How can we find an estimator ˆβ in the multiple regression model? The sum of squared residuals is Sûû = û 0 û = X û 2 t 81

75 Because of we have û = y Xˆβ = y t ˆα ˆβ 1 x 1t... ˆβ K x Kt Sûû = ³ y Xˆβ 0 ³ y Xˆβ = X ³ yt ˆα ˆβ 1 x 1t... ˆβ K x Kt 2 First order conditions Sûû ˆβ = Sûû / ˆα Sûû / ˆβ 1. Sûû / ˆβ K = 0 82

76 Vector of derivatives Sûû ˆβ = ³ y Xˆβ 0 ³ y Xˆβ ˆβ = ˆβ y0 y ˆβ 2y0 Xˆβ + ˆβ = 2X 0 y+2x 0 Xˆβ ˆβX 0 Xˆβ J.R.Magnus,H.Neudecker,Matrix Differential Calculus with Applications in Statistics and Econometrics, rev. ed., John Wiley & Sons: Chichester, Phoebus J. Dhrymes, Mathematics for Econometrics, 3rd ed., Springer: New York,

77 Solving the first order conditions yields the normal equations X 0 Xˆβ = X 0 y and thus ˆβ = ³ X 0 X 1 X 0 y The terms are X 0 X= P x1t... P xkt P x1t x Kt T P P x1t x 2 1t P P P xkt xkt x 1t... x 2 Kt, X0 y= P yt P x1t y t. P xkt y t Numeric illustration {10} 84

78 Meaning of the estimators ˆα, ˆβ 1 and ˆβ 2 Formal meaning ŷ t x 1t = ˆβ 1 and ŷ t x 2t = ˆβ 2 Meaning of ˆα: forx 1t = x 2t =0 ln ĝ t = ˆα = ĝ t = e =

79 Meaning of ˆβ 1 and ˆβ 2 : ˆβ 1 = ŷ t x 1t = (ln ĝ t) (ln p t ) Because of we find ln ĝ t ĝ t = 1 ĝ t and ˆβ 1 = ĝ t/ĝ t p t /p t ln p t p t = 1 p t ˆβ 1 is the estimated elasticity of the barley output with respect to the phosphate fertilizer 86

80 Coefficient of determination R 2 The total variation of y canbedecomposedinthesamewayasinthesimple linear model S yy {z} total variation = Sŷŷ {z} explained variation + Sûû {z} unexplained variation The coefficient of determination is defined as R 2 explained variation = total variation = S ŷŷ S yy = S yy Sûû S yy 87

81 Graphical illustration S yy E A B C F D G S 11 S 22 Here R 2 = A + B + C A + B + C + E 88

82 Computation of R 2 : In the simple linear regression model R 2 = S ŷŷ S yy = ˆβS xy S yy Itcanbeshownthatinthemultiplelinearregressionmodel KX Sŷŷ = k=1 ˆβ k S ky with the covariations S ky = P T t=1 (x kt x k )(y t ȳ) Then {11} R 2 = P Kk=1 b β k S ky S yy 89

83 Properties of the OLS estimators The estimator ˆβ is a random vector The expectation vector is [10] (unbiasedness, Erwartungstreue) E(ˆβ) =β The covariance matrix of ˆβ is [11] V(ˆβ) =σ 2 ³ X 0 X 1 90

84 Special case: Covariance matrix in the two regressor model: Var(ˆβ 1 ) = Var(ˆβ 2 ) = σ 2 S 11 ³ 1 R σ 2 S 22 ³ 1 R Var(ˆα) = σ 2 /T + x 2 1 Var(ˆβ 1 ) +2 x 1 x 2 Cov(ˆβ 1, ˆβ 2 )+ x 2 2 Var(ˆβ 2 ) Cov(ˆβ 1, ˆβ 2 ) = σ 2 R S 12 ³ 1 R where R = S2 12 S 11 S 22 91

85 Gauss-Markov theorem The estimator ˆβ = X 0 X 1 X 0 y is linear in y, since ˆβ = Dy with D = ³ X 0 X 1 X 0 ˆβ = X 0 X 1 X 0 y is not only unbiased but also efficient Let ˇβ be another linear unbiased estimator of β Then V(ˇβ) V(ˆβ) is positive semidefinit [12] 92

86 Distribution of the estimator The model is y = Xβ + u From u N(0,σ 2 I T ) we conclude that y is multivariate normally distributed Expectation vector and covariance matrix of endogenous variable E(y) = E(Xβ + u) =Xβ V(y) = V(Xβ + u) =V(u) =σ 2 I T Thus y N(Xβ,σ 2 I T ) 93

87 How is the estimator ˆβ distributed? Since ˆβ = X 0 X 1 X 0 y the estimator ˆβ also has a multivariate normal distribution Expectation vector and covariance matrix are already known Hence ˆβ N µβ,σ 2 ³ X 0 X 1 Problem: The error term variance σ 2 is unknown 94

88 The covariance matrix V(ˆβ) cannot be computed without σ 2 Since usually σ 2 is unknown, it has to be estimated An estimator of σ 2 is ˆσ 2 = Sûû T K 1 Its expectation is E(ˆσ 2 )=σ 2 [13]{12} The residual maker matrix M = I T X(X 0 X) 1 X 0 95

89 Interval estimation Interval estimation of a single component ˆβ k of the vector ˆβ P ³ˆβ k c β k ˆβ k + c =1 a We know that ˆβ k N(β k,var(ˆβ k )) where Var(ˆβ k ) is the (k +1) th diagonal element of σ 2 X 0 X 1 Problem: σ 2 and Var(ˆβ k ) are unknown 96

90 Step 1: Estimation of σ 2 by ˆσ 2 and se(ˆβ k )= cse(ˆβ k )= q d Var(ˆβ k ) q Var(ˆβ k ) by Step 2: Standardization of ˆβ k t = ˆβ k E(ˆβ k ) cse(ˆβ k ) = ˆβ k β k cse(ˆβ k ) t (T K 1) Step 3: Find the t a/2 -value Step 4: The (1 α)-interval estimator is {13} hˆβ k t a/2 cse(ˆβ k ); ˆβ k + t a/2 cse(ˆβ k ) i 97

91 Interval estimation of linear combinations of ˆβ Let r be an arbitrary (K +1)-column vector How can we find a confidence interval of r 0 β? Fertilizer example: r =[0, 1, 1] 0,thenr 0 β = β 1 + β 2 (economies of scale?) The point estimator of r 0 β is r 0ˆβ Thevarianceofr 0ˆβ is r 0 V(ˆβ)r = σ 2 r 0 (X 0 X) 1 r 98

92 The confidence interval for r 0 β is r 0ˆβ t a/2 ˆσ q r 0 (X 0 X) 1 r ; r 0ˆβ + t a/2 ˆσ q r 0 (X 0 X) 1 r Special case of a single component β k = r 0 β for r =[0,...,0, 1, 0,...,0] 0 where the 1 is located at the k th position Then Var(ˆβ k )=r 0 σ 2 (X 0 X) 1 r 99

93 Hypothesis tests: t-test There are tests of a single linear combination (t-tests) and tests of multiple linear combinations (F -tests) Testing a single linear combination of parameters: t-test (two-sided) Remember: In the simple linear regression case H 0 : β = q H 1 : β 6= q 100

94 Inthemultiplelinearmodelthenullandalternativehypothesesare H 0 : r 0 α + r 1 β r K β K = q H 1 : r 0 α + r 1 β r K β K 6= q or H 0 : r 0 β = q H 1 : r 0 β 6= q where r =[r 0,r 1,...,r K ] 0 101

95 The test procedure: 1. Set up H 0 and H 1 and fix thesignificance level a 2. Estimate se(r 0ˆβ) 3. Compute the t-statistic 4. Find the critical value t a/2 5. Test decision: Compare t a/2 and t {14} 102

96 The left-sided t-test H 0 : r 0 β q H 1 : r 0 β <q and the right-sided test H 0 : r 0 β q H 1 : r 0 β >q are similar The critical values are lower quantiles of the t-distribution for the left-sided test and upper quantiles for the right-sided test {14} 103

97 Hypothesis tests: F-test Simultaneous test of two or more linear combinations (restrictions) Null hypothesis and alternative hypothesis H 0 : Rβ = q H 1 : Rβ 6= q Exampels: H 0 : β 1 = β 2 =...= β K =0 H 0 : β 1 = β 2 =...= β K H 0 : β β k =1and β 1 =2β 2 H 0 : β 1 =5and β 2 =...= β K =0 104

98 Basic idea of the F -test: Compare the restricted and the unrestricted model Sum of squared residuals of the econometric model and the model under the null hypothesis Sûû = û 0 û = Sû0û 0 = û 00 û 0 = TX û 2 t t=1 TX ³û0 2 t where û 0 are the residuals if the model is estimated under the restrictions of the null hypothesis t=1 105

99 Example: Null hypothesis y t = α +0 x 1t x Kt + u t = α + u t Obviously, S bubu 0 S bubu ; the null hypothesis is likely to be false if S0 bubu larger than S bubu is much The test statistic is ³ S 0 bubu S bubu. L F = S bubu /(T K 1) where L is the number of restrictions in H 0 If the null hypothesis is true, then F F (L,T K 1) 106

100 The five steps of the F -test 1. Set up H 0 and H 1 and choose the significance level a 2. Calculate S bubu and S 0 bubu (moreonthecomputationofs0 bubu later) 3. Compute the F -test statistic 4. Find the critical value F a, i.e. the upper a-quantile of the F L,T K 1 -distribution 5. Reject H 0 if F>F a {15} 107

101 Remarks: For L =1the F -test is identical to a two-sided t-test Careful: A combination of t-tests is not the same as a single F -test The decisions of t-tests and an F -test can be contradicting Distinction between individual t-tests and a simultaneous F -test 108

102 Example: H 0 : β 1 = β 2 =0.33 {16} 109

103 Computation of û 00 û 0 Estimate β subject to the restrictions Rβ = q given in the null hypothesis Optimization under constraints: Minimize with respect to β subject to Rβ = q Sû0û 0 = (y Xβ) 0 (y Xβ) A standard Lagrange approach yields [14] ˆβ RLS = ˆβ ³ X 0 X 1 R 0 R ³ 1 X 0 X 1 R 0 ³Rˆβ q 110

104 Residuals of the restricted model: û 0 = y Xˆβ RLS {17} The F -test statistic can also be written as [15] ³ Rˆβ q 0 h R X 0 X 1 R 0 i 1 ³ Rˆβ q /L F = û 0 û/ (T K 1) Note the similarity to the t-test statistic ³ r 0ˆβ q 2 t 2 = ˆσ 2 h r 0 (X 0 X) 1 r i Standard statistical software includes simultaneous tests of linear combinations (F -tests) 111

105 Maximum likelihood estimation Repetition: If X is a K-dimensional random vector with multivariate normal distribution N(μ, Σ) then its joint density is µ f X (x) =(2π) K/2 (det Σ) 1/2 exp 1 2 (x μ)0 Σ 1 (x μ) Multiple linear regression model y = Xβ + u with u N ³ 0,σ 2 I Distribution of the endogenous variables: y N ³ Xβ,σ 2 I 112

106 Joint density of y f y (y) ³ = (2π) T 2 det σ 2 I 12 exp µ 1 2 (y ³ Xβ)0 σ 2 I 1 (y Xβ) = (2π) T/2 ³ σ 2T 1/2 Ã exp (y! Xβ)0 (y Xβ) 2σ 2 Log-likelihood function ln L ³ β,σ 2 = T 2 ln (2π) T 2 ln σ2 (y Xβ)0 (y Xβ) 2σ 2 113

107 First order condition for a maximum ln L β ln L σ 2 = X 0 (y X 0 β) σ 2 T 2σ 2 + (y Xβ)0 (y Xβ) 2σ 4 = " 0 0 # Solution of the FOCs [16] ˆβ ML = ³ X 0 X 1 X 0 y ˆσ 2 ML = 1 T ³û0û The ML estimator of β is identical to the OLS estimator, the ML estimator of σ 2 is different and thus biased (but asymptotically unbiased) 114

108 The classical tests (LR, Wald, LM) Illustration of the basic test ideas [threetests.r] Generalization to multiple restrictions H 0 : g(β) =0 H 1 : g(β) 6= 0 where β is the coefficient vector of a multiple linear regression model and g is a (possibly nonlinear) vector-valued function Test of L linear restrictions: g(β) =Rβ q 115

109 Wald test Idea: If g(ˆβ ML ) is significantly different from 0, reject H 0 Test statistic (for multiple restrictions) W = g ³ˆβ ML 0 h d Cov ³ g ³ˆβ ML i 1 g ³ˆβ ML d U χ 2 L if the null hypothesis is true Wald test statistic for L linear restrictions Rβ q = 0 [17] 116

110 Likelihood ratio (LR) test Idea: If the maximal likelihood under the restrictions L(ˆβ R, ˆσ 2 R ) is significantly lower than the maximal likelihood without restrictions L(ˆβ ML, ˆσ 2 ML ),then reject H 0 Test statistic LR =2 ³ ln L ³ˆβ ML, ˆσ 2 ML ln L ³ˆβ R, ˆσ 2 R d U χ 2 L if the null hypothesis is true LR test statistic for L linear restrictions Rβ q = 0 [18] 117

111 Lagrange multiplier (LM) test Idea: If the slope of the log-likelihood function ln L(ˆβ R )/ β is significantly different from 0, reject H 0 Test statistic LM = ln L(ˆβ R ) β if the null hypothesis is true 0 h d Cov ³ˆβ R i 1 ln L(ˆβ R ) β d U χ 2 L LM test statistic for L linear restrictions Rβ q = 0 [19] 118

112 Forecasting The approach is similar to forecasting in the simple linear regression Let x 0 =[1,x 10,x 20,...,x K0 ] 0 denote the vector of exogenous variables Point forecast ŷ 0 = x 0 0ˆβ Variance of the forecast error [20] Var(ŷ 0 y 0 )=σ 2 µ1+x 0 0 ³ X 0 X 1 x0 119

113 Presentation of the results In the literature, the results of regression analyses are often presented as follows ŷ = ˆα + ˆβ 1 x ˆβ K x K (cse(ˆα)) (cse(ˆβ 1 )) (cse(ˆβ K )) Sometimes you find t-values in the parentheses, i.e. the values of the test statistics for the tests H 0 : β k =0vs H 1 : β k 6=0 Often, R 2 and ˆσ and the value of the test statistic of the F test H 0 : β 1 =...= β K =0 vs H 1 : not H 0 are reported additionally 120

114 Fertilizer example: ŷ = x x 2 ( ) ( ) ( ) Additional results R 2 = ˆσ 2 = ˆσ = Test statistics H 0 : β 1 = H 0 : β 2 = H 0 : β 1 = β 2 =

115 Examples of computer output: Excel SPSS EViews Stata R matlab 122

116 Assumptions A1: No relevant variable is omitted, and no irrelevant variables are included A2: The true functional dependence between X and y is linear A3: The parameters β are constant for all T observations (x t,y t ) B1-B4: u N ³ 0,σ 2 I T C1: The exogenous variables are not stochastic C2: No perfect multicollinearity: rank(x) =K +1 All assumptions can be violated What happens if they are violated? 123

117 Omitted or irrelevant variables Assumption A1: No relevant exogenous variable is omitted from the econometric model, and all exogenous variables included in the model are relevant What happens if relevant variables are missing? What happens if there are irrelevant variables included in the model? Example: Wage structure in a firm with 20 employees; what are the determinants of the wage y t? 124

118 Data: Education x 1t ; age x 2t ; firm tenure x 3t t y t x 1t x 2t x 3t t y t x 1t x 2t x 3t

119 Three potential models (M2 is the true model) (M1) y t = α + βx 1t + u 0 t (M2) y t = α + β 1 x 1t + β 2 x 2t + u t (M3) y t = α + β 1 x 1t + β 2 x 2t + β 3 x 3t + u 00 t Model Variable Coeff. bse(.) t-test p-value (M1) Constant Education (M2) Constant Education Age (M3) Constant Education Age Firm tenure

120 Omitted relevant variables Graphical representation S yy E A B C F D G S 11 S

121 The models: (M1) y t = α + βx 1t + u 0 t (M2) y t = α + β 1 x 1t + β 2 x 2t + u t (M3) y t = α + β 1 x 1t + β 2 x 2t + β 3 x 3t + u 00 t Theerrorterms u 0 t = β 2 x 2t + u t E(u 0 t) = E(β 2 x 2t + u t ) = β 2 x 2t + E(u t ) = β 2 x 2t +0 6= 0 128

122 If a relevant exogenous variable is omitted, assumption B1 is violated! Consequence for point estimation ˆβ 0 1 = ˆβ 1 + ˆβ 2 S 12 E(ˆβ 0 1) = E Ã S 11 ˆβ 1 + ˆβ 2 S 12 S 11 = β 1 + β 2 S 12 S 11! Consequence for interval estimation hˆβ 0 1 t a/2 cse(ˆβ 0 1); ˆβ t a/2 cse(ˆβ 0 1) i 129

123 Further se(ˆβ 0 1)= r var ³ˆβ 0 1 with var ³ˆβ 0 σ 2 1 = S 11 The estimator ˆσ 2 = S bu 0 bu 0 T 2 is biased; the unbiased estimator is ˆσ 2 = S bubu T 3 130

124 Conclusion: The coverage probability of the confidence intervals is not 1 α Hypothesis tests are also biased: The probability of an error of the first kind does not equal the significance level If a relevant exogenous variable is omitted, then the point estimators are biased and inconsistent the interval estimators and hypothesis tests are no longer valid {18} 131

125 Irrelevant variables The error term in the misspecifiedmodelm3is and since β 3 =0 u 00 t = u t β 3 x 3t u 00 t = u t Consequently, E(bα 00 1 ) = α E(ˆβ 00 1) = β 1 E(ˆβ 00 2) = β 2 E(ˆβ 00 3) = β 3 =0 132

126 The variances of the estimators are Var(ˆβ 1 ) = Var(ˆβ 00 1) = σ 2 S 11 ³ 1 R σ 2 S 11 ³ 1 R R Theestimatederrortermvarianceis bσ 2 = S bu 00 bu 00 T 4 Conclusion: Omitted relevant variables are a serious problem, redundant variables are not (but they inflate the standard errors) 133

127 Diagnosis How can we find the correct model? The coeffcient of determination R 2 does not help select a model Adjusted R 2 R 2 = 1 S bubu /(T K 1) S yy /(T 1) = 1 ³ 1 R 2 T 1 T K 1 134

128 Further model selection criteria (trade-off between biasedness and inefficiency) Akaike information criterion (AIC) AIC =ln µ Sbubu T + 2(K +1) T t-test for single variables; F -test for multiple variables 135

129 Functional form Assumption A2: The true functional dependence between X and y is linear Milk example: Milk production m depends on amount of concentrated feed f t f t m t t f t m t

130 Kraftfutter 137 Milchmenge

131 A misspecified model returns useless results Some nonlinear dependencies Semi-Log. : m t = α + β ln f t + u t Invers : m t = α + β (1/f t )+u t Exponential : ln m t = α + βf t + u t Logarithmic : ln m t = α + β ln f t + u t Quadratic : m t = α + β 1 f t + β 2 ft 2 + u t 138

132 Approach I: Estimation of a nonlinear regression with criterion function y t = g(x t )+u t TX t=1 (y t g(x t )) 2 Optimization by numerical methods Approach II: Linearization of the model; then linear regression y t = α + βx t + u t y t = lnm t x t = lnf t 139

133 Diagnosis: Regression Specification Error Test (RESET) Higher order Taylor approximation y t = f(x t )=α + β 1 x t + β 2 x 2 t + β 3 x 3 t +... Are the higher orders (jointly) significant? F -test of β 2 = β 3 =...=0 Problem: What happens if there are many exogenous variables? 140

134 Basic idea of the RESET: by t 2, by t 3,... are included as additional exogenous variables y t = α + β 1 x t + γ 2 by 2 t + γ 3 by 3 t + u t If γ 2 and/or γ 3 are significant, then there are nonlinearities F -test of γ 2 = γ 3 =0(maybe even higher orders) The test is implemented in many statistical software packages 141

135 RESET in the linear model: 1. Estimate the linear model and calculate S bubu and the fitted by t 2. Add L powers of ŷ t to the linear model y t = α + β 1 x t + γ 2 ŷ 2 t + γ 3 ŷ 3 t + u t Estimate the extended model and calculate the sum of squared residuals S bubu 3. The null hypothesis is H 0 : γ 2 = γ 3 =0 142

136 4. Compute the F -test statistic F (L,T K 1) = ³ Sbubu S bubu /L S bubu / (T K 1) where K is the number of exogenous variables in the extended model 5. If F>F a (significance level a, degress of freedom L and T K 1), then H 0 is rejected and the linear model is discarded Milk example {18} 143

137 Qualitative exogenous variables Assumption A3: The parameters β are constant for all T observations (x t,y t ) Example: The wage y t depends on education x 1t and age x 2t y t = α + β 1 x 1t + β 2 x 2t + u t The wage equations for males and females might be different y t = α M + β M1 x 1t + β M2 x 2t + u t y t = α F + β F 1 x 1t + β F 2 x 2t + u t What happens if the difference is neglected? [qualitative.r] 144

138 Dummy variable D t = ( 0 if male 1 if female Extended model y t = α + D t γ + β 1 x 1t + δ 1 D t x 1t + β 2 x 2t + δ 2 D t x 2t + u t Model for men (D t =0) y t = α + β 1 x 1t + β 2 x 2t + u t Model for women (D t =1) y t =(α + γ)+(β 1 + δ 1 ) x 1t +(β 2 + δ 2 ) x 2t + u t 145

139 If the qualitative variable has more than two values, we need more than one dummy variable Example: Religion (protestant, catholic, other) D Pt = D Ct = 0 for other 1 for protestant 0 for catholic 0 for other 0 for protestant 1 for catholic Meaning of the coefficients; testing structural stability 146

140 Estimation of the model Use the ordinary t- or F -tests to detect differences in the coefficients, e.g. H 0 : γ = δ 1 = δ 2 =0 Very often, the model includes only a level effect, i.e. y t = α + γd t + β 1 x 1t + β 2 x 2t + u t Then use a t-test for γ 147

141 Estimation of the wage equation model y t = α + D t γ + β 1 x 1t + δ 1 D t x 1t + β 2 x 2t + δ 2 D t x 2t + u t Compare with separat estimation of the two models [wages.r] y t = α M + β M1 x 1t + β M2 x 2t + u t y t = α F + β F 1 x 1t + β F 2 x 2t + u t for men for women The point estimates and the sum of squared residuals are identical (why?) The standard errors differ (why?) 148

142 For simplicity we only consider one exogenous variable y t = α + γd t + βx t + δd t x t + u t Order the observations such that D t =0for t =1,...,T 1 and D t =1for t = T 1 +1,...,T The joint estimation minimizes (with respect to α, β, γ, δ) S (α, β, γ, δ) = T 1 X t=1 (y t α βx t ) 2 + TX t=t 1 +1 (y t (α + γ) (β + δ) x t ) 2 149

143 The first order conditions for the joint estimation are T1 S α = X t=1 T1 S β = X t=1 S X T γ = S δ = t=t 1 +1 TX t=t 1 +1 (y t α βx t ) (y t α βx t ) x t TX t=t 1 +1 TX t=t 1 +1 (y t (α + γ) (β + δ) x t )=0 (y t (α + γ) (β + δ) x t ) x t =0 (y t (α + γ) (β + δ) x t )=0 (y t (α + γ) (β + δ) x t ) x t =0 150

144 Hence, the point estimates in the joint estimation are identical to those of the separat estimations If the point estimates are identical, then so are the residuals; and if the residuals are identical, then so are the sums of squared residuals As to the standard errors, in the joint model we estimate ˆσ 2 = Sûû / (T 4) while in the separat estimations we estimate ˆσ 2 0 = S0 ûû / (T 1 2) ˆσ 2 1 = S1 ûû / ((T T 1) 2) 151

145 Remarks What happens if the dummy variables are not 0/1-coded but 1/2-coded? Consider the model where y t = α + γd 1t + δd 2t + βx t + u t ( 0 for males D 1t = 1 for females ( 0 for German citizenship D 2t = 1 else Interaction terms 152

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 17, 2012 Outline Heteroskedasticity

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

The Multiple Regression Model Estimation

The Multiple Regression Model Estimation Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing Feng Li Department of Statistics, Stockholm University What we have learned last time... 1 Estimating

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Econometrics Master in Business and Quantitative Methods

Econometrics Master in Business and Quantitative Methods Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity

More information

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T, Regression Analysis The multiple linear regression model with k explanatory variables assumes that the tth observation of the dependent or endogenous variable y t is described by the linear relationship

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4. Prof. Mei-Yuan Chen Spring 2008

FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4. Prof. Mei-Yuan Chen Spring 2008 FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4 Prof. Mei-Yuan Chen Spring 008. Partition and rearrange the matrix X as [x i X i ]. That is, X i is the matrix X excluding the column x i. Let u i denote

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Advanced Quantitative Methods: ordinary least squares

Advanced Quantitative Methods: ordinary least squares Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

ECONOMETRICS (I) MEI-YUAN CHEN. Department of Finance National Chung Hsing University. July 17, 2003

ECONOMETRICS (I) MEI-YUAN CHEN. Department of Finance National Chung Hsing University. July 17, 2003 ECONOMERICS (I) MEI-YUAN CHEN Department of Finance National Chung Hsing University July 17, 2003 c Mei-Yuan Chen. he L A EX source file is ec471.tex. Contents 1 Introduction 1 2 Reviews of Statistics

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

L2: Two-variable regression model

L2: Two-variable regression model L2: Two-variable regression model Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: September 4, 2014 What we have learned last time...

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance V (u i x i ) = σ 2 is common to all observations i = 1,..., In many applications, we may suspect

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance: 8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions

More information

Multiple Regression Analysis

Multiple Regression Analysis Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple

More information

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Classical Least Squares Theory

Classical Least Squares Theory Classical Least Squares Theory CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University October 18, 2014 C.-M. Kuan (Finance & CRETA, NTU) Classical Least Squares Theory October 18, 2014

More information

Financial Econometrics

Financial Econometrics Material : solution Class : Teacher(s) : zacharias psaradakis, marian vavra Example 1.1: Consider the linear regression model y Xβ + u, (1) where y is a (n 1) vector of observations on the dependent variable,

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I. Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information