The Quasi-Maximum Likelihood Method: Theory

Size: px
Start display at page:

Download "The Quasi-Maximum Likelihood Method: Theory"

Transcription

1 Chapter 9 he Quasi-Maximum Likelihood Method: heory As discussed in preceding chapters, estimating linear and nonlinear regressions by the least squares method results in an approximation to the conditional mean function of the dependent variable. While this approach is important and common in practice, its scope is still limited and does not fit the purposes of various empirical studies. First, this approach is not suitable for analyzing the dependent variables with special features, such as binary and truncated (censored) variables, because it is unable to take data characteristics into account. Second, it has little room to characterize other conditional moments, such as conditional variance, of the dependent variable. As far as a complete description of the conditional behavior of a dependent variable is concerned, it is desirable to specify a density function that admits specifications of different conditional moments and other distribution characteristics. his motivates researchers to consider the method of quasi-maximum likelihood (QML). he QML method is essentially the same as the ML method usually seen in statistics and econometrics textbooks. It is conceivable that specifying a density function, while being more general and more flexible than specifying a function for conditional mean, is more likely to result in specification errors. How to draw statistical inferences under potential model misspecification is thus a major concern of the QML method. By contrast, the traditional ML method assumes that the specified density function is the true density function, so that specification errors are assumed away. herefore, the results in the ML method are just special cases of the QML method. Our discussion below is primarily based on White (994); for related discussion we also refer to White (982), Amemiya (985), Godfrey (988), and Gourieroux, (994). 229

2 230 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY 9. Kullback-Leibler Information Criterion We first discuss the concept of information. In a random experiment, suppose that the event A occurs with probability p. he message that A will surely occur would be more valuable when p is small but less important when p is large. For example, when p =0.99, the message that A will occur is not much informative, but when p =0.0, such a message is very helpful. Hence, the information content of the message that A will occur ought to be a decreasing function of p. A common choice of the information function is ι(p) =log(/p), which decreases from positive infinity (p 0) to zero (p =). Itshouldbeclearthat ι( p), the information that A will not occur, is not the same as ι(p), unless p =0.5. he expected information of these two messages is ( ( ) I = pι(p)+( p) ι( p) =p log +( p) log. p) p It is also common to interpret ι(p) as the surprise resulted from knowing that the event A will occur when IP(A) =p. he expected information I is thus also a weighted average of surprises and known as the entropy of the event A. Similarly, the information that the probability of the event A changes from p to q would be useful when p and q are very different, but not of much value when p and q are close. he resulting information content is then the difference between these two pieces of information: ι(p) ι(q) =log(q/p), which is positive (negative) when q > p(q < p). Given n mutually exclusive events A,...,A n, each with an information value log(q i /p i ), the expected information value is then n ( qi ) I = q i log. p i i= his idea is readily generalized to discuss the information content of density functions, as discussed below. Let g be the density function of the random variable z and f be another density function. Define the Kullback-Leibler Information Criterion (KLIC) of g relative to f as ( g(ζ) ) II(g :f) = log g(ζ) dζ. f(ζ) R

3 9.. KULLBACK-LEIBLER INFORMAION CRIERION 23 When f is used to describe z, the value II(g:f) is the expected surprise resulted from knowing g is in fact the true density of z. he following result shows that the KLIC of g relative to f is non-negative. heorem 9. Let g be the density function of the random variable z and f be another density function. hen II(g : f) 0, with the equality holding if, and only if, g = f almost everywhere (i.e., g = f except on a set with the Lebesgue measure zero). Proof: Using the fact that log( + x) <xfor all x>, we have ( g ) ( log = log + f g ) > f f g g. It follows that ( g(ζ) ) ( log g(ζ) dζ> f(ζ) ) g(ζ) dζ =0. f(ζ) g(ζ) Clearly, if g = f almost everywhere, II(g :f) = 0. Conversely, given II(g :f) = 0, suppose without loss of generality that g = f except that g>fon the set B that has a non-zero Lebesgue measure. hen, ( g(ζ) ) ( g(ζ) ) log g(ζ) dζ = log g(ζ) dζ> (log )g(ζ) dζ = 0, R f(ζ) B f(ζ) B contradicting II(g : f) = 0. hus, g mustbethesameasf almost everywhere. Note, however, that the KLIC is not a metric because it is not reflexive in general, i.e., II(g :f) II(f :g), and it does not obey the triangle inequality; see Exercise 9.. Hence, the KLIC is only a crude measure of the closeness between f and g. Let {z t } be a sequence of R ν -valued random variables defined on the probability space (Ω, F, IP o ). Without causing any confusion, we shall use the same notations for random vectors and their realizations. Let z =(z, z 2,...,z ) denote the the information set generated by all z t. he vector z t is divided into two parts y t and w t,wherey t is a scalar and w t is (ν ). Corresponding to IP o,there exists a joint density function g for z : g (z )=g(z ) t=2 g t (z t ) g t (z t ) = g(z ) g t (z t z t ), where g t denote the joint density of t random variables z,...,z t,andg t is the density function of z t conditional on past information z,...,z t. he joint density function t=2

4 232 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY g is the random mechanism governing the behavior of z and will be referred to as the data generation process (DGP) of z. As g is unknown, we may postulate a conditional density function f t (z t z t ; θ), where θ Θ R k, and approximate g (z )by f (z ; θ) =f(z ) f t (z t z t ; θ). t=2 he function f is referred to as the quasi-likelihood function, in the sense that f need not agree with g. For notation convenience, we set the unconditional density g(z )(or f(z )) as a conditional density and write g (or f ) as the product of all conditional density functions. Clearly, the postulated density f would be useful if it is close to the DGP g. It is therefore natural to consider minimizing the KLIC of g relative to f : ( g II(g :f (ζ ) ) ; θ) = log R f (ζ g (ζ )d(ζ ). (9.) ; θ) his amounts to minimizing the surprise level resulted from specifying an f for the DGP g. As g does not involve θ, minimizing the KLIC (9.) with respect to θ is equivalent to maximizing log(f (ζ ; θ))g (ζ )d(ζ ) = IE[log f (z ; θ)], R where IE is the expectation operator with respect to the DGP g (z ). his is, in turn, equivalent to maximizing the average of log f t : ( L (θ) = IE[log f (z ; θ)] = ) IE log f t (z t z t ; θ). (9.2) Let θ be the maximizer of L (θ). hen, θ is also the minimizer of the KLIC (9.). If there exists a unique θ o Θ such that f (z ; θ o )=g (z ) for all,wesaythat {f t } is specified correctly in its entirety for {z t }. In this case, the resulting KLIC II(g : f ; θ o ) = 0 and reaches the minimum. his shows that θ = θ o when {f t } is specified correctly in its entirety. Maximizing L is, however, not a readily solvable problem because the objective function (9.2) involves the expectation operator and hence depends on the unknown DGP g. It is therefore natural to consider maximizing the sample counterpart of L (θ): L (z ; θ) := log f t (z t z t ; θ), (9.3)

5 9.. KULLBACK-LEIBLER INFORMAION CRIERION 233 which is known as the quasi-log-likelihood function. he maximizer of L (z ; θ), θ,is known as the quasi-maximum likelihood estimator (QMLE) of θ. he prefix quasi is used to indicate that this solution may be obtained from a misspecified log-likelihood function. When {f t } is specified correctly in its entirety for {z t }, the QMLE is understood as the MLE, as in standard statistics textbooks. Specifying a complete probability model for z may be a formidable task in practice because it involves too many random variables ( random vectors z t,eachwithν random variables). Instead, econometricians are typically interested in modeling a variable of interest, say, y t, conditional on a set of pre-determined variables, say, x t,wherex t includes some elements of w t and z t. his is a relatively simple job because only the conditional behavior of y t need to be considered. As w t are not explicitly modeled, the conditional density g t (y t x t ) provides only a partial description of {z t }. We then find a quasi-likelihood function f t (y t x t ; θ) to approximate g t (y t x t ). Analogous to (9.), the resulting average KLIC of g t relative to f t is ĪI ({g t :f t }; θ) := II(g t :f t ; θ). (9.4) Let y =(y,...,y )andx =(x,...,x ). Minimizing ĪI ({g t :f t }; θ) in (9.4) is thus equivalent to maximizing L (θ) = IE[log f t (y t x t ; θ)]; (9.5) cf. (9.2). he parameter θ that maximizes (9.5) must also minimize the average KLIC (9.4). If there exists a θ o Θ such that f t (y t x t ; θ o )=g t (y t x t ) for all t, wesay that {f t } is correctly specified for {y t x t }. In this case, ĪI ({g t : f t }; θ o ) is zero, and θ = θ o. Similar as before, L (θ) is not directly observable, we instead maximize its sample counterpart: L (y, x ; θ) := log f t (y t x t ; θ), (9.6) which will also be referred to as the quasi-log-likelihood function; cf. (9.3). he resulting solution is the QMLE θ.when{f t } is correctly specified for {y t x t },theqmle θ is also understood as the usual MLE. As a special case, one may concentrate on certain conditional attribute of y t and postulate a specification µ t (x t ; θ) for this attribute. A leading example is the following

6 234 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY specification of conditional normality with µ t (x t ; θ) as the specification of its mean: y t x t N ( µ t (x t ; β),σ 2) ; note that conditional variance is not explicitly modeled. Setting θ =(β σ 2 ),itiseasy to see that the maximizer of the quasi-log-likelihood function log f t(y t x t ; θ) is also the solution to min β [y t µ t (x t ; β)] [y t µ t (x t ; β)]. he resulting QMLE of β is thus the NLS estimator. herefore, the NLS estimator can be viewed as a QMLE under the assumption of conditional normality with conditional homoskedasticity. We say that {µ t } is correctly specified for the conditional mean IE(y t x t )ifthereexistsaθ o such that µ t (x t ; θ o ) = IE(y t x t ). A more flexible specification, such as y t x t N ( µ t (x t ; β),h(x t ; α) ), would allow us to characterize conditional variance as well. 9.2 Asymptotic Properties of the QMLE he quasi-log-likelihood function is, in general, a nonlinear function in θ, so that the QMLE thus must be computed numerically using a nonlinear optimization algorithm. Given that maximizing L is equivalent to minimizing L, the algorithms discussed in Section are readily applied. We shall not repeat these methods here but proceed to the discussion of the asymptotic properties of the QMLE. For our subsequent analysis, we always assume that the specified quasi-log-likelihood function is twice continuously differentiable on the compact parameter space Θ with probability one and that integration and differentiation can be interchanged. Moreover, we maintain the following identification condition. [ID-2] here exists a unique θ that minimizes the KLIC (9.) or (9.4) Consistency We sketch the idea of establishing the consistency of the QMLE. In the light of the uniform law of large numbers discussed in Section 8.3., we know that if L (z ; θ) (or L (y, x ; θ)) tends to L (θ) uniformly in θ Θ, i.e., L (z ; θ) obeys a WULLN, then

7 9.2. ASYMPOIC PROPERIES OF HE QMLE 235 θ θ in probability. his shows that the QMLE is a weakly consistent estimator for the minimizer of KLIC II(g :f ; θ) (or the average KLIC ĪI ({g t : f t }; θ)), whether the specification is correct or not. When {f t } is specified correctly in its entirety (or for {y t x t }), the KLIC minimizer θ is also the true parameter θ o. In this case, the QMLE is weakly consistent for θ o. herefore, the conditions required to ensure QMLE consistency are also those for a WULLN; we omit the details Asymptotic Normality Consider first the specification of the quasi-log-likelihood function for z : L (z ; θ). When θ is in the interior of Θ, the mean-value expansion of L (z ; θ )aboutθ is L (z ; θ )= L (z ; θ )+ 2 L (z ; θ )( θ θ ), (9.7) where θ is between θ and θ. Note that requiring θ in the interior of Θ is to ensure that the mean-value expansion and subsequent asymptotics are valid in the parameter space. he left-hand side of (9.7) is zero because the QMLE θ solves the first order condition L (z ; θ) =0. hen, as long as 2 L (z ; θ ) is invertible with probability one, (9.7) can be written as ( θ θ )= 2 L (z ; θ ) L (z ; θ ). Note that the invertibility of 2 L (z ; θ ) amounts to requiring quasi-log-likelihood function L being locally quadratic at θ. Let IE[ 2 L (z ; θ)] = H (θ) betheexpected Hessian matrix. When 2 L (z ; θ ) obeys a WULLN, we have 2 L (z ; θ) H (θ) IP 0, uniformly in Θ. As θ is weakly consistent for θ,soisθ. he assumed WULLN then implies that 2 L (z ; θ ) H (θ ) IP 0. It follows that ( θ θ )= H (θ ) L (z ; θ )+o IP (). (9.8) his shows that the asymptotic distribution of ( θ θ ) is essentially determined by the asymptotic distribution of the normalized score: L (z ; θ ).

8 236 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY Let B denote the variance-covariance matrix of the normalized score L (z ; θ): B (θ) =var ( L (z ; θ) ), which will also be referred to as the information matrix. hen provided that log f t (z t z t )obeysacl,wehave B (θ ) /2 ( L (z ; θ ) IE[ L (z ; θ )] ) When differentiation and integration can be interchanged, D N (0, Ik ). (9.9) IE[ L (z ; θ)] = IE[L (z ; θ)] L (θ), where the right-hand side is the first order derivative of (9.2). As θ is the KLIC minimizer, L (θ )=0 so that IE[ L (z ; θ )] = 0. By (9.8) and (9.9), ( θ θ )= H (θ )B (θ ) /2[ B (θ ) /2 L (z ; θ ) ] + o IP (), which has an asymptotic normal distribution. his immediately leads to the following result. heorem 9.2 When (9.7), (9.8) and (9.9) hold, where C (θ ) /2 ( θ θ ) D N (0, I k ), C (θ )=H (θ ) B (θ )H (θ ), with H (θ )=IE[ 2 L (z ; θ )] and B (θ )=var ( L (z ; θ ) ). Remark: For the specification of {y t x t }, the QMLE is obtained from the quasi-loglikelihood function L (y, x ; θ), and its asymptotic normality holds similarly as in heorem 9.2. hat is, C (θ ) /2 ( θ θ ) D N(0, I k ), where C (θ )=H (θ ) B (θ )H (θ ) with H (θ )=IE[ 2 L (y, x ; θ )] and B (θ )=var ( L (y, x ; θ ) ).

9 9.3. INFORMAION MARIX EQUALIY Information Matrix Equality A useful result in the quasi-maximum likelihood theory is the information matrix equality. his equality shows that, under certain conditions, the information matrix B (θ) is the same as the negative of the expected Hessian matrix H (θ) so that the covariance matrix C (θ) can be simplified. In such a case, the estimation of C (θ) wouldbe much simpler. Define the following score functions: s t (z t ; θ) = log f t (z t z t ; θ). he average of these scores is s (z ; θ) = L (z ; θ) = s t (z t ; θ). Clearly, s t (z t ; θ)f t (z t z t ; θ) = f t (z t z t ; θ). It follows that ( )( s (z ; θ)f (z ) ; θ) = s t (z t ; θ) f τ (z τ z τ ; θ) = f t (z t z t ; θ) f τ (z τ z τ ; θ) τ t = f (z ; θ). As f (z ; θ) dz =, its derivative must be zero. By permitting interexchange of differentiation and integration we have 0 = f (z ; θ) dz = s (z ; θ)f (z ; θ) dz, and 0 = 2 f (z ; θ) dz [ s = (z ; θ)+s (z ; θ)s (z ; θ) ] f (z ; θ) dz. he equalities are simply the expectations with respect to f (z ; θ), yet they need not hold when the integrator is not f (z ; θ). If {f t } is correctly specified in its entirety for {z t }, the results above imply IE[s (z ; θ o )] = 0 and IE[ s (z ; θ o )] + IE[s (z ; θ o )s (z ; θ o ) ]=0, where the expectations are taken with respect to the true density g (z )=f (z ; θ o ). his is the information matrix equality stated below.

10 238 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY heorem 9.3 Suppose that there exists a θ o such that f t (z t z t ; θ o )} = g t (z t z t ). hen, H (θ o )+B (θ o )=0, where H (θ o )=IE[ 2 L (z ; θ o )] and B (θ o )=var ( L (z ; θ o ) ). When this equality holds, the covariance matrix C in heorem 9.2 simplifies to C (θ o )=B (θ o ) = H (θ o ). hat is, the QMLE achieves the Cramér-Rao lower bound asymptotically. On the other hand, when f is not a correct specification for z, the score function is not related to the true density g. Hence, there is no guarantee that the mean score is zero, i.e., IE[s (z ; θ)] = s (z ; θ)g (z )dz 0, even when this expectation is evaluated at θ. By the same reason, IE[ s (z ; θ)] + IE[s (z ; θ)s (z ; θ) ] 0, even when it is evaluated at θ. For the specification of {y t x t },wehave f t (y t x t ; θ) =s t (y t, x t ; θ)f t (y t x t ; θ), and f t (y t x t ; θ) dy t =. Similar as above, s t (y t, x t ; θ)f t (y t x t ; θ) dy t = 0, and [ s t (y t, x t ; θ)+s t (y t, x t ; θ)s t (y t, x t ; θ) ] f t (y t x t ; θ) dy t = 0. If {f t } is correctly specified for {y t x t },wehaveie[s t (y t, x t ; θ o ) x t ]=0. hen by the law of iterated expectations, IE[s t (y t, x t ; θ o )] = 0, so that the mean score is still zero under correct specification. Moreover, IE[ s t (y t, x t ; θ o ) x t ]+IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) x t ]=0, which implies IE[ s t (y t, x t ; θ o )] + IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) ] = H (θ o )+ IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) ] = 0.

11 9.3. INFORMAION MARIX EQUALIY 239 he equality above is not necessarily equivalent to the information matrix equality, however. o see this, consider the specifications {f t (y t x t ; θ)}, which are correct for {y t x t }. hese specifications are said to have dynamic misspecification if they are not correctly specified for {y t w t, z t }. hat is, there does not exist any θ o such that f t (y t x t ; θ o )=g t (y t w t, z t ). hus, the information contained in w t and z t cannot be fully represented by x t. On the other hand, when dynamic misspecification is absent, it is easily seen that IE[s t (y t, x t ; θ o ) x t ]=IE[s t (y t, x t ; θ o ) w t, z t ]. (9.0) It is then easily verified that [( B (θ o )= )( IE )] s t (y t, x t ; θ o ) s t (y t, x t ; θ o ) = = IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) ]+ τ= t=τ+ τ= t=τ+ IE[s t τ (y t τ, x t τ ; θ o )s t (y t, x t ; θ o ) ]+ IE[s t (y t, x t ; θ o )s t+τ (y t+τ, x t+tau ; θ o ) ] IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) ], where the last equality holds by (9.0) and the law of iterated expectations. When there is dynamic misspecification, the last equality fails because the covariances of scores do not vanish. his shows that the average of individual information matrices, i.e., var(s t (y t, x t ; θ o )), need not be the information matrix. heorem 9.4 Suppose that there exists a θ o such that f t (y t x t ; θ o )=g t (y t x t ) and there is no dynamic misspecification. hen, H (θ o )+B (θ o )=0, where H (θ o )= IE[ s t (y t, x t ; θ o )] and B (θ o )= IE[s t (y t, x t ; θ o )s t (y t, x t ; θ o ) ].

12 240 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY When heorem 9.4 holds, the covariance matrix needed to normalize ( θ θ o ) again simplifies to B (θ o ) = H (θ o ), and the QMLE achieves the Cramér-Rao lower bound asymptotically. Example 9.5 Consider the following specification: y t x t N(x t β,σ2 ) for all t. Let θ =(β σ 2 ),then log f(y t x t ; θ) = 2 log(2π) 2 log(σ2 ) (y t x tβ) 2 2σ 2, and L (y, x ; θ) = 2 log(2π) 2 log(σ2 ) (y t x tβ) 2 2σ 2. Straightforward calculation yields L (y, x ; θ) = 2 L (y, x ; θ) = x t(y t x t β) σ 2 + (yt x 2σ 2 t β)2 2(σ 2 ) 2, xtx t xt(yt x σ 2 t β) (σ 2 ) 2 (yt x t β)x t (σ 2 ) 2 2(σ 2 ) (yt x 2 t β)2 (σ 2 ) 3 Setting L (y, x ; θ) =0 we can solve for β to obtain the QMLE. It is easily verified that the QMLE of β is the OLS estimator ˆβ and that the QMLE of σ 2 is the average of the OLS residuals: σ 2 = (y t x ˆβ t ) 2. If the specification above is correct for {y t x t },thereexistsθ o =(β o σo) 2 such that the conditional distribution of y t given x t is N (x tβ o,σo). 2 aking expectation with respect to the true distribution function, we have IE[x t (y t x tβ)] = IE(x t x t)(β o β), which is zero when evaluated at β = β o. Similarly, IE[(y t x tβ) 2 ]=IE[(y t x tβ o + x tβ o x tβ) 2 ] =IE[(y t x tβ o ) 2 ]+IE[(x tβ o x tβ) 2 ] = σ 2 o +IE[(x tβ o x tβ) 2 ], where the second term on the right-hand side is zero if it is evaluated at β = β o.hese results together show that. H (θ) =IE[ 2 L (θ)] = IE(xtx t ) σ 2 IE(xtx t )(β o β) (σ 2 ) 2 (β o β) IE(x tx t ) σ2 o+ie[(x (σ 2 ) 2 2(σ 2 ) 2 t β o x t β)2 ] (σ 2 ) 3.

13 9.3. INFORMAION MARIX EQUALIY 24 When this matrix is evaluated at θ o =(β o σ2 o ), H (θ o )= IE(xtx t ) σ 0 o (σo) 2 2 If there is no dynamic misspecification, B (θ) = IE (y t x t β)2 x tx t xt(yt x (σ 2 ) 2 t β) + xt(yt x 2(σ 2 ) 2 t β)3 2(σ 2 ) 3 (yt x t β)x t 2(σ 2 ) + (yt x 2 t β)3 x t 2(σ 2 ) 3 4(σ 2 ) (yt x 2 t β)2 2(σ 2 ) + (yt x 3 t β)4 4(σ 2 ) 4 Given that y t is conditionally normally distributed, its conditional third and fourth central moments are zero and 3(σo 2)2, respectively. It can then be verified that IE[(y t x t β)3 ]= 3σ 2 o IE[(x t β o x t β)] IE[(x t β o x t β)3 ], (9.) which is zero when evaluated at β = β o,andthat IE[(y t x t β)4 ]=3(σ 2 o )2 +6σ 2 o IE[(x t β o x t β)2 ]+IE[(x t β o x t β)3 ], (9.2) which is 3(σo 2)2 when evaluated at β = β o ; see Exercise 9.2. Consequently, B (θ o )= IE(x tx t ) σ 0 o (σ 2 o) 2 his shows that the information matrix equality holds. A typical consistent estimator of H (θ o )is xtx t 0 H ( θ )= σ ( σ 2 )2 Due to the information matrix equality, a consistent estimator of B (θ o )isb ( θ )= H ( θ ). It can be seen that the upper-left block of H ( θ ) is the standard estimator for the covariance matrix of ˆβ. On the other hand, when there is dynamic missepcification, B (θ) is not the same as given above so that the information matrix equality fails; see Exercise he information matrix equality may also fail in different circumstances. he example below shows that, even when the specification for IE(y t x t ) is correct and there is no dynamic misspecification, the information matrix equality may still fail to hold if there is misspecification of other conditional moments, such as neglected conditional heteroskedasticity.

14 242 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY Example 9.6 Consider the specification as in Example 9.5: y t x t N(x t β,σ2 )for all t. Suppose that the DGP is y t x t N ( x t β o,h(x t γ o )), where the conditional variance h(x t γ o ) varies with x t. hen, this specification includes a correct specification for the conditional mean but itself is incorrect for {y t x t } because it ignores conditional heteroskedasticity. From Example 9.5 we see that the upper-left block of H (θ) is σ 2 IE(x t x t)σ 2, and that the corresponding block of B (θ) is IE[(y t x t β)2 x t x t ] (σ 2 ) 2. As the specification is correct for the conditional mean but not the conditional variance, the KLIC minimzer is θ =(β o, (σ ) 2 ). Evaluating the submatrix of H (θ)atθ yields (σ ) 2 IE(x t x t), which differs from the corresponding submatrix of B (θ) evaluated at θ : (σ ) 4 IE[h(x t γ o )2 x t x t ]. hus, the information matrix equality breaks down, despite that the conditional mean specification is correct. A consistent estimator of H (θ )is H ( θ )= xtx t 0 σ 2 0 2( σ 2 )2, yet a consistent estimator of B (θ )is B ( θ )= ê2 t xtx t ( σ 2 ) ( σ 2 )2 + ê4 t 4( σ 2 )4 ;

15 9.4. HYPOHESIS ESING 243 see Exercise 9.4. Due to the block diagonality of H( θ )andb( θ ), it is easy to verify that the upper-left block of H( θ ) B( θ )H( θ ) is ( ) ( x t x t )( ê 2 t x t x t x t x t). his is precisely the the Eicker-White estimator (6.0) for the covariance matrix of the OLS estimator ˆβ, as shown in Section Hypothesis esting In this section we discuss three classical large sample tests (Wald, LM, and likelihood ratio tests), the Hausman (978) test, and the information matrix test of White (982, 987) for the null hypothesis Rθ = r, wherer is q k matrix with full row rank. Failure to reject the null hypothesis suggests that there is no empirical evidence against the specified model under the null hypothesis Wald est Similar to the Wald test for linear regression discussed in Section 6.4., the Wald test under the QMLE framework is based on the difference R θ ) r. From (9.8), R( θ θ )= RH (θ ) B (θ ) /2[ B (θ ) /2 L (z ; θ ) ] + o IP (). his shows that RC (θ )R = RH (θ ) B (θ )H (θ ) R can be used to normalize R( θ θ ) to obtain asymptotic normality. Under the null hypothesis, Rθ = r so that [RC (θ )R ] /2 (R( θ r) D N(0, I q ). (9.3) his is the key distribution result for the Wald test. For notation simplicity, we let H = H ( θ ) denote a consistent estimator for H (θ )and B = B ( θ ) denote a consistent estimator for B (θ ). It follows that a consistent estimator for C (θ )is C = C ( θ )= H B H. Substituting C for C (θ) in (9.3) we have [R tildebc (θ )R ] /2 (R( θ r) D N(0, I q ). (9.4)

16 244 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY he Wald test statistic is the inner product of the left-hand side of (9.4): W = (R θ r) (R C R ) (R θ r). (9.5) he limiting distribution of the Wald test now follows from (9.4) and the continuous mapping theorem. heorem 9.7 Suppose that heorem 9.2 for the QMLE θ holds. hen under the null hypothesis, W D χ 2 (q), where W is defined in (9.5) and q is the number of rows of R. Example 9.8 Consider the quasi-log-likelihood function specified in Example 9.5. We write θ =(σ 2 β ) and β =(b b 2),whereb is (k s), and b 2 is s. We are interested in the null hypothesis that b 2 = Rθ = 0, wherer =[0 R ]iss (k +) and R =[0 I s ]iss k. he Wald test can be computed according to (9.5): W = β 2, (R C R ) β2,, where β 2, = R θ is the estimator of b 2. As shown in Example 9.5, when the information matrix equality holds, C = is block diagnoal so that R C R = R he Wald test then becomes H R = σ 2 R (X X/ ) R. W = β 2, [R (X X/ ) R ] β2, / σ 2. In this case, the Wald test is just s times the standard F statistic which is readily available from most of econometric packages. H Lagrange Multiplier est Consider now the problem of maximizing L (θ) subject to the constraint Rθ = r. he Lagrangian is L (θ)+θ R λ,

17 9.4. HYPOHESIS ESING 245 where λ is the vector of Lagrange multipliers. he maximizers of the Lagrangian and denoted as θ and λ,where θ is the constrained QMLE of θ. Analogous to Section 6.4.2, the LM test under the QML framework also checks whether λ is sufficiently close to zero. First note that θ and λ satisfy the saddle-point condition: L ( θ )+R λ = 0. he mean-value expansion of L ( θ )aboutθ yields L (θ )+ 2 L (θ )( θ θ )+R λ = 0, where θ is the mean value between θ and θ. It has been shown in (9.8) that ( θ θ )= H (θ ) L (θ )+o IP (). Hence, H (θ ) ( θ θ )+ 2 L (θ ) ( θ θ )+R λ = o IP (). Using the WULLN result: 2 L (θ ) H (θ ) IP 0, weobtain ( θ θ )= ( θ θ ) H (θ ) R λ + o IP (). (9.6) his establishes a relationship between the constrained and unconstrained QMLEs. Pre-multiplying both sides of (9.6) by R and noting that the constrained estimator θ must satisfy the constraint so that R( θ θ )=0, wehave λ =[RH (θ ) R ] R ( θ θ )+o IP (), (9.7) which relates the Lagrangian multiplier and the unconstrained QMLE θ. When heorem 9.2 holds for the normalized θ, we obtain the following asymptotic normality result for the normalized Lagrangian multiplier: where Λ /2 λ = Λ /2 [RH (θ ) R ] R ( θ θ ) D N(0, I q ), (9.8) Λ (θ )=[RH (θ ) R ] RC (θ )R [RH (θ ) R ]. Let Ḧ = H ( θ ) denote a consistent estimator for H (θ )and C = C ( θ )denote a consistent estimator for C (θ ), both based on the constrained QMLE θ. hen, Λ =(RḦ R ) R C R (RḦ R )

18 246 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY is consistent for Λ (θ ). It follows from (9.8) that Λ /2 λ = /2 Λ (RḦ R ) R ( θ θ ) D N(0, I q ), (9.9) he LM test statistic is the inner product of the left-hand side of (9.9): LM = λ Λ λ = λ RḦ R (R C R ) RḦ R λ. (9.20) he limiting distribution of the LM test now follows easily from (9.9) and the continuous mapping theorem. heorem 9.9 Suppose that heorem 9.2 for the QMLE θ holds. hen under the null hypothesis, LM D χ 2 (q), where LM is defined in (9.20) and q is the number of rows of R. Remark: When the information matrix equality holds, the LM statistic (9.20) becomes LM = λ RḦ R λ = L ( θ ) Ḧ L ( θ ), which mainly involves the averages of scores: L ( θ ). he LM test is thus a test that checks if the average of scores is sufficiently close to zero and hence also known as the score test. Example 9.0 Consider the quasi-log-likelihood function specified in Example 9.5. We write θ =(σ 2 β ) and β =(b b 2 ),whereb is (k s), and b 2 is s. We are interested in the null hypothesis that b 2 = Rθ = 0, wherer =[0 R ]iss (k +) and R =[0 I s ]iss k. From the saddle-point condition, L ( θ )= R λ. which can be partitioned as σ 2L ( θ ) L ( θ )= b L ( θ ) b2 L ( θ ) = 0 0 λ = R λ. Partitioning x t accordingly as (x t x 2t ),wehave b2 L ( θ )= σ 2 x 2t ɛ t = X 2 ɛ/( σ2 ).

19 9.4. HYPOHESIS ESING 247 where σ 2 = ɛ ɛ/,and ɛ is the vector of constrained residuals obtained from regressing y t on x t and X 2 is the s matrix whose t th row is x 2t. he LM test can be computed according to (9.20): 0 0 LM = 0 Ḧ R (R C R ) RḦ 0, X 2 ɛ/( σ2 ) X 2 ɛ/( σ2 ) which converges in distribution to χ 2 (s) under the null hypothesis. Note that we do not have to evaluate the complete score vector for computing the LM test; only the subvector of the score that corresponds to the constraint matters. When the information matrix equality holds, the LM statistic has a simpler form: LM = [0 ɛ X 2 / ](X X/ ) [0 ɛ X 2 / ] / σ 2 = [ ɛ X(X X) X ɛ/ ɛ ɛ] = R 2, where R 2 is the non-centered coefficient of determination obtained from the auxiliary regression of the constrained residuals ɛ t on x t and x 2t. Example 9. (Breusch-Pagan) Suppose that the specification is y t x t, ζ t N ( x tβ,h(ζ tα) ), where h : R (0, ) is a differentiable function, and ζ tα = α 0 + p i= ζ tiα i.henull hypothesis is conditional homoskedasticity, i.e., α = = α p =0sothath(α 0 )=σ0 2. Breusch and Pagan (979) derived the LM test for this hypothesis under the assumption that the information matrix equality holds. his test is now usually referred to as the Breusch-Pagan test. Note that the constrained specification is y t x t, ζ t N(x tβ,σ 2 ), where σ 2 = h(α 0 ). his is leads to the standard linear regression model without heteroskedasticity. he constrained QMLEs for β and σ 2 are, respectively, the OLS estimators ˆβ and σ 2 = ê2 t /,whereê t are the OLS residuals. he score vector corresponding to α is: α L (y t, x t, ζ t ; θ) = [ h (ζ tα)ζ t 2h(ζ tα) ( (yt x tβ) 2 h(ζ tα) )],

20 248 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY where h (η) = dh(η)/ dη. Under the null hypothesis, h (ζ t α ) = h (α 0 )isjusta constant, say, c. he score vector above evaluated at the constrained QMLEs is α L (y t, x t, ζ t ; θ )= c [ ζt 2 σ 2 ( ˆɛ 2 t σ 2 )]. he (p +) (p + ) block of the Hessian matrix corresponding to α is [ (yt x ] t β)2 h 3 (ζ + tα) 2h 2 (ζ [h (ζ α)] 2 ζ t ζ t tα) [ (yt x + tβ) 2 ] 2h 2 (ζ tα) 2h 2 (ζ h (ζ α)ζ t ζ t tα). Evaluating the expectation of this block at θ =(β o α 0 0 ) and noting that (σ ) 2 = h(α 0 )wehave ( c 2 ) ( ) 2[(σ ) 2 ] 2 IE(ζ t ζ t), which, apart from the constant c, can be estimated by ( c 2 ) ( ) (ζ t ζ t). 2[ σ 2 ]2 he LM test is now readily derived from the results above when the information matrix equality holds. Setting d t =ˆɛ 2 t / σ2, the LM statistic is ( )( ) ( ) LM = d t ζ t ζ t ζ t ζ t d t /2 = d Z(Z Z) Z d/2 D χ 2 (p), where d is withthetth element d t,andz is the (p+) matrix with the t th row ζ t. It can be seen that the numerator of the LM statistic is the (centered) regression sum of squares (RSS) of regressing d t on ζ t. his shows that the Breusch-Pagan test can also be computed by running an auxiliary regression and using the resulting RSS/2 asthe statistic. Intuitively, this amounts to checking whether the variables in ζ t are capable of explaining the square of the (standardized) OLS residuals. It is also interesting to see that the value of c and the functional form of h do not matter in deriving the

21 9.4. HYPOHESIS ESING 249 statistic. he latter feature makes the Breusch-Pagan test a general test for conditional heteroskedasticity. Koenker (98) noted that under conditional normality, d2 t / IP 2. hus, a test that is more robust to non-normality and asymptotically equivalent to the Breusch- Pagan test is to replace the denominator 2 with d2 t /. his robust version can be expressed as LM = [d Z(Z Z) Z d/d d]=r 2, where R 2 is the (centered) R 2 from regressing d t on ζ t. his is also equivalent to the centered R 2 from regressing ˆɛ 2 t on ζ t. Remarks:. o compute the Breusch-Pagan test, one must specify a vector ζ t that determines the conditional variance. Here, ζ t may contain some or all the variables in x t.if ζ t is chosen to include all elements of x t, their squares and pairwise products, the resulting R 2 is also the White (980) test for (conditional) heteroskedasticity of unknown form. he White test can also be interpreted as an information matrix test discussed below. 2. he Breusch-Pagan test is obtained under the condition that the information matrix equality holds. We have seen that the information matrix equality may fail when there is dynamic misspecification. hus, the Breusch-Pagan test is not valid when, e.g., the errors are serially correlated. Example 9.2 (Breusch-Godfrey) Given the specification y t x t N(x tβ,σ 2 ), suppose that one would like to check if the errors are serially correlated. Consider first the AR() errors: y t x t β = ρ(y t x t β)+u t with ρ < and{u t } a white noise. he null hypothesis is ρ = 0, i.e., no serial correlation. It can be seen that a general specification that allows for serial correlations is y t y t, x t, x t N(x t β + ρ(y t x t β),σ2 u ). he constrained specification is just the standard linear regression model y t = x t β. esting the null hypothesis that ρ = 0 is thus equivalent to testing whether an additional variable y t x t β should be included in the mean specification. In the light of Example 9.0, if the information matrix equality holds, the LM test can be computed as R 2,whereR 2 is obtained from regressing the OLS residuals ˆɛ t on x t and

22 250 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY ˆɛ t = y t x t ˆβ. his is precisely the Breusch (978) and Godfrey (978) test for AR() errors with the limiting χ 2 () distribution. he test above can be extended straightforwardly to check AR(p) errors. By regressing ˆɛ t on x t and ˆɛ t,...,ˆɛ t p, the resulting R 2 is the LM test when the information matrix equality holds and has a limiting χ 2 (p) distribution. Such tests are known as the Breusch-Godfrey test. Moreover, if the specification is y t x tβ = u t + αu t, i.e., the errors follow an MA() process, we can write y t x t,u t N(x tβ + αu t,σu). 2 he null hypothesis is α = 0. It is readily seen that the resulting LM test is identical to that for AR() errors. hus, the Breusch-Godfrey test for MA(p) errors is the same as that for AR(p) errors. Remarks:. he Breusch-Godfrey tests discussed above are obtained under the condition that the information matrix equality holds. We have seen that the information matrix equality may fail when, for example, there is neglected conditional heteroskedasticity. hus, the Breusch-Godfrey tests are not valid when conditional heteroskedasticity is present. 2. It can be shown that the square of Durbin s h test is also an LM test. While Durbin s h test may not be feasible in practice, the Breusch-Godfrey test can always be computed Likelihood Ratio est As discussed in Section 6.4.3, the LR test compares the performance of the constrained and unconstrained specifications based on their likelihood ratio: LR = 2 [L ( θ ) L ( θ )]. (9.2) Utilizing the relationship between the constrained and unconstrained QMLEs (9.6) and the relationship between the Lagrangian multiplier and unconstrained QMLE (9.7), we can write ( θ θ )=H (θ ) R λ + o IP () (9.22) = H (θ ) R

23 9.4. HYPOHESIS ESING 25 By aylor expansion of L ( θ )about θ, 2 [L ( θ ) L ( θ )] = 2 L ( θ )( θ θ ) ( θ θ ) H ( θ )( θ θ )+o IP () = ( θ θ ) H (θ )( θ θ )+o IP () = ( θ θ ) R [RH (θ ) R ] R( θ θ )+o IP (), where the second equality follows because L ( θ )=0. It can be seen that the righthand side is essentially the Wald statistic when RH (θ ) R is the normalizing variance-covariance matrix. his leads to the following distribution result for the LR test. heorem 9.3 Suppose that heorem 9.2 for the QMLE θ and the information matrix equality both hold. hen under the null hypothesis, LR D χ 2 (q), where LR is defined in (9.2) and q is the number of rows of R. heorem 9.3 differs from heorem 9.7 and heorem 9.9 in that it also requires the validity of the information matrix equality. When the information matrix equality does not hold, RH (θ ) R is not a proper normalizing matrix so that LR does not have a limiting χ 2 distribution. his result clearly indicates that, when L is constructed by specifying density functions for {y t x t }, the LR test is not robust to dynamic misspecification and misspecifications of other conditional attributes, such as neglected conditional heteroskedasticity. By contrast, the Wald and LM tests can be made robust by employing a proper normalizing variance-covariance matrix.

24 252 CHAPER 9. HE QUASI-MAXIMUM LIKELIHOOD MEHOD: HEORY Exercises 9. Let g and f be two density functions. Show that the KLIC II(g :f) does not obey the triangle inequality, i.e., II(g : f) II(g : h) +II(h : f) for any other density function h. 9.2 Prove equations (9.) and (9.2) in Example In Example 9.5, suppose there is dynamic misspecification. What is B (θ )? 9.4 In Example 9.6, what is B (θ )? Show that B ( θ )= ê2 t xtx t ( σ 2 ) ( σ 2 )2 + ê4 t 4( σ 2 )4 is a consistent estimator for B (θ ). 9.5 Consider the specification y t x t N(x t β,h(ζ tα)). What condition would ensure that H (θ )andb (θ ) are block diagonal? 9.6 Consider the specification y t x t,y t N(γy t +x t β,σ2 ) and the AR() errors: y t αy t x tβ = ρ(y t αy t 2 x t β)+u t, with ρ < and{u t } a white noise. Derive the LM test for the null hypothesis ρ = 0 and show its square root is Durbin s h test; see Section

25 9.4. HYPOHESIS ESING 253 References Amemiya, akeshi (985). Advanced Econometrics, Cambridge, MA: Harvard University Press. Breusch,. S. (978). esting for autocorrelation in dynamic linear models, Australian Economic Papers, 7, Breusch,. S. and A. R. Pagan (979). A simple test for heteroscedasticity and random coefficient variation, Econometrica, 47, Engle, Robert F. (982). Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation, Econometrica, 50, Godfrey, L. G. (978). esting against general autoregressive and moving average error models when the regressors include lagged dependent variables, Econometrica, 46, Godfrey, L. G. (988). Misspecification ests in Econometrics: he Lagrange Multiplier Principle and Other Approaches, New York: Cambridge University Press. Hamilton, James D. (994). Press. ime Series Analysis, Princeton: Princeton University Hausman, Jerry A. (978). Specification tests in econometrics, Econometrica, 46, Koenker, Roger (98). A note on studentizing a test for heteroscedasticity, Journal of Econometrics, 7, White, Halbert (980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica, 48, White, Halbert (982). Maximum likelihood estimation of misspecified models, Econometrica, 50, 25. White, Halbert (994). Estimation, Inference, and Specification Analysis, NewYork: Cambridge University Press.

Quasi Maximum Likelihood Theory

Quasi Maximum Likelihood Theory Quasi Maximum Likelihood heory CHUNG-MING KUAN Department of Finance & CREA June 17, 2010 C.-M. Kuan (National aiwan Univ.) Quasi Maximum Likelihood heory June 17, 2010 1 / 119 Lecture Outline 1 Kullback-Leibler

More information

Generalized Method of Moment

Generalized Method of Moment Generalized Method of Moment CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University June 16, 2010 C.-M. Kuan (Finance & CRETA, NTU Generalized Method of Moment June 16, 2010 1 / 32 Lecture

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

Quasi-Maximum Likelihood: Applications

Quasi-Maximum Likelihood: Applications Chapter 10 Quasi-Maximum Likelihood: Applications 10.1 Binary Choice Models In many economic applications the dependent variables of interest may assume only finitely many integer values, each labeling

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets Econometrics II - EXAM Outline Solutions All questions hae 5pts Answer each question in separate sheets. Consider the two linear simultaneous equations G with two exogeneous ariables K, y γ + y γ + x δ

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley Models, Testing, and Correction of Heteroskedasticity James L. Powell Department of Economics University of California, Berkeley Aitken s GLS and Weighted LS The Generalized Classical Regression Model

More information

Eksamen på Økonomistudiet 2006-II Econometrics 2 June 9, 2006

Eksamen på Økonomistudiet 2006-II Econometrics 2 June 9, 2006 Eksamen på Økonomistudiet 2006-II Econometrics 2 June 9, 2006 This is a four hours closed-book exam (uden hjælpemidler). Please answer all questions. As a guiding principle the questions 1 to 4 have equal

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f e c o n o m i c s Econometrics II Linear Regression with Time Series Data Morten Nyboe Tabor u n i v e r s i t y o f c o p e n h a g

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f e c o n o m i c s Econometrics II Linear Regression with Time Series Data Morten Nyboe Tabor u n i v e r s i t y o f c o p e n h a g

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

ECON 312 FINAL PROJECT

ECON 312 FINAL PROJECT ECON 312 FINAL PROJECT JACOB MENICK 1. Introduction When doing statistics with cross-sectional data, it is common to encounter heteroskedasticity. The cross-sectional econometrician can detect heteroskedasticity

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)

More information

Nonlinear Least Squares Theory

Nonlinear Least Squares Theory Chapter 8 Nonlinear Least Squares heory For real world data, it is hard to believe that linear specifications are universal in characterizing all economic relationships. A straightforward extension of

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 9. Specification Tests (1) Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 9. Specification Tests (1) Luc Anselin http://spatial.uchicago.edu 1 basic concepts types of tests Moran s I classic ML-based tests LM tests 2 Basic Concepts 3 The Logic of Specification

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Lecture 6: Hypothesis Testing

Lecture 6: Hypothesis Testing Lecture 6: Hypothesis Testing Mauricio Sarrias Universidad Católica del Norte November 6, 2017 1 Moran s I Statistic Mandatory Reading Moran s I based on Cliff and Ord (1972) Kelijan and Prucha (2001)

More information

Generalized Least Squares Theory

Generalized Least Squares Theory Chapter 4 Generalized Least Squares Theory In Section 3.6 we have seen that the classical conditions need not hold in practice. Although these conditions have no effect on the OLS method per se, they do

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity The Lahore Journal of Economics 23 : 1 (Summer 2018): pp. 1 19 Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity Sohail Chand * and Nuzhat Aftab ** Abstract Given that

More information

ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN

ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN a.a. 14/15 p. 1 LECTURE 3: REVIEW OF BASIC ESTIMATION METHODS: GMM AND OTHER EXTREMUM

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity

More information

Cointegration Lecture I: Introduction

Cointegration Lecture I: Introduction 1 Cointegration Lecture I: Introduction Julia Giese Nuffield College julia.giese@economics.ox.ac.uk Hilary Term 2008 2 Outline Introduction Estimation of unrestricted VAR Non-stationarity Deterministic

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

A Primer on Asymptotics

A Primer on Asymptotics A Primer on Asymptotics Eric Zivot Department of Economics University of Washington September 30, 2003 Revised: October 7, 2009 Introduction The two main concepts in asymptotic theory covered in these

More information

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance V (u i x i ) = σ 2 is common to all observations i = 1,..., In many applications, we may suspect

More information

Statistics and econometrics

Statistics and econometrics 1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Maximum Likelihood Tests and Quasi-Maximum-Likelihood Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

LESLIE GODFREY LIST OF PUBLICATIONS

LESLIE GODFREY LIST OF PUBLICATIONS LESLIE GODFREY LIST OF PUBLICATIONS This list is in two parts. First, there is a set of selected publications for the period 1971-1996. Second, there are details of more recent outputs. SELECTED PUBLICATIONS,

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

F9 F10: Autocorrelation

F9 F10: Autocorrelation F9 F10: Autocorrelation Feng Li Department of Statistics, Stockholm University Introduction In the classic regression model we assume cov(u i, u j x i, x k ) = E(u i, u j ) = 0 What if we break the assumption?

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

Testing Random Effects in Two-Way Spatial Panel Data Models

Testing Random Effects in Two-Way Spatial Panel Data Models Testing Random Effects in Two-Way Spatial Panel Data Models Nicolas Debarsy May 27, 2010 Abstract This paper proposes an alternative testing procedure to the Hausman test statistic to help the applied

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

MEI Exam Review. June 7, 2002

MEI Exam Review. June 7, 2002 MEI Exam Review June 7, 2002 1 Final Exam Revision Notes 1.1 Random Rules and Formulas Linear transformations of random variables. f y (Y ) = f x (X) dx. dg Inverse Proof. (AB)(AB) 1 = I. (B 1 A 1 )(AB)(AB)

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation University of Pavia Maximum Likelihood Estimation Eduardo Rossi Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution)

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94 Freeing up the Classical Assumptions () Introductory Econometrics: Topic 5 1 / 94 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions needed for derivations

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48 ECON2228 Notes 10 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 10 2014 2015 1 / 48 Serial correlation and heteroskedasticity in time series regressions Chapter 12:

More information

LECTURE 10: MORE ON RANDOM PROCESSES

LECTURE 10: MORE ON RANDOM PROCESSES LECTURE 10: MORE ON RANDOM PROCESSES AND SERIAL CORRELATION 2 Classification of random processes (cont d) stationary vs. non-stationary processes stationary = distribution does not change over time more

More information

Asymptotic Least Squares Theory: Part II

Asymptotic Least Squares Theory: Part II Chapter 7 Asymptotic Least Squares heory: Part II In the preceding chapter the asymptotic properties of the OLS estimator were derived under standard regularity conditions that require data to obey suitable

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Asymptotic Least Squares Theory

Asymptotic Least Squares Theory Asymptotic Least Squares Theory CHUNG-MING KUAN Department of Finance & CRETA December 5, 2011 C.-M. Kuan (National Taiwan Univ.) Asymptotic Least Squares Theory December 5, 2011 1 / 85 Lecture Outline

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

Regression with time series

Regression with time series Regression with time series Class Notes Manuel Arellano February 22, 2018 1 Classical regression model with time series Model and assumptions The basic assumption is E y t x 1,, x T = E y t x t = x tβ

More information

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1 Reading Assignment Serial Correlation and Heteroskedasticity Chapters 1 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1 Serial Correlation or Autocorrelation y t = β 0 + β 1 x 1t + β x t +... + β k

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Generalized Method of Moments (GMM) Estimation

Generalized Method of Moments (GMM) Estimation Econometrics 2 Fall 2004 Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen of29 Outline of the Lecture () Introduction. (2) Moment conditions and methods of moments (MM) estimation. Ordinary

More information

Formulary Applied Econometrics

Formulary Applied Econometrics Department of Economics Formulary Applied Econometrics c c Seminar of Statistics University of Fribourg Formulary Applied Econometrics 1 Rescaling With y = cy we have: ˆβ = cˆβ With x = Cx we have: ˆβ

More information

Additional Topics on Linear Regression

Additional Topics on Linear Regression Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Additional Topics 1 / 49 1 Tests for Functional Form Misspecification 2 Nonlinear

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

Using all observations when forecasting under structural breaks

Using all observations when forecasting under structural breaks Using all observations when forecasting under structural breaks Stanislav Anatolyev New Economic School Victor Kitov Moscow State University December 2007 Abstract We extend the idea of the trade-off window

More information

13.2 Example: W, LM and LR Tests

13.2 Example: W, LM and LR Tests 13.2 Example: W, LM and LR Tests Date file = cons99.txt (same data as before) Each column denotes year, nominal household expenditures ( 10 billion yen), household disposable income ( 10 billion yen) and

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

The Size and Power of Four Tests for Detecting Autoregressive Conditional Heteroskedasticity in the Presence of Serial Correlation

The Size and Power of Four Tests for Detecting Autoregressive Conditional Heteroskedasticity in the Presence of Serial Correlation The Size and Power of Four s for Detecting Conditional Heteroskedasticity in the Presence of Serial Correlation A. Stan Hurn Department of Economics Unversity of Melbourne Australia and A. David McDonald

More information

Follow links for Class Use and other Permissions. For more information send to:

Follow links for Class Use and other Permissions. For more information send  to: COPYRIGH NOICE: Kenneth. Singleton: Empirical Dynamic Asset Pricing is published by Princeton University Press and copyrighted, 00, by Princeton University Press. All rights reserved. No part of this book

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Elements of Probability Theory

Elements of Probability Theory Elements of Probability Theory CHUNG-MING KUAN Department of Finance National Taiwan University December 5, 2009 C.-M. Kuan (National Taiwan Univ.) Elements of Probability Theory December 5, 2009 1 / 58

More information

Diagnostics of Linear Regression

Diagnostics of Linear Regression Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions

More information

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54 ECON2228 Notes 10 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 10 2014 2015 1 / 54 erial correlation and heteroskedasticity in time series regressions Chapter 12:

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

388 Index Differencing test ,232 Distributed lags , 147 arithmetic lag.

388 Index Differencing test ,232 Distributed lags , 147 arithmetic lag. INDEX Aggregation... 104 Almon lag... 135-140,149 AR(1) process... 114-130,240,246,324-325,366,370,374 ARCH... 376-379 ARlMA... 365 Asymptotically unbiased... 13,50 Autocorrelation... 113-130, 142-150,324-325,365-369

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

Section 6: Heteroskedasticity and Serial Correlation

Section 6: Heteroskedasticity and Serial Correlation From the SelectedWorks of Econ 240B Section February, 2007 Section 6: Heteroskedasticity and Serial Correlation Jeffrey Greenbaum, University of California, Berkeley Available at: https://works.bepress.com/econ_240b_econometrics/14/

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Ch.10 Autocorrelated Disturbances (June 15, 2016)

Ch.10 Autocorrelated Disturbances (June 15, 2016) Ch10 Autocorrelated Disturbances (June 15, 2016) In a time-series linear regression model setting, Y t = x tβ + u t, t = 1, 2,, T, (10-1) a common problem is autocorrelation, or serial correlation of the

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs Questions and Answers on Unit Roots, Cointegration, VARs and VECMs L. Magee Winter, 2012 1. Let ɛ t, t = 1,..., T be a series of independent draws from a N[0,1] distribution. Let w t, t = 1,..., T, be

More information

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction For pedagogical reasons, OLS is presented initially under strong simplifying assumptions. One of these is homoskedastic errors,

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information