Follow links for Class Use and other Permissions. For more information send to:

Size: px
Start display at page:

Download "Follow links for Class Use and other Permissions. For more information send to:"

Transcription

1 COPYRIGH NOICE: Kenneth J. Singleton: Empirical Dynamic Asset Pricing is published by Princeton University Press and copyrighted, 00, by Princeton University Press. All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher, except for reading and browsing via the World Wide Web. Users are not permitted to mount this file on any network servers. Follow links for Class Use and other Permissions. For more information send to:

2 Large-Sample Properties of Extremum Estimators Extremum estimators are estimators obtained by either maximizing or minimizing a criterion function over the admissible parameter space. In this chapter we introduce more formally the concept of an extremum estimator and discuss the large-sample properties of these estimators. After briefly setting up notation and describing the probability environment within which we discuss estimation, we describe regularity conditions under which an estimator converges almost surely to its population counterpart. We then turn to the large-sample distributions of extremum estimators. hroughout this discussion we maintain the assumption that θ is a consistent estimator of θ 0 and focus on properties of the distribution of θ as gets large. Whereas discussions of consistency are often criterion-function specific, the large-sample analyses of most of the extremum estimators we will use subsequently can be treated concurrently. We formally define a family of estimators that encompasses the first-order conditions of the ML, standard GMM, and LLS estimators as special cases. hen, after we present a quite general central limit theorem, we establish the asymptotic normality of these estimators. Finally, we examine the relative asymptotic efficiencies of the GMM, LSS, and ML estimators and interpret their asymptotic efficiencies in terms of the restrictions on the joint distribution of the data used in estimation... Basic Probability Model Notationally, we let denote the sample space, F the set of events about which we want to make probability statements (a σ -algebra of events), and he perspective on the large-sample properties of extremum estimators taken in this chapter has been shaped by my discussions and collaborations with Lars Hansen over the past years. In particular, the approach to establishing consistency and asymptotic normality in Sections.. follows that of Hansen (b, 00). [First Page] [], () Lines: 0 to -.pt PgEnds: E [], ()

3 . Large-Sample Properties of Extremum Estimators Pr the probability measure. hus, we denote the probability space by (, F, Pr). Similarly, we let B K denote the Borel algebra of events in R K, which is the smallest σ -algebra containing all open and closed rectangles in R K. A K -dimensional vector random variable X is a function from the sample space to R K with the property that for each B B K, {ω : X (ω) B} F. Each random variable X induces a probability space (R K, B K, µ X )bythe correspondence µ X (B) = Pr{ω : X (ω) B}, for all B B K. wo notions of convergence of sequences of random variables that we use extensively are as follows. Definition.. he sequence of random variables {X } is said to converge almost surely (a.s.) to the random variable X if and only if there exists a null set N such that ω \ N : lim X (ω) = X (ω). (.) Definition.. he sequence of random variables {X } is said to converge in probability to X if and only if, for every ɛ>0, we have lim Pr { X X > ɛ} = 0. (.) When the th element of the sequence is the estimator θ for sample size and the limit is the population parameter vector of interest θ 0, then we call the estimator θ consistent for θ 0. Definition.. A sequence of estimators {θ } is said to be strongly (weakly) consistent for a constant parameter vector θ 0 if and only if θ converges almost surely (in probability) to θ 0 as. here are many different sets of sufficient conditions on the structure of asset pricing models and the probability models generating uncertainty for extremum estimators to be consistent. In this chapter we follow closely the approach in Hansen (b), which assumes that the underlying random vector of interest, z t, is a stationary and ergodic time series. Chapters and 0 discuss how stochastic trends have been accommodated in DAPMs. We let R denote the space consisting of all infinite sequences x = (x, x,...)of real numbers (lower case x indicates x R). A -dimensional rectangle is of the form {x R : x I, x I,...,x I }, where I,...,I are finite or infinite intervals in R. If B denotes the smallest he topics discussed in this section are covered in more depth in most intermediate statistics books. See Chung () and Billingsley (). A null set N for P is a set with the property that Pr{N } = 0. [], () Lines: to.0pt PgV PgEnds: EX [], ()

4 .. Basic Probability Model σ -algebra of subsets of R containing all finite dimensional rectangles, then X = (X, X,...) is a measurable mapping from to (R, B ) (here the X s are random variables). Definition.. A process {X t } is called stationary if, for every k, the process {X t } =k has the same distribution as {X t } = ; that is, t t P {(X, X,...) B }= P {(X k+, X k+...) B }. (.) In practical terms, a stationary process is one such that the functional forms of the joint distributions of collections (X k, X k,..., X k l ) do not change over time. An important property of a stationary process is that the process {Y k } defined by Y k = f (X k, X k+,...,) is also stationary for any f that is measurable relative to B. he assumption that {X t } is stationary is not sufficient to ensure that sample averages of the process converge to EX, a requirement that underlies our large-sample analysis of estimators. (Here we use EX, because all X t have the same mean.) he reason is that the sample we observe is the realization (X (ω 0 ), X (ω 0 ),...) associated with a single ω 0 in the sample space. If we are to learn about the distribution of the time series {X t } from this realization, then, as we move along the series {X t (ω 0 )}, it must be as if we are observing realizations of X t (ω) for fixed t as ω ranges over. o make this idea more precise, suppose there is an event A F with the property that one can find a B B such that for every t >, A ={ω : (X t (ω), X t + (ω),...) B}. Such an event A is called invariant because, for ω 0 A, the information provided by {X t (ω 0 ), X t + (ω 0 ),...} as t increases is essentially unchanged with t. On the other hand, if such a B does not exist, then A = {ω : (X (ω), X (ω),...) B } ={ω : (X t (ω), X t + (ω),...) B}, (.) for some t >, and {X t (ω), X t + (ω),...} conveys information about a different event in F (different part of ). Definition.. A stationary process is ergodic if every invariant event has probability zero or one. If the process is ergodic, then a single realization conveys sufficient information about for a strong law of large numbers (SLLN) to hold. For further discussion of stationary and ergodic stochastic processes see, e.g., Breiman (). [], () Lines: to.0pt * PgEnds: PageBreak [], ()

5 . Large-Sample Properties of Extremum Estimators heorem.. then If X X,...,is a stationary and ergodic process and E X <, X t EX a.s. (.) t= One can relax the assumption of stationarity, thereby allowing the marginal distributions of z t to change over time, and still obtain a SLLN. However, this is typically accomplished by replacing the relatively weak requirements implicit in the assumption of stationarity on the dependence between z t and z t s, for s = 0, with stronger assumptions (see, e.g., Gallant and White, ). wo considerations motivate our focus on the case of stationary and ergodic time series. First, in dynamic asset pricing models, the pricing relations are typically the solutions to a dynamic optimization problem by investors or a replication argument based on no-arbitrage opportunities. As we will see more formally in later chapters, both of these arguments involve optimal forecasts of future variables, and these optimal forecasting problems are typically solved under the assumption of stationary time series. Indeed, these forecasting problems will generally not lend themselves to tractable solutions in the absence of stationarity. Second, the assumption that a time series is stationary does not preclude variation over time in the conditional distributions of z t conditioned on its own history. In particular, the time variation in conditional means and variances that is often the focus of financial econometric modeling is easily accommodated within the framework of stationary and ergodic time series. Of course, neither of these considerations rules out the possibility that the real world is one in which time series are in fact nonstationary. At a conceptual level, the economic argument for nonstationarity often comes down to the need to include additional conditioning variables. For example, the case of a change in operating procedures by a monetary authority, as we experienced in the United States in the early 0s, could be handled by conditioning on variables that determine a monetary authority s operating procedures. However, many of the changes in a pricing environment that would lead us to be concerned about stationarity happen infrequently. herefore, we do not have repeated observations on the changes that concern us the most. he pragmatic solution to this problem has often been to judiciously choose the sample period so that the state vector z t in an asset pricing model can reasonably be assumed to be stationary. With these considerations in mind, we proceed under the formal assumption of stationary time series. An important exception is the case of nonstationarity induced by stochastic trends. [], () Lines: to.pt Long Page PgEnds: EX [], ()

6 .. Consistency: General Considerations Consistency: General Considerations Let Q ( z,θ) denote the function to be minimized by choice of the K - vector θ of unknown parameters within an admissible parameter space R K, and let Q 0 (θ) be its population counterpart. hroughout this chapter, it will be assumed that Q 0 (θ) is uniquely minimized at θ 0, the model parameters that generate the data. We begin by presenting a set of quite general sufficient conditions for θ to be a consistent estimator of θ 0. he discussion of these conditions is intended to illustrate the essential features of a probability model that lead to strong consistency (θ converges almost surely to θ 0 ). Without further assumptions, however, the general conditions proposed are not easily verified in practice. herefore, we proceed to examine a more primitive set of conditions that imply the conditions of our initial consistency theorem. One critical assumption underlying consistency is the uniform convergence of sample criterion functions to their population counterparts as gets large. Following are definitions of two notions of uniform convergence. Definition.. Let g (θ) be a nonnegative sequence of random variables depending on the parameter θ. Consider the two modes of uniform convergence of g (θ) to 0: P lim g (θ) = 0 =, (.) sup θ lim P sup g (θ)<ɛ = for any ɛ>0. (.) θ If (.) holds, then g (θ) is said to converge to 0 almost surely uniformly in θ. If (.) holds, then g (θ) is said to converge to 0 in probability uniformly in θ. he following theorem presents a useful set of sufficient conditions for θ to converge almost surely to θ 0. heorem.. Suppose (i) is compact. (ii) he nonnegative sample criterion function Q ( z,θ) is continuous in θ and is a measurable function of z for all θ. (iii) Q ( z,θ)converges to a non-stochastic function Q 0 (θ) almost surely uniformly in θ as ; and Q 0 (θ) attains a unique minimum at θ 0. Define θ as a value of θ that satisfies Q ( z,θ ) = min Q ( z,θ). (.) θ hen θ converges almost surely to θ 0. In situations where θ is not unique, if we let Ɣ denote the set of minimizers, we can show that δ (ω) = sup{ θ θ 0 : θ Ɣ } converges almost surely to 0 as. [], () Lines: -.pt Long Page PgEnds: E [], ()

7 0. Large-Sample Properties of Extremum Estimators Proof (heorem.). Define the function ρ(ɛ) = inf {Q 0 (θ) Q 0 (θ 0 ), for θ θ 0 ɛ}. (.) As long as ɛ>0, Assumptions (i) (iii) guarantee that ρ(ɛ) > 0. (Continuity of Q 0 follows from our assumptions.) Assumption (iii) implies that there exists a set with P ( ) = and a positive, finite function (ω, ɛ), such that ρ (ω) sup Q (ω, θ) Q 0 (θ) < ρ(ɛ)/, (.0) θ for all ω, ɛ > 0, and (ω, ɛ). his inequality guarantees that for all ω, ɛ > 0, and (ω, ɛ), Q 0 (θ ) Q 0 (θ 0 ) = Q 0 (θ ) Q (ω, θ ) + Q (ω, θ ) Q (ω, θ 0 ) + Q (ω, θ 0 ) Q 0 (θ 0 ) Q 0 (θ ) Q (ω, θ ) + Q (ω, θ 0 ) Q 0 (θ 0 ) Q 0 (θ ) Q (ω, θ ) + Q (ω, θ 0 ) Q 0 (θ 0 ) ρ (ω) < ρ(ɛ), (.) which implies that θ θ 0 <ɛ for all ω, ɛ > 0, and (ω, ɛ). he assumptions of heorem. are quite general. In particular, the z t s need not be identically distributed or independent. However, this generality is of little practical value unless the assumptions of the theorem can be verified in actual applications. In practice, this amounts to verifying Assumption (iii). he regularity conditions imposed in the econometrics literature to assure that (iii) holds typically depend on the specification of Q and Q 0 and, thus, are often criterion function specific. We present a set of sufficient conditions to establish the almost sure uniform convergence of the sample mean G ( z,θ) = g (z t,θ) (.) t = to its population counterpart G 0 (θ) = E [g (z t,θ)]. his result then is used to establish the uniform convergence of Q to Q 0 for the cases of ML and GMM estimators for stationary processes. o motivate the regularity conditions we impose on the time series {z t } and the function g, it is instructive to examine how far the assumption that [0], () Lines: 0 to.pt PgEnds: EX [0], ()

8 .. Consistency: General Considerations {z t } is stationary and ergodic takes us toward fulfilling the assumptions of heorem.. herefore, we begin by assuming: Assumption.. [( ] ( Q 0 (δ) = E y t x t δ ), Q (δ) = y t x t δ ), δ R K. (.) {z t : t } is a stationary and ergodic stochastic process. As discussed in Chapter, the sample and population criterion functions for LLP are t = For the LLP problem, Q 0 (δ) is assured of having a unique minimizer δ 0 if the second-moment matrix E [x t x t ] has full rank. hus, with this additional assumption, the second part of Condition (iii) of heorem. is satisfied. Furthermore, under the assumption of ergodicity, x t x t E [ x t x t ] and x t y t E [x t y t ] a.s. (.) t = t = It follows immediately that δ δ 0 a.s. hough unnecessary in this case, we can also establish the strong consistency of δ for δ 0 from the observation that Q (δ) Q 0 (δ) a.s., for all δ R K. From Figure. it is seen that the criterion functions are quadratic and eventually overlap (for large ), so the minimizers of Q (δ) and Q 0 (δ) must eventually coincide. We conclude that the strong consistency of estimators in LLP problems is essentially implied by the assumption that {z t } is stationary and ergodic (and the rank condition on E [x t x t ]). More generally, the assumptions of ergodicity of {z t } and the continuity of Q ( z,θ) in its second argument do not imply the strong consistency of the minimizer θ of the criterion function Q (θ). he reason is that ergodicity guarantees only pointwise convergence, and the behavior in the tails of some nonlinear criterion functions may be problematic. o illustrate this Q 0 δ 0 Figure.. Sample and population criterion functions for a least-squares projection. δ Q [], () Lines: -0.0pt PgEnds: E [], ()

9 . Large-Sample Properties of Extremum Estimators Q 0 Q θ θ 0 Figure.. Well-behaved Q 0, Q. point, Figure. depicts a relatively well-behaved function Q that implies the convergence of θ to θ 0. In contrast, although the function Q (θ ) in Figure. can be constructed to converge pointwise to Q 0 (θ ), θ 0 and θ may grow increasingly far apart as increases if the dip moves further out to the right as grows. his potential problem is ruled out by the assumptions that {Q : } converges almost surely uniformly in θ to a function Q 0 and that θ 0 is the unique minimizer of Q 0. Even uniform convergence of Q to Q 0 combined with stationarity and ergodicity are not sufficient to ensure that θ converges to θ 0, however. o see why, consider the situation in Figure.. If Q 0 (θ ) asymptotes to the minimum of Q 0 (θ ) over R (but does not achieve this minimum) in the left tail, then Q (θ ) can get arbitrarily close to Q 0 (θ 0 ), even though θ and θ 0 are growing infinitely far apart. o rule this case out, we need to impose a restriction on the behavior of Q 0 in the tails. his can be accomplished either by imposing restrictions on the admissible parameter space or by restricting Q 0 directly. For example, if it is required that inf {Q 0 (θ ) Q 0 (θ 0 ) : θ, θ θ 0 > ρ} > 0, (.) then Q 0 (θ ) cannot asymptote to Q 0 (θ 0 ), for θ far away from θ 0, and convergence of θ to θ 0 is ensured. his condition is satisfied by the least-squares Q 0 Q θ 0 θ Figure.. Poorly behaved Q. [], () Lines: to -.0pt PgV Long Page PgEnds: EX [], ()

10 .. Consistency: General Considerations Q Q 0 θ θ 0 Figure.. Q converging to asymptoting Q 0. criterion function for linear models. For nonlinear models, potentially undesirable behavior in the tails is typically ruled out by assuming that is compact (the tails are chopped off ). With these observations as background, we next provide a primitive set of assumptions that assure the strong consistency of θ for θ 0. As noted in Chapter, most of the criterion functions we will examine can be expressed as sample means of functions g (z t,θ), or are simple functions of such sample means (e.g., a quadratic form). Accordingly, we first present sufficient conditions (beyond Assumption.) for the convergence of G (θ) = g (z t,θ) (.) t = to E [g (z t,θ)] almost surely, uniformly in θ. Our first assumption rules out bad behavior in the tails and the second states that the function g (z t,θ) has a finite mean for all θ: Assumption.. is a compact metric space. Assumption.. he function g (,θ) is Borel measurable for each θ in ; Eg (z t,θ) exists and is finite for all θ in. We will also need a stronger notion of continuity of g (z t,θ). Let ɛ t (θ, δ) = sup{ g (z t,θ) g (z t,α) for all α in with α θ < δ}. (.) Definition.. he random function g (z t,θ) is first-moment continuous at θ if lim δ 0 E [ɛ t (θ, δ)] = 0. Assumption. guarantees that has a countable dense subset. Hence, under Assumptions. and., the function ɛ t (θ, δ) is Borel measurable (it can be represented as the almost sure supremum of a countable collection of Borel measurable functions). [], () Lines: -0.pt Long Page * PgEnds: PageBreak [], ()

11 . Large-Sample Properties of Extremum Estimators First-moment continuity of g (z t,θ) is a joint property of the function g and the random vector z t. Under Assumptions.., if g (z t,θ) is firstmoment continuous at θ, then g (z t,θ) is first-moment continuous for every t. Assumption.. θ. he random function g (z t,θ) is first-moment continuous at all he measure of distance between G and E [g (z t, )] we are concerned with is ρ = sup G (θ) Eg (z t,θ). (.) θ Using the compactness of and the continuity of g t ( ), it can be shown that {ρ : } converges almost surely to zero. he proof proceeds as follows: Let {θ i : i } be a countable dense subset of. he distance between G (θ) and Eg (z t,θ) satisfies the following inequality: G (θ) Eg (z t,θ) G (θ) G (θ i ) + G (θ i ) Eg (z t,θ i ) + Eg (z t,θ i ) Eg (z t,θ). (.) For all θ, the first term on the right-hand side of (.) can be made arbitrarily small by choosing θ i such that θ i θ is small (because the θ i are a dense subset of ) and then using ergodicity and the uniform continuity of g (z t,θ) (uniform continuity follows from Assumptions. and.). he second term can be made arbitrarily small for large enough by ergodicity. Finally, the last term can be made small by exploiting the uniform continuity of g. he following theorem summarizes this result, a formal proof of which is provided in Hansen (00). heorem. (Hansen, b). Suppose Assumptions.. are satisfied. hen {ρ : } in (.) converges almost surely to zero... Consistency of Extremum Estimators Equipped with heorem., the strong consistency of the extremum estimators discussed in Chapter can be established.... Maximum Likelihood Estimators Suppose that the functional form of the density function of y t conditioned J J on y t, f (y t y ; β), is known for all t. Let Q 0 (β) = E [log f (y t y t ; β)] t [], (0) Lines: to.pt PgEnds: EX [], (0)

12 .. Consistency of Extremum Estimators denote the population criterion function and suppose that β 0, the parameter vector of the data-generating process for y t, is a maximizer of Q 0 (β). o show the uniqueness of β 0 as a maximizer of Q 0 (β), required by Condition (iii) of heorem., we use Jensen s inequality to obtain [ ( ) ] [ ( ) ] J J E log f ( y t y t ; β f y t y t ; β ) < log E ( J J ), β = β 0. (.0) f y t yt ; β 0 f y t yt ; β 0 he right-hand side of (.0) is zero (by the law of iterated expectations) because ( ) f J yt yt ; β ( ) J f ( y t ) f y t yt y t ; β ; β 0 dy =. (.) J 0 herefore, [ ( )] [ ( )] J J E log f y t yt ; β < E log f y t yt ; β 0, if β = β 0 (.) and β 0 is the unique solution to (.). he approximate sample log-likelihood function is hus, setting z t (y t, y t J ( J ) l (β) = log f y t yt ; β. (.) t =J + ) and ( ) J g (z t,β) = log f y t yt ; β, (.) G in the preceding section becomes the log-likelihood function. If Assumptions.. are satisfied, then heorem. implies the almost sure, uniform convergence of the sample log-likelihood function to Q 0 (β).... Generalized Method of Moment Estimators he GMM criterion function is based on the model-implied M -vector of moment conditions E [h(z t,θ 0 )] = 0. With use of the sample counterpart to this expectation, the sample and population criterion functions are constructed as quadratic forms with distance matrices W and W 0, respectively: See DeGroot (0) for a discussion of the use of first-moment continuity of log f (y t y t J ; β) in proving the strong consistency of ML estimators. DeGroot refers to first-moment continuity as supercontinuity. [], () Lines:.pt * PgEnds: Eject [], ()

13 . Large-Sample Properties of Extremum Estimators Q (θ) = H ( z,θ) W H ( z,θ), (.) Q 0 (θ) = H 0 (θ) W 0 H 0 (θ), (.) t = where H ( z,θ) = h(z t,θ) and H 0 (θ) = E [h(z t,θ)]. Since H 0 (θ) is zero at θ 0, the function Q 0 ( ) achieves its minimum (zero) at θ 0. o apply heorem. to these criterion functions we impose an additional assumption. Assumption.. {W : } is a sequence of M M positive semidefinite matrices of random variables with elements that converge almost surely to the corresponding elements of the M M constant, positive semidefinite matrix W 0 with rank(w 0 ) K. In addition, we let ρ = sup{ Q (θ) Q 0 (θ) : θ } (.) denote the maximum error in approximating Q 0 by its sample counterpart Q. he following lemma shows that Assumptions.. are sufficient for this approximation error to converge almost surely to zero. Lemma.. Suppose Assumptions.. are satisfied. hen {ρ : } converges almost surely to zero. Proof (Lemma.). Repeated application of the riangle and Cauchy-Schwartz Inequalities gives Q (θ) Q 0 (θ) H (θ) H 0 (θ) W H (θ) + H 0 (θ) W W 0 H (θ) + H 0 (θ) W 0 H (θ) H 0 (θ), (.) where W = r WW. herefore, letting φ 0 = max{ H 0 (θ) : θ } and ρ sup{ H (θ) H 0 (θ) : θ }, 0 ρ ρ W [φ 0 + ρ ] + φ 0 W W 0 [φ 0 + ρ ] + φ 0 W 0 ρ. (.) Since h(z t,θ) is first-moment continuous, H 0 (θ) is a continuous function of θ. herefore, φ 0 is finite because a continuous function on a compact set achieves its maximum. heorem. implies that ρ converges almost surely to zero. Since each of the three terms on the right-hand side of (.) converges almost surely to zero, it follows that {ρ : } converges almost surely to zero. [], () Lines: 0 to.0pt PgEnds: EX [], ()

14 .. Consistency of Extremum Estimators When this result is combined with heorems. and., it follows that the GMM estimator {θ : } converges almost surely to θ QML Estimators Key to consistency of QML estimators is verifying that the population moment equation (.0) based on the normal likelihood function is satisfied at θ 0. As noted in Chapter, this is generally true if the functional forms of the conditional mean and variance of y t are correctly specified (the moments implied by a DAPM are those in the probability model generating y t ). It is informative to verify that (.0) is satisfied at θ 0 for the interest rate Example.. his discussion is, in fact, generic to any one-dimensional state process y t, since it does not depend on the functional forms of the conditional mean µ rt or variance σ rt. Extensions to the multivariate case, with some increase in notational complexity, are immediate (see, e.g., Bollerslev and Wooldridge, ). Recalling the first-order conditions (.) shows the limit of the middle term on the right-hand-side to be ( ) (r t µ ˆ σ rt ) ˆrt (r t µ rt ) σ E rt. (.0) σˆrt θ j σ rt θ t= j Using the law of iterated expectations, we find that this expectation simplifies as [ ( ) ] (r t µ rt ) σ rt (r t µ rt ) σ E = E E rt rt σrt θ j σ rt θ j σ rt = E. (.) σ rt θ j he expectation (.) is seen to be minus the limit of the first term in (.), so the first and second terms cancel. hus, for the population first-order conditions associated with (.) to have a zero at θ 0, it remains to show that the limit of the last term in (.), evaluated at θ 0, is zero. his limit is { } (r t µ ˆ rt ) ˆµ rt (r t µ rt ) µ rt E ˆ, (.) θ t = σ rt j σ rt θ j which is indeed zero, because E [r t µ rt r t ] = 0 by construction and all of the other terms are constant conditional on r t. [], () Lines: -0.0pt PgEnds: E [], ()

15 . Large-Sample Properties of Extremum Estimators Consistency of the QML estimator then follows under the regularity conditions of heorem.... Asymptotic Normality of Extremum Estimators he consistency of θ for θ 0 implies that the limiting distribution of θ is degenerate at θ 0. For the purpose of conducting inference about the population value θ 0 of θ, we would like to know the distribution of θ for finite. his distribution is generally not known, but often it can be reliably approximated using the limiting distribution of (θ θ 0 ) obtained by a central limit theorem. Applicable central limit theorems have been proven under a wide variety of regularity conditions. We continue our focus on stationary and ergodic economic environments. Suppose that θ is strongly consistent for θ 0. o show the asymptotic normality of θ, we focus on the first-order conditions for the maximization or minimization of Q, the sample mean of the function D 0 (z t ; θ) first introduced in Chapter. More precisely, we let log f ( yt y J t θ for the ML estimator, h(z t,θ) = h(z t,θ) for the GMM estimator, (.) ( y t x t θ ) x t for the LLP estimator. In each case, by appropriate choice of z t and θ,e [h(z t,θ 0 )] = 0. hus, the function D 0 (z t ; θ), representing the first-order conditions for Q 0, is where the K M matrix A 0 is D 0 (z t ; θ) = A 0 h(z t ; θ), (.) I K for the ML estimator, A 0 = E [ h(z t,θ 0 ) / θ]w 0 for the GMM estimator, (.) IK for the LLP estimator, where I K denotes the K K identity matrix. he choice of A 0 for the GMM estimator is motivated subsequently as part of the proof of heorem.. Using this notation and letting H (θ) = h(z t,θ), (.) t = [], () Lines: 00 to.pt Short Page * PgEnds: Eject [], ()

16 .. Asymptotic Normality of Extremum Estimators we can view all of these estimators as special cases of the following definition of a GMM estimator (Hansen, b). Definition.. he GMM estimator {θ : } is a sequence of random vectors that converges in probability to θ 0 for which { A H (θ ) : } converges in probability to zero, where {A } is a sequence of K M matrices converging in probability to the full-rank matrix A 0. For a sequence of random variables {X }, convergence in distribution is defined as follows. Definition.. Let F, F,...,be distribution functions of the random variables X, X,... hen the sequence {X } converges in distribution to X (denoted X X ) if and only if F (b) F X (b) for all b at which F X is continuous. he classical central limit theorem examines the partial sums S = (/ ) t X t of an independently and identically distributed process {X t } with mean µ and finite variance. Under these assumptions, the distribution of S converges to that of normal with mean µ and covariance matrix Var[X t ]. However, for the study of asset pricing models, the assumption of independence is typically too strong. It rules out, in particular, persistence in the state variables and time-varying conditional volatilities. he assumption that {X t } is a stationary and ergodic time series, which is much weaker than the i.i.d. assumption in the classical model, is not sufficient to establish a central limit theorem. Essentially, the problem is that an ergodic time series can be highly persistent, so that the X t and X s, for s = t, are too highly correlated for S to converge to a normal random vector. he assumption of independence in the classical central limit theorem avoids this problem by assuming away any temporal dependence. Instead, we will work with the much weaker assumption that {X t } is a Martingale Difference Sequence (MDS), meaning that E [X t X t, X t,...] = 0 (.) with probability one. he assumption that X t is mean-independent of its past imposes sufficient structure on the dependence of {X t } for the following central limit theorem to be true. heorem. (Billingsley, ). Let {X t } t = be a stationary and ergodic MDS such that E X is finite. hen the distribution of (/ ) t = X t approaches the normal distribution with mean zero and variance E X. [], () Lines:.pt Short Page PgEnds: E [], ()

17 0. Large-Sample Properties of Extremum Estimators hough many financial time series are not MDSs, it will turn out that they can be expressed as moving averages of MDS, and this will be shown to be sufficient for our purposes. Equipped with Billingsley s theorem, under the following conditions, we can prove that the GMM estimator is asymptotically normal. heorem. (Hansen, b). Suppose that (i) {z t } is stationary and ergodic. (ii) is an open subset of R K. (iii) h is a measurable function of z t for all θ, h d 0 E (z t,θ 0 ) θ is finite and has full rank, and h/ θ is first moment continuous at all θ. (iv) θ is a GMM estimator of θ 0. (v) H ( z,θ 0 ) N (0, 0 ), where 0 = lim E [H (θ 0 )H (θ 0 ) ]. (vi) A converges in probability to A 0, a constant matrix of full rank, and A 0 d 0 has full rank. hen (θ θ 0 ) N (0, 0 ), where 0 = (A 0 d 0 ) A 0 0 A 0 (d 0 A 0 ). (.) In proving heorem., we will need the following very useful lemma. Lemma.. Suppose that {z t } is stationary and ergodic and the function g (z t,θ) satisfies: (a) E [g (z t,θ 0 )] exists and is finite, (b) g is first-moment continuous at θ 0, and suppose that θ converges to θ 0 in probability. hen (/ ) t = g (z t,θ ) converges to E [g (z t,θ 0 )] in probability. Proof (heorem.). When we apply aylor s theorem on a coordinate by coordinate basis, ( H (θ ) = H (θ 0 ) + G ) θ (θ θ 0 ), (.) where θ isak M matrix with the mth column, θ m, satisfying θ m θ 0 θ θ 0, for m =,...,M, and the ijth element of the M K matrix G (θ ) is the jth [0], () Lines: to.pt Short Page PgEnds: EX [0], ()

18 .. Asymptotic Normality of Extremum Estimators i element of the K vector H (θ i )/ θ. he matrix G (θ ) converges in probability to the matrix d 0 by Lemma.. Furthermore, since A H (θ ) converges in probability to zero, (θ θ 0 ) and [ (A 0 d 0 ) A 0 H (θ 0 )] have the same limiting distribution. Finally, from (v) it follows that (θ θ 0 ) is asymptotically normal with mean zero and covariance matrix (A 0 d 0 ) A 0 0 A 0 (d 0 A 0 ). A key assumption of heorem. is Condition (v), as it takes us a long way toward the desired result. Prior to discussing applications of this theorem, it will be instructive to discuss more primitive conditions for Condition (v) to hold and to characterize 0. Letting I t denote the information set generated by {z t, z t,...}, and h t h(z t ; θ 0 ), we begin with the special case (where ACh is shorthand for autocorrelation in h): Case ACh(0). E [h t I t ] = 0. Since I t includes h s, for s t, {h t } is an MDS. hus, heorem., the central limit theorem (CL), applies directly and implies Condition (v) with 0 = E h t h t. (.0) Case ACh(n ). E [h t +n I t ] = 0, for some n. When n >, this case allows for serial correlation in the process h t up to order n. We cannot apply heorem. directly in this case because it presumes that h t is an MDS. However, it turns out that we can decompose h t into a finite sum of terms that do follow an MDS and then Billingsley s CL can be applied. oward this end, h t is written as n h t = u t,j, (.) j=0 where u t,j I t j and satisfies the property that E [u t,j I t j ] = 0. his representation follows from the observation that h t = E [h t I t ] + u t,0 n = E [h t I t ] + u t,0 + u t, =... = u t,j, (.) where the law of iterated expectations has been used repeatedly. hus, j=0 [], () Lines: 0.0pt Short Page * PgEnds: Eject [], ()

19 . Large-Sample Properties of Extremum Estimators n h t = u t,j. (.) t= t = j=0 Combining terms for which t j is the same (and, hence, that reside in the same information set) and defining gives n u t = u t +j,j, (.) j=0 n+ n h t = u t + V, (.) t = t =0 where V n involves a fixed number of u t,j depending only on n, for all. Since V n converges to zero in probability as, we can focus on the sample mean of u in deriving the limiting distribution of the sample mean t of h t. he series {u } is an MDS. hus, Billingsley s theorem implies that t n+ [ ] u N(0, 0 ), 0 = E u t u. (.) t= t Moreover, substituting the left-hand side of (.) for the scaled average of the u t s in (.), gives [ ( )( )] 0 = lim E h t h t = lim j = n+ t = t= n ( ) j E [ n ] h t h t j = j= n+ t E h t h t j. (.) In words, the asymptotic covariance matrix of the scaled sample mean of h t is the sum of the autocovariances of h t out to order n. Case ACh( ). E [h t h t s ] = 0, for all s. Since, in case ACh(n ), n is the number of nonzero autocovariances of h t, (.) can be rewritten equivalently as [], () Lines: 00 to.pt * PgEnds: Eject [], ()

20 .. Distributions of Specific Estimators = E h(z t,θ 0 )h(z t j,θ 0 ). (.) j= his suggests that, for the case where E [h t h t s ] = 0, for all s (i.e., n = ), (.) holds as well. Hansen (b) shows that this is indeed the case under the additional assumption that the autocovariance matrices of h t are absolutely summable... Distributions of Specific Estimators In applying heorem., it must be verified that the problem of interest satisfies Conditions (iii), (iv), and (v). We next discuss some of the implications of these conditions for the cases of the ML, GMM, and LLP criterion functions. In addition, we examine the form of the asymptotic covariance matrix 0 implied by these criterion functions, and discuss consistent estimators of Maximum Likelihood Estimation In the case of ML estimation, we proved in Chapter that [ ( ) ] J J E D 0 y t, y t,β 0 y = 0, where ( ) log f ( ) J J D 0 y t, y t,β = y t yt ; β. (.) β J Since the density of y t conditioned on y t is the same as the density conditioned on y t by assumption, (.) implies that the score (.) is an MDS. herefore, heorem. and Case ACh(0) apply and ( ) log f ( ) J H (z t,θ 0 ) = y t yt ; β 0 (.0) β In deriving this result, we implicitly assumed that we could reverse the order of integration and differentiation. Formally, this is justified by the assumption that the partial derivative J of log f (y t y t ; β) is first-moment continuous at β 0. More precisely, consider a function h(z,θ). Suppose that for some δ>0, the partial derivative h(z, θ)/ θ exists for all values of z and all θ such that θ θ 0 <δ, and suppose that this derivative is first-moment continuous at θ 0.If E [h(z,θ)] exists for all θ θ 0 <δ and if E [ h(z, θ)/ θ ] <, then h(z,θ) E [h(z,θ)] E =. θ θ=θ0 θ θ=θ0 t = t [], () Lines: -.00pt * PgEnds: PageBreak [], ()

21 . Large-Sample Properties of Extremum Estimators converges in distribution to a normal random vector with asymptotic covariance matrix log f ( J ) J 0 = E y t log f ( yt ; β 0 yt ) y t ; β 0. (.) β β Furthermore, the first-order conditions to the log-likelihood function give K equations in the K unknowns (β), so A is I K in this case and ML 0 = d 0 0 (d 0 ). hus, it remains to determine d 0. Since E [D 0 (z t,β 0 )] = 0, differentiating both sides of this expression with respect to β gives 0 log f ( ) J d ML = E y β β t 0 yt ; β 0 log f ( J ) J = E y t log f ( ) yt ; β 0 yt y t ; β 0. (.) β β When we combine (.), (.), and (.) and use the fact that if X N (0, X ), then AX N (0, A X A ), it follows that ( [ ) log f ( ) ] J (b ML β 0 ) N 0, E y β β t yt ; β 0. (.) In actual implementations of ML estimation, the asymptotic covariance in (.) is replaced by its sample counterpart. From (.) it follows that this matrix can be estimated either as the inverse of the sample mean of the outer product of the likelihood scores or as minus the inverse of the sample mean of the second-derivative matrix evaluated at b ML, ( ) log f ( yt J ) y b ML. (.) β β t ; t = 0 he second equality in (.) is an important property of conditional density functions that follows from (.). By definition, (.) can be rewritten as log f ( ) ( ) 0 = y J t y t ; β 0 f y t J yt ; β 0 dy t. β Differentiating under the integral sign and using the chain rule gives log f ( ) ( ( ) 0 = E y t J y t ; β log f ) 0 + E y t J log f yt ; β J 0 y t y t ; β 0. β β β β [], (0) Lines: 00 to.0pt * PgEnds: PageBreak [], (0)

22 .. Distributions of Specific Estimators Assuming that the regularity conditions for Lemma. are satisfied by the likelihood score, we see that (.) converges to the covariance matrix of b ML as. he asymptotic covariance matrix of b ML is the Cramer-Rao lower bound, the inverse of the so-called Hessian matrix. his suggests that, even though the ML estimator may be biased in small samples, as gets large, the ML estimator is the most efficient estimator in the sense of having the smallest asymptotic covariance matrix among all consistent estimators of β 0. his is indeed the case and we present a partial proof of this result in Section..... GMM Estimation heorem. applies directly to the case of GMM estimators. he GMM estimator minimizes (.) so the regularity conditions for heorem. require that h(z t,θ) be differentiable, h(z t, θ)/ θ be first-moment continuous at θ 0, and that W converge in probability to a constant, positivesemidefinite matrix W 0. he first-order conditions to the minimization problem (.) are H (θ ) W H (θ ) = 0. (.) θ herefore, the A implied by the GMM criterion function (.) is H (θ ) A = W. (.) θ By Lemma. and the assumption that W converges to W 0, it follows that A converges in probability to A 0 = d 0 W 0. Substituting this expression into (.), we conclude that (θ θ 0 ) converges in distribution to a normal with mean zero and covariance matrix ( ) ( GMM 0 = d0 W 0 d 0 d 0 W 0 0 W 0 d 0 d 0 W ) 0 d 0. (.) If the probability limit of the distance matrix defining the GMM criterion function is chosen to be W 0 = 0, then (.) simplifies to 0 GMM = ( d 0 0 ) d 0. (.) We show in Section. that this choice of distance matrix is the optimal choice among GMM estimators constructed from linear combinations of the moment equation E [h(z t,θ 0 )] = 0. A consistent estimator of GMM 0 is constructed by replacing all of the matrices in (.) or (.) by their sample counterparts. he matrix W 0 [], () Lines:.pt PgEnds: E [], ()

23 . Large-Sample Properties of Extremum Estimators is estimated by W, the matrix used to construct the GMM criterion function, and d 0 is replaced by H (θ )/ θ. he construction of a consistent estimator of 0 depends on the degree of autocorrelation in h(z t,θ 0 ).In Case ACh(n ), with finite n, the autocovariances of h comprising 0 are replaced by their sample counterparts using fitted h(z t,θ ) in place of h(z t,θ 0 ): h(z t,θ )h(z t j,θ ). (.) t =j+ An asymptotically equivalent estimator is obtained by subtracting the sample mean from h(z t,θ ) before computing the sample autocovariances. If, on the other hand, n = or n is very large relative to the sample size, then an alternative approach to estimating 0 is required. In Case ACh( ), 0 is given by (.). Letting Ɣ h0 (j) = E [h t h t j ], we proceed by constructing an estimator as a weighted sum of the autocovariances that can feasibly be estimated with a finite sample of length : ( ) j ( ) = k Ɣ h j, (.0) K j= + where the sample autocovariances are given by h(z t,θ )h(z t j,θ ) for j 0, ( ) t=j+ Ɣ h j = (.) h(z t +j,θ )h(z t,θ ) for j < 0, t= j+ and B is a bandwidth parameter discussed later. he scaling factor / ( K ) is a small-sample adjustment for the estimation of θ. he function k( ), called a kernel, determines the weight given to past sample autocovariances in constructing. he basic idea of this estimation strategy is that, for fixed j, sample size must increase to infinity for Ɣ h (j) to be a consistent estimator of Ɣ h0 (j). At the same time, the number of nonzero autocovariances in (.0) must increase without bound for to be a consistent estimator of 0. he potential problem is that if terms are added proportionately as gets large, then the number of products h t h t j in the sample estimate of Ɣ h (j) stays small regardless of the size of. o avoid this problem, the kernel must be chosen so that the number of autocovariances included grows, but at a slower rate than, B [], () Lines: to.0pt PgEnds: EX [], ()

24 .. Distributions of Specific Estimators so that the number of terms in each sample estimate Ɣ h (j) increases to infinity. wo popular kernels for estimating 0 are { for x, runcated k(x) = (.) 0 otherwise, { x for x, Bartlett k(x) = (.) 0 otherwise. For both of these kernels, the bandwidth B determines the number of autocovariances included in the estimation of. In the case of the truncated kernel, all lags out to order B are included with equal weight. his is the kernel studied by White (). In the case of the Bartlett kernel, the autocovariances are given declining weights out to order j B. Newey and West (b) show that, by using declining weights, the Bartlett kernel guarantees that is positive-semidefinite. his need not be the case in finite samples for the truncated kernel. he choice of the bandwidth parameter B is discussed in Andrews ().... Quasi-Maximum Likelihood Estimation he QML estimator is a special case of the GMM estimator. Specifically, continuing our discussion of the scalar process r t with conditional mean µ rt and variance σ rt that depend on the parameter vector θ, let the jth component of h(z t,θ) be the score associated with θ j : σ rt (θ) (r t µ rt (θ)) σ rt (θ) h j (z t,θ) + σ θ rt (θ) θ j σ j rt (θ) (r t µ rt (θ)) µ rt (θ) +,j =,...,K. (.) σ rt (θ) θ j he asymptotic distribution of the QML estimator is thus determined by the properties of h(z t,θ 0 ). From (.) it is seen that E [h j (z t,θ 0 ) I t ] = 0; that is, {h(z t,θ 0 )} is an MDS. his follows from the observations that, after taking conditional expectations, the first and second terms cancel and the third term has a conditional mean of zero. herefore, the QML estimator θ QML falls under Case ACh(0) with M = K (the number of moment equations equals the number of parameters) and ( ) ( ) QML ( QML ) θ θ 0 N 0, ( QML d ) 0 0 d, (.) 0 [], () Lines:.pt * PgEnds: Eject [], ()

25 . Large-Sample Properties of Extremum Estimators where 0 = E [h(z t,θ 0 )h(z t,θ 0 ) ], with h given by (.), and d QML log f N 0 = E (rt I t ; θ 0 ). (.) θ θ hough these components are exactly the same as in the case of fullinformation ML estimation, d QML 0 and 0 are not related by (.), so (.) does not simplify further.... Linear Least-Squares Projection he LLP estimator is the special case of the GMM estimator with z t = (y t, x t ), h(z t,δ) = (y t x t δ)x t, A 0 = I K. Also, d LLP 0 = E x t x t (.) and with u t (y t x t δ 0 ), where δ 0 is the probability limit of the least-squares estimator δ, It follows that LLP 0 0 = E x t u t u t j x t j. (.) j= [ [ = E x t x ] t E x t u t u t j x t j E x t x ] t. (.) j= In order to examine several special cases of LLP for forecasting the future, we assume that the variable being forecasted is dated t + n, n, and let x t denote the vector of forecast variables observed at date t : y t +n = x t δ 0 + u t +n. (.0) We consider several different assumptions about the projection error u t +n. Unless otherwise noted, throughout the following discussion, the information set I t denotes the information generated by current and past x t and u t. Consider first Case ACh(0) with n = and E [u t + I t ] = 0. One circumstance where this case arises is when a researcher is interested in testing whether y t + is unforecastable given information in I t (see Chapter ). For instance, if we assume that x t includes the constant as the first component, and partitioning x t as x t = (, x t ) and δ 0 conformably as δ 0 = (δ c,δ x ), then this case implies that E [y t + I t ] = δ c, δ x = 0, and y t+ is unforecastable given past information about x t and y t. he alternative hypothesis is that [], () Lines: to 0.pt * PgEnds: Eject [], ()

26 .. Distributions of Specific Estimators E [y t + I t ] = δ c + x t δ x, (.) with the (typical) understanding that the projection error under this alternative satisfies E [u t + I t ] = 0. A more general alternative would allow δ x = 0 and the projection error u t + to be correlated with other variables in I t.we examine this case later. Since d 0 = E [x t x t ] and this case fits into Case ACh(0), ( ) LLP (δ δ 0 ) N 0, 0, (.) where 0 [ [ LLP ] [ ] = E x t x ] t E ut + x t x t E x t x t. (.) Without further assumptions, LLP 0 does not simplify. One simplifying assumption that is sometimes made is that the variance of u t + conditioned on I t is constant: E u t + I t = σ u, a constant. (.) Under this assumption, 0 in (.) simplifies to σ E [x t x t ] and LLP 0 [ = σ E x t x ] u t. (.) hese characterizations of LLP 0 are not directly applicable because the asymptotic covariance matrices are unknown (are functions of unknown population moments). herefore, we replace these unknown moments with their sample counterparts. Let û t + (y t + x t δ ). With the homoskedasticity assumption (.), the distribution of δ used for inference is ( ) LLP δ N δ 0,, (.) where = ˆu t ( ) LLP = σˆ u x t x t, (.) with σ ˆ = (/ ) u t. his is, of course, the usual distribution theory used in the classical linear least-squares estimation problem. Letting σˆδi denote the ith diagonal element of (.), we can test the null hypothesis H 0 : δ0 i = δ0 i using the distribution t = u [], () Lines: 0.pt * PgEnds: Eject [], ()

27 0. Large-Sample Properties of Extremum Estimators δ i δi 0 N (0, ). (.) σˆδi Suppose that we relax assumption (.) and let the conditional variance of u t + be time varying. hen LLP 0 is given by (.) and now 0 is estimated by and = û t + x t x t, (.) esting proceeds as before, but with a different calculation of σˆδi. Next, consider Case ACh(n ) which has n > and E [u t +n I t ] = 0. his case would arise, for example, in asking the question whether y t +n is forecastable given information in I t. For this case, d 0 is unchanged, but the calculation of 0 is modified so that [ n [ LLP 0 = E x t x ] t E u t +n u t +n j x t x t j E x t x ] t. (.) n t = ( ) ( ) LLP = x t x t x t x t. (.0) t = t = j = n+ Analogously to the case ACh(0), this expression simplifies further if the conditional variances and autocorrelations of u t are constants. o estimate the asymptotic covariance matrix for this case, we replace E [x t x t ]by(/ ) t = x tx t and 0 by = j = n+ t = esting proceeds in exactly the same way as before. û t +n û t +n j x t x t j. (.).. Relative Efficiency of Estimators he efficiency of an estimator can only be judged relative to an a priori set of restrictions on the joint distribution of the z t that are to be used in estimation. hese restrictions enter the formulation of a GMM estimator in two ways: through the choices of the h function and the A 0. he form of the asymptotic covariance matrix 0 in (.) shows the dependence of [0], () Lines: 0-0.pt PgEnds: EX [0], ()

28 .. Relative Efficiency of Estimators the limiting distribution on both of these choices. In many circumstances, a researcher will have considerable latitude in choosing either A 0 or h(z t,θ) or both. herefore, a natural question is: Which is the most efficient GMM estimator among all admissible estimators? In this section, we characterize the optimal GMM estimator, in the sense of being most efficient or, equivalently, having the smallest asymptotic covariance matrix among all estimators that exploit the same information about the distribution of z.... GMM Estimators o highlight the dependence of the distributions of GMM estimators on the information used in specifying the moment equations, it is instructive to start with the conditional version of the moment equations underlying the GMM estimation, t = A t h(z t,θ ) = 0, (.) where A t is a (possibly random) K M matrix in the information set I t, and h(z t,θ) is an M vector, with K M, satisfying [ ] E h(z t ; θ 0 ) I t = 0. (.) In this section, we will treat z t as a generic random vector that is not presumed to be in I t and, indeed, in all of the examples considered subsequently z t I t. Initially, we treat h(z t ; θ) as given by the asset pricing theory and, as such, not subject to the choice of the researcher. We also let { } h(z t ; θ 0 ) A = A t I t, such that E A t has full rank (.) θ denote the class of admissible GMM estimators, where each estimator is indexed by the (possibly random) weights A t. he efficiency question at hand is: In estimating θ 0, what is the optimal choice of A t? (Which choice of A t gives the smallest asymptotic covariance matrix for θ among all estimators based on matrices in A?) he following lemma, based on the analysis in Hansen (), provides a general characterization of the optimal A A. Lemma.. Suppose that the assumptions of heorem. are satisfied and {A t } A is a stationary and ergodic process ( jointly with z t ). hen the optimal choice A A satisfies [], () Lines:.pt * PgEnds: Eject [], ()

Follow links for Class Use and other Permissions. For more information send to:

Follow links for Class Use and other Permissions. For more information send  to: COPYRIGH NOICE: Kenneth. Singleton: Empirical Dynamic Asset Pricing is published by Princeton University Press and copyrighted, 00, by Princeton University Press. All rights reserved. No part of this book

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

University of Pavia. M Estimators. Eduardo Rossi

University of Pavia. M Estimators. Eduardo Rossi University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Generalized Method of Moments Estimation

Generalized Method of Moments Estimation Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN

ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN a.a. 14/15 p. 1 LECTURE 3: REVIEW OF BASIC ESTIMATION METHODS: GMM AND OTHER EXTREMUM

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi)

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi) Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi) Our immediate goal is to formulate an LLN and a CLT which can be applied to establish sufficient conditions for the consistency

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50 GARCH Models Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 50 Outline 1 Stylized Facts ARCH model: definition 3 GARCH model 4 EGARCH 5 Asymmetric Models 6

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Asymptotic distribution of GMM Estimator

Asymptotic distribution of GMM Estimator Asymptotic distribution of GMM Estimator Eduardo Rossi University of Pavia Econometria finanziaria 2010 Rossi (2010) GMM 2010 1 / 45 Outline 1 Asymptotic Normality of the GMM Estimator 2 Long Run Covariance

More information

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators Robert M. de Jong Department of Economics Michigan State University 215 Marshall Hall East

More information

Weak Laws of Large Numbers for Dependent Random Variables

Weak Laws of Large Numbers for Dependent Random Variables ANNALES D ÉCONOMIE ET DE STATISTIQUE. N 51 1998 Weak Laws of Large Numbers for Dependent Random Variables Robert M. DE JONG* ABSTRACT. In this paper we will prove several weak laws of large numbers for

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

A Primer on Asymptotics

A Primer on Asymptotics A Primer on Asymptotics Eric Zivot Department of Economics University of Washington September 30, 2003 Revised: October 7, 2009 Introduction The two main concepts in asymptotic theory covered in these

More information

Stochastic process for macro

Stochastic process for macro Stochastic process for macro Tianxiao Zheng SAIF 1. Stochastic process The state of a system {X t } evolves probabilistically in time. The joint probability distribution is given by Pr(X t1, t 1 ; X t2,

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes Chapter 2 Some basic tools 2.1 Time series: Theory 2.1.1 Stochastic processes A stochastic process is a sequence of random variables..., x 0, x 1, x 2,.... In this class, the subscript always means time.

More information

1 Linear Difference Equations

1 Linear Difference Equations ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] 1 Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] Insights: Price movements in one market can spread easily and instantly to another market [economic globalization and internet

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets Econometrics II - EXAM Outline Solutions All questions hae 5pts Answer each question in separate sheets. Consider the two linear simultaneous equations G with two exogeneous ariables K, y γ + y γ + x δ

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming 1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

More information

The generalized method of moments

The generalized method of moments Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna February 2008 Based on the book Generalized Method of Moments by Alastair R. Hall (2005), Oxford

More information

1. Stochastic Processes and Stationarity

1. Stochastic Processes and Stationarity Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t, CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

The loss function and estimating equations

The loss function and estimating equations Chapter 6 he loss function and estimating equations 6 Loss functions Up until now our main focus has been on parameter estimating via the maximum likelihood However, the negative maximum likelihood is

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

Department of Economics, UCSD UC San Diego

Department of Economics, UCSD UC San Diego Department of Economics, UCSD UC San Diego itle: Spurious Regressions with Stationary Series Author: Granger, Clive W.J., University of California, San Diego Hyung, Namwon, University of Seoul Jeon, Yongil,

More information

Econometría 2: Análisis de series de Tiempo

Econometría 2: Análisis de series de Tiempo Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 IX. Vector Time Series Models VARMA Models A. 1. Motivation: The vector

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006 Analogy Principle Asymptotic Theory Part II James J. Heckman University of Chicago Econ 312 This draft, April 5, 2006 Consider four methods: 1. Maximum Likelihood Estimation (MLE) 2. (Nonlinear) Least

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Chapter 11 GMM: General Formulas and Application

Chapter 11 GMM: General Formulas and Application Chapter 11 GMM: General Formulas and Application Main Content General GMM Formulas esting Moments Standard Errors of Anything by Delta Method Using GMM for Regressions Prespecified weighting Matrices and

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω) Online Appendix Proof of Lemma A.. he proof uses similar arguments as in Dunsmuir 979), but allowing for weak identification and selecting a subset of frequencies using W ω). It consists of two steps.

More information

1. Fundamental concepts

1. Fundamental concepts . Fundamental concepts A time series is a sequence of data points, measured typically at successive times spaced at uniform intervals. Time series are used in such fields as statistics, signal processing

More information

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics GMM, HAC estimators, & Standard Errors for Business Cycle Statistics Wouter J. Den Haan London School of Economics c Wouter J. Den Haan Overview Generic GMM problem Estimation Heteroskedastic and Autocorrelation

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University Introduction to the Mathematical and Statistical Foundations of Econometrics 1 Herman J. Bierens Pennsylvania State University November 13, 2003 Revised: March 15, 2004 2 Contents Preface Chapter 1: Probability

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Ch.10 Autocorrelated Disturbances (June 15, 2016)

Ch.10 Autocorrelated Disturbances (June 15, 2016) Ch10 Autocorrelated Disturbances (June 15, 2016) In a time-series linear regression model setting, Y t = x tβ + u t, t = 1, 2,, T, (10-1) a common problem is autocorrelation, or serial correlation of the

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Lecture 8: Multivariate GARCH and Conditional Correlation Models

Lecture 8: Multivariate GARCH and Conditional Correlation Models Lecture 8: Multivariate GARCH and Conditional Correlation Models Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Three issues in multivariate modelling of CH covariances

More information

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models Herman J. Bierens Pennsylvania State University September 16, 2005 1. The uniform weak

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

1 Procedures robust to weak instruments

1 Procedures robust to weak instruments Comment on Weak instrument robust tests in GMM and the new Keynesian Phillips curve By Anna Mikusheva We are witnessing a growing awareness among applied researchers about the possibility of having weak

More information

11. Further Issues in Using OLS with TS Data

11. Further Issues in Using OLS with TS Data 11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

On detection of unit roots generalizing the classic Dickey-Fuller approach

On detection of unit roots generalizing the classic Dickey-Fuller approach On detection of unit roots generalizing the classic Dickey-Fuller approach A. Steland Ruhr-Universität Bochum Fakultät für Mathematik Building NA 3/71 D-4478 Bochum, Germany February 18, 25 1 Abstract

More information

Multivariate Time Series

Multivariate Time Series Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form

More information

ARIMA Modelling and Forecasting

ARIMA Modelling and Forecasting ARIMA Modelling and Forecasting Economic time series often appear nonstationary, because of trends, seasonal patterns, cycles, etc. However, the differences may appear stationary. Δx t x t x t 1 (first

More information

Introduction to Stochastic processes

Introduction to Stochastic processes Università di Pavia Introduction to Stochastic processes Eduardo Rossi Stochastic Process Stochastic Process: A stochastic process is an ordered sequence of random variables defined on a probability space

More information

This note introduces some key concepts in time series econometrics. First, we

This note introduces some key concepts in time series econometrics. First, we INTRODUCTION TO TIME SERIES Econometrics 2 Heino Bohn Nielsen September, 2005 This note introduces some key concepts in time series econometrics. First, we present by means of examples some characteristic

More information

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED by W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand Email: bob.reed@canterbury.ac.nz

More information

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications Yongmiao Hong Department of Economics & Department of Statistical Sciences Cornell University Spring 2019 Time and uncertainty

More information

Nonlinear time series

Nonlinear time series Based on the book by Fan/Yao: Nonlinear Time Series Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 27, 2009 Outline Characteristics of

More information

Spectral representations and ergodic theorems for stationary stochastic processes

Spectral representations and ergodic theorems for stationary stochastic processes AMS 263 Stochastic Processes (Fall 2005) Instructor: Athanasios Kottas Spectral representations and ergodic theorems for stationary stochastic processes Stationary stochastic processes Theory and methods

More information

Uncertainty and Disagreement in Equilibrium Models

Uncertainty and Disagreement in Equilibrium Models Uncertainty and Disagreement in Equilibrium Models Nabil I. Al-Najjar & Northwestern University Eran Shmaya Tel Aviv University RUD, Warwick, June 2014 Forthcoming: Journal of Political Economy Motivation

More information

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Prof. Massimo Guidolin 019 Financial Econometrics Winter/Spring 018 Overview ARCH models and their limitations Generalized ARCH models

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p GMM and SMM Some useful references: 1. Hansen, L. 1982. Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p. 1029-54. 2. Lee, B.S. and B. Ingram. 1991 Simulation estimation

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems *

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * February, 2005 Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * Peter Pedroni Williams College Tim Vogelsang Cornell University -------------------------------------------------------------------------------------------------------------------

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Review Session: Econometrics - CLEFIN (20192)

Review Session: Econometrics - CLEFIN (20192) Review Session: Econometrics - CLEFIN (20192) Part II: Univariate time series analysis Daniele Bianchi March 20, 2013 Fundamentals Stationarity A time series is a sequence of random variables x t, t =

More information

Multivariate GARCH models.

Multivariate GARCH models. Multivariate GARCH models. Financial market volatility moves together over time across assets and markets. Recognizing this commonality through a multivariate modeling framework leads to obvious gains

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information