Information Theory for Maximum Likelihood Estimation of Diffusion Models

Size: px
Start display at page:

Download "Information Theory for Maximum Likelihood Estimation of Diffusion Models"

Transcription

1 Information heory for Maximum Likelihood Estimation of Diffusion Models Hwan-sik Choi Binghamton University Mar. 4 Abstract We develop an information theoretic framework for maximum likelihood estimation of diffusion models. wo information criteria that measure the divergence of a diffusion process from the true diffusion are defined. he maximum likelihood estimator MLE converges asymptotically to the limit that minimizes the criteria when sampling interval decreases as sampling span increases. When both drift and diffusion specifications are correct, the criteria become zero and the inverse curvatures of the criteria equal the asymptotic variance of the MLE. For misspecified models, the minimizer of the criteria defines pseudo-true parameters. Pseudo-true drift parameters depend on approximate transitions if used. JEL: C3, C, C4. Keywords: Kullback-Leibler information criterion, fiber bundle, high frequency, infinitesimal generator, infill asymptotics, misspecification. I thank the seminar participants at Purdue University Statistics, Symposium of Probability and Stochastic Processes CIMA, Mexico for helpful discussions. hwansik@binghamton.edu, Department of Economics, Binghamton University, 44 Vestal Parkway E. P.O. Box 6, Binghamton, NY 39, USA.

2 Introduction We develop an information theoretic framework for maximum likelihood estimation of diffusion models in an asymptotic environment in which sampling interval shrinks to zero as sampling span increases to infinity. he new approach provides a unified framework within which we can analyze both correctly specified and misspecified diffusion models. For discrete time models, White 98 gives an information theoretic foundation for analyzing the maximum likelihood estimator MLE. His theory is suitable for the models analyzed with the conventional large sample theory with fixed sampling interval. For diffusion models, estimation and inference under decreasing sampling interval were studied previously by, for example, Bandi and Phillips 3, Jeong and Park 3, and Choi, Jeong, and Park 3. Bandi and Phillips 3 study fully nonparametric estimation. Jeong and Park 3 study the maximum likelihood estimation for correctly specified parametric models for both stationary and non-stationary processes. Choi, Jeong, and Park 3 provide some asymptotic results for misspecified models, although their main focus is on model selection testing. he theory developed in Jeong and Park 3 and Choi, Jeong, and Park 3 can explain the differential properties of the drift and diffusion parameter estimators observed in high frequency samples. hey show that the rates of information accumulation for drift and diffusion parameters are different, and both sample size and sampling span are important for diffusion parameter estimation whereas sampling span only matters for drift parameter estimation. he present paper provides an information theoretic foundation for their new asymptotic theory. In White s theory, the Kullback-Leibler information criterion KLIC, Kullback and Leibler 95 plays a key role in understanding the asymptotic properties of the MLE. He shows that the MLE converges to the limit that minimizes the KLIC from the true distribution and derives asymptotic properties of the MLE under model misspecification. He also shows that the usual inference procedure with the Likelihood Ratio, Lagrange Multiplier, and Wald test statistics is not valid for a misspecified model. Naturally, we are interested in the questions similar to those that White has addressed in his paper. Under the new asymptotic framework with decreasing sampling interval, do the MLEs of misspecified diffusion models converge to some limits asymptotically, do the limits have any meaning, are the usual inference procedures based on likelihood ratios valid, what is the consequence of partial misspecification, what is the impact of having to use an approximate

3 transition when the exact transition is unknown in closed-form? We provide an answer to each of these questions in this paper. Based on the KLIC, we develop two-tier information criteria, which we refer to as the primary and secondary information criteria. he criteria measure the divergence of a diffusion process M from the true diffusion X. he primary criterion depends on the diffusion function of M, and equals zero if and only if the diffusion function is the same as the diffusion function of X. he secondary criterion becomes zero if both drift and diffusion functions of M are correct. Our new information criteria play a similar role as the KLIC in White 98. When sampling interval decreases as sampling span increases, the MLE asymptotically minimizes the two criteria, and the minimizer of the criteria is said to be the pseudo-true parameter. Because the primary criterion does not depend on the drift specification of M, the MLE for a diffusion function parameter is consistent to the same limit regardless of drift specifications. he secondary criterion depends on both diffusion and drift specification. When the exact transition is not known in closed-form and an approximate transition is used, the secondary criterion depends on the choice of an approximate transition as well. herefore, the pseudo-true drift parameter depends on diffusion and drift function specifications and the choice of approximate transitions if used. Under the partial misspecification where the diffusion function is correctly specified, the pseudo-true values of drift function parameters become invariant for a large class of approximate transitions. hus, the MLE of drift parameters would be less sensitive to the choice of an approximate transition when the diffusion function is correctly specified. When the drift function is correctly specified but the diffusion function is misspecified, the drift estimator does not converge to the true value in general. However, we show that the Euler approximate transition has a robustness property in the sense that the MLE of the drift parameters converges asymptotically to the true value. For the Milstein approximate transition and the closed-form approximate transition of Aït-Sahalia, the MLE of the drift parameter may not converge to the true parameter value. We also study the asymptotic distribution of the MLE for both correctly specified and misspecified models. When a diffusion model is correctly specified, we show that the asymptotic variances of the MLE of diffusion and drift parameters are given by the inverse curvatures of the primary and secondary criteria respectively. For misspecified models, the asymptotic variance of the MLE depends on both expected Hessian matrix and covariance matrix of the score functions. 3

4 We develop the new information criteria in the next section, and give information theoretic interpretation of the asymptotic properties of the MLE in Section 3. Divergence of diffusion processes Let W be the standard Brownian motion defined on a probability space Ω, F, P, and X be a stationary diffusion on D R that solves the stochastic differential equation SDE dx t = µ X t dt + σ X t dw t. with a drift function µ and a diffusion function σ. We assume that SDEs in this paper satisfy the conditions in Karatzas and Shreve 99 to admit a weak solution. Suppose we have a parametric diffusion model Mθ for X given by Mθ : dx t = µx t ; α dt + σx t ; β dw t, where µ ; α and σ ; β > are known functions with an unknown parameter vector θ = α, β in a compact set Θ R k. A diffusion model Mθ is misspecified if there exists no θ Θ such that µ = µ ; α and σ = σ ; β on D. Let p t, x, y be the transition density from X = x to X t = y of the true process in. and pt, x, y; θ be the exact or approximate transition density of Mθ. We consider the KLIC from the transition density p t, x, y to pt, x, y; θ. he KLICP, Q from a probability measure P to an equivalent probability measure Q, is defined by ˆ KLICP, Q dp log dp, dq and it is infinite if P and Q are not equivalent. See Csiszár 967a,b, 975 for more general divergence measures. Based on the KLIC, define the divergence measure D t θ E log p t, X, X t,. pt, X, X t ; θ as a function of t given θ. he function D t θ characterizes the divergence from the true process X to Mθ at various transition intervals. For instance, D t θ for small t would measure how close the model transition pt, x, y; θ is from the true transition p t, x, y for observations X and X t obtained over a small time interval t. In the limit of t, 4

5 Dtθ α =., β =. α =.8, β =.7 rue: α =.6, β = ransition interval t Figure : For Mθ : dx t =.α X t dt + β X t dw t, the two curves show D t θ from the true diffusion θ = α, β =.6,. to two diffusion processes θ =.,. and θ =.8,.7. For small t, the process with θ =.,. is closer to the true process, but θ =.8,.7 is closer for large t. D θ would measure the divergence of the marginal densities of X and Mθ. Since D t θ depends on the transition interval t given θ, the proximity of two processes X and Mθ in D t θ should be understood separately for different t. In this sense, we may consider the combination of Mθ and t as a model if sampling occurs at a specific time interval only. Consequently, for the discrete time analysis with fixed sampling interval, it is sufficient to consider D t θ at a specific t as the only relevant criterion. In this paper, however, we are particularly interested in the asymptotic framework in which t since the exact transition of a diffusion model is rarely known in closed-form and all approximate transitions would be valid only if t. his is also one of the reasons that diffusion models are mostly used for high frequency samples. In Figure, we give two examples of the divergence measure D t θ calculated with the CIR process Cox, Ingersoll, and Ross 985 Mθ : dx t =.α X t dt + β X t dw t with a two dimensional parameter vector θ = α, β. he true process X is defined by θ = α, β =.6,., and D t θ from X to the diffusion processes with θ =.,. 5

6 and θ =.8,.7 are shown in Figure. he process with θ =.,. has the diffusion function parameter close to the true value β =. whereas θ =.8,.7 has the drift function parameter close to the true parameter α =.6 compared to the other. he figure shows that the process with θ =.,. is closer to X in D t θ at small t. As t increases, the process θ =.8,.7 becomes closer to the true process. o see the relative effect of α and β on D t θ more clearly as t changes, we compare the contour plots of D t θ for various θ in Figure under quarterly t = /4, monthly t = /, weekly t = /5, and daily t = /5 transition intervals. he true process X is shown as the dot at the center of the plot. Each contour level curve represents the diffusion processes with equal D t θ from X. From this figure, we can notice that the drift function parameter α horizontal moves and the diffusion function parameter β vertical moves influence the divergence differently depending on the transition interval. Given β, a fixed amount of deviation of α from its true value.6 increases the divergence very much with quarterly transition interval, but the increase is quite small under daily transition interval. On the other hand, deviation of β from its true value. has similar impact on D t θ at all four transition intervals. his implies that the divergence measure D t θ is more sensitive to diffusion function specification than drift function specification as transition interval decreases. his example motivates that it is crucial to study the behavior of D t θ with small t to understand the asymmetric impact of drift and diffusion specifications on D t θ for high frequency data. From this motivation, we define the primary criterion as follows. Definition. Primary Criterion. he primary criterion is the limit of the divergence function D t θ as t. C P θ lim t D t θ We show that C P θ plays a crucial role in analyzing the MLE under the asymptotic framework in which the sampling interval decreases to zero. o ensure that C P θ is well defined, we develop our theory under the following assumptions. Let lt, x, y = log pt, x, y be a generic log transition density of Mθ for any θ Θ or of the true process X with a drift function µ and a diffusion function σ. We consider the scaled log transition density lt, x, y tlt, x, y + log t, for t >..3 6

7 β β Quaterly α Weekly e α β β Monthly e α Daily e α Figure : Contour plots of the divergence measure D t θ defined in. from the true process X to Mθ : dx t =.α X t dt + β X t dw t. he true process is defined by θ = α, β =.6,. and shown as the dot at the center. Each level represents the processes in Mθ with equal divergence from X under quarterly t = /4, monthly t = /, weekly t = /5, and daily t = /5 transition intervals. 7

8 Although the transition pt, x, y is not well defined at t =, we extend the domain of the scaled function lt, x, y to t by defining its value at t = as lim t lt, x, y which is assumed to exist by the following assumption. he derivatives of lt, x, y at t = are defined similarly. Assumption. lt, x, y, µx and σx are infinitely differentiable in t, x, y D and θ Θ. Let ht, x, y be one of these functions or their derivative functions. hen ht, x, y is dominated by g in the sense that ht, x, y gxgy, for all small t, y, x D, and θ in the interior of Θ. Moreover, g is locally bounded, increasing at a polynomial rate of order p on the boundary of D, and EgX t d <, where d is the dimension of θ. Assumption. here exists q > such that q sup X t p t [, ] as. If D =,, we further assume that q sup X p t. t [, ] With the above technical conditions, we introduce the following notation for simplicity of exposition. Let ft, x, y be a function that satisfies Assumption. We denote the functions f / x f, x, x, f /t x f, x, x, t with the single argument x as the function ft, x, y and its derivative with respect to t at y = x and t =. Other higher order derivatives such as f /y x, f /yy x, or f /yt x are defined similarly. Using this notation, for the scaled log transition density lt, x, y in.3, we can write l / x = l, x, x, l /t x = l, x, x/ t, l /y x = l, x, x/ y, and other derivatives similarly. We also assume the following for lt, x, y and its derivatives at y = x and t =. 8

9 Assumption 3. We have the following limits for lt, x, y. l/ x =, l/t x = log σ x logπ, l/y x =, l/yy x = σ x. Assumption 4. he functions l /yyy x and l /yyyy x depend on σx but not on µx. Assumptions 3 and 4 can be shown to be satisfied for the exact transition density always. When the exact transition must be approximated because it is not available in closed-form, we can show that the Euler and Milstein approximate transitions and the closed-form approximate transition of Aït-Sahalia satisfy them also. See Lemma A.5 of Jeong and Park 3 for details. For expositional simplicity, we write f fx for any function f with a single argument. For example, we write l / = l / X, µ = µx and σ = σx. Also, we suppress θ for the functions such as the log-likelihoods and the drift and diffusion functions of Mθ. With this notional convention, we derive the explicit form of C P in the following theorem. All proofs are in Appendix. heorem.. Let A t + µ y y + σ y y.4 be the infinitesimal generator of the true process X. Given X = x, define ft, x, y l t, x, y lt, x, y,.5 where l t, x, y and lt, x, y are the scaled log-likelihoods defined in.3 of the true process X and the model Mθ respectively. If Assumptions -4 are satisfied, then we have C P θ = EAf / = [ ] E log σ σ σ + σ..6 Note that Af / = Af / X, σ = σ X and σ = σ X ; β from our notational convention. he theorem shows that the drift specification of Mθ becomes irrelevant for D t θ as t because the primary criterion C P θ does not depend on the drift function 9

10 parameter α. his result also supports our observation from Figure. We will denote C P θ as C P β in the rest of this paper. here is a convenient way to interpret the primary criterion. Suppose that Y β is a random variable of which the distribution is the marginal distribution of the diffusion process hen we can write rx t ; β σ X t σ X t ; β. C P β = E{Y β log Y β}. Intuitively, C P β measures the expected error from approximating log Y with its linear approximation Y for given β. For a correctly specified diffusion function with σ = σ ; β, we would have C P β = since r ; β =. For given β = β, let M β α Mθ be the sub-model defined on the subset Θ β = {α, β Θ β = β } of Θ. invariant on Θ β. heorem. implies that the primary criterion C P β is herefore, if there exists β that minimizes C P β, all the diffusion processes in M β α would minimize the primary criterion and be considered to be closest to X. We define the secondary divergence measure that depends on both α and β to further distinguish the members of M β α. Definition. Secondary Criterion. Let D tθ D tθ t be the derivative of D t θ at t. he secondary criterion is the limit as t. C S θ lim t D tθ We derive the explicit form of C S θ to understand the role of diffusion and drift specifications in determining the secondary divergence of Mθ from the true process X. Let µ /x x µx/ x and σ /x x σ x/ x, and denote µ /x µ /x X and σ /x σ /x X. Other functions and their derivatives without arguments are defined in the same manner.

11 heorem.. Suppose Assumptions -4 are satisfied. We have C S θ = EA f / = [{ } E µ µ + σ /x + σ µ /x + σ /xx 4 σ + σ + σ µ + σ /x l /yyy l /yyy + σ4 4 l /yyyy l /yyyy ] + µ l /yt l /yt + l /tt l /tt + σ l /yyt l /yyt,.7 where A is the infinitesimal generator defined in.4, the function f is the scaled loglikelihood ratio defined in.5 with the exact transition densities of X and Mθ, l/yyy = 3σ /x σ 4, l/yyyy = σ /xx σ 4 5 σ/x 4 σ 6, l/yt = µ σ 3σ /x 4σ,.8 l/tt = µ σ µ /x + µσ /x σ 3σ /x 6σ + σ /xx 4,.9 l/yyt = µ σ /x + µσ /x σ 3σ /x 4σ + 3σ /xx 4 and the derivatives of l are similarly defined with µ x and σ x.,. When an approximate transition is used, the exact form of C S θ would depend on the approximation method generally because the derivatives in.8-. depend on the local properties of the transition of Mθ around y = x and t =. However, it is shown in Jeong and Park that these derivatives have identical forms for the exact transition and Aït-Sahalia s closed-form approximate transition. herefore, Aït-Sahalia s closed-form approximation is sufficiently accurate to give the same C S θ as the exact transition would. Moreover, in the following corollary, we show that if diffusion function specification is correct, C S θ is invariant for a larger class of approximate transitions. Corollary.. If the diffusion function σ ; β of Mθ is identical to the true diffusion function, i.e. σ ; β = σ on D, then C S θ conditional on β = β is simplified to C S θ = [ E µ µ ] σ

12 for the exact transition as well as the Euler, Milstein, and Aït-Sahalia s closed-form approximate transitions. Further, we have C S θ and the equality holds if and only if θ = α, β Θ such that µ ; α = µ on D. As an example, we give the explicit forms of C P and C S for the CIR model Mθ : dx t = α α X t dt + β X t dw t. Let the process with θ = ᾱ, ᾱ, β be the true diffusion. In this example, the function Af / used to calculate the primary criterion becomes a constant that depends on β only rather than random, and is simply given by C P β = β log β + β β. he secondary criterion C S θ = EA f / can be calculated from [ A f / x = ᾱᾱ x + ᾱ ᾱ x β + β x ᾱ β x + β x + 3 ᾱ ᾱ x + β β x β + 5 β β 6x x β ᾱ ᾱ x + ᾱ ᾱ x α α x β x β x ᾱ + ᾱ x + α α x β x β + ᾱ α x + ᾱᾱ x α α x 3 β β x 6 x + ᾱ ᾱᾱ x x + β β α α α x x Let us set the true process as ᾱ, ᾱ, β =.6,.,., and calculate C P β and C S θ. Figure 3 shows the values of C P β for different values of β. he closest diffusion function specification does not depend on drift specifications, and C P becomes zero if and only if the diffusion specification is correct β =.. In Figure 4, we provide the contour plots of C S θ from the true process to processes with α, α given β =.8,.,.,.4. he true value of α, α is shown as the ].

13 C P β β Figure 3: he primary criterion C P β from the true process X to Mθ : dx t = α α X t dt + β X t dw t. he true process is defined by α, α, β =.6,.,.. C P β does not depend on α, α, and it is minimized at the true parameter value β =.. dot at the center of each plot. he true value of the diffusion function parameter is β =.. We can see in the figure that C S θ is minimized at the true value α, α =.6,. when β =.. However, if β., C S θ is minimized at a different α, α point. As shown in this example, slight misspecification of diffusion functions leads to a large change of the closest drift specification in C S. his implies that when diffusion specification is incorrect, the best drift specification in the sense of minimizing C S may not be the correct drift specification. Consequently, the diffusion process with correct drift specification may not be the closest diffusion to the true diffusion in this case. For further development of our theory, we assume that the primary and secondary criteria satisfy the following. Assumption 5. he functions C P β and C S θ satisfy the following conditions. a here is a unique β that solves the minimization problem C P β = min C P β. β b Let Θ β = {α, β β = β }, then we have a unique θ = α, β in the interior of Θ 3

14 α α β = α.8. β = α α α β = α.8. β = α Figure 4: Contour plot of the secondary criterion C S θ from the true process X to Mθ : dx t = α α X t dt + β X t dw t. he true process is defined by α, α, β =.6,.,. and shown as the dot at the center. Each level represents the diffusion processes with α, α in Mθ with equal C S θ from X under four different values of β =.8,.,.,.4. Note that β =. is the true parameter value, and C S θ is minimized at the true parameter α, α =.6,. in this case. 4

15 that solves the minimization problem c C P and C S are twice differentiable at θ. C S θ = min C S θ. θ Θ β Let θ = α, β solve the system of equations he sub-model M β α defined on Θ β β = argmin C P β,. β θ = argmin C S θ.. θ Θ represents the closest members in Mθ to the true diffusion process under the primary criterion C P β. he value α gives the closest diffusion Mθ in M β α that minimizes C S θ given β. he first order conditions for. and. are respectively given by [ ] EAf /β = E σ + σ σ /β =,.3 [ EA µ/α f /α = E σ µ µ + σ σ µ /xα µ /ασ/x ] σ =..4 For example, if the diffusion function of Mθ is constant, σ ; β = β, then we have β = {Eσ X } / implying that β is the mean of the true diffusion function. When an approximate transition is used, the first order condition for the drift parameter α depends on the approximate transition. Aït-Sahalia s closed-form approximate transition results in the same condition in.4 as the exact transition, but for the Euler and Milstein approximate transitions, the first order conditions for α in.4 are given by E [ µ/α σ µ µ [ ] µ/α E µ µ =, for the Euler,.5 σ σ σ 3µ/α σ/x ] σ =, for the Milstein. Compared to the first order condition with the Euler approximation, the Milstein approximation adds an additional term σ 3µ/α σ /x that incorporates the dependence of the σ σ diffusion function σ/x on x, and the first order condition with the exact transition adds the 5

16 term σ µ σ /xα + µ /ασ /x that incorporates the dependence on both diffusion and σ drift functions σ /x and µ /xα on x. When diffusion function is misspecified but drift specification is correct, the solution α with the exact or Milstein approximation is not the true drift parameter α generally. However, the solution of.5 with the Euler approximation would be the same as the true parameter. In this sense, the Euler approximation gives a robust drift estimator under incorrect diffusion specification, which is useful when we are interested in finding correct drift specification rather than the closest diffusion process to the true process. When diffusion specification is correct σ = σ, from Corollary., the first order condition for α is simplified to [ ] EA µ/α f /α = E σ µ µ = for the exact, Euler, Milstein approximations. Considering this result, we can expect that the MLE of drift parameters would be less sensitive to the choice of an approximate density in this case. We study the asymptotic behavior of the ML estimators and the relationship of their asymptotic properties with the primary and secondary criteria in the following section. 3 Information theoretic analysis of maximum likelihood estimation Suppose we have N + observations {X i } N i= of the diffusion X from time zero to = N at a non-random time interval. Denote the log-likelihood function N L log p, X i, X i i= of the true diffusion process, and N Lθ log p, X i, X i ; θ i= 6

17 for Mθ conditional on X. We suppress the dependency of the log-likelihoods on N and or equivalently and for notational simplicity. he MLE ˆθ is defined by ˆθ arg max Lθ. θ Θ When Mθ is misspecified, Lθ and ˆθ are also called the quasi-log-likelihood function and the quasi-mle respectively. he exact transition density pt, x, y; θ of Mθ is known in closed-form for only a few diffusion processes such as the Ornstein-Uhlenbeck or Feller s square root processes. When pt, x, y; θ is not available in closed-form, we must use an approximate transition density such as the Euler or Milstein approximate transitions, the simulated likelihoods of Brandt and Santa-Clara, or the closed-form approximation of Aït-Sahalia based on the Hermite expansion. he log-likelihood ratio L Lθ has an interesting relationship with the primary criterion. If we fix and consider the limit of average log-likelihood ratio plim N L Lθ = plim L Lθ = E log p, X, X = D θ p, X, X ; θ as, then the sequential limit becomes the primary criterion. lim plim N L Lθ = C P β he following theorem shows that the MLE ˆθ is consistent to the minimizer θ = α, β in.-., which is defined as the pseudo-true value. heorem 3.. Suppose that Assumptions -5 are satisfied. Under 3 as and, the MLE ˆθ converges in probability to the solution θ = α, β of the system of equations in.-.. he above theorem shows that the probability limit of ˆβ does not depend on the specification of drift functions, and is robust to the choice of the approximation method for the transition density. But the limit of ˆα depends on the transition density approximation methods as well as diffusion specification. his explains the high sensitivity of drift estimators to the choice of an approximate transition in practice. We also can see that when 7

18 the diffusion function is misspecified but the drift function is correctly specified, ˆα may not converge the true parameter value because the true parameter value may not minimize the secondary criterion. Based on this theorem, we illustrate the geometry of the ML estimation of diffusion models. Geometrically, our model Mθ has the structure of a fiber bundle. A fiber bundle E = B F consists of a base space B, a fiber space F, and a projection of E onto B that defines how to attach fibers on B. It is also denoted simply by F E π B. For us, the total space E of the fiber bundle represents all possible combinations of drift and diffusion specifications defined in Mθ. hus, we can denote the total space as Eθ indexed by θ Θ. he base space B is the space of all diffusion functions σ ; β of Mθ. We denote the base space as Bβ indexed by β. Over B, we define the fiber space F of drift functions µ ; α of Mθ attached to each element β in B. We denote the fiber space as F α indexed by α. Finally, the mapping π : E B of the fiber bundle Eθ onto Bβ defines how each fiber space is attached to B. In our case, π is the projection of θ = α, β in E onto the second factor β in B, in which sense the fiber bundle is a trivial bundle. From these definitions, the fiber attached to β on B can be written as π β, which represents the space of all the diffusion processes in Mθ with diffusion function specification σ ; β. On the right side of Figure 5, we show the fiber bundle E with the fiber space F over the base space B. he primary criterion C P shown in Figure 5 measures the information divergence from the true process µ, σ to σ ; β in the base space Bβ. he σ-orbit represents the space of all diffusion functions with equal divergence in C P from the true diffusion. he divergence C P is defined on the horizontal plane that σ-orbit belongs to, which represents that C P does not depend on F and is determined by diffusion functions only. Under the primary criterion C P, the closest diffusion specification in Bβ is shown as β in Figure 5. hen given β, consider the fiber π β. he secondary criterion C S measures the divergence from µ, σ to an element of π β, and the minimizer of C S is shown as α. herefore, we can think the minimization with respect to C P and C S as two distinct projections, and the pseudo-true value θ = α, β is obtained by sequential application of the two projections. We now study the asymptotic distribution of the MLE. For correctly specified models, the primary and secondary criteria are closely related to the asymptotic variance of the MLE. Suppose we have σ = σ ; β and α = µ ; α for the true parameter θ = α, β. When we have as, Jeong and Park 3 show in their heorem 8

19 E = B F F α fiber µ, σ C S α σ σ-orbit C P β Bβ base Figure 5: he geometry of the ML estimation. On the right, the model Mθ is shown as the fiber bundle E = B F with the fiber space F of the drift functions µ ; α over the base space B of the diffusion functions σ ; β and the projection π : E B that defines how the fibers are attached on B. he primary criterion C P measures the information divergence from the true process µ, σ to σ ; β in Bβ. he σ-orbit represents the space of all diffusion functions with equal divergence in C P from the true diffusion function σ. Under C P, the closest member of the diffusion functions in Bβ is shown as β. he secondary criterion C S measures the divergence from µ, σ to an element of the fiber π β given β, and the minimizer of C S is shown as α. he pseudo-true value θ = α, β is determined by the sequential minimization of C P and C S. 9

20 4.3 that [ ˆβ β d σ/β σ ] /β N, E σ, 3. [ ˆα α d µ/α µ ] /α N, E σ, 3. where σ /β σx ; β / β and µ /α µx ; α / α. In 3.-3., the asymptotic variances of the ML estimators are given by the inverse of the matrices σ/β σ /β µ/α µ /α E and E σ σ 3.3 for ˆβ and ˆα respectively. We show that the matrices in 3.3 are derived from the limits as of the Fisher information matrices calculated under discrete sampling with fixed. For a fixed sampling interval >, the Fisher information matrix G is induced by the KLIC D t θ by the second derivative { } D t θ G t θ =, θ i θ j where { } is the i, j element of G t. herefore, the asymptotic variance of the ML estimator ˆθ under fixed would be given by ˆθ θ d N, G θ, as. Based on this result, we consider the limit of the -sequence of the Fisher { information matrix G θ for β as. Let G p β lim D tθ t be the limit, then we have β i β j } { G P C P } { β } Af / β = = E. 3.4 β i β j β i β j Similarly, for the Fisher information for α, we consider the limit G S θ lim t { D t θ α i α j },

21 which is given by G S θ = { C S } θ α i α j = { E A } f /. 3.5 α i α j his shows that the asymptotic Fisher information matrices are given by the curvatures of C P and C S at θ. In the following theorem, we show that the curvatures given in 3.4 and 3.5 coincide with the inverse asymptotic variance matrices in 3.3 under and. heorem 3.. Suppose that Assumptions -5 are satisfied, and Mθ is correctly specified with the true parameter θ = α, β. Let p, q be the constants defined in Assumption. Under 4pq+ as and, we have where G P 3.5 respectively. ˆβ β d N, G P β, ˆα α d N, G S θ, and G S are the asymptotic Fisher information matrices defined in 3.4 and For misspecified models, the second Bartlett identity does not hold because the Fisher information matrix is different from the negative expected Hessian matrix. herefore, we derive the Fisher information and expected Hessian matrices separately. We first give the asymptotic expansion of the score functions. heorem 3.3. Suppose that Assumptions -5 are satisfied. For misspecified models, under 3 as and, we have the explicit form of the score functions { } Lθ s β θ = β i { } Lθ s α θ = α i + ˆ ˆ ˆ σ σ σ /β dt + O p + O p µ /α σ µ µ + σ σ, 3.6 µ /xα + µ /ασ /x σ σ µ /α σ dw t + O p + O p. 3.7 he asymptotic leading term of s β θ is O p /, and it is generally larger than the second term which is O p. But at β = β, the first term becomes O p / because dt

22 [ ] E σ σ σ /β = from.3. herefore, the first term in 3.6 is larger than the second term only if. Similarly, the first term in the asymptotic expansion of s α θ given 3.7 would be O p at θ = θ. his term becomes the largest term under. We will assume for the rest of the paper to derive the asymptotic distribution of the MLE. As shown in Jeong and Park 3, the asymptotic order of s β θ for correctly specified models is O p /, or equivalently O p N. From heorem 3.3, we find that the score function s β θ for misspecified models becomes larger than O p N. Intuitively, for correctly specified models, we identify the true parameter by finding β that vanishes the first term in 3.6, but for misspecified models, we minimize the primary criterion to find the pseudo-true parameter β that vanishes the first term in 3.6 on the long run average. As shown later in this section, this fact reduces the convergence rate of ˆβ to for misspecified models compared to the rate N of ˆβ for correctly specified models. In the following theorem, we calculate the asymptotic Fisher information matrix from the asymptotic variances of the score functions I β β plim, s βθs β θ and the asymptotic expected Hessian matrices where h β θ = function H β β and the speed density, I α θ plim, s αθs α θ, plim βθ, h, H α θ plim αθ, h, { } { Lθ β i β j and h α θ = Lθ α i α j }. We first define the derivative of the scale ˆ x s x exp m x x µ v σ dv v σ xs x, for any x D fixed. With the scale function s x = x x s w dw, the transformed diffusion s X of the true diffusion X becomes a driftless diffusion, and the stationary density of the diffusion X is given by m x/ D m xdx.

23 heorem 3.4. Suppose that Assumptions -5 are satisfied. Let as and. For a misspecified diffusion model with the pseudo-true parameter θ = α, β, we have the asymptotic Fisher information matrices given by I β β = EV β V β, I α θ = EV α V α, where V β = V β X and V α = V α X are defined from the functions where ˆ x V β x = σ xs x A l /β v m vdv, x V α x = σ xs x ˆ x A l /β v = σ v σ vσ /β v, A l/α v = µ /αv σ v µ v µv + x A l/αv σ v σ v m vdv + σ x σ x µ /αx, µ /xα v + µ /αvσ /x v σ v he asymptotic expected Hessian matrices are the same as the negative second derivatives of the primary and secondary criteria, and their explicit forms are given by { H β β C P β } = β i β j { H α θ C S θ } = = α i α j EA l/αα = E [ µ/αα σ = EA l /ββ = [ ] E σ 4 σ /β σ + σ + σ /β σ, /ββ µ µ µ /αµ /α σ + σ σ µ /xαα + µ ] /αα σ /x σ. From the above theorem, we can get the asymptotic distribution of the MLE. Corollary 3.. Suppose that Assumptions -5 are satisfied. Let as and. For a misspecified diffusion model with the pseudo-true parameter θ = α, β,. 3

24 we have ˆβ β d N, H β β I β β H β β, ˆα α d N, H α θ I α θ H α θ. From the corollary, we can get the explicit form of the asymptotic distributions of the MLE in terms of the model s drift and diffusion functions. It also clarifies that the differences of the asymptotic behaviors of the MLE under correct specification and misspecification. his result also shows that the likelihood ratio test statistics would not have the Chi-squared distribution asymptotically. Consequently, the Lagrange Multiplier, and Wald test statistics would not be distributed as the Chi-squared distribution asymptotically either. 4 Conclusion he information criteria developed in this paper provide the information theoretic foundation for studying asymptotic behaviors of the MLE and likelihood based test statistics for high frequency data. he new framework also allows us to analyze the estimation and inference problems geometrically. We surmise that we can derive the higher order properties of the MLE and other general estimators on this foundation using the differential geometric methods developed in Efron 978 and Amari 98. Our new framework would make the development of these theories easier and the interpretation of them more intuitive. Another important issue is asymptotic analysis of inference with decreasing sampling interval. Since the conventional form of the likelihood based tests would not be valid under model misspecification, a robust test statistic would be desirable. Based on the results of this paper, we can develop a robust version of the likelihood based test statistics and derive its limiting distribution. Also, asymptotic analysis of specification testing for diffusion models in our new framework is another important topic. hese topics will be investigated in the future research. We did not consider in this paper more general processes such as jump diffusions, or Lévy processes. Although these processes are important, the development of the information theory for these processes would warrant a separate paper, and we leave it as a future research topic. 4

25 Appendices A Notational convention We summarize the notional convention used in this paper for reference. For a function ft, x, y, we define the functions f / x f, x, x, f /t x f, x, x, t with the single argument x as the functions evaluated at y = x and t =. Other higher order derivatives such as f /y x, f /yy x, or f /yt x are defined similarly. For a function fx with the single argument x, we define f fx omitting the argument X. For example, we write l/ = l / X, µ = µx, σ = σx. Also, we suppress θ for the functions such as the log-likelihoods and the drift and diffusion functions of Mθ. Using the notations defined above, we define µ /x x µx/ x, σ/x x σ x/ x, and σ/xx x σ x/ x, and denote µ /x µ /x X, σ /x σ /x X, σ /xx σ /xx X. Other functions and their derivatives without arguments are defined in the same manner. For the infinitesimal generator A, we define Af / Af / X, Af /y Af /y X = Af, X, X y, Af /yβ Af /yx. β When these functions are used as integrands, the argument omitted is X t rather than X. 5

26 B Proofs B. Proof of heorem. Proof. We omit θ in D t θ for notational simplicity. he following proof applies to all θ Θ. Define the conditional expectation g t x E x {l t, x, X t lt, x, X t }, where E x is the conditional expectation with respect to X t given X = x. hen we can rewrite D t = E{l t, X, X t lt, X, X t } as D t = Eg t X. Under Assumption, we get lim t D t = E lim t g t X by the dominated convergence theorem. herefore, we need show that lim t g t x = Af / x. From.3 and.5, we can rewrite g t x = E x { l t, x, X t lt, x, X t }/t = E x ft, x, X t /t = Ex ft, x, X t f / x. t he last equality above is from f / x = given by Assumption 3. hen we get the limit E x ft, x, X t f / x lim = Af t t / x by the definition of the infinitesimal generator A. For the explicit expression in.6, we first apply A = t + µ y y + σ y to ft, x, y = l t, x, y lt, x, y to evaluate Aft, x, y at t = and y = x. As the result, y 6

27 we would get From Assumption 3, we have and herefore, Af / = f /t + µ f /y + σ f /yy = l /t l /t + µ l /y l /y + σ l /yy l /yy. f /t x = log σ x σ x, f /yx =, f /yy x = σ x + σ x. Af / x = log σ x σ x + σ x σ x + σ, x which proves the second equality in the theorem. B. Proof of heorem. Proof. For both D t θ and D tθ, we omit θ for notational simplicity. We define the conditional expectation g t x Ex ft, x, X t f / x t given X = x as in the proof of heorem.. D = EAf / by heorem., we can write D lim t D t D t B. Since D t = Eg t X by definition and = lim E g tx Af / X. t t Since lim t E gtx Af / X g tx t = E lim Af / X t t theorem, we only need to show from the dominated convergence g t x Af / x lim = t t A f / x. B. 7

28 o show the above equality, we use Dynkin s formula Dynkin 965 p.33, Øksendal 998 heorem 7.4.., ˆ τ E x ϕx τ = ϕx + E x AϕX s ds, for a twice continuously differentiable function ϕ on R and τ is a stopping time such that E x τ <. By applying Dynkin s formula to the numerator of g t x in B. after setting the stopping time τ = t, we get ˆ t E x ft, x, X t f / x = E x Afs, x, X s ds = ˆ t E x Afs, x, X s ds. B.3 We apply Dynkin s formula twice sequentially to E x Afs, x, X s to have ˆ s E x Afs, x, X s = Af / x + E x A fu, x, X u du herefore, B.3 becomes = Af / x + ˆ s ˆ v {A f / x + E x A 3 fv, x, X v dv} du = Af / x + sa f / x + E x ˆ s ˆ v E x ft, x, X t f / x = taf / x + t A f / x + E x ˆ t From this, we can write A 3 fv, x, X v dv du. ˆ s ˆ v A 3 fv, x, X v dv du ds. g t x Af / x t = A f / x + t Ex ˆ t ˆ s ˆ v A 3 fv, x, X v dv du ds. B.4 Since E x t s v A3 fv, x, X v dv du ds = Ot 3 for any x D from Assumption, we get B. from B.4 by taking t on the both sides. For the second equality in.7, we apply 8

29 A = µ /x yµ y + µ /xx y σ y y { + µ y µ y + σ /x y + σ y µ /x y + σ /xx y } 4 y + σy µ y + σ /x y 3 y 3 + σ4 y 4 4 y t + µ y y t + σ y y t, to ft, x, y = l t, x, y lt, x, y. hen we evaluate A ft, x, y at t = and y = x to get A f / = µ µ + σ /x + σ µ + σ /x where + σ µ /x + σ /xx σ + σ B B + σ4 4 C C + D D + µ A A + σ E E, A = l /yt = µ σ 3σ /x 4σ, B = l /yyy = 3σ /x σ 4, C = l /yyyy = σ /xx σ σ /x σ 6, D = l /tt = µ σ µ /x + µσ /x σ 3σ /x 6σ + σ /xx 4, E = l /yyt = σ µ /x µσ /x σ + 3σ /x 4σ 3σ /xx, 4 and the terms A, B,..., E are defined similarly with µ and σ from the true diffusion. he explicit forms of these terms are given from the corresponding terms A j,..., E j in Section 3 of Choi, Jeong, and Park 3 derived for the exact transition density. 9

30 B.3 Proof of Corollary. Proof. For C S θ, we omit θ for notational simplicity. Since l /yyy = l /yyy and l /yyyy = l/yyyy from σ = σ, we have C S = E From σ = σ, we also have [ l /yt l /yt µ + l /tt l /tt + σ l /yyt l /yyt l/yt l /yt = µ µ σ, l/tt l /tt = µ /x µ /x µ µ l/yyt l /yyt = µ /x µ /x σ σ µ σ /x σ 3 µ σ /x + µσ /x σ 3 ] µσ /x σ σ.., herefore, C S = E [ µ µµ σ µ ] µ σ = [ E µ µ ] σ. For Euler approximation, we use the explicit forms of l /yt, l /tt, and l /yyt which correspond to A j, D j, E j respectively derived for the Euler approximate transition densities in Section 3 of Choi, Jeong, and Park 3. From l/yt l /yt = µ µ σ l/yyt l /yyt =., l/tt l /tt = µ µ σ, we get C S = E [ µ µµ σ µ ] µ σ = [ E µ µ ] σ. For Milstein approximation, we use the explicit forms of l /yt, l /tt, and l /yyt which correspond to A j, D j, E j respectively derived for the Milstein approximate transition densities 3

31 in Section 3 of Choi, Jeong, and Park 3. From l/yt l /yt = µ µ σ, l/tt l /tt = µ µ σ l/yyt l /yyt = 3µ µσ/x σ 4, + 3µ µσ/x σ, we get C S = E [ µ µµ σ µ ] µ σ = [ E µ µ ] σ. B.4 Proof of heorem 3. Proof. he limiting distribution is derived in heorem 4.3 in Jeong and Park 3. We only need to prove that the inverse asymptotic variances of diffusion and drift parameter estimators are the same as the second derivatives of the primary and secondary criteria respectively. For the Fisher information matrix for ˆβ, we first get the first derivative with respect to β i C P β β i = [ ] E σ σ σ σ β i by the dominated convergence theorem. Similarly, we get the second derivative with respect to β j C P β β i β j = E [ σ β j σ 4 σ σ σ + β i σ σ σ σ σ 4 + β j β i σ σ σ ] σ. β i β j Since σ = σ at β = β, we get g P ijβ = C P β β i β j = E [ σ σ σ ] σ 6 = [ σ β i β j E σ ] σ 4, β i β j which is the inverse of the asymptotic variance of the diffusion parameter estimator ˆβ in heorem 4.3 in Jeong and Park 3. For the Fisher information matrix for ˆα, we first collect the part of C S θ that depends 3

32 on α with σ = σ. hat part is given by [ ] E σ µ µ µ. From the dominated convergence theorem, the second derivative is given by g S ijθ = C S θ α i α j = E [ ] µ µ σ, α i α j which is the inverse of the asymptotic variance of the drift parameter estimator ˆα in heorem 4.3 in Jeong and Park 3. B.5 Proof of heorem 3. Proof. Lemma B.7 of Choi, Jeong, and Park 3 gives the asymptotic first order condition that the probability limit θ of the MLE satisfies under 3 as and. he asymptotic first order order condition in their Lemma B.7 can be rewritten as EA l /β = EA l/α =, B.5 which is the first order condition of minimizing the primary and secondary criteria. herefore, θ also minimizes the criteria. B.6 Proof of heorem 3.3 Proof. Define the operator B σ y y. Let A l /β A l / / β and σ /β σ / β and define other derivatives such as A l/β, B l/β, A l/α, AB l /α similarly. When these functions are used in integrands, the argument is X t which is omitted for notational simplicity. From the proof of heorem 3. of Choi, Jeong, and Park 3, under 3, the score functions of β and α are given by s β θ = s α θ = ˆ ˆ A l /β dt + A l/α dt + ˆ ˆ A l/β dt + ˆ B l/β dv t + O p, AB l /α dw t + O p + O p, B.6 B.7 3

33 where V is a standard Brownian motion independent of W. he asymptotic orders of the terms in the above theorem are ˆ ˆ A l /β dt = O p A l/α dt = O p,, ˆ ˆ A l/β dt = O p, AB l /α dw t = O p. ˆ B l/β dv t = O p, he explicit forms of the leading terms of s β θ and s α θ are calculated from A l /β = σ σ σ /β, A l/α = µ /α σ µ µ + AB l /α = σ µ /α σ. σ σ µ /xα + µ /ασ /x σ We get the main result of the theorem by plugging in these terms in B.6-B.7., B.7 Proof of heorem 3.4 From B.6-B.7, under as and, we have ˆ s β θ p A l /β dt and ˆ s α θ A l/α dt + ˆ AB l /α dw t p. herefore, we only need to get the asymptotic distributions of ˆ A l /β dt for s β θ and ˆ A l/α dt + ˆ AB l /α dw t 33

34 for s α θ. We can derive the limiting distributions of A l /β dt and A l/α dt from Lemma B. of Choi, Jeong, and Park 3. heir lemma proves that for fx t such that EfX t =, we have where ˆ fx t dt d EVX t W, ˆ x Vx = σ xs x fvm vdv x for x D. Using the fact EA l /β = EA l/α = from B.5, we apply their lemma to A l /β dt and A l/α dt, which gives us s β θ d N, EV β V β, s α θ d N, EV α V α, B.8 B.9 where V β = V β X and V α = V α X are defined from the functions ˆ x V β x = σ xs x A l /β v m vdv, x ˆ x V β x = σ xs x x A l/α v m vdv + σ x µ /αx σ x. For the expected Hessian matrices, we use the relationships h β θ = ˆ A l /ββ dt + O p + O p, h α θ = ˆ A l/αα dt + O p + O p, 34

35 which are shown in the proof of Lemma B.6 of Choi, Jeong, and Park 3. From as and, and h βθ h αθ ˆ ˆ ˆ ˆ A l /ββ dt A l/αα dt A l /ββ dt p EA l /ββ, p, p, A l/αα dt p EA l/αα, as, the asymptotic expected Hessian matrices of β and α are the same as the negative second derivatives of the primary and secondary criteria respectively. he explicit forms are calculated by taking the second derivatives of the criteria. B.8 Proof of Corollary 3. Proof. Under as and, we get ˆβ β h βθ s βθ p, ˆα α h αθ s α θ p, from the proof of heorem 3. of Choi, Jeong, and Park 3. We only need to show the asymptotic distributions of h βθ s βθ and h αθ s α θ. heir asymptotic distributions are from the results of heorem 3.4 and B.8-B.9. 35

36 References Aït-Sahalia, Y. : Maximum likelihood estimation of discretely sampled diffusions: a closed-form approximation approach, Econometrica, 7, 3 6. Amari, S. I. 98: Differential geometry of curved exponential families - curvatures and information loss, Annals of Statistics,, Bandi, F., and P. Phillips 3: Fully nonparametric estimation of scalar diffusion models, Econometrica, 7, Brandt, M., and P. Santa-Clara : Simulated likelihood estimation of diffusions with an application to exchange rate dynamics in incomplete markets, Journal of Financial Economics, 63, 6. Choi, H.-S., M. Jeong, and J. Y. Park 3: An asymptotic analysis of likelihoodbased diffusion model selection using high frequency data, Journal of Econometrics, 783, Cox, J., J. Ingersoll, Jr, and S. Ross 985: A theory of the term structure of interest rates, Econometrica, 53, Csiszár, I. 967a: Information type measures of difference of probability distributions and indirect observations, Studia Scientiarum Mathematicarum Hungarica,, b: On topological properties of f-divergence, Studia Scientiarum Mathematicarum Hungarica,, : I-divergence geometry of probability distributions and minimization problems, Annals of Probability, 3, Dynkin, E. B. 965: Markov processes. Springer-Verlag. Efron, B. 978: he geometry of exponential families, Annals of Statistics, 6, Jeong, M., and J. Y. Park : Asymptotic expansions and second order efficiency of maximum likelihood estimators for diffusion models, Working Paper, Indiana University. 36

37 3: Asymptotic theory of maximum likelihood estimator for diffusion model, Working Paper, Indiana University. Karatzas, I., and S. Shreve 99: Springer, nd edn. Brownian Motion and Stochastic Calculus. Kullback, S., and R. Leibler 95: On information and sufficiency, Annals of Mathematical Statistics,, Øksendal, B. K. 998: Stochastic Differential Equations: An Introduction with Applications. Springer. White, H. 98: Maximum likelihood estimation of misspecified models, Econometrica, 5, 6. 37

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

I forgot to mention last time: in the Ito formula for two standard processes, putting

I forgot to mention last time: in the Ito formula for two standard processes, putting I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy

More information

A nonparametric method of multi-step ahead forecasting in diffusion processes

A nonparametric method of multi-step ahead forecasting in diffusion processes A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Lecture on Parameter Estimation for Stochastic Differential Equations. Erik Lindström

Lecture on Parameter Estimation for Stochastic Differential Equations. Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindström Recap We are interested in the parameters θ in the Stochastic Integral Equations X(t) = X(0) + t 0 µ θ (s, X(s))ds +

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT A MODEL FOR HE LONG-ERM OPIMAL CAPACIY LEVEL OF AN INVESMEN PROJEC ARNE LØKKA AND MIHAIL ZERVOS Abstract. We consider an investment project that produces a single commodity. he project s operation yields

More information

The loss function and estimating equations

The loss function and estimating equations Chapter 6 he loss function and estimating equations 6 Loss functions Up until now our main focus has been on parameter estimating via the maximum likelihood However, the negative maximum likelihood is

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises

Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises Hongwei Long* Department of Mathematical Sciences, Florida Atlantic University, Boca Raton Florida 33431-991,

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

3 Applications of partial differentiation

3 Applications of partial differentiation Advanced Calculus Chapter 3 Applications of partial differentiation 37 3 Applications of partial differentiation 3.1 Stationary points Higher derivatives Let U R 2 and f : U R. The partial derivatives

More information

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems

More information

i=1 h n (ˆθ n ) = 0. (2)

i=1 h n (ˆθ n ) = 0. (2) Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Interest Rate Models:

Interest Rate Models: 1/17 Interest Rate Models: from Parametric Statistics to Infinite Dimensional Stochastic Analysis René Carmona Bendheim Center for Finance ORFE & PACM, Princeton University email: rcarmna@princeton.edu

More information

Bernardo D Auria Stochastic Processes /10. Notes. Abril 13 th, 2010

Bernardo D Auria Stochastic Processes /10. Notes. Abril 13 th, 2010 1 Stochastic Calculus Notes Abril 13 th, 1 As we have seen in previous lessons, the stochastic integral with respect to the Brownian motion shows a behavior different from the classical Riemann-Stieltjes

More information

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze

More information

Least Squares Bias in Time Series with Moderate. Deviations from a Unit Root

Least Squares Bias in Time Series with Moderate. Deviations from a Unit Root Least Squares Bias in ime Series with Moderate Deviations from a Unit Root Marian Z. Stoykov University of Essex March 7 Abstract his paper derives the approximate bias of the least squares estimator of

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility

Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility José Enrique Figueroa-López 1 1 Department of Statistics Purdue University Statistics, Jump Processes,

More information

6. MAXIMUM LIKELIHOOD ESTIMATION

6. MAXIMUM LIKELIHOOD ESTIMATION 6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

1 Brownian Local Time

1 Brownian Local Time 1 Brownian Local Time We first begin by defining the space and variables for Brownian local time. Let W t be a standard 1-D Wiener process. We know that for the set, {t : W t = } P (µ{t : W t = } = ) =

More information

LAN property for sde s with additive fractional noise and continuous time observation

LAN property for sde s with additive fractional noise and continuous time observation LAN property for sde s with additive fractional noise and continuous time observation Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Samy Tindel (Purdue University) Vlad s 6th birthday,

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula

Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula Group 4: Bertan Yilmaz, Richard Oti-Aboagye and Di Liu May, 15 Chapter 1 Proving Dynkin s formula

More information

Introduction to Diffusion Processes.

Introduction to Diffusion Processes. Introduction to Diffusion Processes. Arka P. Ghosh Department of Statistics Iowa State University Ames, IA 511-121 apghosh@iastate.edu (515) 294-7851. February 1, 21 Abstract In this section we describe

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Asymptotic properties of maximum likelihood estimator for the growth rate for a jump-type CIR process

Asymptotic properties of maximum likelihood estimator for the growth rate for a jump-type CIR process Asymptotic properties of maximum likelihood estimator for the growth rate for a jump-type CIR process Mátyás Barczy University of Debrecen, Hungary The 3 rd Workshop on Branching Processes and Related

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

ẋ = f(x, y), ẏ = g(x, y), (x, y) D, can only have periodic solutions if (f,g) changes sign in D or if (f,g)=0in D.

ẋ = f(x, y), ẏ = g(x, y), (x, y) D, can only have periodic solutions if (f,g) changes sign in D or if (f,g)=0in D. 4 Periodic Solutions We have shown that in the case of an autonomous equation the periodic solutions correspond with closed orbits in phase-space. Autonomous two-dimensional systems with phase-space R

More information

LAN property for ergodic jump-diffusion processes with discrete observations

LAN property for ergodic jump-diffusion processes with discrete observations LAN property for ergodic jump-diffusion processes with discrete observations Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Arturo Kohatsu-Higa (Ritsumeikan University, Japan) &

More information

The multidimensional Ito Integral and the multidimensional Ito Formula. Eric Mu ller June 1, 2015 Seminar on Stochastic Geometry and its applications

The multidimensional Ito Integral and the multidimensional Ito Formula. Eric Mu ller June 1, 2015 Seminar on Stochastic Geometry and its applications The multidimensional Ito Integral and the multidimensional Ito Formula Eric Mu ller June 1, 215 Seminar on Stochastic Geometry and its applications page 2 Seminar on Stochastic Geometry and its applications

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

Statistical Problems Related to Excitation Threshold and Reset Value of Membrane Potentials

Statistical Problems Related to Excitation Threshold and Reset Value of Membrane Potentials Statistical Problems Related to Excitation Threshold and Reset Value of Membrane Potentials jahn@mathematik.uni-mainz.de Le Mans - March 18, 2009 Contents 1 Biological Background and the Problem Definition

More information

Estimation of arrival and service rates for M/M/c queue system

Estimation of arrival and service rates for M/M/c queue system Estimation of arrival and service rates for M/M/c queue system Katarína Starinská starinskak@gmail.com Charles University Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form. Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities

More information

Change detection problems in branching processes

Change detection problems in branching processes Change detection problems in branching processes Outline of Ph.D. thesis by Tamás T. Szabó Thesis advisor: Professor Gyula Pap Doctoral School of Mathematics and Computer Science Bolyai Institute, University

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Interval Estimation for AR(1) and GARCH(1,1) Models

Interval Estimation for AR(1) and GARCH(1,1) Models for AR(1) and GARCH(1,1) Models School of Mathematics Georgia Institute of Technology 2010 This talk is based on the following papers: Ngai Hang Chan, Deyuan Li and (2010). Toward a Unified of Autoregressions.

More information

On the quantiles of the Brownian motion and their hitting times.

On the quantiles of the Brownian motion and their hitting times. On the quantiles of the Brownian motion and their hitting times. Angelos Dassios London School of Economics May 23 Abstract The distribution of the α-quantile of a Brownian motion on an interval [, t]

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Stability of Stochastic Differential Equations

Stability of Stochastic Differential Equations Lyapunov stability theory for ODEs s Stability of Stochastic Differential Equations Part 1: Introduction Department of Mathematics and Statistics University of Strathclyde Glasgow, G1 1XH December 2010

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Problem Set 6 Solution

Problem Set 6 Solution Problem Set 6 Solution May st, 009 by Yang. Causal Expression of AR Let φz : αz βz. Zeros of φ are α and β, both of which are greater than in absolute value by the assumption in the question. By the theorem

More information

Introduction to Random Diffusions

Introduction to Random Diffusions Introduction to Random Diffusions The main reason to study random diffusions is that this class of processes combines two key features of modern probability theory. On the one hand they are semi-martingales

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

More Empirical Process Theory

More Empirical Process Theory More Empirical Process heory 4.384 ime Series Analysis, Fall 2008 Recitation by Paul Schrimpf Supplementary to lectures given by Anna Mikusheva October 24, 2008 Recitation 8 More Empirical Process heory

More information

Multivariable Calculus Notes. Faraad Armwood. Fall: Chapter 1: Vectors, Dot Product, Cross Product, Planes, Cylindrical & Spherical Coordinates

Multivariable Calculus Notes. Faraad Armwood. Fall: Chapter 1: Vectors, Dot Product, Cross Product, Planes, Cylindrical & Spherical Coordinates Multivariable Calculus Notes Faraad Armwood Fall: 2017 Chapter 1: Vectors, Dot Product, Cross Product, Planes, Cylindrical & Spherical Coordinates Chapter 2: Vector-Valued Functions, Tangent Vectors, Arc

More information

This chapter reviews properties of regression estimators and test statistics based on

This chapter reviews properties of regression estimators and test statistics based on Chapter 12 COINTEGRATING AND SPURIOUS REGRESSIONS This chapter reviews properties of regression estimators and test statistics based on the estimators when the regressors and regressant are difference

More information

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω) Online Appendix Proof of Lemma A.. he proof uses similar arguments as in Dunsmuir 979), but allowing for weak identification and selecting a subset of frequencies using W ω). It consists of two steps.

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Generalized Method of Moments Estimation

Generalized Method of Moments Estimation Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample

More information

Universal examples. Chapter The Bernoulli process

Universal examples. Chapter The Bernoulli process Chapter 1 Universal examples 1.1 The Bernoulli process First description: Bernoulli random variables Y i for i = 1, 2, 3,... independent with P [Y i = 1] = p and P [Y i = ] = 1 p. Second description: Binomial

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion Brownian Motion An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan 2010 Background We have already seen that the limiting behavior of a discrete random walk yields a derivation of

More information

A note on the asymptotic efficiency of the restricted estimation

A note on the asymptotic efficiency of the restricted estimation Fac. CC. Económicas y Empresariales Universidad de La Laguna Fac. CC. Económicas y Empresariales Univ. de Las Palmas de Gran Canaria A note on the asymptotic efficiency of the restricted estimation José

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca November 26th, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

10 Transfer Matrix Models

10 Transfer Matrix Models MIT EECS 6.241 (FALL 26) LECTURE NOTES BY A. MEGRETSKI 1 Transfer Matrix Models So far, transfer matrices were introduced for finite order state space LTI models, in which case they serve as an important

More information

The Wiener Itô Chaos Expansion

The Wiener Itô Chaos Expansion 1 The Wiener Itô Chaos Expansion The celebrated Wiener Itô chaos expansion is fundamental in stochastic analysis. In particular, it plays a crucial role in the Malliavin calculus as it is presented in

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Half of Final Exam Name: Practice Problems October 28, 2014

Half of Final Exam Name: Practice Problems October 28, 2014 Math 54. Treibergs Half of Final Exam Name: Practice Problems October 28, 24 Half of the final will be over material since the last midterm exam, such as the practice problems given here. The other half

More information

MATH Solutions to Probability Exercises

MATH Solutions to Probability Exercises MATH 5 9 MATH 5 9 Problem. Suppose we flip a fair coin once and observe either T for tails or H for heads. Let X denote the random variable that equals when we observe tails and equals when we observe

More information

Before you begin read these instructions carefully.

Before you begin read these instructions carefully. MATHEMATICAL TRIPOS Part IB Friday, 8 June, 2012 1:30 pm to 4:30 pm PAPER 4 Before you begin read these instructions carefully. Each question in Section II carries twice the number of marks of each question

More information

ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER

ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER GERARDO HERNANDEZ-DEL-VALLE arxiv:1209.2411v1 [math.pr] 10 Sep 2012 Abstract. This work deals with first hitting time densities of Ito processes whose

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Parameter estimation of diffusion models

Parameter estimation of diffusion models 129 Parameter estimation of diffusion models Miljenko Huzak Abstract. Parameter estimation problems of diffusion models are discussed. The problems of maximum likelihood estimation and model selections

More information

The Azéma-Yor Embedding in Non-Singular Diffusions

The Azéma-Yor Embedding in Non-Singular Diffusions Stochastic Process. Appl. Vol. 96, No. 2, 2001, 305-312 Research Report No. 406, 1999, Dept. Theoret. Statist. Aarhus The Azéma-Yor Embedding in Non-Singular Diffusions J. L. Pedersen and G. Peskir Let

More information

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions Economics Division University of Southampton Southampton SO17 1BJ, UK Discussion Papers in Economics and Econometrics Title Overlapping Sub-sampling and invariance to initial conditions By Maria Kyriacou

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Stochastic Volatility and Correction to the Heat Equation

Stochastic Volatility and Correction to the Heat Equation Stochastic Volatility and Correction to the Heat Equation Jean-Pierre Fouque, George Papanicolaou and Ronnie Sircar Abstract. From a probabilist s point of view the Twentieth Century has been a century

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

ASYMPTOTIC PROPERTIES OF MAXIMUM LIKELIHOOD ES- TIMATION: PARAMETERISED DIFFUSION IN A MANIFOLD

ASYMPTOTIC PROPERTIES OF MAXIMUM LIKELIHOOD ES- TIMATION: PARAMETERISED DIFFUSION IN A MANIFOLD ASYMPOIC PROPERIES OF MAXIMUM LIKELIHOOD ES- IMAION: PARAMEERISED DIFFUSION IN A MANIFOLD S. SAID, he University of Melbourne J. H. MANON, he University of Melbourne Abstract his paper studies maximum

More information

Feller Processes and Semigroups

Feller Processes and Semigroups Stat25B: Probability Theory (Spring 23) Lecture: 27 Feller Processes and Semigroups Lecturer: Rui Dong Scribe: Rui Dong ruidong@stat.berkeley.edu For convenience, we can have a look at the list of materials

More information

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Controlled Diffusions and Hamilton-Jacobi Bellman Equations Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

Thomas Knispel Leibniz Universität Hannover

Thomas Knispel Leibniz Universität Hannover Optimal long term investment under model ambiguity Optimal long term investment under model ambiguity homas Knispel Leibniz Universität Hannover knispel@stochastik.uni-hannover.de AnStAp0 Vienna, July

More information