Estimating the efficient price from the order flow : a Brownian Cox process approach

Similar documents
arxiv: v2 [q-fin.tr] 11 Apr 2013

Convergence at first and second order of some approximations of stochastic integrals

Statistical Inference for Ergodic Point Processes and Applications to Limit Order Book

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals

Stochastic Volatility and Correction to the Heat Equation

Point Process Control

Sample path large deviations of a Gaussian process with stationary increments and regularily varying variance

Regular Variation and Extreme Events for Stochastic Processes

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Outline. A Central Limit Theorem for Truncating Stochastic Algorithms

Lecture 17 Brownian motion as a Markov process

On the quantiles of the Brownian motion and their hitting times.

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

Stochastic Calculus for Finance II - some Solutions to Chapter VII

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 7 9/25/2013

Liquidation in Limit Order Books. LOBs with Controlled Intensity

Optimal exit strategies for investment projects. 7th AMaMeF and Swissquote Conference

Introduction General Framework Toy models Discrete Markov model Data Analysis Conclusion. The Micro-Price. Sasha Stoikov. Cornell University

On Lead-Lag Estimation

An essay on the general theory of stochastic processes

Weak quenched limiting distributions of a one-dimensional random walk in a random environment

STAT 331. Martingale Central Limit Theorem and Related Results

A Barrier Version of the Russian Option

On the submartingale / supermartingale property of diffusions in natural scale

Nonparametric regression with martingale increment errors

Limit theorems for multipower variation in the presence of jumps

Statistical inference for ergodic point processes and application to Limit Order Book

P (A G) dp G P (A G)

Selected Exercises on Expectations and Some Probability Inequalities

Lecture 12. F o s, (1.1) F t := s>t

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Exercises in stochastic analysis

Duration-Based Volatility Estimation

Lecture 21 Representations of Martingales

1. Aufgabenblatt zur Vorlesung Probability Theory

Multivariate Normal-Laplace Distribution and Processes

Exponential martingales: uniform integrability results and applications to point processes

arxiv: v1 [math.pr] 11 Jan 2013

Gaussian Processes. 1. Basic Notions

1. Stochastic Processes and filtrations

On pathwise stochastic integration

Consistency of the maximum likelihood estimator for general hidden Markov models

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

A Note on the Central Limit Theorem for a Class of Linear Systems 1

UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE

A one-level limit order book model with memory and variable spread

A slow transient diusion in a drifted stable potential

The Asymptotic Theory of Transaction Costs

Proofs of the martingale FCLT

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

GENERALIZED RAY KNIGHT THEORY AND LIMIT THEOREMS FOR SELF-INTERACTING RANDOM WALKS ON Z 1. Hungarian Academy of Sciences

Practical conditions on Markov chains for weak convergence of tail empirical processes

Variance-GGC asset price models and their sensitivity analysis

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Order book resilience, price manipulation, and the positive portfolio problem

Stochastic integral. Introduction. Ito integral. References. Appendices Stochastic Calculus I. Geneviève Gauthier.

Information and Credit Risk

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

Strong approximation for additive functionals of geometrically ergodic Markov chains

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have

Bivariate Uniqueness in the Logistic Recursive Distributional Equation

Empirical Processes: General Weak Convergence Theory

Some Aspects of Universal Portfolio

Stochastic Differential Equations.

Recent results in game theoretic mathematical finance

1 Continuous-time chains, finite state space

Branching Processes II: Convergence of critical branching to Feller s CSB

Other properties of M M 1

e - c o m p a n i o n

SUPPLEMENT TO PAPER CONVERGENCE OF ADAPTIVE AND INTERACTING MARKOV CHAIN MONTE CARLO ALGORITHMS

(6, 4) Is there arbitrage in this market? If so, find all arbitrages. If not, find all pricing kernels.

9 Brownian Motion: Construction

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Minimization of ruin probabilities by investment under transaction costs

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

A Model of Optimal Portfolio Selection under. Liquidity Risk and Price Impact

LECTURE 2: LOCAL TIME FOR BROWNIAN MOTION

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

On the martingales obtained by an extension due to Saisho, Tanemura and Yor of Pitman s theorem

On the Goodness-of-Fit Tests for Some Continuous Time Processes

Dynamics of Order Positions and Related Queues in a LOB. Xin Guo UC Berkeley

Feller Processes and Semigroups

On a class of stochastic differential equations in a financial network model

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

AN EXTREME-VALUE ANALYSIS OF THE LIL FOR BROWNIAN MOTION 1. INTRODUCTION

Stability of Stochastic Differential Equations

Non-Essential Uses of Probability in Analysis Part IV Efficient Markovian Couplings. Krzysztof Burdzy University of Washington

arxiv: v1 [math.pr] 24 Sep 2018

The Azéma-Yor Embedding in Non-Singular Diffusions

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Markov processes Course note 2. Martingale problems, recurrence properties of discrete time chains.

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

Random Bernstein-Markov factors

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

One-level limit order books with sparsity and memory

STABILITY AND STRUCTURAL PROPERTIES OF STOCHASTIC STORAGE NETWORKS 1

Transcription:

Estimating the efficient price from the order flow : a Brownian Cox process approach - Sylvain DELARE Université Paris Diderot, LPMA) - Christian ROBER Université Lyon 1, Laboratoire SAF) - Mathieu ROSENBAUM Université Pierre et Marie Curie, LPMA) 213.1 Laboratoire SAF 5 Avenue ony Garnier - 69366 Lyon cedex 7 http://www.isfa.fr/la_recherche

Estimating the efficient price from the order flow: a Brownian Cox process approach Sylvain Delattre LPMA, Université Paris Diderot delattre@math.univ-paris-diderot.fr Christian Y. Robert SAF, Université Lyon 1 christian.robert@univ-lyon1.fr Mathieu Rosenbaum LPMA, Université Pierre et Marie Curie mathieu.rosenbaum@upmc.fr January 14, 213 Abstract At the ultra high frequency level, the notion of price of an asset is very ambiguous. Indeed, many different prices can be defined last traded price, best bid price, mid price,... ). hus, in practice, market participants face the problem of choosing a price when implementing their strategies. In this work, we propose a notion of efficient price which seems relevant in practice. Furthermore, we provide a statistical methodology enabling to estimate this price form the order flow. Key words: Efficient price, order flow, response function, market microstructure, Cox processes, fractional part of Brownian motion, non parametric estimation, functional limit theorems. 1 Introduction 1.1 What is the high frequency price? he classical approach of mathematical finance is to consider that the prices of basic products future, stock,... ) are observed on the market. In particular, their values are used in order to price complex derivatives. Since options traders typically rebalance their portfolio once or a few times a day, such derivatives pricing problems typically occur at the daily scale. When working at the ultra high frequency scale, even pricing a basic product, that is assigning a price to it, becomes a challenging issue. Indeed, one has access to trades 1

and quotes in the order book so that at a given time, many different notions of price can be defined for the same asset: last traded price, best bid price, best ask price, mid price, volume weighted average price,... his multiplicity of prices is problematic for many market participants. For example, market making strategies or brokers optimal execution algorithms often require single prices of plain assets as inputs. Choosing one definition or another for the price can sometimes lead to very significantly different outcomes for the strategies. his is for example the case when the tick value the minimum price increment allowed on the market) is rather large. Indeed, this implies that the prices mentioned above differ in a non negligible way. In practice, high frequency market participants are not looking for the fair economic value of the asset. What they need is rather a price whose value at some given time summarizes in a suitable way the opinions of market participants at this time. his price is called efficient price. Hence, this paper aims at providing a statistical procedure in order to estimate this efficient price. 1.2 Ideas for the estimation strategy In this paper, we focus on the case of large tick assets. We define them as assets for which the bid-ask spread is almost always equal to one tick. Our goal is then to infer an efficient price for this type of asset. Naturally, it is reasonable to assume that the efficient price essentially lies inside the bid-ask spread but we wish to say more. In order to retrieve the efficient price, our idea is to use the order flow. Indeed, it is often said by market participants that the price is where the volume is not. herefore, we assume that the intensity of arrival of the limit order flow at the best bid or the best ask level depends on the distance between the efficient price and the considered level: if this distance is large, the intensity should be high and conversely. hus, we assume the intensity can be written as an increasing deterministic function of this distance. his function is called the order flow response function. In our approach, a crucial step is to estimate the response function in a non parametric way. hen, this functional estimator is used in order to retrieve the efficient price. Note that it is also possible to use the buy or sell market order flow. In that case, the intensity of the flow should be high when the distance is small. Indeed, in this situation, market takers are not loosing too much money with respect to the efficient price) when crossing the spread. 1.3 Organization of the paper he paper is organized as follows. he model and the assumptions are described in Section 2. Particular properties of the efficient price are given in Section 3 and the main statistical procedure is explained in Section 4. he theorems about the response function can be found in Section 5 and the limiting behavior of the estimator of the efficient price is given in Section 6. Finally, the proofs are relegated to Section 7. 2

2 he model 2.1 Description of the model We assume the tick size is equal to one, meaning that the asset can only take integer values. Moreover, the efficient price is given by P t = P + σw t, t [, ], where W t ) t is a Brownian motion on some filtered probability space Ω, F, F t ) t, P) and P is a F -measurable random variable, independent of W t ) t and uniformly distributed on [p, p + 1), with p N. Note that such a simple dynamics for the efficient price is probably still reasonable at our high frequency scale. Let Y t be the fractional part of P t, denoted by {P t }, that is: Y t = {P t } = P t P t. o fix ideas and without loss of generality, we focus in the rest of the paper on the limit order flow at the best bid level. We assume that when a limit order is posted at time t at the best bid level, its price B t is given by P t. herefore at time t, the efficient price is P t = B t + Y t. We denote by N t the total number of limit orders posted over [, t]. In order to translate the fact that the intensity of N t should be an increasing function of Y t, we assume that N t ) t is a Cox process also known as doubly stochastic Poisson process) with arrival intensity at time t given by µhy t ), where µ is a positive constant and where the response function h : [, 1) R + satisfies the following assumption: Assumption H1: Response function he function h is C 1 on [, 1), with derivative h for which there exist some positive constants c 1 and c 2 such that for any x in [, 1), c 1 h x) c 2. Moreover, we have the following identifiability condition: 1 hx)dx = 1. Remark that H1 implies in particular that h is bounded on [, 1). he response function expresses the influence of the efficient price on the order flow. In particular, the limiting case where h is constant corresponds to orders arriving according to a standard Poisson process. his simple arrival mechanism is often assumed in practice for technical convenience, see for example [4], [5]. However, it is probably not very realistic. We assume that we observe the point process N t ) on [, ]. o get asymptotic properties, we let tend to infinity. It will be also necessary to assume that µ = µ depends on. More precisely, we have the following assumption: 3

Assumption H2: Asymptotic setting For some ε >, as tends to infinity, 5/2+ε /µ. Note that up to scaling modifications, we could also consider a setting where is fixed and the tick size is not constant equal to 1 but tends to zero. 3 Properties of the process Y t he intensity of the order flow is µ hy t ). his process Y t ) t has several nice properties. 3.1 Markov property First, recall that if U is uniformly distributed on [, 1] and X is a real-valued random variable, which is independent of U then {U + X} is also uniformly distributed on [, 1]. hus, since P is uniformly distributed on [, 1], we obtain that Y t ) is a stationary Markov process. hen, from the properties of the sample paths of the Brownian motion, it is clear that Y t ) is Harris recurrent and therefore, for f a bounded positive measurable function, almost surely, 1 1 lim fy s )ds = fs)ds, + see for example heorem 3.12 in [7]. 3.2 Regenerative property Beyond the Markov property, the process Y t ) also enjoys a regenerative property. We define the sequence of stopping time ν n ) n N the following way: ν =, ν 1 = inf {t > : P t N} and for n 2: ν n = inf{t > ν n 1 : P t = P νn 1 ± 1} = inf{t > ν n 1 : W t = W νn 1 ± 1/σ}. 1) he cycles Y t+νn ) t<νn+1 ν n are independent and identically distributed for n 1. hus Y t ) is a regenerative process. Note that ν 2 ν 1 has the same law as he Laplace transform of τ 1 is given by τ 1 = inf{t, W t = 1/σ}. E[e γτ 1 ] = 1/ cosh 2γ)/σ ), γ. From this expression, we get E[τ 1 ] = 1/σ 2 and E[τ 2 1 ] = 5/3σ4 ). hus for f a bounded positive measurable function, by heorem VI.3.1 in [1], we get that almost surely lim + 1 fy s )ds = 1 ν2 E[ν 2 ν 1 ] E[ fy t )dt ] = σ 2 E τ1 f{σw t })dt ). ν 1 4

In particular, this implies that σ 2 E [ τ 1 f{σw t })dt ] = 1 fs)ds. Furthermore, by heorem VI.3.2, we get the following convergence in distribution as : 1 with Z f a centered random variable defined by Z f = fy t )dt σ 2 E [ τ 1 f{σw t })dt ]) d N, σ 2 Var[Z f ]), 2) τ1 f{σw t })dt τ 1 σ 2 E [ τ 1 f{σw t })dt ]. We end this section with two particular cases for the function f which will be useful in the following. First, if f = h, using the identifiability condition in Assumption H1, we get that almost surely: herefore, Z h = = τ1 1/σ lim + 1/σ 1 hy t )dt = σ 2 E [ τ 1 h{σw t })dt ] = 1. I{Wt<}h1 + σw t ) 1) + I {Wt>}hσW t ) 1) ) dt I{u<} h1 + σu) 1) + I {u>} hσu) 1))L 1/σ,1/σ u)du, where L 1/σ,1/σ is the local time of W t ) t stopped at τ 1. In the same way, for fx) = I {hx) t}, t [, h1 )), we have 1 I {hs) t} ds = h 1 t). So we define Z e t) = Z f by τ1 Z e t) = I{Ws<,h1+σW s) t} + I {Ws>,hσWs) t} h 1 t) ) ds = 1/σ 1/σ I{u<,h1+σu) t} + I {u>,hσu) t} h 1 t) ) L 1/σ,1/σ u)du. In fact, it will be useful to see Z e t)) t [,h1 )) as a process indexed by t. hus, we introduce the covariance function ρt 1, t 2 ) = Cov[Z e t 1 ), Z e t 2 )] = E[Z e t 1 )Z e t 2 )]. 3) Note that this function and Var[Z h ] can be computed as multiple integrals on [ 1/σ, 1/σ] 2. Indeed, for u, v) [ 1/σ, 1/σ] 2 an explicit expression for the bivariate Laplace transform of L 1/σ,1/σ u), L 1/σ,1/σ v) ) is available, see Formula I.3.18.1) in [3]. Using differentiation, from this expression, one can easily derive 1 E[L 1/σ,1/σ u)l 1/σ,1/σ v)] which enables to deduce ρt 1, t 2 ) and Var[Z h ]. 1 We skip the explicit writing of E[L 1/σ,1/σ u)l 1/σ,1/σ v)], the expression being quite heavy. 5

4 Estimating the flow response function and its inverse We want to estimate the flow response function. Before estimating h, we need to estimate µ. Since E [ N ] [ 1 = E hy t )dt ] = 1 E[hY t )]dt = 1, µ it is natural to propose ˆµ = N 4) as an estimator for µ. We have the following proposition, whose proof is given in Section 7.1. Proposition 1 If µ as, we have ˆµ µ 1 ) d N, σ 2 Var[Z h ]). We now explain how the estimator of the flow response function is built. Let k be a known deterministic sequence of positive integers. hen define for j = 1,..., k ˆθ j = k N j/k N j 1)/k ˆµ = k N N j/k N j 1)/k ). Now remark that ˆθ j is approximately equal to 1 µ /k Nj 1)/k +i/µ N j 1)/k +i 1)/µ ). µ /k i=1 Conditional on the path of Y t ), the variables in the sum are independent and if /k is small enough, they approximately follow a Poisson law with parameter µ hy j 1)/k )/k. herefore, if moreover µ /k is sufficiently large, one can expect that ˆθ j is close to hy j 1)/k ). In fact, throughout the paper, we assume that k is chosen so that for some p >, as tends to infinity, p+1/2 k 1/2 k p/2, µ. Note that since 5/2+ε /µ such a sequence k exists. he ˆθ j introduced above are k estimators of quantities of the form hu j ), u j [, 1]. However, we do not have access to the values of the u i. Nevertheless, we know that they are uniformly distributed on [, 1]. We therefore rank the ˆθ j : ˆθ 1) ˆθ 2)... ˆθ k ). For u [, 1), we define the estimator of hu) the following way: ĥu) = ˆθ uk ). Note that the identifiability condition holds since 1 ĥu)du = 1 k k ˆθ j = 1. 6

hen, the estimator of h 1 is naturally defined by the right continuous generalized inverse of ĥ: ĥ 1 t) = 1 k I k {ˆθj t}. 5) 5 Limit theorems for the response function We now give the theorems associated to ĥ 1 and ĥ. For b >, we write D[, b) resp. D[, b]), for the space of càdlàg functions from [, b) resp. [, b]) into R. We define the convergence in law in D[, b) as the convergence in law of the restrictions of the stochastic processes to any compact [, a], < a < b, for the Skorohod topology J 1. We have the following result for ĥ 1 : heorem 1 Under H1 and H2, as tends to infinity, we have ĥ 1 ) h 1 ) ) d σg ) ) h h 1 ) ) h1 ) σgv)dv, in D[, h1 )), where G ) is a continuous centered Gaussian process with covariance function ρ defined by 3). he same type of result holds also for ĥ: heorem 2 Under H1 and H2, as tends to infinity, we have in D[, 1). ) d ĥ ) h ) h )σg h ) ) h1 ) + h ) σgv)dv, Note that both in heorem 1 and heorem 2, the covariance functions of the limiting processes can be computed explicitly, see Section 3.2. 6 Estimation of the efficient price Let t, ]. A nice corollary of heorem 1 is the possibility of estimating Y t, which is the fractional part of the efficient price at time t, and therefore the efficient price itself provided t is an order arrival time). Indeed, we have an estimator ĥ 1 of h 1 and we know how to estimate hy t ), namely we take 2 ĥy t ) = k N t N t /k ˆµ 2 Of course we can take any other time interval of length /k containing t.. 7

herefore, it is natural to define an estimator Y t ) of Y t by Ŷ t = ĥ 1 ĥy t ) ). he following corollary shows that this estimator is indeed a good estimator of the efficient price Y t. Corollary 1 For t >, under H1 and H2, as tends to infinity, we have Ŷt Y t ) d σg hyt ) ), with G independent of Y t. hus, thanks to this approach it is possible to retrieve the efficient price from the order flow, with the usual parametric rate. Remark that the law of σg hy t ) ) can be explicitly computed. Indeed, conditional on Y t we have a centered Gaussian random variable with variance σ 2 ρ hy t ), hy t ) ). hen we can use the fact that Y t is independent of G and uniformly distributed. 7 Proofs All the proofs are done under H1 and H2. In the following, c denotes a constant that may vary from line to line, and even in the same line. 7.1 Proof of Proposition 1 We give here an elementary proof of Proposition 1. However, note that this result can also be obtained as a by product of Corollary 2 shown in Section 7.3. he characteristic function of ˆµ /µ 1) satisfies E [ exp it ˆµ /µ 1) )] = exp it [ [ N )] ] )E E Y exp it, µ where E Y denotes the expectation conditional of the path of Y t ). Since conditional on the path of Y t ), N follows a Poisson random variable with parameter the characteristic function is equal to Now µ hy t )dt, exp it [ )E exp e itµ )] ) 1 1)µ hy t )dt. e itµ ) 1 1 itµ ) 1 t 2 µ ) 2. 8

Using the boundedness of h and the fact that µ goes to infinity, we get that the limit as tends to infinity of the characteristic function is given by [ lim E exp it ) 1 hy t )dt 1 ))]. + From Equation 2), this is equal to 7.2 An auxiliary result exp σ2 t 2 2 Var[Zh ] ). We now give an auxiliary result from which heorem 1 and heorem 2 will be easily deduced. For j = 1,..., k, we set θ j = and define the function ĥe by for u [, 1). Note that k µ N j/k N j 1)/k ) = ˆµ µ ˆθj, ĥ e u) = θ uk ) = ĥu) ˆµ µ, ˆµ µ = N µ = 1 k k θ j = 1 he right continuous generalized inverse of ĥe is given by and satisfies We have the following proposition: ĥ 1 e t) = 1 k k I {θj t} ĥ 1 ˆµ ) t) = ĥ 1 e t. µ ĥ e u)du. Proposition 2 As tends to infinity, we have ĥ 1 e ) h 1 ) ) d σg ), in D[, h1 )). By heorem 15.1 in [2], it is enough to prove that the finite dimensional distributions converge and that a tightness criterion holds. he proof is splitted into two Lemmas: Lemma 1 for the convergence of the finite dimensional distributions and Lemma 2 for the C-tightness. Lemma 1 For any n N, for any t 1,..., t n ) [, h1 )) n, as, ĥ 1 e t i ) h 1 t i ) ) d N, σ 2 ν) i=1,...,n where ν i,j = ρ t i, t j ), with ρ the covariance function defined in 3). 9

Proof. We give the proof for n = 1. he other cases are obtained the same way by linear combinations of the components of ĥ 1 e t i ) h 1 t i ) ) i=1,...,n. We write ĥ 1 e t) h 1 t) = 1 + 2 + 3, with 1 = 1 k 2 = 1 3 = 1 k k I k k { I {θj t} 1 k k I k { j /k j 1) /k hy u)du t} 1 I {hyu) t}du h 1 t). We study each component separately. We have 1 = 1 k j /k j 1) /k hy u)du t}, k j/k k v j,k, j 1)/k I {hyu) t}du, with v j,k, = or - v j,k, = 1 if θ j t < k j/k j 1)/k hy t )dt, - v j,k, = 1 if k j/k j 1)/k hy t )dt t < θ j. For ε >, we have E[ v j,k, ] A 1 + A 2, with We have A 1 = P [ k j/k A 2 = E [ I {vj,k, =±1}I { t k j 1)/k hy u )du t ε ], j /k j 1) /k hy t)dt >ε } ]. k j/k hy t )dt hy j 1)/k ) c k j 1)/k j/k j 1)/k Y t Y j 1)/k dt. herefore, for p >, P [ k hus j/k j 1)/k hy t )dt hy j 1)/k ) ε ] P [ sup t [,/k ] A 1 P [ hy j 1)/k ) t ] /k ) p/2 2ε + c. ] /k ) p/2 W t cε c. ε p ε p 1

Using the stationary distribution of Y j 1)/k we obtain that for ε small enough, A 1 h 1 t + 2ε ) h 1 t 2ε ) + c /k ) p/2 ε p cε + c /k ) p/2 ε p. We now turn to A 2. Remarking that v j,k, = 1 implies θ j k j/k hy t )dt t k j 1)/k j/k j 1)/k hy t )dt, we get A 2 P [ θj k j/k j 1)/k hy t )dt > ε ]. d Recall that conditional on the path of Y t ), θ j = Zk /µ ), where Z is a Poisson random variable with parameter j/k µ hy t )dt. j 1)/k hus we obtain Using that we finally get k A 2 c µ ε 2. E[ 1 ] 1 k k E[ v j,k, ], E[ p+1)/2 k 1 ] cε + c + c. µ ε 2 ε p kp/2 We take ε = ζ /, with ζ tending to zero. hen E[ 1 ] cζ + c p+1/2 ζ p kp/2 + c k µ ζ 2. hanks to the assumptions on k, we can find a sequence ζ tending to such that E[ 1 ] goes to. We now turn to 2. We have 2 = 1 = 1 k j/k I{ k j 1)/k k j/k j 1)/k w j,k,s, ds I ) j /k hy {hy j 1) /k u)du t} s) t} ds with w j,k,s, = or 11

- w j,k,s, = 1 if k j/k j 1)/k hy u )du t < hy s ) - w j,k,t, = 1 if hy s ) t < k j/k j 1)/k hy u )du. Now remark that w j,k,s, = 1 implies hy s ) k j/k hy u )du t k j 1)/k j/k j 1)/k hy u )du. Using the same kind of computations as for 1, for ε positive and small enough and p >, this leads to P [ w j,k,s, = 1 ] cε + c /k ) p/2 ε. )p hen, in the same way as for 1, we easily obtain that E[ 2 ] goes to. Eventually, from 2), we get that 3 converges in law towards a centered Gaussian random variable with variance σ 2 Var[Z e t)]. We now give the second Lemma needed to prove Proposition 2. Lemma 2 Let α t) = 1 he sequence α t) ) t<h1 ) is tight in D[, h1 )). I {hys) t}ds h 1 t) ). Proof. Recall first the following classical C-tightness criterion, see heorem 15.6 in [2]: If for p >, p 1 > 1 and all t 1, t 2 < h1 ) E [ α t 1 ) α t 2 ) p] c t 1 t 2 p ) 1, then α t) ) t h1) is tight in D[, h1 )). We now use the sequence of stopping time ν n ) n N defined by 1). Let n = inf{i : ν i }. From heorem II.5.1 and heorem II.5.2 in [6], we have E[ν n ] c 6) and E[ n σ 2 )2 ] c. 7) 12

and We now define Y i t 1, t 2 ) = 1 ν i ν i 1 I {t1 <hy t) t 2 }dt 1 σ 2 h 1 t 2 ) h 1 t 1 ) )) Ỹ n +1t 1, t 2 ) = 1 νn I {t1 <hy t) t 2 }dt. Remark that for i 2, the Y i t 1, t 2 ) are centered see Section 3.2) and iid. Moreover, using the occupation formula together with a aylor expansion, we get E [ Y 2 t 1, t 2 ) ) [ 2] νi 1 E I {t1 <hy t) t 2 }dt ) ] 2 c 1 t 2 t 1 2 E[L ) 2 ], ν i 1 with L = sup L 1/σ,1/σ u) ). u [ 1/σ,1/σ] From the Ray-Knight version of Burkhölder-Davis-Gundy inequality, see [7], we know that all polynomial moments of L are finite and thus We show in the same way that E [ Y 2 t 1, t 2 ) ) 2] c 1 t 2 t 1 2. E [ Y 1 t 1, t 2 ) ) 2] + E [Ỹn +1t 1, t 2 ) ) 2] c 1 t 2 t 1 2. We now use the preceding inequalities in order to show that the tightness criterion holds. We have α t 1 ) α t 2 ) = B 1 + B 2 B 3 + B 4, with hus By heorem I.5.1 in [6], B 1 = Y 1 t 1, t 2 ), B 2 = n i=2 Y i t 1, t 2 ), B 3 = Ỹn +1t 1, t 2 ), B 4 = n 1 σ 2 1) h 1 t 2 ) h 1 t 1 ) ). E [ E [ α t 1 ) α t 2 ) 2] c n i=2 Moreover, using 6), we obtain Finally, we easily get from 7) that 4 E[ B i 2 ]. i=1 Y i t 1, t 2 ) 2 ] ce[n ]E [ Y 2 t 1, t 2 ) ) 2]. E[n ]E [ Y 2 t 1, t 2 ) 2] c t 1 t 2 2. E[B 4 ) 2 ] c t 1 t 2 2. and therefore the tightness criterion is satisfied. 13

7.3 Proofs of heorem 1 and heorem 2 In order to prove heorem 1 and heorem 2, we start with the following corollary of Proposition 2: Corollary 2 As, converges in law towards ĥ 1 e ) h 1 ), ĥe ) h ), σg ), h )σg h ) ), 1 h1 ) in D[, h1 )) D[, 1) R for the product topology). ) ĥ e u)du 1 ) σgv)dv, Proof. he result follows directly from Proposition 2 together with heorem 13.7.2 in [8] and the continuous mapping theorem. We now give the proof of heorem 1. We write ĥ 1 ) h 1 ) ) = V 1 + V 2, with V 1 = ĥ 1 e ˆµ /µ ) h 1 ˆµ /µ ) ), V 2 = h 1 ˆµ /µ ) h 1 ) ). From Proposition 2 together with the continuity of the composition map, see heorem 13.2.2 in [8], we have d V1 σg ), in D[, h1 )). Now recall that 1 ĥ e u)du = ˆµ /µ. hus using Corollary 2 together with heorem 13.3.3 in [8], we get V2 d ) h h 1 ) ) h1 ) σgv)dv, in D[, h1 )). he convergences of V 1 and V 2 taking place jointly, this ends the proof of heorem 1. From heorem 1, heorem 2 follows from heorem 13.7.2 in [8]. 14

7.4 Proof of Corollary 1 Let ĥey t ) be the oracle estimate of hy t ) defined the same way as ĥy t) but with µ instead of ˆµ. Now we use that ĥ 1 µ ) = ĥ 1 e ) ˆµ and the fact that the cycles on which ĥey t ) is defined are negligible in the estimation of h 1 ) and µ. hen, from Proposition 2, we have ĥ 1 µ ) h 1 ) ) ) d, ˆµ ĥey t ) σg ), hyt ) ) in D[, h1 )) R +, with G independent of Y t. Using heorem 13.2.1 in [8] and the fact that ĥy t ) = µ ˆµ ĥ e Y t ), we derive ĥ 1 ĥy t ) ) h 1 ĥ e Y t ) )) d σg hyt ) ). Now recall that h 1 ĥ e Y t ) ) h 1 hy t ) ) c ĥ e Y t ) hy t ). and that conditional on the path of Y t ), ĥey t ) d = Zk /µ ), where Z is a Poisson random variable with parameter µ t t /k hy u )du. herefore, using a aylor expansion, we get that [ ĥe E Y t ) hy t ) 2] is smaller than ce [ sup u [t /k,t+/k ] Y u Y t 2] + c k / µ ) ) c/k ) + c k / µ ) ). Eventually, since / k and k /µ tend to zero, we get which concludes the proof. E[ ĥe Y t ) hy t ) 2] c 2 /k ) + ck /µ ), References [1] Asmussen, S. 23). Applied probability and queues. 2nd ed. Springer. [2] Billingsley, P. 1968). Convergence of probability measures. Wiley, New York. [3] Borodin, A. and Salminen, P. 23). Handbook of Brownian motion: facts and formulae. Birkhauser. 15

[4] Cont, R. and de Larrard, A. 212). Price dynamics in a Markovian limit order book market. o appear in SIAM Journal for Financial Mathematics. [5] Farmer, J.D., Patelli, P. and Zovko, I.I. 25). he predictive power of zero intelligence in financial markets. PNAS 12, 22542259. [6] Gut, A. 1988). Stopped random walks. Limit theorems and applications. Springer- Verlag, New York. [7] Revuz, D. and Yor, M. 25). Continuous martingale and Brownian motion. Springer, third edition. [8] Whitt, W. 22). Stochastic-Process Limits. Springer, New York. 16