Empirical Market Microstructure Analysis (EMMA)

Similar documents
Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

FE570 Financial Markets and Trading. Stevens Institute of Technology

Lecture 2: Univariate Time Series

Review Session: Econometrics - CLEFIN (20192)

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Introduction to ARMA and GARCH processes

Univariate Time Series Analysis; ARIMA Models

Chapter 2: Unit Roots

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Non-Stationary Time Series and Unit Root Testing

Univariate, Nonstationary Processes

Time Series Econometrics 4 Vijayamohanan Pillai N

Advanced Econometrics

Forecasting with ARMA

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Econ 623 Econometrics II Topic 2: Stationary Time Series

Introduction to Stochastic processes

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Cointegration, Stationarity and Error Correction Models.

Time Series Analysis

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

9) Time series econometrics

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Discrete time processes

Ch. 15 Forecasting. 1.1 Forecasts Based on Conditional Expectations

Autoregressive and Moving-Average Models

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Some Time-Series Models

1 Linear Difference Equations

Univariate ARIMA Models

Midterm Suggested Solutions

7. Integrated Processes

Econometrics II Heij et al. Chapter 7.1

3. ARMA Modeling. Now: Important class of stationary processes

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Ch 6. Model Specification. Time Series Analysis

STAT Financial Time Series

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

BCT Lecture 3. Lukas Vacha.

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

7. Integrated Processes

Testing for non-stationarity

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Trending Models in the Data

Lecture 4a: ARMA Model

at least 50 and preferably 100 observations should be available to build a proper model

Time Series Analysis -- An Introduction -- AMS 586

Problem set 1 - Solutions

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Univariate linear models

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Classic Time Series Analysis

Moreover, the second term is derived from: 1 T ) 2 1

Unit Root and Cointegration

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Economtrics of money and finance Lecture six: spurious regression and cointegration

Problem Set 2: Box-Jenkins methodology

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Università di Pavia. Forecasting. Eduardo Rossi

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE -MODULE2 Midterm Exam Solutions - March 2015

Econometrics I: Univariate Time Series Econometrics (1)

SOME BASICS OF TIME-SERIES ANALYSIS

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Oil price volatility in the Philippines using generalized autoregressive conditional heteroscedasticity

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

Stationary Stochastic Time Series Models

CHAPTER 8 FORECASTING PRACTICE I

Estimation and application of best ARIMA model for forecasting the uranium price.

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Basics: Definitions and Notation. Stationarity. A More Formal Definition

ECON 4160, Spring term Lecture 12

7 Introduction to Time Series

Ross Bettinger, Analytical Consultant, Seattle, WA

Introduction to Economic Time Series

AR, MA and ARMA models

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Permanent Income Hypothesis (PIH) Instructor: Dmytro Hryshko

Advanced Econometrics

A time series is called strictly stationary if the joint distribution of every collection (Y t

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Estimating AR/MA models

Lecture 2: ARMA(p,q) models (part 2)

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Nonstationary Time Series:

10. Time series regression and forecasting

Transcription:

Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg Summer Term 2016 Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 1 / 34

Outline 1. Introduction: Financial Markets and Market Structure 2. Financial Market Equilibrium Theory and Asset Pricing Models 3. Statistical Building Blocks and Econometric Basics 4. Transaction and Trading Models 5. Information-Based Models 6. Inventory Models 7. Limit Order Book Models 8. Price Discovery and Liquidity 9. High-frequency Trading 10. Current Developments Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 2 / 34

Important Notes Having discussed the different market types, trading systems and order types in the first lecture, and the relevant economic equilibrium models in the second lecture, this set of slides concludes the basics part of the course. Here the basic regression model is briefly reviewed before the time-series properties of financial market variables are discussed. The basics of time-series properties will help in understanding some of the models that will be discussed throughout the course. Of course all relevant elements will be discussed in detail when the papers are discussed, the slides here should just serve as a helping file. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 3 / 34

Linear regression review The simple regression function with several descriptive variables: y i = α + β 1 x 1i +... + β k x ki + u i with y i = dependent variable α= constant x ki = explanatory variable β k = coefficient of explanatory variable k u i = random error term, independently, identically distributed u i.i.d. The method of least squares minimizes the sum of the squares of the residuals. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 4 / 34

Linear regression review min N ûi 2 = (y i ŷ i ) 2 i=1 y i = ˆα + K ˆβ k x ki + û i k=1 ŷ i = ˆα + K ˆβ k x ki k=1 û i = y i ˆα + K ˆβ k x ki û i = y i ŷ i k=1 Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 5 / 34

Linear regression review Residual and estimator in case of one variable: N i i=1û 2 = (y i ŷ i ) 2 ûi 2 = (y i ˆα ˆβ 1 x i ) 2 i i N ˆβ i = [(x i x)(y i ȳ)] i=1 N ˆα = y ˆβ 1 x [(x i x)] 2 i=1 Total, explained and residual sum of squares: (y i y) 2 = (ŷ i y) 2 + ûi 2 = (ŷ i y) 2 + (y i ŷ i ) 2 i i i i TSS = ESS + RSS = ESS + RSS Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 6 / 34

Linear regression review Source: Brooks (2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 7 / 34

Linear regression review Quality of the regression: The explanatory power of the estimate can be represented by several different measures: R 2 = TSS ESS = 1 TSS RSS = 1 û2 i (ŷi y) 2 R 2 = 1 RSS (N K 1) TSS (N 1) R 2 = 1 (1 R 2 ) (N 1) (N K) loglikelihood = N 2 (1 + log(2π) + log( û2 i N ) AIC = 2l N + 2K N SIC = 2l N + K log(n) N Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 8 / 34

Hypothesis tests t-test of the null hypothesis that the respective observed coefficient is zero: t k = β k ˆσ(β k ) F-test to test the null hypothesis that all coefficients are zero: F = R 2 K 1 (1 R 2 ) (N K) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 9 / 34

Hypothesis tests Source: Brooks (2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 10 / 34

Hypothesis test Source: Brooks (2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 11 / 34

Hypothesis test Source: Brooks (2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 12 / 34

OLS and MLE Ordinary last squares is one possibility to determine the parameters. Maximum likelihood estimation can be employed as well. Here the log likelihood function for the most simple case is shown: log L = T 2 log(2π) T 2 log(σ2 u) T [ ] (yi α β 1 x 1i ) (other common notations to be found as well) t=1 2σ 2 u Assumed again is the normal distribution of residuals. MLE is done by maximizing the likelihood by choice of the respective parameters. As we will see, a lot of studies that are discussed in the course employ MLE estimations. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 13 / 34

Time series properties Studies of financial market data are often operated with time series. Econometric models which consider one or more variables over time are called dynamic models. Generally, a stochastic process is to be described, which is based on the variables under consideration. This process is also called data generating process (DGP). When the data generating process is known, we are able to describe it recursively. This lecture includes the basics of time series analysis, what helps to understand the models discussed later in the course. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 14 / 34

Time series properties Modified example of Enders ( Applied Econometric Time Series,2010): A variable is for example consisting of trend, seasonal and irregular part: T t = 1 + 0.1t S t = 1.6sin ( ) t π 10 I t = 0.7I t 1 + ɛ t 6 Components of the time series 7 Realizations of the time series 5 6 4 5 3 2 4 1 3 0 2 1 1 2 3 0 4 0 10 20 30 40 50 1 0 10 20 30 40 50 Source: Own example, modification based on Enders, W. (2010). Applied Econometric Times Series. Wiley. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 15 / 34

Stationarity The example above shows that the time series depends on previous observations in all components. For stationarity, different characteristics of these dependencies are essential: A stationary variable is displaying the characteristic of mean reversion, that is, fluctuates around a constant long-term mean (E (Y t ) = µ). Furthermore, stationary time-series possess a finite, constant variance (Var (Y t ) = σ 2 for all t) and are covariance stationary, meaning that the covariance between past observations depends only on the time span between the observations, and not on the time of observation (Cov (Y t,y t k ) = γ k for all k). Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 16 / 34

Autoregression Usually one speaks of stationarity if the time series is covariance stationary. This can can be investigated by using the autoregression function: p y t = a 0 + a i y t i + ɛ t i=1 y t = a 0 + a 1 y t 1 + a 2 y t 2 +... + a p y t p + ɛ t The sequence of the error term is a white noise process (the error term is a pure random variable, independently, identically distributed with mean 0 and constant variance) and the linear coefficients are also constant. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 17 / 34

Autoregression Using the lag operator the process becomes: ( 1 a1 L 1 + a 2 L 2 +... + a p L p) y t = a 0 + ɛ t a (L)y t = a 0 + ɛ t, with 1 a 1 L + a 2 L 2 +... + a p L p. The lag operator is linear and has a negative sign for future observations. Lag-operator: L i y t = y t i and L i y t = y t+i as well as y t = y t y t 1 = (1 L)y t Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 18 / 34

Autoregression As outlined above, the autoregression function: y t = a 0 + a 1 y t 1 + a 2 y t 2 +... + a p y t p + ɛ t Conditions for stationarity of a process derive from the solution of the changed AR-process, ex constant and error terms: y t a 1 L 1 y t a 2 L 2 y t... a p L p y t p = 0 y t ( 1 a1 z a 2 z 2... a p z p) = 0 Here the lag operator is replaced with the complex variable z. The expression in brackets is the characteristic polynomial of the process, and the solutions are the characteristic roots of the AR(p) process. 1 a 1 z = 0 a 1 z = 1 z = 1 a 1 a 1 = 1 z Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 19 / 34

Stationarity and Autoregression Is the condition fulfilled, we call the process weakly stationary. For an AR(2) process, the conditions are: (a 1 + a 2 ) < 1 (a 2 a 1 ) < 1 1 < a 2 < 1 A stationary variable is displaying the characteristic of mean reversion, is fluctuating around a constant long-term mean. Furthermore, stationary time-series possess a finite, constant variance and are covariance stationary, meaning that the covariance between past observations depends only on the time span between the observations, and not on the time of observation: E (y t ) = µ Var (y t ) = E (y t µ) 2 = σ 2 Cov (y t,y t s ) = E (y t µ)(y t s µ) = γ s Whereas for all s: µ = const. σ 2 = const. γ s = const. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 20 / 34

Integration Definition of integration (Engle and Granger, 1987, Co-Integration and Error-Correction: Representation, Estimation and Testing ): A series with no deterministic trend components which has a stationary, invertible, ARMA representation after differencing d times, is said to be integrated of order d, denoted I(d). This means that a variable which is stationary after differencing once, is integrated of order 1, namely I(1). Testing for stationarity can be accomplished with many different procedures, the most common is the Augmented-Dickey-Fuller-Test (ADF-Test). It is tested whether the process exhibits a unit root, i.e. whether previous observations have a persistent, non-decaying influence on following observations. If the null hypothesis of a unit root can be rejected, the variable is stationary. If not, the variable is non-stationary. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 21 / 34

White Noise A white noise process is a pure random process. The observations are independently, identically distributed with a constant mean and constant variance. Additionally the covariance for a lag > 0 is 0: E(y t ) = µ var(y t ) = σ 2 γ t r = { σ2 if t = r 0 otherwise Every observation is independent of previous ones. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 22 / 34

Moving Average The sequence from the white noise process is termed moving average function of order q, MA(q): q y t = µ + u t + θ 1 u t 1 + + θ q u t q = µ + θ i u t i + u t An MA-process is a linear combination of white-noise-processes. A variable y t here is dependent on previous error terms. i=1 Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 23 / 34

Moving Average Using the lag operator yields: q y t = µ + θ i L i u t + u t with the exponent of the lag operator giving i=1 the lag order: L i u t = u t i Another transformation yields: y t = µ + θ(l)u t, with: θ(l) = (1 + θ 1 L + θ 2 L 2 + + θ q L q ) Properties of an MA-process: E(y t ) = µ var(y t ) = γ 0 = (1 + θ 2 1 + θ2 2 + + θ2 q)σ 2 γ s = { (θ s + θ s+1 θ 1 + θ s+2 θ 2 + + θ q θ q s ) for s = 1,2,,q 0 for s > q Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 24 / 34

AR-Process In an autoregressive process the actual value of a variable is dependent on its own value of previous periods and an error term. An AR(p) process can be formulated as: y t = µ + φ 1 y t 1 + φ 2 y t 2 + + φ p y t p + u t = µ + Using the lag-operator yields: p y t = µ + φ i L i y t + u t i=1 p φ i y t i + u t φ(l)y t = µ + u t with φ(l) = (1 φ 1 L φ 2 L 2 θ p L p ) i=1 Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 25 / 34

ARMA An ARMA process is a combination of an AR(p) and an MA(q) process: In an ARMA model the value of a variable is dependent on its own previous realizations and a combination of the current and previous error terms: φ(l)y t = µ + θ(l)u t, with φ(l) = (1 φ 1 L φ 2 L 2 θ p L p ) θ(l) = (1 + θ 1 L + θ 2 L 2 + + θ q L q ) E(u t ) = 0; E(u 2 t ) = σ 2 ; E(u t u s ) = 0, t s Investigations of ARMA processes can be done systematically for example by using the Box-Jenkins method. Here the whole investigation is fragmented into identification, estimation and evaluation. In practice, ARMA model investigations are often done iteratively however and often need to rely on information criteria. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 26 / 34

ACF For a covariance stationary process the autocorrelation function is: T 1 T s (y t ȳ)(y t s ȳ) t=s+1 T ρ s =, with ȳ = T T 1 y t as arithmetic mean 1 (y t ȳ) 2 t=1 T 1 t=s+1 As the autocovariance is constant for covariance stationary processes, so is the autocorrelation. Thus, the autocorrelation is independent of the time. In the case of all autocorrelations being zero, we obtain the white noise process again. The latter is an assumption for the residuals in ordinary least squares, regression analysis. For the case that at least for one time span there is either positive or negative correlation between the residuals, we have autocorrelation in the residuals. This is called serial correlation. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 27 / 34

PACF The partial autocorrelation function is expressing the correlation between 2 observations of different time points, where the influence of observations in between is excluded: φ 11 = ρ 1 φ 22 = (ρ 2 ρ 2 1) (1 ρ 2 1) φ ss = s 1 ρ s φ s 1,j ρ s j j=1 s 1 1 j=1 φ s 1,j ρ s j,s = 3,4,5..., with φ sj = φ s 1,j φ ss φ s 1,s j for j = 1,2,...,s 1 Accordingly, the autocorrelation function is informative on the unconditional correlation, whereas the partial autocorrelation function is informative on the conditional correlation between two time points. Both functions are used to identify ARMA structures. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 28 / 34

ACF and PACF Source: First table: Enders (2010) // Second table: Rachev et al. (2007) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 29 / 34

AR(I)MA Considering the ARMA specification as difference equation, we can solve it obtaining the so-called moving average representation for the dependent variable: p q y = µ + φ i y t i + θ i u t i i=1 i=1 q µ+ θ p q i u t i (1 φ i L i i=1 )y i = µ + θ i u t i y t = p i=1 i=1 (1 φ i L i ) i=1 The resulting stochastic difference equation can be seen as an MA process of infinite order: q y t = µ+ θ i u t i i=1 p (1 φ i L i ) i=1 MA( ) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 30 / 34

AR(I)MA The main interest in this representation is the condition for stability of the stochastic difference equation being the necessity of convergence of the MA process. It can be shown that this is given for the case that all characteristic roots of the polynomial p (1 φ i L i ) are outside of the unit circle. i=1 In addition, if the dependent variable is explained by a linear stochastic difference equation, the condition for stability is a necessary condition for stationarity of the dependent variable. If all characteristic roots lie outside the unit circle, there is a stable, stationary process. However, if there is at least one root within the unit circle, the sequence is of the endogenous variable is integrated. This results in the presence of an Autoregressive Integrated Moving Average (ARIMA) process. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 31 / 34

Process comparison Source: Brooks(2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 32 / 34

Process comparison Source: Brooks(2008) Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 33 / 34

References In general, recommended literature for econometrics is the one that was listed in Lecture 1. Prof. Stein (michael.stein@vwl.uni-freiburg.de) EMMA Lecture 3 Summer Term 2016 34 / 34