Vast Volatility Matrix Estimation for High Frequency Data

Similar documents
VAST VOLATILITY MATRIX ESTIMATION FOR HIGH-FREQUENCY FINANCIAL DATA

Minjing Tao and Yazhen Wang. University of Wisconsin-Madison. Qiwei Yao. London School of Economics. Jian Zou

Portfolio Allocation using High Frequency Data. Jianqing Fan

Duration-Based Volatility Estimation

On Lead-Lag Estimation

Empirical properties of large covariance matrices in finance

Subsampling Cumulative Covariance Estimator

Shape of the return probability density function and extreme value statistics

Finite Sample Analysis of Weighted Realized Covariance with Noisy Asynchronous Observations

Large Volatility Matrix Estimation with Factor-Based Diffusion Model for High-Frequency Financial data

Stochastic Volatility and Correction to the Heat Equation

On Realized Volatility, Covariance and Hedging Coefficient of the Nikkei-225 Futures with Micro-Market Noise

High Dimensional Covariance and Precision Matrix Estimation

Understanding Regressions with Observations Collected at High Frequency over Long Span

Recovering overcomplete sparse representations from structured sensing

Symmetric btw positive & negative prior returns. where c is referred to as risk premium, which is expected to be positive.

Statistical methods for financial models driven by Lévy processes

Applications of Random Matrix Theory to Economics, Finance and Political Science

Quantitative Finance II Lecture 10

sparse and low-rank tensor recovery Cubic-Sketching

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

New stylized facts in financial markets: The Omori law and price impact of a single transaction in financial markets

Unified Discrete-Time and Continuous-Time Models. and High-Frequency Financial Data

Introduction General Framework Toy models Discrete Markov model Data Analysis Conclusion. The Micro-Price. Sasha Stoikov. Cornell University

Realized Volatility, Covariance and Hedging Coefficient of the Nikkei-225 Futures with Micro-Market Noise

Distributed Optimization over Networks Gossip-Based Algorithms

Methods for sparse analysis of high-dimensional data, II

Eigenvalue spectra of time-lagged covariance matrices: Possibilities for arbitrage?

Some Aspects of Universal Portfolio

Methods for sparse analysis of high-dimensional data, II

Bayesian Modeling of Conditional Distributions

On a class of stochastic differential equations in a financial network model

Short-time expansions for close-to-the-money options under a Lévy jump model with stochastic volatility

Introduction to Algorithmic Trading Strategies Lecture 3

Supplementary Appendix to Dynamic Asset Price Jumps: the Performance of High Frequency Tests and Measures, and the Robustness of Inference

Volatility. Gerald P. Dwyer. February Clemson University

Spatially Adaptive Smoothing Splines

Estimation of large dimensional sparse covariance matrices

Reconstruction from Anisotropic Random Measurements

Fast-slow systems with chaotic noise

Discretization of SDEs: Euler Methods and Beyond

Some functional (Hölderian) limit theorems and their applications (II)

Simulation of conditional diffusions via forward-reverse stochastic representations

Permutation-invariant regularization of large covariance matrices. Liza Levina

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS

Beyond incoherence and beyond sparsity: compressed sensing in the real world

Assumptions for the theoretical properties of the

Outline. A Central Limit Theorem for Truncating Stochastic Algorithms

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion

Nonlinear Time Series Modeling

Quantitative Methods in High-Frequency Financial Econometrics:Modeling Univariate and Multivariate Time Series

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Multilevel Preconditioning and Adaptive Sparse Solution of Inverse Problems

Bridging the Gap between Center and Tail for Multiscale Processes

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Multi-Factor Lévy Models I: Symmetric alpha-stable (SαS) Lévy Processes

arxiv:physics/ v3 [physics.data-an] 29 Nov 2006

Introduction to Algorithmic Trading Strategies Lecture 10

The Realized RSDC model

Extremogram and ex-periodogram for heavy-tailed time series

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 7

Stochastic Integration and Stochastic Differential Equations: a gentle introduction

The First Fundamental Form

Affine Processes. Econometric specifications. Eduardo Rossi. University of Pavia. March 17, 2009

Lecture 4: Introduction to stochastic processes and stochastic calculus

MULTIDIMENSIONAL WICK-ITÔ FORMULA FOR GAUSSIAN PROCESSES

Stochastic Modelling in Climate Science

Variable Selection for Highly Correlated Predictors

Multi-Factor Finite Differences

IOANNIS KARATZAS Mathematics and Statistics Departments Columbia University

Order book resilience, price manipulation, and the positive portfolio problem

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

High-frequency data modelling using Hawkes processes

Detection of structural breaks in multivariate time series

Factor-Adjusted Robust Multiple Test. Jianqing Fan (Princeton University)

EQUITY MARKET STABILITY

High-frequency data modelling using Hawkes processes

On the Goodness-of-Fit Tests for Some Continuous Time Processes

arxiv:physics/ v2 [physics.soc-ph] 8 Dec 2006

Optimal portfolio strategies under partial information with expert opinions

Linear Models for Regression. Sargur Srihari

A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

On the Statistical Equivalence at Suitable Frequencies of GARCH and Stochastic Volatility Models with the Corresponding Diffusion Model

Overlapping Variable Clustering with Statistical Guarantees and LOVE

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Change point tests in functional factor models with application to yield curves

Exact and high order discretization schemes. Wishart processes and their affine extensions

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

MATH 304 Linear Algebra Lecture 23: Diagonalization. Review for Test 2.

Bayes Model Selection with Path Sampling: Factor Models

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

The problem is to infer on the underlying probability distribution that gives rise to the data S.

Transcription:

Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36

Outline 1. Basic Setting 2. Regularization and Estimation 3. Numerical Studies Y. Wang (at NSF) 2 / 36

High-Frequency Finance High-Frequency Data: Intradaily observations on asset prices such as tick by tick stock price data and minute by minute exchange rate data. Data Characteristics: High-frequency data have complex structure with microstructure noise. One-Dim Model: Observed data: Y ti, i = 1,, n and X t = true log-price of a stock Y ti = X ti + ǫ ti, i = 1,, n ǫ ti : microstructure noise and independent of X t. Y. Wang (at NSF) 3 / 36

Very High Dim: Large Volatility Matrix High Dim Model: For the i-th asset, observation times t ij, i = 1,, p, j = 1,, n i and observed log price Y i (t i,j ), Y i (t i,j ) = X i (t i,j ) + ε i (t i,j ), X i (t): true log price of asset i, and microstructure noise ε i ( ): i.i.d. with zero mean, and independent of X i (t). Nonsynchronization: stocks transactions occur at distinct times and the prices of different stocks are recorded at mismatched time points. Y. Wang (at NSF) 4 / 36

Stock 3 Stock 2 Stock 1 Time Y. Wang (at NSF) 5 / 36

Price Model X t = (X 1t,, X pt ) : log price of p assets dx t = µ t dt + σ t dw t, t [0, 1], where W t : p-dimensional BM, and σ t : p p matrix. Integrated volatility matrix: Γ = 1 0 γ(t) dt, γ(t) = σ t σ t Goal: Estimate Γ based on data Y i (t ij ). Y. Wang (at NSF) 6 / 36

Methodology: 1. Form realized volatility (RV) matrix 2. Regularize RV matrix Y. Wang (at NSF) 7 / 36

Realized co-volatility τ = {τ r = r/m, r = 1,, m}: pre-determined sampling frequency. For assets i 1 and i 2, select previous-tick times: τ is,r = max{t is,j τ r, j = 1,, n is }, s = 1, 2 Realized volatility matrix ˆΓ(τ): m ˆΓ i1,i 2 (τ) = [Y i1 (τ i1,r) Y i1 (τ i1,r 1)][Y i2 (τ i2,r) Y i2 (τ i2,r 1)]. r=1 Y. Wang (at NSF) 8 / 36

Stock 3 Stock 2 Stock 1 Time Y. Wang (at NSF) 9 / 36

Take n = (n 1 + + n p )/p, τ k = τ + (k 1)/n, k = 1,, K = [n/m] Γ = 1 K K ˆΓ(τ k ) k=1 where Γ ii are adjusted by subtracting them from estimated noise variance components, 2 m n i n i [Y i (t i,l ) Y i (t i,l 1 )] 2 l=1 Y. Wang (at NSF) 10 / 36

Matrix Size For each entry ˆΓ i1,i 2 : ˆΓ i1,i 2 Γ i1,i 2 = O P (n η ), η = 1/2, 1/3, 1/4, 1/6 Dimension Reduction For moderate to large p, (p 2 + p)/2 entries in Γ: too many parameters and too much random fluctuation. Issue: Usual dimension reduction techniques are not applicable to non-synchronized data. Y. Wang (at NSF) 11 / 36

Numerical Illustration X(t) = (W 1 (t),, W p (t)): vector of p independent Brownian motions. Observations = X(k/n), k = 0, 1,, n. Γ = I p, Γ = ( Γij ), Γij = 1 n N Z ik Z jk k=1 Z ik = n[w i (k/n) W i ((k 1)/n)] N(0, 1) Take n = 100 and p = 100. We compute the eigenvalues of Γ in a simulation with 50 replications. Y. Wang (at NSF) 12 / 36

(a) 50 sets of ordered 100 eigenvalues (b) 50 pairs of max and min eigenvalues Eigenvalues 0 1 2 3 4 Eigenvalues 0 1 2 3 4 0 20 40 60 80 100 index 0 10 20 30 40 50 index Y. Wang (at NSF) 13 / 36

Regularize Volatility Matrix Write Γ = (Γ ij ) Sparsity: assume Γ has a sparse representation p Γ ij δ M π(p), i = 1,, p, E[M] C, j=1 where 0 δ < 1 and π(p) = 1, log p, or a small power of p. Examples: (1) Block diagonal matrix (2) Matrix with decay elements from diagonal (3) Matrix with small number of non-zero elements in each row (4) Random permutations of rows and columns for above matrices Y. Wang (at NSF) 14 / 36

Estimation with Regularization Write ˆΓ = (ˆΓ ij ). Thresholding: for sparse Γ, regularize ˆΓ by thresholding ) T [ˆΓ] = (ˆΓ ij 1( ˆΓ ij ), where is a threshold. Y. Wang (at NSF) 15 / 36

Asymptotic Theory Define matrix norm Γ 2 = sup{ Γ x 2, x 2 = 1} = max absolute eigenvalue Technical conditions A1: For some β > 0, [ max max E γ ii (t) β] [ <, max max E µ i (t) β] <, 1 i p 0 t 1 1 i p 0 t 1 [ max E ε i (t il ) 2 β] <. 1 i p A2: Each asset has at least one observation between consecutive time points of the selected sampling frequency. With n = (n 1 + + n p )/p, n C 1 min i 1 i p n max n i 1 i p n C 2, max max t il t i,l 1 = O(n 1 ). 1 i p 1 l n i m Y. Wang (at NSF) 16 / 36

Theorem For sparse Γ, under conditions A1-A2, we have [ T [ˆΓ] Γ 2 = O P (π(p) p e β/2 n ] 2 (1 δ)/β ), where = e n p 2/β log log(n p), and for noisy data, e n n 1/6, convergence rate = π(p) [p n β/12] 2(1 δ)/β for noiseless data, e n n 1/3, convergence rate = π(p) [p n β/6] 2(1 δ)/β Y. Wang (at NSF) 17 / 36

Some Insights (1) Multiple random sources; (2) Fat tails; (3) Non-synchronization problem. X i (t i,j ) = Y i (t i,j ) = X i (t i,j ) + ε i (t i,j ) ti,j 0 µ i,s ds + ti,j 0 σ i,s dw s Y. Wang (at NSF) 18 / 36

Simulations Y i (t ij ) = tij 0 σ i,s db s + ε i (t ij ), i = 1,, p, j = 1,, n p = 512, n = 200, γ(t) = σ t σ t. γ ii(t): geometric OU, sum of two CIR, Nelson GARCH diffusion, and two-factor log linear SV. κ(t) = e2 u(t) 1 e 2 u(t) + 1, γ ij (t) = {κ(t)} i j γ ii (t)γ jj (t), 1 i j p, du(t) = 0.03 [0.64 u(t)] dt + 0.118 u(t) dw κ,t, W κ,t = 0.96 W 0 κ,t 0.2 p B it / p, i=1 Y. Wang (at NSF) 19 / 36

κ(0)=0.537 κ(0)=0.762 κ(t) 0.52 0.54 0.56 κ(t) 0.72 0.74 0.76 0.78 0 50 100 150 200 0 50 100 150 200 t t κ(0)=0.905 κ(0)=0.964 κ(t) 0.87 0.88 0.89 0.90 0.91 κ(t) 0.94 0.95 0.96 0.97 0 50 100 150 200 0 50 100 150 200 t t κ(0)=0.98 κ(0)=0.995 κ(t) 0.980 0.984 0.988 0.992 κ(t) 0.992 0.994 0.996 0.998 0 50 100 150 200 0 50 100 150 200 t t Y. Wang (at NSF) 20 / 36

Simulate γ ii : Brownian motions governing γ ii are constructed as follows Wit 1 = ρ i B it + 1 ρ 2 i Uit 1, W it 2 = ρ i B it + 1 ρ 2 i Uit 2, where U 1 i and U 2 i are independent standard BW, and 0.62, 1 i p/4, 0.50, p/4 < i p/2, ρ i = 0.25, p/2 < i 3 p/4, 0.30, 3 p/4 < i p. Y. Wang (at NSF) 21 / 36

Geometric OU process Sum of two CIR: d log γ ii (t) = 0.6 (0.157 + log γ ii (t)) dt + 0.25 dw 1 it γ ii (t) = 0.98 (v 1,t + v 2,t ) dv 1,t = 0.0429 (0.108 v 1,t ) dt + 0.1539 v 1,t dw 1 i,t dv 2,t = 3.74 (0.401 v 2,t ) dt + 1.4369 v 2,t dw 2 i,t Nelson GARCH diffusion: dγ ii (t) = [0.1 γ ii (t)] dt + 0.2 γ ii (t) dw 1 i,t Y. Wang (at NSF) 22 / 36

Two-factor log linear SV model: γ ii (t) = e 6.8753 s-exp(0.04 v 1,t + 1.5 v 2,t 1.2) d v 1,t = 0.00137 v 1,t dt + dw 1 i,t d v 2,t = 1.386 v 2,t dt + (1 + 0.25 v 2,t ) dwi,t 2 e u, if u log (8.5), s-exp(u) = 8.5 {1 log (8.5) + u 2 / log (8.5)} 1/2, if u > log (8.5). Y. Wang (at NSF) 23 / 36

Y. Wang (at NSF) 24 / 36

Y. Wang (at NSF) 25 / 36

ε i (t ij ) i.i.d N(0, a 2 ) with a = 0.002, 0.127, 0.2 for low, medium and high noise levels. MSE for noisy synchronized data κ(0) Noise Estimator 0.537 0.762 0.905 0.964 0.995 Low ˆΓ 5.595 6.039 7.511 9.959 18.270 Low T [ˆΓ] 0.845 2.456 4.595 7.457 Med ˆΓ 5.641 6.097 7.649 10.479 18.398 Med T [ˆΓ] 0.871 2.466 4.101 7.680 High ˆΓ 5.769 6.234 7.717 10.521 19.26 High T [ˆΓ] 0.896 2.429 4.043 8.765 Y. Wang (at NSF) 26 / 36

Simulate 3 n = 600 observations and divide into 200 groups of 3 observations. Randomly pick up one from each group to form non-synchronized data. MSE for noisy non-synchronized data κ(0) Noise Estimator 0.537 0.762 0.905 0.964 0.995 Low ˆΓ 12.86 16.99 27.13 48.37 152.75 Low T [ˆΓ] 3.842 5.281 13.38 30.29 121.15 Med ˆΓ 12.98 17.10 27.15 48.57 153.57 Med T [ˆΓ] 3.374 4.728 11.662 30.53 123.22 High ˆΓ 13.16 17.15 27.50 48.07 151.85 High T [ˆΓ] 3.997 4.902 11.70 29.98 100.13 Y. Wang (at NSF) 27 / 36

Application High-frequency price data on 630 stocks traded in Shanghai market over 177 days in 2003. For each day we compute volatility matrix estimator Γ i, i = 1,, 177. Y. Wang (at NSF) 28 / 36

Average RV matrix at various threshold levels Y. Wang (at NSF) 29 / 36

Average wavelet RV matrix at various threshold levels Y. Wang (at NSF) 30 / 36

Concentration plots of RV and wavelet RV estimators for 9 days Y. Wang (at NSF) 31 / 36

Threshold Γ i Calibrate threshold value i,a = a-quantile of the absolute entries of Γ i and choose the value of a to minimize 176 ] 2 Λ(a) = Γ i+1 [ Γ T i,a i i=1 Find â = 0.95 and evaluate T i,0.95 [ Γ i ]. 2 Compute the eigenvalues for Γ i and T i,0.95 [ Γ i ] and plot their corresponding largest eigenvalues for comparison. Y. Wang (at NSF) 32 / 36

Largest eigenvalues of Γ i and their regularizations Y. Wang (at NSF) 33 / 36

Largest eigenvalues for Γ i, regularizations, and wavelet versions Superimposed plot largest eigenvalue 0.0 0.1 0.2 0.3 0.4 0.5 Gamma T(Gamma) T(W) 0 50 100 150 Y. Wang (at NSF) 34 / 36

Zoom in plot Superimposed plot largest eigenvalue 0.00 0.05 0.10 0.15 0.20 Gamma T(Gamma) T(W) 0 50 100 150 day Y. Wang (at NSF) 35 / 36

Concluding Remarks 1. Good matrix estimators may perform poorly when the matrix size is very large. We need to regularize large sample covariance and RV matrix estimators. 2. For sparse matrices, thresholding yields good performance for sample covariance and RV based matrix estimators. Papers can be down-loaded from http://www.stat.uconn.edu/ yzwang Y. Wang (at NSF) 36 / 36