Some Proofs: This section provides proofs of some theoretical results in section 3.

Similar documents
Testing Jumps via False Discovery Rate Control

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

A Simple Regression Problem

Block designs and statistics

Generalized Augmentation for Control of the k-familywise Error Rate

The degree of a typical vertex in generalized random intersection graph models

FDR- and FWE-controlling methods using data-driven weights

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

DERIVING TESTS OF THE REGRESSION MODEL USING THE DENSITY FUNCTION OF A MAXIMAL INVARIANT

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

1 Proof of learning bounds

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

Understanding Machine Learning Solution Manual

Constructing Locally Best Invariant Tests of the Linear Regression Model Using the Density Function of a Maximal Invariant

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies

arxiv: v1 [stat.ot] 7 Jul 2010

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Local asymptotic powers of nonparametric and semiparametric tests for fractional integration

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Tracking using CONDENSATION: Conditional Density Propagation

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

Statistics and Probability Letters

1 Generalization bounds based on Rademacher complexity

1 Bounding the Margin

Testing equality of variances for multiple univariate normal populations

CS Lecture 13. More Maximum Likelihood

Nonlinear Log-Periodogram Regression for Perturbed Fractional Processes

Bootstrapping Dependent Data

Solutions of some selected problems of Homework 4

Multi-Dimensional Hegselmann-Krause Dynamics

CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING. RESEARCH REPORT. Christophe A.N. Biscio and Jesper Møller

Supplement to: Subsampling Methods for Persistent Homology

Support Vector Machines. Maximizing the Margin

Optimal Jackknife for Discrete Time and Continuous Time Unit Root Models

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Simultaneous critical values for t-tests in very high dimensions

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Asymptotics of weighted random sums

In this chapter, we consider several graph-theoretic and probabilistic models

arxiv: v1 [math.pr] 17 May 2009

The path integral approach in the frame work of causal interpretation

Poisson processes and their properties

arxiv: v2 [math.st] 11 Dec 2018

Computable Shell Decomposition Bounds

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Moments of the product and ratio of two correlated chi-square variables

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

Non-Parametric Non-Line-of-Sight Identification 1

Metric Entropy of Convex Hulls

Sharp Time Data Tradeoffs for Linear Inverse Problems

A Jackknife Correction to a Test for Cointegration Rank

Inference about Realized Volatility using In ll Subsampling

Computable Shell Decomposition Bounds

Computational and Statistical Learning Theory

The Hilbert Schmidt version of the commutator theorem for zero trace matrices

RAFIA(MBA) TUTOR S UPLOADED FILE Course STA301: Statistics and Probability Lecture No 1 to 5

Information Loss in Volatility Measurement with Flat Price Trading 1

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

ESE 523 Information Theory

Necessity of low effective dimension

Adaptive Stabilization of a Class of Nonlinear Systems With Nonparametric Uncertainty

Feature Extraction Techniques

Information Overload in a Network of Targeted Communication: Supplementary Notes

Analyzing Simulation Results

Biostatistics Department Technical Report

New upper bound for the B-spline basis condition number II. K. Scherer. Institut fur Angewandte Mathematik, Universitat Bonn, Bonn, Germany.

Multiple Testing Issues & K-Means Clustering. Definitions related to the significance level (or type I error) of multiple tests

Computational and Statistical Learning Theory

1 Brownian motion and the Langevin equation

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

FAST DYNAMO ON THE REAL LINE

A Bernstein-Markov Theorem for Normed Spaces

Fixed-to-Variable Length Distribution Matching

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

arxiv: v1 [cs.ds] 3 Feb 2014

Least Squares Fitting of Data

Inflation Forecasts: An Empirical Re-examination. Swarna B. Dutt University of West Georgia. Dipak Ghosh Emporia State University

Probability Distributions

Celal S. Konor Release 1.1 (identical to 1.0) 3/21/08. 1-Hybrid isentropic-sigma vertical coordinate and governing equations in the free atmosphere

Supplementary Materials: Proofs and Technical Details for Parsimonious Tensor Response Regression Lexin Li and Xin Zhang

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

A Note on Online Scheduling for Jobs with Arbitrary Release Times

Comparing Probabilistic Forecasting Systems with the Brier Score

Meta-Analytic Interval Estimation for Bivariate Correlations

LORENTZ SPACES AND REAL INTERPOLATION THE KEEL-TAO APPROACH

Lecture 21 Principle of Inclusion and Exclusion

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all

Supplementary Information for Design of Bending Multi-Layer Electroactive Polymer Actuators

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Estimating Parameters for a Gaussian pdf

A Theoretical Analysis of a Warm Start Technique

IN modern society that various systems have become more

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

Transcription:

Testing Jups via False Discovery Rate Control Yu-Min Yen. Institute of Econoics, Acadeia Sinica, Taipei, Taiwan. E-ail: YMYEN@econ.sinica.edu.tw. SUPPLEMENTARY MATERIALS Suppleentary Materials contain the following sections: Soe oofs: This section provides proofs of soe theoretical results in section 3. The PRDS condition: This section provides a ore detailed discussion on the PRDS condition. Siulation with the SVFJ odel: This section provides siulation results fro another stochastic volatility plus jup odel SVFJ []. Data descriptions: This section provides descriptions of the real data used in section 5. Soe discussions on the daily realized variance, bipower variation, the jup test statistics and icrostructure issue of the data are also presented. oofs of soe theoretical results oof of Theore oof. Let s start our proof fro how to construct the. Without loss of generality, suppose that the first hypotheses are true, and the rest hypotheses are false. Now consider events such that we reject the first v true null hypotheses and the first s false hypotheses. Let the optial significance level selected by the BH procedure i γ/ q v+s. Then p M, q v+s,..., p M,v q v+s, p M,v+ > q v+s+,..., p M,0 > q 0+s, p M,0+ q v+s,..., p M,0+s q v+s, p M,0+s+ > q 0+s+,..., p M, > q represents probability of one of such events. Note that here i v+s, and q i iγ/ for i v+s+,..., is the crietria corresponding to a hypothesis which is not rejected. Let,, [0, q v+s ] v 0+s iv+s+ q i, ] [0, q v+s ] s i+s+ q i, ], and the above probability can be rewritten as p M,,. Let E i [0, ] be the fold products of interval [0, ]. Note that joint density of p M is integrable over the set E. Apparently,, E, so p M,, exists. By suitably varying perutations of intervals [0, qv+s ] and q i, ] i v + s +,...,, we can obtain different diensional cubes to construct sets for events 0 of rejecting s false and v true null hypotheses, and the total nuber of such perutations is v s v!. s To see this, at first we focus on the events when p M, q v+s,..., p M,v q v+s and p M,0+ q v+s,..., p M,0+s q v+s occur, and the rest p-values are greater than their corresponding significance

2 levels. In this case, there are total s v! possible perutations of q i, ] for these non-rejected hypotheses. Let s v!,,j,, j be union of such events, and also obviously, E. Furtherore, if we vary perutations of the 0 interval [0, q v+s ] for the s false the v true null hypotheses, there are such different perutations. Therefore for the s false and the v true null hypotheses, total nuber of possible perutations s v 0 of the interval [0, q v+s ] is. Let 0 v s 0, and s v! h, v s 0 h,j,, for j h,..., denote such union of the diensional cubes. Finally, let h h, h s v! j h,j,. E, since all h,j, E. When there are true null hypotheses, the probability of rejecting v true null and s false hypotheses under the BH procedure is thus given by 0 s v! h { } v,s s v! p M h,j, p M p M. h j h j h,j, The sae approach can be used to construct the probability of rejecting v true null and s false hypotheses when we ipleent the BH procedure with p, and it is given by p. Furtherore, if the consistency for ultivariate distribution holds, p M and p D v,s exist when. Then E V/R and E pm V/R can be expressed as a function of the arginal distributions of p-values. Let us use E V/R as an exaple. As shown in Lea 4. of [2], p can be further expressed as h v,s p i q v+s v p h,, h and therefore V E R v v + s p s0 v v v + s 0 p h, s0 v h v p i q v+s v + s v p s0 v h 0 v + s v,s p i q v+s p s0 v h, h h, h.

3 Let Λ v,s i, denote the event that if p i q v+s occurs and then v true null and s false hypotheses are rejected. We can see that Also let {p i q v+s } p h, h {p i q v+s } Λ v,s i,. q {q v+s : v + s } α, and Λ i, { } Λ v,s i, : v + s. Note that Λ v,s i, is utually disjoint for different v and s. Λ i, is the event that except Hi 0, we reject the other hypotheses given true null hypotheses, and it is disjoint for different i. Then Considering the sae way, Thus V E R s0 v s0 v p M,i q Λi,0, an analog of E pm V R v + s p i q v+s p v + s p i q v+s Λ v,s p i q Λ i,0. h, h i, p i q Λ when p i,0 M is used. Following p M,i q Λ i,0. V V E p M R E R p M,i q Λ i,0 p i q Λ i,0. Note that the consistency for ultivariate distribution should hold, then the above joint probability functions exist when. p i q Λ is just the probability that if p i,0 i q, then the other hypotheses are rejected. Therefore p i q Λ can be explicitly expressed as i,0 Then p i q Λ i,0 p i q, p i q, p i p i q, p i p i q Λ i,0 > q +,... p i > q > q +,... p i > q p i q, p i > q, p i > q +,... p i > q. p i q, p i > q +,... p i > q p i q, p i > q, p i > q +,... p i > q.

The first ter of the above suation is ter is/ is 2 2 p i q p i q, p i > q p i q, p i > q 2,... p i > q, while the last. Suation of the iddle 2 ters p i q, p i > q +,... p i > q p i q, p i > q, p i > q +,... p i > q p i q, p i > q +,... p i > q 2 + p i q +, p i > q +, p i + > q +2,... p i > q p i q, p i > q +,... p i > q + p i q +, p i p i q, p i > q 2+,... p i > q > q +, p i + > q +2,... p i > q + p i q, p i > q. 4 Therefore p i q Λ i,0 p i q, p i > q +,..., p i > q p i q +, p i + > q +,..., p i > q + p i q By siilar way, p M,i q Λ i,0 p M,i q, p i M, > q +,..., p i M, > q + p M,i q +, p i M, > q +,..., p i M, > q + p M,i q. Note that q γ/, so in general as goes large, p i q, p i > q +,..., p i > q p i q +, p i > q +,..., p i > q.

5 Then p i q, p i > q +,..., p i > q + p i q +, p i > q +,..., p i > q p i q, p i > q +,..., p i > q p i q, p i + > q +,..., p i > q + p i q, p i > q +,..., p i > q. Also p M,i q, p i M, > q +,..., p i M, > q + p M,i q +, p i M, > q +,..., p i M, > q + p M,i q, p i M, > q +,..., p i M, > q. Finally V V E p M R E R p M,i q Λ i,0 p i q Λ i,0 i I p M,i q, p i M, > q +,..., p i M, > q 0 + p i q, p i > q +,..., p i > q + p M,i q p i q p M,i q, p i M, > q +,..., p i M, > q + p i q, p i > q +,..., p i > q + p M,i q p i q. If condition 4 holds, the second ter of the last inequality is bounded by O /M δ. If condition 5 hold, the first ter of the last inequality becoes p M,i q, p i M, > q +,..., p i M, > q + p i q, p i > q +,..., p i > q sup sup p M,i q, p i M, > q +,..., p i M, > q p i q, p i > q +,..., p i > q 0 o.

6 We then can conclude that V V E p M R E R 0 o + 0 O M δ. Then V V E p M E R R V 0 E p M R 0 0 E V 00 R 0 0 VR V E pm E R 0 0 o + 0 O M δ 0 E o + O M δ o. As shown in the proof of Theore.2 of [2], if condition 2 holds, then Λ i, p i q. By the assuption that p i q γ, {p i q } Λ Λ i,0 i, p i q γ. Thus V E R V E R γ 0 0 E {p i q } Λ i, Λ i, p i q p i q Λ i, p i q γ Λ p i,0 i q γ γ, V R γ E γ γ. Finally we can conclude that V V li M E p M E γ. R R

7 oof of Theore 2 oof. To start our proof, at first we have a loo of the inequality, p M,i a a, where a 0, and i I 0. Suppose that a q γ/,,...,, and γ 0,, then the above inequality becoes p M,i q γ. It iplies p M,i q γ for all,..., and and i I 0. Let p M,i q F pi,m q, therefore for i I 0, F pi,m q is bound by γ as. Furtherore, since T,..., T are continuous rando variables, p i q γ/. Let F pi q p i a /, then for i I 0, F pi q is also bounded. Since both F pi q and F pi,m q are bound and continuous functions of p i q and p M,i q respectively, we can conclude that as M, if sup sup p M,i q p i q O M δ, then sup sup sup F pm,i q F pi q sup p M,i q p i q O M δ. Since T,..., T are independent, then p,..., p are also independent. Therefore the event Λ i, and {p i q } are independent, and Λ i,0 p i q Λ i,0. Furtherore, by Λ i, are utually exclusive for and is the whole space, therefore Λ i, Λ p i,0 i q Λ i,0 Λ i,. Since T M,,..., T M, are also utually independent, by siilar arguent as above, Λ i,. Fro proof of Theore, we now that V V E p M R E R p M,i q Λ i,0 p i q Λ i,0. Λ i,

8 It can be shown that p M,i q Λ i,0 p i q Λ i,0 Λ p i,0 M,i q p M,i q Λ p i,0 M,i q p i q + Λ p i,0 M,i q p i q Λ p i,0 i q p i q Λ p i,0 M,i q p M,i q p i q + Λ p i,0 M,i q Λ p i,0 i q p i q. Therefore O since Λ V V E p M R E R p M,i q Λ i,0 p i q Λ i,0 Λ i,0 p M,i q p M,i q p i q + Λ p i,0 M,i q Λ p i,0 i q p i q Λ i,0 p M,i q p M,i q p i q + Λ p i,0 M,i q Λ i, p i q Λ i,0 sup sup i I 0 p M,i q p i q + Λ i,0 Λ i, γ Λ i,0 O M δ + γ M δ, i, Λ i,0 0. So V V E p M E R R VR E pm E 0 E O M δ o. V R Λ i,0 Λ i,

9 Finally, if T,..., T are utually independent, their joint distribution is PRDS on the subset of p-values corresponding to true null hypotheses. Thus the conclusion follows. oof of oposition Siilar as in [3], we apply the Orlicz nor to prove the proposition. The Orlicz nor U ψ is defined as { } U U ψ inf c 3 > 0 : E ψ, where ψ is a non-decreasing and convex function with ψ 0 0. As suggested by [4], we set ψ as c 3 ψ ρ u exp u ρ, in the following proof. The corresponding Orlicz nor of ψ ρ u is called an exponential Orlicz nor. For all nonnegative u, u ρ ψ ρ u, which iplies that U ρ U ψρ for each ρ. oof. Let M δ TM,i T i U i,m. With ψ ρ u exp u ρ and ψρ log + ρ, the proof directly follows fro lea 2.2. and 2.2.2 in [4]. Given true null hypotheses, as M M 0 ax U i,m ax U i,m ρ ψρ c 5 log + ρ ax U i,m i I ψρ 0 c 5 log + ρ + c ρ 2c 5 log ρ + c c 2 c 2 ρ, by log + 2 log. Thus ax Ti T M,i ax U i log ρ i I ρ 0 M δ c 6 ρ M δ, [ where c 6 2c 5 c 2 + c ] ρ <. Therefore if M δ log ρ o as M,, we can conclude that T P. M,i T i for all i I 0, and sup sup i I0 p M,i q p i q o since convergence in probability iplies convergence in law. More discussions on the PRDS condition PRDS is a special case of positive regression dependence. Lehann [5] defined a rando variable Y positive regression dependent on a rando variable X as Y y X x is non-increasing in x, while Y is negative regression dependent on X if Y y X x is non-decreasing in x. Y positive negative regression dependent on X is also called stochastic onotonicity of Y y X x.

0 Y positive regression dependent on X also iplies that for all x x and Y y X x Y y X x, 2 Y y, X x Y y X x. 3 3 is called X and Y are positively quadrant dependent. It says that the ore possibility of X being sall large, the ore possibility of Y also being sall large. If we let x, then 2 becoes 3. With siple algebra, it can be shown that iplies 2, and 2 iplies 3. All of the three conditions can be extended to ultiple variables. Positive regression dependent of an l diensional rando vector Y on a diensional rando vector X is that Y y,..., Y l y l X x,..., X x 4 is non-increasing in x,... x. Obviously Y is PRDS on a subset I 0 of X is less stringent than 4. Another frequently used but ore restricted criteria for dependency of ultivariate rando variables is the ultivariate totally positive of order 2 MTP 2. Karlin and Rinott [6] defined a diensional rando vector X to have an MTP 2 distribution if the corresponding joint density f X satisfies where f X y z f X y z f X y f X z, y y,..., y, z z,..., z, y z ax y, z,..., ax y, z, y z in y, z,..., in y, z. The nuber of diension can be extended to infinity or even continuous. MTP 2 iplies positive regression dependent, and therefore iplies PRDS [7]. It can be shown that joint density of rando variables X i satisfying MTP 2 iplies Cov X i, X j 0 for i, j,...,. Nevertheless, except the case of ultivariate noral, PRDS and Cov X i, X j 0 ay not iply each other [2]. In a ore general situation, epirically verifying whether data structure satisfies the above conditions ay be difficult. But soe solutions have been suggested, for exaple, a nonparaetric test for stochastic onotonicity proposed by [8]. A Siulation study with SVFJ For an additional siulation study, we use the following stochastic volatility with one jup coponent odel SVFJ, which also was considered in [], d log P t µdt + exp β 0 + β σ t dw t + dj t, dσ t aσ t dt + dw 2 t, J t Nt j D t, j, D t, j iid N 0,, N t iid Poisson λdt, where dw t and dw 2 t follow the standard Brownian otion, and σ 2 t follows a siple stochastic process. J t follows a Copound Poisson ocess CPP with a constant intensity λdt, and N t is the nuber of jups occurring within the sall interval t t, t].

For the siulation, we set the paraeter to the following values. µ 0.03, β 0 0, β 0.25, and a 0.. In addition, we also add the leverage effect into the odel, and the correlation between dw t and dw 2 t is set to 0.62. All of the other settings for the siulation are the sae as in the SVJ case. Relevant results are shown in Figure S to Figure S5. It can be seen that all the results are qualitatively siilar to those of the SVJ case. Data descriptions The raw data used for the epirical applications are one inute recorded prices of S&P500 SPC500 index in cash and Dow Jones Industrial Average DJIA index. The saple period spans fro Jan-02-2003 to Dec-3-2007. The data sets are provided by Olsen Financial Technologies in Zūrich, Switzerland. During the saple period, aret closed at p on a few days. Such days were inactive trading days, and we exclude the fro our saples. After eliinating these inactive trading days, we have 247 active trading days for both DJIA and S&P 500 indices. In our epirical analysis in section 5, all estiated realized price variations and test statistics are based on the data fro the 247 active trading days. To estiate the intradaily price variations, we use five inute log returns but exclude overnight returns. Soe issues of icrostructure noise are also concerned here. When observed prices contain icrostructure noise, realized variations estiated with different sapling frequencies will have different degrees of biasness. Since the two indices are not really traded, their price series would be less liely to suffer distortions fro the icrostructure noise than those of traded futures. The property of iunizing the icrostructure noise can be seen in Figure S6, which shows volatility signature plots. The horizontal dashed line in each plot is the average daily realized variance when the 5-in log returns are used. It can be seen that the average values of the realized variances are downward biased when their sapling intervals are sall. As the sapling interval becoes oderately large, the average values becoe stable, and the biasness is itigated. However, the downward biasness reappears when the sapling interval increases beyond one hour. Fro the figure, we can see that the realized variances estiated fro the 5-in log return data see to suffer little icrostructural effect. This is the reason why the 5-in log return data is used to construct the realized variance estiations. We then calculate the three different jup test statistics Z.5,i, Z log,i and Z ratio,i and their corresponding p-values. To avoid effects of abnoral trades, we oit data of the first five inutes 09:3-09:35 and the last ten inutes 6:0-6:0, so the nuber of saples for each day equals to 77. This additional step of screening the data aes our estiates reflect intradaily dynaics of the two indices ore hoogeneously and efficiently. Note that the additional screening step only applies to JV i and the daily jup test statistics. For RV i and BV i, we still eep the 80 saples each day. Figure S7 shows tie series plots of RV, BV and JV i,0.05 for the two indices. It can be seen that the log type statistic have ost identified jup days.

2 References. Huang X, Tauchen GE 2005 The relative contribution of jups to total price variance. Journal of Financial Econoetrics 3: 456-499. 2. Benjaini Y, Yeutieli D 200 The control of the false discovery rate in ultiple testing under dependency. The Annals of Statistics 29: 65-88. 3. Kosoro MR, Ma S 2007 Marginal asyptotics for the large p, sall n paradig : With applications to icroarray data. The Annals of Statistics 35: 456-486. 4. van der Vaart A, Wellner J 996 Wea Convergence and Epirical ocesses: With Applications to Statistics. New Yor: Springer-Verlag. 5. Lehann EL 966 Soe concepts of dependence. The Annals of Matheatical Statistics 37: 37-53. 6. Karlin S, Rinott Y 98 Total positivity properties of absolute value ultinoral variables with applications to confidence interval estiates and related probabilistic inequalities. The Annals of Statistics 9: 035-049. 7. Sarar SK 2002 Soe results on false discovery rate in stepwise ultiple testing procedures. The Annals of Statistics 30: 239-257. 8. Lee S, Linton O, Whang YJ 2009 Testing for stochastic onotonicity. Econoetrica 77: 585-602.