Inference from Data Partitions

Similar documents
3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

NUMERICAL DIFFERENTIATION

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Lecture 4 Hypothesis Testing

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Numerical Heat and Mass Transfer

Chapter 8 Indicator Variables

Lecture 17 : Stochastic Processes II

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

x = , so that calculated

Convergence of random processes

Statistics for Economics & Business

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

/ n ) are compared. The logic is: if the two

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

x i1 =1 for all i (the constant ).

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Comparison of Regression Lines

Testing for seasonal unit roots in heterogeneous panels

Composite Hypotheses testing

Linear Approximation with Regularization and Moving Least Squares

Computing MLE Bias Empirically

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

More metrics on cartesian products

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Chapter 14 Simple Linear Regression

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Statistics II Final Exam 26/6/18

Error Probability for M Signals

Chapter 13: Multiple Regression

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Economics 130. Lecture 4 Simple Linear Regression Continued

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

First Year Examination Department of Statistics, University of Florida

Chapter 5 Multilevel Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

6. Stochastic processes (2)

6. Stochastic processes (2)

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

= z 20 z n. (k 20) + 4 z k = 4

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Expected Value and Variance

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Lecture Notes on Linear Regression

3.1 ML and Empirical Distribution

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Chapter 1. Probability

Lecture 6: Introduction to Linear Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Uncertainty and auto-correlation in. Measurement

Negative Binomial Regression

2.3 Nilpotent endomorphisms

Statistics Chapter 4

Lecture 12: Discrete Laplacian

Limited Dependent Variables

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Hydrological statistics. Hydrological statistics and extremes

Homework Assignment 3 Due in class, Thursday October 15

Chapter 11: Simple Linear Regression and Correlation

Lecture 4. Macrostates and Microstates (Ch. 2 )

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Modeling and Simulation NETW 707

Boostrapaggregating (Bagging)

III. Econometric Methodology Regression Analysis

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Professor Chris Murray. Midterm Exam

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Exercise Solutions to Real Analysis

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Conjugacy and the Exponential Family

Statistics for Business and Economics

Linear Regression Analysis: Terminology and Notation

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Basic Business Statistics, 10/e

Lecture 3: Probability Distributions

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

January Examinations 2015

Transcription:

Inference from Data Parttons Rcardo Bórquez and Melvn Hnch May 18, 2010 1 Introducton Consder a statonary process X = {X t, t = 1, 2,..., T } defned on (Ω, F, {F t }, P ) where F denotes the Borel sets, {F t } s a fltraton and P s a probablty measure whch s assumed to be absolutely contnuous respect to the Lebesgue measure. For ths process we are nterested on testng any form of tme-dependence (lnear or nonlnear). Usually, we would expect a researcher to run some convenent test over the whole sample n order to nfer about the knd of dependence that s present n the data. For nstance, n the Box and Jenkns modellng strategy for ARMA models t s requred some nference over the whole sample usng the autocorrelaton and partal autocorrelaton functons n order to select the aproprate model, and t s also requred an nference procedure over the whole sample when selectng a parsmonous model usng nformaton crterons such that of Akake. However, n these examples as well as n several other statstcal settngs t s mplct that the result of the test does not depend on how we can parttonate the data, otherwse the nference made wthout ths addtonal nformaton s generally nvald. In ths study, we propose a test of whether a specfc form of parttonng the data sample s nformatve. The test s based on fndng evdence of transent dependency (.e., unstable structure of dependence n the data). Ths problem can be restated to apply whenever the data s gven some order not necessarly n tme (e.g., the order of ndvduals n a cross-secton), although our 1

examples are unquely taken from the tme-seres context where the problem s more evdent. To be precse, let { X (), = 1, 2,..., k } be an arbtrary collecton of subsamples of X wth elements of length N and such that N = T. Defne a sufcent statstc S R d to be used n the nference procedure for whch we know ts lmt dstrbuton Q and denote S ( X ()) the statstc evaluated at each X (), S(X) corresponds to the same statstc evaluated at the entre sample. The problem of transent dependency can be stated as follows. Let T : R k+d R d be an applcaton over the sets { S ( X ()) A, = 1, 2,..., k } k > 2 and denote the correspondng composton as T (X) = { S ( X ()) A, = 1, 2,..., k } T. If the partton of the data s nformatve and f ths nformaton s summarzed n the parameter φ, then we can buld a smlar test based on T (X) wth smlar regon α (the sze of the test that s based on S (X)). That s, we ntempt to have (Barklett, 1937): P (S (X) A; φ) = α φ Φ where φ s a nusance parameter. But f T (X) s a suffcent statstc for φ then under the null hypothess the condtonal dstrbuton P ( S ( X ()) A T (X) A ) (wth A = A ) wll not depend on the parameter φ for = 1, 2,..., k. Thus, evdence supportng transent dependence can be found by rejectng the null hypothess for some subsample based on ths condtonal test. We show below that ths property s satsfed by the famly of unon-ntersecton tests. It s needed frst to defne formally the relatonshp between an nference based on S (X) and that of T (X) whch occurs under the null hypothess. Proposton 1. For S(X) to provde the same nference than that of T (X) t s necessary that P [S (X) A ] n Q where A F s a Borel set such that Q ( A ) = 0 ( A s the boundary of A ) and Q s proportonal to the lmt dstrbuton of T (X). Proof. If P [S (X) A ] n Q then there exst a collecton of dsjont sets A for = 1, 2,..., k formng a partton of A = A such that P [T (X) A] n λq where λ = λ (X) > 0 s a constant, but ths s not possble because the latter dstrbuton s tght on R d for d 1 and the fnte dmensonal dstrbutons form a convergence-determnng class on that space (Bllngsley, 2

1999). Thus, when the null hypothess s true the partton of the sample data { X (), = 1, 2,..., k } s not nformatve and ts knowledge conduces no further changes to the nference that we can make through the statstc S (X). We operatonalze ths proposton as follows. For smplcty assume that S, T R 1. The null hypothess of the test of nterest s descrbed as the nter- [ ( secton of complementary events ) S X () c ] for some c. A level α test can be formed through the unon ntersecton approach wth the maxmum order statstc and the rejecton regon defned as [ S ( X ()) > c for some = 1,..., k ] [ = max S ( ] X ()) > c where c = c (α, n). To see ths, we can note that [ S ( X ()) > c for some = 1,..., k ] [ ( ) S X () > c ] and that ( [ ( P ) S X () > c ]) = α (t s only requred that S provdes a sze α test). We show then that the unon-ntersecton test can be used to answer the queston of whether a partton of the data sample s nformatve. Proposton 2. Under the null hypothess T (X) = max S ( X ()) s a sufcent statstc for φ. Proof. For unon ntersecton tests we only need to note that P ( { ( P S (X) > c ) S X () > c } ) ; φ = 1 whch does not depend on φ. ( S (X) > c max S ( X ()) > c; φ ) = Then, we dentfy λq n Proposton 1 wth the lmt dstrbuton of T (X) = max S ( X ()) and ( 1/λ = P S ( X ()) > c max S ( ) X ()) > c < 1 for = 1, 2,..., k. A well known convergence to types result due to Gnedenko (1943) shows that the lmt dstrbuton of the maxmum order statstc for a sequence of ndependent, dentcally dstrbuted random varables exsts and t s one of three types dependng on the support of the dstrbuton. Extensons of ths result to allow for dependency n the data ether n dscrete or contnuous tme are avalable (e.g. Watson 1954; Welsch, 1971; Durret and Resnck, 1978) and also there are results for statonary processes and some forms of weak dependency (e.g. Berman, 1964; Leadbetter, 3

1974; Adler, 1978). It s clear that dependng upon the partcular context a sutable result for a lmt dstrbuton of the maxma s often avalable, and ths s enough for our purposes. 2 Testng for Transent Dependence In ths secton, we apply the prevous results to derve a method for testng transent dependence when S (X) s a centered ch-squared varable. An example s now provded n the context of the Hnch (1996) test for nonlnearty. Consder a zero-mean second-order statonary process for whch we are nterested on fndng sgnfcant elements of the thrd-order cumulants. These are moments of the form C (r, s) = E (X t X t+r X t+s ) and ther sample counterparts are referred as bcorrelatons. A stochastc process can show non-zero bcorrelatons and stll have a whte nose representaton, whch turns out to be a convenent specfcaton for descrbng tme-dependence n many applcatons. The Hnch (1996) test s a test for the null hypothess of a pure whte nose process (.e., a whte nose process wth ndependent nnovatons) aganst a process havng many sgnfcant bcorrelatons. As usual, the test reles on assumptons about the stablty of the dependence structure n the sample. But note that ths could be unlkely to occur f the sample covers a relatvely long perod of tme, whch s commonly the case n tme-seres applcatons. Motvated by ths fact, Hnch and Patterson (2005) studed the transent dependence n whte nose. Usng fnancal data they found that perods of tme dependence do alternate wth perods of ndependence, a result that can have mplcatons regardng the effcency of fnancal markets. From a statstcal pont of vew, that result can also have mplcatons on the forecastng ablty of lnear tme-seres models. In ther settng, Hnch and Patterson (2005) appled the test separately over data grouped n consecutve wndow frames of fxed but rather short length of tme. Thus, a penalty n the sze and power of the test s expected for that procedure because of the lmted nformaton contaned n a sngle wndow even when s appled consecutvely or overlapped. 4

Alternatvely, we can use a unon-ntersecton approach to control for the sze of the test and ncrease ts power. In partcular, let X = {X t, t = 1, 2,..., T } be a sequence of lnearly fltered data where EX t = 0 and EX 2 t = 1 for all t T. The testng procedure employs non-overlapped data wndows, thus f N s the wndow length, then [X(t +1 ), X(t +1 + 1),..., X(t +1 + N 1)] s the -th wndow where X t = X(t) and = 1, 2,..., k and t = 1, 2,..., T. The next non-overlapped wndow smply consders t +1 = t + N. Defne the statstc H = G 2 (r, s) where G (r, s) = r s (N s).5 X t X t+r X t+s for 0 < r < s whch s ndexed to the wndow. The H statstc s t dstrbuted ch-squared wth (L 1) (L/2) degrees of freedom for a test of sze α. L s the number of lags that enters the wndow and t s determned endogenously as L = Nb wth 0 < b < 0.5 (recommended to maxmze the power of the test). Under the null hypothess {H, = 1, 2,..., k} s a collecton of ndependent and dentcally dstrbuted random varables, then we can characterze ths hypothess as {H c} and ts probablty as P {H c} = P (H 1 c) k when all the wndows have the same length (or n general ( ) as P (H c ) where c = c (α, N ) and = 1, 2,..., k). The rejecton regon for the unon ntersecton test s gven by max H > c. { } Proposton 3. P (H 1 b k u) k k exp ( u γ ), where γ = γ (α, k) > 0 and b k s a normalzng constant such that 1 P (H 1 b k ) = 1 n. Proof. In order to apply Proposton 3 to the H statstc, we have to show that we can wrte 1 P (H 1 u) = u δ h(u) for some δ > 0 and slowly varyng functon h(u). But t suffces to assume that Eu δ < wth δ = 1 (so that the process X s second-order statonary). The rest s a standard result and t can be found n Ferguson (1996), p.95. It s mmedate that max H s dstrbuted reverse webull wth parameters (γ, 1). In order to have a smlar proposton for consderng the case of dfferent lengths on each wndow, one could apply the results of lmt convergence for the maxma on arrays of ndependent random varables n Serfozo (1982), but the lmtng dstrbuton dffers from that of Proposton 3. 5

Fgure 1: Sze of the test accordng wndow length (N) and sample sze (T ) 2.1 Sze and Power of the Test In ths secton we provde evdence on the sze and power of the unon-ntersecton test of secton 2 through a Monte Carlo experment. We generate pseudo random numbers for the pure nose process from four alternatve dstrbutons: Gaussan, t-student (wth v degrees of freedom), Unform and Exponental. Both the sze and power of the test vary n a complex manner accordng to the sample sze T and the wndow length N, whch s controlled by adequately choosng the value of the parameter γ. Our Monte Carlo results show that we can use γ = k 0.2 as a vald approxmaton for most emprcal applcatons. 2.1.1 Sze For the estmaton of the sze of the test we computed the tmes that the null hypothess was erroneously rejected, runnng ten thousand replcatons n each case. The results are summarzed n Fgure 1. For a gven wndow length the sze of the test ncreases as the sample sze ncreases whch s a standard result. The sze also vares wth the wndow length for a gven sample sze although ths s expected. The reason can be assocated to the nformatonal content of a sngle wndow frame respect to the whole sample, whch dffers accordngly to the number of wndows and the wndow length. For a gven sze of the test and a sample sze we can deduce usng Fgure 1 the wndow length that s consstent wth the asymptotc theory. Wth T = 1100 observatons and α = 0.1 we should use a wndow length N = 90 observatons f Gaussan nnovatons are assumed and a wndow length N = 80 observatons for t-student nnovatons. Alternatvely, by fxng the wndow length for a gven sample sze we can obtan the respectve probablty of ntroducng Type I error. For example, consder agan T = 1100 and N = 105. In the case of Gaussan nnovatons the sze of the test s approxmately 0.08 but t s near 0.05 for the Unform nnovatons. 6

Note that although the results dffer across the four dstrbutons for the nnovatons such dfferences are stll bounded on values that are commonly used n emprcal work. Consequently our results seem to be robust ndependently of the partcular dstrbuton that s assumed. In practce ths means that low p-values should be consdered as strong evdence n favor of the alternatve hypothess, even f the dstrbuton of the nnovatons s assumed to have fat tals. 2.1.2 Power The power of the test s evaluated aganst two nonlnear models: a nonlnear movng average (NLMA) model and a blnear (BL) model. The partcular specfcaton we use for the NLMA model s X t = e t + βe t 1 e t 2 where e t denotes a zero-mean nnovaton wth varance equal to σ 2. Ths model permts that the parameter β can take any nonzero value whlst the random varable s clearly not ndependent yet s whte, whch has many desrable propertes for our study. Note that although there s no correlaton between X t and X t+r for r 0 the elements of the thrd-order cumulants of the process {X t, t = 1, 2,..., T } can be dfferent from zero. In fact, we have that C (r, s) = βσ 4 but there are only sx of these elements for ths partcular process, whch makes t very dffcult to capture the underlyng tme-dependence structure based on a nonparametrc test. On the other sde, the blnear model can be thought as a reduced form of some hgher-order nonlnear movng average process and therefore s characterzed by havng several non-zero bcorrelatons. These models have the property of approxmatng wth arbtrary accuracy any model that reasonably can be represented by Volterra expansons, and consequently they have been proposed as natural nonlnear extensons of ARMA models (Tong, 1990; Granger and Andersen, 1978). For nstance, a model of the form X t = e t + βx t p e t q s (second-order) statonary f βσ 2 < 1 and the seres s generally whte for p q. In our study we use p = 1 and q = 2. The results for the test are summarzed n Exhbt 1 for each model and two alternatve values of the parameters. We report the percentage of correct decsons usng a sze of 0.01. As s usual 7

the power greatly depends on the values of the model parameters, beng more dffcult to reject the null hypothess as ther absolute value approaches to zero 1. The power of the test s hgher as the number of wndows s hgher, whch can be acheved by ncreasng the wndow length and/or the number of observatons. Ths result dffers accordng to the four dstrbutonal alternatves on the nnovatons, beng more senstve for the case of the Unform dstrbuton. We also note that the power of the test s hgher for the blnear case than the NLMA, whch s expected as the number of possbly nonzero bcorrelatons s hgher n the former model. 2.2 An Emprcal Example We return to the problem stated n Hnch and Patterson (2005). References [1] Adler, 1978 [2] Barklett, 1937 [3] Berman, 1964 [4] Bllngsley, 1999 [5] Durret and Resnck, 1978 [6] Ferguson (1996) [7] Gnedenko (1943) [8] Granger and Andersen, 1978 [9] Hnch (1996) 1 If the parameter s exactly zero then the process reduces to a pure whte nose 8

[10] Hnch and Patterson (2005) [11] Leadbetter, 1974 [12] Serfozo (1982) [13] Tong, 1990 [14] Watson 1954 [15] Welsch, 1971 9