AR-order estimation by testing sets using the Modified Information Criterion

Size: px
Start display at page:

Download "AR-order estimation by testing sets using the Modified Information Criterion"

Transcription

1 AR-order estimation by testing sets using the Modified Information Criterion Rudy Moddemeijer 14th March 2006 Abstract The Modified Information Criterion (MIC) is an Akaike-like criterion which allows performance control by means of a simple a priori defined parameter, the upper-bound on the error of the first kind (false alarm probability). The criterion MIC is for example used to estimate the order of Auto-Regressive (AR) processes. The criterion can only be used to test pairs of composite hypotheses; in an AR-order estimation this leads to sequential testing. Usually the Akaike criterion is used to test sets of composite hypotheses. The difference between sequential and set testing corresponds with the difference between searching the first local and the global minimum of the Akaike criterion. We extend the criterion MIC to testing a composite null-hypothesis versus a set of composite alternative hypotheses; these alternative hypotheses form a sequence where every element introduces one additional parameter. The theory is verified by simulations and is compared with the Akaike criterion used in sequential and set testing. Due to the excellent correspondence between the theory and the experimental results we consider the AR-model order estimation problem for low order AR-processes with Gaussian white noise as solved. Key words: AIC, Akaike criterion, AR, autoregressive processes, composite hypothesis, maximum likelihood, model order, system identification, time series analysis. 1 Introduction In a recent publication [1] (see also [2, 3]) we introduced the Modified Information Criterion (MIC 1 )) and applied this criterion to Auto-Regressive (AR) model order estimation. In its original form the criterion MIC could only be applied to testing pairs of composite hypotheses[5, chapter 35]; later some extensions have been derived. The criterion MIC can be compared with the Akaike criterion (AIC) [6, 4, 7] and the Generalized Information Criterion (GIC) [8]. All criteria mentioned are used to test composite hypotheses, [9, pp ]; a composite hypothesis is a hypothesis with some unknown parameters to be estimated. University of Groningen, Department of Computing Science, P.O. Box 800, NL-9700 AV Groningen, The Netherlands, rudy@cs.rug.nl 1 ) IC stands for information criterion and A is added so that similar statistics, BIC, DIC etc., may follow [4]. Similarly the M of modified is added 1

2 ML: Maximum Likelihood MLL: Mean Log Likelihood (expected log likelihood) MMLL: Maximum Mean Log Likelihood; i.e. the maximum of the MLL-function ALL: Average Log Likelihood; an unbiased statistic to estimate the MLL MALL: Maximum Average Log Likelihood; i.e. the ALL-function at the MLestimate; a biased statistic to estimate the MMLL Table 1: Frequently used abbreviations In AR-model order estimation there is an essential difference between the application of MIC versus AIC or GIC. The models with different AR-order are called the hypotheses H I. The index I of the hypotheses corresponds to the ARmodel order. There exist two essentially different approaches with respect to testing composite hypotheses applied to AR-model order estimation: sequential testing and set testing. In sequential testing we test an I th order AR-model versus a J th order ARmodel where J = I + 1 for increasing I till the I th order model is preferred above the J th order model. The last value of I is considered to be an estimate of the correct AR-model order M. For sequential testing a reliable method to test pairs of composite hypotheses is needed. In case of the Akaike criterion or GIC the method of order selection is not described, although this is essential for the result. Most researchers use set testing, they compute the criterion AIC or GIC for all orders up to a certain maximum candidate order L and use the order with smallest AIC or GIC as an estimate of the correct order. Using sequential testing the estimated AR-model order corresponds to the first local minimum of the criterion. Set testing selects the AR-order corresponding with the absolute minimum as a function of the model order given L. In a recent publication [10] we showed that significantly better results can be achieved by using sequential testing instead of set testing in case of AR model selection using AIC or GIC. The criterion MIC has been designed to test pairs of composite hypotheses, so it can only be used in sequential testing. The introduction of the criterion MIC made it possible to estimate the AR-order with a preselected maximum probability α on selecting a model with a too high AR-order. Reading referee reports on our recent publications indicates that the signal processing community is reluctant to accept sequential testing. Is it possible to apply the criterion MIC in case of set testing? In this publication we present such an extension to the criterion MIC. 2

3 I J K L M N AR-order of the (null) hypothesis AR-order of the alternative hypothesis size of the alternative set of hypotheses maximum candidate order correct AR-order number of samples Table 2: Used symbols 2 Formulation of the problem Assume the stochastic signal x, generated by an M th order AR-model: x n = ɛ n + a 1 x n 1 + a 2 x n a M x n M (1) where the noise ɛ is stationary, white and normally distributed with zero mean and unit variance. The correct conditional probability density function (pdf) g(x n x n 1,..., x n M ) of the stochastic process x is according to hypothesis H I modeled by the conditional pdf of an I th order AR-model f x (x n x n 1,..., x n I ; a 1,..., a I, σ) = f ɛ (ɛ n ; σ) (2) where f ɛ (ɛ n ; σ) is a normal distribution with variance σ 2. This AR-model has a parameter vector p I = (a 1, a 2,..., a I, σ) which consists of (dim p I ) = I +1 independently adjustable parameters. Determine the AR-model order by selecting the best hypothesis H I given a sequence of N observations x 1, x 2,..., x N. We aim to estimate the simplest model (lowest AR-order) which maximizes the Mean Log-Likelihood (MLL) E {l n (p I )}. The single observation log likelihood of hypothesis H I given the parameter vector p I is defined by l n (p I ) = log f x (x n x n 1,..., x n I ; p I ) (3) The MLL is a function of the parameter vector p I and is, due to the (information theoretical) information inequality [11, (2.82)], bounded by the conditional neg(ative)-entropy: E {l n (p I )} < E { log g(x n x n 1,..., x n M ) } = H { x n x n 1,..., x n M } (4) where the conditional entropy H { } x n x n 1,..., x n M corresponds with the entropy per sample of the stationary stochastic process x. Equality holds if and only if g(x n x n 1,..., x n M ) = f x (x n x n 1,..., x n I ; p I ) and I M, i.e. if the correct conditional pdf can exactly be modeled. This can only be the case for p I = p I ; so searching for the vector p I for which the MLL has a maximum, the Maximum Mean Log-Likelihood (MMLL) searches for the optimal fit of f x to g. The method of Maximum Likelihood (ML) is based on this principle. Theoretically the MMLL is a non decreasing (increasing or flat) function of I. Adding superfluous parameters never decreases the MMLL, so the MMLL is flat for I M. Testing two composite hypotheses, the null-hypothesis H I and the alternative hypothesis H J, we distinguish two essentially different test situations: the critical test and the non-critical test. 3

4 In case of a critical test we are testing a correct null-hypothesis (I M) versus an alternative hypothesis (J > I) with (more) superfluous parameters. Both hypotheses are equally likely and have consequently an equal MMLL, so the null-hypothesis is the simplest and therefore the best choice. In AR-model order estimation the tests between hypotheses become critical if both hypotheses model the AR-process sufficiently (J > I M). The non-critical test occurs if the alternative hypothesis is significantly more likely than the null-hypothesis (I < M and J > I), consequently the MMLL s differ. Because the order estimation algorithm always performs better in the non-critical case than in the critical case the non-critical case is irrelevant for the development of an order estimation algorithm. All criteria: AIC, GIC and MIC, are based on a statistic, the Maximum Average Log-Likelihood (MALL), to estimate the MMLL. First we introduce the Average Log-Likelihood (ALL) being an unbiased statistic to estimate the MLL [1]: N Ê {l n (p I )} = 1 N log f x (x n x n 1,..., x n I ; p I ) (5) n=1 where Ê {...} denotes a statistic to estimate the mean. This ALL-function corresponds with a normalized log-likelihood function as in the method of ML [12, chapter 7][13, section 11.5 ]. Analogous to the average being an unbiased statistic to estimate the mean, the ALL is for every value of p I an unbiased statistic to estimate the MLL. The MMLL can be estimated by replacing p I in the ALL-function by the ML-estimate p I of p I ; the resulting statistic is the MALL. This MALL is biased with respect to the MMLL due to the finite number of observations [1]: } E {Ê {l( pi )} E {l(p I )} + dim p I 2N (6) where dim p I is the dimension, i.e. the number of elements, of p I. 2.1 Set testing using GIC The Akaike criterion [6, 4, 7] is a member of a family of criteria, the Generalized Information Criteria (GIC) [8] : GIC (λ K ) = Ê {l( p I)} + λ K dim bp I N (7) Different values for λ K are used: no correction (λ K = 0), Bhansali (λ K = 1 2 ) [14], Akaike (λ K = 1) [6], Broersen (λ K = 3 2 ) [15] and Åström (λ K = 2) [16]. Even considerably larger values of λ K, even depending on N, have been used [17, 18]. The estimated AR-order is the order for which GIC reaches a minimum. GIC can be used for both set testing and sequential testing. 2.2 Sequential testing using MIC The criterion MIC [1] is designed to test a pair of composite hypotheses with a test which can be compared with a likelihood-ratio test. The test statistic is 4

5 α K=1 K=2 K=3 K=4 K=5 1% % % % % Akaike Table 3: The λ K = Nη J,J+1,K as a function of the size K of the alternative set of hypotheses and the false alarm probability α. the difference in Maximum Average Log-Likelihood ( MALL 2 )): C I,J = Ê {l( p J)} Ê {l( p I)} (8) Notice that C I,J = C I,k + C k,j for any k. The statistic 2NC I,J is in case of a critical test approximately chi-squared distributed with (dim p J dim p I ) = (J I) degrees of freedom [1, 19, 20]. Consequently: E { } C I,J = BIAS { } C I,J = J I 2N VAR { } C I,J = J I (9) 2N 2 The expectation equals the bias because the difference in Maximum Mean Log- Likelihood ( MMLL) is zero in case of a critical test. In the non-critical case the statistic is, due to the central limit theorem, normally distributed with expectation MMLL. The range of the outcomes of the statistic will, depending on a threshold η high, be divided into two intervals: C I,J < η high accept H I C I,J η high accept H J (10) The constant η high depends on the probability α of erroneously selecting alternative hypothesis H J (a the too high order) and can be solved from the equation: α = 2Nη high x k 1 e 1 2 x dx where k = 1 Γ(k)2 k 2 (J I) (11) For J = I +1 the constant η high = λ 1 /N, where λ 1 equals λ K of (7) with K = 1, is shown in table 3. The criterion MIC, testing the I th order AR-model versus the J st order AR-model where J = I + 1, is equivalent to using GIC (λ 1 ) to test a pair of hypotheses. 3 Testing hypotheses Originally the theory with respect to MIC was developed for testing a composite null-hypothesis versus a composite alternative hypothesis [1]. In a previous publication [21] we have extended the theory to testing a composite null-hypothesis 2 ) In related work we used the term MALL-ratio instead. This term does note correctly represent the order in which the operations are performed: take the logarithm, take the average, find the maximum and divide the likelihoods consequently subtract the log-likelihoods. 5

6 versus an alternative set of composite hypotheses where all these hypotheses introduce one additional parameter. Now we will extend the theory to testing a composite null-hypothesis versus an alternative set of composite hypotheses where all these hypotheses introduce additional parameters incrementally. The null-hypothesis consists of an I th order model and the alternative set of hypotheses consists of K = L I models with order (I + 1), (I + 2),..., L. Now we generalize the theory of testing two composite hypotheses using MIC (see subsection 2.2) to testing a composite null-hypothesis versus an alternative set of composite hypotheses. We accept the null-hypothesis if, similar to (10), the following conditions are fulfilled; otherwise we accept the alternative set of hypotheses: C I,I+1 < η I,I+1,K and C I,I+2 < η I,I+2,K and (12) C I,L < η I,L,K There are K = L I different thresholds η I,J,K to be determined, where K is the number of elements in the alternative set of hypotheses. To reduce the number of degrees of freedom we choose a relation between the constants η I,J,K. To make the test procedure more Akaike-like, we have chosen for the relation: η I,J,K = λ K J I N (13) Alternative choices are possible. Now we possess the framework for testing the null-hypothesis consisting of an I th order model versus an alternative set of hypotheses consisting of the (I + 1) st, (I + 2) nd,..., L th order models. Performing these tests for increasing I till the null-hypothesis is selected leads to an algorithm to estimate the AR-order. An example of a Pascal implementation is: I := 0; REPEAT stop := true; FOR J := I+1 TO L DO stop := stop AND ( MALL[J] - MALL[I] <= lambda[l-i] * (J-I) / N ); IF NOT stop THEN I := I + 1; UNTIL stop; where MALL[J] corresponds with Ê {l( p J)} and lambda[l-i] corresponds with λ L I. The selected model order remains in the variable I. Suppose we want to perform the tests of (12) by using GIC or AIC. We select the null-hypothesis if the I th order model has minimum GIC and we select the alternative hypothesis if one of the models with order (I + 1), (I + 2),..., L has a minimum GIC. Every condition in (12) corresponds with comparing the GIC, as defined in (7), of the I th order model with the GIC of the J th order model: Ê {l( p J)} + λ K dim bp J N < Ê {l( p I)} + λ K dim bp I N This inequality is due to (dim p I ) = I + 1 and (8) equivalent to: (14) C I,J > λ K J I N = η I,J,K (15) 6

7 This inequality justifies our choice for the relation (13). This suggests that both criteria GIC and AIC should also depend K, which is due to K = L I a function of the maximum candidate order. In the next section we will derive λ K given an a priori determined probability on selecting a too high order (false alarm probability). L. 4 Computation of the threshold In the method described in the previous section there remains a threshold λ K to be determined. This threshold will be determined such that the probability on selecting a too high order α has a preselected value. In case of a critical test we are testing a correct null hypothesis versus a superfluous alternative hypothesis. Both hypotheses are equally likely, so the null-hypothesis is the best choice. The test statistic 2NC I,J is chi-squared distributed with J I (assume J > I) degrees of freedom [19, 20, 1]. In the relevant critical case we assume that all C I,I+1 for different I are independent amongst each other. At this moment we can not proof this assumption. The assumption seams reasonable because the sum 2NC I,J = 2NC I,I+1 + 2NC I+1,I NC J 1,J must be chi-squared distributed with J degrees of freedom. This condition is satisfied if all 2NC I,I+1 are chi-squared distributed with one degree of freedom and are independent amongst each other. The resulting thresholds based on this assumption are verified by simulations. These simulation results match perfectly with the theory. We will show how to compute the λ K for K = 1, 2 and 3; the reader may extend the theory for K > 3. In case of a critical test and K = 1 the alternative hypothesis will, according to (11), erroneously be selected with a probability α = 2λ 1 f χ (c 1 ) dc 1 (16) where f χ is a chi-squared distribution with one degree of freedom, where c 1 = 2NC I,I+1 and where λ K = Nη J,J+1,K. Searching λ 1 for a preselected value of α such that (16) holds provides us with a threshold for K = 1 (see table 3) In case of K = 2 we have an alternative set of two hypotheses. Now two chi-squared distributed variables c 1 = 2NC I,I+1 and c 2 = 2NC I+1,I+2, where c 1 + c 2 = 2NC I,I+2, play a role. According to (12) the alternative set of hypotheses will be selected if one of the inequalities c 1 > 2λ 2 or c 1 + c 2 > 4λ 2 (17) is satisfied. The probability on erroneously selecting the alternative hypothesis can be computed by integration of the product of chi-squared distributions f χ (c 1 )f χ (c 2 ) over the area defined by (17): α = + 2λ 2 f χ (c 1 ) dc 1 2λ 2 4λ 2 c 1 f χ (c 1 )f χ (c 2 ) dc 2 dc 1 (18) 7

8 Solving λ 2 from this equation provides us with λ 2 given α for K = 2 (see also table 3) For K = 3 the set of conditions is: The corresponding equation is: (19) c 1 > 2λ 3 or c 1 + c 2 > 4λ 3 or c 1 + c 2 + c 3 > 6λ 3 α = + + 2λ 3 f χ (c 1 ) dc 1 2λ 3 4λ 3 c 1 f χ (c 1 )f χ (c 2 ) dc 2 dc 1 2λ 3 4λ 3 c 1 6λ 3 c 1 c 2 f χ (c 1 )f χ (c 2 )f χ (c 3 ) dc 3 dc 2 dc 1 (20) Generalization to higher values of K is in theory trivial. Computation or approximation of the threshold λ K for K > 5 is an unsolved numerical problem. We have tried to solve this problem using Mathematica. For K > 5 the integration method becomes too inaccurate or the computation effort becomes too large. For the missing values we have assumed λ K λ 5 ; this is for large K a reasonable assumption (see table 3). 5 Simulations Within the scope of this article we compare our estimated AR-orders with the estimated AR-orders using the Akaike criterion in case of set testing and sequential testing. For a more extended comparison of MIC with AIC and GIC see our earlier work [1, 10, 21]. The simulation results in table 4 to table 7 confirm our earlier results. Sequential testing applied to AIC performs significantly better than set testing [10]. In case of sequential testing the Akaike criterion has a theoretical probability of 15.7% [1] of estimating a too high order. Notice that sequential testing using AIC in an AR-order estimation context is equivalent to sequential testing using MIC where α = 15.7% Of course our method should select a 0 th order model in case of no model x n = ɛ n. The estimated number of 0 th order models in table 4 corresponds with the preselected value of α. Only for L = 10 and α = 20% some deviations are observed. These deviations are mainly caused by the practical approximation λ K λ 5 for K > 5. The Akaike criterion in case of set testing depends strongly on the maximum candidate order L. This problem disappears using sequential testing. Notice how accurate the theoretical probability α = 15.7%, in theory 8430 correct 0 th order identifications, is reproduced. The model x n = 0.55x n x n 2 + ɛ n is a typical example of a second order model which can (for small N easily) be misidentified as a first order model (see table 6). For N = 100 or N = 1000 the 1 st order model is erroneously preferred above the 2 nd order model. For the given number of samples N these models are indistinguishable. Our claim with respect to the probability on estimating a too high order is never seriously violated. Notice the better 8

9 performance of AIC using sequential testing instead of set testing. Our method performs better due to the performance control using different values of α. The model x n = 0.5x n x n x n x n x n x n 6 +ɛ n is hard to identify as a sixth order model. Only for N = a reasonable number of times the correct order has been estimated (see table 6). Because λ K for K > 5 has never been used we expect no deviations. Our claim with respect to the probability on estimating a too high order is never violated. As expected the estimated order increases with the number of samples N. AIC in a set testing context performs better because this method will in general overestimate the correct order, which is a benefit in this situation. The last model x n = 0.5x n x n 4 + ɛ n where a 2 = a 3 = 0 is typically a model where set testing could perform better than sequential testing (see table 7). For N 1000 the set testing strategy using AIC performs better than the sequential testing strategy. For N = sequential testing performs better. Our method with α = 20% has for small N a similar performance as AIC using set testing. The overall performance of our method is in general better. Is for N = 1000 the 4 th order model significantly better than the 1 st order model? If these models are in 1731 cases more or less equivalent, sequential testing is to be preferred. The problem of indistinguishable models demands a better study. Due to the noticeable deviations we may conclude that for α = 20% it is important to develop techniques to compute λ K for K > 5. Except for this known deficiency the method of order selection works satisfactory. 6 Discussion Although many researchers think otherwise, sequential testing is to be preferred above sequential testing. Theoretically the MMLL is a non decreasing (increasing or flat) function. Adding superfluous parameters never decreases the MMLL. Consequently spurious local minima of the MALL can only be caused by statistical fluctuations. Notice that the bias is also an increasing function (6), so the bias correction is decreasing. Overcompensation for bias creates an absolute maximum of the bias corrected MALL function; this corresponds with a minimum of GIC (λ) where λ > 1 2. The position of this absolute maximum highly depends on the chosen value of λ, i.e. the overcompensation for bias. What is the meaning of the first local maximum of the MALL? At the first local maximum the estimation error is in the same order of magnitude as the increase in MMLL between the I th and (I + 1) st model, i.e. the improvement of the model is negligible with respect to the estimation noise. Consequently the models are indistinguishable. Only if the MMLL as a function of the order has a plateau, the strategy of selecting the first local maximum is incorrect. Broersen [15] states: the influence of L is very important for all asymptotic criteria, and remains a nuisance for the finite sample criteria. The value of λ in case of set testing should depend on L. Assume we increase L by adding a superfluous parameter. There is a finite probability that the AR-order corresponding to this superfluous parameter will be selected. To keep the probability on selecting a too high order constant, the probability on selecting any other model containing superfluous parameters should decrease. Therefore the value of λ should increase and depends therefore on L. The dependence is shown in table 3 and derived in section 3. 9

10 7 Conclusions In this publication, as in earlier work [1, 21], the correspondence between the theory and the simulations is so good that we consider the AR-model order estimation problem for low order AR-processes with Gaussian white noise as solved. Earlier work [10] showed that sequential testing leads to excellent results and set testing is insufficiently motivated. Except for AR-models where the maximum mean log-likelihood as a function of the order has a plateau we prefer sequential testing. Criteria like AIC, GIC or MIC should depend on the application and should therefore depend on the testing strategy: sequential testing or set testing. The missing information with respect to the testing strategy is a nuisance for criteria like AIC and GIC. In this publication we have presented an extension on the criterion MIC, which was designed for sequential testing, such that it can also be applied in a set testing strategy. Now we possess a theoretically well-founded criterion, MIC, which can both be used in sequential testing and set testing. This criterion allows in both situations performance control by means of the upper-bound on the probability of selecting a too high order (false alarm probability). For criteria like AIC and GIC using set testing the constant λ K should depend on K and therefore also on the maximum candidate order L. This is a serious shortcoming of AIC and GIC. At this moment, although in theory the values are known, the computation or approximation of the threshold λ K for K > 5 is an unsolved numerical problem. This problem is a limitation to our method. Furthermore a theoretical foundation of the assumption that the C I,I+1 s are independent is necessary. References [1] R. Moddemeijer, Testing composite hypotheses applied to AR order estimation; the Akaike-criterion revised, Submitted to IEEE Transactions on Signal Processing, [2] R. Moddemeijer, Testing composite hypotheses applied to AR order estimation; the Akaike-criterion revised, in Signal Processing Symposium (SPS 98), Leuven (B), Mar , pp , IEEE Benelux Signal Processing Chapter. [3] R. Moddemeijer, Testing composite hypotheses applied to AR order estimation; the Akaike-criterion revised, in Nineteenth Symposium on Information Theory in the Benelux, P. H. N. de With and M. van der Schaar-Mitrea, Eds., Veldhoven (NL), May , pp , Werkgemeenschap Informatie- en Communicatietheorie, Enschede, (NL). [4] H. Akaike, A new look at the statistical model identification, IEEE Trans. on Information Theory, vol. 19, no. 6, pp , [5] H. Cramér, Mathematical methods of statistics, Princeton, Princeton Univ. Press, [6] H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proc. 2nd Int. Symp. on Information Theory, P. N. Petrov and F. Csaki, Eds., Budapest (H), 1973, pp , Akademia Kiado. 10

11 [7] Y. Sakamoto, Akaike information criterion statistics, Reidel Publ. Comp., Dordrecht (NL), [8] P. M. T. Broersen and H. E. Wensink, On the penalty factor for autoregressive order selection in finite samples, IEEE Trans. on Signal Processing, vol. 44, no. 3, pp , [9] H. L. van Trees, Detection, estimation, and modulation theory part I, John Wiley & Sons, Inc., New York, [10] R. Moddemeijer, Application of information criteria to AR order estimation, Resubmitted as Technical Paper for publication in IEEE Transactions on Automatic Control, [11] T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, Inc., New York, [12] S. Brandt, Statistical and Computational Methods in Data Analysis, North-Holland Publ. Comp., Amsterdam (NL), 2 nd edition, [13] E. Kreyszig, Introductory Mathematical Statistics, John Wiley & Sons, Inc., New York, [14] R. J. Bhansali, A Monte Carlo comparison of the regression method and the spectral methods of regression, Journal American Statistical Association, vol. 68, no. 343, pp , [15] P. M. T. Broersen, The ABC of autoregressive order selection criteria, in 11 th IFAC Symp. on System Identification, SYSID 97, Kitakyushu, Fukuoka, Japan, July , vol. 1, pp , Society of Instrument and Control Engineers (SICE). [16] K. J. Åström, Maximum likelihood and prediction error methods, Automatica, vol. 16, pp , [17] J. Rissanen, Modelling by shortest data description, Automatica, vol. 14, pp , [18] E. J. Hannan and B. G. Quinn, The determination of the order of an autoregression, J. R. Statist. Soc. Ser. B, vol. 41, pp , [19] S. S. Wilks, The large-sample distribution of the likelihood ratio for testing composite hypotheses., Ann. Math. Stat., vol. 9, pp , [20] V. K. Rohatgi, An introduction to probability theory and mathematical statistics, John Wiley & Sons, Inc., New York, [21] R. Moddemeijer, An efficient algorithm for selecting optimal configurations of AR-coefficients, Submitted for publication in IEEE Transactions on Signal Processing,

12 N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) Table 4: The estimated AR-order in case of estimations of the 0 th order AR-process x n = ɛ n as a function of the number of samples N and the maximum candidate order L. We have estimated the order using the algorithm proposed in section 3 in case of α = 1%, 5% and 20%. The results are compared with the estimated order using the Akaike criterion using both set testing and sequential testing. 12

13 N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) Table 5: As table 4 for the 2 nd order AR-process x n = 0.55x n x n 2 +ɛ n 13

14 N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) Table 6: As table 4 for the 6 th order AR-process x n = 0.5x n x n x n x n x n x n 6 + ɛ n 14

15 N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) N= L=4 1% % % AIC (set) L=6 1% % % AIC (set) L=10 1% % % AIC (set) AIC (seq.) Table 7: As table 4 for the 4 th order AR-process x n = 0.5x n x n 4 + ɛ n 15

Selecting an optimal set of parameters using an Akaike like criterion

Selecting an optimal set of parameters using an Akaike like criterion Selecting an optimal set of parameters using an Akaike like criterion R. Moddemeijer a a University of Groningen, Department of Computing Science, P.O. Box 800, L-9700 AV Groningen, The etherlands, e-mail:

More information

Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised

Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised MODDEMEIJER: TESTING COMPOSITE HYPOTHESES; THE AKAIKE-CRITERION REVISED 1 Testing composite hypotheses applied to AR-model order estimation; the Akaike-criterion revised Rudy Moddemeijer Abstract Akaike

More information

On the convergence of the iterative solution of the likelihood equations

On the convergence of the iterative solution of the likelihood equations On the convergence of the iterative solution of the likelihood equations R. Moddemeijer University of Groningen, Department of Computing Science, P.O. Box 800, NL-9700 AV Groningen, The Netherlands, e-mail:

More information

On the convergence of the iterative solution of the likelihood equations

On the convergence of the iterative solution of the likelihood equations On the convergence of the iterative solution of the likelihood equations R. Moddemeijer University of Groningen, Department of Computing Science, P.O. Box 800, NL-9700 AV Groningen, The Netherlands, e-mail:

More information

Order Selection for Vector Autoregressive Models

Order Selection for Vector Autoregressive Models IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 2, FEBRUARY 2003 427 Order Selection for Vector Autoregressive Models Stijn de Waele and Piet M. T. Broersen Abstract Order-selection criteria for vector

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Performance of Autoregressive Order Selection Criteria: A Simulation Study

Performance of Autoregressive Order Selection Criteria: A Simulation Study Pertanika J. Sci. & Technol. 6 (2): 7-76 (2008) ISSN: 028-7680 Universiti Putra Malaysia Press Performance of Autoregressive Order Selection Criteria: A Simulation Study Venus Khim-Sen Liew, Mahendran

More information

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation AMERICAN JOURNAL OF SCIENTIFIC AND INDUSTRIAL RESEARCH 214, Science Huβ, http://www.scihub.org/ajsir ISSN: 2153-649X, doi:1.5251/ajsir.214.5.6.185.194 Time Series Forecasting: A Tool for Out - Sample Model

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Automatic Spectral Analysis With Time Series Models

Automatic Spectral Analysis With Time Series Models IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 51, NO. 2, APRIL 2002 211 Automatic Spectral Analysis With Time Series Models Piet M. T. Broersen Abstract The increased computational speed and

More information

On Moving Average Parameter Estimation

On Moving Average Parameter Estimation On Moving Average Parameter Estimation Niclas Sandgren and Petre Stoica Contact information: niclas.sandgren@it.uu.se, tel: +46 8 473392 Abstract Estimation of the autoregressive moving average (ARMA)

More information

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

Model selection criteria Λ

Model selection criteria Λ Model selection criteria Λ Jean-Marie Dufour y Université de Montréal First version: March 1991 Revised: July 1998 This version: April 7, 2002 Compiled: April 7, 2002, 4:10pm Λ This work was supported

More information

KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE

KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE Kullback-Leibler Information or Distance f( x) gx ( ±) ) I( f, g) = ' f( x)log dx, If, ( g) is the "information" lost when

More information

On Autoregressive Order Selection Criteria

On Autoregressive Order Selection Criteria On Autoregressive Order Selection Criteria Venus Khim-Sen Liew Faculty of Economics and Management, Universiti Putra Malaysia, 43400 UPM, Serdang, Malaysia This version: 1 March 2004. Abstract This study

More information

Model selection using penalty function criteria

Model selection using penalty function criteria Model selection using penalty function criteria Laimonis Kavalieris University of Otago Dunedin, New Zealand Econometrics, Time Series Analysis, and Systems Theory Wien, June 18 20 Outline Classes of models.

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation

More information

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. A EW IFORMATIO THEORETIC APPROACH TO ORDER ESTIMATIO PROBLEM Soosan Beheshti Munther A. Dahleh Massachusetts Institute of Technology, Cambridge, MA 0239, U.S.A. Abstract: We introduce a new method of model

More information

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability

More information

Information Criteria and Model Selection

Information Criteria and Model Selection Information Criteria and Model Selection Herman J. Bierens Pennsylvania State University March 12, 2006 1. Introduction Let L n (k) be the maximum likelihood of a model with k parameters based on a sample

More information

On the Behavior of Information Theoretic Criteria for Model Order Selection

On the Behavior of Information Theoretic Criteria for Model Order Selection IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 8, AUGUST 2001 1689 On the Behavior of Information Theoretic Criteria for Model Order Selection Athanasios P. Liavas, Member, IEEE, and Phillip A. Regalia,

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS. Yngve Selén and Erik G. Larsson

PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS. Yngve Selén and Erik G. Larsson PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS Yngve Selén and Eri G Larsson Dept of Information Technology Uppsala University, PO Box 337 SE-71 Uppsala, Sweden email: yngveselen@ituuse

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

1.5 Testing and Model Selection

1.5 Testing and Model Selection 1.5 Testing and Model Selection The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses (e.g. Likelihood Ratio statistic) and to choosing between specifications

More information

Box-Jenkins ARIMA Advanced Time Series

Box-Jenkins ARIMA Advanced Time Series Box-Jenkins ARIMA Advanced Time Series www.realoptionsvaluation.com ROV Technical Papers Series: Volume 25 Theory In This Issue 1. Learn about Risk Simulator s ARIMA and Auto ARIMA modules. 2. Find out

More information

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends

More information

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests Brief Sketch of Solutions: Tutorial 3 3) unit root tests.5.4.4.3.3.2.2.1.1.. -.1 -.1 -.2 -.2 -.3 -.3 -.4 -.4 21 22 23 24 25 26 -.5 21 22 23 24 25 26.8.2.4. -.4 - -.8 - - -.12 21 22 23 24 25 26 -.2 21 22

More information

LTI Systems, Additive Noise, and Order Estimation

LTI Systems, Additive Noise, and Order Estimation LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts

More information

UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS

UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS F. C. Nicolls and G. de Jager Department of Electrical Engineering, University of Cape Town Rondebosch 77, South

More information

Detection of Outliers in Regression Analysis by Information Criteria

Detection of Outliers in Regression Analysis by Information Criteria Detection of Outliers in Regression Analysis by Information Criteria Seppo PynnÄonen, Department of Mathematics and Statistics, University of Vaasa, BOX 700, 65101 Vaasa, FINLAND, e-mail sjp@uwasa., home

More information

UNBIASED ESTIMATE FOR b-value OF MAGNITUDE FREQUENCY

UNBIASED ESTIMATE FOR b-value OF MAGNITUDE FREQUENCY J. Phys. Earth, 34, 187-194, 1986 UNBIASED ESTIMATE FOR b-value OF MAGNITUDE FREQUENCY Yosihiko OGATA* and Ken'ichiro YAMASHINA** * Institute of Statistical Mathematics, Tokyo, Japan ** Earthquake Research

More information

THE PROCESSING of random signals became a useful

THE PROCESSING of random signals became a useful IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 58, NO. 11, NOVEMBER 009 3867 The Quality of Lagged Products and Autoregressive Yule Walker Models as Autocorrelation Estimates Piet M. T. Broersen

More information

Full terms and conditions of use:

Full terms and conditions of use: This article was downloaded by:[smu Cul Sci] [Smu Cul Sci] On: 28 March 2007 Access Details: [subscription number 768506175] Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Economics 308: Econometrics Professor Moody

Economics 308: Econometrics Professor Moody Economics 308: Econometrics Professor Moody References on reserve: Text Moody, Basic Econometrics with Stata (BES) Pindyck and Rubinfeld, Econometric Models and Economic Forecasts (PR) Wooldridge, Jeffrey

More information

Minimum Feedback Rates for Multi-Carrier Transmission With Correlated Frequency Selective Fading

Minimum Feedback Rates for Multi-Carrier Transmission With Correlated Frequency Selective Fading Minimum Feedback Rates for Multi-Carrier Transmission With Correlated Frequency Selective Fading Yakun Sun and Michael L. Honig Department of ECE orthwestern University Evanston, IL 60208 Abstract We consider

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

Goodness of Fit Test and Test of Independence by Entropy

Goodness of Fit Test and Test of Independence by Entropy Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

A New Method for Varying Adaptive Bandwidth Selection

A New Method for Varying Adaptive Bandwidth Selection IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 47, O. 9, SEPTEMBER 1999 2567 TABLE I SQUARE ROOT MEA SQUARED ERRORS (SRMSE) OF ESTIMATIO USIG THE LPA AD VARIOUS WAVELET METHODS A ew Method for Varying Adaptive

More information

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project Erik Swanson Cori Saviano Li Zha Final Project 1 Introduction In analyzing time series data, we are posed with the question of how past events influences the current situation. In order to determine this,

More information

ISSN Article. Simulation Study of Direct Causality Measures in Multivariate Time Series

ISSN Article. Simulation Study of Direct Causality Measures in Multivariate Time Series Entropy 2013, 15, 2635-2661; doi:10.3390/e15072635 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Simulation Study of Direct Causality Measures in Multivariate Time Series Angeliki

More information

ROYAL INSTITUTE OF TECHNOLOGY KUNGL TEKNISKA HÖGSKOLAN. Department of Signals, Sensors & Systems

ROYAL INSTITUTE OF TECHNOLOGY KUNGL TEKNISKA HÖGSKOLAN. Department of Signals, Sensors & Systems The Evil of Supereciency P. Stoica B. Ottersten To appear as a Fast Communication in Signal Processing IR-S3-SB-9633 ROYAL INSTITUTE OF TECHNOLOGY Department of Signals, Sensors & Systems Signal Processing

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Przemys law Biecek and Teresa Ledwina

Przemys law Biecek and Teresa Ledwina Data Driven Smooth Tests Przemys law Biecek and Teresa Ledwina 1 General Description Smooth test was introduced by Neyman (1937) to verify simple null hypothesis asserting that observations obey completely

More information

Alfredo A. Romero * College of William and Mary

Alfredo A. Romero * College of William and Mary A Note on the Use of in Model Selection Alfredo A. Romero * College of William and Mary College of William and Mary Department of Economics Working Paper Number 6 October 007 * Alfredo A. Romero is a Visiting

More information

Generalization to Multi-Class and Continuous Responses. STA Data Mining I

Generalization to Multi-Class and Continuous Responses. STA Data Mining I Generalization to Multi-Class and Continuous Responses STA 5703 - Data Mining I 1. Categorical Responses (a) Splitting Criterion Outline Goodness-of-split Criterion Chi-square Tests and Twoing Rule (b)

More information

ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION

ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION Yu-Hang Xiao, Lei Huang, Junhao Xie and H.C. So Department of Electronic and Information Engineering, Harbin Institute of Technology,

More information

Regression and Time Series Model Selection in Small Samples. Clifford M. Hurvich; Chih-Ling Tsai

Regression and Time Series Model Selection in Small Samples. Clifford M. Hurvich; Chih-Ling Tsai Regression and Time Series Model Selection in Small Samples Clifford M. Hurvich; Chih-Ling Tsai Biometrika, Vol. 76, No. 2. (Jun., 1989), pp. 297-307. Stable URL: http://links.jstor.org/sici?sici=0006-3444%28198906%2976%3a2%3c297%3aratsms%3e2.0.co%3b2-4

More information

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] 1 Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] Insights: Price movements in one market can spread easily and instantly to another market [economic globalization and internet

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

2.2 Classical Regression in the Time Series Context

2.2 Classical Regression in the Time Series Context 48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression

More information

NONLINEAR STRUCTURE IDENTIFICATION WITH LINEAR LEAST SQUARES AND ANOVA. Ingela Lind

NONLINEAR STRUCTURE IDENTIFICATION WITH LINEAR LEAST SQUARES AND ANOVA. Ingela Lind OLIEAR STRUCTURE IDETIFICATIO WITH LIEAR LEAST SQUARES AD AOVA Ingela Lind Division of Automatic Control, Department of Electrical Engineering, Linköping University, SE-58 83 Linköping, Sweden E-mail:

More information

Anomaly Detection in Time Series of Graphs using ARMA Processes

Anomaly Detection in Time Series of Graphs using ARMA Processes Anomaly Detection in Time Series of raphs using ARMA Processes Brandon Pincombe a Abstract There are many situations in which indicators of changes or anomalies in communication networks can be helpful,

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

Constructing Ensembles of Pseudo-Experiments

Constructing Ensembles of Pseudo-Experiments Constructing Ensembles of Pseudo-Experiments Luc Demortier The Rockefeller University, New York, NY 10021, USA The frequentist interpretation of measurement results requires the specification of an ensemble

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Statistics 262: Intermediate Biostatistics Model selection

Statistics 262: Intermediate Biostatistics Model selection Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.

More information

Univariate linear models

Univariate linear models Univariate linear models The specification process of an univariate ARIMA model is based on the theoretical properties of the different processes and it is also important the observation and interpretation

More information

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics

More information

On the econometrics of the Koyck model

On the econometrics of the Koyck model On the econometrics of the Koyck model Philip Hans Franses and Rutger van Oest Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands Econometric Institute

More information

INFORMATION THEORY AND AN EXTENSION OF THE MAXIMUM LIKELIHOOD PRINCIPLE BY HIROTOGU AKAIKE JAN DE LEEUW. 1. Introduction

INFORMATION THEORY AND AN EXTENSION OF THE MAXIMUM LIKELIHOOD PRINCIPLE BY HIROTOGU AKAIKE JAN DE LEEUW. 1. Introduction INFORMATION THEORY AND AN EXTENSION OF THE MAXIMUM LIKELIHOOD PRINCIPLE BY HIROTOGU AKAIKE JAN DE LEEUW 1. Introduction The problem of estimating the dimensionality of a model occurs in various forms in

More information

Expressions for the covariance matrix of covariance data

Expressions for the covariance matrix of covariance data Expressions for the covariance matrix of covariance data Torsten Söderström Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-7505 Uppsala, Sweden

More information

Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability

Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability Nuntanut Raksasri Jitkomut Songsiri Department of Electrical Engineering, Faculty of Engineering,

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

Testing Goodness-of-Fit for Exponential Distribution Based on Cumulative Residual Entropy

Testing Goodness-of-Fit for Exponential Distribution Based on Cumulative Residual Entropy This article was downloaded by: [Ferdowsi University] On: 16 April 212, At: 4:53 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

A. Motivation To motivate the analysis of variance framework, we consider the following example.

A. Motivation To motivate the analysis of variance framework, we consider the following example. 9.07 ntroduction to Statistics for Brain and Cognitive Sciences Emery N. Brown Lecture 14: Analysis of Variance. Objectives Understand analysis of variance as a special case of the linear model. Understand

More information

5 Autoregressive-Moving-Average Modeling

5 Autoregressive-Moving-Average Modeling 5 Autoregressive-Moving-Average Modeling 5. Purpose. Autoregressive-moving-average (ARMA models are mathematical models of the persistence, or autocorrelation, in a time series. ARMA models are widely

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

Bootstrap for model selection: linear approximation of the optimism

Bootstrap for model selection: linear approximation of the optimism Bootstrap for model selection: linear approximation of the optimism G. Simon 1, A. Lendasse 2, M. Verleysen 1, Université catholique de Louvain 1 DICE - Place du Levant 3, B-1348 Louvain-la-Neuve, Belgium,

More information

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1 Mediation Analysis: OLS vs. SUR vs. ISUR vs. 3SLS vs. SEM Note by Hubert Gatignon July 7, 2013, updated November 15, 2013, April 11, 2014, May 21, 2016 and August 10, 2016 In Chap. 11 of Statistical Analysis

More information

Automatic Autocorrelation and Spectral Analysis

Automatic Autocorrelation and Spectral Analysis Piet M.T. Broersen Automatic Autocorrelation and Spectral Analysis With 104 Figures Sprin ger 1 Introduction 1 1.1 Time Series Problems 1 2 Basic Concepts 11 2.1 Random Variables 11 2.2 Normal Distribution

More information

The Method of Finite Difference Regression

The Method of Finite Difference Regression Research Report 2014 Intel Science Talent Search This report received the prestigious Scientific Research Report Badge, awarded to only 264 students nationwide, from the 2014 INTEL Science Talent Search

More information

Model Selection by Sequentially Normalized Least Squares

Model Selection by Sequentially Normalized Least Squares Model Selection by Sequentially Normalized Least Squares Jorma Rissanen, Teemu Roos, Petri Myllymäki Helsinki Institute for Information Technology HIIT Abstract Model selection by the predictive least

More information

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION BEZRUCKO Aleksandrs, (LV) Abstract: The target goal of this work is to develop a methodology of forecasting Latvian GDP using ARMA (AutoRegressive-Moving-Average)

More information

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu Volume 109 No. 8 2016, 225-232 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

More information

Review Session: Econometrics - CLEFIN (20192)

Review Session: Econometrics - CLEFIN (20192) Review Session: Econometrics - CLEFIN (20192) Part II: Univariate time series analysis Daniele Bianchi March 20, 2013 Fundamentals Stationarity A time series is a sequence of random variables x t, t =

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information