Journal of Multivariate Analysis. Independence tests for continuous random variables based on the longest increasing subsequence

Size: px
Start display at page:

Download "Journal of Multivariate Analysis. Independence tests for continuous random variables based on the longest increasing subsequence"

Transcription

1 Journal of Multivariate Analysis 127 (2014) Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: Independence tests for continuous random variables based on the longest increasing subsequence Jesús E. García, V. A. González-López Department of Statistics, University of Campinas, Rua Sérgio Buarque de Holanda, 651, Campinas, São Paulo. CEP , Brazil a r t i c l e i n f o a b s t r a c t Article history: Received 13 March 2013 Available online 3 March 2014 AMS 2000 subject classifications: 62G10 62G30 Keywords: Longest increasing subsequence Test for independence Copula We propose a new class of nonparametric tests for the supposition of independence between two continuous random variables X and Y. Given a size n sample, let π be the permutation which maps the ranks of the X observations on the ranks of the Y observations. We identify the independence assumption of the null hypothesis with the uniform distribution on the permutation space. A test based on the size of the longest increasing subsequence of π (L n ) is defined. The exact distribution of L n is computed from Schensted s theorem (Schensted, 1961). The asymptotic distribution of L n was obtained by Baik et al. (1999). As the statistic L n is discrete, there is a small set of possible significance levels. To solve this problem we define the JL n statistic which is a jackknife version of L n, as well as the corresponding hypothesis test. A third test is defined based on the JLM n statistic which is a jackknife version of the longest monotonic subsequence of π. On a simulation study we apply our tests to diverse dependence situations with null or very small correlations where the independence hypothesis is difficult to reject. We show that L n, JL n and JLM n tests have very good performance on that kind of situations. We illustrate the use of those tests on two real data examples with small sample size Elsevier Inc. All rights reserved. 1. Introduction Call Ω the space of the univariate, continuous cumulative distributions. Let (X, Y) be a random vector with unknown joint cumulative distribution H and univariate marginal distributions F and G respectively, F Ω, G Ω. Suppose that (x 1, y 1 ),..., (x n, y n ) is a paired sample of size n of (X, Y). Set H 0 : X and Y are independent. A test is constructed with no extra assumption (other than continuity) about the form of the marginal distributions (marginal free test). The procedure is based on the size of the longest increasing subsequence of the random permutation defined by the paired sample and denoted by L n. Theorem 3.1 shows how to compute the exact distribution of L n and it is a straightforward application of Schensted s theorem and Frame et al. s theorem, see Schensted [12] and Frame et al. [6]. In addition, we proposed two test statistics denoted briefly by JL n and JLM n, respectively. JL n is a Jackknife version of L n while JLM n is based on the size of the longest monotonic subsequence. The power of these tests is compared with those of various existing tests by simulation. This new class of tests is rankbased, therefore, it will be compared with other rank-based procedures for testing independence as the nonparametric tests Kendall, Spearman and Hoeffding and the independence test from Genest et al. [7], denoted here by Genest s test. (1) Corresponding author. addresses: jg@ime.unicamp.br (J.E. García), veronica@ime.unicamp.br (V. A. González-López) X/ 2014 Elsevier Inc. All rights reserved.

2 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 1. The left figure is the scatter plot of a sample (size = 200) from a mixture of two bivariate Normal distributions, with correlation 0.9 and 0.9 respectively (distribution D1 from Section 4). The right figure shows the plot of the sample size vs. the empirical power (level 0.01) for the same distribution. We include also the MIC test, based on the maximal information coefficient, from Reshef et al. [11]. In addition we include Pearson s test for its well known performance in the normal case. In the case of Kendall s test, Spearman s test, Hoeffding s test and Pearson s test, each methodology estimates the association between X and Y and computes a test of the association being zero. They use different measures of association, all of them in the interval [ 1, 1] with 0 indicating no association/correlation. The asymptotic Genest s test consist on computing the approximate p-values of the test statistic with respect to the empirical distribution obtained by simulation. For the MIC test, the p-value of a given MIC score is computed by selecting a probability δ of false rejection, creating a set of 1 1 surrogate datasets, and comparing the MIC δ of the real data with the MIC scores of the surrogate datasets. To compute the p-values for Kendall, Spearman and Pearson methods, we use the cor.test function, available in the stat package from R-project. Details about each test may be found in Hollander et al. [9]. In the case of Hoeffding s test, to compute the p-values, we use the hoeffd function, available in the Hmisc package from R-project. For Genest s test we use the indeptest function, available in the copula package from R-project. For the MIC test was used the support program given in We performed a simulation study with different conditions. For example, we use a mixture of two bivariate Normal distributions, with correlation ρ and ρ respectively (zero expected correlation). In this case L n, JL n and JLM n were competitive and markedly more powerful than the other six tests considered. Fig. 1, on the left, shows a scatter plot for a sample (size = 200) of this mixture when ρ = 0.9 and Fig. 1, on the right, shows the sample size versus the empirical power (level 0.01). The other tests do not detect the dependence for any sample size. This situation illustrates the usefulness of our proposal, we will explore more situations like that, in Section 4.2. We applied the tests based on the longest increasing subsequence to two real datasets, both with small sample sizes considering that for bigger sample sizes there exists very efficient procedures designed for asymptotic situations. The first dataset was provided by Professor Dalia Chakrabarty, researcher in the School of Physics and Astronomy, University of Nottingham. It consist on two measures, the projected radius and the radial velocity for 30 Globular Clusters around the galaxy NGC 3379 (see Chakrabarty [5]). The second dataset appears on VGAM (package from R-project), named coalminers. The data is about coal-miners who are smokers without radiological pneumoconiosis, classified by age, breathlessness and wheeze. We adapted and implemented (in C language) the algorithm provided by Zoghbi et al. [13]. We use that algorithm to compute the exact probability of L n, in the case of n 100. For n > 100 the asymptotic distribution of L n, obtained by Baik et al. [3] can be used and we show how to use it in our test, in Section 3. Nevertheless, the exact probability could be calculated for n > 100 also. The probabilities for JL n and JLM n were estimated by simulation. The tests and simulations were implemented in the R-project environment (LIStest package). Section 2 provides the main concepts and the definition of the test statistic. In Section 3 we calculate the distribution of the test statistic, proposed here. In Theorem 3.1 is shown the exact distribution of the test statistic under the independence assumption, by a direct application of results from Schensted [12] and Frame et al. [6]. Section 4 is devoted to show the capacity to detect dependence of each test statistic introduced here. Through simulations, we discuss each one of the test statistics, face to face with several dependence situations. We apply the test, to real datasets in Section 4.3. In the Appendix A we include the proof of Theorem 3.1.

3 128 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Table 1 Paired sample size 5. x i y i a b c Fig. 2. Dispersion s graphic and permutation (Example 2.2). (a) Is the dispersion plot for the sample, (b) represents the permutation defined by the sample, the solid line shows the longest increasing subsequence, (c) shows the empirical copula of the sample. 2. Preliminaries We will introduce some basic concepts related to the size of the longest increasing subsequence, associated with a paired sample of size n of (X, Y) with continuous marginal distributions. Definition 2.1. Let S n denote the group of permutations of {1,..., n}. If π S n, we say that π(i 1 ),..., π(i k ) is an increasing subsequence of π if 1 i 1 < < i k n and 1 π(i 1 ) < π(i 2 ) < < π(i k ) n. Definition 2.2. Given a permutation π S n, we call l n (π) (or ld n (π)) the length of the longest increasing (or decreasing) subsequence of π. Example 2.1. Consider the set {1, 2, 3, 4, 5, 6, 7, 8}. Let π be the permutation which transforms the previous set in {3, 6, 1, 7, 4, 2, 5, 8} where π(1) = 3, π(2) = 6, π(3) = 1, π(4) = 5, π(5) = 7, π(6) = 2, π(7) = 4, π(8) = 8. Examples of increasing subsequences are {1, 7, 8}, {3, 6, 7, 8}, {1, 2, 5, 8}. The maximal size for the increasing subsequences is 4 which is reached by the sequences {1, 2, 5, 8}, {1, 4, 5, 8} and {3, 6, 7, 8}, then l 8 (π) = 4. We bring the concept of the longest increasing subsequence to the sample space, using the next example, in which we will connect the sample with a specific permutation of n points, π. Example 2.2. Let us consider the paired sample {(x i, y i )} n i=1 (from Table 1). First, sort the sample in increasing order in relation to the marginal sample {x i } n i=1 and replace the x i value with its rank in the sequence, this produces {(1, 3.5), (2, 2.86), (3, 4.17), (4, 3.18), (5, 3.2)}. Next, replace each y i with its rank in the {y i } n i=1 sequence, this produces {(1, 4), (2, 1), (3, 5), (4, 2), (5, 3)}. The permutation π related to this sample is defined by π(1) = 4, π(2) = 1, π(3) = 5, π(4) = 2, π(5) = 3. The longest increasing subsequence is {1, 2, 3} and l 5 (π) = 3, see Fig. 2(b). We define now the length of the longest increasing subsequence as a random variable. Definition 2.3. Let (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) be replications of (X, Y) with continuous marginal distributions, we denote by L n the random variable, L n = l n (π D ) where D = {(X i, Y i )} n i=1 and π D is the permutation which assigns π(rank(x i )) = rank(y i ), i = 1,..., n. On the next section we show the distribution of L n, under the assumption of independence between X and Y.

4 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) The distribution of L n The exact distribution of L n in the case of independence can be obtained using the next theorem, in which the probability of L n be equal to k, for k = 1,..., n will be denoted by p n k. Theorem 3.1. Let (X, Y) be a random vector with continuous marginal distributions, under Hypothesis (1). Suppose that (x 1, y 1 ),..., (x n, y n ) is a paired sample of size n of (X, Y). Let S n denote the group of permutations of {1,..., n} and let S n, U be S n with uniform distribution U. Then, for k = 1, 2..., n, if p n = k Prob(L n = k), p n = 1 n k N(W) 2 (2) n! m=1 W V n (k,m) where L n is given by Definition 2.3, V n (k, m) is the set of shapes of standard Young tableaux of order n having k columns and m rows, N(W) is the number of standard Young tableaux with shape W as given by Formula (6). Proof. See Appendix A. Remark 1. S n, U is the space of permutations π D where D = {(X i, Y i )} n i=1, and (X i, Y i ) are i.i.d. with the same law of (X, Y) under Hypothesis (1). For k = 1, 2..., n, p n k = #{π S n:l n (π)=k} n!. There are diverse algorithms in the literature to find V n (k, m), we implemented the ZS2 algorithm by Zoghbi et al. [13]. Using Theorem 3.1 we compute p n k for 1 k n, n 100. The table can be accessed from the LIStest package, implemented in R project. The asymptotic distribution of L n in the case of independence, after appropriate centering and scaling, was first obtained by Baik et al. [3]. Let q(z) denote the solution of the Painlevé II equation given by, q (z) = 2q 3 + zq, satisfying the boundary condition q(z) Ai(z) when z, where Ai is the Airy function and q denotes the second derivative of q. Hastings et al. e [8] show the asymptotic solutions, q(z) = Ai(z) + O (4/3)z3/2 z 1 as z, q(z) = 1 + O as z. z 1/4 2 z 2 The Tracy Widom distribution is defined by the following cumulative distribution F TW (t) = exp (z t)q 2 (z)dz t, t R. (3) Theorem 3.2. Under the assumptions of Theorem 3.1, if χ is a random variable whose distribution function is F TW, given by Eq. (3), then χ n = L n 2 n n 1/6 χ in distribution, as n. (4) Proof. See Baik et al. [3]. For n > 100 we use the asymptotic distribution of L n through Eq. (4). We calculate the asymptotic p-values using the R-package RMTstat, specifying the parameter β = 2 in the cumulative function ptw The L n independence test Let (x 1, y 1 ),..., (x n, y n ) be a paired sample of size n of (X, Y) with continuous marginal distributions. The p-value for a statistical test with null hypothesis of independence against an alternative hypothesis of not independence between X and Y is defined in the following way. Definition 3.1. The two-sided p-value is min 2F Ln (l 0 )I + F Ln (l 0 ) 1 2(1 F Ln (l 0 ))I, F 2 Ln (l 0 )> 1 1, 2 where l 0 is the observed value of L n in the sample, F Ln is the cumulative distribution function, F Ln (l 0 ) = l 0 k=1 pn k (see Eq. (2)) and I E denotes the indicator function of set E.

5 130 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) The JL n, statistic The JL n statistic is obtained from two modifications to the L n statistic. The first modification is based on Johansson [10]. That paper shows that, in the independent case, if we consider U i = rank(x i ) and V i = rank(y i ), for i = 1,..., n then the typical deviations of a maximal path from the diagonal U = V is of order n 5/6. The first modification is that will only consider points whose ranks are at a distance less than or equal to cn 5/6, from the diagonal U = V, i.e. U i V i cn 5/6, where c is a constant. To choose the value of c, we checked with different values of c on simulated data and used the best one. We started with c = 0.1 then c = 0.2 which gives us better power, and so on until the power started to go down which happened for c = 0.5. Table 15 shows the results of the simulation study used to choose c. Note that the power of the test does not change too much for values of c between 0.3 and 0.5. We choose c = 0.4 as it seems to give the best power for the distributions and sample sizes used in the simulation. Formally, we introduce the set D diag = (U i, V i ), i = 1,..., n : U i V i 0.4n 5/6, and we define L diag n = l n diag(π D diag), with n diag = # D diag. The second modification is a jackknife procedure. The statistic L n is discrete, Fig. 3 (a) shows F L80 which is L n cumulative distribution function for n = 80. For example, F L80 ( ) jumps from for x = 11 to for x = 12, then, if we want to test at level 0.05 for unilateral alternative, we will reject the independence when a unilateral p-value is and the exact level 0.05 cannot be achieved. To mitigate this characteristic we define a jackknife version of the L diag n statistic. Definition 3.2. Let (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) be replications of (X, Y) with continuous marginal distributions. We define JL n = 1 n diag (u, v), L diag n (u,v) D diag where L diag n (u, v) = l n diag 1(π D (u,v)) with D (u,v) = D diag \ {(u, v)}, for each (u, v) D diag. Fig. 3 (b) shows F JL80 which is the JL n cumulative distribution function for n = 80. We can see that the number of steps in the function has grown. In this case, F JL80 ( ) jumps from for x = to for x = , this means that if we want to test at level 0.05 in practice we will test at level Remark 2. If we reexamine the cumulative distribution for the statistic L 80 in Fig. 3 (left), we can see that (under the Hypothesis (1)) the set of values with probabilities significantly different from zero for L 80 is {11, 12,..., 17, 18, 19}, as can be seen in the following table. l p 80 l In the same way for JL n, under the Hypothesis (1), the set of values with probabilities significantly different from zero is inside the interval (9, 22), as can be seen from the right picture in Fig. 3. Because of this if some underlying dependence structure between the random variables increases or decreases the size of the longest increasing subsequence, even in a small quantity (4 or 5 for n = 80), it can easily take the L n or JL n statistic to regions of very low probability. Another useful characteristic of the L n (and JL n ) statistic is that, as seen in Aldous et al. [1], under the Hypothesis (1), E(L lim n ) n n = 2, in other words, under the assumption of independence, when n grows L n grows like 2 n The JLM n, statistic The idea behind the JLM n statistic, is to use the size of the longest monotonic subsequence, which is the maximum between the longest increasing subsequence and the longest decreasing subsequence. The size of the longest decreasing subsequence of a sample {(x i, y i )} n i=1 is the size of the longest increasing subsequence for the sample {(x i, ( y i ))} n i=1. As before, we only consider points that are at a distance smaller than 0.4n 5/6 from the corresponding diagonal, derived from the observations transformed through the ranks, as described in Section 3.2. Definition 3.3. Let (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) be replications of (X, Y) with continuous marginal distributions. We define JLM n = max{jl n, JL n }, with JL n and JL n given by Definition 3.2 applied over (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) and (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ), respectively.

6 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Simulations Fig. 3. (a) L n cumulative distribution function for n = 80 (left). (b) JL n cumulative distribution function for n = 80 (right). To compare the power of our tests against Pearson s test, Kendall s test, Spearman s test, Hoeffding s test, Genest s test and MIC s test, we carried out a simulation study in which for each test we estimate the power function for different sample sizes and diverse joint distributions. For each joint distribution and sample sizes 20, 40, 60, 80, 100 we simulated 5000 samples, and computed the p-values. The simulation was implemented in R-project. Denote by (X j, i Y j ) n i the j-th i=1 simulated sample, with j = 1,..., 5000 and n = 20, 40, 60, 80, 100. Given a level α, we calculate the empirical significance level as being, # j : p-value (X j, i Y j ) n i i= where p-value (X j i, Y j i ) n i=1 α denotes the p-value associated with the sample j, (X j i, Y j i ) n i=1 (5). The p-values for the L n, JL n and JLM n statistics were calculated using our R package LIStest. For n 100 the LIStest package uses the exact values of probabilities for the L n, computed using Theorem 3.1. For n > 100 it uses the Tracy Widom approximation given by Theorem 3.2. In the LIStest package, the distributions for JL n and JLM n were estimated by simulation for n 200. We divided our simulations into two parts, the independence case to compare (5) for different sample sizes and the dependent case to measure and compare the power of the tests for diverse situations. Complete tables with the computed empirical power obtained for each situations can be consulted in the Appendix of this paper (see Appendix B) Independence Considering that the tests (except Pearson s test) are marginal free, we analyze only two distributions for the case of independence. The first distribution is pairs of independent random variables with Normal (standard) marginal distributions. The second distribution, with heavier tails, consists of pairs of independent random variables with Pareto of parameter 4 distribution. Fig. 4, and Tables 7, 8 show the behavior of the empirical significance levels for these two distributions. We note that the power of Pearson s test can achieve values significantly higher than α = 0.01, under the effect of the marginal distributions. When, for example, the marginal distributions are heavy-tailed, as the Pareto distribution. See for illustration, the lower panel in Fig. 4, it shows the effect of the marginal distribution on Pearson s test. Situations such as these show the importance of using a marginal free methodology to detect dependence. We can also see that the empirical significance levels for JL n and JLM n tests are much closer to the theoretical α compared to the L n test Dependence The conjecture No dependence test is known to be optimal under all dependence structures. With this in mind our conjecture is that the proposed family of tests is efficient to detect types of dependences with null correlation or when the correlation takes very

7 132 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 4. The picture on the left is the scatter plot of a sample size 200, the picture on the right is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top: independent N(0, 1) random variables and on bottom: independent Pareto(4) random variables. small values and we concentrate our study on joint distributions with zero or very small correlations that challenge many tests of independence. In order to introduce some intuition about the behavior of the new test in traditional situations, we explore onwards four models with medium and high correlations, those are (i) Gumbel s copula with parameter θ, where the cumulative distribution is given by C G (x, y θ) = exp{ [( ln(x)) θ + ( ln(y)) θ ] 1/θ }, θ [1, ); (ii) Frank s copula with }, θ (, )\{0}; (iii) Clayton s ln{1+ (exp( θx) 1)(exp( θy) 1) exp( θ) 1 parameter θ, and cumulative distribution C F (x, y θ) = 1 θ copula with parameter θ, and cumulative distribution C C (x, y θ) = max{(x θ + y θ 1) 1/θ, 0}, θ [ 1, ) \ {0} and (iv) normal bivariate distribution with correlation ρ, Tables 2 and 3 show the results. In all the simulated cases we fixed the parameters θ and ρ, in order to obtain a correlation between x and y approximately equal to 0.5 and 0.7. For large sample sizes, at the nominal level equal to 0.01, the new family of tests detects the dependence but with a lower power than the others tests. Except some situations as in the case of correlation approximately equal to 0.7, where the new family of tests showed positive results. In general, Pearson s tests show the highest power levels in all the distributions with moderate to large correlation considered in this study, see Tables 2 and 3. To illustrate in detail the behavior that we observe in the cases included in Tables 2 and 3, see Fig. 5. We also observed that the statistics introduced in this paper are not consistent against all alternatives. For example, they are not able to recognize the difference between the Uniform distribution on [0, 1] 2 and the distribution given by f (x, y) = 1

8 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Table 2 Empirical significance level (α = 0.01). For each case stand out the best power, in bold letter. Dist. n Spe Ken Pea Hoe Mine Gen L n JL n JLM n Gumbel θ = (cor ) Gumbel θ = (cor ) Frank θ = (cor ) Frank θ = (cor ) Clayton θ = (cor ) Clayton θ = (cor ) Table 3 Empirical significance level (α = 0.01), bivariate Normal distribution with variance 1 and correlation ρ. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe Mine Gen L n JL n JLM n if (x, y) [0, 1] 2 \ (A B), f (x, y) = 2 if (x, y) B and f (x, y) = 0 otherwise, where A = [0.5 a, a] [1 a, 1] and B = [0.5 a, a] [0, a], with a very small value a. That means, the distribution f is equal to the Uniform except on the sets A and B. On A, f takes null values and on B, f assumes value equal to 2 (because the mass on A was reallocated to B). The family of tests given in this paper, will recognize the magnitude of the density distributed on the diagonals (x = y and/or x = y) and in its neighborhood Settings of dependence We consider two main situations in which the samples show low correlation. (a) Visible dependence; (b) hidden dependence. In the first group we explore distributions with the following x y plot shapes (i) a cross, (ii) a ring, and (iii) a square. All of them are types of dependence with null expected correlation coefficients. We note that Pearson s test, Kendall s

9 134 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 5. The picture on the left is the scatter plot of a sample size 200 of the Gumbel copula. The picture on the right is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top θ = 1.55 (correlation 0.5), and on bottom θ = 2.07 (correlation 0.7). test and Spearman s test are not consistent for the Hypothesis (1) explaining its poor performance in almost all the cases exposed in this section. For case (a) we implemented the following joint distributions D1 Mixture of two bivariate Normal distributions with variances 1 and correlations ρ and ρ; (X, Y) 1 N 2 2 0, Σ1 + 0, Σ2, where 0 = (0, 0), Σ1 = 1 and Σ 2 = N 2 ρ ρ 1 ρ ρ 1 D2 Uniform ring centered at 0 with internal radius of ρ and external radius of 1. D3 Uniform distribution on {[ 1, 1] [ 1, 1]} \ {[ ρ, ρ] [ ρ, ρ]} (border of a square). In all the cases (see Figs. 1, 6 9 and Tables 9 11 respectively), the new family of tests meet the highest empirical powers. D1, D2 and D3 situations show how it is possible to enhance the efficiency of statistical tests based on L n, exemplified here by statistics JL n and JLM n. For case (b) we implemented the following joint distributions

10 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 6. The picture on the left is the scatter plot of a sample size 200 of D1. The picture on the right is the sample size vs. the empirical significance level (α = 0.01) for the same distribution with ρ = 0.7. Table 4 P-values, Globular Clusters-NGC 3379 galaxy. Spe Ken Hoe Pea Gen MIC L n JL n JLM n D4 Mixture of two bivariate Normal distributions, one independent with standard deviation 4 and the other dependent with standard deviation 1 and correlation ρ, (X, Y) 1 N 2 2 0, 16I + 1 N 2 2 0, Σ, where 0 = (0, 0), 16I = and Σ = 1 ρ ρ. 1 D5 Mixture of two bivariate Normal distributions, γ % independent with standard deviation 4 and (1 γ )% dependent with standard deviation 0.5 and correlation ρ = 0.95, (X, Y) γ N 2 0, 16I + (1 γ )N2 0, Σ, where 0 = (0, 0), 16I = and Σ = ρ ρ 0.25 D6 Mixture of two bivariate Clayton s copulas, one with parameter 0.1 and the other with parameter equal to 10; (X, Y) 0.75C C (, 0.1) C C (, 10). For the case of distribution D6, Spearman s correlation was about Spearman s, Kendall s and Pearson s tests cannot detect the dependence, because the sample proportion (25%) which has strong correlation (around 0.94) is too small compared with the sample proportion (75%) having negative and small correlation (about 0.17). According to the results shown by Figs and Tables respectively, in scenarios D4, D5 and D6, the statistics L n, JL n and JLM n reach the best results followed by Hoeffding s test Applications Application 1, Globular Clusters data The dataset is composed by a sample of globular clusters (GC) around the galaxy NGC 3379 (see Bergond et al. [4] and Chakrabarty [5]). The NGC 3379 is the brightest elliptical galaxy in the constellation Leo and it is known to have a supermassive black hole. The measures (Fig. 13) are the Projected Distance expressed in kpc (x axis) and the Line of Sight (LOS) radial Velocity expressed in km/s (y axis) to the galaxy for 30 GCs. While conceptually the dependence between the Projected Distance and the LOS Velocity exists, it is not detected by Pearson s test, Kendall s test, Spearman s test, Hoeffding s test, Genest s test and MIC s test, see the results in Table 4. Astronomers use this kind of relation to infer the total mass distribution in galaxies. They can compare, for example, the globular cluster system of NGC 3379 (with a black hole) and some planetary nebulae (without a black hole) in order to infer the influence produced by the presence of a black hole. L n test (with a p-value = ), JL n test (with a p-value = ) and JLM n test (with a p-value = ) are capable to show that dependence. Spearman s correlation between the Projected Distance and the LOS Velocity is equal to

11 136 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 7. The left panel is the scatter plot of a sample size 200 of D2. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top ρ = 0, and on bottom ρ = Application 2, coalminers data The dataset named coalminers, appears on VGAM (package from R-project). The data is about coal-miners who are smokers without radiological pneumoconiosis, classified by age, breathlessness and wheeze. Denote by BW the counts with breathlessness and wheeze, BnW the counts with breathlessness but no wheeze, nbw the counts with no breathlessness and wheeze. Fig. 14, on the left, shows the plot between BnW and BW, while Fig. 15, on the left, shows the plot between BW and nbw. Each point, was took according to 9 age-groups. In both situations the dependency appears as a consequence of event B (breathlessness) or W (wheeze) respectively. Since Figs. 14 and 15 (left) expose an increasing tendency, we can test some unilateral hypotheses for the relation BnW versus BW and BW versus nbw, respectively. For the tests based on a specific measure, we test measure > 0 ; for the L n test we test L n > M 0, where M 0 is the mode of the distribution under the independence assumption. For that kind of hypotheses, we compute the exact p-values for Pearson s test, Spearman s test, Kendall s test (see the description of the function cor.test from R-project) and the L n test. For the first case (BnW versus BW) all the tests reject the independence in favor of an increasing tendency, with small p-values (lesser that 1e-05) and with dependence coefficients taking values approximately equal to 1 (Pearson s correlation,

12 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 8. The left panel is the scatter plot of a sample size 200 of D2. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top ρ = 0.3, and on bottom ρ = 0.5. Table 5 Unilateral hypotheses (BW, nbw). Test Spe Ken Pea L n JL n Coefficient p-value Spearman s rank correlation and Kendall s rank correlation). In contrast, for the second situation (BW versus nbw) the tests based on correlations fail to reject the null hypothesis, while the L n test and JL n test reject it, at level 5%. We show in Table 5 the results in which case, the L n test and JL n test show the best performance. 5. Conclusions In this work we develop a new class of nonparametric independence tests for the independence of two continuous random variables. For the L n test we show the exact distribution for the test statistic, for the JL n and JLM n tests we estimated

13 138 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 9. The left panel is the scatter plot of a sample size 200 of D3. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top ρ = 0.5, and on bottom ρ = 0.7. the distribution by simulation. We compare our family of tests with Pearson s test, Kendall s test, Spearman s test, Genest s test, Hoeffding s test and MIC s test using simulations. We apply the L n test and its variants in two real data examples. The inability of L n, to reach the nominal α level (by construction) is successfully eliminated in its variants JL n and JLM n. For the sample sizes considered in our study, the tests based on the longest increasing subsequence were the only ones capable to detect dependence, for the distributions D1 D6. Followed by Hoeffding s test in the cases D4 and D6, it is necessary to emphasize the ability of these new tests to address situations with moderate sample sizes (even small). In all the cases in which L n test and related, work well, they have the highest power for sample sizes bigger than 40, this property added to the capacity to control the significance level (through the JL n and/or JLM n versions) put this procedure in an advantaged position in relation to the other tests, as is strongly exposed in the simulation study and in the applications to real data. According to the simulation study (Section 4) L n, JL n and JLM n tests have a remarkable behavior in the mixture cases, in which the samples are composed by two subsamples, coming from a strongly correlated distribution and from a weakly correlated distribution, respectively. In summary, in this paper we wish to draw attention to the potential of statistics that are constructed with the longest increasing subsequence. Such statistics can be useful for detecting dependency types, difficult to identify with the tests available in the literature.

14 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 10. The left panel is the scatter plot of a sample size 200 of D4. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top ρ = 0.9, and on bottom ρ = Acknowledgments The authors gratefully acknowledge the support for this research provided by (a) USP project Mathematics, computation, language and the brain, (b) Portuguese in time and space: linguistic contact, grammars in competition and parametric change, FAPESP s project, grant 2012/ and (c) FAPESP Center for Neuromathematics (grant 2013/ , S. Paulo Research Foundation). Special thanks to Professor Dalia Chakrabarty for making the astronomical data used in this paper available to us. We wish to thank the referees and an associate editor for their many helpful comments and suggestions on an earlier draft of this paper. Appendix A. Proof of Theorem 3.1 The combinatorial concepts that we will introduce, useful in representation theory, have been extensively developed in Frame et al. [6], Schensted [12] and Baer et al. [2].

15 140 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 11. The left panel is the scatter plot of a sample size 200 of D5. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. On top γ = 0.75, and on bottom γ = Definition A.1. A standard Young Tableau of order n is an arrangement of n distinct natural numbers in rows and columns so that the numbers in each row and in each column form increasing sequences, and so that there is an element of each row in the first column and an element of each column in the first row, and there are no gaps between numbers. There is a 1 1 correspondence between permutations and standard Young Tableaux. To each permutation we can assign a corresponding standard Young tableaux, citing Baer et al. [2], in the following way. Let the permutation be {x 1, x 2,..., x n }. For the moment, define the first entry in the first row of the tableau to be x 1. Now, if at the i-th step, the first i entries of the sequence have been used in the developing tableau then at the next step the element x i+1, is inserted into the first row of the tableau by displacing the smallest entry in the first row which is larger than x i+1, or by appending x i+1, at the end of the first row if it is larger than all entries in the first row. If an entry y is displaced from the first row by x i+1, then y is inserted into the second row by letting it displace the smallest entry in the second row which is larger than y or by simply appending y to the second row if there is no such element. The process is continued from row to row until either the original x i+1, or a displaced element is appended to the end of a row. Then the whole process is renewed for x i+2,... until all of the entries of the original permutation sequence have been entered into the tableau.

16 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 12. The left panel is the scatter plot of a sample size 200 of D6. The right panel is the sample size vs. the empirical significance level (α = 0.01) for the same distribution. Fig. 13. The plots show the Projected Distance (kpc) vs. the Line of Sight Velocity (on the left) and the ranks of the Projected Distance vs. the ranks of the Line of Sight Velocity (on the right) for 30 GCs associated with NGC Example A.1. Applying the algorithm given by Baer et al. [2] and reproduced in the previous paragraph, to the permutation {3, 6, 1, 5, 7, 2, 4, 8} we find the following sequence of arrangements. The last arrangement (step 8) is the standard Young Tableaux. step step step Remark 3. The first row on the standard Young Tableau corresponds to one of the longest increasing subsequence.

17 142 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Fig. 14. The plots show counts with breathlessness and no wheeze (BnW) vs. counts with breathlessness and wheeze (BW) (on the left) and the ranks of BnW vs. the ranks of BW (on the right). Fig. 15. The plots show counts with breathlessness and wheeze (BW) vs. counts with no breathlessness and wheeze (nbw) (on the left) and the ranks of BW vs. the ranks of nbw (on the right). Definition A.2. If T is a standard Young Tableau of order n, for each element j, j {1,..., n} of the arrangement we define the hook number of j, h j as the number of elements in the same column and in the same row in which j is included, counting from the bottom until the element j and from the right to the row until the element j. Example A.2. We illustrate the concepts introduced by means of Example 2.2. standard Young Tableau hook numbers Remark 4. By definition, the h j numbers depend on the shape of the tableau not on the numbers filling it. Each permutation is directly associated with the shape of a standard Young Tableau, but different permutations of {1,..., n} can give the same tableau shape. The next example shows all the possible shapes of standard Young Tableaux that can be obtained by the permutations of 5 numbers. Example A.3. Consider the set {1, 2, 3, 4, 5}. Each shape of the list (Table 6) is associated with an integer partition, which is a way of writing n as a sum of positive integers, denoted by IP(n) (n = 5 in this case).

18 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Table 6 List of shapes (of standard Young Tableaux) and hooks numbers for the permutations of {1, 2, 3, 4, 5}. Shape 1 Shape 2 Shape 3 Shape 4 Shape 5 Shape 6 Shape IP1(5) IP2(5) IP3(5) IP4(5) IP5(5) IP6(5) IP7(5) Table 7 Empirical power at level α = 0.01 for independent random variables with Normal marginal distributions. n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 8 Empirical power at level α = 0.01 for independent random variables with Pareto(4) marginal distributions. n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 9 Empirical power at level α = 0.01 for distribution D1. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Shape 1 corresponds to the permutation π(1) = 5, π(2) = 4, π(3) = 3, π(4) = 2, π(5) = 1 (ld 5 = 5) and it is associated to the integer partition of n = 5 given by IP1(5) = 5 (the sum of the number of elements in the first column of the shape 1). The shape 5 is associated to the integer partition of n = 5, IP5(5) = , where each term of IP5(5) (from left to right) is the sum of the number of elements by column in the shape 5. Given a permutation π, the size of the longest increasing subsequence for π is the size of the first row in the shape of the Tableau corresponding to the permutation. The next results allow to compute the number of permutations of n numbers such that l n (π) = k which is the number of standard Young tableaux with a shape such that the first row has size k. Theorem A.1 (Frame et al. [6]). Given a shape W, the number of standard Young tableaux with shape W, containing the integers {1,..., n} is N(W) = n!, n h j j=1 where the h j, j = 1,..., n are the hook numbers for each cell of the Tableau. (6)

19 144 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Table 10 Empirical power at level α = 0.01 for distribution D2. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 11 Empirical power at level α = 0.01 for distribution D3. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 12 Empirical power at level α = 0.01 for distribution D4. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Example A.4. The number of standard Young tableaux containing the numbers {1, 2, 3, 4, 5} with shape given by Example A.2 (left) is 5!/[4.3.2] = 5 (using the values of Example A.2 (right)). Theorem A.2 (Schensted [12]). Let V n (k, m) be the set of shapes of Young tableaux of order n, having k columns and m rows. The number of permutations of n elements with a longest increasing subsequence of size k and a longest decreasing subsequence of size m is W V n (k,m) N(W)2. Example A.5. Considering the set of numbers {1, 2, 3, 4, 5} we want to calculate the number of sequences having l 5 = 3. Let us denote by # {A} the cardinal of the set A; # {l 5 = 3} = # {l 5 = 3, ld 5 = 2} + # {l 5 = 3, ld 5 = 3}, corresponding with

20 J.E. García, V. A. González-López / Journal of Multivariate Analysis 127 (2014) Table 13 Empirical power at level α = 0.01 for distribution D5. For each case stand out the best power, in bold letter. ρ n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 14 Empirical power at level α = 0.01 for distribution D6. For each case stand out the best power, in bold letter. n Spe Ken Pea Hoe MIC Gen L n JL n JLM n Table 15 Empirical power of the test JL n at level α = 0.01, for c = 0.1, 0.2, 0.3, 0.4 and 0.5 for six dependence situations. Distribution c D ρ = D ρ = D ρ = D ρ = D ρ = Bivariate Normal ρ = only two possible shapes of Young tableaux, shape 4 and shape 5 (see Table 6). Using the Theorem A.2, # {l 5 = 3, ld 5 = 2} = 5 2 = 25, # {l 5 = 3, ld 5 = 3} = 6 2 = 36 and # {l 5 = 3} = = 61. Proof. Let (X 1, Y 1 ),..., (X n, Y n ) be independent, identically distributed, bivariate random vectors with the same distribution of (X, Y). X and Y verify de Hypothesis (1) with continuous marginal distributions. We can define L n as

First steps of multivariate data analysis

First steps of multivariate data analysis First steps of multivariate data analysis November 28, 2016 Let s Have Some Coffee We reproduce the coffee example from Carmona, page 60 ff. This vignette is the first excursion away from univariate data.

More information

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence Lehrstuhl für Statistik und Ökonometrie Diskussionspapier 87 / 2011 Some critical remarks on Zhang s gamma test for independence Ingo Klein Fabian Tinkl Lange Gasse 20 D-90403 Nürnberg Some critical remarks

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

the long tau-path for detecting monotone association in an unspecified subpopulation

the long tau-path for detecting monotone association in an unspecified subpopulation the long tau-path for detecting monotone association in an unspecified subpopulation Joe Verducci Current Challenges in Statistical Learning Workshop Banff International Research Station Tuesday, December

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories Entropy 2012, 14, 1784-1812; doi:10.3390/e14091784 Article OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories Lan Zhang

More information

Trivariate copulas for characterisation of droughts

Trivariate copulas for characterisation of droughts ANZIAM J. 49 (EMAC2007) pp.c306 C323, 2008 C306 Trivariate copulas for characterisation of droughts G. Wong 1 M. F. Lambert 2 A. V. Metcalfe 3 (Received 3 August 2007; revised 4 January 2008) Abstract

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic STATISTICS ANCILLARY SYLLABUS (W.E.F. the session 2014-15) Semester Paper Code Marks Credits Topic 1 ST21012T 70 4 Descriptive Statistics 1 & Probability Theory 1 ST21012P 30 1 Practical- Using Minitab

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ).

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ). APPENDIX A SIMLATION OF COPLAS Copulas have primary and direct applications in the simulation of dependent variables. We now present general procedures to simulate bivariate, as well as multivariate, dependent

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

A New Generalized Gumbel Copula for Multivariate Distributions

A New Generalized Gumbel Copula for Multivariate Distributions A New Generalized Gumbel Copula for Multivariate Distributions Chandra R. Bhat* The University of Texas at Austin Department of Civil, Architectural & Environmental Engineering University Station, C76,

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Bivariate Relationships Between Variables

Bivariate Relationships Between Variables Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Copula modeling for discrete data

Copula modeling for discrete data Copula modeling for discrete data Christian Genest & Johanna G. Nešlehová in collaboration with Bruno Rémillard McGill University and HEC Montréal ROBUST, September 11, 2016 Main question Suppose (X 1,

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

A3. Statistical Inference

A3. Statistical Inference Appendi / A3. Statistical Inference / Mean, One Sample-1 A3. Statistical Inference Population Mean μ of a Random Variable with known standard deviation σ, and random sample of size n 1 Before selecting

More information

Measuring relationships among multiple responses

Measuring relationships among multiple responses Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.

More information

Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's Paper, Dependency Modeling)

Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's Paper, Dependency Modeling) Study Guide on Dependency Modeling for the Casualty Actuarial Society Exam 7 - G. Stolyarov II Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2 based on the conditional probability integral transform Daniel Berg 1 Henrik Bakken 2 1 Norwegian Computing Center (NR) & University of Oslo (UiO) 2 Norwegian University of Science and Technology (NTNU)

More information

A simple graphical method to explore tail-dependence in stock-return pairs

A simple graphical method to explore tail-dependence in stock-return pairs A simple graphical method to explore tail-dependence in stock-return pairs Klaus Abberger, University of Konstanz, Germany Abstract: For a bivariate data set the dependence structure can not only be measured

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Package LIStest. February 19, 2015

Package LIStest. February 19, 2015 Type Package Package LIStest February 19, 2015 Title Tests of independence based on the Longest Increasing Subsequence Version 2.1 Date 2014-03-12 Author J. E. Garcia and V. A. Gonzalez-Lopez Maintainer

More information

high-dimensional inference robust to the lack of model sparsity

high-dimensional inference robust to the lack of model sparsity high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,

More information

A Monte-Carlo study of asymptotically robust tests for correlation coefficients

A Monte-Carlo study of asymptotically robust tests for correlation coefficients Biometrika (1973), 6, 3, p. 661 551 Printed in Great Britain A Monte-Carlo study of asymptotically robust tests for correlation coefficients BY G. T. DUNCAN AND M. W. J. LAYAKD University of California,

More information

Copula-based Logistic Regression Models for Bivariate Binary Responses

Copula-based Logistic Regression Models for Bivariate Binary Responses Journal of Data Science 12(2014), 461-476 Copula-based Logistic Regression Models for Bivariate Binary Responses Xiaohu Li 1, Linxiong Li 2, Rui Fang 3 1 University of New Orleans, Xiamen University 2

More information

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee Songklanakarin Journal of Science and Technology SJST-0-0.R Sukparungsee Bivariate copulas on the exponentially weighted moving average control chart Journal: Songklanakarin Journal of Science and Technology

More information

Nemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015

Nemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015 Nemours Biomedical Research Biostatistics Core Statistics Course Session 4 Li Xie March 4, 2015 Outline Recap: Pairwise analysis with example of twosample unpaired t-test Today: More on t-tests; Introduction

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

Lifetime Dependence Modelling using a Generalized Multivariate Pareto Distribution

Lifetime Dependence Modelling using a Generalized Multivariate Pareto Distribution Lifetime Dependence Modelling using a Generalized Multivariate Pareto Distribution Daniel Alai Zinoviy Landsman Centre of Excellence in Population Ageing Research (CEPAR) School of Mathematics, Statistics

More information

A Measure of Monotonicity of Two Random Variables

A Measure of Monotonicity of Two Random Variables Journal of Mathematics and Statistics 8 (): -8, 0 ISSN 549-3644 0 Science Publications A Measure of Monotonicity of Two Random Variables Farida Kachapova and Ilias Kachapov School of Computing and Mathematical

More information

A CONDITION TO OBTAIN THE SAME DECISION IN THE HOMOGENEITY TEST- ING PROBLEM FROM THE FREQUENTIST AND BAYESIAN POINT OF VIEW

A CONDITION TO OBTAIN THE SAME DECISION IN THE HOMOGENEITY TEST- ING PROBLEM FROM THE FREQUENTIST AND BAYESIAN POINT OF VIEW A CONDITION TO OBTAIN THE SAME DECISION IN THE HOMOGENEITY TEST- ING PROBLEM FROM THE FREQUENTIST AND BAYESIAN POINT OF VIEW Miguel A Gómez-Villegas and Beatriz González-Pérez Departamento de Estadística

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Accounting for extreme-value dependence in multivariate data

Accounting for extreme-value dependence in multivariate data Accounting for extreme-value dependence in multivariate data 38th ASTIN Colloquium Manchester, July 15, 2008 Outline 1. Dependence modeling through copulas 2. Rank-based inference 3. Extreme-value dependence

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William

More information

Correlation & Dependency Structures

Correlation & Dependency Structures Correlation & Dependency Structures GIRO - October 1999 Andrzej Czernuszewicz Dimitris Papachristou Why are we interested in correlation/dependency? Risk management Portfolio management Reinsurance purchase

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Nonparametric Independence Tests

Nonparametric Independence Tests Nonparametric Independence Tests Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Nonparametric

More information

Semi-parametric predictive inference for bivariate data using copulas

Semi-parametric predictive inference for bivariate data using copulas Semi-parametric predictive inference for bivariate data using copulas Tahani Coolen-Maturi a, Frank P.A. Coolen b,, Noryanti Muhammad b a Durham University Business School, Durham University, Durham, DH1

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Construction and estimation of high dimensional copulas

Construction and estimation of high dimensional copulas Construction and estimation of high dimensional copulas Gildas Mazo PhD work supervised by S. Girard and F. Forbes Mistis, Inria and laboratoire Jean Kuntzmann, Grenoble, France Séminaire Statistiques,

More information

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Overview of Extreme Value Theory. Dr. Sawsan Hilal space Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate

More information

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS Proceedings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

MULTI-ORDERED POSETS. Lisa Bishop Department of Mathematics, Occidental College, Los Angeles, CA 90041, United States.

MULTI-ORDERED POSETS. Lisa Bishop Department of Mathematics, Occidental College, Los Angeles, CA 90041, United States. INTEGERS: ELECTRONIC JOURNAL OF COMBINATORIAL NUMBER THEORY 7 (2007), #A06 MULTI-ORDERED POSETS Lisa Bishop Department of Mathematics, Occidental College, Los Angeles, CA 90041, United States lbishop@oxy.edu

More information

Frequency Distribution Cross-Tabulation

Frequency Distribution Cross-Tabulation Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape

More information

Some Results on the Multivariate Truncated Normal Distribution

Some Results on the Multivariate Truncated Normal Distribution Syracuse University SURFACE Economics Faculty Scholarship Maxwell School of Citizenship and Public Affairs 5-2005 Some Results on the Multivariate Truncated Normal Distribution William C. Horrace Syracuse

More information

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

UNIT 4 RANK CORRELATION (Rho AND KENDALL RANK CORRELATION

UNIT 4 RANK CORRELATION (Rho AND KENDALL RANK CORRELATION UNIT 4 RANK CORRELATION (Rho AND KENDALL RANK CORRELATION Structure 4.0 Introduction 4.1 Objectives 4. Rank-Order s 4..1 Rank-order data 4.. Assumptions Underlying Pearson s r are Not Satisfied 4.3 Spearman

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Multivariate Non-Normally Distributed Random Variables

Multivariate Non-Normally Distributed Random Variables Multivariate Non-Normally Distributed Random Variables An Introduction to the Copula Approach Workgroup seminar on climate dynamics Meteorological Institute at the University of Bonn 18 January 2008, Bonn

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Simulating Realistic Ecological Count Data

Simulating Realistic Ecological Count Data 1 / 76 Simulating Realistic Ecological Count Data Lisa Madsen Dave Birkes Oregon State University Statistics Department Seminar May 2, 2011 2 / 76 Outline 1 Motivation Example: Weed Counts 2 Pearson Correlation

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS

GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS REVSTAT Statistical Journal Volume 14, Number 1, February 2016, 1 28 GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS Author: Yuri Salazar Flores Centre for Financial Risk, Macquarie University,

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Smooth Tests of Copula Specifications

Smooth Tests of Copula Specifications Smooth Tests of Copula Specifications Juan Lin Ximing Wu December 24, 2012 Abstract We present a family of smooth tests for the goodness of fit of semiparametric multivariate copula models. The proposed

More information

Introduction to Statistical Data Analysis III

Introduction to Statistical Data Analysis III Introduction to Statistical Data Analysis III JULY 2011 Afsaneh Yazdani Preface Major branches of Statistics: - Descriptive Statistics - Inferential Statistics Preface What is Inferential Statistics? The

More information

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up? Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.

More information

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves

More information

How To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures

How To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures How To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures Dongxiao Zhu June 13, 2018 1 Introduction OMICS data are increasingly available to biomedical researchers, and

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico di Torino Porto Institutional Repository [Article] On preservation of ageing under minimum for dependent random lifetimes Original Citation: Pellerey F.; Zalzadeh S. (204). On preservation

More information

Discrete Applied Mathematics

Discrete Applied Mathematics Discrete Applied Mathematics 194 (015) 37 59 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: wwwelseviercom/locate/dam Loopy, Hankel, and combinatorially skew-hankel

More information

On detection of unit roots generalizing the classic Dickey-Fuller approach

On detection of unit roots generalizing the classic Dickey-Fuller approach On detection of unit roots generalizing the classic Dickey-Fuller approach A. Steland Ruhr-Universität Bochum Fakultät für Mathematik Building NA 3/71 D-4478 Bochum, Germany February 18, 25 1 Abstract

More information

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Rivista Italiana di Economia Demografia e Statistica Volume LXXII n. 3 Luglio-Settembre 2018 MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Kateryna

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Supplementary Material for Wang and Serfling paper

Supplementary Material for Wang and Serfling paper Supplementary Material for Wang and Serfling paper March 6, 2017 1 Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information