Nonparametric estimation of the entropy using a ranked set sample

Size: px
Start display at page:

Download "Nonparametric estimation of the entropy using a ranked set sample"

Transcription

1 arxiv: v3 [stat.co] 14 Jun 2015 Nonparametric estimation of the entropy using a ranked set sample Morteza Amini and Mahdi Mahdizadeh Department of tatistics, chool of Mathematics, tatistics and computer cience, College of cience, University of Tehran, P.O. Box , Tehran, Iran chool of Biological ciences, Institute for Research in Fundamental ciences (IPM), P. O. Box , Tehran, Iran Department of tatistics, Hakim abzevari University, P.O. Box 397, abzevar, Iran June 16, 2015 Abstract This paper is concerned with nonparametric estimation of the entropy in ranked set sampling. Theoretical properties of the proposed estimator are studied. The proposed estimator is compared with the rival estimator in simple random sampling. The applications of the proposed estimator to the mutual information estimation as well as estimation of the Kullback-Leibler divergence are provided. everal Monté-Carlo simulation studies are conducted to examine the performance of the estimator. The results are applied to the longleaf pine (Pinus palustris) trees and the body fat percentage data sets to illustrate applicability of theoretical results. Keywords: Kernel density estimation, Multi-stage ranked set sampling, Nonlinear dependency, Optimal bandwidth selection. AM ubject Classifications: primary: 94A17, 62D05; secondary: 62G07, 62P10. Corresponding Author addresses: morteza.amini@ut.ac.ir (Morteza Amini), mahdizadeh.m@live.com (Mahdi Mahdizadeh) 1

2 1 Introduction In many fields of sciences, including environmental, biological and ecological studies, measurement of the variable of interest is usually time-consuming and/or costly. Under simple random sampling (R), this leads to samples of small sizes with which statistical inference is less reliable. Thus, if the situation permits, a more efficient sampling technique may be desirable. A proper choice in such situations is ranked set sampling(r) which was originally introduced by McIntyre(1952) in the context of an agricultural experiment. The application of R in many other areas of science is studied by many researchers, including Kvam (2003), who applied R for environmental monitoring, Halls and Dell (1966), who mentioned the application of R in forestry and Chen et al. (2005) who applied R in medicine. R is highly applicable when a few units in a set are easily ranked without full measurement. One of the judgment ranking methods is to use an auxiliary variable for ranking. The auxiliary variable, is often highly correlated with the variable of interest and its measurement is usually inexpensive and fast. For example, in a clinical trial, the measurement of the body fat percentage variable is time consuming and expensive. However, the measurement of the body circumference (waist), which is highly correlated with the body fat percentage, is inexpensive and fast. This auxiliary variable may be used to extract a ranked set sample of the values of the variable of interest. The judgment ranking is based on any means not involving formal measurement, e.g. visual ranking or concomitant variable. Many authors have introduced extensions of R to construct improved estimators of different population attributes. Ozturk (2011) proposed some variations on R in which rankers are permitted to declare ties. Multistage ranked set sampling (MR), proposed by Al-aleh and Al-Omari (2002), is another example of such a design. The sampling methodology of MR is described in the following algorithm. Balanced R is simply the case r = 1 of the following algorithm. 1. Randomly identify k r+1, (r 1) units from the target population and allocate them randomly in k r 1 sets, each of size k For each set in step 1, apply judgement ordering, by any cheap method not involving formal measurement, onthe elements of the ith(i = 1,...,k) sample and identify the ith (judgement) smallest unit, to get a (judgement) ranked set of size k. This step gives k r 1 (judgement) ranked sets, each of size k. 3. Repeat step 2, r 1 times and apply each ranking stage on the ranked sets of its previous stage. A ranked set of size k is acquired in the (r 1)th stage. 4. Actually measure the k identified units in step Repeat steps 1-4, m 1 times, to obtain a sample of size n = mk. 2

3 WedenotethesamplecollectedusingMRby{X [i]j : i = 1...,k;j = 1,...,m}, where X [i]j is the ith judgement order statistic from the jth cycle. The special case r = 2 of MR (Al-aleh and Al-Kadiri, 2000) is known as double ranked set sampling (DR). The ranked set sample contains not only measurements on the quantified units but also additional information on their ranks. ince R provides more structured samples than R, improved inference may result from the use of R design. Investigation of R in nonparametric settings has attracted attention of many researchers. Dell and Clutter (1972) formally showed that the sample mean using R is an unbiased estimator of the population mean regardless of ranking errors and it has a smaller variance than the sample mean using R when the number of measured units are the same. tokes (1980) considered estimation of variance and showed that improved estimates of variance can be produced in R. tokes and ager (1988) characterized a ranked set sample as a sample from a conditional distribution, conditioning on a multinomial random vector, and applied R to the estimation of the cumulative distribution function. Chen (1999) studied the kernel method of density estimation in R. Deshpande et al. (2006) have obtained nonparametric confidence intervals for quantiles of the population based on a ranked set sample. ince the concept of differential entropy was introduced in hannon (1968) s paper, it has been of great interest because of wide applications specially in signal analysis and analysis of neural data and also because of its interesting theoretical properties. The concept of mutual information, defined based on hannon entropy, is also well-known specially as a well-defined dependence measure (see Cover and Thomas, 2012 for more details). The mutual information has several advantages compared to the classical dependence measures, such as Pearson, pearman and Kendall s correlation. The classical dependence criteria usually measure the correlation between two variables. The Pearson correlation criterion only measures the linear dependence and the pearman and Kendall s correlation criteria use only the ranks of the observations. The mutual information enables us to measure whether linear or nonlinear dependency between two or more variables. Joe(1989) studied the large sample properties of the kernel based estimator of the entropy based on R. He also obtained the optimal bandwidth and a proper kernel function. In this paper, we are specially interested in estimating the entropy based on R and MR by adapting the work of Joe (1989). The proposed estimator is applied to estimate the mutual information of variables as well as the Kullback- Leibler divergence measure. We compare the proposed kernel based estimator of the entropy based on R and MR samples with that of Joe (1989) by approximating relative efficiency of the proposed estimator with respect to its competitor in R. We observe that the R and MR methods, perfectly improve the kernel based estimation of the entropy. 3

4 The outline of this paper is as follows. ection 2 introduces a nonparametric estimator of the entropy in R. The approximated bias and mean square error (ME) of the proposed estimator are derived. The choice of kernel and the optimal bandwidth is discussed and estimation of the ME of the entropy estimator and approximating the efficiency of the estimator relative to the corresponding estimator in R are studied. The applications of the proposed estimator for estimation of the mutual information and the Kullback-Leibler divergence criteria is discussed in ection 3. A simulation study and two real data examples are presented in ection 4 to examine performance of the proposed estimators and to illustrate applicability of theoretical results. 2 The proposed estimator Let f X be a p-variate (p 1) probability density function (pdf) with respect to Lebesgue measure µ with distribution function F X. We consider estimation of H(X) = f X logf X dµ, using a ranked set sample, where is a bounded set, such that f X is bounded below on it by a positive constant and f X logf X dµ R p f X logf X dµ. We assume that f X has continuous first and second derivatives, which are dominated by integrable functions. For each i, X [i]1,...,x [i]m are independent and identically distributed to X [i]. For p = 1 and the ordinary R (r = 1), when the ranking is perfect, X [i] is identically distributed to the ith order statistic in a sample of size k, from the parent distribution F, while X [1],...,X [k] are independent. For p = 2, if the values of X = (X (1),X (2) ) are ranked by X (1), then the distribution of X [i] is identical with the joint distribution of the ith order statistic and its concomitant in an iid sample of size k from f X. For the perfect ranking and DR case (r = 2), the distribution of X [i] is identical with that of the ith order statistic and its concomitants of the independent but not identically distributed sample X [1],...,X [k]. We assume throughout that the ranked set sample X [i]j, i = 1,...,k, j = 1,...,m is consistent, that is (C1) For each x, 1 k k F X [i] (x) = F X(x). It is well known that if the ranking of X is performed using one of its components, then both R and MR samples are consistent (see e.g., Al-aleh and Al-Omari (2002) for a proof of the consistency for the DR case). 4

5 For any t in support of X, the kernel density estimate of f X (t) based on the ranked set sample X [i]j, i = 1,...,k, j = 1,...,m, is (Chen, 1999) f R n (t) = 1 nγ p m ( ) t X[i]j K p, γ where K p is a kernel function and γ is a positive bandwidth. We assume throughout that K p (u 1,...,u p ) = p k 0(u j ) and k 0 is a univariate kernel function satisfying the following conditions: (C2) k 0 (u) = k 0 ( u); (C3) k 0 (u) du = 1 and (C4) u 2 k 0 (u) du = 1. Let Fn R (t) = 1 n be the estimator of F in R, then f R n (t) = 1 γ pk p m I(X [i]j t) ( ) t u γ dfn R (u). We propose the estimator of the entropy of X, H(X), in R based on X [i]j, i = 1,...,k, j = 1,...,m, as H R n (X) = 1 n m logfn R (X [i]j )I (X [i]j ) = logf R n (x) dfn R (x). The asymptotic properties of Hn R (X) are studied by Joe (1989). In the next section, we study the asymptotic properties of Hn R (X). 2.1 Characteristics of the estimator In this section, we approximate ME and bias of Hn R (X). With a suitable choice of bandwidth γ (see ection 4), we deduce that ME(Hn R (X)) is O(n 1 ), that is Hn R (X) is a consistent estimator of H(X). To prove the results, we need the following lemma, which is an analog of Lemma 2.1 of Joe (1989). Lemma 1 Let U n (x) = n(f R n (t) F(t)), then, under the condition C1, (i) for any integrable function a(x) with E(a(X)) <, we have E a(x) du n (x) = 0; 5

6 (ii) for any integrable function a(x, y), we have E a(x,y) du n (x) du n (y) = a(x,x) df X (x) 1 k Proof ee the Appendix. By using Lemma 1, we get the following result. Theorem 1 (i) The bias of H R n (X) is E(H R a(x,y)f X[i] (x)f X[i] (y) dx dy. n (X)) H(X) = (H γ H(X))+n 1 α 1 +O(n 3/2 ), ( ( ) ) 1 where H γ = E(I (X)log( γ (X))), γ (x) = E I (X), γ p K x X γ α 1 = df X (x) B(x,x) 1 k B(x,y)f X[i] (x)f X[i] (y) dx dy, and B(x,y) = 1 2 (ii) The ME of H R n (X) is ( ) ( ) 1 z x K γ 2p p K z y γ p γ df ( γ (z)) 2 X (z)+ ( ) 1 K y x γ p p γ I (x). (1) γ (x) E(Hn R (X) H(X)) 2 = (H γ H(X)) 2 +n 1 (α 2 2α 1 (H γ H(X)))+O(n 3/2 ), (2) where α 2 = A 2 (x) df X (x) 1 ( 2 A(x) df X[i] (x)), k and A(x) = Proof ee the Appendix. ( ) 1 y x K γ p p γ γ (y) df X (y)+log( γ (x))i (x) (3) 6

7 2.2 Choosing the optimal bandwidth and kernel To choose an optimal value for the bandwidth and a suitable kernel function, note that α 1 = θ+γ p K(0) γ (x) 1 df X (x) γ pκ 2 2 γ(z) γ (z) 2 df X (z) where ( = θ γ p κ2 2 ) K(0) γ (x) 1 df X (x) γ pκ 2 ( γ 2 (z) γ(z)) γ (z) 2 df X (z), θ = 1 k B(x,y)f X[i] (x)f X[i] (y) dx dy, γ(z) = E(I (X)γ p K 2 ((z X)/γ))/κ 2 and κ 2 = K 2 (u) du. The term θ is O(1). In addition, [ v γ (z) γ(z) = (2γ 2 ) 1 trf X 2 (z) k0 2(v) ] dv 1 +O(γ 2 ), κ 02 where κ 02 = k0 2 (u) du. Thus, if and then, we have k 0 (0) = κ /p (4) v 2 k 2 0(v) κ 02 dv = 1, (5) α 1 = O(1)+O(γ 2 p ). Joe (1989) showed that the kernel function of the form k 0 (u) = { η1 +η 1 u, u < ξ 1 η 3 η 4 u, ξ 1 < u < ξ 2. (6) satisfies conditions (4) and (5) and computed η i,ξ j, i = 1,...,4, j = 1,2 for p = 1,...,4. For p 2, a competitor of the kernel function in (6) would be which satisfies condition (5). k 0 (u) = (4π) 0.5 exp{ u 2 /4}, (7) 7

8 Using (7), we have, for p 2 α 1 = O(1)+O(γ p ). On the other hand, since H γ H(X) = O(γ 2 ), for the kernel function in (6), we get E(Hn R (X) H(X)) 2 = O(n 1 )+O(n 1 γ 2 )+O(n 1 γ 4 p )+O(γ 4 ) and E(H R n (X) H(X)) = O(n 1 )+O(n 1 γ 2 p )+O(γ 2 ), and for the kernel function in (7), we get and E(H R n (X) H(X)) 2 = O(n 1 )+O(n 1 γ 2 )+O(n 1 γ 2 p )+O(γ 4 ) E(H R n (X) H(X)) = O(n 1 )+O(n 1 γ p )+O(γ 2 ). Using the kernel function in (6), for p 2, by setting γ = O(n 1/(2+0.5p) ), we deduce that E(Hn R (X) H(X)) 2 = O(n 1 )+O(n 4/(2+0.5p) ) and and E(H R n (X) H(X)) = O(n 2/(2+0.5p) ). Using the kernel function (7), for p 2, by setting γ = O(n 1/(2+0.5p) ), we have E(H R n (X) H(X)) 2 = O(n 1 )+O(n (4 0.5p)/(2+0.5p) ) E(H R n (X) H(X)) = O(n (2 0.5p)/(2+0.5p) ). Thus, using both kernels in (6) and (7), for p 2, E(H R n (X) H(X)) 2 = O(n 1 ). Although the rate of convergence of the bias is somehow higher using the kernel in (6), simulations show that, for p 2, using the kernel function in (7), results in better performance of the entropy estimator than using the kernel function in (6), specially for smaller sample sizes. From the above discussion, for both kernel functions in (7) and (6) and for p 2, the optimal bandwidth is of the form γ = c n p, where c depends on the parameters of f X. To discuss about the suitable value of c, first, we study the behavior of the optimal value of γ through simulation, obtained by minimizing the cross-validation criterion CV γ = 1 m ( H R n (X) Hn R (X ( j) ) ) 2, (8) m 8

9 where X ( j) = {X [i]l, i = 1,...,k, l( j) = 1,...,m}. imulations show that the bandwidth should decrease with more dependence for a bivariate density. Therefore, if Σ is the covariance matrix of the population, then c increases by increasing θ = Σ 2 1. For the multivariate normal distribution with covariance matrix Σ (with (tr(σ 1 )) Σ not too close to zero), a rough approximation of θ 2+0.5p is R φ(x,σ)dx, where 2 p 2+0.5p ( p ) R = [ 0.674,0.674] p (see Jones, 1989). Thus, we take the the optimal bandwidth to be γ = d 1 n ˆα 2+0.5p IQR p, where d 1 is a suitable normalizing constant, which is decreasing in p, ˆα is the proportion of data in the rectangle p [q j1,q j3 ], q j1 and q j3 are the lower and upper quartiles and IQR is the average of interquartile ranges. imulations suggest that for p 2, the rule γ = d 1 n ˆα 2+0.5p IQR p, (9) works quite well, with the kernel in (7), for ρ = 0.5(0.1)0.9, for the values of d 1, given in Table 1, for different values of k = 3,5, p = 1,2 and r = 1,2. A value of d 1 which was corresponding to the minimum value of the simulated ME of the estimator Hn R (X) on a grid of values of d 1 = 0.5(0.05)2.00 is chosen as the optimal value of d 1. Remark 1 imulations suggest that for p = 3,4, the bandwidth in (9) works quite well with the kernel in (6) under the multivariate normal distribution. We have not provided the suitable values of d 1 for this case here for the sake of briefness. However, an illustrative example is provided for the case p = 3 in ection Estimation of the ME of the estimator Computations show that CV γ in (8) has a large negative bias for estimating ME of Hn R (X). To propose a less biased estimator for ME of Hn R (X), we use the approximated ME in Theorem 1, where CV γ is employed as an estimator of the term (H γ H(X)) 2. ince H γ H(X) is negative for suitable values of γ, we use D γ as an estimator of H γ H(X), where D γ = 1 m m ( H R n (X) H R n (X ( j) ) ). Thus, our proposed estimator of the ME of H R n (X) is ˆM γ = CV γ +n 1 (ˆα 2 +2ˆα 1 D γ )δ(cv γ +n 1 (ˆα 2 +2ˆα 1 D γ )), (10) 9

10 Table 1: The suitable choices of d 1 in (9) for the estimator of H n (X) under the bivariate normal distribution, for n = 30, k = 3,5, p = 1,2, r = 1,2 and ρ = 0.5(0.1)0.9. ρ r k p where ˆα 1 = 1 mk m δ(t) = { 1 t 0 0 t < 0, 1 ˆB(X [i]j,x [i]j )I (X [i]j ) m(m 1)k m j j (=1) ˆB(X [i]j,x [i]j )I (X [i]j )I (X [i]j ), and ˆα 2 = 1 mk ˆB(x,y) = 1 2mk Â(x) = 1 mk m  2 (X [i]j )I (X [i]j ) 1 k m ( ) 1 X[i]j x K γ 2p p γ m [ m 2 Â(X [i]j )I (X [i]j )], K p ( X[i]j y γ ) (f R n (X [i]j )) 2 I (X [i]j )+ ( ) 1 X[i]j x K γ p p γ f R n (X [i]j ) ( ) 1 K y x γ p p γ fn R (x) I (X [i]j )+log(fn R (x))i (x). I (x) The performance of the estimators in (8) and (10) are examined through simulation studies in ection The relative efficiency To approximate RE of Hn R (X) with respect to Hn R (X), we use the approximated ME of Hn R (X) obtained by Joe (1989) as E(H R n (X) H(X)) 2 = (H γ H(X)) 2 +n 1 (β 2 2β 1 (H γ H(X)))+O(n 3/2 ), (11) 10

11 where β 1 = = B(x,x) df X (x) B(x,x) df X (x) 1 2, B(x,y) df(x) df X (y) and β 2 = = ( A 2 (x) df X (x) A 2 (x) df X (x) (1 H γ ) 2, ) 2 A(x) df X (x) in which A(x) and B(x,y) are given in (3) and (1), respectively. Joe (1989) also showed that the bias of H R n (X) is E(H R n (X)) H(X) = (H γ H(X))+n 1 β 1 +O(n 3/2 ). As a corollary of Theorem 1 and using (11), the RE of Hn R (X) relative to n (X) is approximated as H R RE(H R n (X),H R n (X)) 1+ n 1 (β 2 α 2 2(β 1 α 1 )(H γ H(X))) (H γ H(X)) 2 +n 1 (α 2 2α 1 (H γ H(X))). (12) Using Cauchy-chwartz inequality for series and consistency of the R sample we have 1 k ( 2 ( A n (x) df X[i] (x)) A n (x) df X (x) A n and consequently β 2 α 2. Numerical computations under the bivariate normal distribution, for different values of k, m and ρ, show that that for a suitable choice of γ, the approximated relative efficiency in(12) is greater than 1. The computed values of the approximated relative efficiencies of H(X (1) ) are given in Table 2, for different values of ρ, n and k, for R and DR schemes, under bivariate normal distribution. ince the values given in Table 2 are population parameters, different optimal values of γ areneeded for this computation. Weused γ = cn 1/(n+0.5p), with suitable values of the normalizing constant c. It is found that, for the R scheme, suitable values of c are c = 1.35 for ρ = 0.9 and c = 1.30 for ρ = 0.8. For R, we used c = 1.40 for k = 3 and ρ = 0.9, c = 1.35 for k = 3 and ρ = 0.8, c = 1.65 for k = 5 and ρ = 0.9 and c = 1.60 for k = 5 and ρ = 0.8. For DR, our suggested values ) 2 11

12 Table 2: Approximated relative efficiencies, under bivariate normal distribution, for different values of ρ, n and k. ρ n k R DR are c = 1.45 for k = 3 and ρ = 0.9, c = 1.40 for k = 3 and ρ = 0.8, c = 1.70 for k = 5 and ρ = 0.9 and c = 1.65 for k = 5 and ρ = 0.8. AsonecanseefromTable2,allapproximatedvaluesofRE(Hn R (X (1) ),Hn R (X (1) )) are greater than Also, the relative efficiencies decrease as ρ and/or n increase and also as k decreases. For DR, the relative efficiencies are greater than for R, as expected. Finally, using the results of Al-aleh and Al-Omari (2002) we have, for i = 1,...,k and x in the support of X, { ( lim f kfx (x), F 1 i 1 ) ( X r [i] (x) = X k x < F 1 i X k) (13) 0, otherwise, and consequently for each x,y in the support of X, 1 lim r k f X[i] (x)f X[i] (y) = kf X (x)f X (y). Thus lim α 1 = β 1 k 1, and lim r 2 α 2 = β 2 (k 1)(1 H γ ) 2, r and thereby lim r RE(HR n (X),Hn R (X)) 1+n 1 (k 1)((1 H γ ) 2 H γ +H(X)) 3 Applications [ (H γ H(X)) 2 +n 1 (β 2 (k 1)(1 H γ ) 2 2(β 1 k 1 1 )(H γ H(X)))]. 2 In addition to estimation of the entropy, the developed results might be applied for estimation of the Kullback-Leibler distance measure as well as the mutual information criterion. The estimator of the Kullback-Leibler distance measure can be used to construct homogeneity test statistics. The mutual information is well-known as a measure of dependence. These applications are described in the following sections. 12

13 3.1 Application to estimation of the mutual information For p 2 and X = (X (1),X (2) ), the mutual information of X (1) and X (2), defined as I(X (1),X (2) ) = H(X (1) )+H(X (2) ) H(X) ( ) f X = f X log dµ, f X (1)f X (2) which is the Kullback-Leibler divergence between the joint distribution of X = (X (1),X (2) ) and the product of its marginal pdfs, is used as a popular and welldefined nonlinear correlation criteria. If for example, X is distributed as the bivariate normal distribution with correlation parameter ρ, the mutual information between X (1) and X (2) is I(X (1),X (2) ) = 0.5log(1 ρ 2 ). Motivated by the above relationship, and using the fact that I(X (1),X (2) ) 0, a standardized nonlinear correlation measure based on the mutual information might be defined as follows I(X (1),X (2) ) = 1 e 2I(X(1),X (2)) (0,1). (14) Usingtheproposedestimatoroftheentropy, themutualinformationi(x (1),X (2) ) is estimated by Î(X (1),X (2) ) = Hn R (X (1) )+Hn R (X (2) ) Hn R (X) ( ) fn X;R (x 1,x 2 ) = log df f X(1) ;R n (x 1 )f X(2) ;R n X;R (x 1,x 2 ), (15) n (x 2 ) where Hn R (X (l) ) is the the estimator of H(X (l) ) in R based on X (l) [i]j, i = 1,...,k, j = 1,...,m, l = 1,2 and Hn R (X) is the estimator of H(X) based on (X (1) ), i = 1,...,k, j = 1,...,m, that is (i)j,x(2) [i]j H R n (X) = 1 mk m logf R n (X (1) (i)j,x(2) [i]j )I (X (1) (i)j,x(2) [i]j ). Using the kernel function K p (u 1,...,u p ) = p k 0(u j ) and a similar bandwidthparameterforestimationoffn X;R (x 1,x 2 ), f X(1) ;R n (x 1 )andf X(2) ;R n (x 2 ), itis straightforwardthatî(x(1),x (2) ) 0. ThecorrespondingestimatorofI(X (1),X (2) ) is obtained by plugging in the estimator of Î(X(1),X (2) ) in (14). 13

14 Table3: Thesuitable choices ofd 1 in(9) fortheestimator in(15)under thebivariate normal distribution, for n = 30, p = 2 and different values of k, r and ρ. ρ r k imulations show that the bandwidth parameter in (9) works well for the estimator in (15), with p equal to dimension of X = (X (1),X (2) ), by choosing some suitable value of the constant d 1. Table 3 presents the suitable choices of d 1 in (9), for the estimator in (15), under the bivariate normal distribution, for n = 30, p = 2 and different values of k, r and ρ = 0.5(0.1)0.9. A value of d 1 which was corresponding to the minimum value of the simulated ME of the estimator in (15), on a grid of values of d 1 = 0.5(0.05)2.00, is chosen as the optimal value of d Application to estimation of the Kullback-Leibler divergence Using the proposed estimator of the entropy, the Kullback-Leibler divergence between f X (1) and f X (2) defined as ( ) fx (1) KL(f X (1),f X (2)) = f X (1) log dµ, (16) can be estimated as KL(f X (1),f X (2)) = = 1 mk log ( f X(1) ;R n f X(2) ;R n m log f X (2) ) (x) (x) df X(1) ;R n (x) fx(1) ;R n (X (1) [i]j ) f X(2) ;R n (X (1) [i]j ). (17) tatistical properties of the estimator in (17) as well as its application for the homogeneity test remains as an open problem. 4 Numerical studies In this section, several numerical studies are conducted to examine the performance of the proposed estimators and to illustrate theoretical results. A simulation study 14

15 under the bivariate normal distribution is conducted to examine the performance of the estimators in (8) and (10). A data analysis is performed to examine the performance of the estimators for a real data example. Finally, the proposed theoretical results are applied to the problem of variable selection based on the estimator of the mutual information as a correlation measure. 4.1 imulation study In order to compare the performance of CV γ in (8) and M γ in (10), two separate simulation studies were conducted, each with 10,000 iterations. ince, computation of the multiple integrals in α i and β i, i = 1,2 takes a lot of time, we have considered the simple case p = 2,p 1 = p 2 = 1 and estimation of H(X (1) ), under the bivariate normal distribution with correlation parameter ρ, using the kernel in (7) and the bandwidth in (9). The variable X (2) is considered as the ranker variable and the computations are performed for R and DR schemes. In the first simulation study, the values of ME(Hn R (X (1) )) were computed which are given in Table 4, for different values of n, k and ρ, for R and DR schemes. In the second simulation study, thevariance andexpectation of CV γ andm γ were computed which are given corresponding to each of the simulated ME(Hn R (X (1) )) in Table 4. As one can see from Table 4, the estimator M γ is less biased than CV γ, except for the case k = 5 and m = 3. As expected, the variance of CV γ is less than that of M γ. In the simulation study, it is found that the suitable values of d 1 for Hn R (X (1) ) are almost similar to those of CV γ and M γ. 4.2 Data analysis We apply the results to a data set analyzed first by Platt et al. (1988). The original data set consists of measurements made on 399 long-leaf pine (Pinus palustris) trees. We use a truncated version of this data set, given in Chen et al. (2004), which contains the diameter (in centimeters) at breast height (X (1) ) and the height (in feet) (X (2) ) of 396 trees. To examine the distributional properties of the proposed estimators for this real data set, we extract samples of size n = mk = 30, with k = 3 and m = 10, using R, as well as samples of the same size using DR and R from this data set, by the diameter of the trees as the ranking variable. ince the population size is finite, the sampling is with replacement. Also, since the population does not have a density and the differential entropy is not defined, the true entropy of the population is estimated by H N (X) = 1 N logf N (X i )I (X i ), N 15

16 Table 4: imulated MEs as well as simulted variance and expectation of the estimated MEs, under bivariate normal distribution, for different values of ρ, n and k. cheme R DR ρ n k imul. ME Var( ˆM γ ) E( ˆM γ ) Var(CV γ ) E(CV γ ) imul. ME Var( ˆM γ ) E( ˆM γ ) Var(CV γ ) E(CV γ )

17 Table 5: The simulated biases and MEs of the entropy and the mutual information estimators for the tree data. Estimator Hn R (X (1) ) Hn DR (X (1) ) Hn R (X (1) ) Bias ME Estimator Î R (X (1),X (2) ) Î DR (X (1),X (2) ) Î R (X (1),X (2) ) Bias ME Estimator Î R (X (1),X (2) ) Î DR (X (1),X (2) ) Î R (X (1),X (2) ) Bias ME ( where f N (x) = 1 N Nγ k 0 ) x X i γ and simillarly for I(X (1),X (2) ). Using optimal bandwidth and kernel function for a simple random sample of size N = 396, the full population estimates of the true parameters are H N (X (1) ) = 4.921, I N (X (1),X (2) ) = and I N (X (1),X (2) ) = The simulated bias and ME of the estimators were computed which are given in Table 5. As it can be observed from Table 5, the estimators of I N (X (1),X (2) ) and I N (X (1),X (2) ) based on DR have the smallest MEs and biases. Also, the estimators of H N (X (1) ) has smallest ME but are more biased under R and DR schemes. 4.3 Application to a real data set In this section, we apply the proposed theoretical results to the problem of variable selection in regression estimation based on ranked set sampling (Yu and Lam, 1997). We consider a real data set, consisting of measurements of different body fat percentage evaluations and various body circumference measurements for N =252 men. This data set is taken from Penrose et al. (1985). Consider the following variables Y: Percent of body fat using Brozek s equation; X (1) : Abdomen circumference; X (2) : Weight; X (3) : Chest circumference; uppose, the regression estimator of the mean percent of body fat using Brozek s equation is of interest. Three auxiliary variables X (1), X (2) and X (3) are considered 17

18 and suppose that one is willing to choose the two most correlated auxiliary variables with the variable of interest, Y. The classical correlation criteria fail to measure the correlation between three variables. Thus, the mutual information criterion is used as the variable selection tool. A double ranked set sample with k = 3 and m = 10, is extracted from this data set, with X (1) as the ranking variable. This sample is given in Table 6. Based on this sample, the determinant of the correlation matrix of the sample is obtained as R = The simulation results under the multivariate normal distribution with variancecovariance matrix Σ, satisfying Σ = 0.01, suggest using the bandwidth in (9) with d 1 = The resulting estimators are and ince H n ((Y)) = 3.599, H n ((X (1),X (2),Y)) = , H n ((X (1),X (2) )) = 7.807, H n ((X (1),X (3),Y)) = , H n ((X (1),X (3) )) = 6.894, H n ((X (2),X (3),Y)) = , H n ((X (2),X (3) )) = 7.679, Î((X (1),X (2),Y)) = 0.180, Î((X (1),X (2),Y)) = 0.302, Î((X (1),X (3),Y)) = 0.029, Î((X (1),X (3),Y)) = 0.057, Î((X (2),X (3),Y)) = 0.160, Î((X (2),X (3),Y)) = Î((X (1),X (2),Y)) > Î((X(2),X (3),Y)) > Î((X(1),X (3),Y)), the variables X (1) and X (2) might be chosen out of the three auxiliary variables, based on the sample given in Table 6. Acknowledgements The first author s research is supported in part by a grant B from the Institute for Research in Fundamental ciences (IPM), Tehran, Iran. 18

19 19 Table 6: Double ranked set sample extracted from the body fat data set. Variable i j Y [i]j i j X (1) [i]j i j X (2) [i]j i j X (3) [i]j

20 Y X1 X X Figure 1: catterplots for pairs of variables for the body fat data set 20

21 Appendix Proof of Lemma 1 The proof of part (i) is trivial. For the consistent ranked set sample, we have E a(x,y) du n (x) du n (y) = ne a(x,y) dfn R (x) dfn R (y) n a(x,y) df X (x) df X (y) { m [ ] = n 1 E a(x [i]j,x [i]j )+ + m j 1 j 2 i 1 =1i 1 =1 } a(x [i1 ]j 1,X [i2 ]j 2 ) n a(x,y) df X (x) df X (y) = a(x,x) df X (x) nm 1 +nm 1 = 1 k a(x,x) df X (x) i 1 i 2 a(x [i1 ]j,x [i2 ]j) a(x,y) df X (x) df X (y) a(x,y) 1 f k (x)f 2 X[i1 ] X [i2 (y) dx dy ] i 1 i 2 a(x,y)f X[i] (x)f X[i] (y) dx dy. Proof of Theorem 1 By the Taylor expansion of the integrands and using the fact that ( ) 1 y x fn R (x) γ (x) = n 1/2 γ pk p du n (y), γ 21

22 we can write H R n (X) = Thus +n 1/2 + = log(fn R (x)) dfn R (x) = (log(fn R (x)) log( γ (x))) du n (x) log( γ (x)) df X (x)+n 1/2 +n 1/2 + H R n (X) = H γ n 1/2 n 1 + (fn R (x) γ (x)) γ (x) (log(fn R (x)) log( γ (x))) df X (x) log( γ (x)) du n (x) (fn R (x) γ (x)) 2 df X (x) 2 γ (x) 2 df X (x) (fn R (x) γ (x)) du n (x) γ (x) log( γ (x)) df X (x)+n 1/2 log( γ (x)) du n (x)+o(n 3/2 ). ( 1 K γ p p ) y x γ γ (y) du n (y) df X (x) ] log( γ (x)) du n (x) ( ) 1 K y x γ p γ du n (y) du n (x) γ (y) ( ( ) 1 y x z x K γ 2p p )K γ p γ 2 γ (x) 2 du n (y) du n (z) df X (x) +O(n 3/2 ). By applying Lemma 1, the required result follows. References 1. Al-aleh, M.F., and Al-Kadiri, M. (2000). Double ranked set sampling. tatistics & Probability Letters, 48, Al-aleh, M.F., and Al-Omari, A.I. (2002). Multistage ranked set sampling. Journal of tatistical Planning and Inference, 102, Chen, Z. (1999). Density estimation using ranked set sampling data. Environmental and Ecological tatistics, 6,

23 4. Chen Z., Bai Z. and inha B. (2004). Ranked set sampling: theory and applications. Lecture Notes in tatistics. pringer, New York. 5. Chen H, tasny E.A. and Wolfe D.A. (2005). Ranked set sampling for efficient estimation of a population proportion. tatistics in Medicine, 24, Cover, T. M. and Thomas, J. A. (2012). Elements of Information Theory. 2nd ed., Wiley, New York. 7. Dell, T.R. and Clutter, J.L. (1972). Ranked set sampling theory with order statistics background. Biometrics, 28, Kvam, P.H. (2003). Ranked set sampling based on binary water quality data with covariates. Journal of Agricultural, Biological, and Environmental tatistics, 8, Halls L.K. and Dell T.R. (1966). Trial of ranked-set sampling for forage yields. Forest cience, 12, Deshpande J. V., Frey J. and Ozturk O. (2006). Nonparametric ranked-set sampling confidence intervals for quantiles of a finite population. Environmental and Ecological tatistics, 13, Joe, H. (1989). Estimation of entropy and other functionals of a multivariate density. Annals of the Institute of tatistical Mathematics, 41, Ozturk, O. (2011). ampling from partially rank-ordered sets. Environmental and Ecological tatistics, 18, Penrose, K.W., Nelson, A.G. and Fisher, A.G. (1985). Generalized body composition prediction equation for men using simple measurement techniques. Medicine and cience in ports and Exercise, 17, Platt, W.J., Evans G.W. and Rathbun.L (1988). The population-dynamics of a long lived conifer (Pinus palustris). American Naturalist, 131, McIntyre, G.A. (1952). A method of unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3, hannon, C. E.(1968). A mathematical theory of communication. Bell ystem Tech. J., 27, tokes,.l. (1980). Estimation of variance using judgement ordered ranked set samples, Biometrics, 36,

24 18. tokes,.l. and ager, T.W. (1988). Characterization of a ranked-set sample with application to estimating distribution functions. Journal of the American tatistical Association, 83, Yu, P. L. H. and Lam, K. (1997). Regression Estimator in Ranked et ampling. Biometrics, 53 (3),

New ranked set sampling for estimating the population mean and variance

New ranked set sampling for estimating the population mean and variance Hacettepe Journal of Mathematics and Statistics Volume 45 6 06, 89 905 New ranked set sampling for estimating the population mean and variance Ehsan Zamanzade and Amer Ibrahim Al-Omari Abstract The purpose

More information

PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES

PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES REVSTAT Statistical Journal Volume 16, Number 4, October 2018, 463 474 PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES Authors: Ehsan Zamanzade Department of Statistics,

More information

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use: This article was downloaded by: [Ferdowsi University of Mashhad] On: 7 June 2010 Access details: Access Details: [subscription number 912974449] Publisher Taylor & Francis Informa Ltd Registered in England

More information

Improvement in Estimating the Population Mean in Double Extreme Ranked Set Sampling

Improvement in Estimating the Population Mean in Double Extreme Ranked Set Sampling International Mathematical Forum, 5, 010, no. 6, 165-175 Improvement in Estimating the Population Mean in Double Extreme Ranked Set Sampling Amer Ibrahim Al-Omari Department of Mathematics, Faculty of

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling

Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling JIRSS (05) Vol. 4, No., pp -3 Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling Nader Nematollahi, Reza Shahi Department of Statistics, Allameh Tabataba i University,

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

Estimation of Parameters

Estimation of Parameters CHAPTER Probability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS Second Edition A. J. Clark School of Engineering Department of Civil and Environmental

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

On estimating a stress-strength type reliability

On estimating a stress-strength type reliability Hacettepe Journal of Mathematics and Statistics Volume 47 (1) (2018), 243 253 On estimating a stress-strength type reliability M. Mahdizadeh Keywords: Abstract This article deals with estimating an extension

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES

RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES . RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES 000 JAYANT V. DESHPANDE Indian Institute of Science Education and Research, Pune - 411021, India Talk Delivered at the INDO-US Workshop on Environmental

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

On Ranked Set Sampling for Multiple Characteristics. M.S. Ridout

On Ranked Set Sampling for Multiple Characteristics. M.S. Ridout On Ranked Set Sampling for Multiple Characteristics M.S. Ridout Institute of Mathematics and Statistics University of Kent, Canterbury Kent CT2 7NF U.K. Abstract We consider the selection of samples in

More information

A Modification of Linfoot s Informational Correlation Coefficient

A Modification of Linfoot s Informational Correlation Coefficient Austrian Journal of Statistics April 07, Volume 46, 99 05. AJS http://www.ajs.or.at/ doi:0.773/ajs.v46i3-4.675 A Modification of Linfoot s Informational Correlation Coefficient Georgy Shevlyakov Peter

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Lecture 16: Sample quantiles and their asymptotic properties

Lecture 16: Sample quantiles and their asymptotic properties Lecture 16: Sample quantiles and their asymptotic properties Estimation of quantiles (percentiles Suppose that X 1,...,X n are i.i.d. random variables from an unknown nonparametric F For p (0,1, G 1 (p

More information

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY SECOND YEAR B.Sc. SEMESTER - III SYLLABUS FOR S. Y. B. Sc. STATISTICS Academic Year 07-8 S.Y. B.Sc. (Statistics)

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

ACM 116: Lectures 3 4

ACM 116: Lectures 3 4 1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

More information

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach 1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and

More information

Limit Distributions of Extreme Order Statistics under Power Normalization and Random Index

Limit Distributions of Extreme Order Statistics under Power Normalization and Random Index Limit Distributions of Extreme Order tatistics under Power Normalization and Random Index Zuoxiang Peng, Qin Jiang & aralees Nadarajah First version: 3 December 2 Research Report No. 2, 2, Probability

More information

On the Choice of Parametric Families of Copulas

On the Choice of Parametric Families of Copulas On the Choice of Parametric Families of Copulas Radu Craiu Department of Statistics University of Toronto Collaborators: Mariana Craiu, University Politehnica, Bucharest Vienna, July 2008 Outline 1 Brief

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Lecture 14: Multivariate mgf s and chf s

Lecture 14: Multivariate mgf s and chf s Lecture 14: Multivariate mgf s and chf s Multivariate mgf and chf For an n-dimensional random vector X, its mgf is defined as M X (t) = E(e t X ), t R n and its chf is defined as φ X (t) = E(e ıt X ),

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Week 9 The Central Limit Theorem and Estimation Concepts

Week 9 The Central Limit Theorem and Estimation Concepts Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population

More information

Chapter 2: Resampling Maarten Jansen

Chapter 2: Resampling Maarten Jansen Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples

Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples Ann Inst Stat Math (2009 61:235 249 DOI 10.1007/s10463-007-0141-5 Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples Omer Ozturk N. Balakrishnan

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

Semi-Parametric Importance Sampling for Rare-event probability Estimation

Semi-Parametric Importance Sampling for Rare-event probability Estimation Semi-Parametric Importance Sampling for Rare-event probability Estimation Z. I. Botev and P. L Ecuyer IMACS Seminar 2011 Borovets, Bulgaria Semi-Parametric Importance Sampling for Rare-event probability

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Quantifying Stochastic Model Errors via Robust Optimization

Quantifying Stochastic Model Errors via Robust Optimization Quantifying Stochastic Model Errors via Robust Optimization IPAM Workshop on Uncertainty Quantification for Multiscale Stochastic Systems and Applications Jan 19, 2016 Henry Lam Industrial & Operations

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Stat 5101 Notes: Algorithms

Stat 5101 Notes: Algorithms Stat 5101 Notes: Algorithms Charles J. Geyer January 22, 2016 Contents 1 Calculating an Expectation or a Probability 3 1.1 From a PMF........................... 3 1.2 From a PDF...........................

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information Correlation Detection and an Operational Interpretation of the Rényi Mutual Information Masahito Hayashi 1, Marco Tomamichel 2 1 Graduate School of Mathematics, Nagoya University, and Centre for Quantum

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC A Simple Approach to Inference in Random Coefficient Models March 8, 1988 Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC 27695-8203 Key Words

More information

Kullback-Leibler Designs

Kullback-Leibler Designs Kullback-Leibler Designs Astrid JOURDAN Jessica FRANCO Contents Contents Introduction Kullback-Leibler divergence Estimation by a Monte-Carlo method Design comparison Conclusion 2 Introduction Computer

More information

Rademacher Complexity Bounds for Non-I.I.D. Processes

Rademacher Complexity Bounds for Non-I.I.D. Processes Rademacher Complexity Bounds for Non-I.I.D. Processes Mehryar Mohri Courant Institute of Mathematical ciences and Google Research 5 Mercer treet New York, NY 00 mohri@cims.nyu.edu Afshin Rostamizadeh Department

More information

A New Class of Positively Quadrant Dependent Bivariate Distributions with Pareto

A New Class of Positively Quadrant Dependent Bivariate Distributions with Pareto International Mathematical Forum, 2, 27, no. 26, 1259-1273 A New Class of Positively Quadrant Dependent Bivariate Distributions with Pareto A. S. Al-Ruzaiza and Awad El-Gohary 1 Department of Statistics

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Estimation of the Population Mean Based on Extremes Ranked Set Sampling

Estimation of the Population Mean Based on Extremes Ranked Set Sampling Aerican Journal of Matheatics Statistics 05, 5(: 3-3 DOI: 0.593/j.ajs.05050.05 Estiation of the Population Mean Based on Extrees Ranked Set Sapling B. S. Biradar,*, Santosha C. D. Departent of Studies

More information

On the Estimation and Application of Max-Stable Processes

On the Estimation and Application of Max-Stable Processes On the Estimation and Application of Max-Stable Processes Zhengjun Zhang Department of Statistics University of Wisconsin Madison, WI 53706, USA Co-author: Richard Smith EVA 2009, Fort Collins, CO Z. Zhang

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

7 Influence Functions

7 Influence Functions 7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the

More information

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.100/sim.0000 Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

Simulating Uniform- and Triangular- Based Double Power Method Distributions

Simulating Uniform- and Triangular- Based Double Power Method Distributions Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

A comparative study of ordinary cross-validation, r-fold cross-validation and the repeated learning-testing methods

A comparative study of ordinary cross-validation, r-fold cross-validation and the repeated learning-testing methods Biometrika (1989), 76, 3, pp. 503-14 Printed in Great Britain A comparative study of ordinary cross-validation, r-fold cross-validation and the repeated learning-testing methods BY PRABIR BURMAN Division

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Variable selection and feature construction using methods related to information theory

Variable selection and feature construction using methods related to information theory Outline Variable selection and feature construction using methods related to information theory Kari 1 1 Intelligent Systems Lab, Motorola, Tempe, AZ IJCNN 2007 Outline Outline 1 Information Theory and

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type. Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information

Mutual Information and Optimal Data Coding

Mutual Information and Optimal Data Coding Mutual Information and Optimal Data Coding May 9 th 2012 Jules de Tibeiro Université de Moncton à Shippagan Bernard Colin François Dubeau Hussein Khreibani Université de Sherbooe Abstract Introduction

More information

ECE534, Spring 2018: Solutions for Problem Set #3

ECE534, Spring 2018: Solutions for Problem Set #3 ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Multivariate random variables

Multivariate random variables DS-GA 002 Lecture notes 3 Fall 206 Introduction Multivariate random variables Probabilistic models usually include multiple uncertain numerical quantities. In this section we develop tools to characterize

More information

Random Eigenvalue Problems Revisited

Random Eigenvalue Problems Revisited Random Eigenvalue Problems Revisited S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. Email: S.Adhikari@bristol.ac.uk URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

A Magiv CV Theory for Large-Margin Classifiers

A Magiv CV Theory for Large-Margin Classifiers A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector

More information

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Applied Mathematical Sciences, Vol. 4, 2010, no. 14, 657-666 Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Pranesh Kumar Mathematics Department University of Northern British Columbia

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

02 Background Minimum background on probability. Random process

02 Background Minimum background on probability. Random process 0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability

More information

Multivariate random variables

Multivariate random variables Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Joint distributions Tool to characterize several

More information

Estimation of information-theoretic quantities

Estimation of information-theoretic quantities Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/ liam liam@gatsby.ucl.ac.uk November 16, 2004 Some

More information

GENERALIZED NONLINEARITY OF S-BOXES. Sugata Gangopadhyay

GENERALIZED NONLINEARITY OF S-BOXES. Sugata Gangopadhyay Volume X, No. 0X, 0xx, X XX doi:0.3934/amc.xx.xx.xx GENERALIZED NONLINEARITY OF -BOXE ugata Gangopadhyay Department of Computer cience and Engineering, Indian Institute of Technology Roorkee, Roorkee 47667,

More information

CSCI-6971 Lecture Notes: Monte Carlo integration

CSCI-6971 Lecture Notes: Monte Carlo integration CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following

More information