arxiv: v4 [math.st] 26 Aug 2017

Size: px
Start display at page:

Download "arxiv: v4 [math.st] 26 Aug 2017"

Transcription

1 A GEERAL ASYMPTOTIC FRAMEWORK FOR DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS BHASWAR B. BHATTACHARYA arxiv: v4 [math.st] 6 Aug 07 Abstract. Testing equality of two multivariate distributions is a classical problem for which many non-parametric tests have been proposed over the years. Most of the popular two-sample tests, which are asymptotically distribution-free, are based either on geometric graphs constructed using inter-point distances between the observations (multivariate generalizations of the Wald-Wolfowitz s runs test) or on multivariate data-depth (generalizations of the Mann-Whitney rank test). This paper introduces a general notion of distribution-free graph-based two-sample tests, which includes the tests described above, and provides a unified framework for analyzing and comparing their asymptotic properties. The asymptotic efficiency of a general graphbased test is derived, which relates combinatorial properties of the underlying graph to the performance of the associated test. As a consequence, it is shown that tests based on geometric graphs such as the Friedman-Rafsky test [7], the test based on the K-nearest neighbor graph, and the cross-match test of Rosenbaum [38] have zero asymptotic (Pitman) efficiency against O( ) alternatives. Asymptotic efficiencies of the tests based on multivariate depth functions (the Liu-Singh rank sum statistic [7]) are also derived from the proposed general theory. Applications of these results are illustrated both on synthetic and real datasets.. Introduction Let F and G be two continuous distribution functions in R d with densities f and g with respect to Lebesgue measure, respectively. Given independent and identically distributed samples X = {X, X,..., X } and Y = {Y, Y,..., Y } (.) from two unknown densities f and g, respectively, the two-sample problem is to distinguish the hypotheses H 0 : f = g, versus H : f g. (.) More precisely, H 0 is the collection of all distributions of mutually independent i.i.d. observations with sample sizes and from a distribution in P d, the collection of all probability distributions in R d which have density with respect to Lebesgue measure. Similarly, the alternative H is the collection of all distributions of mutually independent i.i.d. observations with sample size from some distribution F P d, and i.i.d. observations with sample size from some other distribution G P d, such that their respective densities f and g, differ on a set of positive Lebesgue measure. 00 Mathematics Subject Classification. 6G0, 6G30, 60D05, 60F05, 60C05. Key words and phrases. Asymptotic efficiency, Distribution-free tests, Minimum spanning tree, earestneighbor graphs, Two-sample problem.

2 B. B. BHATTACHARYA There are many multivariate two-sample testing procedures, ranging from tests for parametric hypotheses such as the Hotelling s T -test, and the likelihood ratio test, to more general non-parametric procedures [4, 5, 8, 9, 4, 7, 9, 0,, 37, 38, 39, 44]. In this paper, we consider multivariate two-sample tests, which are asymptotically distribution-free, that is, tests for which the asymptotic null distribution do not depend on the underlying (unknown) distribution of the data. As a result, these tests can be directly implemented as an asymptotically level α test, making them practically convenient. For univariate data, there are several celebrated nonparametric distribution-free tests such as the the Wald-Wolfowitz runs test [43] and the Mann-Whitney rank test [9] (see the textbook [8] for more on these tests). Many multivariate generalizations of these tests, which are asymptotically distribution-free, have proposed. Most of these tests, can be broadly classified into two categories: () Tests based on geometric graphs: These tests are constructed using the inter-point distances between the observations. This includes the test based on the Euclidean minimal spanning tree by Friedman and Rafsky [7] (generalization the Wald-Wolfowitz runs test [43] to higher dimensions) and tests based on nearest neighbor graphs [, 39]. This also include Rosenbaum s [38] test based minimum non-bipartite matching and the test based on the Hamiltonian path by Biswas et al. [9] (both of which are distribution-free in finite samples), and the recent tests of Chen and Friedman []. Refer to Maa et al. [8] for theoretical motivations for using tests based on inter-point distances. () Tests based on depth functions: The Liu-Singh rank sum test [7] are a class of multivariate two-sample tests that generalize the Mann-Whitney rank test using the notion of data-depth. This include tests based on halfspace depth [4] and simplicial depth [6], among others. For other generalizations of the Mann-Whitney test, refer to the survey by Oja [3] and the references therein. In this paper, we provide a general framework of graph-based two-sample tests, which includes all the tests discussed above. We begin with a few definitions: A subset S R d is locally finite if S C is finite, for all compact subsets C R d. A locally finite set S R d is nice if all inter-point distances among the elements of S are distinct. If S is a set of i.i.d. points W, W,..., W from some continuous distribution F, then the distribution of W W does not have any point mass, and S is nice. A graph functional G in R d defines a graph for all finite subsets of R d, that is, given S R d finite, G (S) is a graph with vertex set S. A graph functional is said to be undirected/directed if the graph G (S) is an undirected/directed graph with vertex set S. We assume that G (S) has no self loops and multiple edges, that is, no edge is repeated more than once in the undirected case, and no edge in the same direction is repeated more than once in the directed case. The set of edges in the graph G (S) will be denoted by E(G (S)). Definition.. Let X and Y be i.i.d. samples of size and from densities f and g, respectively, as in (.). The -sample test statistic based on the graph functional G is defined as T (G (X Y )) = j= {(X i, Y j ) E(G (X Y ))}. (.3) E(G (X Y ))

3 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 3 If G is an undirected graph functional, then the statistic T counts the proportion of edges in the graph G (X Y ) with one end point in X and the other end point in Y. If G is a directed graph functional, then T is the proportion of directed edges with the outward end in X and the inward end in Y. By conditioning on the graph G (X Y ), it is easy to see that under the null H 0, E(T (G (X Y ))) is or, depending on ( ) ( ) whether the graph functional is undirected or directed, respectively, where = +. In this paper, the graph functionals are computed based on the Euclidean distance in R d, and the rejection region of the statistic (.3) will be based on its asymptotic null distribution in the usual limiting regime with + p (0, ), + q := p, (.4) and r := pq. The test statistics considered in this paper will have -fluctuations under H 0. Thus, depending on the type of alternative, the test based on (.3) will reject H 0 for large and/or small values of the standardized statistic R(G (Z )) = {T (G (Z )) E(T (G (Z )))}, (.5) where Z = X Y is the pooled sample. Let Z = {Z, Z,..., Z } be the elements of the pooled sample, with Z i labelled { if zi X c i =, if z i Y. (.6) Then (.3) can be re-written as i j T (G (Z )) := ψ(c i, c j ){(Z i, Z j ) E(G (Z ))}, (.7) E(G (Z )) where ψ(c i, c j ) = {c i =, c j = }... Two-Sample Tests Based on Geometric Graphs. Many popular multivariate twosample test statistics are of the form (.3) where the graph functional G is constructed using the inter-point distances of the pooled sample.... Wald-Wolfowitz (WW) Runs Test. This is one of the earliest known non-parametric tests for the equality of two univariate distributions [43]: Let X and Y be i.i.d. samples of size and as in (.). A run in the pooled sample Z = X Y is a maximal non-empty segment of adjacent elements with the same label when the elements in Z are arranged in increasing order. If the two distributions are different, the elements with labels and would be clumped together, and the total number of runs C(Z ) in Z will be small. On the other hand, for distributions which are equal/close, the different labels are jumbled up and C(Z ) will be large. Thus, the WW-test rejects H 0 for large values of C(Z ). It is easy to see that the WW-test is a graph-based test (.3): C(X Y ) = T (P(X Y )), Ties occur with zero probability by the assumption that both the distribution functions F and G are continuous.

4 4 B. B. BHATTACHARYA where P(S) is the path with S edges through the elements of S arranged in increasing order, for all finite S R. Let R(P(Z )) be the standardized version of T (P(Z )) as in (.5). Wald and Wolfowitz [43] proved that R(P(Z )) is distribution-free in finite samples, is asymptotically normal under H 0, and consistent under all fixed alternatives. The WW-test often has low power in practice and has zero asymptotic efficiency, that is, it is powerless against O( ) alternatives [30].... Friedman-Rafsky (FR) Test. Friedman and Rafsky [7] generalized the Wald and Wolfowitz runs test to higher dimensions by using the Euclidean minimal spanning tree of the pooled sample. Definition.. Given a nice finite set S R d, a spanning tree of S is a connected graph T with vertex-set S and no cycles. The length w(t ) of T is the sum of the Euclidean lengths of the edges of T. A minimum spanning tree (MST) of S, denoted by T (S), is a spanning tree with the smallest length, that is, w(t (S)) w(t ) for all spanning trees T of S. Thus, T defines a graph functional in R d, and given X and Y as in (.), the FR-test rejects H 0 for small values of j= T (T (X Y )) = {(X i, Y j ) E(T (X Y ))}. (.8) This is precisely the WW-test in d =, and is motivated by the same intuition that when the two distributions are different, the number of edges across labels and is small. Friedman and Rafsky [7] calibrated (.8) as a permutation test, and showed that it has good power in practice for multivariate data. Later, Henze and Penrose [] proved that R(T (Z )) is asymptotically normal under H 0 and is consistent under all fixed alternatives. Recently, Chen and Zhang [0] used the FR and related graph-based tests in change-point detection problems, and suggested new modifications of the FR-test for high-dimensional and object data [, ]...3. Test Based on K-earest eighbor (K-) Graphs. As in (.8), a multivariate twosample test can be constructed using the K-nearest neighbor graph of Z. This was originally suggested by Friedman and Rafsky [7] and later studied by Schilling [39] and Henze []. Definition.3. Given a nice finite set S R d, the (undirected) K-nearest neighbor graph (K-) is a graph with vertex set S with an edge (a, b), for a, b S, if the Euclidean distance between a and b is among the K-th smallest distances from a to any other point in S and/or among the K-th smallest distances from b to any other point in S. Denote the undirected K- of S by K (S). Given X and Y as in (.), the K- statistic is T ( K (Z )) = j= {(X i, Y j ) E( K (Z ))}, (.9) E( K (Z ))

5 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS (a) (b) Figure. The 3- graph on a pooled sample of size 5 in R with 0 i.i.d. points from (0, I) (colored blue) and 5 i.i.d. points from (, I) (colored red). For (a) =, there are edges with endpoints in different samples, and for (b) = 0.05, there are 0 edges with endpoints in different samples. where Z = X Y. As before, when the two distributions are different, the number of edges across the two samples is small (see Figure ), so the K- test rejects H 0 for small values of (.9). Schilling [43] showed that the test based on K nearest neighbors is asymptotically normal under H 0 and consistent under fixed alternatives...4. Cross-Match (CM) Test. Rosenbaum [38] proposed a distribution-free multivariate two-sample test based on minimum non-bipartite matching. For simplicity, assume that the total number of samples is even; otherwise, add or delete a sample point to make it even. Definition.4. Given a finite S R d and a symmetric distance matrix D := ((d(a, b))) a b S, a non-bipartite matching of S is a pairing of the elements S into / non-overlapping pairs, that is, a partition of S = / S i, where S i = and S i S j =. The weight of a matching is the sum of the distances between the / matched pairs. A minimum non-bipartite matching (BM) of S is a matching which has the minimum weight over all matchings of S. The BM defines a graph functional W as follows: for every S R d finite, W(S) is the graph with vertex set S and an edge (a, b) whenever there exists i [/] such that S i = {a, b}. ote that W(S) is a graph with / pairwise disjoint edges. Given X and as in (.), the CM-test rejects H 0 for small values of Y T (W(X Y )) = j= {(X i, Y j ) E(W(X Y ))}. (.0) / The statistic (.9) is slightly different from the test used by Schilling [39, Section ], which can also be re-written as graph-based test (.3) by allowing multiple edges in the K- graph.

6 6 B. B. BHATTACHARYA Like the WW-test, but unlike the FR and the K- tests, the CM-test is distribution-free in finite samples under H 0. Rosenbaum [38] implemented this as a permutation test and derived the asymptotic normal distribution under the permutation null... Two-Sample Tests Based on Data-Depth. Many non-parametric two-sample tests are based on depth functions, which are multivariate generalizations of ranks [6, 3]. Given a distribution function F in R d, a depth function D(, F ) : R d [0, ] is a function that provides a ranking of points in R d. High depth corresponds to centrality, while low depth corresponds to outlyingness. The center consists of the points that globally maximize the depth, and is often considered as a multivariate median of F. For X F and Y G two independent random variables in R d, R D (y, F ) = P(X : D(X, F ) D(y, F )) (.) is a measure of the relative outlyingness of a point y R d with respect to F. In other words, R D (y, F ) is the fraction of the F population with depth at most as that of the point y. The dependence on D in the notation R D (, F ) will be dropped when it is clear from the context. The quality index ˆ Q(F, G) := R(y, F )dg(y) =P(D(X, F ) D(Y, F ) X F, Y G) (.) is the average fraction of F with depth at most as the point y, averaged over y distributed as G. When F = G and D(X, F ) has a continuous distribution, then R(Y, F ) Unif[0, ] and Q(F, G) = / [7, Proposition 3.]. Definition.5. Let X and Y be as in (.) and F and G be the empirical distribution functions, respectively. The Liu-Singh rank sum statistic [7], is ˆ Q(F, G ) := the sample estimator of Q(F, G). R(y, F )dg (y) = R(Y j, F ), (.3) The test rejects H 0 for large values of ( Q(F, G) ) and can be re-written as a graph-based test (.3). To see this, note that Q(F, G ) = {i [ ] : D(X i, F ) D(Y j, F )} j= j= = {D(X i, F ) D(Y j, F )}. (.4) j= Let Z be the pooled sample with the labeling as in (.6). Construct a graph G D (Z ) with vertices Z with a directed edge from (Z i, Z j ) whenever D(Z i, F ) D(Z j, F ). ote that

7 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 7 G D (Z ) is a complete graph with directions on the edges depending on the relative order of the depth of the two endpoints. Then from (.4), Q(F, G ) = ψ(c i, c j ){D(Z i, F ) D(Z j, F )} = i j i j ψ(c i, c j ){(Z i, Z j ) E(G D (Z ))}, with ψ(, ) as in (.7). Since E(G D (Z )) = ( ), this implies, Q(F, G ) = ( ) T (G D (Z )), (.5) where T is defined in (.7). Thus, the Liu-Singh rank sum statistic based on a depth function D is a graph-based test (.3). Common depth functions include the Mahalanobis depth, the halfspace depth, the simplicial depth, and the projection depth, among others. In the following, we recall the definitions for a few of these (refer to [6] for more on depth functions).. Mann-Whitney Test: This is one of the oldest non-parametric two-sample tests for univariate data [8, 4]. Depending on the alternative, the Mann-Whitney rank test rejects H 0 for small or large values of j= {X i < Y j }. This is a distribution-free test, which corresponds to taking D(x, F ) = F (x) in (.4).. Halfspace Depth: Tukey [4] suggested the following depth-function ˆ HD(x, F ) = inf df, (.6) H x where the infimum is taken over all closed halfspace H x with x on its boundary. The twosample test based on the halfspace depth is obtained by using (.6) in the statistic (.4). 3. Mahalanobis Depth: Given a distribution function F, the Mahalanobis depth of a point x R d is MD(x, F ) = (.7) + (x µ(f )) Σ (F )(x µ(f )) where µ(f ) = xdf (x) and Σ(F ) = (x µ(f ))(x µ(f )) df (x) are the mean and covariance of the distribution F. The Liu-Singh rank sum statistic is consistent against alternatives for which the quality index Q(F, G) (recall (.)), under mild conditions [7]. A special version of Liu- Singh rank sum statistic, with a reference sample, inherits the distribution-free property of the Mann-Whitney test in finite samples [7, Section 4]. However, the Liu-Singh rank sum statistic defined above (.3) is only asymptotically distribution-free under the null hypothesis [7, 45]. Zuo and He [45, Theorem ] proved the asymptotic normality of the Liu-Singh rank sum statistic under general alternatives for depth functions satisfying certain regularity conditions.

8 8 B. B. BHATTACHARYA.3. Properties of Graph-Based Tests. A test function φ for the testing problem (.) is said to be asymptotically exact level α, if lim E H0 (φ ) = α. An exact level α test function φ is said to be consistent against the alternative H, if lim E H (φ ) =, that is, the power of the test φ converges to in the usual asymptotic regime (.4). All the tests described in Sections. and. have the following common properties: Asymptotically Distribution-Free: The standardized test statistic (.5) converges to (0, σ ) under H 0, where the limiting variance σ depends on the graph functional G, but not on the unknown distributions. Therefore, under the null, the asymptotic distribution of the test statistic (.5) is distribution-free. This was proved for the FR-test by Henze and Penrose [], for the test based on the K- graph by Henze [], and by Liu and Singh [7] for depth-based tests. Finally, recall that the CM test is exactly distribution-free in finite samples [38]. Consistent Against Fixed Alternatives: For a geometric graph functional G in R d, the test function with rejection region {R(G (Z )) < σ z α }, (.8) where z α is the α-th quantile of the standard normal, is asymptotically exact level α and consistent against all fixed alternatives, for the testing problem (.). This was proved for the MST by Henze and Penrose [34] and for the K- graph by Schilling [39] and later by Henze []. Recently, Arias-Castro and Pelletier [3] showed that the CM test is consistent against general alternatives. Similarly, the Liu-Singh rank sum statistic 3 is asymptotically exact level α and consistent against alternatives for which the quality index Q(F, G). Even though the basic asymptotic properties of the tests based on geometric graphs and those based on data-depth are quite similar, their algorithmic complexities are very different. Tests based on geometric graphs such as the MST, the K- graph, and the CM-test can be computed in polynomial time with respect to both the number of data points and the dimension d. For instance, the MST and the K- graph of a set of points in R d can be computed easily in O(d ) time, and the BM matching can be computed in O(d 3 ) time [3]. On the other hand, depth-based tests are generally difficult to compute when the dimension is large because the computation time is polynomial in but exponential in d (with the exception of Mahalanobis depth (.7), which is easy to compute but gives rise to non-robust procedures because it involves the sample mean and covariance matrix). In fact, computing the Tukey depth of a set of points in R d is P-hard if both and d are parts of the input [4], and it is even hard to approximate []. For more details on algorithms to compute various depth functions, refer to the recent survey of Rousseeuw and Hubert [36] and the references therein..4. Summary of Results. The general notion of graph-based two-sample tests (.3) introduced above provides a unified framework for analyzing and statistically comparing their asymptotic properties. We derive the limiting power of these tests against local alternatives, 3 For depth-based tests, the rejection region can be of the form { R(G (Z )) σ z α/ }, depending on the alternative.

9 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 9 that is, alternatives which shrink towards the null as the sample size grows to infinity. If the statistic (.5) is asymptotically normal under the null, and the test based on (.8) is consistent, it will generally have non-trivial power (greater than the size of the test) against O( ) local alternatives. This can be formalized using the notion of Pitman efficiency of a test (see Definition. below), and can be used to compare the asymptotic performances of different tests. The following is a summary of the results obtained:. In Section 3 the asymptotic (Pitman) efficiency of a general graph-based test is derived. This result can be used as a black-box to derive the efficiency of any such two-sample test (Theorem 3. and Corollary 3.). The results obtained show how combinatorial properties of the underlying graph effect the performance of the associated test, which can be effective in the future for constructing and analyzing new tests. The results are illustrated through simulations and compared with other parametric tests in Section 5.. As a consequence of the general result the asymptotic efficiency of the tests described in Sections. and. can be derived: It is shown that the Friedman-Rafsy test has zero asymptotic efficiency, that is, it is powerless against any O( ) local alternatives (Theorem 4.). In fact, this phenomenon extends to a large class of random geometric graphs that exhibit local spatial dependence. This can be formalized using the notion of stabilization of geometric graphs [34], which includes the MST, the K- graph among others (Theorem 4.). On the other hand, the tests based on data-depth (.3) (see also Section 4.4) have have non-trivial power, and hence, non-zero asymptotic efficiencies, for many O( ) alternatives (Theorem 4.). The results above, together with the discussion at the end of Section.3, show that there is a very interesting trade-off between asymptotic efficiencies and computation time, needed by the tests based on geometric graphs and those based on datadepth: Tests based on geometric graphs can be computed efficiently, however, they are asymptotically inefficient. On the other hand, tests based on data-depth are difficult to compute, but generally have non-trivial efficiencies. This raises the following intriguing question: Is possible to construct an asymptotically distribution-free test, which has non-trivial asymptotic efficiency and is also computationally tractable when dimension is large? In Section 4.5, we address the fundamental difficulties and discuss practical alternatives. 3. Finally, the performance of the Friedman-Rafsy test and the test based on the halfspace depth are compared on the sensorless drive diagnosis data set [7] (Section 6). Preliminaries about asymptotic efficiency and Le Cam s theory are discussed in Section. Proofs of the results are given in Appendix A-F.. Preliminaries Begin by recalling some definitions and notation: For two sequences {a n } n and {b n } n, a n b n means a n = O(b n ); a n b n means a n = ( + o())b n. For two random sequences {a n } n and {b n } n defined on the same probability space, a n = O P (b n ) means

10 0 B. B. BHATTACHARYA lim M lim n P( a n /b n > M) = 0 and a n = o P (b n ) means a n /b n converges in probability to zero. Finally, for x R, x + = max{x, 0}, and for a vector z R s, z = ( s z i ), is the Euclidean norm of z. For S R d measurable, denote by B(S) the Borel sigma-algebra generated by the open subsets of S. Let (S n, B(S n )) be a sequence of measure spaces with two probability distributions P n and Q n. The sequence {Q n } n is said to be contiguous with respect to {P n } n if lim n P n (E n ) = 0 implies lim n Q n (E n ) for all sequence {E n } of measurable sets. Contiguity is relevant for analyzing test functions under local alternatives, that is, alternatives shrinking towards the null as the sample size goes to infinity. To quantify the notion of local alternatives, let Θ R p and {P θ } θ Θ be a parametric family of distributions in R d with density f( θ), with respect to Lebesgue measure, indexed by a p-dimensional parameter θ Θ. Denote the score function η(, θ) = θf( θ), and f( θ) Fisher-information matrix I(θ).To compute asymptotic efficiency of tests certain smoothness conditions are required on f( θ). The standard technical condition is to assume differentiability in norm of the square root of the density: Definition.. [5, Definition..] The family {P θ } θ Θ is quadratic mean differentiable (QMD) at θ = θ 0 Θ if there exists a vector of real-valued functions ν(, θ 0 ) = (ν (, θ 0 ), ν (, θ 0 ), ν p (, θ 0 )) such that ˆ [ f(x θ0 + h) ] f(x θ 0 ) h, ν(x, θ 0 ) dx = o( h ), as h 0. R d The above condition holds for most standard families of distributions, including exponential families in natural form. Hereafter, we assume that {P θ } θ Θ is a QMD family such that the score function has bounded fourth moment. Assumption.. A parametric family of distributions {P θ } θ Θ, with Θ R p, is said to satisfy Assumption. at θ = θ, if P θ is QMD at θ = θ, and for V P θ and all h R p, E( h, η(v, θ ) 4 ) <. Let X and Y be samples from P θ and P θ as in (.), respectively. For h R p, consider the testing problem, H 0 : θ θ = 0, versus H : θ θ = h. (.) ote that the tests are still carried out in the non-parametric setup assuming no knowledge of the distributions of the two samples. However, the efficiency is computed assuming a parametric form for the unknown distributions. 4 4 Another choice of alternatives for computing the efficiency of a non-parametric test is to consider: H 0 : g = f, versus H : g = ( δ )f + δ g, for some density g in R d and δ > 0. Then, under mild integrability assumptions, the densities associated with the alternative are contiguous, and the efficiency results derived in this paper easily extend to this case, as well. We chose the formulation in (.) because it yields slightly cleaner asymptotic expansions and formulas.

11 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS Let Z = X Y be the pooled sample. The log-likelihood ratio for the testing problem (.) is: ( ) f Y j θ + h L := log = log f(z j θ + h ) {c j = }, (.) f(y j θ ) f(z j θ ) j= where c j is the label of Z j as in (.6). Since P θ is QMD, by Lehmann and Romano [5, Theorem..3], in the usual asymptotic regime (.4), ( L D q h, I(θ ) )h, q h, I(θ )h. This implies the joint distributions of X and Y under H 0 and H (as in (.)) are mutually contiguous [5, Corollary.3.]. Then by Le Cam s Third Lemma [5, Corollary.3.], if under the null H 0, ( ) R(G (Z )) L (( D j= 0 q h,i(θ )h ) ( σ, σ σ q h, I(θ )h )), (.3) then under the alternative H, R(G (Z )) D (σ, σ). Then the limiting power of the two-sample test based on G with rejection region (.8) is given by Φ(z α σ σ ), where Φ( ) is the standard normal distribution function. Definition.. The quantity (σ ) + σ will be referred to as the asymptotic (Pitman) efficiency of the test statistic R(G (Z )) for the testing problem (.), and will be denoted by AE(G ). The ratio of the asymptotic efficiencies of two test statistics is the relative (Pitman) efficiency, which is the ratio of the number of samples required to achieve the same limiting power between two level α-tests. Refer to the textbooks [8, 5, 4] for more on contiguity, local power and asymptotic efficiencies of tests. 3. Asymptotic Efficiency of Graph-Based Two-Sample Tests This section describes the main results about the asymptotic efficiencies of general graphbased two-sample tests. There are two cases, depending on whether the graph functional is directed or undirected. To begin, let G be a directed graph functional in R d. Denote by E(G (S)) the set of edges in G (S), and by E + (G (S)) the set of pairs of vertices with edges in both directions (that is, the set of ordered pairs of vertices (x, y) such that both (x, y) E(G (S)) and (y, x) E(G (S))), respectively. For x R d, let d (x, G (S)) be the in-degree of the vertex x in the graph G (S {x}), d (x, G (S)) be the out-degree of the vertex x in the graph G (S {x}), d(x, G (S)) = d (x, G (S)) + d (x, G (S)), be the total degree of the vertex x in the graph G (S {x}). Define the scaled in-degree and the scaled out-degree of a vertex as follows: λ (x, G (S)) = d (x, G (S x )) E(G (S x )), λ (x, G (S)) = d (x, G (S x )), (3.) E(G (S x ))

12 B. B. BHATTACHARYA where S x = S {x}. Also, let T (G (S)) = ( ) d (x, G (S)), T (G (S)) = ( ) d (x, G (S)) x S x S (3.) be the number of outward -stars and inward -stars in G (S), respectively. Finally, let T + (G (S)) be the number of -stars in G (S) with different directions on the two edges. For an undirected graph functional G, denote by d(x, G (S)) the degree of the vertex x in the graph G (S {x}). As in (3.) and (3.), let λ(x, G (S)) = d(x, G (Sx )), T E(G (S x (G (S)) = ( ) d(x, G (S)), (3.3) )) x S where S x = S {x}. Intuitively, the functions λ (x, G (S)), λ (x, G (S)), and λ(x, G (S)), measure the relative position of the point x in set S {x}. For example, in the FR-test or the test based on the K- graph, large values of λ correspond to points near the center of the data-cloud, whereas small values correspond to outliers. Similarly, for depth-based tests, small/large values of λ generally correspond to points which are outliers. In fact, for such tests, this is directly related to the relative outlyingness of the point x, as defined in (.) (see Observation F.). 3.. Asymptotic Efficiency for Directed Graph Functionals. Let G be a directed graph functional in R d and {P θ } θ Θ a parametric family of distributions satisfying Assumption. at θ = θ Θ R p. To derive the asymptotic efficiency of a general graph-based test using (.3) various assumptions are required on the graph functional G. The variance condition stated below requires that the proportion of edges and the (scaled) number of -stars converge, which ensures that the variance of R(G (Z )) converges (see Lemma A. in Appendix A.). Assumption 3.. (Variance Condition) The pair (G, P θ ) is said to satisfy the variance condition with parameters (β 0, β 0 +, β, β, β + ), if for V := {V, V,..., V } i.i.d. from P θ, there exist finite non-negative constants β 0, β 0 +, β, β, and β + (which do not depend on P θ ) such that (a) E(G (V )) (b) T (G (V )) E(G (V )) P β, (c) T + (G (V )) E(G (V )) P β +. P β 0, and E+ (G (V )) E(G (V )) P β + 0, T (G (V )) E(G (V )) P β, and The variance condition is a very mild requirement which is satisfied by all the graph functionals described in Section (see Section 4 for details). ext, to show (.3), the limiting covariance of the statistic R(G (Z )) and the loglikelihood ratio L has to be computed. To ensure the existence of this limit the following condition is required: Assumption 3.. (Covariance Condition) The pair (G, P θ ) is said to satisfy the covariance condition if for V := {V, V,..., V } i.i.d. from P θ the following holds:

13 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 3 (a) There exists functions λ, λ : R d R, such that for almost all z R d, λ (z) = lim E(λ (z, G (V ))), and λ (z) = lim E(λ (z, G (V ))), (3.4) and zero otherwise. (b) For h R p as in (.) ˆ h, η(v i, θ ) λ (V i, G (V )) P h, f(z θ ) λ (z)dz, (3.5) and the same holds for λ. Finally, to show (.3), the joint normality of R(G (Z )) and L must be proved. To this end, define (G (S)) := max i [] d(v i, G (S)) the total maximum degree of the graph G (S), where [] := {,,..., }. Assumption 3.3. (ormality Condition) The pair (G, P θ ) is said to satisfy the normality condition if for V := {V, V,..., V } i.i.d. from P θ either one of the following holds: (a) (Condition ) The maximum degree (b) (Condition ) The maximum degree (G (V )) P and (G (V )) = O P (). (3.6) (G (V )) E(G (V )) = O P (). (3.7) Remark 3.. Assumption 3.3 ensures that either (a) the graph has bounded degree (for a slightly weaker condition refer to Proposition B. in Appendix B), or (b) the maximum degree (G (V )) is of the same order as the average degree E(G (V )) /. The examples in Section satisfy either one of these conditions, and in Propositions B. and C. the joint normality of the likelihood ratio and the statistic (.5) is proved under condition and, respectively. However, determining a necessary and sufficient condition for the asymptotic normality of the statistic (.5) is a non-trivial problem. This can be reformulated as proving the CLT of the number of monochromatic edges in a randomly -colored graph, which has been extensively studied, especially in the equal sample sizes (p = q) case (refer to [6] and the references therein). The following result gives the asymptotic efficiency of a graph-based test, if the graph functional satisfies the above conditions. The proof of the theorem is given in Appendix A. Theorem 3.. Let G be a directed graph functional and {P θ } θ Θ a parametric family of distributions in R d satisfying Assumption. at θ = θ Θ R p. Suppose the pair (G, P θ ) satisfies the variance condition 3. with parameters (β 0, β 0 +, β, β, β + ), the covariance condition 3., and the normality condition 3.3. Then the asymptotic efficiency of the test statistic (.5) for the testing problem (.) is AE(G ) := r r ( p h, f(z θ ) λ (z)dz q h, f(z θ ) λ (z)dz ) + { ( )}, β 0 + qβ + pβ r β 0 + β+ 0 + β + β + β +

14 4 B. B. BHATTACHARYA whenever the denominator above is strictly positive. The graph functionals defined in Section satisfy the variance, covariance, and normality conditions, and the above theorem can be used to compute their asymptotic efficiencies (see Section 4 for details). More generally, the efficiency formula shows how combinatorial properties of the underlying graph effect the performance of the associated test. Remark 3.. The efficiency of a test is governed by the functions λ, λ. Recall that λ (z), λ (z) are the scaled out/in-degrees of the vertex z, in the graph constructed with z and the other data points. ote that for points close to the center of the data-cloud, λ (z) and/or λ (z) will be large, that is, these functions are like depth measures of the data set. In fact, it will be shown in Section 4.4, that for tests based on depth-functions, they are directly related to the relative outlyingness of the associated depth function. On the other hand, in Section 4. it will be shown, that for tests based on stabilizing geometric graphs, λ, λ are constant functions. Intuitively, this means that for these graphs, the degree of a new point is blind to the geometry of the data distribution, and, as a consequence, the associated test has zero efficiency. The proof of Theorem 3. also gives the limiting null distribution of the test statistic (.5). Remark 3.3. Let G be a directed graph functional and f and g be two densities in R d. ote that the variance condition 3. and the normality condition 3.3 naturally extend to the pair (G, P f ), where P f is the distribution function corresponding to f. The proof of Theorem 3. shows that if the pair (G, P f ) satisfies the variance condition 3. and the normality condition 3.3, then under the null hypothesis (f = g) R(G (Z )) D (0, σ), where σ is the denominator of the formula in Theorem 3.. This gives the asymptotic null distribution of a general graph-based two-sample test. In particular, this gives a unified proof for the asymptotic null distribution of the FR-test [, Theorem ], the K- test [39, Theorem 3.], and the Liu-Singh rank sum statistic [7, Theorem 6.]. Remark 3.4. The condition that the denominator in the formula in Theorem 3. is positive ensures that the limiting variance of the statistic (.5) is positive (see Lemma A. in Appendix A.). If the limiting variance happens to be zero, then the statistic does not have a non-degenerate distribution at the -scale and the asymptotic efficiency against alternatives (.) is undefined. 3.. Asymptotic Efficiency for Undirected Graph Functionals. Every undirected graph functional G can be modified to a directed graph functional G + in a natural way: For S R d finite, G + (S) is obtained by replacing every edge in G (S) with two edges one in each direction. The asymptotic efficiency of the test based on G can then be derived by applying Theorem 3. to the directed graph functional G +. The following is the analogue of Assumptions for undirected graph functionals. Assumption 3.4. Let G be an undirected graph functional and assume V := {V, V,..., V } be i.i.d. with density P θ. The pair (G, P θ ) is said to satisfy Assumption 3.4 if the following hold:

15 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 5 ((γ 0, γ )-Undirected Variance Condition) There exists finite non-negative constants γ 0, γ such that E(G (V )) P γ 0, and T (G (V )) P γ E(G (V )). (3.8) (Undirected Covariance Condition) There exists a function λ : R d R, such that for almost all z R d, lim E(λ(z, G (V ))) λ(z), and zero otherwise. Moreover, for h R p as in (.) ˆ h, η(v i, θ ) λ(v i, G (V )) P h, f(z θ ) λ(z)dz. (3.9) (ormality Condition) The pair (G +, P θ ) satisfies the normality condition as in Assumption 3.3. If G satisfies the above condition, then G + satisfies Assumptions (see Appendix A.3.), and applying Theorem 3. for G + we get the following: Corollary 3.. Let G be an undirected graph functional and {P θ } θ Θ a parametric family of distributions in R p satisfying Assumption. at θ = θ Θ. If the pair (G, P θ ) satisfies Assumption 3.4, then the asymptotic efficiency of the test statistic (.5) is ( r AE(G ) = (p q) h, f(z θ ) λ(z)dz ) + r {γ0 ( r) + (γ )( r)}, (3.0) whenever the denominator above is strictly positive. Remark 3.5. ote that the numerator in (3.0) is zero when p = q = /, that is, when the two sample sizes and are asymptotically equal. This is because, conditional on the graph, the variables {ψ(c i, c j ) : (Z i, Z j ) E(G (Z ))} are pairwise independent when p = q, and, as a result, the test statistic (.5) does not correlate with the likelihood ratio. Therefore, depending on the value of the denominator in (3.0) the following two cases arise: If the graph functional is non-sparse, that is, E(G (V )) /, then γ 0 = 0 in (3.8) and the denominator in (3.0) is zero, since r = pq = /. Therefore, non-sparse undirected graph functionals do not have a non-degenerate distribution at the scale when p = q, and Corollary 3. does not apply. If the graph is sparse, that is, γ 0 > 0, then the denominator of (3.0) is non-zero when p = q. This shows that tests based on sparse graph functionals cannot have non-zero efficiencies when the two-sample sizes are asymptotically equal. 4. Applications In this section we compute the asymptotic efficiencies of the tests described in Section using Theorem 3. and Corollary 3.. Extensions and generalizations which can be used to construct locally efficient tests are also discussed.

16 6 B. B. BHATTACHARYA 4.. The Friedman-Rafsky (FR) Test. Let T be the MST functional as in Definition. and consider the two-sample test based on T (.8). Examples suggested that this test has zero asymptotic efficiency, which is formalized in the following theorm: Theorem 4.. Let {P θ } θ Θ be a parametric family of distributions in R d satisfying Assumption. at θ = θ Θ. Then the asymptotic efficiency of the Friedman-Rafsky test (.8), for the testing problem (.), is zero, that is, AE(T ) = 0. The FR-test has zero asymptotic efficiency because the function λ(z) does not depend on z R d, and hence, the numerator in (3.0) is zero. This is a consequence of the following well-known result: For V = {V, V,..., V } i.i.d. P θ, lim E(λ(z, T (V ))) = (refer to [, Proposition ]). The Efron-Stein inequality [5] can then be used to show that (T, P θ ) satisfies the covariance condition with the function λ(z) =, for all z R d. The details of the proof are given in Appendix D. 4.. Tests Based on Stabilizing Graphs. The proof of Theorem 4. uses the fact that the MST graph functional has local dependence, that is, addition/deletion of a point only effects the edges incident on the neighborhood of that point. This phenomenon holds for many other random geometric graphs and was formalized by Penrose and Yukich [34] using the notion of stabilization. Let G be a graph functional defined for all locally finite subsets of R d. (The K- graph can be naturally extended to locally finite infinite points sets. Aldous and Steele [] extended the MST graph functional to locally finite infinite point sets using the Prim s algorithm.) For S R d locally finite and x R d, let E(x, G (S)) be the set edges incident on x in G (S {x}). ote that E(x, G (S)) = d(x, G (S)), the (total) degree of the vertex x in G (S {x}). Definition 4.. Given S R d and y R d and a R, denote by y +S = {y +z : z S} and as := {az : z S}. A graph functional G is said to be translation invariant if the graphs G (x + S) and G (S) are isomorphic for all points x R d and all locally finite S R d. A graph functional G is scale invariant if G (as) and G (S) are isomorphic for all points a R and and all locally finite S R d. Let P λ be the Poisson process of intensity λ 0 in R d, and P λ,x := P λ {x}, for x R d. Penrose and Yukich [34] defined stabilization of graph functionals over homogeneous Poisson processes as follows: Definition 4. (Penrose and Yukich [34]). A translation and scale invariant graph functional G stabilizes P λ if there exists a random but almost surely finite variable R such that E(0, G (P λ,0 )) = E(0, G (P λ,0 B(0, R) A )), (4.) for all finite A R d \B(0, R), where B(0, R) is the (Euclidean) ball of radius R with center at the point 0 R d. Informally, stabilization ensures the insertion of a point (or finitely many points) far away from the origin 0 does effect the degree of 0 in the graph G (Pλ 0 ), that is, it has only a local effect in some sense. Many geometric graphs, like the MST, the K- graph, and the Delaunay graph are stabilizing [34].

17 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS Asymptotic Efficiency of Stabilizing Graphs. The following theorem shows that tests based on stabilizing graph functionals have zero asymptotic efficiency. Theorem 4.. Let {P θ } θ Θ be a parametric family of distributions in R d satisfying Assumption. at θ = θ Θ, and G be a translation and scale invariant graph functional which stabilizes P. If the pair (G, P θ ) satisfies Assumption 3.3 and sup sup E (d(z, G (Z )) s ) <, (4.) z R d for some s > 4, then the asymptotic efficiency of the two-sample test based on G (.5), for the testing problem (.), is zero, that is, AE(G ) = 0. The outline of the proof is as follows (refer to Appendix D for detials): For undirected stabilizing graph functionals E(G (V ) / P Ed(0, G (P )) and Ed(z, G (V )) P Ed(0, G (P )), where V = {V, V,..., V } are i.i.d. P θ. Therefore, λ(z, V ) and, as in the FRtest, it can be shown that (G, P θ ) satisfies the covariance condition with λ(z) =, for all z R d. Thus, the numerator in (3.0) is zero, and the result follows. The proof holds almost verbatim for directed stabilizing graph functionals as well Applications of Theorem 4.. Graph functionals such as the MST, the K-, and others, like the Delaunay graph and the Gabriel graph, not considered in this paper are stabilizing [34]. Theorem 4. can be used to show that tests based on such graph functionals have zero asymptotic efficiency. Example 4. (Minimum Spanning Tree (MST)). By [34, Lemma.] the MST graph functional T stabilizes P. Moreover, the degree of a vertex in the MST of a set of points in R d is bounded by a constant B d, depending only on the dimension d [, Lemma 4]. Therefore, the normality condition in Assumption 3.3 and the moment condition (4.) are trivially satisfied. Theorem 4. then implies AE(T ) = 0, thus re-deriving Theorem 4.. Example 4. (K-earest eighbor ()). By [33, Lemma 6.], the K- graph functional K stabilizes P. Condition in Assumption 3.3 and the moment condition (4.) are trivially satisfied, since K is a bounded degree graph functional. This implies, AE( K ) = 0, showing that tests based on K- graphs have no asymptotic local power Cross-Match (CM) Test. Let W be the minimum non-bipartite matching (BM) graph functional as in Definition.4. It is unknown whether W is stabilizing [34], and so, Theorem 4. cannot be applied to compute the asymptotic efficiency of the cross-match test. However, in this case, Corollary 3. can be used directly to derive the following: Corollary 4.3. Let {P θ } θ Θ be a parametric family of distributions in R d satisfying Assumption. at θ = θ Θ. Then the asymptotic efficiency of the cross-match (CM) test (.0), for the testing problem (.), is zero, that is, AE(W) = 0. Proof. Let be even and V = {V, V,..., V } i.i.d. P θ. In this case, E(W(V )) = / and T (W(V )) = 0. Therefore, (W, P θ ) satisfies the (, 0)-undirected variance condition (3.8). The normality condition (3.6) holds, since d(v i, W(V )) = for all V i V. This also implies that the undirected covariance condition holds with λ(z) =, for all z R d. The result then follows by Corollary 3..

18 8 B. B. BHATTACHARYA 4.4. Depth-Based Tests. Let Z = X Y be the pooled sample, and F the empirical distribution of X. The two sample test based on a depth function D (4.4) rejects for large values of T (G D (Z )), where the graph G D (Z ) has vertex set Z with a directed edge (Z i, Z j ) whenever D(Z i, F ) D(Z j, F ). If the depth function D(X, F ), where X F, has a continuous distribution then G D (Z ) is a complete graph ( E(G D (Z )) = ( )/) with directions on the edges depending on the relative ordering of the depth of the two endpoints. Definition 4.3. Let F be a distribution function in R d with empirical distribution function F. A depth function D is said to be good with respect to F if (A) For X F, the distribution of D(X, F ) is continuous. (A) P(y D(Y, F ) y ) C y y, for some constant C and any y, y [0, ]. (A3) sup x R d D(x, F ) D(x, F ) = o() almost surely and in expectation. The standard depth functions such as the halfspace depth, the Mahalanobis depth, the projection depth among others, satisfy the above conditions, for any continuous distribution function F. The following result gives the asymptotic efficiencies of tests based on such depth functions. To this end, let {P θ } θ Θ be a parametric family of distributions in R d satisfying Assumption. at θ = θ Θ R p. Moreover, let F θ be the distribution function of P θ. Theorem 4.4. The asymptotic efficiency of the two-sample test (4.4) based on a good depth function D (with respect to F θ ), for the testing problem (.), is AE(G D ) = ˆ 6r h, f(x θ ) R(x, F θ )dx, (4.3) where R(x, F θ ) is as defined in (.). The proof of the theorem is given in Appendix F. It entails verifying the different conditions in Theorem 3. for the directed graph functional G D, for a good depth function D. Example 4.3. (Mann-Whitney U-Statistic) This is one of the most popular univariate twosample tests, which corresponds to (4.4) with D(x, F ) = F (x). This implies, R(x, F ) = F (x) and, for θ = θ Θ, by Theorem 4.4, the asymptotic efficiency of the Mann- Whitney test is 6r ˆ h, f(x θ ) R(x, F θ )dx = ˆ 6r h, f(x θ ) F θ (x)dx. (4.4) Tests based on depth functions generally have non-trivial local power against O( ) alternatives, especially for scale problems. However, they have no power in location problems, as discussed below: Example 4.4. (Location Family) As in Example 5., consider parametric family P θ (θ, I). Then R MD (y, F θ ) = R D (y, F θ ), where F θ is the distribution function of P θ, for any depth function D which is affine-invariant and the satisfies strict monotonicity property (refer to [7, Theorem 5.] for details). ow, R MD (y, F θ ) = P Fθ (X : MD(X, F θ ) MD(y, F θ )) =

19 DISTRIBUTIO-FREE GRAPH-BASED TWO-SAMPLE TESTS 9 P(χ d > (y θ)t (y θ)). Then ˆ h, f(x θ) R(x, F θ )dx = ˆ h, ze z t z R (π) d d/ P(χ d > z t z)dz = 0. R d This implies that the asymptotic efficiency of any depth-based test is zero for the normal location problem, as seen in Example 5. for the halfspace depth (HD). 5 In fact, the same argument shows that depth-based tests have no power against location alternatives for any elliptical distribution [6] Discussion. The examples above raise the following question: Can one construct computationally efficient and distribution-free two-sample tests, which are consistent against a large class of alternatives and have non-zero Pitman efficiency? Theorem 3. shows that to obtain two-sample tests with non-zero efficiency, one has to construct graph functionals for which the degree-functions λ /λ (defined in (3.4)) are nonconstant. One natural approach is to consider inhomogeneous (weighted) nearest-neighbor graphs: Given a weight function w : R d {0,,..., K}, where K is a fixed positive integer, the directed w-weighted nearest neighbor graph functional, to be denoted by w, is defined as follows: For a finite set S R d, w (S) is a graph with vertex set S with directed edges (a, b), for a, b S, if the Euclidean distance between a and b is among the w(a)-th smallest distances from a to any other point in S. (ote that the unweighted directed K- graph corresponds to the constant weight function K.) Similar to the unweighted K-, the weighted nearest-neighbor graph can be computed efficiently, in any dimension. Moreover, our general result implies that the asymptotic efficiency of the two-sample test based on w is, 6 ˆ r(p q) AE( w ) = h, f(z θ ) w(z)dz, (4.5) σ w where σ w is the asymptotic variance of the test statistic (.5) (which can be calculated using Lemma A. in Appendix A.). This shows that if p q, one can, in principle choose a weight function w such the corresponding test has non-zero efficiency. 7 The optimal weight function which maximizes the asymptotic efficiency is, w(z) = K {(p q) h, f(z θ ) > 0}. (4.6) However, this is practically useless because we do not know f. It is possible to use (4.6) by estimating f from the data, however, in doing so, the test will no longer be asymptotically distribution-free. The results above illustrate a fundamental difficulty in constructing distribution-free twosample tests, which are generally based on the number of edges across two the samples, that are both computationally and statistically efficient. A natural practical way to reduce the computational cost and maintain non-zero Pitman efficiency is by projection: For example, 5 The univariate depth function D(x, F ) = F (x) does not satisfy the strict monotonicity property. In fact, (4.4) implies that the asymptotic efficiency of the Mann-Whitney test for the normal location problem is 6r f (x θ )dx > 0, unlike the strictly monotonic depth functions. 6 For well-behaved weight functions, that is, for each a {0,,..., K}, the set B a := {x R d : w(x) = a} is either empty or has a non-empty interior. 7 The test has no power when p = q, irrespective of the choice of w, for reasons discussed in Remark 3.5.

arxiv: v2 [math.st] 22 Jun 2016

arxiv: v2 [math.st] 22 Jun 2016 DISTRIBUTIO OF TWO-SAMPLE TESTS BASED O GEOMETRIC GRAPHS AD APPLICATIOS arxiv:151.00384v [math.st] Jun 016 By Bhaswar B. Bhattacharya Stanford University In this paper central limit theorems are derived

More information

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Robust estimation of principal components from depth-based multivariate rank covariance matrix Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Lecture 35: December The fundamental statistical distances

Lecture 35: December The fundamental statistical distances 36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose

More information

Jensen s inequality for multivariate medians

Jensen s inequality for multivariate medians Jensen s inequality for multivariate medians Milan Merkle University of Belgrade, Serbia emerkle@etf.rs Given a probability measure µ on Borel sigma-field of R d, and a function f : R d R, the main issue

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points Inequalities Relating Addition and Replacement Type Finite Sample Breadown Points Robert Serfling Department of Mathematical Sciences University of Texas at Dallas Richardson, Texas 75083-0688, USA Email:

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

CONTRIBUTIONS TO THE THEORY AND APPLICATIONS OF STATISTICAL DEPTH FUNCTIONS

CONTRIBUTIONS TO THE THEORY AND APPLICATIONS OF STATISTICAL DEPTH FUNCTIONS CONTRIBUTIONS TO THE THEORY AND APPLICATIONS OF STATISTICAL DEPTH FUNCTIONS APPROVED BY SUPERVISORY COMMITTEE: Robert Serfling, Chair Larry Ammann John Van Ness Michael Baron Copyright 1998 Yijun Zuo All

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Partial cubes: structures, characterizations, and constructions

Partial cubes: structures, characterizations, and constructions Partial cubes: structures, characterizations, and constructions Sergei Ovchinnikov San Francisco State University, Mathematics Department, 1600 Holloway Ave., San Francisco, CA 94132 Abstract Partial cubes

More information

Detecting Change in Multivariate Data Streams Using Minimum Subgraphs. Collaborative work with Dave Ruth, Emily Craparo, and Kevin Wood

Detecting Change in Multivariate Data Streams Using Minimum Subgraphs. Collaborative work with Dave Ruth, Emily Craparo, and Kevin Wood Detecting Change in Multivariate Data Streams Using Minimum Subgraphs Robert Koyak Operations Research Dept. Naval Postgraduate School Collaborative work with Dave Ruth, Emily Craparo, and Kevin Wood Basic

More information

The expansion of random regular graphs

The expansion of random regular graphs The expansion of random regular graphs David Ellis Introduction Our aim is now to show that for any d 3, almost all d-regular graphs on {1, 2,..., n} have edge-expansion ratio at least c d d (if nd is

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Course 212: Academic Year Section 1: Metric Spaces

Course 212: Academic Year Section 1: Metric Spaces Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................

More information

NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II

NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II Chapter 2 Further properties of analytic functions 21 Local/Global behavior of analytic functions;

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

CONSTRAINED PERCOLATION ON Z 2

CONSTRAINED PERCOLATION ON Z 2 CONSTRAINED PERCOLATION ON Z 2 ZHONGYANG LI Abstract. We study a constrained percolation process on Z 2, and prove the almost sure nonexistence of infinite clusters and contours for a large class of probability

More information

Let X be a topological space. We want it to look locally like C. So we make the following definition.

Let X be a topological space. We want it to look locally like C. So we make the following definition. February 17, 2010 1 Riemann surfaces 1.1 Definitions and examples Let X be a topological space. We want it to look locally like C. So we make the following definition. Definition 1. A complex chart on

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively);

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively); STABILITY OF EQUILIBRIA AND LIAPUNOV FUNCTIONS. By topological properties in general we mean qualitative geometric properties (of subsets of R n or of functions in R n ), that is, those that don t depend

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

Lebesgue Integration on R n

Lebesgue Integration on R n Lebesgue Integration on R n The treatment here is based loosely on that of Jones, Lebesgue Integration on Euclidean Space We give an overview from the perspective of a user of the theory Riemann integration

More information

Optional Stopping Theorem Let X be a martingale and T be a stopping time such

Optional Stopping Theorem Let X be a martingale and T be a stopping time such Plan Counting, Renewal, and Point Processes 0. Finish FDR Example 1. The Basic Renewal Process 2. The Poisson Process Revisited 3. Variants and Extensions 4. Point Processes Reading: G&S: 7.1 7.3, 7.10

More information

On the limiting distributions of multivariate depth-based rank sum. statistics and related tests. By Yijun Zuo 2 and Xuming He 3

On the limiting distributions of multivariate depth-based rank sum. statistics and related tests. By Yijun Zuo 2 and Xuming He 3 1 On the limiting distributions of multivariate depth-based rank sum statistics and related tests By Yijun Zuo 2 and Xuming He 3 Michigan State University and University of Illinois A depth-based rank

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Part III. 10 Topological Space Basics. Topological Spaces

Part III. 10 Topological Space Basics. Topological Spaces Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,

More information

arxiv: v1 [math.pr] 14 Aug 2017

arxiv: v1 [math.pr] 14 Aug 2017 Uniqueness of Gibbs Measures for Continuous Hardcore Models arxiv:1708.04263v1 [math.pr] 14 Aug 2017 David Gamarnik Kavita Ramanan Abstract We formulate a continuous version of the well known discrete

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi Real Analysis Math 3AH Rudin, Chapter # Dominique Abdi.. If r is rational (r 0) and x is irrational, prove that r + x and rx are irrational. Solution. Assume the contrary, that r+x and rx are rational.

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

Graph Theory. Thomas Bloom. February 6, 2015

Graph Theory. Thomas Bloom. February 6, 2015 Graph Theory Thomas Bloom February 6, 2015 1 Lecture 1 Introduction A graph (for the purposes of these lectures) is a finite set of vertices, some of which are connected by a single edge. Most importantly,

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

Strongly chordal and chordal bipartite graphs are sandwich monotone

Strongly chordal and chordal bipartite graphs are sandwich monotone Strongly chordal and chordal bipartite graphs are sandwich monotone Pinar Heggernes Federico Mancini Charis Papadopoulos R. Sritharan Abstract A graph class is sandwich monotone if, for every pair of its

More information

Measures. Chapter Some prerequisites. 1.2 Introduction

Measures. Chapter Some prerequisites. 1.2 Introduction Lecture notes Course Analysis for PhD students Uppsala University, Spring 2018 Rostyslav Kozhan Chapter 1 Measures 1.1 Some prerequisites I will follow closely the textbook Real analysis: Modern Techniques

More information

CHAPTER 6. Differentiation

CHAPTER 6. Differentiation CHPTER 6 Differentiation The generalization from elementary calculus of differentiation in measure theory is less obvious than that of integration, and the methods of treating it are somewhat involved.

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Hard-Core Model on Random Graphs

Hard-Core Model on Random Graphs Hard-Core Model on Random Graphs Antar Bandyopadhyay Theoretical Statistics and Mathematics Unit Seminar Theoretical Statistics and Mathematics Unit Indian Statistical Institute, New Delhi Centre New Delhi,

More information

Optimal global rates of convergence for interpolation problems with random design

Optimal global rates of convergence for interpolation problems with random design Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289

More information

The Strong Largeur d Arborescence

The Strong Largeur d Arborescence The Strong Largeur d Arborescence Rik Steenkamp (5887321) November 12, 2013 Master Thesis Supervisor: prof.dr. Monique Laurent Local Supervisor: prof.dr. Alexander Schrijver KdV Institute for Mathematics

More information

Multi-coloring and Mycielski s construction

Multi-coloring and Mycielski s construction Multi-coloring and Mycielski s construction Tim Meagher Fall 2010 Abstract We consider a number of related results taken from two papers one by W. Lin [1], and the other D. C. Fisher[2]. These articles

More information

Construction of a general measure structure

Construction of a general measure structure Chapter 4 Construction of a general measure structure We turn to the development of general measure theory. The ingredients are a set describing the universe of points, a class of measurable subsets along

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information

The minimum G c cut problem

The minimum G c cut problem The minimum G c cut problem Abstract In this paper we define and study the G c -cut problem. Given a complete undirected graph G = (V ; E) with V = n, edge weighted by w(v i, v j ) 0 and an undirected

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

The number of Euler tours of random directed graphs

The number of Euler tours of random directed graphs The number of Euler tours of random directed graphs Páidí Creed School of Mathematical Sciences Queen Mary, University of London United Kingdom P.Creed@qmul.ac.uk Mary Cryan School of Informatics University

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

LECTURE Itineraries We start with a simple example of a dynamical system obtained by iterating the quadratic polynomial

LECTURE Itineraries We start with a simple example of a dynamical system obtained by iterating the quadratic polynomial LECTURE. Itineraries We start with a simple example of a dynamical system obtained by iterating the quadratic polynomial f λ : R R x λx( x), where λ [, 4). Starting with the critical point x 0 := /2, we

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

Measurable functions are approximately nice, even if look terrible.

Measurable functions are approximately nice, even if look terrible. Tel Aviv University, 2015 Functions of real variables 74 7 Approximation 7a A terrible integrable function........... 74 7b Approximation of sets................ 76 7c Approximation of functions............

More information

Abstract. 2. We construct several transcendental numbers.

Abstract. 2. We construct several transcendental numbers. Abstract. We prove Liouville s Theorem for the order of approximation by rationals of real algebraic numbers. 2. We construct several transcendental numbers. 3. We define Poissonian Behaviour, and study

More information

Module 3. Function of a Random Variable and its distribution

Module 3. Function of a Random Variable and its distribution Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given

More information

A weighted edge-count two-sample test for multivariate and object data

A weighted edge-count two-sample test for multivariate and object data A weighted edge-count two-sample test for multivariate and object data Hao Chen, Xu Chen, and Yi Su arxiv:1604.06515v1 [stat.me] 22 Apr 2016 Abstract Two-sample tests for multivariate data and non-euclidean

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Mathematical Preliminaries

Mathematical Preliminaries Chapter 33 Mathematical Preliminaries In this appendix, we provide essential definitions and key results which are used at various points in the book. We also provide a list of sources where more details

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Regression Clustering

Regression Clustering Regression Clustering In regression clustering, we assume a model of the form y = f g (x, θ g ) + ɛ g for observations y and x in the g th group. Usually, of course, we assume linear models of the form

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

Statistical Depth Function

Statistical Depth Function Machine Learning Journal Club, Gatsby Unit June 27, 2016 Outline L-statistics, order statistics, ranking. Instead of moments: median, dispersion, scale, skewness,... New visualization tools: depth contours,

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

A new graph-based two-sample test for multivariate and object data

A new graph-based two-sample test for multivariate and object data A new graph-based two-sample test for multivariate and object data Hao Chen Department of Statistics, University of California, Davis and Jerome H. Friedman Department of Statistics, Stanford University

More information

arxiv: v3 [math.pr] 18 Aug 2017

arxiv: v3 [math.pr] 18 Aug 2017 Sparse Exchangeable Graphs and Their Limits via Graphon Processes arxiv:1601.07134v3 [math.pr] 18 Aug 2017 Christian Borgs Microsoft Research One Memorial Drive Cambridge, MA 02142, USA Jennifer T. Chayes

More information

Existence and uniqueness of solutions for nonlinear ODEs

Existence and uniqueness of solutions for nonlinear ODEs Chapter 4 Existence and uniqueness of solutions for nonlinear ODEs In this chapter we consider the existence and uniqueness of solutions for the initial value problem for general nonlinear ODEs. Recall

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

TREE AND GRID FACTORS FOR GENERAL POINT PROCESSES

TREE AND GRID FACTORS FOR GENERAL POINT PROCESSES Elect. Comm. in Probab. 9 (2004) 53 59 ELECTRONIC COMMUNICATIONS in PROBABILITY TREE AND GRID FACTORS FOR GENERAL POINT PROCESSES ÁDÁM TIMÁR1 Department of Mathematics, Indiana University, Bloomington,

More information

Continued fractions for complex numbers and values of binary quadratic forms

Continued fractions for complex numbers and values of binary quadratic forms arxiv:110.3754v1 [math.nt] 18 Feb 011 Continued fractions for complex numbers and values of binary quadratic forms S.G. Dani and Arnaldo Nogueira February 1, 011 Abstract We describe various properties

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

Stein s Method: Distributional Approximation and Concentration of Measure

Stein s Method: Distributional Approximation and Concentration of Measure Stein s Method: Distributional Approximation and Concentration of Measure Larry Goldstein University of Southern California 36 th Midwest Probability Colloquium, 2014 Concentration of Measure Distributional

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

ACO Comprehensive Exam March 20 and 21, Computability, Complexity and Algorithms

ACO Comprehensive Exam March 20 and 21, Computability, Complexity and Algorithms 1. Computability, Complexity and Algorithms Part a: You are given a graph G = (V,E) with edge weights w(e) > 0 for e E. You are also given a minimum cost spanning tree (MST) T. For one particular edge

More information

Some Background Math Notes on Limsups, Sets, and Convexity

Some Background Math Notes on Limsups, Sets, and Convexity EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40 NATIONAL BOARD FOR HIGHER MATHEMATICS Research Scholarships Screening Test Saturday, February 2, 2008 Time Allowed: Two Hours Maximum Marks: 40 Please read, carefully, the instructions on the following

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm Chapter 13 Radon Measures Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm (13.1) f = sup x X f(x). We want to identify

More information

CHAPTER I THE RIESZ REPRESENTATION THEOREM

CHAPTER I THE RIESZ REPRESENTATION THEOREM CHAPTER I THE RIESZ REPRESENTATION THEOREM We begin our study by identifying certain special kinds of linear functionals on certain special vector spaces of functions. We describe these linear functionals

More information

arxiv: v1 [math.fa] 14 Jul 2018

arxiv: v1 [math.fa] 14 Jul 2018 Construction of Regular Non-Atomic arxiv:180705437v1 [mathfa] 14 Jul 2018 Strictly-Positive Measures in Second-Countable Locally Compact Non-Atomic Hausdorff Spaces Abstract Jason Bentley Department of

More information

Discrete Geometry. Problem 1. Austin Mohr. April 26, 2012

Discrete Geometry. Problem 1. Austin Mohr. April 26, 2012 Discrete Geometry Austin Mohr April 26, 2012 Problem 1 Theorem 1 (Linear Programming Duality). Suppose x, y, b, c R n and A R n n, Ax b, x 0, A T y c, and y 0. If x maximizes c T x and y minimizes b T

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications

A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications Hyunsook Lee. hlee@stat.psu.edu Department of Statistics The Pennsylvania State University Hyunsook

More information

Laplace s Equation. Chapter Mean Value Formulas

Laplace s Equation. Chapter Mean Value Formulas Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic

More information