An Information-Theoretic Measure of Dependency Among Variables in Large Datasets

Size: px
Start display at page:

Download "An Information-Theoretic Measure of Dependency Among Variables in Large Datasets"

Transcription

1 An Information-Theoretic Measure of Dependency Among Variabes in Large Datasets Ai Mousavi, Richard G. Baraniuk Department of Eectrica and Computer Engineering Rice University Houston, TX arxiv: v [cs.it] 7 Aug 205 Abstract The maxima information coefficient (MIC, which measures the amount of dependence between two variabes, is abe to detect both inear and non-inear associations. However, computationa cost grows rapidy as a function of the dataset size. In this paper, we deveop a computationay efficient approximation to the MIC that repaces its dynamic programming step with a much simper technique based on the uniform partitioning of data grid. A variety of experiments demonstrate the quaity of our approximation. I. INTRODUCTION One of the chaenging issues for statisticians and computer scientists is deaing with arge datasets containing hundreds of variabes which some of them may have interesting but unexpored reationships with each other. This is due to exampes of massive datasets in different areas such as: socia networks, astronomy, genomics, medica records, and poitica science. Hence, it is an interesting topic to try to come up with methods which hep us to discover these reationships. Measuring the amount of dependence among two variabes has been extensivey studied in the iterature and severa methods have been proposed for it. We review some of them in the foowing. In [], the author has suggested seven properties to be satisfied by any measure φ(x,y used for determining the amount of dependence between x and y. These properties known as Rényi s axioms are: In defining φ(x,y between any two random variabes, neither x nor y shoud be constant with probabiity. 0 φ(x,y. φ(x,y = 0 if and ony if x and y are independent from each other. φ(x, y = if there is an arbitrary functiona dependency betweenx andy. In other words, if y = f(x orx = g(y where f(. and g(. are Bore-measurabe functions. φ(x,y = φ(y,x if f(. and g(. are stricty monotonic functions, then φ(x,y = φ(f(x,g(y. if x and y are jointy Gaussian random variabes, then φ(x,y = PCC(x,y where PCC is the Pearson correation coefficient. This work was supported by NSF CCF , CCF-7939; DARPA/ONR N6600--C-4092 and N ; ONR N , and N ; ARO MURI W9NF Emai: {ai.mousavi, The Pearson correation coefficient (PCC is the most we known dependency measure. However, it is unabe to detect non-inear associations. In other words, the PCC is ony abe to capture inear associations between two variabes. As another measure of dependency, correation ratio of random variabe y (if σ 2 (y exists and σ(y > 0 on random variabe x, introduced in [2] and [3], is defined as Θ(y = σ(e(y x. ( σ(y It is easy to show that 0 Θ(y where Θ(y = if and ony if y = f(x in which f(x is a Bore-measurabe function and Θ(y = 0 if x and y are independent. The aternative formua of the correation ratio mentioned in [] is Θ(y = sup PCC(y, g(x. (2 (g This aternative formua eads to another measure of dependency caed maxima correation [4]: S(x,y = suppcc(f(x,g(y, (3 f,g where f(. and g(. are Bore-measurabe functions. The author in [] has shown that S(x,y = 0 if and ony if x and y are independent. Furthermore, if there is an arbitrary functiona reationship between x and y, then S(x,y =. The authors in [5] have introduced the aternating conditiona expectation (ACE agorithm to find the optima transformations. The Spearman correation coefficient [6] is defined simiar to the PCC; however, it is defined between the two ranked variabes. By ranked variabes we mean repacing each data point by its rank (or the average rank for equa sampe points in the ascending order. Therefore, if x i and ỹ i denote the ranked versions of x i and y i, the Spearman correation coefficient woud be i ρ = ( x i x(ỹ i ȳ i ( x i x 2 i (ỹ (4 i ȳ 2. The authors in [7] have expressed covariance and inear correation in terms of principa components and generaized them for variabes distributed aong a curve. They have estimated their measures using principa curves. Mutua Information [8] is another measure that can be used for quantification of dependency between two variabes

2 since it satisfies some common properties of other dependency measures. As an exampe I(x,y = 0 if and ony if x and y are independent. The authors in [9] have used kerne density estimation of probabiity density functions in order to estimate the mutua information between two variabes. In [0], a method of mutua information estimation based on binning and estimating entropy from k-nearest neighbors is proposed. The MIC [] is recenty proposed for quantifying dependency between two random variabes. It is based on binning the dataset using dynamic programming technique to compute mutua information between different variabes. It has two main properties which makes it superior in comparison with the aforementioned measures. First, it has generaity meaning that if the sampe size is arge enough, it is abe to detect different kinds of associations rather than specific types. Second, it is an equitabe measure meaning that it gives simiar scores to equay noisy associations no matter what type the association is. One of the probems with the MIC is the fact that its computationa cost grows rapidy as a function of the dataset size. Since this computationa cost may become infeasibe, the authors in [] have appied a heuristic so as not to compute the mutua information for a possibe grids. This heuristic appication may resut in finding a oca maximum. In this paper, we deveop a computationay efficient approximation to the MIC. This approximation is based on repacing the dynamic programming appication used in computation of the MIC with a very efficient technique that is uniformy binning the data. We show that our proposed method is abe to detect both functiona and non-functiona associations between different variabes, simiar to the MIC whie more efficienty. In addition, it has a better performance in recognizing the independence between different variabes. The rest of this paper is organized as the foowing. In section (II, we review the MIC and the agorithm used to compute it from []. In section (III, we introduce our new measure of dependency that is a modification to the MIC. We present simuation resuts in section (IV. Finay, section (V incudes the concusion of the paper. II. THE MAXIMAL INFORMATION COEFFICIENT (MIC A. MIC Definition and Properties For any finite dataset D which contains ordered pairs of two random variabes, one can partition the first eement, i.e., x-vaue of these pairs into x bins and simiary partition the second eement or y-vaue of these pairs into y bins. As a resut of this partitioning, we wi have an x -by- y grid G. Every ce of this grid may or may not contain some sampe points from the set D. This grid induces a probabiity distribution on the ces of G where the corresponding probabiity of each ce is equa to the portion of sampe points ocated in that ce. That is to say p ij = D ij D, (5 i th row D ij j th coumn Fig.. Partitioning of dataset D into x coumns and y rows. D ij denotes the set of sampe points ocated in the i-th row and the j-th coumn. where p ij denotes the probabiity corresponding to the ce ocated at the i th row and the j th coumn and D ij denotes the number of sampe points faing into the i-th row and the j-th coumn (See Figure for a graphica view of the grid G. It is obvious that for each ( x, y, we wi have a grid that induces a new probabiity distribution and hence resuts in a different mutua information between the two variabes. Let I D G (P;Q = max GI D G (P;Q be the argest possibe mutua information achievabe by an x -by- y grid G on a set D of sampe points. P and Q are the partitions of X-axis and Y-axis of grid G, respectivey. In order to have a fair comparison among different grids, the computed vaues of mutua information shoud be normaized. Since I(P;Q = H(Q H(Q P = H(P H(P Q, we divide I D G (P;Q by og(min( x, y. Therefore, we have 0 I D G (P;Q. (6 og(min( x, y This inequaity motivates the definition of the MIC as a measure of dependency between two variabes. For a dataset D containing n sampes of two variabes, we have ID G MIC(D = max (P;Q x y<b(n og(min( x, y. (7 where B(n = n 0.6 [] or more generay ω( B(n O(n ǫ. According to this definition, the MIC has the foowing properties: 0 MIC(D. MIC(x,y = MIC(y,x. It is invariant under order-preserving transformation appied to the dataset D. It is not invariant under the rotation of coordinate axes, e.g., if y = x, then MIC(D =. However, after a 45 cockwise rotation of coordinate axes, instead of y = x we have y = 0 and hence MIC(D = 0. B. MIC Agorithm Athough the agorithm for computing the MIC is fuy described in [], here we ony review the OptimizeXAxis agorithm which is used in computation of the highest mutua

3 Cumps Fig. 2. OptimizeXAxis [] considers ony consecutive points faing into the same row and draw partitions between them. The set of consecutive points faing into the same row is caed cump. information achievabe by an x -by- y grid. Any x -by- y grid imposes two sets of partitions on x-vaues (coumns of grid and y-vaues (rows of grid. We indicate coumns of the grid by c,c 2,...,c x where c i denotes the endpoint (argest x- vaue of the i-th coumn. Since I(P,Q is upper-bounded by H(P and H(Q, in order to maximize it, one can equipartition either the Y or X axis, i.e., impose a discrete uniform distribution on either Q or P. Without oss of generaity, we consider the version of the agorithm that equipartitions the Y-axis. However, it is obvious that we shoud check both of the cases (equipartitioning either the X or Y axis separatey for each x -by- y grid and choose the maximum resuting mutua information. Let H(P denote the entropy of distribution imposed by m sampe points (m D = n on the partition of X- axis. Simiary, et H(Q denote the entropy of distribution imposed by m sampe points (m D = n on the partition of Y-axis. Since we have assumed that the Y-axis is equipartitioned, H(Q is constant and equa to og( Q. Finay, et H(P, Q denote the entropy of distribution imposed by m sampe points (m < D = n on the ces of grid G which has X-axis partition P and Y-axis partition Q. Since I(P;Q = H(Q H(Q P and we have aready maximized H(Q by equipartitioning the Y-axis, to achieve the highest mutua information, we have to minimize the H(Q P. This is done by the OptimizeXAxis agorithm []. An aternative formua for the mutua information is I(P; Q = H(Q+H(P H(P, Q. Since H(Q is constant, the OptimizeXAxis ony needs to maximize H(P H(P,Q. The foowing theorem [] is the key to sove this probem. Theorem II.. For a dataset D of size n and a fixed row partition Q, and for every m, N, if we define F(m, = max D(:m, P = {H(P H(P,Q} then for > and < m n we woud have the foowing recursive equation F(m, = max { i m i F(i, H( i,m,q}. (8 i<m m m Proof of Theorem II.: See proposition 3.2. in []. The OptimizeXAxis uses dynamic programming technique motivated by Theorem II.. It ensures F(n, that is the desired partition of dataset D (which has n sampe points Agorithm OptimizeXAxis(D,Q, x [] Require: D is a set of ordered pairs sorted in increasing order by x-vaues Require: Q is a Y-axis partition of D Require: x is an integer greater than Ensure: Returns a ist of scores (I 2,...,I x such that each I is the maximum vaue of I(P;Q over a partitions P of size : c 0,...,c k GetCumpsPartition(D,Q 2: 3: Find the optima partition of size 2 4: for t = 2 to k do 5: Find s {,...,t} maximizing H( c s,c t H( c s,c t,q. 6: P t,2 c s,c t 7: I t,2 H(Q+H(P t,2 H(P t,2,q 8: end for 9: 0: Inductivey buid the rest of the tabe of optima partitions : for = 3 to x do 2: for t = 2 to k do 3: Find s {,...,t} maximizing F(s,t, := c s c t (I s, H(Q+ Q # i, i= c t og # i, #, where#,j is the number of points in thej-th coumn of P s, c t and # i,j is the number of points in the j-th coumn of P s, c t that fa in the i-th row of Q 4: P t, P s, c t 5: I t, H(Q+H(P t, H(P t,,q 6: end for 7: end for 8: return (I k,2,...,i k,x having coumns imposing partition P over X-axis. In order to minimize the H(Q P, OptimizeXAxis considers ony consecutive points faing into the same row and draw partitions between them. The set of consecutive points faing into the same row is caed cump (See Figure 2 for a graphica view of cump. In Agorithm, the GetCumpsPartition subroutine is responsibe for finding and partitioning the cumps. Moreover, P t, is an optima partition of size for the first t cumps. A. Noiseess Setting III. THE UNIFORM-MIC (U-MIC The major drawback of the Agorithm is its computationa compexity. If there exists k cumps in the given partition of an x -by- y grid, the runtime of this agorithm woud be O(k 2 x y. If there is a functiona association between the two variabes, the number of cumps in the corresponding grid is pretty sma. However, for noisy or random datasets it is easy to imagine that the number of cumps is very arge and hence the computationa compexity of the Agorithm ag:optimizexaxis woud be arge.

4 Furthermore and due to this probem, this agorithm cannot be generaized in order to detect associations between more than two variabes. As an exampe, if we want to detect whether or not three variabes are reated to each other, we may write the formua for the generaized mutua information as: I(P;Q;R =H(P+H(Q+H(R H(P,Q (9 H(P,R H(Q,R+H(P,Q,R. Hence, intuitivey and ike the case for two random variabes, in order to maximize the generaized mutua information, we have to equipartition one axis to maximize the entropy. Nevertheess, we shoud partition the two other axes with respect to the paces of the cumps in them. if we equipartition the first axis and there exists k cumps in the second axis and k 2 cumps in the third axis, then the runtime of this agorithm woud be O(kk x y z 2 where x, y, z are the sizes of partitioning. This runtime is not acceptabe for arge datasets. Therefore, we have to modify the agorithm in order to decrease its runtime and as a resut make it generaizabe to higher dimensions. The agorithm we propose in here for repacing the Agorithm is uniform partitioning (Agorithm 2. Let y min = min i y i, y max = max j y j, and simiary x min = min i x i and x max = max j x j. we then partition both X and Y axes such that a the coumns have ength xmax xmin x and simiary a the rows have ength ymax ymin y. We ca this new measure, that is derived by repacing the Agorithm with Agorithm 2, by the U-MIC (Uniform Maxima Information Coefficient. In the foowing we prove that the U-MIC wi approach as the sampe size grows for when there exists a functiona association between two variabes (with finite derivative. Without oss of generaity, we do a the proofs in the case that (x,y [0,] [0,]. These proofs coud be generaized to other cases easiy. Agorithm 2 UniformPartition( x, y Require: Dataset D Require: x and y are integers greater than Ensure: Returns a score I which is the vaue of I(P;Q where P are Q are distributions from uniform partitioning of both axes. : P Uniform partition of X-axis by x coumns each has ength xmax xmin x 2: Q Uniform partition of Y -axis by y rows each has ength ymax ymin y 3: I = H(P+H(Q H(P,Q og(min( x, y 4: return I Proposition III.. If D = {(x i,y i } n i= where y i = h(x i and h (x <, then im n U-MIC(D =. Proof of Proposition III.: We denote by g h (α the subeve function of function h(., i.e., g h (α = λ({x : h(x α}, (0 where λ(t denotes the fraction of sampe points in the set T. Consequenty g h (α = F y (α = P(y α = P(h(x α, ( where F y (. denotes the cumuative distribution function (CDF and P(. denotes the probabiity function. Using this notation and assuming that we uniformy partition Y -axis by y rows, we can write the entropy of Q which is the uniform partition of Y-axis as H(Q (2 y = P(Q = iog(p(q = i y ( i = P Y < i+ ( ( i og P Y < i+ y y y y y = g h (α i og (g h (α i y y y = y g h(α i og(g h(α i y ( g h(α i og, y y where i y α i < i+ y for each i (0 i y is derived according to the mean vaue Theorem. If without oss of generaity we assume that min( x, y = y then we can write y H(Q og( y = y y og( y g h(α i og(g h(α i + As a resut, in the asymptotic setting we can write im y H(Q og( y = i y g h(α i. (3 y g h (α i =, (4 where the ast equaity hods since im y i y g h (α i is the Riemann integra of the function g h (α i. If we assume that h (. < c, then according to the mean vaue Theorem we have ( ( i+ i h h c. (5 y y y Equation (5 states that for a particuar coumn of the X-axis partition, the curve of the function passes through at most c+ ces of that coumn. We use this fact in upper-bounding the H(Q P. Simiar to (2 and (3 we have y H(Q P = k = P(Q = i P = k (6

5 og(p(q = i P = k y ( i = P Y < i+ P = k y y ( ( i og P Y < i+ P = k y y y = f y x (α i P = k y og (f y x (α i P = k y y = f y x (α i P = k y og(f y x (α i P = k y ( f y x (α i P = kog y y, where f y x denotes the conditiona probabiity density function. Because of equation (5, we can simpify (6 as j c+ H(Q P = k = f y x (α i P = k (7 i=j y og(f y x (α i P = k j c+ ( f y x (α i P = kog i=j y y. If we define k = argmax k H(Q P = k, then since H(Q P = kp(p = kh(q P = k, we can write j c+ H(Q P f y x (α i P = k (8 i=j y and hence og(f y x (α i P = k j c+ i=j y f y x (α i P = k og ( y, j c+ H(Q P im y og( y f y x (α i P = k = 0. (9 i=j y The ast equaity hods since y 0 but c <. As a resut im U-MIC(D = I(P;Q y og(min{ x, y } = H(Q H(Q P og(min{ x, y } =. (20 If x and y are independent, then according to the foowing Proposition we have U-MIC(D = 0. Proposition III.2. If D = {(x i,y i } n i= where x i y i for i n, then U-MIC(D = 0. Proof of Proposition III.2: The ine of reasoning is straight forward and simiar to the proof of Proposition III.. Since x and y are independent from each other, we can write H(Q (2 y = P(Q = iog(p(q = i y ( i = P Y < i+ ( ( i og P Y < i+ y y y y y ( i = P og Y < i+ y ( P P = k y ( i y Y < i+ y P = k y = P(Q = i P = kog(p(q = i P = k = H(Q P = k. Therefore, H(Q = H(Q P = k for every k where 0 k x. Now since H(Q P = kp(p = kh(q P = k, we have H(Q = H(Q P and as a resut U-MIC(D=0. B. Noisy Setting In this section we study performance of the U-MIC in noisy setting. We first give a ower-bound on it when the two variabes x and y have a noisy functiona association in which the noise is bounded. After that, we study the case of unbounded noise. For the bounded noise case, without oss of generaity we assume that x U[0,] and the noise has a uniform distribution. Specificay, we assume that sampe points(x i,y i have the form (x i,h(x i +z ǫ where z ǫ U[ ǫ,ǫ]. We define y mid = ymax+ymin 2. In Agorithm 2, we divide the Y-axis into two rows by drawing a horizonta ine at y mid. In addition, we divide the X-axis into x coumns each having the ength x (since x U[0,]. Let D = {(x i,y i y i < y mid } and D 2 = {(x i,y i y i > y mid }. We use P and Q to denote the partition of X-axis and Y-axis of the grid in this setting. Having this setting and notations in mind, the foowing Coroary gives a simpe ower-bound for U-MIC(D in this case. Coroary III.3. Let m be the number of coumns in P in which there exists a sampe point (ˆx,ŷ such that ŷ y mid ǫ. Then, U-MIC(D is ower-bounded by D og( D D og( D D 2 og( D 2 D m x. Proof of Coroary III.3: Since I(P, Q = H(Q H(Q P, we need to have an upper-bound on H(Q P in order to determine a ower-bound on I(P, Q. According to

6 δ n δ n 2ǫ n + Points y = h(x+z Fig. 3. Using k-nearest neighbors method to bound the noise in noisy reationships. We repace each point with the average of its neighbors in its δ n-neighborhood. the entropy definition we can write H(Q = D ( D og D D D 2 D og X ( D2 D (22 = D og( D D og( D D 2 og( D 2. D Let M = { p,..., pm } denote the coumns in which there exists a data point (ˆx,ŷ such that ŷ y mid ǫ. Since Q has ony two rows, we can upper-bound the H(Q P as the foowing H(Q P = x k=0 ( (a = x P(P = kh(q P = k (23 k M (b = x k M M x = m x, H(Q P = k+ H(Q P = k H(Q P = k k/ M where (a hods since x U[0,] and (b hods because z ǫ U[ ǫ,ǫ]. The ower-bound is then derived by combining (22 and (23. The main issue with generaizing this ower-bounding idea to other noise distributions is that noise vaues coud be unbounded. Hence, we use the idea of k-nearest neighbors to bound the noise so as to come up with a consistent version of the association detector. We study this idea for the case that noise is drawn from a Gaussian distribution with 0 mean and variance of σ 2. For each sampe point, we consider its δ n -neighborhood (we use subscript n to show the dependency on the size of the dataset n. We repace each data point with the average of sampe points ocated in its δ n -neighborhood. The foowing emma characterizes the number of sampe points in this neighborhood. Lemma III.4. Let x be uniformy distributed, i.e., x U[0,] and(x i,y i denote thei-th data point indwherey i = h(x i + z i. If N = {(x j,y j (x i x j 2 δ 2 n }, then im n N = 2nδ n. Proof of Lemma III.4: Let I(. denote the indicator function. Then we can write n 2ǫ n = N = I(x i δ n x j x i +δ n. (24 j= As a resut E[2ǫ n ] = 2nδ n. Using the Hoeffding inequaity we have P( 2ǫ n E[2ǫ n ] t 2e 2ct2 n 2, (25 for some constant c. If we et t = ogn, then im n (ǫ n = nδ n or equivaenty, im n N = 2nδ n. Assume that h(. is a Lipschitz continuous function of order β, i.e., h(v h(w k v w β where k is a constant depends on the function h(.. If we estimate (or repace the y-vaue of each noisy sampe point with the average of sampe points in its δ n -neighborhood, in the case of Gaussian noise (0 mean and variance of σ 2 we can write the estimation mean squared error as n = n E( h(x i h(x i 2 (26 n = n i= [ n ǫn ] 2 j= ǫn (h(x i j +z i j E h(x i 2ǫ n + i= k2 ǫ 2β n n 2ǫ n +. 2β + σ2 In order to minimize the estimation error we can take derivative with respect to ǫ n and set it to 0. Therefore, ǫ n which minimizes the mean squared error is ǫ n = σ 2 2β+ 4k 2 β n 2β+. (27 We use this ǫ n ater to to bound the noise. The foowing emma gives a probabiistic bound on the noise vaues. Lemma III.5. If z,z 2,...,z n are i.i.d. drawn from N(0,σ 2, then P{max i n z i > t} 2ne t2 2σ 2. Proof of Lemma III.5: First of a, for a zero mean Gaussian random variabe z i we prove that P{ z i > t} 2ne t2 2σ 2. Let u = z i t and hence u N( t,σ 2. We have 0 e u2 2ut 2σ 2 du 2πσ 2 As a resut we can write t e z 2 i 2σ 2 dz i = 2πσ πσ 2 e u2 2σ 2 du. (28 e (u t2 2σ 2 2πσ 2 du e t2 2σ 2. (29 Simiary, (29 hods for [, t] and hence P{ z i > t} 2e t2 2σ 2. The resut of emma then foows from using unionbound on the z i s. By using the k-nearest neighbors method, each z i is repaced by z i which is the average of 2ǫ n + i.i.d. noise vaues and hence its variance is decreased by 2ǫ n +. This idea

7 Fig. 4. Test functiona reationships for Tabes I,II, and III. function. One interpretation of this difference is that in the proof of Proposition III., we have assumed that the absoute vaue of derivative of function h(. is upper-bounded by constantc. However, this is not the case for sinusoida function with different frequencies since there is a discontinuity in this function. If we increase the sampe size, as reported in Tabe II, this issue is aeviated as we can see. TABLE I: MIC(D and U-MIC(D for different functiona reationships in Figure 4. For this set of experiments, D = 200. Linear Paraboic Periodic Cubic Sin (Diff. Freq. Sin (Singe Freq. MIC U-MIC TABLE II: MIC(D and U-MIC(D for different functiona reationships in Figure 4. For this set of experiments, D = Linear Paraboic Periodic Cubic Sin (Diff. Freq. Sin (Singe Freq. MIC U-MIC Fig. 5. Test non-functiona reationships for Tabes IV and V. motivates the foowing coroary which ets us to bound the noise. Coroary ǫn III.6. By using the k-nearest neighbors method, z i = j= ǫn z i j 2ǫ n+, and as a resut im n max i n z i = 0. Proof of Coroary III.6: According to the Lemma III.5, we can write P{max i n z i > t} 2ne t 2 (2ǫn+ 2σ 2. The resut then foows from etting t = ogn and ǫ n = ǫ n which was derived in (27. In the next section we show how the U-MIC works in practice comparing to the MIC. IV. SIMULATION RESULTS In this section, we study the performance of our proposed measure. We first show how it works for functiona associations. Second, we study its performance for non-functiona associations. Finay, we do some experiments for the case of noisy reationships. As mentioned previousy, the authors in [] appy a heuristic to compute the MIC which may not resut in the true MIC. On the other hand, we do not appy any heuristic in the simuation resuts in order to have a precise comparison with our proposed method. Figure 4 shows the functiona associations that we have tested the performance of the MIC and U-MIC agorithms on. Tabe I summarizes the resuts for the case that there are 200 sampe points. One interesting point in Tabe I is the vaue of the U-MIC for sinusoida function with different frequencies. As we can see, MIC(D= whie U-MIC(D=0.75 for this TABLE III: Run time (in sec. for cacuation of MIC(D and U-MIC(D for different functiona reationships in Figure 4. For this set of experiments, D = 200. Linear Paraboic Periodic Cubic Sin (Diff. Freq. Sin (Singe Freq. MIC U-MIC Athough the same issue hods for periodic function in Figure 4, we do not see that much effect. Quaitativey, the derivative of continuous pieces of the periodic function in Figure 4 (y = x is smaer than the maximum of the derivative of sinusoida function with different frequencies (y = sin(0x, y = sin(20x. Hence, if we uniformy partition the X-axis in the case of periodic function, there woud be fewer sampe points in rows of a certain coumn and more probaby higher entropy (resuting in higher U-MIC, as the case in Tabe I. Tabe III summarizes the runtime for cacuation of the MIC and U-MIC for different functiona associations in Figure 4. As we can see, the U-MIC is at east 0 times faster in these cases. This is expected since the MIC uses dynamic programming to find a cose to optima grid for the data whie the U-MIC just uniformy partitions the axes. TABLE IV: MIC(D and U-MIC(D for different nonfunctiona reationships in Figure 5. For this set of experiments, D = 200. Circe Sinusoida Mixture Two Lines Random MIC U-MIC Tabe IV summarizes the resuts for non-functiona associations presented in Figure 5. One important point about Tabe

8 TABLE V: Run time (in sec. for cacuation of MIC(D and U-MIC(D for different non-functiona reationships in Figure 5. For this set of experiments, D = 200. Circe Sinusoida Mixture Two Lines Random MIC U-MIC Fig. 6. Test noisy non-functiona reationships for Tabes VI and VII. IV is that the U-MIC has a better performance in the case of random sampe points (i.e., x y. In this case, the idea MIC and U-MIC is 0; however, as we can see MIC(D=0.6 and U-MIC(D=0.06. This issue is reated to one of the criticisms made about MIC in the iterature [2]. One of the drawbacks of the MIC is the fact that as a statistica test it has a ower power than other measures of dependency such as distance correation [2]. In other words, it gives more fase positives in detecting associations. However, according to our simuation resuts and Proposition III.2 this issue is aeviated in the U- MIC. Tabe V shows the runtime for cacuation of the MIC and U-MIC. In the case of non-functiona reationships we have more cumps in the initia grid of sampe points for cacuation of the MIC. Hence, Agorithm which is basicay running dynamic programming over initia grid to find the optima grid, woud have arger runtime as we can see in Tabe V. On the other hand, since the U-MIC is deaing with uniform partitioning of the grid of sampe points, it does not matter what type of reationship the two random variabes have. The runtime is amost constant and simiar to the cases that there is a functiona association between two variabes. Tabes VI and VII summarize the resuts for noisy nonfunctiona associations presented in Figure 6. Figure 6 is simiar to Figure 5 except for the fact that we have added noise drawn from uniform distribution, i.e., U[ 0.5, 0.5] to the sampe points. Comparing Tabe VI with Tabe IV, we can see that the range of decrease for different associations is amost the same for the both MIC and U-MIC. We expected this for MIC since it has an important property caed equitabiity []. On the other hand, we can observe that at east according to the simuation resuts reported here, U-MIC has approximatey the same equitabiity property. V. CONCLUSION In this paper we introduced a nove measure of dependency between two variabes. This measure is caed the uniform maxima information coefficient (U-MIC because it is a modification of the origina MIC []. It is derived from uniform partitioning of the both X and Y axes. Therefore, it is not deaing with dynamic programming simiar to what the MIC does and hence, is much faster. We proved that asymptoticay, U-MIC equas to if there is a functiona reationship between two variabes. If two variabes are truy independent from each other, then we showed that the U-MIC woud be equa to 0. Specificay, according to the simuation resuts, we showed that the U-MIC does a better job in recognizing independence between variabes comparing to the MIC. TABLE VI: MIC(D and U-MIC(D for different nonfunctiona reationships in Figure 5. For this set of experiments, D = 200 and noise is uniformy distributed in [- 0.05,0.05]. Circe Sinusoida Mixture Two Lines MIC U-MIC TABLE VII: Run time (in sec. for cacuation of MIC(D and U-MIC(D for different noisy non-functiona reationships in Figure 5. For this set of experiments, D = 200 and noise is uniformy distributed in [-0.05,0.05]. Circe Sinusoida Mixture Two Lines MIC U-MIC REFERENCES [] A. Rényi, New version of the probabiistic generaization of the arge sieve, Acta Mathematica Hungarica, vo. 0, no., pp , 959. [2] H. Cramer, Mathematica Methods of Statistics. Princeton Univ Pr, 999, vo. 9. [3] A. Komogorov, Grundbegriffe der wahrscheinichkeitsrechnung, 933. [4] H. Gebeein, Das statistische probem der korreation as variations-und eigenwertprobem und sein zusammenhang mit der ausgeichsrechnung, ZAMM-Journa of Appied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik, vo. 2, no. 6, pp , 94. [5] L. Breiman and J. Friedman, Estimating optima transformations for mutipe regression and correation, Journa of the American Statistica Association, pp , 985. [6] W. Pirie, Spearman rank correation coefficient, Encycopedia of Statistica Sciences, 988. [7] P. Deicado and M. Smrekar, Measuring non-inear dependence for two random variabes distributed aong a curve, Statistics and Computing, vo. 9, no. 3, pp , [8] T. Cover and J. Thomas, Eements of information theory. Wiey Onine Library, 99, vo. 6. [9] Y. Moon, B. Rajagopaan, and U. La, Estimation of mutua information using kerne density estimators, Physica Review E, vo. 52, no. 3, pp , 995. [0] A. Kraskov, H. Stögbauer, and P. Grassberger, Estimating mutua information, Physica Review E, vo. 69, no. 6, p , [] D. Reshef, Y. Reshef, H. Finucane, S. Grossman, G. McVean, P. Turnbaugh, E. Lander, M. Mitzenmacher, and P. Sabeti, Detecting nove associations in arge data sets, Science, vo. 334, no. 6062, pp , 20. [2] N. Simon and R. Tibshirani, Comment on detecting nove associations in arge data sets by reshef et a, science dec 6, 20, arxiv preprint arxiv: , 204.

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

arxiv: v1 [math.fa] 23 Aug 2018

arxiv: v1 [math.fa] 23 Aug 2018 An Exact Upper Bound on the L p Lebesgue Constant and The -Rényi Entropy Power Inequaity for Integer Vaued Random Variabes arxiv:808.0773v [math.fa] 3 Aug 08 Peng Xu, Mokshay Madiman, James Mebourne Abstract

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c) A Simpe Efficient Agorithm of 3-D Singe-Source Locaization with Uniform Cross Array Bing Xue a * Guangyou Fang b Yicai Ji c Key Laboratory of Eectromagnetic Radiation Sensing Technoogy, Institute of Eectronics,

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

An Extension of Almost Sure Central Limit Theorem for Order Statistics

An Extension of Almost Sure Central Limit Theorem for Order Statistics An Extension of Amost Sure Centra Limit Theorem for Order Statistics T. Bin, P. Zuoxiang & S. Nadarajah First version: 6 December 2007 Research Report No. 9, 2007, Probabiity Statistics Group Schoo of

More information

Course 2BA1, Section 11: Periodic Functions and Fourier Series

Course 2BA1, Section 11: Periodic Functions and Fourier Series Course BA, 8 9 Section : Periodic Functions and Fourier Series David R. Wikins Copyright c David R. Wikins 9 Contents Periodic Functions and Fourier Series 74. Fourier Series of Even and Odd Functions...........

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE

THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE KATIE L. MAY AND MELISSA A. MITCHELL Abstract. We show how to identify the minima path network connecting three fixed points on

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channes arxiv:cs/060700v1 [cs.it] 6 Ju 006 Chun-Hao Hsu and Achieas Anastasopouos Eectrica Engineering and Computer Science Department University

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Two-Stage Least Squares as Minimum Distance

Two-Stage Least Squares as Minimum Distance Two-Stage Least Squares as Minimum Distance Frank Windmeijer Discussion Paper 17 / 683 7 June 2017 Department of Economics University of Bristo Priory Road Compex Bristo BS8 1TU United Kingdom Two-Stage

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5].

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5]. PRODUCTS OF NEARLY HOLOMORPHIC EIGENFORMS JEFFREY BEYERL, KEVIN JAMES, CATHERINE TRENTACOSTE, AND HUI XUE Abstract. We prove that the product of two neary hoomorphic Hece eigenforms is again a Hece eigenform

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

Haar Decomposition and Reconstruction Algorithms

Haar Decomposition and Reconstruction Algorithms Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

An explicit Jordan Decomposition of Companion matrices

An explicit Jordan Decomposition of Companion matrices An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

Week 6 Lectures, Math 6451, Tanveer

Week 6 Lectures, Math 6451, Tanveer Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ).

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ). Bourgain s Theorem Computationa and Metric Geometry Instructor: Yury Makarychev 1 Notation Given a metric space (X, d) and S X, the distance from x X to S equas d(x, S) = inf d(x, s). s S The distance

More information

Approximated MLC shape matrix decomposition with interleaf collision constraint

Approximated MLC shape matrix decomposition with interleaf collision constraint Agorithmic Operations Research Vo.4 (29) 49 57 Approximated MLC shape matrix decomposition with intereaf coision constraint Antje Kiese and Thomas Kainowski Institut für Mathematik, Universität Rostock,

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

Reichenbachian Common Cause Systems

Reichenbachian Common Cause Systems Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,

More information

Approximated MLC shape matrix decomposition with interleaf collision constraint

Approximated MLC shape matrix decomposition with interleaf collision constraint Approximated MLC shape matrix decomposition with intereaf coision constraint Thomas Kainowski Antje Kiese Abstract Shape matrix decomposition is a subprobem in radiation therapy panning. A given fuence

More information

arxiv: v1 [math.co] 17 Dec 2018

arxiv: v1 [math.co] 17 Dec 2018 On the Extrema Maximum Agreement Subtree Probem arxiv:1812.06951v1 [math.o] 17 Dec 2018 Aexey Markin Department of omputer Science, Iowa State University, USA amarkin@iastate.edu Abstract Given two phyogenetic

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA ON THE SYMMETRY OF THE POWER INE CHANNE T.C. Banwe, S. Gai {bct, sgai}@research.tecordia.com Tecordia Technoogies, Inc., 445 South Street, Morristown, NJ 07960, USA Abstract The indoor power ine network

More information

Restricted weak type on maximal linear and multilinear integral maps.

Restricted weak type on maximal linear and multilinear integral maps. Restricted weak type on maxima inear and mutiinear integra maps. Oscar Basco Abstract It is shown that mutiinear operators of the form T (f 1,..., f k )(x) = R K(x, y n 1,..., y k )f 1 (y 1 )...f k (y

More information

Homework 5 Solutions

Homework 5 Solutions Stat 310B/Math 230B Theory of Probabiity Homework 5 Soutions Andrea Montanari Due on 2/19/2014 Exercise [5.3.20] 1. We caim that n 2 [ E[h F n ] = 2 n i=1 A i,n h(u)du ] I Ai,n (t). (1) Indeed, integrabiity

More information

The Binary Space Partitioning-Tree Process Supplementary Material

The Binary Space Partitioning-Tree Process Supplementary Material The inary Space Partitioning-Tree Process Suppementary Materia Xuhui Fan in Li Scott. Sisson Schoo of omputer Science Fudan University ibin@fudan.edu.cn Schoo of Mathematics and Statistics University of

More information

On the Goal Value of a Boolean Function

On the Goal Value of a Boolean Function On the Goa Vaue of a Booean Function Eric Bach Dept. of CS University of Wisconsin 1210 W. Dayton St. Madison, WI 53706 Lisa Heerstein Dept of CSE NYU Schoo of Engineering 2 Metrotech Center, 10th Foor

More information

A Comparison Study of the Test for Right Censored and Grouped Data

A Comparison Study of the Test for Right Censored and Grouped Data Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the

More information

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu

More information

On colorings of the Boolean lattice avoiding a rainbow copy of a poset arxiv: v1 [math.co] 21 Dec 2018

On colorings of the Boolean lattice avoiding a rainbow copy of a poset arxiv: v1 [math.co] 21 Dec 2018 On coorings of the Booean attice avoiding a rainbow copy of a poset arxiv:1812.09058v1 [math.co] 21 Dec 2018 Baázs Patkós Afréd Rényi Institute of Mathematics, Hungarian Academy of Scinces H-1053, Budapest,

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

Math 124B January 17, 2012

Math 124B January 17, 2012 Math 124B January 17, 212 Viktor Grigoryan 3 Fu Fourier series We saw in previous ectures how the Dirichet and Neumann boundary conditions ead to respectivey sine and cosine Fourier series of the initia

More information

General Certificate of Education Advanced Level Examination June 2010

General Certificate of Education Advanced Level Examination June 2010 Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/Q10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of using one or two eyes on the perception

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is

More information

Statistical Inference, Econometric Analysis and Matrix Algebra

Statistical Inference, Econometric Analysis and Matrix Algebra Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Cass Website: http://ircamera.as.arizona.edu/astr_518 Cass Texts: Practica Statistics for Astronomers, J.V. Wa, and C.R. Jenkins, Second Edition. Measuring the

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

The Relationship Between Discrete and Continuous Entropy in EPR-Steering Inequalities. Abstract

The Relationship Between Discrete and Continuous Entropy in EPR-Steering Inequalities. Abstract The Reationship Between Discrete and Continuous Entropy in EPR-Steering Inequaities James Schneeoch 1 1 Department of Physics and Astronomy, University of Rochester, Rochester, NY 14627 arxiv:1312.2604v1

More information

Math 124B January 31, 2012

Math 124B January 31, 2012 Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat

More information

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation Appied Mathematics and Stochastic Anaysis Voume 007, Artice ID 74191, 8 pages doi:10.1155/007/74191 Research Artice On the Lower Bound for the Number of Rea Roots of a Random Agebraic Equation Takashi

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

SEMINAR 2. PENDULUMS. V = mgl cos θ. (2) L = T V = 1 2 ml2 θ2 + mgl cos θ, (3) d dt ml2 θ2 + mgl sin θ = 0, (4) θ + g l

SEMINAR 2. PENDULUMS. V = mgl cos θ. (2) L = T V = 1 2 ml2 θ2 + mgl cos θ, (3) d dt ml2 θ2 + mgl sin θ = 0, (4) θ + g l Probem 7. Simpe Penduum SEMINAR. PENDULUMS A simpe penduum means a mass m suspended by a string weightess rigid rod of ength so that it can swing in a pane. The y-axis is directed down, x-axis is directed

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC (January 8, 2003) A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC DAMIAN CLANCY, University of Liverpoo PHILIP K. POLLETT, University of Queensand Abstract

More information

Lecture 9. Stability of Elastic Structures. Lecture 10. Advanced Topic in Column Buckling

Lecture 9. Stability of Elastic Structures. Lecture 10. Advanced Topic in Column Buckling Lecture 9 Stabiity of Eastic Structures Lecture 1 Advanced Topic in Coumn Bucking robem 9-1: A camped-free coumn is oaded at its tip by a oad. The issue here is to find the itica bucking oad. a) Suggest

More information

Lecture 11. Fourier transform

Lecture 11. Fourier transform Lecture. Fourier transform Definition and main resuts Let f L 2 (R). The Fourier transform of a function f is a function f(α) = f(x)t iαx dx () The normaized Fourier transform of f is a function R ˆf =

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Lemma 1. Suppose K S is a compact subset and I α is a covering of K. There is a finite subcollection {I j } such that

Lemma 1. Suppose K S is a compact subset and I α is a covering of K. There is a finite subcollection {I j } such that 2 Singuar Integras We start with a very usefu covering emma. Lemma. Suppose K S is a compact subset and I α is a covering of K. There is a finite subcoection {I j } such that. {I j } are disjoint. 2. The

More information

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009 AALBORG UNIVERSITY The distribution of communication cost for a mobie service scenario by Jesper Møer and Man Lung Yiu R-29-11 June 29 Department of Mathematica Sciences Aaborg University Fredrik Bajers

More information

Source and Relay Matrices Optimization for Multiuser Multi-Hop MIMO Relay Systems

Source and Relay Matrices Optimization for Multiuser Multi-Hop MIMO Relay Systems Source and Reay Matrices Optimization for Mutiuser Muti-Hop MIMO Reay Systems Yue Rong Department of Eectrica and Computer Engineering, Curtin University, Bentey, WA 6102, Austraia Abstract In this paper,

More information

Rate-Distortion Theory of Finite Point Processes

Rate-Distortion Theory of Finite Point Processes Rate-Distortion Theory of Finite Point Processes Günther Koiander, Dominic Schuhmacher, and Franz Hawatsch, Feow, IEEE Abstract We study the compression of data in the case where the usefu information

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea-Time Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

1D Heat Propagation Problems

1D Heat Propagation Problems Chapter 1 1D Heat Propagation Probems If the ambient space of the heat conduction has ony one dimension, the Fourier equation reduces to the foowing for an homogeneous body cρ T t = T λ 2 + Q, 1.1) x2

More information

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization Scaabe Spectrum ocation for Large Networks ased on Sparse Optimization innan Zhuang Modem R&D Lab Samsung Semiconductor, Inc. San Diego, C Dongning Guo, Ermin Wei, and Michae L. Honig Department of Eectrica

More information

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems Bayesian Unscented Kaman Fiter for State Estimation of Noninear and Non-aussian Systems Zhong Liu, Shing-Chow Chan, Ho-Chun Wu and iafei Wu Department of Eectrica and Eectronic Engineering, he University

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

FRIEZE GROUPS IN R 2

FRIEZE GROUPS IN R 2 FRIEZE GROUPS IN R 2 MAXWELL STOLARSKI Abstract. Focusing on the Eucidean pane under the Pythagorean Metric, our goa is to cassify the frieze groups, discrete subgroups of the set of isometries of the

More information

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems Componentwise Determination of the Interva Hu Soution for Linear Interva Parameter Systems L. V. Koev Dept. of Theoretica Eectrotechnics, Facuty of Automatics, Technica University of Sofia, 1000 Sofia,

More information

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE arxiv:0.5339v [cs.it] 4 Dec 00 Abstract The probem of random number generation from an uncorreated

More information

14 Separation of Variables Method

14 Separation of Variables Method 14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt

More information

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay Throughput Optima Scheduing for Wireess Downinks with Reconfiguration Deay Vineeth Baa Sukumaran vineethbs@gmai.com Department of Avionics Indian Institute of Space Science and Technoogy. Abstract We consider

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design Contro Chart For Monitoring Nonparametric Profies With Arbitrary Design Peihua Qiu 1 and Changiang Zou 2 1 Schoo of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems Convergence Property of the Iri-Imai Agorithm for Some Smooth Convex Programming Probems S. Zhang Communicated by Z.Q. Luo Assistant Professor, Department of Econometrics, University of Groningen, Groningen,

More information

Approximate Bandwidth Allocation for Fixed-Priority-Scheduled Periodic Resources (WSU-CS Technical Report Version)

Approximate Bandwidth Allocation for Fixed-Priority-Scheduled Periodic Resources (WSU-CS Technical Report Version) Approximate Bandwidth Aocation for Fixed-Priority-Schedued Periodic Resources WSU-CS Technica Report Version) Farhana Dewan Nathan Fisher Abstract Recent research in compositiona rea-time systems has focused

More information

Competitive Diffusion in Social Networks: Quality or Seeding?

Competitive Diffusion in Social Networks: Quality or Seeding? Competitive Diffusion in Socia Networks: Quaity or Seeding? Arastoo Fazei Amir Ajorou Ai Jadbabaie arxiv:1503.01220v1 [cs.gt] 4 Mar 2015 Abstract In this paper, we study a strategic mode of marketing and

More information

MA 201: Partial Differential Equations Lecture - 10

MA 201: Partial Differential Equations Lecture - 10 MA 201: Partia Differentia Equations Lecture - 10 Separation of Variabes, One dimensiona Wave Equation Initia Boundary Vaue Probem (IBVP) Reca: A physica probem governed by a PDE may contain both boundary

More information

Two-sample inference for normal mean vectors based on monotone missing data

Two-sample inference for normal mean vectors based on monotone missing data Journa of Mutivariate Anaysis 97 (006 6 76 wwweseviercom/ocate/jmva Two-sampe inference for norma mean vectors based on monotone missing data Jianqi Yu a, K Krishnamoorthy a,, Maruthy K Pannaa b a Department

More information

Coded Caching for Files with Distinct File Sizes

Coded Caching for Files with Distinct File Sizes Coded Caching for Fies with Distinct Fie Sizes Jinbei Zhang iaojun Lin Chih-Chun Wang inbing Wang Department of Eectronic Engineering Shanghai Jiao ong University China Schoo of Eectrica and Computer Engineering

More information