Characterization of multivariate Bernoulli distributions with given margins

Size: px
Start display at page:

Download "Characterization of multivariate Bernoulli distributions with given margins"

Transcription

1 arxiv: v1 [math.st] 5 Jun 217 Characterization of multivariate Bernoulli distributions with given margins Roberto Fontana 1 and Patrizia Semeraro 2 1 Department of Mathematical Sciences, Politecnico di Torino, roberto.fontana@polito.it 2 Department of Mathematical Sciences, Politecnico di Torino, patrizia.semeraro@polito.it

2 Abstract We express each Fréchet class of multivariate Bernoulli distributions with given margins as the convex hull of a set of densities, which belong to the same Fréchet class. This characterisation allows us to establish whether a given correlation matrix is compatible with the assigned margins and, if it is, to easily construct one of the corresponding joint densities. We reduce the problem of finding a density belonging to a Fréchet class and with given correlation matrix to the solution of a linear system of equations. Our methodology also provides the bounds that each correlation must satisfy to be compatible with the assigned margins. An algorithm and its use in some examples is shown. Keywords: Algebraic statistics; Correlation; Fréchet class; Multivariate binary distribution; Simulation.

3 1 Introduction Dependent binary variables play a key role in many important scientific fields such as clinical trials and health studies. The problem of the simulation of correlated binary data is extensively addressed in the statistical literature, e.g. [3], [6], [15] and [9]. Simulation studies are a useful tool for analysing extensions or alternatives to current estimating methodologies, such as generalised linear mixed models, or for the evaluation of statistical procedures for marginal regression models ([13]). The simulation problem consists of constructing multivariate distributions for given Bernoulli marginal distributions and a given correlation matrix ρ. Frequently, assumptions are made about the correlation structure. Probably the most common is equicorrelation, e.g. [3]. A popular approach also uses working correlation matrices ([1] and [16]), such as first order moving average correlations or first order autoregressive correlations ([12] and references therein). An important issue for these simulation procedures is the compatibility of marginal binary variables and their correlations, since problems may arise when the margins and the correlation matrix are not compatible ([4], [14] and [3]). The range of admissible correlation matrices for binary variables is well known in the bivariate case. This problem has been widely identified in the literature, but, to the best of our knowledge no effective solution exists for multivariate binary distributions with more than three variables ([3]). We propose a new but simple methodology to characterise Bernoulli variables belonging to a given Fréchet class, i.e. with given marginal distributions. This characterisation allows us to establish whether a given correlation matrix is compatible with the assigned margins and, if it is, to easily construct one of the corresponding joint densities. It also provides the bounds that each correlation must satisfy to be compatible with the assigned margins. Furthermore, if the correlation structure and the margins are not compatible, we can find a new correlation matrix which is close to the desired one but compatible with the given margins. It is worth noting that this methodology puts no restriction either on the number of variables or on the correlation structure. It also provides a new computational procedure to simulate multivariate distributions of binary variables with assigned margins and given moments. The proposed methodology is based on a polynomial representation of all the multivariate Bernoulli distributions of a given Fréchet class, i.e. of all the distributions with fixed Bernoulli margins. This representation is linked to the Farlie-Gumbel-Morgesten copula ([11]). It allows us to write each Fréchet class as the convex hull of the ray densities, which are densities that belong to the Fréchet class under consideration. By so doing, the problem of finding one distribution with given moments in a Fréchet class is reduced to the solution of a linear system of equations. 1

4 2 Preliminaries LetF m bethesetofm-dimensionaldistributionswhichhavebernoulliunivariatemarginal distributions. Let us consider the Fréchet class F(p 1,...,p m ) F m of distribution functions in F m which have the same Bernoulli marginal distributions B(p i ), < p i < 1,i = 1,...,m. If X = (X 1,...,X m ) is a random vector with joint distribution in F(p 1,...,p m ), we denote its cumulative distribution function by F p and its density function by f p where p = (p 1,...,p m ); the column vector which contains the values of F p and f p over S m := {,1} m, with asmallabuseofnotation, stillbyf p = (F p (x) : x S m )andf p = (f p (x) : x S m ) respectively; we make the non-restrictive hypothesis that S m is ordered according to the reverse-lexicographical criterion; the marginal cumulative distribution function and the marginal density function of X i by F p,i and f p,i respectively, i = 1,...,m; the values f p,i () F p,i () and f p,i (1) by q i and p i respectively, i = 1,...,m. We observe that q i = 1 p i and that the expected value of X i is p i, E[X i ] = p i, i = 1,...,m. GiventwomatricesA M(n m)andb M(d l)thematrixa B M(nd ml) indicates their Kronecker product and A n is A... A }{{}. n times If we consider a Bernoulli variable B(τ), < τ < 1, with F τ and f τ as cumulative and density function respectively, the following holds where D = ( ( fτ () f τ (1) ) = D ) is the difference matrix. ( Fτ () F τ (1) It follows that given F p and f p in F(p 1,...,p m ) we have ) f p = D m F p. (2.1) Finallywecanwritef p F(p 1,...,p m ), F p F(p 1,...,p m )andx F(p 1,...,p m ). 2

5 3 Construction of multivariate Bernoulli distributions with given margins We give a polynomial and matrix representation of all the F p F(p 1,...,p m ). We make the non-restrictive hypothesis that {q 1,1}... {q m,1} is ordered according to the reverse-lexicographical criterion. We denote {q 1,1}... {q m,1} by Q m. Theorem 3.1. Any distribution F p F(p 1,...,p m ) admits the following representation over Q m F p = Λ p U p θ where Λ p = diag(q (1 α 1) 1... q m (1 αm),(α 1,...,α m ) S m ), U p = U p1... U pm, U pi = ( ) 1 1 qi,i = 1,...,m and θ = (θ,θ m,θ m 1,θ m,m 1,...,θ 12...m ). 1 Necessary conditions for F p being a distribution are θ = 1 and θ i =,i = 1,...,m. Proof. Given u = (u 1,...,u m ) Q m let us define ( m ) (θ m g(u) = u i + θ j (1 u j )+ m θ jk (1 u j )(1 u k )+ +θ 12...m (1 u i ) ) j=1 1 j<k m and the row vectors a i = (1, 1 u i ), i = 1,...,m. We can write g(u) R as ( n g(u) = u i )(a 1... a m ) θ θ m θ m 1... θ 12...m Considering all the u Q m we get the 2 m -vector (g(u),u Q m ) = Λ p U p θ. ( ) 1 1 qi We observe that the determinant of U pi = is det(u pi ) = p i. 1 It follows that the determinant of U p, which is (p 1... p m ) 2, is also different from zero. Being the determinant of Λ p we get that the determinant of Λ p U p is different from zero. It follows that the rank of Λ p U p is 2 m and then any vector y R 2m and in particular any distribution F p can be written as F p = Λ p U p θ. If F p is a distribution in F(p 1,...,p n ), the vector parameter θ must satisfy the following necessary conditions: 1. θ = 1. The condition F p (1,...,1) = 1 implies θ = 1, since F p (1,...,1) = θ ; 2. θ i =,i = 1,...,m. The condition F p (1,...1,,1,...,1) = q i implies θ i =,i = 1,...,m, since F p (1,...1,,1,...,1) = q i (1+θ i (1 q i )). 3.

6 Remark 1. Under the necessary assumptions θ = 1 and θ i =, i = 1,...,m, the polynomial function g(u) in Theorem 3.1 is the restriction of the well-known Farlie- Gumbel-Morgesten copula C(u) to Q m : ( m ) (1+ m C(u) := u i θ jk (1 u j )(1 u k )+ +θ 12...m (1 u i ) ), u [,1] m. 1 j<k n Notice that the condition θ = 1 derives from C(1,...,1) = 1 and the condition θ i = is necessary since a requirement to be a copula is that C(1,...1,q i,1,...,1) = q i, i = 1,...,m. Our representation shows that the restriction to Q m of the Farlie-Gumbel- Morgesten copula allows us to represent all the binary distributions with given margins, and therefore to model all the possible dependence structures of multivariate Bernoulli distributions. As a consequence of Theorem 3.1 and Equation 2.1 any density f p F(p 1,...,p m ) admits the following representation over S m f p = D m Λ p U p θ (3.1) We observe that given f p F(p 1,...,p m ) we can write it as in Eq.(3.1). Vice versa Theorem 3.1 does not provide any condition on θ i1,...,i k for k 2 such that D m Λ p U p θ represents a density function f p over S m. In the remaining part of this section we will provide a representation of all the densities f p F(p 1,...,p m ). Theorem 3.2. Let f p F(p 1,...,p m ). It holds that f p = n F λ i R (i) p, (3.2) where R (i) p = (R (i) p (x),x S m ) F(p 1,...,p m ), λ i, i = 1,...,n F and n F λ i = 1. Proof. Let us define Y p = D m Λ p U p. From Eq.(3.1) it holds that f p = Y p Θ, with the conditions θ = 1 and θ i =, i = 1,...,m. We can write Θ = Y 1 p f p. The conditions θ i =, i = 1,...,m can be written as Hf p =, (3.3) 4

7 where H is the m 2 m sub-matrix of Yp 1 obtained by selecting the rows corresponding to θ i, i = 1...,m. The condition θ = 1 F p (1,...,1) = 1 is ensured by requiring that f p is a density, i.e. 1. f p (x) ; 2. x f p(x) = 1 where x S m. All the positive solutions f p of (3.3) have the following form: f p = n F λ i R(i) p, λi, (i) where R p = ( R (i) p,j,j = 1,...,2m ) R 2m, R(i) (i) p,j and H R p =, i = 1,...,n F are the extremal rays of the cone defined by Hf p = ([1] and [7]). By dividing where λ i = λ i R(i) p,+ and R (i) p = R (i) p by the sum of its elements f p = n F R (i) p,+ = 2 m j=1 λ i R (i) p, R (i) p,j we can write (i) R p, i = 1,...,n R (i) F. It follows that 2 m j=1 R(i) p,j = 1 and p,+ that the ray density defined as R (i) p (x) := R (i) p,j being x the j-th element of S m belongs to F(p 1,...,p m ), i = 1,...,n F. Finallythecondition x f p(x) = 1implies n F λ i ( 2 m j=1 R(i) p,j ) = n F λ i = 1. Then we have λ i,i = 1,...,m and n F λ i = 1 and the assert is proved. Notice that Theorem 3.2 makes extremely easy to generate any density f p of the Fréchet class F(p 1,...,p m ). It is enough to take a positive vector λ = (λ 1,...,λ nf ), such that n F λ i = 1, and build f p = n F λ i R (i) p. The constraints E[X i ] = p i, i = 1,...,m allow us to obtain an interesting intepretation of the matrix H of (3.3). We have E[X i ] = (x 1,...,x m) S m x i f p (x 1,...,x m ). It follows that x T i f p = p i (1 x i ) T f p = q i where x i is the vector which contains the i-th element of x S m, i = 1,...,m. If we consider the odds of the event X i = 1, γ i = p i /q i we have γ i q i p i =. We can write (γ i (1 x i ) T x T i )f p =. 5

8 Then H is simply the m 2 m matrix whose rows, up to a non-influential multiplicative constant, are (γ i (1 x i ) T x T i ), i = 1,...,m. Using Theorem 3.2 we represent each Fréchet class F(p 1,...,p m ) as the convex hull of the ray densities. We observe that the ray densities depend only on the marginal distributions F 1,...,F m. Building the ray matrix R p R p = R (1) p,1... R (n F) p,1... R (1) p,2... m R(n F) p,2 m whose columns are the ray densities R (i) p,i = 1,...,n F we write Eq.(3.2) simply as f p = R p λ with λ = (λ 1,...,λ nf ),λ i and n F λ i = 1. (i) In practical applications the rays R p and therefore the ray densities R p (i) can be found using the software 4ti2, [1]. In Section 5 we will use SAS and 4ti2 to show some numerical examples. In the next sections we will see that the representation of f p as in Theorem 3.2 plays a key role in determining the densities with given moments. 3.1 Moments of multivariate Bernoulli variables We observe that, given thebernoulli variable X B(τ), < τ < 1 with density function f τ we can compute the moments E[X α ],α {,1} as where M = ( ). E[X α ] = ( E[1] E[X] ) = M ( fτ () f τ (1) It follows that given X = (X 1,...,X m ) F(p 1,...,p m ) with multivariate joint density f p, we can compute the vector of its moments E[X α ] E[X α Xm αm],α = (α 1,...,α m ) S m as E[X α ] = M m f p. We also observe that the correlation ρ ij between two Bernoulli variables X i B(p i ) and X j B(p j ) is related to the second-order moment E[X i X j ] as follows E[X i X j ] = ρ ij pi q i p j q j +p i p j. (3.4) ) 6

9 3.2 Second-order moments of multivariate Bernoulli variables with given margins From Theorem 3.2 we get E[X α ] = M m f p = M m R p λ. In particular for the second-order moments µ 2 = E[X α : α = 2], where α = m α i we get the following result, which is crucial for the solution of the problem of simulating multivariate binary distributions with a given correlation matrix. Proposition 3.1. It holds that µ 2 = A 2p λ (3.5) where A 2p = (M m ) 2 R p and (M m ) 2 is the sub-matrix of M m obtained by selecting the rows corresponding to the second-order moments, R p is the ray matrix and λ = (λ 1,...,λ nf ), λ i,i = 1,...m and n F λ i = 1. It follows that the target second-order moments are compatible with the means if they belong to the convex hull generated by the points which are the columns of the A 2p = (M m ) 2 R p matrix. As a direct consequence of Proposition 3.1 we also get the univariate bounds for the second-order moments and the correlations. Proposition 3.2. For each α, α = 2, the second-order moment µ (α) 2 must satisfy the following bounds mina (α) 2p µ (α) 2 maxa (α) 2p (3.6) and the correlations ρ S(α) must satisfy the following bounds mina (α) 2p p i p j pi q i p j q j ρ ij maxa(α) 2p p i p j pi q i p j q j (3.7) where A (α) 2p is the row of the matrix A 2p such that µ (α) 2 = A (α) 2p λ and {i,j} = {k : α k = 1}. Proof. From Proposition 3.1 using the the proper row of A 2p we get To prove (3.6) it is enough to observe that µ (α) 2 = A (α) 2p λ. 1. being λ i and n F λ i = 1 it follows that the minimum (maximum) value of µ (α) 2 will be obtained choosing λ equal to one of the e i s, where e i {,1} n F is the binary vector with all the elements equal to zero apart from the i-th which is equal to one, i = 1,...,n F ; 7

10 2. the product A (α) 2p e i gives the i-th element of A (α) 2p. To prove (3.7) we simply observe that using equation (3.4) the bounds in (3.6) can be transformed to those suitable for correlations. Now we solve the problem of constructing a multivariate Bernoulli density f p F(p 1,...,p m ) with given correlation matrix ρ = (ρ ij ) i,j=1,...,m. Using Equation (3.4) we transform the desired correlations ρ ij into the corresponding desired second-order moments E[X i X j ],i,j = 1,...,m,i < j. In this way the density f p with means p 1,...,p m and correlation matrix ρ can be built as R p λ, where λ = (λ 1,...,λ nf ),λ i, n F λ i = 1 is a solution, if it exists, of the system of equations (3.5). The space of solutions λ of the system (3.5) defines the set of distributions in the Fréchet class with correlation matrix ρ. The choice of a particular solution does not modify the distributions of the sample means and of the sample second-order moments, which depend only on p 1,...,p m and ρ respectively. To explain this point let us consider a random sample {(X k1,...,x km ), k = 1,...,N} extracted from a randomly selected m-dimensional Bernoulli variable belonging to the Fréchet class F(p 1,...,p n ) and with given second-order moments µ ij := E[X i X j ], i,j = 1,...,n. The sample means X i, i = 1,...,m are 1 Binomial(N,p N i) and the sample second-order moments X i X j := N k=1 X ki X kj N, i,j = 1,...,n,i < j are 1 N Binomial(N,µ ij). In general different distributions which belong to the same Fréchet class and which have the same correlation matrix ρ (or equivalently the same vector of second-order moments µ 2 ), will have different k-order moments, with k 3. This methodology offers the opportunity to choose the best distribution according to a certain criterion. For example, as the moments of multivariate Bernoulli are always positive, it could be of interest to find one of the distributions with the smallest sum of all the moments with order greater than 2. This problem can be efficiently solved using linear programming techniques ([2]). It can be simply stated as min(1 T (M m ) 3...m f) f F m subject to { Hf = (M m ) 2 f = µ 2 where 1 is the vector with all the elements equal to 1 and (M m ) 3...m is the sub-matrix of M m obtained by selecting the rows corresponding to the k-moments, with k 3. As we alreadymentioned, fromageometrical point of view asolution ofthe system of equations(3.5) exists if and only if a point whose coordinates are the desired second-order moments belongstotheconvex hull generatedbythepointswhich arethecolumns ofthe A 2p = (M m ) 2 R p matrix. If the margins and the correlation matrix are not compatible, 8

11 the system (3.5) does not have any solution. In this case it is possible to search for a feasible ρ which is the correlation matrix closest to the desired ρ, according to a chosen distance. Finally it is worth noting that the method can be applied to the moments of order greater than 2 or to any selection of moments by simply replacing the (M m ) 2 matrix with the proper one. 3.3 Margins of multivariate Bernoulli variables with given secondorder moments In Section 3.2 we studied second-order moments of multivariate Bernoulli variables with given margins. The methodology can be easily generalised to solve the problem of studying h-order moments of multivariate Bernoulli variables with given k-order moments, h,k {1,...,m}, h k. We show this point by studying the h = 1,k = 2 case, i.e. studying margins of multivariate Bernoulli variables f µ2 with given 2-order moments µ 2 = (µ ij : i,j = 1,...,m, i < j). We observe that E[X i X j ] = (x 1,...,x m) S m x i x j f µ2 (x 1,...,x m ), that is x T ij f µ 2 = µ ij (1 x ij ) T f µ2 = 1 µ ij where x ij is the vector which contains the product x i x j of the i-th and the j-th element of x S m. If we consider the odds of the event X i X j = 1, γ ij = µ ij /(1 µ ij ), we have γ ij (1 µ ij ) µ ij = that is (γ ij (1 x ij ) T x T ij)f µ2 =. Building the matrix H 2 whose rows are (γ ij (1 x ij ) T x T ij), all the densities f µ2 must satisty the system of equations H 2 f µ2 =. The following proposition is the equivalent of Theorem 3.2, Proposition 3.1 and Proposition 3.2 for the case under study. Proposition 3.3. Let f µ2 a multivariate Bernoulli density with second-order moments µ 2 = (µ ij : i,j = 1,...,m, i < j): 1. all the densities f µ2 can be written as f µ2 = n F λ i R (i) µ 2, (3.8) where R (i) µ 2 = (R (i) µ 2 (x),x S m ) i = 1,...,n F are multivariate Bernoulli densities with second-order moments µ 2, λ i,i = 1,...,m and n F λ i = 1. 9

12 2. The vector p = (p 1,...,p m ) is p = A 1µ2 λ (3.9) where A 1µ2 = (M m ) 1 R µ2 and (M m ) 1 is the sub-matrix of M m obtained by selecting the rows corresponding to the first-order moments, R µ2 is the ray matrix and λ = (λ 1,...,λ nf ), λ i,i = 1,...m and n F λ i = For each α, α = 1, the first-order moment µ (α) 1 p i must satisfy the following bounds mina (α) 1µ 2 p i maxa (α) 1µ 2 (3.1) where A (α) 1µ 2 is the row of the matrix A 1µ2 such that p i = A (α) 1µ 2 λ and {i} = {k : α k = 1}. 4 Bivariate Bernoulli density with given margins Inthissectionweconsiderbivariatedistributions, i.e. theclassf(p 1,p 2 )of2-dimensional random variables (X 1,X 2 ) which have Bernoulli marginal distributions F i B(p i ),i = 1,2. In the bivariate case two key distributions are F L and F u, the lower and upper Fréchet bound of F(p 1,p 2 ) respectively: where x = (x 1,x 2 ) {,1} 2. For any F p F(p 1,p 2 ) it holds that F L (x) = max{f 1 (x 1 )+F 2 (x 2 ) 1) (4.1) F U (x) = min{f 1 (x 1 ),F 2 (x 2 )} (4.2) F L (x) F p (x) F U (x), x {,1} 2. (4.3) For an overview of Fréchet classes and their bounds see [5]. WenowanalyseTheorem3.2inthebivariatecase. Thenumberofraysisindependent of the Fréchet class F(p 1,p 2 ). We have two ray densities, which are the lower and upper Fréchet bound of each class. Proposition 4.1. Let f F(p 1,p 2 ), then f p = λf L +(1 λ)f U, λ [,1], where f L and f U are the discrete densities corresponding to F L and F U, respectively. Proof. We observe that in x = (, ) the distribution function and the density function take the same value. Then using (4.3) we can write f L (,) f p (,) f U (,). (4.4) 1

13 It follows that f p (,) = λf L (,)+(1 λ)f U (,) with λ = fp(,) f U(,) f L (,) f U. It holds (,) that λ 1. Now we observe that for any density function f F(p 1,p 2 ) we have f(,1) = q 1 f(,). Then using (4.4) we can write that is q 1 f L (,) q 1 f p (,) q 1 f U (,) f U (1,) f p (1,) f L (1,). We can write f p (1,) = λ 1 f L (1,)+(1 λ 1 )f U (1,). It is easy to verify that λ 1 = λ. We proceed in ananalogous way for f p (,1) = q 2 f p (,) andf p (1,1) = 1 q 1 q 2 +f p (,) and we get f p (x) = λf L (x)+(1 λ)f U (x), x {,1} 2 and λ 1. Proposition 4.1 states that F(p 1,p 2 ) is the convex hull of theupper and lower Fréchet bound. In the bivariate case we can also find the domain of θ 12 expressed as a function of the margins p 1,p 2. From Eq.(3.1) we get f p (,) = q 1 q 2 (1+θ 12 p 1 p 2 ). (4.5) and consequently Using (4.4) it follows θ 12 = f p(,) q 1 q 2 q 1 q 2 p 1 p 2. (4.6) f L (,) q 1 q 2 q 1 q 2 p 1 p 2 θ 12 f U(,) q 1 q 2 q 1 q 2 p 1 p 2 Now without loss of generality we assume q 2 q 1. From Eq.(4.1) and (4.2) we get 1. if q 1 +q 2 1 then 1 p 1 p 2 θ 12 1 p 1 q 2 ; 2. if q 1 +q 2 > 1 then q 1+q 2 1 q 1 q 2 q 1 q 2 p 1 p 2 θ 12 1 p 1 q 2. Finally(see also Theorem 1 in[8]) we obtain the bounds for the correlation coefficient ρ 12 = E[X 1X 2 ] p 1 p 2 p1 q 1 p 2 q 2. Being E[X 1 X 2 ] = f p (1,1), f L (1,1) f p (1,1) f U (1,1) and f(1,1) = 1 q 1 q 2 +f(,) for any density function f F(p 1,p 2 ) we obtain: 1. if q 1 +q 2 1 then 1 q 1 q 2 p 1 p 2 p1 q 1 p 2 q 2 q1 q 2 p 1 p 2 ρ 12 1 q 2 p 1 p 2 p1 q 1 p 2 q 2 p2 q 1 p 1 q 2 ; 2. if q 1 +q 2 > 1 then p 1p 2 p1 q 1 p 2 q 2 p1 p 2 q 1 q 2 ρ 12 1 q 2 p 1 p 2 p1 q 1 p 2 q 2 p2 q 1 p 1 q 2. 11

14 5 Examples In this section we show some results corresponding to different multivariate Bernoulli distributions. The algorithm is described in Section Trivariate Bernoulli distributions Let us consider the case m = 3 and p = ( 1, 1, ). From Theorem 3.2, solving the system of equations (3.3), we get 6 ray densities. The ray matrix R p is R p = and the matrix A 2p as defined in Proposition 3.1 is Using Eq. (3.1) we get A 2p = ρ ij 1, i,j = 1,2,3,i < j.. Let us consider the case in which the X i,i = 1,...,3 must be not correlated. We want to find a distribution F p F( 1 2, 1 2, 1 2 ) such that ρ 12 = ρ 13 = ρ 23 =. From Eq. (3.5) we obtain λ 1 = λ 2 = λ 3 = λ 5 =.25 and λ 4 = λ 6 =. The corresponding density is uniform, f p (x) = 1 8,x S 3 as expected. If we choose ρ 12 =.2,ρ 13 =.3 and ρ 23 =.4, we obtain λ 1 =.275,λ 2 =.25,λ 3 =.375,λ 4 =,λ 5 =.325 and λ 6 = as one of the solutions of Eq. (3.5). The corresponding density is f T p = (.1625,.1875,.125,.1375,.1375,.125,.1875,.1625). If we choose ρ 12 =.9,ρ 13 =.3 and ρ 23 =.6, we do not find any f p with such correlations, even if each ρ ij satisfies the constraints found for bivariate distributions, which, as we said before, in this case are 1 ρ ij 1, i,j = 1,2,3, i < j. 12

15 If we search for a feasible ρ which is the correlation matrix closest 1 to the desired ρ we obtain ρ 12 =.63, ρ 13 =.33 and ρ 23 =.3. The corresponding density is (f p )T = (.2416,,.916,.1666,.1666,.916,,.2416 ). Let us now consider the case p = ( 1, 3, ). The ray matrix Rp contains 6 margins R p = and the A 2p matrix is Using Eq. (3.1) we get A 2p = ρ and.577 ρ 13,ρ If we choose ρ 12 =.3,ρ 13 =.25 and ρ 23 =.1, we obtain λ 1 =.2835,λ 2 =.25,λ 3 =,λ 4 =,λ 5 =.2781 and λ 6 = The corresponding density is f T p = (.1729,.185,.63,.144,.79,.3258,,.133 ). As the last example of trivariate Bernoulli distribution we consider p = ( 1, 1, ). The ray matrix R p (rounded to the third decimal digit) has 11 ray densities R p = The distance can be freely chosen. In this example we used the Euclidean distance.. 13

16 Using Eq. (3.1) we get.236 ρ 12.77,.48 ρ and.289 ρ If we choose ρ 12 =.3,ρ 13 =.25 and ρ 23 =.2, we obtain f T p = (.146,,.1197,.199,.665,.617,.491,.4893). 5.2 Multivariate m = 5 Bernoulli distributions Let us consider the case p = ( 1, 1, 1, 1, ). We obtain 2,712 ray densities. If we choose ρ 12 =.3,ρ 13 =.2,ρ 14 =.2,ρ 15 =.1,ρ 23 =.2,ρ 24 =.3,ρ 25 =.2,ρ 34 =.2,ρ 35 =.1 and ρ 45 =.2, we obtain f p = Multivariate m 6 Bernoulli distributions For m = 6 and p = ( 1 2, 1 2, 1 2, 1 2, 1 2, 1 2) we obtain 77,264 ray densities. In general we observe that if the number of rays is too large with respect to the available computer power and if the objective can be reduced to the problem of finding just one density f F m with given margins p and second order moments µ 2, it is enough to solve the system { (M m ) 1 f = p (M m ) 2 f = µ 2 using standard linear programming tools (e.g. [2]). 14

17 5.4 The algorithm In this section we briefly describe the algorithm that we used in Section 5. Given m, p and ρ as input the algorithm returns the ray matrix R p and, if it exists, the density f p, which has Bernoulli B(p i ),i = 1,...,m as marginal distribution and pairwise correlations ρ = (ρ ij,i,j = 1,...,m,i < j). The algorithm has the following main steps: 1. the construction of the matrix H, see (3.3) of Theorem 3.2; 2. the generation of the ray matrix R p ; 3. the construction of the density f p as the solution of the system (3.5) of Theorem 3.2. The construction of the matrix H and of the density f p is implemented in SAS/IML. In particular, the system (3.5) is solved using the Proc Lpsolve that is part of SAS/QC. The rays are generated using 4ti2 ([1]). The software code is available on request. We performed the analysis using a standard laptop (CPU Intel core I7-262M CPU 2.7GHz 2.7GHz, RAM 8GB). 6 Discussion The proposed approach can be applied to any given set of moments, even of different orders. All the results given for moments and correlations can be easily adapted to other widely-used measures of dependence, such as Kendall s τ and Spearman s ρ. Furthermore, the polynomial representation of the distributions of any Fréchet class provides a link to copulas, which are a powerful instrument to model dependence. 7 Acknowledgements Roberto Fontana wishes to thank professor Antonio Di Scala (Politecnico di Torino, Department of Mathematical Sciences) and professor Giovanni Pistone (Collegio Carlo Alberto, Moncalieri) for the helpful discussions he had with them. References [1] 4ti2 team. 4ti2 a software package for algebraic, geometric and combinatorial problems on linear spaces. Available at

18 [2] Michel Berkelaar, Kjell Eikland, Peter Notebaert, et al. lpsolve: Open source (mixed-integer) linear programming system. Eindhoven U. of Technology, 24. [3] N Rao Chaganty and Harry Joe. Range of correlation matrices for dependent bernoulli random variables. Biometrika, 93(1):197 26, 26. [4] Martin Crowder. On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika, 82(2):47 41, [5] Giorgio Dall Aglio, Samuel Kotz, and Gabriella Salinetti. Advances in probability distributions with given marginals: beyond the copulas, volume 67. Springer Science & Business Media, 212. [6] Mary E Haynes, Roy T Sabo, and N Rao Chaganty. Simulating dependent binary variables through multinomial sampling. Journal of Statistical Computation and Simulation, 86(3):51 523, 216. [7] Raymond Hemmecke. On the computation of hilbert bases of cones. Mathematical Software, ICMS, pages , 22. [8] Mark Huber, Nevena Marić, et al. Multivariate distributions with fixed marginals and correlations. Journal of Applied Probability, 52(2):62 68, 215. [9] Seung-Ho Kang and Sin-Ho Jung. Generating correlated binary variables with complete specification of the joint distribution. Biometrical Journal, 43(3): , 21. [1] Kung-Yee Liang and Scott L Zeger. Longitudinal data analysis using generalized linear models. Biometrika, 73(1):13 22, [11] RB Nelsen. An introduction to copulas, ser. Lecture Notes in Statistics. New York: Springer, 26. [12] Samuel D Oman. Easily simulated multivariate binary distributions with given positive and negative correlations. Computational Statistics & Data Analysis, 53(4):999 15, 29. [13] Bahjat F Qaqish. A family of multivariate binary distributions for simulating correlated binary variables with specified marginal means and correlations. Biometrika, 9(2): , 23. [14] N Rao Chaganty and Harry Joe. Efficiency of generalized estimating equations for binary responses. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66(4):851 86,

19 [15] Justine Shults. Simulating longer vectors of correlated binary random variables via multinomial sampling [16] Scott L Zeger and Kung-Yee Liang. Longitudinal data analysis for discrete and continuous outcomes. Biometrics, pages ,

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling J. Shults a a Department of Biostatistics, University of Pennsylvania, PA 19104, USA (v4.0 released January 2015)

More information

Algebraic Generation of Orthogonal Fractional Factorial Designs

Algebraic Generation of Orthogonal Fractional Factorial Designs Algebraic Generation of Orthogonal Fractional Factorial Designs Roberto Fontana Dipartimento di Matematica, Politecnico di Torino Special Session on Advances in Algebraic Statistics AMS 2010 Spring Southeastern

More information

Efficiency of generalized estimating equations for binary responses

Efficiency of generalized estimating equations for binary responses J. R. Statist. Soc. B (2004) 66, Part 4, pp. 851 860 Efficiency of generalized estimating equations for binary responses N. Rao Chaganty Old Dominion University, Norfolk, USA and Harry Joe University of

More information

Simulation of multivariate distributions with fixed marginals and correlations

Simulation of multivariate distributions with fixed marginals and correlations Simulation of multivariate distributions with fixed marginals and correlations Mark Huber and Nevena Marić June 24, 2013 Abstract Consider the problem of drawing random variates (X 1,..., X n ) from a

More information

Toric statistical models: parametric and binomial representations

Toric statistical models: parametric and binomial representations AISM (2007) 59:727 740 DOI 10.1007/s10463-006-0079-z Toric statistical models: parametric and binomial representations Fabio Rapallo Received: 21 February 2005 / Revised: 1 June 2006 / Published online:

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

REMARKS ON TWO PRODUCT LIKE CONSTRUCTIONS FOR COPULAS

REMARKS ON TWO PRODUCT LIKE CONSTRUCTIONS FOR COPULAS K Y B E R N E T I K A V O L U M E 4 3 2 0 0 7 ), N U M B E R 2, P A G E S 2 3 5 2 4 4 REMARKS ON TWO PRODUCT LIKE CONSTRUCTIONS FOR COPULAS Fabrizio Durante, Erich Peter Klement, José Juan Quesada-Molina

More information

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich Modelling Dependence with Copulas and Applications to Risk Management Filip Lindskog, RiskLab, ETH Zürich 02-07-2000 Home page: http://www.math.ethz.ch/ lindskog E-mail: lindskog@math.ethz.ch RiskLab:

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

Multivariate Measures of Positive Dependence

Multivariate Measures of Positive Dependence Int. J. Contemp. Math. Sciences, Vol. 4, 2009, no. 4, 191-200 Multivariate Measures of Positive Dependence Marta Cardin Department of Applied Mathematics University of Venice, Italy mcardin@unive.it Abstract

More information

Modelling Dependent Credit Risks

Modelling Dependent Credit Risks Modelling Dependent Credit Risks Filip Lindskog, RiskLab, ETH Zürich 30 November 2000 Home page:http://www.math.ethz.ch/ lindskog E-mail:lindskog@math.ethz.ch RiskLab:http://www.risklab.ch Modelling Dependent

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico di Torino Porto Institutional Repository [Article] On preservation of ageing under minimum for dependent random lifetimes Original Citation: Pellerey F.; Zalzadeh S. (204). On preservation

More information

arxiv: v2 [math.pr] 30 Jun 2017

arxiv: v2 [math.pr] 30 Jun 2017 arxiv:1706.06182v2 [math.pr] 30 Jun 2017 BERNOULLI CORRELATIONS AND CUT POLYTOPES MARK HUBER AND NEVENA MARIĆ Abstract. Given n symmetric Bernoulli variables, what can be said about their correlation matrix

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

HILBERT BASIS OF THE LIPMAN SEMIGROUP

HILBERT BASIS OF THE LIPMAN SEMIGROUP Available at: http://publications.ictp.it IC/2010/061 United Nations Educational, Scientific and Cultural Organization and International Atomic Energy Agency THE ABDUS SALAM INTERNATIONAL CENTRE FOR THEORETICAL

More information

Characterizations of indicator functions of fractional factorial designs

Characterizations of indicator functions of fractional factorial designs Characterizations of indicator functions of fractional factorial designs arxiv:1810.08417v2 [math.st] 26 Oct 2018 Satoshi Aoki Abstract A polynomial indicator function of designs is first introduced by

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

arxiv: v1 [math.pr] 9 Jan 2016

arxiv: v1 [math.pr] 9 Jan 2016 SKLAR S THEOREM IN AN IMPRECISE SETTING IGNACIO MONTES, ENRIQUE MIRANDA, RENATO PELESSONI, AND PAOLO VICIG arxiv:1601.02121v1 [math.pr] 9 Jan 2016 Abstract. Sklar s theorem is an important tool that connects

More information

Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables

Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables Walter Schneider July 26, 20 Abstract In this paper an analytic expression is given for the bounds

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Quasi-copulas and signed measures

Quasi-copulas and signed measures Quasi-copulas and signed measures Roger B. Nelsen Department of Mathematical Sciences, Lewis & Clark College, Portland (USA) José Juan Quesada-Molina Department of Applied Mathematics, University of Granada

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

A note about the conjecture about Spearman s rho and Kendall s tau

A note about the conjecture about Spearman s rho and Kendall s tau A note about the conjecture about Spearman s rho and Kendall s tau V. Durrleman Operations Research and Financial Engineering, Princeton University, USA A. Nikeghbali University Paris VI, France T. Roncalli

More information

Sklar s theorem in an imprecise setting

Sklar s theorem in an imprecise setting Sklar s theorem in an imprecise setting Ignacio Montes a,, Enrique Miranda a, Renato Pelessoni b, Paolo Vicig b a University of Oviedo (Spain), Dept. of Statistics and O.R. b University of Trieste (Italy),

More information

Duke University. Duke Biostatistics and Bioinformatics (B&B) Working Paper Series. Randomized Phase II Clinical Trials using Fisher s Exact Test

Duke University. Duke Biostatistics and Bioinformatics (B&B) Working Paper Series. Randomized Phase II Clinical Trials using Fisher s Exact Test Duke University Duke Biostatistics and Bioinformatics (B&B) Working Paper Series Year 2010 Paper 7 Randomized Phase II Clinical Trials using Fisher s Exact Test Sin-Ho Jung sinho.jung@duke.edu This working

More information

Preliminary statistics

Preliminary statistics 1 Preliminary statistics The solution of a geophysical inverse problem can be obtained by a combination of information from observed data, the theoretical relation between data and earth parameters (models),

More information

Correlation & Dependency Structures

Correlation & Dependency Structures Correlation & Dependency Structures GIRO - October 1999 Andrzej Czernuszewicz Dimitris Papachristou Why are we interested in correlation/dependency? Risk management Portfolio management Reinsurance purchase

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

Approximation of multivariate distribution functions MARGUS PIHLAK. June Tartu University. Institute of Mathematical Statistics

Approximation of multivariate distribution functions MARGUS PIHLAK. June Tartu University. Institute of Mathematical Statistics Approximation of multivariate distribution functions MARGUS PIHLAK June 29. 2007 Tartu University Institute of Mathematical Statistics Formulation of the problem Let Y be a random variable with unknown

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Hybrid System Identification via Sparse Polynomial Optimization

Hybrid System Identification via Sparse Polynomial Optimization 2010 American Control Conference Marriott Waterfront, Baltimore, MD, USA June 30-July 02, 2010 WeA046 Hybrid System Identification via Sparse Polynomial Optimization Chao Feng, Constantino M Lagoa and

More information

On Parameter-Mixing of Dependence Parameters

On Parameter-Mixing of Dependence Parameters On Parameter-Mixing of Dependence Parameters by Murray D Smith and Xiangyuan Tommy Chen 2 Econometrics and Business Statistics The University of Sydney Incomplete Preliminary Draft May 9, 2006 (NOT FOR

More information

Bernoulli and Tail-Dependence Compatibility

Bernoulli and Tail-Dependence Compatibility 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 0016 0017 0018 0019 0020 0021 0022 0023 0024 0025 0026 0027 0028 0029 0030 0031 0032 0033 0034 0035 0036 0037 0038 0039 0040 0041

More information

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

Mathematical foundations of the methods for multicriterial decision making

Mathematical foundations of the methods for multicriterial decision making Mathematical Communications 2(1997), 161 169 161 Mathematical foundations of the methods for multicriterial decision making Tihomir Hunjak Abstract In this paper the mathematical foundations of the methods

More information

Root systems and optimal block designs

Root systems and optimal block designs Root systems and optimal block designs Peter J. Cameron School of Mathematical Sciences Queen Mary, University of London Mile End Road London E1 4NS, UK p.j.cameron@qmul.ac.uk Abstract Motivated by a question

More information

Estimation Under Multivariate Inverse Weibull Distribution

Estimation Under Multivariate Inverse Weibull Distribution Global Journal of Pure and Applied Mathematics. ISSN 097-768 Volume, Number 8 (07), pp. 4-4 Research India Publications http://www.ripublication.com Estimation Under Multivariate Inverse Weibull Distribution

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

A Brief Introduction to Copulas

A Brief Introduction to Copulas A Brief Introduction to Copulas Speaker: Hua, Lei February 24, 2009 Department of Statistics University of British Columbia Outline Introduction Definition Properties Archimedean Copulas Constructing Copulas

More information

Copula-based Logistic Regression Models for Bivariate Binary Responses

Copula-based Logistic Regression Models for Bivariate Binary Responses Journal of Data Science 12(2014), 461-476 Copula-based Logistic Regression Models for Bivariate Binary Responses Xiaohu Li 1, Linxiong Li 2, Rui Fang 3 1 University of New Orleans, Xiamen University 2

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations D. R. Wilkins Academic Year 1996-7 1 Number Systems and Matrix Algebra Integers The whole numbers 0, ±1, ±2, ±3, ±4,...

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves

More information

Lecture Quantitative Finance Spring Term 2015

Lecture Quantitative Finance Spring Term 2015 on bivariate Lecture Quantitative Finance Spring Term 2015 Prof. Dr. Erich Walter Farkas Lecture 07: April 2, 2015 1 / 54 Outline on bivariate 1 2 bivariate 3 Distribution 4 5 6 7 8 Comments and conclusions

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

Probabilistic Engineering Mechanics. An innovating analysis of the Nataf transformation from the copula viewpoint

Probabilistic Engineering Mechanics. An innovating analysis of the Nataf transformation from the copula viewpoint Probabilistic Engineering Mechanics 4 9 3 3 Contents lists available at ScienceDirect Probabilistic Engineering Mechanics journal homepage: www.elsevier.com/locate/probengmech An innovating analysis of

More information

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 Multivariate distributions: Gaussian, Bernoulli, Probability tables Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey

More information

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Rivista Italiana di Economia Demografia e Statistica Volume LXXII n. 3 Luglio-Settembre 2018 MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Kateryna

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Mutual Information and Optimal Data Coding

Mutual Information and Optimal Data Coding Mutual Information and Optimal Data Coding May 9 th 2012 Jules de Tibeiro Université de Moncton à Shippagan Bernard Colin François Dubeau Hussein Khreibani Université de Sherbooe Abstract Introduction

More information

Lecture 2 One too many inequalities

Lecture 2 One too many inequalities University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 2 One too many inequalities In lecture 1 we introduced some of the basic conceptual building materials of the course.

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21 Sections 2.3, 2.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 21 2.3 Partial association in stratified 2 2 tables In describing a relationship

More information

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Applied Mathematical Sciences, Vol. 4, 2010, no. 14, 657-666 Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Pranesh Kumar Mathematics Department University of Northern British Columbia

More information

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Melbourne Business School, University of Melbourne (Joint with Mohamad Khaled, University of Queensland)

More information

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent

More information

Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's Paper, Dependency Modeling)

Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's Paper, Dependency Modeling) Study Guide on Dependency Modeling for the Casualty Actuarial Society Exam 7 - G. Stolyarov II Study Guide on Dependency Modeling for the Casualty Actuarial Society (CAS) Exam 7 (Based on Sholom Feldblum's

More information

arxiv: v1 [math.oc] 23 Nov 2012

arxiv: v1 [math.oc] 23 Nov 2012 arxiv:1211.5406v1 [math.oc] 23 Nov 2012 The equivalence between doubly nonnegative relaxation and semidefinite relaxation for binary quadratic programming problems Abstract Chuan-Hao Guo a,, Yan-Qin Bai

More information

Small area estimation with missing data using a multivariate linear random effects model

Small area estimation with missing data using a multivariate linear random effects model Department of Mathematics Small area estimation with missing data using a multivariate linear random effects model Innocent Ngaruye, Dietrich von Rosen and Martin Singull LiTH-MAT-R--2017/07--SE Department

More information

MCS 563 Spring 2014 Analytic Symbolic Computation Monday 14 April. Binomial Ideals

MCS 563 Spring 2014 Analytic Symbolic Computation Monday 14 April. Binomial Ideals Binomial Ideals Binomial ideals offer an interesting class of examples. Because they occur so frequently in various applications, the development methods for binomial ideals is justified. 1 Binomial Ideals

More information

Simulating Uniform- and Triangular- Based Double Power Method Distributions

Simulating Uniform- and Triangular- Based Double Power Method Distributions Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions

More information

A GENERAL CONSTRUCTION FOR SPACE-FILLING LATIN HYPERCUBES

A GENERAL CONSTRUCTION FOR SPACE-FILLING LATIN HYPERCUBES Statistica Sinica 6 (016), 675-690 doi:http://dx.doi.org/10.5705/ss.0015.0019 A GENERAL CONSTRUCTION FOR SPACE-FILLING LATIN HYPERCUBES C. Devon Lin and L. Kang Queen s University and Illinois Institute

More information

Generating Spatial Correlated Binary Data Through a Copulas Method

Generating Spatial Correlated Binary Data Through a Copulas Method Science Research 2015; 3(4): 206-212 Published online July 23, 2015 (http://www.sciencepublishinggroup.com/j/sr) doi: 10.11648/j.sr.20150304.18 ISSN: 2329-0935 (Print); ISSN: 2329-0927 (Online) Generating

More information

On Tail Dependence Matrices The Realization Problem for Parametric Families

On Tail Dependence Matrices The Realization Problem for Parametric Families On Tail Dependence Matrices The Realization Problem for Parametric Families Nariankadu D. Shyamalkumar and Siyang Tao Department of Statistics and Actuarial Science The University of Iowa 24 Schaeffer

More information

Construction and estimation of high dimensional copulas

Construction and estimation of high dimensional copulas Construction and estimation of high dimensional copulas Gildas Mazo PhD work supervised by S. Girard and F. Forbes Mistis, Inria and laboratoire Jean Kuntzmann, Grenoble, France Séminaire Statistiques,

More information

Copulas. Mathematisches Seminar (Prof. Dr. D. Filipovic) Di Uhr in E

Copulas. Mathematisches Seminar (Prof. Dr. D. Filipovic) Di Uhr in E Copulas Mathematisches Seminar (Prof. Dr. D. Filipovic) Di. 14-16 Uhr in E41 A Short Introduction 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 The above picture shows a scatterplot (500 points) from a pair

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

GRÖBNER BASES AND POLYNOMIAL EQUATIONS. 1. Introduction and preliminaries on Gróbner bases

GRÖBNER BASES AND POLYNOMIAL EQUATIONS. 1. Introduction and preliminaries on Gróbner bases GRÖBNER BASES AND POLYNOMIAL EQUATIONS J. K. VERMA 1. Introduction and preliminaries on Gróbner bases Let S = k[x 1, x 2,..., x n ] denote a polynomial ring over a field k where x 1, x 2,..., x n are indeterminates.

More information

Concomitants of Order Statistics of a New Bivariate Finite Range Distribution (NBFRD)

Concomitants of Order Statistics of a New Bivariate Finite Range Distribution (NBFRD) Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number 2 (206), pp. 69-697 Research India Publications http://www.ripublication.com/gjpam.htm Concomitants of Order Statistics of

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

Latin Hypercube Sampling with Multidimensional Uniformity

Latin Hypercube Sampling with Multidimensional Uniformity Latin Hypercube Sampling with Multidimensional Uniformity Jared L. Deutsch and Clayton V. Deutsch Complex geostatistical models can only be realized a limited number of times due to large computational

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

What can be expressed via Conic Quadratic and Semidefinite Programming?

What can be expressed via Conic Quadratic and Semidefinite Programming? What can be expressed via Conic Quadratic and Semidefinite Programming? A. Nemirovski Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Abstract Tremendous recent

More information

arxiv: v1 [quant-ph] 4 Jul 2013

arxiv: v1 [quant-ph] 4 Jul 2013 GEOMETRY FOR SEPARABLE STATES AND CONSTRUCTION OF ENTANGLED STATES WITH POSITIVE PARTIAL TRANSPOSES KIL-CHAN HA AND SEUNG-HYEOK KYE arxiv:1307.1362v1 [quant-ph] 4 Jul 2013 Abstract. We construct faces

More information

Easy and not so easy multifacility location problems... (In 20 minutes.)

Easy and not so easy multifacility location problems... (In 20 minutes.) Easy and not so easy multifacility location problems... (In 20 minutes.) MINLP 2014 Pittsburgh, June 2014 Justo Puerto Institute of Mathematics (IMUS) Universidad de Sevilla Outline 1 Introduction (In

More information

THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1

THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1 THE NUMBER OF LOCALLY RESTRICTED DIRECTED GRAPHS1 LEO KATZ AND JAMES H. POWELL 1. Preliminaries. We shall be concerned with finite graphs of / directed lines on n points, or nodes. The lines are joins

More information

2 Random Variable Generation

2 Random Variable Generation 2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods

More information

Received: 2/7/07, Revised: 5/25/07, Accepted: 6/25/07, Published: 7/20/07 Abstract

Received: 2/7/07, Revised: 5/25/07, Accepted: 6/25/07, Published: 7/20/07 Abstract INTEGERS: ELECTRONIC JOURNAL OF COMBINATORIAL NUMBER THEORY 7 2007, #A34 AN INVERSE OF THE FAÀ DI BRUNO FORMULA Gottlieb Pirsic 1 Johann Radon Institute of Computational and Applied Mathematics RICAM,

More information

GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS

GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS REVSTAT Statistical Journal Volume 14, Number 1, February 2016, 1 28 GENERAL MULTIVARIATE DEPENDENCE USING ASSOCIATED COPULAS Author: Yuri Salazar Flores Centre for Financial Risk, Macquarie University,

More information

Albert W. Marshall. Ingram Olkin Barry. C. Arnold. Inequalities: Theory. of Majorization and Its Applications. Second Edition.

Albert W. Marshall. Ingram Olkin Barry. C. Arnold. Inequalities: Theory. of Majorization and Its Applications. Second Edition. Albert W Marshall Ingram Olkin Barry C Arnold Inequalities: Theory of Majorization and Its Applications Second Edition f) Springer Contents I Theory of Majorization 1 Introduction 3 A Motivation and Basic

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Robust Maximum Association Between Data Sets: The R Package ccapp

Robust Maximum Association Between Data Sets: The R Package ccapp Robust Maximum Association Between Data Sets: The R Package ccapp Andreas Alfons Erasmus Universiteit Rotterdam Christophe Croux KU Leuven Peter Filzmoser Vienna University of Technology Abstract This

More information

How to Distinguish True Dependence from Varying Independence?

How to Distinguish True Dependence from Varying Independence? University of Texas at El Paso DigitalCommons@UTEP Departmental Technical Reports (CS) Department of Computer Science 8-1-2013 How to Distinguish True Dependence from Varying Independence? Marketa Krmelova

More information

arxiv: v1 [stat.co] 7 Sep 2017

arxiv: v1 [stat.co] 7 Sep 2017 Computing optimal experimental designs with respect to a compound Bayes risk criterion arxiv:1709.02317v1 [stat.co] 7 Sep 2017 Radoslav Harman, and Maryna Prus September 8, 2017 Abstract We consider the

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Simulating Realistic Ecological Count Data

Simulating Realistic Ecological Count Data 1 / 76 Simulating Realistic Ecological Count Data Lisa Madsen Dave Birkes Oregon State University Statistics Department Seminar May 2, 2011 2 / 76 Outline 1 Motivation Example: Weed Counts 2 Pearson Correlation

More information

Bivariate generalized Pareto distribution

Bivariate generalized Pareto distribution Bivariate generalized Pareto distribution in practice Eötvös Loránd University, Budapest, Hungary Minisymposium on Uncertainty Modelling 27 September 2011, CSASC 2011, Krems, Austria Outline Short summary

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Semi-parametric predictive inference for bivariate data using copulas

Semi-parametric predictive inference for bivariate data using copulas Semi-parametric predictive inference for bivariate data using copulas Tahani Coolen-Maturi a, Frank P.A. Coolen b,, Noryanti Muhammad b a Durham University Business School, Durham University, Durham, DH1

More information

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ).

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ). APPENDIX A SIMLATION OF COPLAS Copulas have primary and direct applications in the simulation of dependent variables. We now present general procedures to simulate bivariate, as well as multivariate, dependent

More information