WALD TESTS OF SINGULAR HYPOTHESES. By Mathias Drton and Han Xiao University of Washington and Rutgers University

Size: px
Start display at page:

Download "WALD TESTS OF SINGULAR HYPOTHESES. By Mathias Drton and Han Xiao University of Washington and Rutgers University"

Transcription

1 WALD TESTS OF SINGULAR HYPOTHESES By Mathias Drton and Han Xiao University of Washington and Rutgers University Motivated by the problem of testing tetrad constraints in factor analysis, we study the large-sample distribution of Wald statistics at parameter points at which the gradient of the tested constraint vanishes. When based on an asymptotically normal estimator, the Wald statistic converges to a rational function of a normal random vector. The rational function is determined by a homogeneous polynomial and a covariance matrix. For quadratic forms and bivariate monomials of arbitrary degree, we show unexpected relationships to chi-square distributions that explain conservative behavior of certain Wald tests. For general monomials, we offer a conjecture according to which the reciprocal of a certain quadratic form in the reciprocals of dependent normal random variables is chi-square distributed.. Introduction. Let f R[x,..., x k ] be a homogeneous k-variate polynomial with gradient f, and let Σ be a k k positive semidefinite matrix with positive diagonal entries. In this paper, we study the distribution of the random variable (.) W f,σ = f(x) 2 ( f(x)) T Σ f(x), where X N k (0, Σ) is a normal random vector with zero mean and covariance matrix Σ. The random variable W f,σ arises in the description of the large-sample behavior of Wald tests with Σ being the asymptotic covariance matrix of an estimator and the polynomial f appearing in a Taylor approximation to the function that defines the constraint to be tested. In regular settings, the Wald statistic for a single constraint converges to χ 2, the chi-square distribution with one degree of freedom. This familiar fact is recovered when f(x) = a T x, a 0, is a linear form and ( a T ) 2 X (.2) W f,σ = a T Σa becomes the square of a standard normal random variable; the vector a corresponds to a nonzero gradient of the tested constraint. Our attention is AMS 2000 subject classifications: 62F05, 62E20 Keywords and phrases: Asymptotic distribution, factor analysis, large-sample theory, singular parameter point, tetrad, Wald statistic

2 2 M. DRTON AND H. XIAO devoted to cases in which f has degree two or larger. These singular cases occur when the gradient of the constraint is zero at the true parameter. For likelihood ratio tests, a large body of literature starting with Chernoff (954) describes large-sample behavior in irregular settings; examples of recent work are Azaïs, Gassiat and Mercadier (2006), Drton (2009), Kato and Kuriki (203) and Ritz and Skovgaard (2005). In contrast, much less work appears to exist for Wald tests. Three examples we are aware of are Glonek (993), Gaffke, Steyer and von Davier (999) and Gaffke, Heiligers and Offinger (2002) who treat singular hypotheses that correspond to collapsility of contingency tables and confounding in regression. Our own interest is motivated by the fact that graphical models with hidden variables are singular (Drton, Sturmfels and Sullivant, 2009, Chap. 4). In graphical modeling, or more specifically in factor analysis, the testing of so-called tetrad constraints is a problem of particular practical relevance (Bollen, Lennox and Dahly, 2009; Bollen and Ting, 2000; Hipp and Bollen, 2003; Silva et al., 2006; Spirtes, Glymour and Scheines, 2000). This problem goes back to Spearman (904); for some of the history see Harman (976). The desire to better understand the Wald statistic for a tetrad was the initial statistical motivation for this work. We solve the tetrad problem in Section 5; the relevant polynomial is quadratic, namely, f(x) = x x 2 x 3 x 4. However, many other hypotheses are of interest in graphical modeling and beyond (Drton, Sturmfels and Sullivant, 2007; Drton, Massam and Olkin, 2008; Sullivant, Talaska and Draisma, 200; Zwiernik and Smith, 202). In principle, any homogeneous polynomial f could arise in the description of a large-sample limit and, thus, general distribution theory for the random variable W f,σ from (.) would be desirable. At first sight, it may seem as if not much concrete can be said about W f,σ when f has degree two or larger. However, the distribution of W f,σ can in surprising ways be independent of the covariance matrix Σ even if degree(f) 2. Glonek (993) was the first to shows this in his study of the case f(x) = x x 2 that is relevant, in particular, for hypotheses that are the union of two sets. Moreover, the asymptotic distribution in this case is smaller than χ 2, making the Wald test maintain (at times quite conservatively) a desired asymptotic level across the entire null hypothesis. We will show that similar phenomena hold also in degree higher than two; see Section 3 that treats monomials f(x) = x α xα 2 2. For the tetrad, conservativeness has been remarked upon in work such as Johnson and Bodner (2007). According to our work in Section 5, this is due to the singular nature of the hypothesis rather than effects of too small a sample size. We remark that in singular settings standard n-out-of-n bootstrap tests may fail to achieve a

3 WALD TESTS OF SINGULAR HYPOTHESES 3 desired asymptotic size, requiring the need to consider m-out-of-n and subsampling procedures; compare the discussion and references in Drton and Williams (20). In the remainder of this paper we first clarify the connection between Wald tests and the random variables W f,σ from (.); see Section 2. Bivariate monomials f of arbitrary degree are the topic of Section 3. Quadratic forms f are treated in Section 4, which gives a full classification of the bivariate case. The tetrad is studied in Section 5. Our proofs make heavy use of the polar coordinate representation of a pair of independent standard normal random variables and, unfortunately, we have so far not been able to prove the following conjecture, which we discuss further in Section 6. Conjecture.. Let Σ be any positive semidefinite k k matrix with positive diagonal entries. If f(x,..., x k ) = x α xα 2 2 xα k k with nonnegative integer exponents α,..., α k that are not all zero, then W f,σ (α + + α k ) 2 χ2. It is not difficult to show that the conjecture holds when Σ is diagonal. Proof under independence. Let Z be a standard normal random variable, and α > 0. Then α 2 /Z 2 follows the one-sided stable distribution of index 2 with parameter α, which has the density (.3) p α (x) = α 2π x 3 e 2 α2 /x, x > 0. The law in (.3) is the distribution of the first passage time of a Brownian motion to the level α (Feller, 966). Hence, it has the convolution rule (.4) p α p β = p α+β α, β > 0. When f(x) = x α xα k k and Σ = (σ ij ) is diagonal with σ,..., σ kk > 0, then (.5) W f,σ = α2 σ X α2 k σ kk Xk 2. By (.4), the distribution of /W f,σ is (α + + α k ) 2 /Z 2. Therefore, (.6) W f,σ as claimed in Conjecture.. (α + + α k ) 2 χ2,

4 4 M. DRTON AND H. XIAO The preceding argument can be traced back to Shepp (964); see also Cohen (98); Reid (987); Quine (994); DasGupta and Shepp (2004). However, if X is a dependent random vector, the argument no longer applies. The case k = 2, α = α 2 = and Σ arbitrary was proved by Glonek (993); see Theorem 2.3 below. We prove the general statement for k = 2 as Theorem 3.. Remark.2. When simplified as in (.5), the random variable W f,σ is well-defined when allowing α,..., α k to take nonnegative real as opposed to nonnegative integer values, and the above proof under independence goes through in that case as well. Subsequently, we will thus consider the random variable W f,σ for a monomial f(x,..., x k ) = x α xα 2 2 xα k k with α,..., α k nonnegative real. To be precise we then refer to W f,σ rewritten as ( k ) k α i α j (.7) W f,σ = σ ij. X i X j i= j= With this convention, we believe Conjecture. to be true for α,..., α k nonnegative real. 2. Wald tests. To make the connection between Wald tests and the random variables W f,σ from (.) explicit, suppose that θ R k is a parameter of a statistical model and that, based on a sample of size n, we wish to test the hypothesis (2.) H 0 : γ(θ) = 0 versus H : γ(θ) 0 for a continuously differentiable function γ : R k R. Suppose further that there is a n-consistent estimator ˆθ of θ such that, as n, we have the convergence in distribution n(ˆθ θ) d N k (0, Σ(θ)), where the asymptotic covariance matrix Σ(θ) is a continuous function of the parameter. The Wald statistic for testing (2.) is the ratio (2.2) T γ = γ(ˆθ) 2 var[γ(ˆθ)] = nγ(ˆθ) 2 ( γ(ˆθ)) T Σ(ˆθ) γ(ˆθ), where the denominator of the right-most term estimates the asymptotic variance of γ(ˆθ), which by the delta method is given by ( γ(θ)) T Σ(θ) γ(θ).

5 WALD TESTS OF SINGULAR HYPOTHESES 5 Consider now a true distribution from H 0, that is, the true parameter satisfies γ(θ) = 0. Without loss of generality, we assume that θ = 0. If the gradient is nonzero at the true parameter, then the limiting distribution of T γ is the distribution of the random variable in (.2) with a = γ(0) 0 and Σ = Σ(0). Hence, the limit is χ 2. However, if γ(0) = 0 (i.e., the constraint γ is singular at the true parameter), then the asymptotic distribution of T γ is no longer χ 2 but rather given by (.) with the polynomial f having higher degree; the degree of f is determined by how many derivatives of γ vanish at the true parameter. Proposition 2.. Assume that γ(0) = 0 and that there is a homogeneous polynomial f of degree d 2 such that, as x 0, If nˆθ γ(x) = f(x) + o( x d/2 ), and γ(x) = f(x) + o( x (d )/2 ). d N (0, Σ), then T γ d W f,σ. Example 2.2. Glonek (993) studied testing collapsibility properties of contingency tables. Under an assumption of no three-way interaction, collapsibility with respect to a chosen margin amounts to the vanishing of at least one of two pairwise interactions, which we here simply denote by θ and θ 2. In the (θ, θ 2 )-plane, the hypothesis is the union of the two coordinate axes, which can be described as the solution set of γ(θ, θ 2 ) = θ θ 2 = 0 and tested using the Wald statistic T γ based on maximum likelihood estimates of θ and θ 2. The hypothesis is singular at the origin as reflected by the vanishing of γ when θ = θ 2 = 0. Away from the origin, T γ has the expected asymptotic χ 2 distribution. At the origin, by Proposition 2., T γ converges to W f,σ, where f(x) = x x 2 and Σ is the asymptotic covariance matrix of the two maximum likelihood estimates. The main result of Glonek (993), stated as a theorem below, gives the distribution of W f,σ in this case. Glonek s surprising result clarifies that the Wald test for this hypothesis is conservative at (and in finite samples near) the intersection of the two sets making up the null hypothesis. Theorem 2.3 (Glonek, 993). If f(x) = x x 2 and Σ is any positive semidefinite 2 2 matrix with positive diagonal entries, then W f,σ 4 χ2. Before turning to concrete problems, we make two simple observations that we will use to bring (f, Σ) in convenient form.

6 6 M. DRTON AND H. XIAO Lemma 2.4. Let f R[x,..., x k ] be a homogeneous polynomial, and let Σ be a positive semidefinite k k matrix with positive diagonal entries. (i) If c R \ {0} is a nonzero scalar, then W cf,σ = W f,σ. (ii) If B is an invertible k k matrix, then W f B,B ΣB T distribution as W f,σ. has the same Proof. (i) Obvious, since (cf) = c f. (ii) Let X N (0, Σ) and define Y = B X N (0, B ΣB T ). Then f(x) = (f B)(Y ) and (f B)(Y ) = B T f(x). Substituting into (.) gives W f,σ = (f B)(Y ) 2 ( (f B)(Y )) T B ΣB T (f B)(Y ) = W f B,B ΣB T. 3. Bivariate Monomials. In this section, we study the random variable W f,σ when f(x) = x α xα 2 2. If the exponents α, α 2 are positive integers, then f is a bivariate monomial. However, all our arguments go through for a slightly more general case in which α, α 2 are positive real numbers; recall Remark.2. Our main result is that the distribution of W f,σ does not depend on Σ. Theorem 3.. Let f(x) = x α xα 2 2 with α, α 2 > 0, and let Σ be any positive semidefinite 2 2 matrix with positive diagonal entries. Then W f,σ (α + α 2 ) 2 χ2. Proof. As shown in Section, the claim is true if Σ = (σ ij ) is diagonal. It thus suffices to show that W f,σ has the same distribution as W f := W f,i. By Lemma 2.4, we can assume without loss of generality that σ = σ 22 = and ρ := σ 2 > 0. Since W f,σ = α2 X 2 + 2ρα α 2 + α2 2 X X 2 X2 2, we can also assume α = for simplicity. With σ = /α 2, we have and need to show that = W f,σ X 2 W f,σ + 2ρ σx X 2 + σ 2 X 2 2 ( + σ ) 2 χ2.

7 WALD TESTS OF SINGULAR HYPOTHESES 7 If ρ =, then X and X 2 are almost surely equal and it is clear that W f,σ has the same distribution as W f. Hence, it remains to consider 0 < ρ <. Let Z and Z 2 be independent standard normal random variables. When expressing Z = R cos(ψ) and Z 2 = R sin(ψ) in polar coordinates, it holds that R and Ψ are independent, and Ψ is uniformly distributed over [0, 2π]. Let ρ = sin(φ) with 0 φ < π/2, then the joint distribution of X and X 2 can be represented as X = R cos(ψ φ/2), X 2 = R sin(ψ + φ/2), which leads to with = W f,σ R 2 T T = cos 2 (Ψ φ/2) + 2 sin(φ) σ cos(ψ φ/2) sin(ψ + φ/2) + σ 2 sin 2 (Ψ + φ/2). Routine trigonometric calculations show that T can be expressed as a function of the doubled angle 2Ψ. More precisely, where t(ψ, φ) = T = σ2 t(2ψ, φ), 4 2 cos(2φ) + 2 cos(ψ φ) cos(2ψ) 2 cos(ψ + φ) σ cos(2φ) + ( + σ)[σ + cos(ψ φ) σ cos(ψ + φ)]. Since 2Ψ is uniformly distributed on [0, 4π], the distribution of T is independent of φ if and only if the same is true for the distribution of T = t(ψ, φ). We proceed by calculating the moments of T and show that they are independent of φ. For each 0 φ < π/2, there exists a small interval L = [φ ɛ, φ + ɛ] such that when m, the function sup [t(ψ, φ)] m t(ψ, φ) φ L φ is integrable over 0 ψ < 2π. Therefore, we have (3.) 2π φ E(T m m ) = [t(ψ, φ)]m t(ψ, φ) dψ, 0 2π φ The expression of φt(ψ, φ) is long, so we omit it here.

8 8 M. DRTON AND H. XIAO We introduce the complex numbers z = e iψ and a = e iφ, and express the functions t(ψ, φ) and φt(ψ, φ) in terms of z and a: (a z) 2 ( + az) 2 t(ψ, φ) = u(z, a) = z(a + aσ + a 2 z σz)( + a 2 σ az aσz), φ t(ψ, φ) = v(z, a) =a(a z)( + az)( + a2 s + 2az 2aσz + a 2 z 2 + σz 2 ) iz(a + aσ + a 2 z σz) 2 ( + a 2 σ az aσz) 2 ( a aσ z a 2 z + σz + a 2 σz az 2 aσz 2 ). The integral in (3.) can be computed as a complex contour integral on the unit circle T = {z : z = } 2π [t(ψ, φ)] m t(ψ, φ) dψ = [u(z, a)] m v(z, a) φ iz dz. 0 Let q(z, a) = [u(z, a)] m v(z, a) iz. As a function of z, it has two poles within the unit disc. These two poles are at z 0 = 0 and z = (a 2 σ )/(a + aσ) and have the same order m +. By the Residue Theorem, we know (3.2) q(z, a) dz = Res(q; 0) + Res(q; z ), 2πi T where Res(q; 0) and Res(q; z ) are the residues at 0 and z respectively. Let ζ 0 = {ce iψ, 0 ψ 2π} be a small circle around 0 such that z is outside the circle. Let S be the Möbius transform T S(w) = z w z w. Then S is one-to-one from the unit disk onto itself and maps 0 to z, and ζ 0 to a closed curve ζ = {S(ce iψ ), 0 ψ 2π} around z with winding number one. It holds that Res(q; z ) = 2πi It also holds that ζ q(z, a) dz = 2πi q(s(w), a)s (w) = q(w, a), ζ 0 q(s(w), a)s (w) dw. from which we deduce q(s(w), a)s (w) dw = q(w, a) dw = Res(q; 0). 2πi ζ 0 2πi ζ 0

9 WALD TESTS OF SINGULAR HYPOTHESES 9 Hence, the integral in (3.2) is zero. We have shown that the integral in (3.) is zero for every m, which means that the moments of T do not depend on φ for 0 φ < π/2. When φ = 0, the random variable T is bounded, so its moments uniquely determine the distribution. Therefore, the distribution of T does not depend on φ, and the proof is complete. Remark 3.2. If α = α 2, then Theorem 3. reduces to Theorem 2.3. In this case, our proof above would only need to treat σ =. Glonek s proof of Theorem 2.3 finds the distribution function of a random variable related to our T. If σ =, this requires solving a quadratic equation. When σ, we were unable to extend this approach as a complicated quartic equation arises in the computation of the distribution function. We thus turned to the presented method of moments. Let X = (X, X 2 ) T and Y = (Y, Y 2 ) T be two independent N 2 (0, Σ) random vectors, where Σ has positive diagonal entries. Let p, p 2 be nonnegative numbers such that p + p 2 =. The random variable Q = p X 2 Y + p 2 X Y 2 (p X 2, p 2 X )Σ(p X 2, p 2 X ) T has the standard normal distribution, and is independent of X. For f(x) = x p xp 2 2, let f(x) V f,σ = ( f(x)) T Σ f(x) and W f,σ = V 2 f,σ. Then (3.3) p Y X + p 2 Y 2 X 2 = Q V f,σ. By taking the conditional expectation given V f,σ, the characteristic function of (3.3) is seen to be E [exp{itq/v f,σ }] = E [ exp{ 2 t2 /W f,σ } ]. The uniqueness of the moment generating function for positive random variables (Billingsley, 995, Thm. 22.2) yields that (3.3) has a standard Cauchy distribution (with characteristic function e t ) if and only if W f,σ χ 2. Therefore, we have the following equivalent version of Theorem 3..

10 0 M. DRTON AND H. XIAO Corollary 3.3. Let X = (X, X 2 ) T and Y = (Y, Y 2 ) T be independent N 2 (0, Σ) random vectors, where Σ has positive diagonal entries. If p, p 2 are nonnegative numbers such that p + p 2 =, then the random variable has the standard Cauchy distribution. p Y X + p 2 Y 2 X 2 4. Quadratic Forms. In this section, we consider the distribution of W f,σ when f is a quadratic form, that is, f(x, x 2,..., x k ) = for real coefficients a ij. Equivalently, i j k (4.) f(x, x 2,..., x k ) = x T Ax, a ijx i x j where A = (a ij ) is symmetric, with a ii = a ii and a ij = a ij /2 for i < j. 4.. Canonical form. Let I denote the k k identity matrix. We use the shorthand W f := W f,i when the covariance matrix Σ is the identity. Lemma 4.. If f R[x,..., x k ] is homogeneous of degree d and Σ is a positive semidefinite k k matrix, then W f,σ has the same distribution as W g where g is a homogeneous degree d polynomial in rank(σ) many variables. Proof. If Σ has full rank then Σ = BB T for an invertible matrix B. Use Lemma 2.4(ii) to transform W f,σ to W g where g = f B is homogeneous of degree d. If Σ has rank m < k then Σ = BE m B T, where B is invertible and E m is zero apart from the first m diagonal entries that are equal to one. Form g by substituting x m+ = = x k = 0 into f B. Further simplications are possible for a treatment of the random variables W f. In the case of quadratic forms, we may restrict attention to canonical forms f(x) = λ x λ kx 2 k, as shown in the next lemma.

11 WALD TESTS OF SINGULAR HYPOTHESES Lemma 4.2. Let f(x) = x T Ax be a quadratic form given by a symmetric k k matrix A 0. If Σ is a positive definite k k matrix and λ,..., λ k are the eigenvalues of AΣ, then W f,σ has the same distribution as (4.2) ( λ Z λ kzk) ( λ 2 Z2 + + ), λ2 k Z2 k where Z,..., Z k are independent standard normal random variables. Proof. Write Σ = BB T for an invertible matrix B. By Lemma 2.4(ii), W f,σ has the same distribution as W g with g(x) = x T (B T AB)x. Let Q T (B T AB)Q = diag(λ,..., λ k ) be the spectral decomposition of B T AB, with Q orthogonal. Then λ,..., λ k are also the eigenvalues of AΣ. Applying Lemma 2.4(ii) again, we find that W f,σ has the same distribution as W h with h(x) = x T (Q T B T ABQ)x = λ x λ k x 2 k. Since h(x) = 2(λ x,..., λ k x k ), the claim follows. In (4.2), the set of eigenvalues {λ i : i k} can be scaled to {cλ i : i k} for any c 0, without changing the distribution; recall also Lemma 2.4(i). For instance, we may scale one nonzero eigenvalue to become equal to one. When all (scaled) λ i are in {, }, the description of the distribution of W f,σ can be simplified. We write Beta(α, β) for the Beta distribution with parameters α, β > 0. Lemma 4.3. Let k and k 2 be two positive integers, and let k = k + k 2. If f(x,..., x k ) = x x2 k x 2 k + x2 k +k 2, then W f has the same distribution as 4 R2 (2B ) 2, where R 2 and B are independent, R 2 χ 2 k, and B Beta(k /2, k 2 /2). Proof. The distribution of W f is that of (Z Z2 k Zk 2 + Z2 k +k 2 ) 2 4 Z Z2 k with Z,..., Z k independent and standard normal. Let Y := Z Z 2 k χ 2 k, Y 2 := Z 2 k Z2 k χ2 k 2.

12 2 M. DRTON AND H. XIAO Then R 2 := Y + Y 2 χ 2 k. Representing Z,..., Z k in polar coordinates shows that R 2 and W f /R 2 are independent (Muirhead, 982, Thm..5.5). Since B = Y /(Y + Y 2 ) Beta(k /2, k 2 /2), and (Y Y 2 ) 2 /(Y + Y 2 ) 2 = (2B ) 2, we deduce that the two random variables W f /R 2 and 4 (2B )2 have the same distribution. We note that when k = 4 and k = k 2 = 2, Lemma 4.3 gives the equality of distributions (4.3) W f d = 4 R2 U 2. The equality holds because, in this special case, U = Y /(Y + Y 2 ) is uniformly distributed on [0, ], and (2U ) 2 has the same distribution as U 2. The distribution from (4.3) will appear in Section 5. For general eigenvalues λ i, it seems that the distribution from (4.2) cannot be described in as simple terms Classification of bivariate quadratic forms. We now turn to the bivariate case (k = 2), that is, we are considering a quadratic form in two variables, f(x, x 2 ) = ax 2 + 2bx x 2 + cx 2 2. In this case, we are able to give a full classification of the possible distributions of W f in terms of linear combinations of a pair of independent χ 2 random variables; see Johnson, Kotz and Balakrishnan (994, Sect. 8.8) for a discussion of such distributions. Our classification reveals that for k = 2 the distributions for quadratic forms are stochastically bounded below and above by χ 2 /4 and χ2 2 /4, respectively. Theorem 4.4. Let Σ be a positive definite matrix, and let f(x, x 2 ) = ax 2 + 2bx x 2 + cx 2 2 be a nonzero quadratic form with matrix ( ) a b A := 0. b c (a) If b 2 ac 0, then W f,σ χ 2 /4. (b) If b 2 ac < 0, then W f,σ d = 4 ( Z det(aσ) ) tr(aσ) 2 Z2 2, where Z and Z 2 are independent standard normal random variables.

13 WALD TESTS OF SINGULAR HYPOTHESES 3 Before giving a proof of the theorem, we would like to point out that the key insight, Lemma 4.5 below, can also be obtained from a theorem of Marianne Mora that is based on properties of the Cauchy distribution (Seshadri, 993, Theorem 2.3). Proof. (a) When the discriminant b 2 ac 0 then f factors into a product of two linear forms. The joint distribution of the two linear forms is bivariate normal. Write Σ for the covariance matrix of the linear forms then the distribution of W f is equal to the distribution of W g,σ with g(x, x 2 ) = x x 2. Hence, the distribution is χ 2 /4 by Theorem 2.3/Theorem 3.. (b) In this case, the discriminant is negative and f does not factor. By Lemma 4.2, we can assume Σ = I and consider the distribution of W f for f(x, x 2 ) = λ x 2 + λ 2x 2 2, where λ and λ 2 are the eigenvalues of AΣ. Since det(aσ) = λ λ 2 and tr(aσ) = λ + λ 2, to prove the claim, we must show that in this case (4.4) W f d = 4 ( Z 2 + 4λ ) λ 2 (λ + λ 2 ) 2 Z2 2 = ( Z c ) ( + c) 2 Z2 2, where c = λ 2 /λ > 0. To show (4.4) we use the polar coordinates again. So represent the two considered independent standard normal random variables as X = R cos(ψ) and X 2 = R sin(ψ), where R 2 χ 2 2 and Ψ Uniform[0, 2π] are independent. Then [ W f = R2 cos(ψ) c sin(ψ) 2] 2 cos(ψ) 2 + c 2 sin(ψ) 2 = R2 4 ( ( c)2 ( + c) 2 cos(ψ) 2 sin(ψ) 2 ( + c) 2 cos(ψ) 2 + c 2 sin(ψ) 2 Using Lemma 4.5, we have d R 2 ) ( c)2 W f = ( 4 ( + c) 2 cos(ψ)2 ( ) = R2 4c (4.5) 4 ( + c) 2 cos(ψ)2 + sin(ψ) 2. This is the claim from (4.4) because R cos(ψ) and R sin(ψ) are independent and standard normal. Lemma 4.5. ). If c 0 and Ψ has a uniform distribution over [0, 2π], then S c (Ψ) := ( + c)2 cos(ψ) 2 sin(ψ) 2 cos(ψ) 2 + c 2 sin(ψ) 2 d = cos(ψ) 2.

14 4 M. DRTON AND H. XIAO Proof. Let R 2 χ 2 2 be independent of Ψ. Then R sin(ψ) and R cos(ψ) are independent and standard normal. Therefore, R 2 S c (Ψ) = ( + c) 2 [R sin(ψ)] 2 + c 2 ( + c) 2 [R cos(ψ)] 2 is the sum of two independent random variables that follow the one-sided stable distribution of index 2. Since c > 0, the first summand has the stable distribution with parameter /( + c) and the second summand has parameter c/( + c). Hence, by (.4), their sum follows a stable law with parameter. Expressing this in terms of the reciprocals, R 2 S c (Ψ) d = R 2 cos(ψ) 2 χ 2. It follows that S c (Ψ) has the same distribution as cos(ψ) 2. For instance, we may argue that S c (Ψ) and cos(ψ) 2 have identical moments, which implies equality of the distributions as both are compactly supported. The claim of Lemma 4.5 is false for c < 0. Indeed, the distribution of S c (Ψ) varies with c when c < Stochastic bounds. To understand possible conservativeness of Wald tests it is interesting to look for stochastic bounds on W f,σ that hold for all f and Σ. We denote the stochastic ordering of two random variables as U st V when P (U > t) P (V > t) for all t R. Proposition 4.6. If f R[x,..., x k ] is a quadratic form and Σ any nonzero positive semidefinite k k matrix, then W f,σ st 4 χ2 k. Equality is achieved when f(x) = x x2 k and Σ is the identity matrix. Proof. The second claim is obvious. For the first claim, without loss of generality, we can restrict our attention to the distributions from (4.2). The Cauchy-Schwarz inequality gives ( λ Z λ kzk) 2 2 ( Z 2 ) + + Zk 2 ) ( λ 2 Z λ2 k ) k) Z2 4 ( λ 2 Z2 + + λ2 k Z2 k = 4 which is the desired chi-square bound. 4 ( λ 2 Z2 + + λ2 k Z2 k ( Z Zk 2 ), The considered Wald test rejects the hypothesis that γ(θ) = 0 when the statistic T γ from (2.2) exceeds c α, where c α is the ( α) quantile of the

15 WALD TESTS OF SINGULAR HYPOTHESES 5 χ 2 distribution. Let k α be the largest degrees of freedom k such that a 4 χ2 k random variable exceeds c α with probability at most α. According to Proposition 4.6, if the true parameter is a singularity at which γ can be approximated by a quadratic form in at most k α variables, then the Wald test is guaranteed to be asymptotically conservative. Some values are k 0.05 = 7, k =, k 0.0 = 6, k = 20, k 0.00 = 29. Turning to a lower bound, we can offer the following simple observation. Proposition 4.7. Suppose the quadratic form f is given by a symmetric k k matrix A 0, and suppose that Σ is a positive definite k k matrix such that all eigenvalues of AΣ are nonnegative. Then W f,σ st 4 χ2. Proof. Let λ,..., λ k 0 be the eigenvalues of AΣ. By scaling, we can assume without loss of generality that λ = and 0 λ i for 2 i k. Then ( λ Z λ kzk) 2 2 ( 4 ( λ 2 Z2 + + ) Z2 λ 2 Z λ2 k k) Z2 λ2 k Z2 k 4 ( λ 2 Z2 + + ) = λ2 k Z2 4 Z2, k and the claim follows from Lemma 4.2. Proposition 4.7, Theorem 4.4 and simulation experiments lead us to conjecture that 4 χ2 is still a stochastic lower bound when there are both positive and negative eigenvalues λ i. Conjecture 4.8. For any quadratic form f 0 and any positive semidefinite matrix Σ 0, the distribution of W f,σ stochastically dominates 4 χ2. While we do not know how to prove this conjecture in general, we are able to treat the special case where the eigenvalues λ i are either or -. Theorem 4.9. Let k, k 2 > 0, and k = k + k 2. If f(x,..., x k ) = x x2 k x 2 k + x2 k +k 2, then W f st 4 χ2. Proof. Without loss of generality we assume k k 2. If k = 0 or k = k 2 =, the claim follows Proposition 4.7 and Theorem 4.4, respectively. We now consider the case k and k 2 2. By Lemma 4.3, we know (4.6) W f d = 4 R2 (2B ) 2,

16 6 M. DRTON AND H. XIAO where R 2 and B are independent, R 2 χ 2 k, and B Beta(k /2, k 2 /2). On the other hand, if B Beta(/2, (k )/2) and is independent of R 2, then (4.7) R 2 B χ 2. Let g(x) and h(x) be the density functions of (2B ) 2 and B, respectively. The comparison of (4.6) and (4.7) shows that it suffices to prove that (2B ) 2 is stochastically larger than B. We will show a stronger result, namely, that the likelihood ratio g(x)/h(x) is an increasing function over [0, ]. To simplify the argument, we rescale the density functions to and g(x) x ( + x) k /2 ( x) k 2/2 h(x) x ( x) (k 3)/2 + ( x) k /2 ( + x) k 2/2 = ( x) (k +k 2 3)/2 ( + x) (k +k 2 3)/2. For our purpose, it is equivalent to show the monotonicity of g(x 2 )/h(x 2 ), which is proportional to l(x) := ( + x) ( k 2+)/2 ( x) ( k +)/2 + ( x) ( k 2+)/2 ( + x) ( k +)/2. When k =, the derivative of l(x) satisfies 2l (x) = (k 2 )( x) ( k 2 )/2 (k 2 )( + x) ( k 2 )/2 > 0 when 0 < x <, and thus the likelihood ratio is an increasing function. When k 2, we have 2l (x)( + x) (k2+)/2 ( x) (k 2+)/2 = ( + x) [(k ] 2 )( + x) (k 2 k )/2 + (k )( x) (k 2 k )/2 ( x) [(k ] )( + x) (k 2 k )/2 + (k 2 )( x) (k 2 k )/2 [ ] > (k 2 k ) ( + x) (k 2 k )/2 ( x) (k 2 k )/2 0 for all 0 < x <. Therefore, l(x) is an increasing function.

17 WALD TESTS OF SINGULAR HYPOTHESES 7 5. Tetrads. We now turn to the problem that sparked our interest in Wald tests of singular hypothesis, namely, the problem of testing tetrad constraints on the covariance matrix Θ = (θ ij ) of a random vector Y in R p with p 4. A tetrad is a 2 2 subdeterminant that only involves off-diagonal entries and, without loss of generality, we consider the tetrad ( ) θ3 θ (5.) γ(θ) = θ 3 θ 24 θ 4 θ 23 = det 4. θ 23 θ 24 Example 5.. Consider a factor analysis model in which the coordinates of Y are linear functions of a latent variable X and noise terms. More precisely, Y i = β 0i +β i X +ɛ i where X N (0, ) is independent of ɛ,..., ɛ p, which in turn are independent normal random variables. Then the covariance between Y i and Y j is θ ij = β i β j and the tetrad from (5.) vanishes. Suppose now that we observe a sample of independent and identically distributed random vectors Y (),..., Y (n) with covariance matrix Θ. Let Y n be the sample mean vector, and let ˆΘ = n n (Y (i) Y n )(Y (i) Y n ) T i= be the empirical covariance matrix. Assuming that the data-generating distribution has finite fourth moments, it holds that n( ˆΘ Θ) d N k (0, V (Θ)) with k = p 2. The rows and columns of the asymptotic covariance matrix V (Θ) are indexed by the pairs ij := (i, j), i, j p. Since the tetrad from (5.) only involves the covariances indexed by the pairs in C = {3, 4, 23, 24}, only the principal submatrix Σ(Θ) := V (Θ) C C is of relevance for the large-sample distribution of the sample tetrad γ( ˆΘ). The gradient of the tetrad is γ(θ) = (θ 24, θ 23, θ 4, θ 3 ). Hence, if at least one of the four covariances in the tetrad is nonzero the Wald statistic T γ converges to a χ 2 distribution. If, on the other hand,

18 8 M. DRTON AND H. XIAO θ 3 = θ 4 = θ 23 = θ 24 = 0, then the large-sample limit of T γ has the distribution of W f,σ(θ) where f(x) = x x 4 x 2 x 3 is a quadratic form in k = 4 variables; recall Proposition 2.. This form can be written as x T Ax with a matrix that is a Kronecker product, namely ( ) ( ) (5.2) A = = If Y is multivariate normal, then the asymptotic covariance matrix has the entries V (Θ) ij,kl = θ ik θ jl + θ il θ jk. In the singular case with θ 3 = θ 4 = θ 23 = θ 24 = 0, we have thus θ θ 33 θ θ 34 θ 2 θ 33 θ 2 θ 34 ( ) ( ) Σ(Θ) = θ θ 34 θ θ 44 θ 2 θ 34 θ 2 θ 44 θ 2 θ 33 θ 2 θ 34 θ 22 θ 33 θ 22 θ 34 = θ θ 2 θ33 θ 34, θ 2 θ 22 θ 34 θ 44 θ 2 θ 34 θ 2 θ 44 θ 22 θ 34 θ 22 θ 44 which again is a Kronecker product. We remark that Σ(Θ) would also be a Kronecker product if we had started with an elliptical distribution instead of the normal, compare Iwashita and Siotani (994, eqn. (2.)), or if (Y, Y 2 ) and (Y 3, Y 4 ) were independent in the data-generating distribution. As we show next, in the singular case, the Kronecker structure of the two matrices A and Σ(Θ) gives a limiting distribution of the Wald statistic for the tetrad that does not depend on the block-diagonal covariance matrix Θ. Theorem 5.2. Let Σ = Σ () Σ (2) be the Kronecker product of two positive definite 2 2 matrices Σ (), Σ (2). Let f(x) = x x 4 x 2 x 3. Then where R 2 χ 2 4 W f,σ d = 4 R2 U 2, and U Uniform[0, ] are independent. Proof. Since f is a quadratic form we may consider the canonical form from Lemma 4.2, which depends on the (real) eigenvalues of AΣ. The claim follows from Lemma 4.3 and the comments in the paragraph following its proof provided the four eigenvalues of AΣ all have the same absolute value, two of them are positive and two are negative.

19 Let Σ (i) = (σ (i) kl WALD TESTS OF SINGULAR HYPOTHESES 9 ). Then, by (5.2), ( σ () 2 σ () AΣ = σ () 22 σ () 2 ) ( ) σ (2) 2 σ (2) σ (2) 22 σ (2). 2 For i =, 2, since Σ (i) is positive definite, the matrix ( ) σ (i) 2 σ (i) σ (i) 22 σ (i) 2 has the imaginary eigenvalues ±λ (i) = ± (σ (i) 2 )2 σ (i) σ(i) 22. It follows that AΣ has the real eigenvalues λ () λ (2) and λ () λ (2), each with multiplicity two. Hence, Lemma 4.3 applies with k = k 2 = 2. The distribution function of 4 R2 U 2 is F sing (t) = e 2t + ( 2πt Φ ( 2 t )), t 0, where Φ(t) is the distribution function of N (0, ). The density f sing (t) of 4 R2 U 2 is strictly decreasing on (0, ) and f sing (t) as t 0. In light of Theorem 4.4, it is interesting to note that the distribution of 4 R2 U 2 is not the distribution of a linear combination of four independent χ 2 random variables, because the χ 2 d distribution has a finite density at zero when d 2. However, the distribution satisfies 4 χ2 st 4 R2 U 2 st 4 χ2 2. The first inequality holds according to Theorem 4.9. The second inequality holds because R 2 U χ 2 2. According to the next result, the distribution is also no larger than a χ 2 distribution, which means that the Wald test of a tetrad constraint is asymptotically conservative at the tetrad s singularities (which are given by block-diagonal covariance matrices). and U Uniform[0, ] are inde- Proposition 5.3. pendent. Then Suppose R 2 χ R2 U 2 st χ 2.

20 20 M. DRTON AND H. XIAO Proof. Let Z,..., Z 4 be independent standard normal random variables. Then the sum of squares Z 2 + Z Z Z 2 4 d = R 2 χ 2 4 and the ratio Z 2 Z 2 + Z2 2 + Z2 3 + Beta ( Z2 2, 3 ) 2 4 are independent. Hence, the claim holds if and only if 2 U st B, where U Uniform[0, ] and B Beta ( 2, 3 2). The distribution of U/2 is supported on the interval [0, /2] on which it has distribution function F U/2 (t) = 2t. For t (0, ), the distribution function of B has first and second derivative F B (t) = 4 t 2 π and F 4t B (t) = π t. 2 Hence, F B is strictly concave on (0, ) and has a tangent with slope 4/π < 2 at t = 0. Consequently, F U/2 (t) F B (t), t R, giving the claimed ordering of 4 R2 U 2 and the χ 2 distribution. 6. Conjectures. In Section 3, we mentioned that Theorem 3. and Corollary 3.3 are equivalent. Similarly, Conjecture. is equivalent to the following one. Conjecture 6.. Let X = (X, X 2,..., X k ) T and Y = (Y, Y 2,..., Y k ) T be independent and have the same distribution N k (0, Σ), where Σ has positive diagonal entries. If p, p 2,..., p k are nonnegative numbers such that p + p p k =, then p Y + p 2Y p ky k X X 2 X k has the standard Cauchy distribution.

21 WALD TESTS OF SINGULAR HYPOTHESES 2 For a proof of this conjecture it is natural to try an induction type argument, which might involve the ratio of normal random variables with nonzero means (Marsaglia, 965). However, we were unable to make this work. By taking the reciprocal of W f,σ, we can translate Conjecture. into another equivalent form. Conjecture 6.2. Let X = (X, X 2,..., X k ) T N k (0, Σ) be such that non of its entries is a point mass. If p, p 2,..., p n are nonnegative numbers such that p + p p n =, then (6.) ( p, p 2,..., p ) ( n p Σ, p 2,..., p ) T n X X 2 X n X X 2 X n χ 2. Simulation provides strong evidence for the validity of these conjectures. We have tried many randomly generated scenarios with 2 k 5, simulating large numbers of values for the rational functions in question. In all cases empirical distribution functions were indistinguishable from the conjectured χ 2 or Cauchy distribution functions. On the other hand, the positivity requirement for p, p 2,..., p k is crucial for the validity of the conjectures. For instance, let Q be the reciprocal of the quantity on the left hand side of (6.), and consider the special case where k = 2, var(x ) = var(x 2 ) =, cor(x, X 2 ) = ρ, and p = p 2 = /2. Assuming that ρ <, change coordinates to Z = (X + X 2 )/ 2( + ρ), Z 2 = (X X 2 )/ 2( ρ), and then to polar coordinates Z = R cos Ψ and Z 2 = R sin Ψ. We obtain that ( Q = 4 X 2 2ρ + ) 2 [ρ + cos(2ψ)]2 X X 2 X2 2 = R ρ 2. The distribution of Q now depends on ρ. For instance, E[Q] = + 2ρ2 ρ Conclusion. In regular settings, the Wald statistic for testing a constraint on the parameters of a statistical model converges to a χ 2 distribution as the sample size increases. When the true parameter is a singularity of the constraint, the limiting distribution is instead determined by a rational function of jointly normal random variables (recall Section 2). The distributions of these rational functions are in surprising ways related to chi-square distributions as we showed in our main results in Sections 3-5.

22 22 M. DRTON AND H. XIAO Our work led to several, in our opinion, intriguing conjectures about the limiting distributions of Wald statistics. Although the conjectures can be stated in elementary terms, we are not aware of any other work that suggests these properties for the multivariate normal distribution. For quadratic forms, the usual canonical form leads to a particular class of distributions parametrized by a collection of eigenvalues (recall Lemma 4.2). It would be interesting to study Schur convexity properties of this class of distributions, which would provide further insights into asymptotic conservativeness of Wald tests of singular hypotheses. Finally, this paper has focused on testing a single constraint. It would be interesting to develop a general theory for Wald tests of hypotheses that are defined in terms of several constraints. In this setting the choice of the constraints representing a null hypothesis will play an important role in the distribution theory, as exemplified by Gaffke, Steyer and von Davier (999) and Gaffke, Heiligers and Offinger (2002). Acknowledgments. We would like to thank Gérard Letac and Lek- Heng Lim for helpful comments on our conjectures. This work was supported by NSF under Grant No. DMS Mathias Drton was also supported by an Alfred P. Sloan Fellowship. REFERENCES Azaïs, J.-M., Gassiat, É. and Mercadier, C. (2006). Asymptotic distribution and local power of the log-likelihood ratio test for mixtures: bounded and unbounded cases. Bernoulli MR (2008c:6207) Billingsley, P. (995). Probability and measure, third ed. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons Inc., New York. A Wiley-Interscience Publication. MR (95k:6000) Bollen, K. A., Lennox, R. D. and Dahly, D. L. (2009). Practical application of the vanishing tetrad test for causal indicator measurement models: an example from healthrelated quality of life. Stat. Med MR Bollen, K. A. and Ting, K.-F. (2000). A tetrad test for causal indicators. Psychological Methods Chernoff, H. (954). On the distribution of the likelihood ratio. Ann. Math. Statistics MR (6,38k) Cohen, E. A. Jr. (98). A note on normal functions of normal random variables. Comput. Math. Appl MR62366 (82h:60029) DasGupta, A. and Shepp, L. (2004). Chebyshev polynomials and G-distributed functions of F -distributed variables. In A festschrift for Herman Rubin. IMS Lecture Notes Monogr. Ser Inst. Math. Statist., Beachwood, OH. MR (2006a:6007) Drton, M. (2009). Likelihood ratio tests and singularities. Ann. Statist MR (200d:6243)

23 WALD TESTS OF SINGULAR HYPOTHESES 23 Drton, M., Massam, H. and Olkin, I. (2008). Moments of minors of Wishart matrices. Ann. Statist MR (200a:60035) Drton, M., Sturmfels, B. and Sullivant, S. (2007). Algebraic factor analysis: tetrads, pentads and beyond. Probab. Theory Related Fields MR (2008f:62086) Drton, M., Sturmfels, B. and Sullivant, S. (2009). Lectures on algebraic statistics. Oberwolfach Seminars 39. Birkhäuser Verlag, Basel. MR (202d:62004) Drton, M. and Williams, B. (20). Quantifying the failure of bootstrap likelihood ratio tests. Biometrika MR (202k:62065) Feller, W. (966). An introduction to probability theory and its applications. Vol. II. John Wiley & Sons Inc., New York. MR02054 (35 ##048) Gaffke, N., Heiligers, B. and Offinger, R. (2002). On the asymptotic nulldistribution of the Wald statistic at singular parameter points. Statist. Decisions MR96070 (2004a:62055) Gaffke, N., Steyer, R. and von Davier, A. A. (999). On the asymptotic nulldistribution of the Wald statistic at singular parameter points. Statist. Decisions MR (2000j:62027) Glonek, G. F. V. (993). On the behaviour of Wald statistics for the disjunction of two regular hypotheses. J. Roy. Statist. Soc. Ser. B MR22394 (94g:6222) Harman, H. H. (976). Modern factor analysis, Third ed. University of Chicago Press, Chicago, Ill. MR (53 ##4377) Hipp, J. R. and Bollen, K. A. (2003). Model fit in structural equation models with censored, ordinal, and dichotomous variables: testing vanishing tetrads. Sociological Methodology Iwashita, T. and Siotani, M. (994). Asymptotic distributions of functions of a sample covariance matrix under the elliptical distribution. Canad. J. Statist MR (96b:62086) Johnson, T. R. and Bodner, T. E. (2007). A note on the use of bootstrap tetrad tests for covariance structures. Struct. Equ. Model MR Johnson, N. L., Kotz, S. and Balakrishnan, N. (994). Continuous univariate distributions. Vol., second ed. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons Inc., New York. A Wiley-Interscience Publication. MR (96j:62028) Kato, N. and Kuriki, S. (203). Likelihood ratio tests for positivity in polynomial regressions. J. Multivariate Anal MR Marsaglia, G. (965). Ratios of normal variables and ratios of sums of uniform variables. J. Amer. Statist. Assoc MR (3 ##2747) Muirhead, R. J. (982). Aspects of multivariate statistical theory. John Wiley & Sons Inc., New York. Wiley Series in Probability and Mathematical Statistics. MR (84c:62073) Quine, M. P. (994). A result of Shepp. Appl. Math. Lett MR (96e:60026) Reid, J. G. (987). Normal functions of normal random variables. Comput. Math. Appl MR (88h:62023) Ritz, C. and Skovgaard, I. M. (2005). Likelihood ratio tests in curved exponential families with nuisance parameters present only under the alternative. Biometrika MR (2008f:62032) Seshadri, V. (993). The inverse Gaussian distribution. Oxford Science Publications. The Clarendon Press Oxford University Press, New York. A case study in exponential families. MR30628 (96e:6205)

24 24 M. DRTON AND H. XIAO Shepp, L. (964). Normal functions of normal random variables. SIAM Rev Silva, R., Scheines, R., Glymour, C. and Spirtes, P. (2006). Learning the structure of linear latent variable models. J. Mach. Learn. Res MR Spearman, C. (904). General intelligence, objectively determined and measured. The American Journal of Psychology Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, prediction, and search, second ed. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA. With additional material by David Heckerman, Christopher Meek, Gregory F. Cooper and Thomas Richardson, A Bradford Book. MR85675 (200j:62009) Sullivant, S., Talaska, K. and Draisma, J. (200). Trek separation for Gaussian graphical models. Ann. Statist MR (20f:62076) Zwiernik, P. and Smith, J. Q. (202). Tree cumulants and the geometry of binary tree models. Bernoulli MR Department of Statistics University of Washington Seattle, WA, U.S.A. md5@uw.edu Department of Statistics & Biostatistics Rutgers University Piscataway, NJ, U.S.A. hxiao@stat.rutgers.edu

Testing Algebraic Hypotheses

Testing Algebraic Hypotheses Testing Algebraic Hypotheses Mathias Drton Department of Statistics University of Chicago 1 / 18 Example: Factor analysis Multivariate normal model based on conditional independence given hidden variable:

More information

Semidefinite Programming

Semidefinite Programming Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has

More information

arxiv: v2 [math.st] 19 Oct 2015

arxiv: v2 [math.st] 19 Oct 2015 AN UNEXPECTED ENCOUNTER WITH CAUCHY AND LÉVY arxiv:1505.01957v2 [math.st] 19 Oct 2015 By Natesh S. Pillai, and Xiao-Li Meng, Department of Statistics, Harvard University The Cauchy distribution is usually

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

Submitted to the Brazilian Journal of Probability and Statistics

Submitted to the Brazilian Journal of Probability and Statistics Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions Chapter 3 Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions 3.1 Scattered Data Interpolation with Polynomial Precision Sometimes the assumption on the

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 11 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,, a n, b are given real

More information

Common-Knowledge / Cheat Sheet

Common-Knowledge / Cheat Sheet CSE 521: Design and Analysis of Algorithms I Fall 2018 Common-Knowledge / Cheat Sheet 1 Randomized Algorithm Expectation: For a random variable X with domain, the discrete set S, E [X] = s S P [X = s]

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

On a Nonparametric Notion of Residual and its Applications

On a Nonparametric Notion of Residual and its Applications On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014

More information

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

More information

Stat 5101 Notes: Brand Name Distributions

Stat 5101 Notes: Brand Name Distributions Stat 5101 Notes: Brand Name Distributions Charles J. Geyer September 5, 2012 Contents 1 Discrete Uniform Distribution 2 2 General Discrete Uniform Distribution 2 3 Uniform Distribution 3 4 General Uniform

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Considering our result for the sum and product of analytic functions, this means that for (a 0, a 1,..., a N ) C N+1, the polynomial.

Considering our result for the sum and product of analytic functions, this means that for (a 0, a 1,..., a N ) C N+1, the polynomial. Lecture 3 Usual complex functions MATH-GA 245.00 Complex Variables Polynomials. Construction f : z z is analytic on all of C since its real and imaginary parts satisfy the Cauchy-Riemann relations and

More information

On self-concordant barriers for generalized power cones

On self-concordant barriers for generalized power cones On self-concordant barriers for generalized power cones Scott Roy Lin Xiao January 30, 2018 Abstract In the study of interior-point methods for nonsymmetric conic optimization and their applications, Nesterov

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

The intersection axiom of

The intersection axiom of The intersection axiom of conditional independence: some new [?] results Richard D. Gill Mathematical Institute, University Leiden This version: 26 March, 2019 (X Y Z) & (X Z Y) X (Y, Z) Presented at Algebraic

More information

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 Multivariate distributions: Gaussian, Bernoulli, Probability tables Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey

More information

arxiv:math/ v5 [math.ac] 17 Sep 2009

arxiv:math/ v5 [math.ac] 17 Sep 2009 On the elementary symmetric functions of a sum of matrices R. S. Costas-Santos arxiv:math/0612464v5 [math.ac] 17 Sep 2009 September 17, 2009 Abstract Often in mathematics it is useful to summarize a multivariate

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION

ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION Annales Univ. Sci. Budapest., Sect. Comp. 33 (2010) 273-284 ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION L. László (Budapest, Hungary) Dedicated to Professor Ferenc Schipp on his 70th

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

c 2005 Society for Industrial and Applied Mathematics

c 2005 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. XX, No. X, pp. XX XX c 005 Society for Industrial and Applied Mathematics DISTRIBUTIONS OF THE EXTREME EIGENVALUES OF THE COMPLEX JACOBI RANDOM MATRIX ENSEMBLE PLAMEN KOEV

More information

Multivariate Linear Models

Multivariate Linear Models Multivariate Linear Models Stanley Sawyer Washington University November 7, 2001 1. Introduction. Suppose that we have n observations, each of which has d components. For example, we may have d measurements

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

arxiv: v1 [math.pr] 22 May 2008

arxiv: v1 [math.pr] 22 May 2008 THE LEAST SINGULAR VALUE OF A RANDOM SQUARE MATRIX IS O(n 1/2 ) arxiv:0805.3407v1 [math.pr] 22 May 2008 MARK RUDELSON AND ROMAN VERSHYNIN Abstract. Let A be a matrix whose entries are real i.i.d. centered

More information

Classical transcendental curves

Classical transcendental curves Classical transcendental curves Reinhard Schultz May, 2008 In his writings on coordinate geometry, Descartes emphasized that he was only willing to work with curves that could be defined by algebraic equations.

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Interlacing Inequalities for Totally Nonnegative Matrices

Interlacing Inequalities for Totally Nonnegative Matrices Interlacing Inequalities for Totally Nonnegative Matrices Chi-Kwong Li and Roy Mathias October 26, 2004 Dedicated to Professor T. Ando on the occasion of his 70th birthday. Abstract Suppose λ 1 λ n 0 are

More information

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg Tutorial: Gaussian conditional independence and graphical models Thomas Kahle Otto-von-Guericke Universität Magdeburg The central dogma of algebraic statistics Statistical models are varieties The central

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications to the natural

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

~ g-inverses are indeed an integral part of linear algebra and should be treated as such even at an elementary level.

~ g-inverses are indeed an integral part of linear algebra and should be treated as such even at an elementary level. Existence of Generalized Inverse: Ten Proofs and Some Remarks R B Bapat Introduction The theory of g-inverses has seen a substantial growth over the past few decades. It is an area of great theoretical

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Solutions to Complex Analysis Prelims Ben Strasser

Solutions to Complex Analysis Prelims Ben Strasser Solutions to Complex Analysis Prelims Ben Strasser In preparation for the complex analysis prelim, I typed up solutions to some old exams. This document includes complete solutions to both exams in 23,

More information

F (z) =f(z). f(z) = a n (z z 0 ) n. F (z) = a n (z z 0 ) n

F (z) =f(z). f(z) = a n (z z 0 ) n. F (z) = a n (z z 0 ) n 6 Chapter 2. CAUCHY S THEOREM AND ITS APPLICATIONS Theorem 5.6 (Schwarz reflection principle) Suppose that f is a holomorphic function in Ω + that extends continuously to I and such that f is real-valued

More information

Open Problems in Algebraic Statistics

Open Problems in Algebraic Statistics Open Problems inalgebraic Statistics p. Open Problems in Algebraic Statistics BERND STURMFELS UNIVERSITY OF CALIFORNIA, BERKELEY and TECHNISCHE UNIVERSITÄT BERLIN Advertisement Oberwolfach Seminar Algebraic

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 2 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 5, 2012 Andre Tkacenko

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM ALEX FINK 1. Introduction and background Consider the discrete conditional independence model M given by {X 1 X 2 X 3, X 1 X 3 X 2 }. The intersection axiom

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control

Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control Chun-Hua Guo Dedicated to Peter Lancaster on the occasion of his 70th birthday We consider iterative methods for finding the

More information

arxiv: v1 [math.ra] 13 Jan 2009

arxiv: v1 [math.ra] 13 Jan 2009 A CONCISE PROOF OF KRUSKAL S THEOREM ON TENSOR DECOMPOSITION arxiv:0901.1796v1 [math.ra] 13 Jan 2009 JOHN A. RHODES Abstract. A theorem of J. Kruskal from 1977, motivated by a latent-class statistical

More information

Operators with numerical range in a closed halfplane

Operators with numerical range in a closed halfplane Operators with numerical range in a closed halfplane Wai-Shun Cheung 1 Department of Mathematics, University of Hong Kong, Hong Kong, P. R. China. wshun@graduate.hku.hk Chi-Kwong Li 2 Department of Mathematics,

More information

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence 1 EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence Jean Walrand I. SUMMARY Here are the key ideas and results of this important topic. Section II reviews Kalman Filter. A system is observable

More information

On the adjacency matrix of a block graph

On the adjacency matrix of a block graph On the adjacency matrix of a block graph R. B. Bapat Stat-Math Unit Indian Statistical Institute, Delhi 7-SJSS Marg, New Delhi 110 016, India. email: rbb@isid.ac.in Souvik Roy Economics and Planning Unit

More information

Recall the convention that, for us, all vectors are column vectors.

Recall the convention that, for us, all vectors are column vectors. Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists

More information

1 The linear algebra of linear programs (March 15 and 22, 2015)

1 The linear algebra of linear programs (March 15 and 22, 2015) 1 The linear algebra of linear programs (March 15 and 22, 2015) Many optimization problems can be formulated as linear programs. The main features of a linear program are the following: Variables are real

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

Functions of Several Variables

Functions of Several Variables Functions of Several Variables The Unconstrained Minimization Problem where In n dimensions the unconstrained problem is stated as f() x variables. minimize f()x x, is a scalar objective function of vector

More information

b jσ(j), Keywords: Decomposable numerical range, principal character AMS Subject Classification: 15A60

b jσ(j), Keywords: Decomposable numerical range, principal character AMS Subject Classification: 15A60 On the Hu-Hurley-Tam Conjecture Concerning The Generalized Numerical Range Che-Man Cheng Faculty of Science and Technology, University of Macau, Macau. E-mail: fstcmc@umac.mo and Chi-Kwong Li Department

More information

Spectral inequalities and equalities involving products of matrices

Spectral inequalities and equalities involving products of matrices Spectral inequalities and equalities involving products of matrices Chi-Kwong Li 1 Department of Mathematics, College of William & Mary, Williamsburg, Virginia 23187 (ckli@math.wm.edu) Yiu-Tung Poon Department

More information

Linear Systems and Matrices

Linear Systems and Matrices Department of Mathematics The Chinese University of Hong Kong 1 System of m linear equations in n unknowns (linear system) a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.......

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

Failure of the Raikov Theorem for Free Random Variables

Failure of the Raikov Theorem for Free Random Variables Failure of the Raikov Theorem for Free Random Variables Florent Benaych-Georges DMA, École Normale Supérieure, 45 rue d Ulm, 75230 Paris Cedex 05 e-mail: benaych@dma.ens.fr http://www.dma.ens.fr/ benaych

More information

The extreme rays of the 5 5 copositive cone

The extreme rays of the 5 5 copositive cone The extreme rays of the copositive cone Roland Hildebrand March 8, 0 Abstract We give an explicit characterization of all extreme rays of the cone C of copositive matrices. The results are based on the

More information

arxiv: v1 [math.co] 3 Nov 2014

arxiv: v1 [math.co] 3 Nov 2014 SPARSE MATRICES DESCRIBING ITERATIONS OF INTEGER-VALUED FUNCTIONS BERND C. KELLNER arxiv:1411.0590v1 [math.co] 3 Nov 014 Abstract. We consider iterations of integer-valued functions φ, which have no fixed

More information

Notes on Linear Algebra and Matrix Theory

Notes on Linear Algebra and Matrix Theory Massimo Franceschet featuring Enrico Bozzo Scalar product The scalar product (a.k.a. dot product or inner product) of two real vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) is not a vector but a

More information

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0}

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0} VARIATION OF ITERATED BROWNIAN MOTION Krzysztof Burdzy University of Washington 1. Introduction and main results. Suppose that X 1, X 2 and Y are independent standard Brownian motions starting from 0 and

More information

Supermodular ordering of Poisson arrays

Supermodular ordering of Poisson arrays Supermodular ordering of Poisson arrays Bünyamin Kızıldemir Nicolas Privault Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological University 637371 Singapore

More information

Taylor and Laurent Series

Taylor and Laurent Series Chapter 4 Taylor and Laurent Series 4.. Taylor Series 4... Taylor Series for Holomorphic Functions. In Real Analysis, the Taylor series of a given function f : R R is given by: f (x + f (x (x x + f (x

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Fiedler s Theorems on Nodal Domains

Fiedler s Theorems on Nodal Domains Spectral Graph Theory Lecture 7 Fiedler s Theorems on Nodal Domains Daniel A. Spielman September 19, 2018 7.1 Overview In today s lecture we will justify some of the behavior we observed when using eigenvectors

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS

A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS K Y B E R N E T I K A V O L U M E 4 3 ( 2 0 0 7, N U M B E R 4, P A G E S 4 7 1 4 8 0 A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS

More information

Problem Set 2. Assigned: Mon. November. 23, 2015

Problem Set 2. Assigned: Mon. November. 23, 2015 Pseudorandomness Prof. Salil Vadhan Problem Set 2 Assigned: Mon. November. 23, 2015 Chi-Ning Chou Index Problem Progress 1 SchwartzZippel lemma 1/1 2 Robustness of the model 1/1 3 Zero error versus 1-sided

More information

Lecture 2: Computing functions of dense matrices

Lecture 2: Computing functions of dense matrices Lecture 2: Computing functions of dense matrices Paola Boito and Federico Poloni Università di Pisa Pisa - Hokkaido - Roma2 Summer School Pisa, August 27 - September 8, 2018 Introduction In this lecture

More information

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook. Math 443 Differential Geometry Spring 2013 Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook. Endomorphisms of a Vector Space This handout discusses

More information

WEYL S LEMMA, ONE OF MANY. Daniel W. Stroock

WEYL S LEMMA, ONE OF MANY. Daniel W. Stroock WEYL S LEMMA, ONE OF MANY Daniel W Stroock Abstract This note is a brief, and somewhat biased, account of the evolution of what people working in PDE s call Weyl s Lemma about the regularity of solutions

More information

New lower bounds for hypergraph Ramsey numbers

New lower bounds for hypergraph Ramsey numbers New lower bounds for hypergraph Ramsey numbers Dhruv Mubayi Andrew Suk Abstract The Ramsey number r k (s, n) is the minimum N such that for every red-blue coloring of the k-tuples of {1,..., N}, there

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

arxiv: v1 [math.ca] 23 Oct 2018

arxiv: v1 [math.ca] 23 Oct 2018 A REMARK ON THE ARCSINE DISTRIBUTION AND THE HILBERT TRANSFORM arxiv:80.08v [math.ca] 3 Oct 08 RONALD R. COIFMAN AND STEFAN STEINERBERGER Abstract. We prove that if fx) ) /4 L,) and its Hilbert transform

More information

Gaussian Models (9/9/13)

Gaussian Models (9/9/13) STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate

More information

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman Kernels of Directed Graph Laplacians J. S. Caughman and J.J.P. Veerman Department of Mathematics and Statistics Portland State University PO Box 751, Portland, OR 97207. caughman@pdx.edu, veerman@pdx.edu

More information