Siegel s formula via Stein s identities

Size: px

Start display at page:

Download "Siegel s formula via Stein s identities"

Veronica Grant
5 years ago
Views:

1 Siegel s formula via Stein s identities Jun S. Liu Department of Statistics Harvard University Abstract Inspired by a surprising formula in Siegel (1993), we find it convenient to compute covariances, even for order statistics, by using Stein (1972) s identities. Generalizations of Siegel s formula to other order statistics as well as other distributions are obtained along this line. KEY WORDS: Exponential family; multivariate Stein s identity; covariance with order statistics. 1 Siegel s result and Stein s identities In a recent paper of Siegel (1993), a remarkable formula is found for the covariance of X 1 with the minimum of the multivariate normal vector (X 1,..., X n ). The theorem is stated as follows. Theorem [Siegel] Let (X 1,..., X n ) be multivariate normal, such that the variables are distinct, with arbitrary mean and variance structure. Then cov[x 1, min(x 1,..., X n )] = cov(x 1, X i )Pr[X i = min(x 1,..., X n )]. i=1 On the other hand, Stein (1972) observes that for Z N(µ, σ 2 ) and any function f such that E f (Z) <, E[(Z µ)f(z)] = σ 2 E[f (Z)]. (1) Similar identities hold for many other distributions. For example, if Z Poisson(λ), then E[(Z λ)f(z)] = λe[f(z + 1) f(z)]. 0 Address corresponds to: Jun S. Liu, Department of Statistics, Harvard University, Cambridge, MA

2 If Z t ν (0, σ 2 ), i.e., Z = Y/ W/ν, where Y N(0, σ 2 ) and W χ 2 ν, then E[Zf(Z)] = ν/(ν 2)σ 2 E[f (Z )], where Z t ν 2 (µ, σ 2 ). The above identities are directly proved as an application of integration by parts. To illustrate, we show the last identity when σ = 1: E[Zf(Z)] = = = = ( ) (ν+1)/2 Γ[(ν + 1)/2] zf(z) 1 + z2 dz νπγ(ν/2) ν ( ) ( Γ[(ν + 1)/2] ν f(z) νπγ(ν/2) ν z2 ν ) (ν 1)/2 ( ) ν (ν 1)/2 f Γ[(ν 1)/2] (z) 1 + z2 dz ν 2 (ν 2)πΓ(ν/2 1) ν 2 ν/(ν 2)E[f (Z )]. dz There is also a multivariate version of Stein s identity as follows. Lemma 1 Let X = (X 1,..., X n ) be multivariate normally distributed with arbitrary mean vector µ and covariance matrix Σ. For any function h(x 1,..., x n ) such that h/ x i exists almost everywhere and E x i h(x) <, i = 1,..., n, we write h(x) = ( h(x),..., h(x) x n ) T. Then the following identity is true: x 1 cov[x, h(x)] = Σ E[ h(x)]. (2) Specifically, cov[x 1, h(x 1,..., X n )] = i=1 cov(x 1, X i )E[ x i h(x 1,..., X n )]. (3) Proof: Let Z = (Z 1,..., Z n ), where the Z i are i.i.d. N(0, 1) random variables. It is straightforward to show by (1) that for any g(x), differentiable almost everywhere, cov[z i, g(z)] = E[ g(z)/ z i ]. Hence cov[z, g(z)] = E[ g(z)]. Also see Stein (1981) for a more elaborate proof. Now for the random vector X, we can write X = Σ 1/2 Z + µ, and g(z) = h(σ 1/2 Z + µ). Hence the left hand side of (2) is cov[x, h(x)] = cov[σ 1/2 Z, g(z)] = Σ 1/2 E[ g(z)]. Since g(z) = Σ 1/2 h(x), the identity (2) follows immediately. 2

3 REMARK: A version of this lemma appears in Stein (1981) with Σ being the identity matrix. However, the search of an explicit formula as (2) is not successful, although it is believed that the form is well-known in the fields of sliced inverse regression and decision theory. Siegel (1993) s method can also be applied to prove the identity (2) but is less direct. We find that Siegel s theorem is closely related to Stein s identities and can be generalized to covariances of X 1 with other order statistics, other functionals, and for other distributions. 2 A simple proof of the generalized Siegel s formula Let f u (z) = min(z, u). Then f u is piecewise linear in z. Hence for Z N(µ, σ 2 ), Stein s identity can be applied to get that cov[z, f u (Z)] = σ 2 E[f u(z)] = σ 2 E(I {Z<u} ) = σ 2 Pr[Z = min(z, u)], which easily leads to Lemmas 2 and 3 of Siegel (1993). Then some more properties of the normal distribution are used in Siegel (1993) to obtain the proof of his theorem. We find, however, that it is convenient to use a multivariate version of Stein s identity directly to prove the following generalized Siegel s formula. Theorem 1 Let (X 1,..., X n ) be multivariate normal, such that the variables are distinct, with arbitrary mean and variance structure. Let X (i) be the ith largest among X 1,..., X n, then cov[x 1, X (i) ] = cov(x 1, X j )P r(x j = X (i) ). j=1 Proof: Define h(x 1,..., x n ) = x (i). Then h is piecewise linear and therefore almost everywhere differentiable. Also it is noticed that h(x 1,..., x n )/ x j is one if x j = x (i) and zero if x j x (i). Hence by applying identity (3) we obtain the result. REMARK: When X 1,..., X n are i.i.d., Carl Morris and I found the following ancillary argument to prove the above result. Since X (i) X is independent of X, cov( X, X(i) ) = cov( X, X) = var(x 1 )/n. On the other hand, since cov(x j, X (i) ) = cov(x k, X (i) ) for any j and k because of the i.i.d. assumption, we obtain that cov(x 1, X (i) ) = var(x 1 )/n. For a multivariate normal random vector, the covariance of X 1 with min( X 1,..., X n ) can also be computed similarly. Define h(x 1,..., x n ) = min( x 1,..., x n ), then h(x)/ x j is equal to 3

4 1 if 0 < x j < u; 1 if u < x j < 0; and 0 if x j > u, where u = min( X k, k j). Hence Stein s identity (3) leads to the following corollary. Corollary 1 Let Y = min( X 1,..., X n ), where (X 1,..., X n ) is multivariate normal with arbitrary mean and covariance structure. Then cov(x 1, Y ) = cov(x 1, X i )[Pr( X i = Y, X i > 0) Pr( X i = Y, X i < 0)]. i=1 When X 1,..., X n have mean zero, the above covariance equals zero. 3 Generalizations to other distributions Stein s identities for other distributions can also be used to obtain similar results. Corollary 2 Let X 1 Poisson(λ 1 ) be independent of X 2,..., X n, then cov[x 1, min(x 1,..., X n )] = λ 1 P r[x 1 < X j, for all j 1). Proof: Define f u (x) the same way as before, then by Stein s identity for a Poisson distribution: cov[x 1, f u (X 1 )] = λ 1 E[f u (X 1 + 1) f u (X 1 )] = λ 1 Pr(X 1 < u). Hence conditionally cov[x 1, f u (X 1 ) X 2,..., X n ] = λ 1 Pr[X 1 < min(x 2,..., X n ) X 2,... X n ]. Since cov(u, V )=E[cov(U, V W )]+cov[e(u W ), E(V W )], and X 1 is independent of X j, j 1, the conclusion follows. Corollary 3 Let X 1 t ν (0, σ 2 ) be independent of X 2,..., X n, then ν cov[x 1, min(x 1,..., X n )] = ν 2 σ2 P r[x1 < min(x 2,..., X n )], where X 1 t ν 2(0, σ 2 ). Let X (i) be the ith order statistics, then more generally, ν cov[x 1, X (i) ] = ν 2 σ2 P r[x (i 1) < X1 < X (i+1) ]. Proof: Use Stein s identity for t-distribution. 4

5 More generally, Morris (1983) provides Stein-type identities (theorem 5.2) for natural exponential families with quadratic variance functions (NEF-QVF), which can be used to generalize the above arguments to a broader class of distributions (NEF-QVF) without much difficulty. To illustrate, we compute cov[y 1, min(y 1,..., Y n )] where Y i Exponential(θ i ) are independent, E(Y i ) = 1/θ i, and var(y i ) = 1/θi 2. Stein s identity takes the form cov[y 1, f(y 1 )] = 1 θ 1 E[Y 1 f (Y 1 )]. Let f u (x) = min(x, u), then cov[y 1, f u (Y 1 )] = (1/θ 1 )E(Y 1 I {Y1 <u}). Since U = min(y 2,..., Y n ) Exponential(θ θ n ), cov[y 1, min(y 1,..., Y n )] = 1 θ 1 E[E(Y 1 I {Y1 <U} U)] = 1 θ 1 E[Y 1 Pr(U > Y 1 Y 1 )] = 1 (θ θ n ) 2. REMARK: Firstly, it is noted that this covariance does not depend on θ 1 at all, and a more general version is cov[y i, min(y 1,..., Y n )] = var[min(y 1,..., Y n )]. Secondly, Professor Carl Morris provides another ancillary argument for the result: consider a location shift family of exponential distributions with known scale parameters θ 1,..., θ n. Then Y (1) = min(y 1,..., Y n ) is complete sufficient for the location parameter. Hence by Basu s theorem, Y 1 Y (1) is independent of Y (1), and therefore the result holds. Acknowledgement: The author thanks Professors Hal Stern and Carl Morris for bringing up the problem and providing very stimulating discussions. Professor Morris also suggested the computations for the exponential distribution. REFERENCES Morris, C.N. (1983) Natural exponential families with quadratic variance functions: statistical theory, Ann. Statist., 11, Siegel, A.F. (1993) A surprising covariance involving the minimum of multivariate normal variables, J. Amer. Statist. Assoc., 88, Stein, C.M. (1972) A bound for the error in the normal approximation to the distribution of a sum of dependent random variables, Proc. Sixth Berkeley Symp. Math. Statist. Probab., 2,

6 Stein, C.M. (1981) Estimation of the mean of a multivariate normal distribution, Ann. Statist., 6,

ECE Lecture #9 Part 2 Overview

ECE Lecture #9 Part 2 Overview ECE 450 - Lecture #9 Part Overview Bivariate Moments Mean or Expected Value of Z = g(x, Y) Correlation and Covariance of RV s Functions of RV s: Z = g(x, Y); finding f Z (z) Method : First find F(z), by